All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 00/31] CXL 2.0 Support
@ 2021-02-02  0:59 ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Major changes since v2 [1]:
 * Removed all register endian/alignment/size checking. Using core functionality
   instead. This untested on big endian hosts, but Should Work(tm).
 * Fix component capability header generation (off by 1).
 * Fixed HDM programming (multiple issues).
 * Fixed timestamp command implementations.
 * Added commands: GET_FIRMWARE_UPDATE_INFO, GET_PARTITION_INFO, GET_LSA, SET_LSA

Things have remained fairly stable since since v2. The biggest change here is
definitely the HDM programming which has received limited (but not 0) testing in
the Linux driver.

Jonathan Cameron has gotten this patch series working on ARM [2], and added some
much sought after functionality [3].

---

I've started #cxl on OFTC IRC for discussion. Please feel free to use that
channel for questions or suggestions in addition to #qemu.

---

Introduce emulation of Compute Express Link 2.0
(https://www.computeexpresslink.org/). Specifically, add support for Type 3
memory expanders with persistent memory.

The emulation has been critical to get the Linux enabling started [4], it would
be an ideal place to land regression tests for different topology handling, and
there may be applications for this emulation as a way for a guest to manipulate
its address space relative to different performance memories.

Three of the five CXL component types are emulated with some level of
functionality: host bridge, root port, and memory device. All components and
devices implement basic MMIO. Devices/memory devices implement the mailbo
interface. Basic ACPI support is also included. Upstream ports and downstream
ports aren't implemented (the two components needed to make up a switch).

CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
implementation utilizes existing PCI paradigms. To implement the host bridge,
I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
fit even though it doesn't directly map to how hardware will work. For
persistent capacity of the memory device, I utilized the memory subsystem
(hw/mem).

We have 3 reasons why this work is valuable:
1. Linux driver feature development benefits from emulation both due to a lack
   of initial hardware availability, but also, as is seen with NVDIMM/PMEM
   emulation, there is value in being able to share topologies with
   system-software developers even after hardware is available.

2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
   resources via custom modules (nfit_test). In retrospect a QEMU emulation of
   nfit_test capabilities would have made the test environment more portable,
   and allowed for easier community contributions of example configurations.

3. This is still being fleshed out, but in short it provides a standardized
   mechanism for the guest to provide feedback to the host about size and
   placement needs of the memory. After the host gives the guest a physical
   window mapping to the CXL device, the emulated HDM decoders allow the guest a
   way to tell the host how much it wants and where. There are likely simpler
   ways to do this, but they'd require inventing a new interface and you'd need
   to have diverging driver code in the guest programming of the HDM decoder vs.
   the host. Since we've already done this work, why not use it?

There is quite a long list of work to do for full spec compliance, but I don't
believe that any of it precludes merging. Off the top of my head:
- Main host bridge support (WIP)
- Interleaving
- Better Tests
- Hot plug support
- Emulating volatile capacity
- CDAT emulation [3]

The flow of the patches in general is to define all the data structures and
registers associated with the various components in a top down manner. Host
bridge, component, ports, devices. Then, the actual implementation is done in
the same order.

The summary is:
1-5: Infrastructure for component and device emulation
6-9: Basic mailbox command implementations
10-19: Implement CXL host bridges as PXB devices
20: Implement a root port
21-22: Implement a memory device
23-26: ACPI bits
27-29: Add some more advanced mailbox command implementations
30: Start working on enabling the main host bridge
31: Basic test case

---

[1]: https://lore.kernel.org/qemu-devel/20210105165323.783725-1-ben.widawsky@intel.com/
[2]: https://lore.kernel.org/qemu-devel/20210201152655.31027-1-Jonathan.Cameron@huawei.com/
[3]: https://lore.kernel.org/qemu-devel/20210201151629.29656-1-Jonathan.Cameron@huawei.com/
[4]: https://lore.kernel.org/linux-cxl/20210130002438.1872527-1-ben.widawsky@intel.com/

---

Ben Widawsky (31):
  hw/pci/cxl: Add a CXL component type (interface)
  hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  hw/cxl/device: Introduce a CXL device (8.2.8)
  hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  hw/cxl/device: Implement basic mailbox (8.2.8.4)
  hw/cxl/device: Add memory device utilities
  hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
  hw/cxl/device: Timestamp implementation (8.2.9.3)
  hw/cxl/device: Add log commands (8.2.9.4) + CEL
  hw/pxb: Use a type for realizing expanders
  hw/pci/cxl: Create a CXL bus type
  hw/pxb: Allow creation of a CXL PXB (host bridge)
  qtest: allow DSDT acpi table changes
  acpi/pci: Consolidate host bridge setup
  tests/acpi: remove stale allowed tables
  hw/pci: Plumb _UID through host bridges
  hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  acpi/pxb/cxl: Reserve host bridge MMIO
  hw/pxb/cxl: Add "windows" for host bridges
  hw/cxl/rp: Add a root port
  hw/cxl/device: Add a memory device (8.2.8.5)
  hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
  acpi/cxl: Add _OSC implementation (9.14.2)
  tests/acpi: allow CEDT table addition
  acpi/cxl: Create the CEDT (9.14.1)
  tests/acpi: Add new CEDT files
  hw/cxl/device: Add some trivial commands
  hw/cxl/device: Plumb real LSA sizing
  hw/cxl/device: Implement get/set LSA
  qtest/cxl: Add very basic sanity tests
  WIP: i386/cxl: Initialize a host bridge

 MAINTAINERS                         |   6 +
 hw/Kconfig                          |   1 +
 hw/acpi/Kconfig                     |   5 +
 hw/acpi/cxl.c                       | 173 ++++++++++
 hw/acpi/meson.build                 |   1 +
 hw/arm/virt.c                       |   1 +
 hw/core/machine.c                   |  26 ++
 hw/core/numa.c                      |   3 +
 hw/cxl/Kconfig                      |   3 +
 hw/cxl/cxl-component-utils.c        | 208 ++++++++++++
 hw/cxl/cxl-device-utils.c           | 264 +++++++++++++++
 hw/cxl/cxl-mailbox-utils.c          | 498 ++++++++++++++++++++++++++++
 hw/cxl/meson.build                  |   5 +
 hw/i386/acpi-build.c                |  87 ++++-
 hw/i386/microvm.c                   |   1 +
 hw/i386/pc.c                        |   2 +
 hw/mem/Kconfig                      |   5 +
 hw/mem/cxl_type3.c                  | 405 ++++++++++++++++++++++
 hw/mem/meson.build                  |   1 +
 hw/meson.build                      |   1 +
 hw/pci-bridge/Kconfig               |   5 +
 hw/pci-bridge/cxl_root_port.c       | 231 +++++++++++++
 hw/pci-bridge/meson.build           |   1 +
 hw/pci-bridge/pci_expander_bridge.c | 209 +++++++++++-
 hw/pci-bridge/pcie_root_port.c      |   6 +-
 hw/pci/pci.c                        |  32 +-
 hw/pci/pcie.c                       |  30 ++
 hw/ppc/spapr.c                      |   2 +
 include/hw/acpi/cxl.h               |  27 ++
 include/hw/boards.h                 |   2 +
 include/hw/cxl/cxl.h                |  29 ++
 include/hw/cxl/cxl_component.h      | 187 +++++++++++
 include/hw/cxl/cxl_device.h         | 255 ++++++++++++++
 include/hw/cxl/cxl_pci.h            | 160 +++++++++
 include/hw/pci/pci.h                |  15 +
 include/hw/pci/pci_bridge.h         |  25 ++
 include/hw/pci/pci_bus.h            |   8 +
 include/hw/pci/pci_ids.h            |   1 +
 monitor/hmp-cmds.c                  |  15 +
 qapi/machine.json                   |   1 +
 tests/data/acpi/pc/CEDT             | Bin 0 -> 36 bytes
 tests/data/acpi/pc/DSDT             | Bin 5065 -> 5065 bytes
 tests/data/acpi/pc/DSDT.acpihmat    | Bin 6390 -> 6390 bytes
 tests/data/acpi/pc/DSDT.bridge      | Bin 6924 -> 6924 bytes
 tests/data/acpi/pc/DSDT.cphp        | Bin 5529 -> 5529 bytes
 tests/data/acpi/pc/DSDT.dimmpxm     | Bin 6719 -> 6719 bytes
 tests/data/acpi/pc/DSDT.hpbridge    | Bin 5026 -> 5026 bytes
 tests/data/acpi/pc/DSDT.hpbrroot    | Bin 3084 -> 3084 bytes
 tests/data/acpi/pc/DSDT.ipmikcs     | Bin 5137 -> 5137 bytes
 tests/data/acpi/pc/DSDT.memhp       | Bin 6424 -> 6424 bytes
 tests/data/acpi/pc/DSDT.numamem     | Bin 5071 -> 5071 bytes
 tests/data/acpi/pc/DSDT.roothp      | Bin 5261 -> 5261 bytes
 tests/data/acpi/q35/CEDT            | Bin 0 -> 36 bytes
 tests/data/acpi/q35/DSDT            | Bin 7801 -> 7801 bytes
 tests/data/acpi/q35/DSDT.acpihmat   | Bin 9126 -> 9126 bytes
 tests/data/acpi/q35/DSDT.bridge     | Bin 7819 -> 7819 bytes
 tests/data/acpi/q35/DSDT.cphp       | Bin 8265 -> 8265 bytes
 tests/data/acpi/q35/DSDT.dimmpxm    | Bin 9455 -> 9455 bytes
 tests/data/acpi/q35/DSDT.ipmibt     | Bin 7876 -> 7876 bytes
 tests/data/acpi/q35/DSDT.memhp      | Bin 9160 -> 9160 bytes
 tests/data/acpi/q35/DSDT.mmio64     | Bin 8932 -> 8932 bytes
 tests/data/acpi/q35/DSDT.numamem    | Bin 7807 -> 7807 bytes
 tests/qtest/cxl-test.c              |  93 ++++++
 tests/qtest/meson.build             |   4 +
 64 files changed, 3004 insertions(+), 30 deletions(-)
 create mode 100644 hw/acpi/cxl.c
 create mode 100644 hw/cxl/Kconfig
 create mode 100644 hw/cxl/cxl-component-utils.c
 create mode 100644 hw/cxl/cxl-device-utils.c
 create mode 100644 hw/cxl/cxl-mailbox-utils.c
 create mode 100644 hw/cxl/meson.build
 create mode 100644 hw/mem/cxl_type3.c
 create mode 100644 hw/pci-bridge/cxl_root_port.c
 create mode 100644 include/hw/acpi/cxl.h
 create mode 100644 include/hw/cxl/cxl.h
 create mode 100644 include/hw/cxl/cxl_component.h
 create mode 100644 include/hw/cxl/cxl_device.h
 create mode 100644 include/hw/cxl/cxl_pci.h
 create mode 100644 tests/data/acpi/pc/CEDT
 create mode 100644 tests/data/acpi/q35/CEDT
 create mode 100644 tests/qtest/cxl-test.c

-- 
2.30.0


^ permalink raw reply	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 00/31] CXL 2.0 Support
@ 2021-02-02  0:59 ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Major changes since v2 [1]:
 * Removed all register endian/alignment/size checking. Using core functionality
   instead. This untested on big endian hosts, but Should Work(tm).
 * Fix component capability header generation (off by 1).
 * Fixed HDM programming (multiple issues).
 * Fixed timestamp command implementations.
 * Added commands: GET_FIRMWARE_UPDATE_INFO, GET_PARTITION_INFO, GET_LSA, SET_LSA

Things have remained fairly stable since since v2. The biggest change here is
definitely the HDM programming which has received limited (but not 0) testing in
the Linux driver.

Jonathan Cameron has gotten this patch series working on ARM [2], and added some
much sought after functionality [3].

---

I've started #cxl on OFTC IRC for discussion. Please feel free to use that
channel for questions or suggestions in addition to #qemu.

---

Introduce emulation of Compute Express Link 2.0
(https://www.computeexpresslink.org/). Specifically, add support for Type 3
memory expanders with persistent memory.

The emulation has been critical to get the Linux enabling started [4], it would
be an ideal place to land regression tests for different topology handling, and
there may be applications for this emulation as a way for a guest to manipulate
its address space relative to different performance memories.

Three of the five CXL component types are emulated with some level of
functionality: host bridge, root port, and memory device. All components and
devices implement basic MMIO. Devices/memory devices implement the mailbo
interface. Basic ACPI support is also included. Upstream ports and downstream
ports aren't implemented (the two components needed to make up a switch).

CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
implementation utilizes existing PCI paradigms. To implement the host bridge,
I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
fit even though it doesn't directly map to how hardware will work. For
persistent capacity of the memory device, I utilized the memory subsystem
(hw/mem).

We have 3 reasons why this work is valuable:
1. Linux driver feature development benefits from emulation both due to a lack
   of initial hardware availability, but also, as is seen with NVDIMM/PMEM
   emulation, there is value in being able to share topologies with
   system-software developers even after hardware is available.

2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
   resources via custom modules (nfit_test). In retrospect a QEMU emulation of
   nfit_test capabilities would have made the test environment more portable,
   and allowed for easier community contributions of example configurations.

3. This is still being fleshed out, but in short it provides a standardized
   mechanism for the guest to provide feedback to the host about size and
   placement needs of the memory. After the host gives the guest a physical
   window mapping to the CXL device, the emulated HDM decoders allow the guest a
   way to tell the host how much it wants and where. There are likely simpler
   ways to do this, but they'd require inventing a new interface and you'd need
   to have diverging driver code in the guest programming of the HDM decoder vs.
   the host. Since we've already done this work, why not use it?

There is quite a long list of work to do for full spec compliance, but I don't
believe that any of it precludes merging. Off the top of my head:
- Main host bridge support (WIP)
- Interleaving
- Better Tests
- Hot plug support
- Emulating volatile capacity
- CDAT emulation [3]

The flow of the patches in general is to define all the data structures and
registers associated with the various components in a top down manner. Host
bridge, component, ports, devices. Then, the actual implementation is done in
the same order.

The summary is:
1-5: Infrastructure for component and device emulation
6-9: Basic mailbox command implementations
10-19: Implement CXL host bridges as PXB devices
20: Implement a root port
21-22: Implement a memory device
23-26: ACPI bits
27-29: Add some more advanced mailbox command implementations
30: Start working on enabling the main host bridge
31: Basic test case

---

[1]: https://lore.kernel.org/qemu-devel/20210105165323.783725-1-ben.widawsky@intel.com/
[2]: https://lore.kernel.org/qemu-devel/20210201152655.31027-1-Jonathan.Cameron@huawei.com/
[3]: https://lore.kernel.org/qemu-devel/20210201151629.29656-1-Jonathan.Cameron@huawei.com/
[4]: https://lore.kernel.org/linux-cxl/20210130002438.1872527-1-ben.widawsky@intel.com/

---

Ben Widawsky (31):
  hw/pci/cxl: Add a CXL component type (interface)
  hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  hw/cxl/device: Introduce a CXL device (8.2.8)
  hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  hw/cxl/device: Implement basic mailbox (8.2.8.4)
  hw/cxl/device: Add memory device utilities
  hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
  hw/cxl/device: Timestamp implementation (8.2.9.3)
  hw/cxl/device: Add log commands (8.2.9.4) + CEL
  hw/pxb: Use a type for realizing expanders
  hw/pci/cxl: Create a CXL bus type
  hw/pxb: Allow creation of a CXL PXB (host bridge)
  qtest: allow DSDT acpi table changes
  acpi/pci: Consolidate host bridge setup
  tests/acpi: remove stale allowed tables
  hw/pci: Plumb _UID through host bridges
  hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  acpi/pxb/cxl: Reserve host bridge MMIO
  hw/pxb/cxl: Add "windows" for host bridges
  hw/cxl/rp: Add a root port
  hw/cxl/device: Add a memory device (8.2.8.5)
  hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
  acpi/cxl: Add _OSC implementation (9.14.2)
  tests/acpi: allow CEDT table addition
  acpi/cxl: Create the CEDT (9.14.1)
  tests/acpi: Add new CEDT files
  hw/cxl/device: Add some trivial commands
  hw/cxl/device: Plumb real LSA sizing
  hw/cxl/device: Implement get/set LSA
  qtest/cxl: Add very basic sanity tests
  WIP: i386/cxl: Initialize a host bridge

 MAINTAINERS                         |   6 +
 hw/Kconfig                          |   1 +
 hw/acpi/Kconfig                     |   5 +
 hw/acpi/cxl.c                       | 173 ++++++++++
 hw/acpi/meson.build                 |   1 +
 hw/arm/virt.c                       |   1 +
 hw/core/machine.c                   |  26 ++
 hw/core/numa.c                      |   3 +
 hw/cxl/Kconfig                      |   3 +
 hw/cxl/cxl-component-utils.c        | 208 ++++++++++++
 hw/cxl/cxl-device-utils.c           | 264 +++++++++++++++
 hw/cxl/cxl-mailbox-utils.c          | 498 ++++++++++++++++++++++++++++
 hw/cxl/meson.build                  |   5 +
 hw/i386/acpi-build.c                |  87 ++++-
 hw/i386/microvm.c                   |   1 +
 hw/i386/pc.c                        |   2 +
 hw/mem/Kconfig                      |   5 +
 hw/mem/cxl_type3.c                  | 405 ++++++++++++++++++++++
 hw/mem/meson.build                  |   1 +
 hw/meson.build                      |   1 +
 hw/pci-bridge/Kconfig               |   5 +
 hw/pci-bridge/cxl_root_port.c       | 231 +++++++++++++
 hw/pci-bridge/meson.build           |   1 +
 hw/pci-bridge/pci_expander_bridge.c | 209 +++++++++++-
 hw/pci-bridge/pcie_root_port.c      |   6 +-
 hw/pci/pci.c                        |  32 +-
 hw/pci/pcie.c                       |  30 ++
 hw/ppc/spapr.c                      |   2 +
 include/hw/acpi/cxl.h               |  27 ++
 include/hw/boards.h                 |   2 +
 include/hw/cxl/cxl.h                |  29 ++
 include/hw/cxl/cxl_component.h      | 187 +++++++++++
 include/hw/cxl/cxl_device.h         | 255 ++++++++++++++
 include/hw/cxl/cxl_pci.h            | 160 +++++++++
 include/hw/pci/pci.h                |  15 +
 include/hw/pci/pci_bridge.h         |  25 ++
 include/hw/pci/pci_bus.h            |   8 +
 include/hw/pci/pci_ids.h            |   1 +
 monitor/hmp-cmds.c                  |  15 +
 qapi/machine.json                   |   1 +
 tests/data/acpi/pc/CEDT             | Bin 0 -> 36 bytes
 tests/data/acpi/pc/DSDT             | Bin 5065 -> 5065 bytes
 tests/data/acpi/pc/DSDT.acpihmat    | Bin 6390 -> 6390 bytes
 tests/data/acpi/pc/DSDT.bridge      | Bin 6924 -> 6924 bytes
 tests/data/acpi/pc/DSDT.cphp        | Bin 5529 -> 5529 bytes
 tests/data/acpi/pc/DSDT.dimmpxm     | Bin 6719 -> 6719 bytes
 tests/data/acpi/pc/DSDT.hpbridge    | Bin 5026 -> 5026 bytes
 tests/data/acpi/pc/DSDT.hpbrroot    | Bin 3084 -> 3084 bytes
 tests/data/acpi/pc/DSDT.ipmikcs     | Bin 5137 -> 5137 bytes
 tests/data/acpi/pc/DSDT.memhp       | Bin 6424 -> 6424 bytes
 tests/data/acpi/pc/DSDT.numamem     | Bin 5071 -> 5071 bytes
 tests/data/acpi/pc/DSDT.roothp      | Bin 5261 -> 5261 bytes
 tests/data/acpi/q35/CEDT            | Bin 0 -> 36 bytes
 tests/data/acpi/q35/DSDT            | Bin 7801 -> 7801 bytes
 tests/data/acpi/q35/DSDT.acpihmat   | Bin 9126 -> 9126 bytes
 tests/data/acpi/q35/DSDT.bridge     | Bin 7819 -> 7819 bytes
 tests/data/acpi/q35/DSDT.cphp       | Bin 8265 -> 8265 bytes
 tests/data/acpi/q35/DSDT.dimmpxm    | Bin 9455 -> 9455 bytes
 tests/data/acpi/q35/DSDT.ipmibt     | Bin 7876 -> 7876 bytes
 tests/data/acpi/q35/DSDT.memhp      | Bin 9160 -> 9160 bytes
 tests/data/acpi/q35/DSDT.mmio64     | Bin 8932 -> 8932 bytes
 tests/data/acpi/q35/DSDT.numamem    | Bin 7807 -> 7807 bytes
 tests/qtest/cxl-test.c              |  93 ++++++
 tests/qtest/meson.build             |   4 +
 64 files changed, 3004 insertions(+), 30 deletions(-)
 create mode 100644 hw/acpi/cxl.c
 create mode 100644 hw/cxl/Kconfig
 create mode 100644 hw/cxl/cxl-component-utils.c
 create mode 100644 hw/cxl/cxl-device-utils.c
 create mode 100644 hw/cxl/cxl-mailbox-utils.c
 create mode 100644 hw/cxl/meson.build
 create mode 100644 hw/mem/cxl_type3.c
 create mode 100644 hw/pci-bridge/cxl_root_port.c
 create mode 100644 include/hw/acpi/cxl.h
 create mode 100644 include/hw/cxl/cxl.h
 create mode 100644 include/hw/cxl/cxl_component.h
 create mode 100644 include/hw/cxl/cxl_device.h
 create mode 100644 include/hw/cxl/cxl_pci.h
 create mode 100644 tests/data/acpi/pc/CEDT
 create mode 100644 tests/data/acpi/q35/CEDT
 create mode 100644 tests/qtest/cxl-test.c

-- 
2.30.0



^ permalink raw reply	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 01/31] hw/pci/cxl: Add a CXL component type (interface)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

A CXL component is a hardware entity that implements CXL component
registers from the CXL 2.0 spec (8.2.3). Currently these represent 3
general types.
1. Host Bridge
2. Ports (root, upstream, downstream)
3. Devices (memory, other)

A CXL component can be conceptually thought of as a PCIe device with
extra functionality when enumerated and enabled. For this reason, CXL
does here, and will continue to add on to existing PCI code paths.

Host bridges will typically need to be handled specially and so they can
implement this newly introduced interface or not. All other components
should implement this interface. Implementing this interface allows the
core pci code to treat these devices as special where appropriate.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci/pci.c         | 10 ++++++++++
 include/hw/pci/pci.h |  8 ++++++++
 2 files changed, 18 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 512e9042ff..a45ca326ed 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -194,6 +194,11 @@ static const TypeInfo pci_bus_info = {
     .class_init = pci_bus_class_init,
 };
 
+static const TypeInfo cxl_interface_info = {
+    .name          = INTERFACE_CXL_DEVICE,
+    .parent        = TYPE_INTERFACE,
+};
+
 static const TypeInfo pcie_interface_info = {
     .name          = INTERFACE_PCIE_DEVICE,
     .parent        = TYPE_INTERFACE,
@@ -2091,6 +2096,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
         pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS;
     }
 
+    if (object_class_dynamic_cast(klass, INTERFACE_CXL_DEVICE)) {
+        pci_dev->cap_present |= QEMU_PCIE_CAP_CXL;
+    }
+
     pci_dev = do_pci_register_device(pci_dev,
                                      object_get_typename(OBJECT(qdev)),
                                      pci_dev->devfn, errp);
@@ -2817,6 +2826,7 @@ static void pci_register_types(void)
     type_register_static(&pci_bus_info);
     type_register_static(&pcie_bus_info);
     type_register_static(&conventional_pci_interface_info);
+    type_register_static(&cxl_interface_info);
     type_register_static(&pcie_interface_info);
     type_register_static(&pci_device_type_info);
 }
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 66db08462f..528cef341c 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -195,6 +195,8 @@ enum {
     QEMU_PCIE_LNKSTA_DLLLA = (1 << QEMU_PCIE_LNKSTA_DLLLA_BITNR),
 #define QEMU_PCIE_EXTCAP_INIT_BITNR 9
     QEMU_PCIE_EXTCAP_INIT = (1 << QEMU_PCIE_EXTCAP_INIT_BITNR),
+#define QEMU_PCIE_CXL_BITNR 10
+    QEMU_PCIE_CAP_CXL = (1 << QEMU_PCIE_CXL_BITNR),
 };
 
 #define TYPE_PCI_DEVICE "pci-device"
@@ -202,6 +204,12 @@ typedef struct PCIDeviceClass PCIDeviceClass;
 DECLARE_OBJ_CHECKERS(PCIDevice, PCIDeviceClass,
                      PCI_DEVICE, TYPE_PCI_DEVICE)
 
+/*
+ * Implemented by devices that can be plugged on CXL buses. In the spec, this is
+ * actually a "CXL Component, but we name it device to match the PCI naming.
+ */
+#define INTERFACE_CXL_DEVICE "cxl-device"
+
 /* Implemented by devices that can be plugged on PCI Express buses */
 #define INTERFACE_PCIE_DEVICE "pci-express-device"
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 01/31] hw/pci/cxl: Add a CXL component type (interface)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

A CXL component is a hardware entity that implements CXL component
registers from the CXL 2.0 spec (8.2.3). Currently these represent 3
general types.
1. Host Bridge
2. Ports (root, upstream, downstream)
3. Devices (memory, other)

A CXL component can be conceptually thought of as a PCIe device with
extra functionality when enumerated and enabled. For this reason, CXL
does here, and will continue to add on to existing PCI code paths.

Host bridges will typically need to be handled specially and so they can
implement this newly introduced interface or not. All other components
should implement this interface. Implementing this interface allows the
core pci code to treat these devices as special where appropriate.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci/pci.c         | 10 ++++++++++
 include/hw/pci/pci.h |  8 ++++++++
 2 files changed, 18 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 512e9042ff..a45ca326ed 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -194,6 +194,11 @@ static const TypeInfo pci_bus_info = {
     .class_init = pci_bus_class_init,
 };
 
+static const TypeInfo cxl_interface_info = {
+    .name          = INTERFACE_CXL_DEVICE,
+    .parent        = TYPE_INTERFACE,
+};
+
 static const TypeInfo pcie_interface_info = {
     .name          = INTERFACE_PCIE_DEVICE,
     .parent        = TYPE_INTERFACE,
@@ -2091,6 +2096,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
         pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS;
     }
 
+    if (object_class_dynamic_cast(klass, INTERFACE_CXL_DEVICE)) {
+        pci_dev->cap_present |= QEMU_PCIE_CAP_CXL;
+    }
+
     pci_dev = do_pci_register_device(pci_dev,
                                      object_get_typename(OBJECT(qdev)),
                                      pci_dev->devfn, errp);
@@ -2817,6 +2826,7 @@ static void pci_register_types(void)
     type_register_static(&pci_bus_info);
     type_register_static(&pcie_bus_info);
     type_register_static(&conventional_pci_interface_info);
+    type_register_static(&cxl_interface_info);
     type_register_static(&pcie_interface_info);
     type_register_static(&pci_device_type_info);
 }
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 66db08462f..528cef341c 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -195,6 +195,8 @@ enum {
     QEMU_PCIE_LNKSTA_DLLLA = (1 << QEMU_PCIE_LNKSTA_DLLLA_BITNR),
 #define QEMU_PCIE_EXTCAP_INIT_BITNR 9
     QEMU_PCIE_EXTCAP_INIT = (1 << QEMU_PCIE_EXTCAP_INIT_BITNR),
+#define QEMU_PCIE_CXL_BITNR 10
+    QEMU_PCIE_CAP_CXL = (1 << QEMU_PCIE_CXL_BITNR),
 };
 
 #define TYPE_PCI_DEVICE "pci-device"
@@ -202,6 +204,12 @@ typedef struct PCIDeviceClass PCIDeviceClass;
 DECLARE_OBJ_CHECKERS(PCIDevice, PCIDeviceClass,
                      PCI_DEVICE, TYPE_PCI_DEVICE)
 
+/*
+ * Implemented by devices that can be plugged on CXL buses. In the spec, this is
+ * actually a "CXL Component, but we name it device to match the PCI naming.
+ */
+#define INTERFACE_CXL_DEVICE "cxl-device"
+
 /* Implemented by devices that can be plugged on PCI Express buses */
 #define INTERFACE_PCIE_DEVICE "pci-express-device"
 
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 02/31] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

A CXL 2.0 component is any entity in the CXL topology. All components
have a analogous function in PCIe. Except for the CXL host bridge, all
have a PCIe config space that is accessible via the common PCIe
mechanisms. CXL components are enumerated via DVSEC fields in the
extended PCIe header space. CXL components will minimally implement some
subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
2.0 specification. Two headers and a utility library are introduced to
support the minimum functionality needed to enumerate components.

The cxl_pci header manages bits associated with PCI, specifically the
DVSEC and related fields. The cxl_component.h variant has data
structures and APIs that are useful for drivers implementing any of the
CXL 2.0 components. The library takes care of making use of the DVSEC
bits and the CXL.[mem|cache] registers. Per spec, the registers are
little endian.

None of the mechanisms required to enumerate a CXL capable hostbridge
are introduced at this point.

Note that the CXL.mem and CXL.cache registers used are always 4B wide.
It's possible in the future that this constraint will not hold.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 MAINTAINERS                    |   6 +
 hw/Kconfig                     |   1 +
 hw/cxl/Kconfig                 |   3 +
 hw/cxl/cxl-component-utils.c   | 208 +++++++++++++++++++++++++++++++++
 hw/cxl/meson.build             |   3 +
 hw/meson.build                 |   1 +
 include/hw/cxl/cxl.h           |  17 +++
 include/hw/cxl/cxl_component.h | 187 +++++++++++++++++++++++++++++
 include/hw/cxl/cxl_pci.h       | 138 ++++++++++++++++++++++
 9 files changed, 564 insertions(+)
 create mode 100644 hw/cxl/Kconfig
 create mode 100644 hw/cxl/cxl-component-utils.c
 create mode 100644 hw/cxl/meson.build
 create mode 100644 include/hw/cxl/cxl.h
 create mode 100644 include/hw/cxl/cxl_component.h
 create mode 100644 include/hw/cxl/cxl_pci.h

diff --git a/MAINTAINERS b/MAINTAINERS
index bcd88668bc..981dc92e25 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2234,6 +2234,12 @@ F: qapi/block*.json
 F: qapi/transaction.json
 T: git https://repo.or.cz/qemu/armbru.git block-next
 
+Compute Express Link
+M: Ben Widawsky <ben.widawsky@intel.com>
+S: Supported
+F: hw/cxl/
+F: include/hw/cxl/
+
 Dirty Bitmaps
 M: Eric Blake <eblake@redhat.com>
 M: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
diff --git a/hw/Kconfig b/hw/Kconfig
index 5ad3c6b5a4..c03650c5ed 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -6,6 +6,7 @@ source audio/Kconfig
 source block/Kconfig
 source char/Kconfig
 source core/Kconfig
+source cxl/Kconfig
 source display/Kconfig
 source dma/Kconfig
 source gpio/Kconfig
diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig
new file mode 100644
index 0000000000..8e67519b16
--- /dev/null
+++ b/hw/cxl/Kconfig
@@ -0,0 +1,3 @@
+config CXL
+    bool
+    default y if PCI_EXPRESS
diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
new file mode 100644
index 0000000000..8d56ad5c7d
--- /dev/null
+++ b/hw/cxl/cxl-component-utils.c
@@ -0,0 +1,208 @@
+/*
+ * CXL Utility library for components
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
+
+static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
+                                       unsigned size)
+{
+    CXLComponentState *cxl_cstate = opaque;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+
+    assert(size == 4);
+
+    if (cregs->special_ops && cregs->special_ops->read) {
+        return cregs->special_ops->read(cxl_cstate, offset, size);
+    } else {
+        return cregs->cache_mem_registers[offset / 4];
+    }
+}
+
+static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
+                                    unsigned size)
+{
+    CXLComponentState *cxl_cstate = opaque;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+
+    assert(size == 4);
+
+    if (cregs->special_ops && cregs->special_ops->write) {
+        cregs->special_ops->write(cxl_cstate, offset, value, size);
+    } else {
+        cregs->cache_mem_registers[offset / 4] = value;
+    }
+}
+
+/*
+ * 8.2.3
+ *   The access restrictions specified in Section 8.2.2 also apply to CXL 2.0
+ *   Component Registers.
+ *
+ * 8.2.2
+ *   • A 32 bit register shall be accessed as a 4 Bytes quantity. Partial
+ *   reads are not permitted.
+ *   • A 64 bit register shall be accessed as a 8 Bytes quantity. Partial
+ *   reads are not permitted.
+ *
+ * As of the spec defined today, only 4 byte registers exist.
+ */
+static const MemoryRegionOps cache_mem_ops = {
+    .read = cxl_cache_mem_read_reg,
+    .write = cxl_cache_mem_write_reg,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
+
+void cxl_component_register_block_init(Object *obj,
+                                       CXLComponentState *cxl_cstate,
+                                       const char *type)
+{
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+
+    memory_region_init(&cregs->component_registers, obj, type,
+                       CXL2_COMPONENT_BLOCK_SIZE);
+
+    /* io registers controls link which we don't care about in QEMU */
+    memory_region_init_io(&cregs->io, obj, NULL, cregs, ".io",
+                          CXL2_COMPONENT_IO_REGION_SIZE);
+    memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cregs,
+                          ".cache_mem", CXL2_COMPONENT_CM_REGION_SIZE);
+
+    memory_region_add_subregion(&cregs->component_registers, 0, &cregs->io);
+    memory_region_add_subregion(&cregs->component_registers,
+                                CXL2_COMPONENT_IO_REGION_SIZE,
+                                &cregs->cache_mem);
+}
+
+static void ras_init_common(uint32_t *reg_state)
+{
+    reg_state[R_CXL_RAS_UNC_ERR_STATUS] = 0;
+    reg_state[R_CXL_RAS_UNC_ERR_MASK] = 0x1efff;
+    reg_state[R_CXL_RAS_UNC_ERR_SEVERITY] = 0x1efff;
+    reg_state[R_CXL_RAS_COR_ERR_STATUS] = 0;
+    reg_state[R_CXL_RAS_COR_ERR_MASK] = 0x3f;
+    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */
+}
+
+static void hdm_init_common(uint32_t *reg_state)
+{
+    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0);
+    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);
+}
+
+void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
+{
+    int caps = 0;
+    switch (type) {
+    case CXL2_DOWNSTREAM_PORT:
+    case CXL2_DEVICE:
+        /* CAP, RAS, Link */
+        caps = 2;
+        break;
+    case CXL2_UPSTREAM_PORT:
+    case CXL2_TYPE3_DEVICE:
+    case CXL2_LOGICAL_DEVICE:
+        /* + HDM */
+        caps = 3;
+        break;
+    case CXL2_ROOT_PORT:
+        /* + Extended Security, + Snoop */
+        caps = 5;
+        break;
+    default:
+        abort();
+    }
+
+    memset(reg_state, 0, 0x1000);
+
+    /* CXL Capability Header Register */
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
+
+
+#define init_cap_reg(reg, id, version)                                        \
+    _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
+    do {                                                                      \
+        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
+        reg_state[which] = FIELD_DP32(reg_state[which],                       \
+                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
+        reg_state[which] =                                                    \
+            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
+                       VERSION, version);                                     \
+        reg_state[which] =                                                    \
+            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
+                       CXL_##reg##_REGISTERS_OFFSET);                         \
+    } while (0)
+
+    init_cap_reg(RAS, 2, 1);
+    ras_init_common(reg_state);
+
+    init_cap_reg(LINK, 4, 2);
+
+    if (caps < 3) {
+        return;
+    }
+
+    init_cap_reg(HDM, 5, 1);
+    hdm_init_common(reg_state);
+
+    if (caps < 5) {
+        return;
+    }
+
+    init_cap_reg(EXTSEC, 6, 1);
+    init_cap_reg(SNOOP, 8, 1);
+
+#undef init_cap_reg
+}
+
+/*
+ * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
+ * for tracking the valid offset.
+ *
+ * This function will build the DVSEC header on behalf of the caller and then
+ * copy in the remaining data for the vendor specific bits.
+ */
+void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
+                                uint16_t type, uint8_t rev, uint8_t *body)
+{
+    PCIDevice *pdev = cxl->pdev;
+    uint16_t offset = cxl->dvsec_offset;
+
+    assert(offset >= PCI_CFG_SPACE_SIZE &&
+           ((offset + length) < PCI_CFG_SPACE_EXP_SIZE));
+    assert((length & 0xf000) == 0);
+    assert((rev & ~0xf) == 0);
+
+    /* Create the DVSEC in the MCFG space */
+    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
+    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER1_OFFSET,
+                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
+    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
+    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
+           body + sizeof(struct dvsec_header),
+           length - sizeof(struct dvsec_header));
+
+    /* Update state for future DVSEC additions */
+    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
+    cxl->dvsec_offset += length;
+}
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
new file mode 100644
index 0000000000..00c3876a0f
--- /dev/null
+++ b/hw/cxl/meson.build
@@ -0,0 +1,3 @@
+softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
+  'cxl-component-utils.c',
+))
diff --git a/hw/meson.build b/hw/meson.build
index 010de7219c..3e440c341a 100644
--- a/hw/meson.build
+++ b/hw/meson.build
@@ -6,6 +6,7 @@ subdir('block')
 subdir('char')
 subdir('core')
 subdir('cpu')
+subdir('cxl')
 subdir('display')
 subdir('dma')
 subdir('gpio')
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
new file mode 100644
index 0000000000..55f6cc30a5
--- /dev/null
+++ b/include/hw/cxl/cxl.h
@@ -0,0 +1,17 @@
+/*
+ * QEMU CXL Support
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_H
+#define CXL_H
+
+#include "cxl_pci.h"
+#include "cxl_component.h"
+
+#endif
+
diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
new file mode 100644
index 0000000000..762feb54da
--- /dev/null
+++ b/include/hw/cxl/cxl_component.h
@@ -0,0 +1,187 @@
+/*
+ * QEMU CXL Component
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_COMPONENT_H
+#define CXL_COMPONENT_H
+
+/* CXL 2.0 - 8.2.4 */
+#define CXL2_COMPONENT_IO_REGION_SIZE 0x1000
+#define CXL2_COMPONENT_CM_REGION_SIZE 0x1000
+#define CXL2_COMPONENT_BLOCK_SIZE 0x10000
+
+#include "qemu/range.h"
+#include "qemu/typedefs.h"
+#include "hw/register.h"
+
+enum reg_type {
+    CXL2_DEVICE,
+    CXL2_TYPE3_DEVICE,
+    CXL2_LOGICAL_DEVICE,
+    CXL2_ROOT_PORT,
+    CXL2_UPSTREAM_PORT,
+    CXL2_DOWNSTREAM_PORT
+};
+
+/*
+ * Capability registers are defined at the top of the CXL.cache/mem region and
+ * are packed. For our purposes we will always define the caps in the same
+ * order.
+ * CXL 2.0 - 8.2.5 Table 142 for details.
+ */
+
+/* CXL 2.0 - 8.2.5.1 */
+REG32(CXL_CAPABILITY_HEADER, 0)
+    FIELD(CXL_CAPABILITY_HEADER, ID, 0, 16)
+    FIELD(CXL_CAPABILITY_HEADER, VERSION, 16, 4)
+    FIELD(CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 20, 4)
+    FIELD(CXL_CAPABILITY_HEADER, ARRAY_SIZE, 24, 8)
+
+#define CXLx_CAPABILITY_HEADER(type, offset)                  \
+    REG32(CXL_##type##_CAPABILITY_HEADER, offset)             \
+        FIELD(CXL_##type##_CAPABILITY_HEADER, ID, 0, 16)      \
+        FIELD(CXL_##type##_CAPABILITY_HEADER, VERSION, 16, 4) \
+        FIELD(CXL_##type##_CAPABILITY_HEADER, PTR, 20, 12)
+CXLx_CAPABILITY_HEADER(RAS, 0x4)
+CXLx_CAPABILITY_HEADER(LINK, 0x8)
+CXLx_CAPABILITY_HEADER(HDM, 0xc)
+CXLx_CAPABILITY_HEADER(EXTSEC, 0x10)
+CXLx_CAPABILITY_HEADER(SNOOP, 0x14)
+
+/*
+ * Capability structures contain the actual registers that the CXL component
+ * implements. Some of these are specific to certain types of components, but
+ * this implementation leaves enough space regardless.
+ */
+/* 8.2.5.9 - CXL RAS Capability Structure */
+#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */
+#define CXL_RAS_REGISTERS_SIZE   0x58
+REG32(CXL_RAS_UNC_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET)
+REG32(CXL_RAS_UNC_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x4)
+REG32(CXL_RAS_UNC_ERR_SEVERITY, CXL_RAS_REGISTERS_OFFSET + 0x8)
+REG32(CXL_RAS_COR_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET + 0xc)
+REG32(CXL_RAS_COR_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x10)
+REG32(CXL_RAS_ERR_CAP_CTRL, CXL_RAS_REGISTERS_OFFSET + 0x14)
+/* Offset 0x18 - 0x58 reserved for RAS logs */
+
+/* 8.2.5.10 - CXL Security Capability Structure */
+#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
+#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
+
+/* 8.2.5.11 - CXL Link Capability Structure */
+#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
+#define CXL_LINK_REGISTERS_SIZE   0x38
+
+/* 8.2.5.12 - CXL HDM Decoder Capability Structure */
+#define HDM_DECODE_MAX 10 /* 8.2.5.12.1 */
+#define CXL_HDM_REGISTERS_OFFSET \
+    (CXL_LINK_REGISTERS_OFFSET + CXL_LINK_REGISTERS_SIZE) /* 8.2.5.12 */
+#define CXL_HDM_REGISTERS_SIZE (0x20 + HDM_DECODE_MAX * 10)
+#define HDM_DECODER_INIT(n)                                                    \
+  REG32(CXL_HDM_DECODER##n##_BASE_LO,                                          \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x10)                          \
+            FIELD(CXL_HDM_DECODER##n##_BASE_LO, L, 28, 4)                      \
+  REG32(CXL_HDM_DECODER##n##_BASE_HI,                                          \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x14)                          \
+  REG32(CXL_HDM_DECODER##n##_SIZE_LO,                                          \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x18)                          \
+  REG32(CXL_HDM_DECODER##n##_SIZE_HI,                                          \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x1C)                          \
+  REG32(CXL_HDM_DECODER##n##_CTRL,                                             \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x20)                          \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, IG, 0, 4)                         \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, IW, 4, 4)                         \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, LOCK_ON_COMMIT, 8, 1)             \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMIT, 9, 1)                     \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMITTED, 10, 1)                 \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, ERROR, 11, 1)                     \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, TYPE, 12, 1)                      \
+  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_LO, 0x24)                             \
+  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_HI, 0x28)
+
+REG32(CXL_HDM_DECODER_CAPABILITY, CXL_HDM_REGISTERS_OFFSET)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0, 4)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, TARGET_COUNT, 4, 4)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_256B, 8, 1)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, INTELEAVE_4K, 9, 1)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, POISON_ON_ERR_CAP, 10, 1)
+REG32(CXL_HDM_DECODER_GLOBAL_CONTROL, CXL_HDM_REGISTERS_OFFSET + 4)
+    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, POISON_ON_ERR_EN, 0, 1)
+    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 1, 1)
+
+HDM_DECODER_INIT(0);
+
+/* 8.2.5.13 - CXL Extended Security Capability Structure (Root complex only) */
+#define EXTSEC_ENTRY_MAX        256
+#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)
+#define CXL_EXTSEC_REGISTERS_SIZE   (8 * EXTSEC_ENTRY_MAX + 4)
+
+/* 8.2.5.14 - CXL IDE Capability Structure */
+#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)
+#define CXL_IDE_REGISTERS_SIZE   0
+
+/* 8.2.5.15 - CXL Snoop Filter Capability Structure */
+#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
+#define CXL_SNOOP_REGISTERS_SIZE   0x8
+
+_Static_assert((CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE) < 0x1000,
+               "No space for registers");
+
+typedef struct component_registers {
+    /*
+     * Main memory region to be registered with QEMU core.
+     */
+    MemoryRegion component_registers;
+
+    /*
+     * 8.2.4 Table 141:
+     *   0x0000 - 0x0fff CXL.io registers
+     *   0x1000 - 0x1fff CXL.cache and CXL.mem
+     *   0x2000 - 0xdfff Implementation specific
+     *   0xe000 - 0xe3ff CXL ARB/MUX registers
+     *   0xe400 - 0xffff RSVD
+     */
+    uint32_t io_registers[CXL2_COMPONENT_IO_REGION_SIZE >> 2];
+    MemoryRegion io;
+
+    uint32_t cache_mem_registers[CXL2_COMPONENT_CM_REGION_SIZE >> 2];
+    MemoryRegion cache_mem;
+
+    MemoryRegion impl_specific;
+    MemoryRegion arb_mux;
+    MemoryRegion rsvd;
+
+    /* special_ops is used for any component that needs any specific handling */
+    MemoryRegionOps *special_ops;
+} ComponentRegisters;
+
+/*
+ * A CXL component represents all entities in a CXL hierarchy. This includes,
+ * host bridges, root ports, upstream/downstream switch ports, and devices
+ */
+typedef struct cxl_component {
+    ComponentRegisters crb;
+    union {
+        struct {
+            Range dvsecs[CXL20_MAX_DVSEC];
+            uint16_t dvsec_offset;
+            struct PCIDevice *pdev;
+        };
+    };
+} CXLComponentState;
+
+void cxl_component_register_block_init(Object *obj,
+                                       CXLComponentState *cxl_cstate,
+                                       const char *type);
+void cxl_component_register_init_common(uint32_t *reg_state,
+                                        enum reg_type type);
+
+void cxl_component_create_dvsec(CXLComponentState *cxl_cstate, uint16_t length,
+                                uint16_t type, uint8_t rev, uint8_t *body);
+
+#endif
diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
new file mode 100644
index 0000000000..a53c2e5ae7
--- /dev/null
+++ b/include/hw/cxl/cxl_pci.h
@@ -0,0 +1,138 @@
+/*
+ * QEMU CXL PCI interfaces
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_PCI_H
+#define CXL_PCI_H
+
+#include "hw/pci/pci.h"
+#include "hw/pci/pcie.h"
+
+#define CXL_VENDOR_ID 0x1e98
+
+#define PCIE_DVSEC_HEADER1_OFFSET 0x4 /* Offset from start of extend cap */
+#define PCIE_DVSEC_ID_OFFSET 0x8
+
+#define PCIE_CXL_DEVICE_DVSEC_LENGTH 0x38
+#define PCIE_CXL1_DEVICE_DVSEC_REVID 0
+#define PCIE_CXL2_DEVICE_DVSEC_REVID 1
+
+#define EXTENSIONS_PORT_DVSEC_LENGTH 0x28
+#define EXTENSIONS_PORT_DVSEC_REVID 0
+
+#define GPF_PORT_DVSEC_LENGTH 0x10
+#define GPF_PORT_DVSEC_REVID  0
+
+#define PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0 0x14
+#define PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0  1
+
+#define REG_LOC_DVSEC_LENGTH 0x24
+#define REG_LOC_DVSEC_REVID  0
+
+enum {
+    PCIE_CXL_DEVICE_DVSEC      = 0,
+    NON_CXL_FUNCTION_MAP_DVSEC = 2,
+    EXTENSIONS_PORT_DVSEC      = 3,
+    GPF_PORT_DVSEC             = 4,
+    GPF_DEVICE_DVSEC           = 5,
+    PCIE_FLEXBUS_PORT_DVSEC    = 7,
+    REG_LOC_DVSEC              = 8,
+    MLD_DVSEC                  = 9,
+    CXL20_MAX_DVSEC
+};
+
+struct dvsec_header {
+    uint32_t cap_hdr;
+    uint32_t dv_hdr1;
+    uint16_t dv_hdr2;
+} __attribute__((__packed__));
+_Static_assert(sizeof(struct dvsec_header) == 10,
+               "dvsec header size incorrect");
+
+/*
+ * CXL 2.0 devices must implement certain DVSEC IDs, and can [optionally]
+ * implement others.
+ *
+ * CXL 2.0 Device: 0, [2], 5, 8
+ * CXL 2.0 RP: 3, 4, 7, 8
+ * CXL 2.0 Upstream Port: [2], 7, 8
+ * CXL 2.0 Downstream Port: 3, 4, 7, 8
+ */
+
+/* CXL 2.0 - 8.1.5 (ID 0003) */
+struct extensions_dvsec_port {
+    struct dvsec_header hdr;
+    uint16_t status;
+    uint16_t control;
+    uint8_t alt_bus_base;
+    uint8_t alt_bus_limit;
+    uint16_t alt_memory_base;
+    uint16_t alt_memory_limit;
+    uint16_t alt_prefetch_base;
+    uint16_t alt_prefetch_limit;
+    uint32_t alt_prefetch_base_high;
+    uint32_t alt_prefetch_base_low;
+    uint32_t rcrb_base;
+    uint32_t rcrb_base_high;
+};
+_Static_assert(sizeof(struct extensions_dvsec_port) == 0x28,
+               "extensions dvsec port size incorrect");
+#define PORT_CONTROL_OVERRIDE_OFFSET 0xc
+#define PORT_CONTROL_UNMASK_SBR      1
+#define PORT_CONTROL_ALT_MEMID_EN    4
+
+/* CXL 2.0 - 8.1.6 GPF DVSEC (ID 0004) */
+struct dvsec_port_gpf {
+    struct dvsec_header hdr;
+    uint16_t rsvd;
+    uint16_t phase1_ctrl;
+    uint16_t phase2_ctrl;
+};
+_Static_assert(sizeof(struct dvsec_port_gpf) == 0x10,
+               "dvsec port GPF size incorrect");
+
+/* CXL 2.0 - 8.1.8/8.2.1.3 Flexbus DVSEC (ID 0007) */
+struct dvsec_port_flexbus {
+    struct dvsec_header hdr;
+    uint16_t cap;
+    uint16_t ctrl;
+    uint16_t status;
+    uint32_t rcvd_mod_ts_data;
+};
+_Static_assert(sizeof(struct dvsec_port_flexbus) == 0x14,
+               "dvsec port flexbus size incorrect");
+
+/* CXL 2.0 - 8.1.9 Register Locator DVSEC (ID 0008) */
+struct dvsec_register_locator {
+    struct dvsec_header hdr;
+    uint16_t rsvd;
+    uint32_t reg0_base_lo;
+    uint32_t reg0_base_hi;
+    uint32_t reg1_base_lo;
+    uint32_t reg1_base_hi;
+    uint32_t reg2_base_lo;
+    uint32_t reg2_base_hi;
+};
+_Static_assert(sizeof(struct dvsec_register_locator) == 0x24,
+               "dvsec register locator size incorrect");
+
+/* BAR Equivalence Indicator */
+#define BEI_BAR_10H 0
+#define BEI_BAR_14H 1
+#define BEI_BAR_18H 2
+#define BEI_BAR_1cH 3
+#define BEI_BAR_20H 4
+#define BEI_BAR_24H 5
+
+/* Register Block Identifier */
+#define RBI_EMPTY          0
+#define RBI_COMPONENT_REG  (1 << 8)
+#define RBI_BAR_VIRT_ACL   (2 << 8)
+#define RBI_CXL_DEVICE_REG (3 << 8)
+
+#endif
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 02/31] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

A CXL 2.0 component is any entity in the CXL topology. All components
have a analogous function in PCIe. Except for the CXL host bridge, all
have a PCIe config space that is accessible via the common PCIe
mechanisms. CXL components are enumerated via DVSEC fields in the
extended PCIe header space. CXL components will minimally implement some
subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
2.0 specification. Two headers and a utility library are introduced to
support the minimum functionality needed to enumerate components.

The cxl_pci header manages bits associated with PCI, specifically the
DVSEC and related fields. The cxl_component.h variant has data
structures and APIs that are useful for drivers implementing any of the
CXL 2.0 components. The library takes care of making use of the DVSEC
bits and the CXL.[mem|cache] registers. Per spec, the registers are
little endian.

None of the mechanisms required to enumerate a CXL capable hostbridge
are introduced at this point.

Note that the CXL.mem and CXL.cache registers used are always 4B wide.
It's possible in the future that this constraint will not hold.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 MAINTAINERS                    |   6 +
 hw/Kconfig                     |   1 +
 hw/cxl/Kconfig                 |   3 +
 hw/cxl/cxl-component-utils.c   | 208 +++++++++++++++++++++++++++++++++
 hw/cxl/meson.build             |   3 +
 hw/meson.build                 |   1 +
 include/hw/cxl/cxl.h           |  17 +++
 include/hw/cxl/cxl_component.h | 187 +++++++++++++++++++++++++++++
 include/hw/cxl/cxl_pci.h       | 138 ++++++++++++++++++++++
 9 files changed, 564 insertions(+)
 create mode 100644 hw/cxl/Kconfig
 create mode 100644 hw/cxl/cxl-component-utils.c
 create mode 100644 hw/cxl/meson.build
 create mode 100644 include/hw/cxl/cxl.h
 create mode 100644 include/hw/cxl/cxl_component.h
 create mode 100644 include/hw/cxl/cxl_pci.h

diff --git a/MAINTAINERS b/MAINTAINERS
index bcd88668bc..981dc92e25 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2234,6 +2234,12 @@ F: qapi/block*.json
 F: qapi/transaction.json
 T: git https://repo.or.cz/qemu/armbru.git block-next
 
+Compute Express Link
+M: Ben Widawsky <ben.widawsky@intel.com>
+S: Supported
+F: hw/cxl/
+F: include/hw/cxl/
+
 Dirty Bitmaps
 M: Eric Blake <eblake@redhat.com>
 M: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
diff --git a/hw/Kconfig b/hw/Kconfig
index 5ad3c6b5a4..c03650c5ed 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -6,6 +6,7 @@ source audio/Kconfig
 source block/Kconfig
 source char/Kconfig
 source core/Kconfig
+source cxl/Kconfig
 source display/Kconfig
 source dma/Kconfig
 source gpio/Kconfig
diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig
new file mode 100644
index 0000000000..8e67519b16
--- /dev/null
+++ b/hw/cxl/Kconfig
@@ -0,0 +1,3 @@
+config CXL
+    bool
+    default y if PCI_EXPRESS
diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
new file mode 100644
index 0000000000..8d56ad5c7d
--- /dev/null
+++ b/hw/cxl/cxl-component-utils.c
@@ -0,0 +1,208 @@
+/*
+ * CXL Utility library for components
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
+
+static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
+                                       unsigned size)
+{
+    CXLComponentState *cxl_cstate = opaque;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+
+    assert(size == 4);
+
+    if (cregs->special_ops && cregs->special_ops->read) {
+        return cregs->special_ops->read(cxl_cstate, offset, size);
+    } else {
+        return cregs->cache_mem_registers[offset / 4];
+    }
+}
+
+static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
+                                    unsigned size)
+{
+    CXLComponentState *cxl_cstate = opaque;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+
+    assert(size == 4);
+
+    if (cregs->special_ops && cregs->special_ops->write) {
+        cregs->special_ops->write(cxl_cstate, offset, value, size);
+    } else {
+        cregs->cache_mem_registers[offset / 4] = value;
+    }
+}
+
+/*
+ * 8.2.3
+ *   The access restrictions specified in Section 8.2.2 also apply to CXL 2.0
+ *   Component Registers.
+ *
+ * 8.2.2
+ *   • A 32 bit register shall be accessed as a 4 Bytes quantity. Partial
+ *   reads are not permitted.
+ *   • A 64 bit register shall be accessed as a 8 Bytes quantity. Partial
+ *   reads are not permitted.
+ *
+ * As of the spec defined today, only 4 byte registers exist.
+ */
+static const MemoryRegionOps cache_mem_ops = {
+    .read = cxl_cache_mem_read_reg,
+    .write = cxl_cache_mem_write_reg,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
+
+void cxl_component_register_block_init(Object *obj,
+                                       CXLComponentState *cxl_cstate,
+                                       const char *type)
+{
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+
+    memory_region_init(&cregs->component_registers, obj, type,
+                       CXL2_COMPONENT_BLOCK_SIZE);
+
+    /* io registers controls link which we don't care about in QEMU */
+    memory_region_init_io(&cregs->io, obj, NULL, cregs, ".io",
+                          CXL2_COMPONENT_IO_REGION_SIZE);
+    memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cregs,
+                          ".cache_mem", CXL2_COMPONENT_CM_REGION_SIZE);
+
+    memory_region_add_subregion(&cregs->component_registers, 0, &cregs->io);
+    memory_region_add_subregion(&cregs->component_registers,
+                                CXL2_COMPONENT_IO_REGION_SIZE,
+                                &cregs->cache_mem);
+}
+
+static void ras_init_common(uint32_t *reg_state)
+{
+    reg_state[R_CXL_RAS_UNC_ERR_STATUS] = 0;
+    reg_state[R_CXL_RAS_UNC_ERR_MASK] = 0x1efff;
+    reg_state[R_CXL_RAS_UNC_ERR_SEVERITY] = 0x1efff;
+    reg_state[R_CXL_RAS_COR_ERR_STATUS] = 0;
+    reg_state[R_CXL_RAS_COR_ERR_MASK] = 0x3f;
+    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */
+}
+
+static void hdm_init_common(uint32_t *reg_state)
+{
+    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0);
+    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);
+}
+
+void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
+{
+    int caps = 0;
+    switch (type) {
+    case CXL2_DOWNSTREAM_PORT:
+    case CXL2_DEVICE:
+        /* CAP, RAS, Link */
+        caps = 2;
+        break;
+    case CXL2_UPSTREAM_PORT:
+    case CXL2_TYPE3_DEVICE:
+    case CXL2_LOGICAL_DEVICE:
+        /* + HDM */
+        caps = 3;
+        break;
+    case CXL2_ROOT_PORT:
+        /* + Extended Security, + Snoop */
+        caps = 5;
+        break;
+    default:
+        abort();
+    }
+
+    memset(reg_state, 0, 0x1000);
+
+    /* CXL Capability Header Register */
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
+
+
+#define init_cap_reg(reg, id, version)                                        \
+    _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
+    do {                                                                      \
+        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
+        reg_state[which] = FIELD_DP32(reg_state[which],                       \
+                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
+        reg_state[which] =                                                    \
+            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
+                       VERSION, version);                                     \
+        reg_state[which] =                                                    \
+            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
+                       CXL_##reg##_REGISTERS_OFFSET);                         \
+    } while (0)
+
+    init_cap_reg(RAS, 2, 1);
+    ras_init_common(reg_state);
+
+    init_cap_reg(LINK, 4, 2);
+
+    if (caps < 3) {
+        return;
+    }
+
+    init_cap_reg(HDM, 5, 1);
+    hdm_init_common(reg_state);
+
+    if (caps < 5) {
+        return;
+    }
+
+    init_cap_reg(EXTSEC, 6, 1);
+    init_cap_reg(SNOOP, 8, 1);
+
+#undef init_cap_reg
+}
+
+/*
+ * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
+ * for tracking the valid offset.
+ *
+ * This function will build the DVSEC header on behalf of the caller and then
+ * copy in the remaining data for the vendor specific bits.
+ */
+void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
+                                uint16_t type, uint8_t rev, uint8_t *body)
+{
+    PCIDevice *pdev = cxl->pdev;
+    uint16_t offset = cxl->dvsec_offset;
+
+    assert(offset >= PCI_CFG_SPACE_SIZE &&
+           ((offset + length) < PCI_CFG_SPACE_EXP_SIZE));
+    assert((length & 0xf000) == 0);
+    assert((rev & ~0xf) == 0);
+
+    /* Create the DVSEC in the MCFG space */
+    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
+    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER1_OFFSET,
+                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
+    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
+    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
+           body + sizeof(struct dvsec_header),
+           length - sizeof(struct dvsec_header));
+
+    /* Update state for future DVSEC additions */
+    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
+    cxl->dvsec_offset += length;
+}
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
new file mode 100644
index 0000000000..00c3876a0f
--- /dev/null
+++ b/hw/cxl/meson.build
@@ -0,0 +1,3 @@
+softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
+  'cxl-component-utils.c',
+))
diff --git a/hw/meson.build b/hw/meson.build
index 010de7219c..3e440c341a 100644
--- a/hw/meson.build
+++ b/hw/meson.build
@@ -6,6 +6,7 @@ subdir('block')
 subdir('char')
 subdir('core')
 subdir('cpu')
+subdir('cxl')
 subdir('display')
 subdir('dma')
 subdir('gpio')
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
new file mode 100644
index 0000000000..55f6cc30a5
--- /dev/null
+++ b/include/hw/cxl/cxl.h
@@ -0,0 +1,17 @@
+/*
+ * QEMU CXL Support
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_H
+#define CXL_H
+
+#include "cxl_pci.h"
+#include "cxl_component.h"
+
+#endif
+
diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
new file mode 100644
index 0000000000..762feb54da
--- /dev/null
+++ b/include/hw/cxl/cxl_component.h
@@ -0,0 +1,187 @@
+/*
+ * QEMU CXL Component
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_COMPONENT_H
+#define CXL_COMPONENT_H
+
+/* CXL 2.0 - 8.2.4 */
+#define CXL2_COMPONENT_IO_REGION_SIZE 0x1000
+#define CXL2_COMPONENT_CM_REGION_SIZE 0x1000
+#define CXL2_COMPONENT_BLOCK_SIZE 0x10000
+
+#include "qemu/range.h"
+#include "qemu/typedefs.h"
+#include "hw/register.h"
+
+enum reg_type {
+    CXL2_DEVICE,
+    CXL2_TYPE3_DEVICE,
+    CXL2_LOGICAL_DEVICE,
+    CXL2_ROOT_PORT,
+    CXL2_UPSTREAM_PORT,
+    CXL2_DOWNSTREAM_PORT
+};
+
+/*
+ * Capability registers are defined at the top of the CXL.cache/mem region and
+ * are packed. For our purposes we will always define the caps in the same
+ * order.
+ * CXL 2.0 - 8.2.5 Table 142 for details.
+ */
+
+/* CXL 2.0 - 8.2.5.1 */
+REG32(CXL_CAPABILITY_HEADER, 0)
+    FIELD(CXL_CAPABILITY_HEADER, ID, 0, 16)
+    FIELD(CXL_CAPABILITY_HEADER, VERSION, 16, 4)
+    FIELD(CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 20, 4)
+    FIELD(CXL_CAPABILITY_HEADER, ARRAY_SIZE, 24, 8)
+
+#define CXLx_CAPABILITY_HEADER(type, offset)                  \
+    REG32(CXL_##type##_CAPABILITY_HEADER, offset)             \
+        FIELD(CXL_##type##_CAPABILITY_HEADER, ID, 0, 16)      \
+        FIELD(CXL_##type##_CAPABILITY_HEADER, VERSION, 16, 4) \
+        FIELD(CXL_##type##_CAPABILITY_HEADER, PTR, 20, 12)
+CXLx_CAPABILITY_HEADER(RAS, 0x4)
+CXLx_CAPABILITY_HEADER(LINK, 0x8)
+CXLx_CAPABILITY_HEADER(HDM, 0xc)
+CXLx_CAPABILITY_HEADER(EXTSEC, 0x10)
+CXLx_CAPABILITY_HEADER(SNOOP, 0x14)
+
+/*
+ * Capability structures contain the actual registers that the CXL component
+ * implements. Some of these are specific to certain types of components, but
+ * this implementation leaves enough space regardless.
+ */
+/* 8.2.5.9 - CXL RAS Capability Structure */
+#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */
+#define CXL_RAS_REGISTERS_SIZE   0x58
+REG32(CXL_RAS_UNC_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET)
+REG32(CXL_RAS_UNC_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x4)
+REG32(CXL_RAS_UNC_ERR_SEVERITY, CXL_RAS_REGISTERS_OFFSET + 0x8)
+REG32(CXL_RAS_COR_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET + 0xc)
+REG32(CXL_RAS_COR_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x10)
+REG32(CXL_RAS_ERR_CAP_CTRL, CXL_RAS_REGISTERS_OFFSET + 0x14)
+/* Offset 0x18 - 0x58 reserved for RAS logs */
+
+/* 8.2.5.10 - CXL Security Capability Structure */
+#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
+#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
+
+/* 8.2.5.11 - CXL Link Capability Structure */
+#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
+#define CXL_LINK_REGISTERS_SIZE   0x38
+
+/* 8.2.5.12 - CXL HDM Decoder Capability Structure */
+#define HDM_DECODE_MAX 10 /* 8.2.5.12.1 */
+#define CXL_HDM_REGISTERS_OFFSET \
+    (CXL_LINK_REGISTERS_OFFSET + CXL_LINK_REGISTERS_SIZE) /* 8.2.5.12 */
+#define CXL_HDM_REGISTERS_SIZE (0x20 + HDM_DECODE_MAX * 10)
+#define HDM_DECODER_INIT(n)                                                    \
+  REG32(CXL_HDM_DECODER##n##_BASE_LO,                                          \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x10)                          \
+            FIELD(CXL_HDM_DECODER##n##_BASE_LO, L, 28, 4)                      \
+  REG32(CXL_HDM_DECODER##n##_BASE_HI,                                          \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x14)                          \
+  REG32(CXL_HDM_DECODER##n##_SIZE_LO,                                          \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x18)                          \
+  REG32(CXL_HDM_DECODER##n##_SIZE_HI,                                          \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x1C)                          \
+  REG32(CXL_HDM_DECODER##n##_CTRL,                                             \
+        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x20)                          \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, IG, 0, 4)                         \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, IW, 4, 4)                         \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, LOCK_ON_COMMIT, 8, 1)             \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMIT, 9, 1)                     \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMITTED, 10, 1)                 \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, ERROR, 11, 1)                     \
+            FIELD(CXL_HDM_DECODER##n##_CTRL, TYPE, 12, 1)                      \
+  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_LO, 0x24)                             \
+  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_HI, 0x28)
+
+REG32(CXL_HDM_DECODER_CAPABILITY, CXL_HDM_REGISTERS_OFFSET)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0, 4)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, TARGET_COUNT, 4, 4)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_256B, 8, 1)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, INTELEAVE_4K, 9, 1)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, POISON_ON_ERR_CAP, 10, 1)
+REG32(CXL_HDM_DECODER_GLOBAL_CONTROL, CXL_HDM_REGISTERS_OFFSET + 4)
+    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, POISON_ON_ERR_EN, 0, 1)
+    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 1, 1)
+
+HDM_DECODER_INIT(0);
+
+/* 8.2.5.13 - CXL Extended Security Capability Structure (Root complex only) */
+#define EXTSEC_ENTRY_MAX        256
+#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)
+#define CXL_EXTSEC_REGISTERS_SIZE   (8 * EXTSEC_ENTRY_MAX + 4)
+
+/* 8.2.5.14 - CXL IDE Capability Structure */
+#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)
+#define CXL_IDE_REGISTERS_SIZE   0
+
+/* 8.2.5.15 - CXL Snoop Filter Capability Structure */
+#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
+#define CXL_SNOOP_REGISTERS_SIZE   0x8
+
+_Static_assert((CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE) < 0x1000,
+               "No space for registers");
+
+typedef struct component_registers {
+    /*
+     * Main memory region to be registered with QEMU core.
+     */
+    MemoryRegion component_registers;
+
+    /*
+     * 8.2.4 Table 141:
+     *   0x0000 - 0x0fff CXL.io registers
+     *   0x1000 - 0x1fff CXL.cache and CXL.mem
+     *   0x2000 - 0xdfff Implementation specific
+     *   0xe000 - 0xe3ff CXL ARB/MUX registers
+     *   0xe400 - 0xffff RSVD
+     */
+    uint32_t io_registers[CXL2_COMPONENT_IO_REGION_SIZE >> 2];
+    MemoryRegion io;
+
+    uint32_t cache_mem_registers[CXL2_COMPONENT_CM_REGION_SIZE >> 2];
+    MemoryRegion cache_mem;
+
+    MemoryRegion impl_specific;
+    MemoryRegion arb_mux;
+    MemoryRegion rsvd;
+
+    /* special_ops is used for any component that needs any specific handling */
+    MemoryRegionOps *special_ops;
+} ComponentRegisters;
+
+/*
+ * A CXL component represents all entities in a CXL hierarchy. This includes,
+ * host bridges, root ports, upstream/downstream switch ports, and devices
+ */
+typedef struct cxl_component {
+    ComponentRegisters crb;
+    union {
+        struct {
+            Range dvsecs[CXL20_MAX_DVSEC];
+            uint16_t dvsec_offset;
+            struct PCIDevice *pdev;
+        };
+    };
+} CXLComponentState;
+
+void cxl_component_register_block_init(Object *obj,
+                                       CXLComponentState *cxl_cstate,
+                                       const char *type);
+void cxl_component_register_init_common(uint32_t *reg_state,
+                                        enum reg_type type);
+
+void cxl_component_create_dvsec(CXLComponentState *cxl_cstate, uint16_t length,
+                                uint16_t type, uint8_t rev, uint8_t *body);
+
+#endif
diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
new file mode 100644
index 0000000000..a53c2e5ae7
--- /dev/null
+++ b/include/hw/cxl/cxl_pci.h
@@ -0,0 +1,138 @@
+/*
+ * QEMU CXL PCI interfaces
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_PCI_H
+#define CXL_PCI_H
+
+#include "hw/pci/pci.h"
+#include "hw/pci/pcie.h"
+
+#define CXL_VENDOR_ID 0x1e98
+
+#define PCIE_DVSEC_HEADER1_OFFSET 0x4 /* Offset from start of extend cap */
+#define PCIE_DVSEC_ID_OFFSET 0x8
+
+#define PCIE_CXL_DEVICE_DVSEC_LENGTH 0x38
+#define PCIE_CXL1_DEVICE_DVSEC_REVID 0
+#define PCIE_CXL2_DEVICE_DVSEC_REVID 1
+
+#define EXTENSIONS_PORT_DVSEC_LENGTH 0x28
+#define EXTENSIONS_PORT_DVSEC_REVID 0
+
+#define GPF_PORT_DVSEC_LENGTH 0x10
+#define GPF_PORT_DVSEC_REVID  0
+
+#define PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0 0x14
+#define PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0  1
+
+#define REG_LOC_DVSEC_LENGTH 0x24
+#define REG_LOC_DVSEC_REVID  0
+
+enum {
+    PCIE_CXL_DEVICE_DVSEC      = 0,
+    NON_CXL_FUNCTION_MAP_DVSEC = 2,
+    EXTENSIONS_PORT_DVSEC      = 3,
+    GPF_PORT_DVSEC             = 4,
+    GPF_DEVICE_DVSEC           = 5,
+    PCIE_FLEXBUS_PORT_DVSEC    = 7,
+    REG_LOC_DVSEC              = 8,
+    MLD_DVSEC                  = 9,
+    CXL20_MAX_DVSEC
+};
+
+struct dvsec_header {
+    uint32_t cap_hdr;
+    uint32_t dv_hdr1;
+    uint16_t dv_hdr2;
+} __attribute__((__packed__));
+_Static_assert(sizeof(struct dvsec_header) == 10,
+               "dvsec header size incorrect");
+
+/*
+ * CXL 2.0 devices must implement certain DVSEC IDs, and can [optionally]
+ * implement others.
+ *
+ * CXL 2.0 Device: 0, [2], 5, 8
+ * CXL 2.0 RP: 3, 4, 7, 8
+ * CXL 2.0 Upstream Port: [2], 7, 8
+ * CXL 2.0 Downstream Port: 3, 4, 7, 8
+ */
+
+/* CXL 2.0 - 8.1.5 (ID 0003) */
+struct extensions_dvsec_port {
+    struct dvsec_header hdr;
+    uint16_t status;
+    uint16_t control;
+    uint8_t alt_bus_base;
+    uint8_t alt_bus_limit;
+    uint16_t alt_memory_base;
+    uint16_t alt_memory_limit;
+    uint16_t alt_prefetch_base;
+    uint16_t alt_prefetch_limit;
+    uint32_t alt_prefetch_base_high;
+    uint32_t alt_prefetch_base_low;
+    uint32_t rcrb_base;
+    uint32_t rcrb_base_high;
+};
+_Static_assert(sizeof(struct extensions_dvsec_port) == 0x28,
+               "extensions dvsec port size incorrect");
+#define PORT_CONTROL_OVERRIDE_OFFSET 0xc
+#define PORT_CONTROL_UNMASK_SBR      1
+#define PORT_CONTROL_ALT_MEMID_EN    4
+
+/* CXL 2.0 - 8.1.6 GPF DVSEC (ID 0004) */
+struct dvsec_port_gpf {
+    struct dvsec_header hdr;
+    uint16_t rsvd;
+    uint16_t phase1_ctrl;
+    uint16_t phase2_ctrl;
+};
+_Static_assert(sizeof(struct dvsec_port_gpf) == 0x10,
+               "dvsec port GPF size incorrect");
+
+/* CXL 2.0 - 8.1.8/8.2.1.3 Flexbus DVSEC (ID 0007) */
+struct dvsec_port_flexbus {
+    struct dvsec_header hdr;
+    uint16_t cap;
+    uint16_t ctrl;
+    uint16_t status;
+    uint32_t rcvd_mod_ts_data;
+};
+_Static_assert(sizeof(struct dvsec_port_flexbus) == 0x14,
+               "dvsec port flexbus size incorrect");
+
+/* CXL 2.0 - 8.1.9 Register Locator DVSEC (ID 0008) */
+struct dvsec_register_locator {
+    struct dvsec_header hdr;
+    uint16_t rsvd;
+    uint32_t reg0_base_lo;
+    uint32_t reg0_base_hi;
+    uint32_t reg1_base_lo;
+    uint32_t reg1_base_hi;
+    uint32_t reg2_base_lo;
+    uint32_t reg2_base_hi;
+};
+_Static_assert(sizeof(struct dvsec_register_locator) == 0x24,
+               "dvsec register locator size incorrect");
+
+/* BAR Equivalence Indicator */
+#define BEI_BAR_10H 0
+#define BEI_BAR_14H 1
+#define BEI_BAR_18H 2
+#define BEI_BAR_1cH 3
+#define BEI_BAR_20H 4
+#define BEI_BAR_24H 5
+
+/* Register Block Identifier */
+#define RBI_EMPTY          0
+#define RBI_COMPONENT_REG  (1 << 8)
+#define RBI_BAR_VIRT_ACL   (2 << 8)
+#define RBI_CXL_DEVICE_REG (3 << 8)
+
+#endif
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 03/31] hw/cxl/device: Introduce a CXL device (8.2.8)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

A CXL device is a type of CXL component. Conceptually, a CXL device
would be a leaf node in a CXL topology. From an emulation perspective,
CXL devices are the most complex and so the actual implementation is
reserved for discrete commits.

This new device type is specifically catered towards the eventual
implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
specification.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 include/hw/cxl/cxl.h        |   1 +
 include/hw/cxl/cxl_device.h | 155 ++++++++++++++++++++++++++++++++++++
 2 files changed, 156 insertions(+)
 create mode 100644 include/hw/cxl/cxl_device.h

diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 55f6cc30a5..23f52c4cf9 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -12,6 +12,7 @@
 
 #include "cxl_pci.h"
 #include "cxl_component.h"
+#include "cxl_device.h"
 
 #endif
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
new file mode 100644
index 0000000000..a85f250503
--- /dev/null
+++ b/include/hw/cxl/cxl_device.h
@@ -0,0 +1,155 @@
+/*
+ * QEMU CXL Devices
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_DEVICE_H
+#define CXL_DEVICE_H
+
+#include "hw/register.h"
+
+/*
+ * The following is how a CXL device's MMIO space is laid out. The only
+ * requirement from the spec is that the capabilities array and the capability
+ * headers start at offset 0 and are contiguously packed. The headers themselves
+ * provide offsets to the register fields. For this emulation, registers will
+ * start at offset 0x80 (m == 0x80). No secondary mailbox is implemented which
+ * means that n = m + sizeof(mailbox registers) + sizeof(device registers).
+ *
+ * This is roughly described in 8.2.8 Figure 138 of the CXL 2.0 spec.
+ *
+ * n + PAYLOAD_SIZE_MAX  +---------------------------------+
+ *                       |                                 |
+ *                  ^    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |         Command Payload         |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  n    +---------------------------------+
+ *                  ^    |                                 |
+ *                  |    |    Device Capability Registers  |
+ *                  |    |    x, mailbox, y                |
+ *                  |    |                                 |
+ *                  m    +---------------------------------+
+ *                  ^    |     Device Capability Header y  |
+ *                  |    +---------------------------------+
+ *                  |    | Device Capability Header Mailbox|
+ *                  |    +------------- --------------------
+ *                  |    |     Device Capability Header x  |
+ *                  |    +---------------------------------+
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |      Device Cap Array[0..n]     |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  0    +---------------------------------+
+ */
+
+#define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
+#define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
+#define CXL_DEVICE_CAPS_MAX 4 /* 8.2.8.2.1 + 8.2.8.5 */
+
+#define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
+#define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
+
+#define CXL_MAILBOX_REGISTERS_OFFSET \
+    (CXL_DEVICE_REGISTERS_OFFSET + CXL_DEVICE_REGISTERS_LENGTH)
+#define CXL_MAILBOX_REGISTERS_SIZE 0x20
+#define CXL_MAILBOX_PAYLOAD_SHIFT 11
+#define CXL_MAILBOX_MAX_PAYLOAD_SIZE (1 << CXL_MAILBOX_PAYLOAD_SHIFT)
+#define CXL_MAILBOX_REGISTERS_LENGTH \
+    (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
+
+typedef struct cxl_device_state {
+    MemoryRegion device_registers;
+
+    /* mmio for device capabilities array - 8.2.8.2 */
+    MemoryRegion caps;
+
+    /* mmio for the device status registers 8.2.8.3 */
+    MemoryRegion device;
+
+    /* mmio for the mailbox registers 8.2.8.4 */
+    MemoryRegion mailbox;
+
+    /* memory region for persistent memory, HDM */
+    MemoryRegion *pmem;
+
+    /* memory region for volatile  memory, HDM */
+    MemoryRegion *vmem;
+} CXLDeviceState;
+
+/* Initialize the register block for a device */
+void cxl_device_register_block_init(Object *obj, CXLDeviceState *dev);
+
+/* Set up default values for the register block */
+void cxl_device_register_init_common(CXLDeviceState *dev);
+
+/* CXL 2.0 - 8.2.8.1 */
+REG32(CXL_DEV_CAP_ARRAY, 0) /* 48b!?!?! */
+    FIELD(CXL_DEV_CAP_ARRAY, CAP_ID, 0, 16)
+    FIELD(CXL_DEV_CAP_ARRAY, CAP_VERSION, 16, 8)
+REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
+    FIELD(CXL_DEV_CAP_ARRAY2, CAP_COUNT, 0, 16)
+
+/*
+ * Helper macro to initialize capability headers for CXL devices.
+ *
+ * In the 8.2.8.2, this is listed as a 128b register, but in 8.2.8, it says:
+ * > No registers defined in Section 8.2.8 are larger than 64-bits wide so that
+ * > is the maximum access size allowed for these registers. If this rule is not
+ * > followed, the behavior is undefined
+ *
+ * Here we've chosen to make it 4 dwords. The spec allows any pow2 multiple
+ * access to be used for a register (2 qwords, 8 words, 128 bytes).
+ */
+#define CXL_DEVICE_CAPABILITY_HEADER_REGISTER(n, offset)                            \
+    REG32(CXL_DEV_##n##_CAP_HDR0, offset)                 \
+        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_ID, 0, 16)      \
+        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_VERSION, 16, 8) \
+    REG32(CXL_DEV_##n##_CAP_HDR1, offset + 4)             \
+        FIELD(CXL_DEV_##n##_CAP_HDR1, CAP_OFFSET, 0, 32)  \
+    REG32(CXL_DEV_##n##_CAP_HDR2, offset + 8)             \
+        FIELD(CXL_DEV_##n##_CAP_HDR2, CAP_LENGTH, 0, 32)
+
+CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
+CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
+                                               CXL_DEVICE_CAP_REG_SIZE)
+
+REG32(CXL_DEV_MAILBOX_CAP, 0)
+    FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
+    FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
+    FIELD(CXL_DEV_MAILBOX_CAP, BG_INT_CAP, 6, 1)
+    FIELD(CXL_DEV_MAILBOX_CAP, MSI_N, 7, 4)
+
+REG32(CXL_DEV_MAILBOX_CTRL, 4)
+    FIELD(CXL_DEV_MAILBOX_CTRL, DOORBELL, 0, 1)
+    FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
+    FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
+
+/* XXX: actually a 64b register */
+REG32(CXL_DEV_MAILBOX_STS, 0x10)
+    FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
+    FIELD(CXL_DEV_MAILBOX_STS, ERRNO, 32, 16)
+    FIELD(CXL_DEV_MAILBOX_STS, VENDOR_ERRNO, 48, 16)
+
+/* XXX: actually a 64b register */
+REG32(CXL_DEV_BG_CMD_STS, 0x18)
+    FIELD(CXL_DEV_BG_CMD_STS, BG, 0, 16)
+    FIELD(CXL_DEV_BG_CMD_STS, DONE, 16, 7)
+    FIELD(CXL_DEV_BG_CMD_STS, ERRNO, 32, 16)
+    FIELD(CXL_DEV_BG_CMD_STS, VENDOR_ERRNO, 48, 16)
+
+REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
+
+#endif
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 03/31] hw/cxl/device: Introduce a CXL device (8.2.8)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

A CXL device is a type of CXL component. Conceptually, a CXL device
would be a leaf node in a CXL topology. From an emulation perspective,
CXL devices are the most complex and so the actual implementation is
reserved for discrete commits.

This new device type is specifically catered towards the eventual
implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
specification.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 include/hw/cxl/cxl.h        |   1 +
 include/hw/cxl/cxl_device.h | 155 ++++++++++++++++++++++++++++++++++++
 2 files changed, 156 insertions(+)
 create mode 100644 include/hw/cxl/cxl_device.h

diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 55f6cc30a5..23f52c4cf9 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -12,6 +12,7 @@
 
 #include "cxl_pci.h"
 #include "cxl_component.h"
+#include "cxl_device.h"
 
 #endif
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
new file mode 100644
index 0000000000..a85f250503
--- /dev/null
+++ b/include/hw/cxl/cxl_device.h
@@ -0,0 +1,155 @@
+/*
+ * QEMU CXL Devices
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_DEVICE_H
+#define CXL_DEVICE_H
+
+#include "hw/register.h"
+
+/*
+ * The following is how a CXL device's MMIO space is laid out. The only
+ * requirement from the spec is that the capabilities array and the capability
+ * headers start at offset 0 and are contiguously packed. The headers themselves
+ * provide offsets to the register fields. For this emulation, registers will
+ * start at offset 0x80 (m == 0x80). No secondary mailbox is implemented which
+ * means that n = m + sizeof(mailbox registers) + sizeof(device registers).
+ *
+ * This is roughly described in 8.2.8 Figure 138 of the CXL 2.0 spec.
+ *
+ * n + PAYLOAD_SIZE_MAX  +---------------------------------+
+ *                       |                                 |
+ *                  ^    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |         Command Payload         |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  n    +---------------------------------+
+ *                  ^    |                                 |
+ *                  |    |    Device Capability Registers  |
+ *                  |    |    x, mailbox, y                |
+ *                  |    |                                 |
+ *                  m    +---------------------------------+
+ *                  ^    |     Device Capability Header y  |
+ *                  |    +---------------------------------+
+ *                  |    | Device Capability Header Mailbox|
+ *                  |    +------------- --------------------
+ *                  |    |     Device Capability Header x  |
+ *                  |    +---------------------------------+
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |      Device Cap Array[0..n]     |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  0    +---------------------------------+
+ */
+
+#define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
+#define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
+#define CXL_DEVICE_CAPS_MAX 4 /* 8.2.8.2.1 + 8.2.8.5 */
+
+#define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
+#define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
+
+#define CXL_MAILBOX_REGISTERS_OFFSET \
+    (CXL_DEVICE_REGISTERS_OFFSET + CXL_DEVICE_REGISTERS_LENGTH)
+#define CXL_MAILBOX_REGISTERS_SIZE 0x20
+#define CXL_MAILBOX_PAYLOAD_SHIFT 11
+#define CXL_MAILBOX_MAX_PAYLOAD_SIZE (1 << CXL_MAILBOX_PAYLOAD_SHIFT)
+#define CXL_MAILBOX_REGISTERS_LENGTH \
+    (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
+
+typedef struct cxl_device_state {
+    MemoryRegion device_registers;
+
+    /* mmio for device capabilities array - 8.2.8.2 */
+    MemoryRegion caps;
+
+    /* mmio for the device status registers 8.2.8.3 */
+    MemoryRegion device;
+
+    /* mmio for the mailbox registers 8.2.8.4 */
+    MemoryRegion mailbox;
+
+    /* memory region for persistent memory, HDM */
+    MemoryRegion *pmem;
+
+    /* memory region for volatile  memory, HDM */
+    MemoryRegion *vmem;
+} CXLDeviceState;
+
+/* Initialize the register block for a device */
+void cxl_device_register_block_init(Object *obj, CXLDeviceState *dev);
+
+/* Set up default values for the register block */
+void cxl_device_register_init_common(CXLDeviceState *dev);
+
+/* CXL 2.0 - 8.2.8.1 */
+REG32(CXL_DEV_CAP_ARRAY, 0) /* 48b!?!?! */
+    FIELD(CXL_DEV_CAP_ARRAY, CAP_ID, 0, 16)
+    FIELD(CXL_DEV_CAP_ARRAY, CAP_VERSION, 16, 8)
+REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
+    FIELD(CXL_DEV_CAP_ARRAY2, CAP_COUNT, 0, 16)
+
+/*
+ * Helper macro to initialize capability headers for CXL devices.
+ *
+ * In the 8.2.8.2, this is listed as a 128b register, but in 8.2.8, it says:
+ * > No registers defined in Section 8.2.8 are larger than 64-bits wide so that
+ * > is the maximum access size allowed for these registers. If this rule is not
+ * > followed, the behavior is undefined
+ *
+ * Here we've chosen to make it 4 dwords. The spec allows any pow2 multiple
+ * access to be used for a register (2 qwords, 8 words, 128 bytes).
+ */
+#define CXL_DEVICE_CAPABILITY_HEADER_REGISTER(n, offset)                            \
+    REG32(CXL_DEV_##n##_CAP_HDR0, offset)                 \
+        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_ID, 0, 16)      \
+        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_VERSION, 16, 8) \
+    REG32(CXL_DEV_##n##_CAP_HDR1, offset + 4)             \
+        FIELD(CXL_DEV_##n##_CAP_HDR1, CAP_OFFSET, 0, 32)  \
+    REG32(CXL_DEV_##n##_CAP_HDR2, offset + 8)             \
+        FIELD(CXL_DEV_##n##_CAP_HDR2, CAP_LENGTH, 0, 32)
+
+CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
+CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
+                                               CXL_DEVICE_CAP_REG_SIZE)
+
+REG32(CXL_DEV_MAILBOX_CAP, 0)
+    FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
+    FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
+    FIELD(CXL_DEV_MAILBOX_CAP, BG_INT_CAP, 6, 1)
+    FIELD(CXL_DEV_MAILBOX_CAP, MSI_N, 7, 4)
+
+REG32(CXL_DEV_MAILBOX_CTRL, 4)
+    FIELD(CXL_DEV_MAILBOX_CTRL, DOORBELL, 0, 1)
+    FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
+    FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
+
+/* XXX: actually a 64b register */
+REG32(CXL_DEV_MAILBOX_STS, 0x10)
+    FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
+    FIELD(CXL_DEV_MAILBOX_STS, ERRNO, 32, 16)
+    FIELD(CXL_DEV_MAILBOX_STS, VENDOR_ERRNO, 48, 16)
+
+/* XXX: actually a 64b register */
+REG32(CXL_DEV_BG_CMD_STS, 0x18)
+    FIELD(CXL_DEV_BG_CMD_STS, BG, 0, 16)
+    FIELD(CXL_DEV_BG_CMD_STS, DONE, 16, 7)
+    FIELD(CXL_DEV_BG_CMD_STS, ERRNO, 32, 16)
+    FIELD(CXL_DEV_BG_CMD_STS, VENDOR_ERRNO, 48, 16)
+
+REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
+
+#endif
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 04/31] hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

This implements all device MMIO up to the first capability. That
includes the CXL Device Capabilities Array Register, as well as all of
the CXL Device Capability Header Registers. The latter are filled in as
they are implemented in the following patches.

Endianness and alignment are managed by softmmu memory core.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c   | 105 ++++++++++++++++++++++++++++++++++++
 hw/cxl/meson.build          |   1 +
 include/hw/cxl/cxl_device.h |  27 +++++++++-
 3 files changed, 132 insertions(+), 1 deletion(-)
 create mode 100644 hw/cxl/cxl-device-utils.c

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
new file mode 100644
index 0000000000..bb15ad9a0f
--- /dev/null
+++ b/hw/cxl/cxl-device-utils.c
@@ -0,0 +1,105 @@
+/*
+ * CXL Utility library for devices
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/cxl/cxl.h"
+
+/*
+ * Device registers have no restrictions per the spec, and so fall back to the
+ * default memory mapped register rules in 8.2:
+ *   Software shall use CXL.io Memory Read and Write to access memory mapped
+ *   register defined in this section. Unless otherwise specified, software
+ *   shall restrict the accesses width based on the following:
+ *   • A 32 bit register shall   be accessed as a 1 Byte, 2 Bytes or 4 Bytes
+ *     quantity.
+ *   • A 64 bit register shall be accessed as a 1 Byte, 2 Bytes, 4 Bytes or 8
+ *     Bytes
+ *   • The address shall be a multiple of the access width, e.g. when
+ *     accessing a register as a 4 Byte quantity, the address shall be
+ *     multiple of 4.
+ *   • The accesses shall map to contiguous bytes.If these rules are not
+ *     followed, the behavior is undefined
+ */
+
+static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    return cxl_dstate->caps_reg_state32[offset / 4];
+}
+
+static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    return 0;
+}
+
+static const MemoryRegionOps dev_ops = {
+    .read = dev_reg_read,
+    .write = NULL, /* status register is read only */
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+    },
+};
+
+static const MemoryRegionOps caps_ops = {
+    .read = caps_reg_read,
+    .write = NULL, /* caps registers are read only */
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
+
+void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
+{
+    /* This will be a BAR, so needs to be rounded up to pow2 for PCI spec */
+    memory_region_init(&cxl_dstate->device_registers, obj, "device-registers",
+                       pow2ceil(CXL_MMIO_SIZE));
+
+    memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
+                          "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
+    memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
+                          "device-status", CXL_DEVICE_REGISTERS_LENGTH);
+
+    memory_region_add_subregion(&cxl_dstate->device_registers, 0,
+                                &cxl_dstate->caps);
+    memory_region_add_subregion(&cxl_dstate->device_registers,
+                                CXL_DEVICE_REGISTERS_OFFSET,
+                                &cxl_dstate->device);
+}
+
+static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
+
+void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
+{
+    uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
+    const int cap_count = 1;
+
+    /* CXL Device Capabilities Array Register */
+    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
+    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
+    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
+
+    cxl_device_cap_init(cxl_dstate, DEVICE, 1);
+    device_reg_init_common(cxl_dstate);
+}
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
index 00c3876a0f..47154d6850 100644
--- a/hw/cxl/meson.build
+++ b/hw/cxl/meson.build
@@ -1,3 +1,4 @@
 softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
   'cxl-component-utils.c',
+  'cxl-device-utils.c',
 ))
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index a85f250503..f3bcf19410 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -58,6 +58,8 @@
 #define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
 #define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
 #define CXL_DEVICE_CAPS_MAX 4 /* 8.2.8.2.1 + 8.2.8.5 */
+#define CXL_CAPS_SIZE \
+    (CXL_DEVICE_CAP_REG_SIZE * CXL_DEVICE_CAPS_MAX + 1) /* +1 for header */
 
 #define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
 #define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
@@ -70,11 +72,18 @@
 #define CXL_MAILBOX_REGISTERS_LENGTH \
     (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
 
+#define CXL_MMIO_SIZE                                       \
+    CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_REGISTERS_LENGTH + \
+        CXL_MAILBOX_REGISTERS_LENGTH
+
 typedef struct cxl_device_state {
     MemoryRegion device_registers;
 
     /* mmio for device capabilities array - 8.2.8.2 */
-    MemoryRegion caps;
+    struct {
+        MemoryRegion caps;
+        uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
+    };
 
     /* mmio for the device status registers 8.2.8.3 */
     MemoryRegion device;
@@ -126,6 +135,22 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
                                                CXL_DEVICE_CAP_REG_SIZE)
 
+#define cxl_device_cap_init(dstate, reg, cap_id)                                   \
+    do {                                                                           \
+        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
+        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \
+        cap_hdrs[which] =                                                          \
+            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \
+        cap_hdrs[which] = FIELD_DP32(                                              \
+            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \
+        cap_hdrs[which + 1] =                                                      \
+            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \
+                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \
+        cap_hdrs[which + 2] =                                                      \
+            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \
+                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \
+    } while (0)
+
 REG32(CXL_DEV_MAILBOX_CAP, 0)
     FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
     FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 04/31] hw/cxl/device: Implement the CAP array (8.2.8.1-2)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

This implements all device MMIO up to the first capability. That
includes the CXL Device Capabilities Array Register, as well as all of
the CXL Device Capability Header Registers. The latter are filled in as
they are implemented in the following patches.

Endianness and alignment are managed by softmmu memory core.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c   | 105 ++++++++++++++++++++++++++++++++++++
 hw/cxl/meson.build          |   1 +
 include/hw/cxl/cxl_device.h |  27 +++++++++-
 3 files changed, 132 insertions(+), 1 deletion(-)
 create mode 100644 hw/cxl/cxl-device-utils.c

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
new file mode 100644
index 0000000000..bb15ad9a0f
--- /dev/null
+++ b/hw/cxl/cxl-device-utils.c
@@ -0,0 +1,105 @@
+/*
+ * CXL Utility library for devices
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/cxl/cxl.h"
+
+/*
+ * Device registers have no restrictions per the spec, and so fall back to the
+ * default memory mapped register rules in 8.2:
+ *   Software shall use CXL.io Memory Read and Write to access memory mapped
+ *   register defined in this section. Unless otherwise specified, software
+ *   shall restrict the accesses width based on the following:
+ *   • A 32 bit register shall   be accessed as a 1 Byte, 2 Bytes or 4 Bytes
+ *     quantity.
+ *   • A 64 bit register shall be accessed as a 1 Byte, 2 Bytes, 4 Bytes or 8
+ *     Bytes
+ *   • The address shall be a multiple of the access width, e.g. when
+ *     accessing a register as a 4 Byte quantity, the address shall be
+ *     multiple of 4.
+ *   • The accesses shall map to contiguous bytes.If these rules are not
+ *     followed, the behavior is undefined
+ */
+
+static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    return cxl_dstate->caps_reg_state32[offset / 4];
+}
+
+static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    return 0;
+}
+
+static const MemoryRegionOps dev_ops = {
+    .read = dev_reg_read,
+    .write = NULL, /* status register is read only */
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+    },
+};
+
+static const MemoryRegionOps caps_ops = {
+    .read = caps_reg_read,
+    .write = NULL, /* caps registers are read only */
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
+
+void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
+{
+    /* This will be a BAR, so needs to be rounded up to pow2 for PCI spec */
+    memory_region_init(&cxl_dstate->device_registers, obj, "device-registers",
+                       pow2ceil(CXL_MMIO_SIZE));
+
+    memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
+                          "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
+    memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
+                          "device-status", CXL_DEVICE_REGISTERS_LENGTH);
+
+    memory_region_add_subregion(&cxl_dstate->device_registers, 0,
+                                &cxl_dstate->caps);
+    memory_region_add_subregion(&cxl_dstate->device_registers,
+                                CXL_DEVICE_REGISTERS_OFFSET,
+                                &cxl_dstate->device);
+}
+
+static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
+
+void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
+{
+    uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
+    const int cap_count = 1;
+
+    /* CXL Device Capabilities Array Register */
+    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
+    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
+    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
+
+    cxl_device_cap_init(cxl_dstate, DEVICE, 1);
+    device_reg_init_common(cxl_dstate);
+}
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
index 00c3876a0f..47154d6850 100644
--- a/hw/cxl/meson.build
+++ b/hw/cxl/meson.build
@@ -1,3 +1,4 @@
 softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
   'cxl-component-utils.c',
+  'cxl-device-utils.c',
 ))
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index a85f250503..f3bcf19410 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -58,6 +58,8 @@
 #define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
 #define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
 #define CXL_DEVICE_CAPS_MAX 4 /* 8.2.8.2.1 + 8.2.8.5 */
+#define CXL_CAPS_SIZE \
+    (CXL_DEVICE_CAP_REG_SIZE * CXL_DEVICE_CAPS_MAX + 1) /* +1 for header */
 
 #define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
 #define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
@@ -70,11 +72,18 @@
 #define CXL_MAILBOX_REGISTERS_LENGTH \
     (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
 
+#define CXL_MMIO_SIZE                                       \
+    CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_REGISTERS_LENGTH + \
+        CXL_MAILBOX_REGISTERS_LENGTH
+
 typedef struct cxl_device_state {
     MemoryRegion device_registers;
 
     /* mmio for device capabilities array - 8.2.8.2 */
-    MemoryRegion caps;
+    struct {
+        MemoryRegion caps;
+        uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
+    };
 
     /* mmio for the device status registers 8.2.8.3 */
     MemoryRegion device;
@@ -126,6 +135,22 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
                                                CXL_DEVICE_CAP_REG_SIZE)
 
+#define cxl_device_cap_init(dstate, reg, cap_id)                                   \
+    do {                                                                           \
+        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
+        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \
+        cap_hdrs[which] =                                                          \
+            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \
+        cap_hdrs[which] = FIELD_DP32(                                              \
+            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \
+        cap_hdrs[which + 1] =                                                      \
+            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \
+                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \
+        cap_hdrs[which + 2] =                                                      \
+            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \
+                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \
+    } while (0)
+
 REG32(CXL_DEV_MAILBOX_CAP, 0)
     FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
     FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

This is the beginning of implementing mailbox support for CXL 2.0
devices. The implementation recognizes when the doorbell is rung,
handles the command/payload, clears the doorbell while returning error
codes and data.

Generally the mailbox mechanism is designed to permit communication
between the host OS and the firmware running on the device. For our
purposes, we emulate both the firmware, implemented primarily in
cxl-mailbox-utils.c, and the hardware.

No commands are implemented yet.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c   | 125 ++++++++++++++++++++++-
 hw/cxl/cxl-mailbox-utils.c  | 197 ++++++++++++++++++++++++++++++++++++
 hw/cxl/meson.build          |   1 +
 include/hw/cxl/cxl.h        |   3 +
 include/hw/cxl/cxl_device.h |  28 ++++-
 5 files changed, 349 insertions(+), 5 deletions(-)
 create mode 100644 hw/cxl/cxl-mailbox-utils.c

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index bb15ad9a0f..6602606f3d 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -40,6 +40,111 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
     return 0;
 }
 
+static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    switch (size) {
+    case 8:
+        return cxl_dstate->mbox_reg_state64[offset / 8];
+    case 4:
+        return cxl_dstate->mbox_reg_state32[offset / 4];
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
+                               uint64_t value)
+{
+    switch (offset) {
+    case A_CXL_DEV_MAILBOX_CTRL:
+        /* fallthrough */
+    case A_CXL_DEV_MAILBOX_CAP:
+        /* RO register */
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
+                      __func__, offset);
+        break;
+    }
+
+    reg_state[offset / 4] = value;
+}
+
+static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
+                               uint64_t value)
+{
+    switch (offset) {
+    case A_CXL_DEV_MAILBOX_CMD:
+        break;
+    case A_CXL_DEV_BG_CMD_STS:
+        /* BG not supported */
+        /* fallthrough */
+    case A_CXL_DEV_MAILBOX_STS:
+        /* Read only register, will get updated by the state machine */
+        return;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
+                      __func__, offset);
+        return;
+    }
+
+
+    reg_state[offset / 8] = value;
+}
+
+static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
+                              unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    if (offset >= A_CXL_DEV_CMD_PAYLOAD) {
+        memcpy(cxl_dstate->mbox_reg_state + offset, &value, size);
+        return;
+    }
+
+    /*
+     * Lock is needed to prevent concurrent writes as well as to prevent writes
+     * coming in while the firmware is processing. Without background commands
+     * or the second mailbox implemented, this serves no purpose since the
+     * memory access is synchronized at a higher level (per memory region).
+     */
+    RCU_READ_LOCK_GUARD();
+
+    switch (size) {
+    case 4:
+        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
+        break;
+    case 8:
+        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+                         DOORBELL))
+        cxl_process_mailbox(cxl_dstate);
+}
+
+static const MemoryRegionOps mailbox_ops = {
+    .read = mailbox_reg_read,
+    .write = mailbox_reg_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
 static const MemoryRegionOps dev_ops = {
     .read = dev_reg_read,
     .write = NULL, /* status register is read only */
@@ -80,20 +185,33 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
                           "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
     memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
                           "device-status", CXL_DEVICE_REGISTERS_LENGTH);
+    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
+                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
 
     memory_region_add_subregion(&cxl_dstate->device_registers, 0,
                                 &cxl_dstate->caps);
     memory_region_add_subregion(&cxl_dstate->device_registers,
                                 CXL_DEVICE_REGISTERS_OFFSET,
                                 &cxl_dstate->device);
+    memory_region_add_subregion(&cxl_dstate->device_registers,
+                                CXL_MAILBOX_REGISTERS_OFFSET,
+                                &cxl_dstate->mailbox);
 }
 
 static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
 
+static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
+{
+    /* 2048 payload size, with no interrupt or background support */
+    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
+                     PAYLOAD_SIZE, CXL_MAILBOX_PAYLOAD_SHIFT);
+    cxl_dstate->payload_size = CXL_MAILBOX_MAX_PAYLOAD_SIZE;
+}
+
 void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 {
     uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
-    const int cap_count = 1;
+    const int cap_count = 2;
 
     /* CXL Device Capabilities Array Register */
     ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
@@ -102,4 +220,9 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 
     cxl_device_cap_init(cxl_dstate, DEVICE, 1);
     device_reg_init_common(cxl_dstate);
+
+    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
+    mailbox_reg_init_common(cxl_dstate);
+
+    assert(cxl_initialize_mailbox(cxl_dstate) == 0);
 }
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
new file mode 100644
index 0000000000..466055b01a
--- /dev/null
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -0,0 +1,197 @@
+/*
+ * CXL Utility library for mailbox interface
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/cxl/cxl.h"
+#include "hw/pci/pci.h"
+#include "qemu/log.h"
+#include "qemu/uuid.h"
+
+/*
+ * How to add a new command, example. The command set FOO, with cmd BAR.
+ *  1. Add the command set and cmd to the enum.
+ *     FOO    = 0x7f,
+ *          #define BAR 0
+ *  2. Forward declare the handler.
+ *     declare_mailbox_handler(FOO_BAR);
+ *  3. Add the command to the cxl_cmd_set[][]
+ *     CXL_CMD(FOO, BAR, 0, 0),
+ *  4. Implement your handler
+ *     define_mailbox_handler(FOO_BAR) { ... return CXL_MBOX_SUCCESS; }
+ *
+ *
+ *  Writing the handler:
+ *    The handler will provide the &struct cxl_cmd, the &CXLDeviceState, and the
+ *    in/out length of the payload. The handler is responsible for consuming the
+ *    payload from cmd->payload and operating upon it as necessary. It must then
+ *    fill the output data into cmd->payload (overwriting what was there),
+ *    setting the length, and returning a valid return code.
+ *
+ *  XXX: The handler need not worry about endianess. The payload is read out of
+ *  a register interface that already deals with it.
+ */
+
+/* 8.2.8.4.5.1 Command Return Codes */
+typedef enum {
+    CXL_MBOX_SUCCESS = 0x0,
+    CXL_MBOX_BG_STARTED = 0x1,
+    CXL_MBOX_INVALID_INPUT = 0x2,
+    CXL_MBOX_UNSUPPORTED = 0x3,
+    CXL_MBOX_INTERNAL_ERROR = 0x4,
+    CXL_MBOX_RETRY_REQUIRED = 0x5,
+    CXL_MBOX_BUSY = 0x6,
+    CXL_MBOX_MEDIA_DISABLED = 0x7,
+    CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
+    CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
+    CXL_MBOX_FW_AUTH_FAILED = 0xa,
+    CXL_MBOX_FW_INVALID_SLOT = 0xb,
+    CXL_MBOX_FW_ROLLEDBACK = 0xc,
+    CXL_MBOX_FW_REST_REQD = 0xd,
+    CXL_MBOX_INVALID_HANDLE = 0xe,
+    CXL_MBOX_INVALID_PA = 0xf,
+    CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
+    CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
+    CXL_MBOX_ABORTED = 0x12,
+    CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
+    CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
+    CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
+    CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
+    CXL_MBOX_MAX = 0x17
+} ret_code;
+
+struct cxl_cmd;
+typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,
+                                   CXLDeviceState *cxl_dstate, uint16_t *len);
+struct cxl_cmd {
+    const char *name;
+    opcode_handler handler;
+    ssize_t in;
+    uint16_t effect; /* Reported in CEL */
+    uint8_t *payload;
+};
+
+#define define_mailbox_handler(name)                \
+    static ret_code cmd_##name(struct cxl_cmd *cmd, \
+                               CXLDeviceState *cxl_dstate, uint16_t *len)
+#define declare_mailbox_handler(name) define_mailbox_handler(name)
+
+#define define_mailbox_handler_zeroed(name, size)                         \
+    uint16_t __zero##name = size;                                         \
+    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
+                               CXLDeviceState *cxl_dstate, uint16_t *len) \
+    {                                                                     \
+        *len = __zero##name;                                              \
+        memset(cmd->payload, 0, *len);                                    \
+        return CXL_MBOX_SUCCESS;                                          \
+    }
+#define define_mailbox_handler_const(name, data)                          \
+    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
+                               CXLDeviceState *cxl_dstate, uint16_t *len) \
+    {                                                                     \
+        *len = sizeof(data);                                              \
+        memcpy(cmd->payload, data, *len);                                 \
+        return CXL_MBOX_SUCCESS;                                          \
+    }
+#define define_mailbox_handler_nop(name)                                  \
+    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
+                               CXLDeviceState *cxl_dstate, uint16_t *len) \
+    {                                                                     \
+        return CXL_MBOX_SUCCESS;                                          \
+    }
+
+#define CXL_CMD(s, c, in, cel_effect) \
+    [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
+
+static struct cxl_cmd cxl_cmd_set[256][256] = {};
+
+#undef CXL_CMD
+
+QemuUUID cel_uuid;
+
+void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
+{
+    uint16_t ret = CXL_MBOX_SUCCESS;
+    struct cxl_cmd *cxl_cmd;
+    uint64_t status_reg;
+    opcode_handler h;
+
+    /*
+     * current state of mailbox interface
+     *  mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
+     *  mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
+     *  status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
+     */
+    uint64_t command_reg =
+        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
+
+    /* Check if we have to do anything */
+    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+                          DOORBELL)) {
+        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
+        return;
+    }
+
+    uint8_t set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
+    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
+    uint16_t len = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH);
+    cxl_cmd = &cxl_cmd_set[set][cmd];
+    h = cxl_cmd->handler;
+    if (!h) {
+        goto handled;
+    }
+
+    if (len != cxl_cmd->in) {
+        ret = CXL_MBOX_INVALID_PAYLOAD_LENGTH;
+    }
+
+    cxl_cmd->payload = cxl_dstate->mbox_reg_state + A_CXL_DEV_CMD_PAYLOAD;
+    ret = (*h)(cxl_cmd, cxl_dstate, &len);
+    assert(len <= cxl_dstate->payload_size);
+
+handled:
+    /*
+     * Set the return code
+     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
+     * away with this
+     */
+    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
+
+    /*
+     * Set the return length
+     */
+    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
+    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
+    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, len);
+
+    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_CMD / 8] = command_reg;
+    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_STS / 8] = status_reg;
+
+    /* Tell the host we're done */
+    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+                     DOORBELL, 0);
+}
+
+int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate)
+{
+    const char *cel_uuidstr = "0da9c0b5-bf41-4b78-8f79-96b1623b3f17";
+
+    for (int i = 0; i < 256; i++) {
+        for (int j = 0; j < 256; j++) {
+            if (cxl_cmd_set[i][j].handler) {
+                struct cxl_cmd *c = &cxl_cmd_set[i][j];
+
+                cxl_dstate->cel_log[cxl_dstate->cel_size].opcode = (i << 8) | j;
+                cxl_dstate->cel_log[cxl_dstate->cel_size].effect = c->effect;
+                cxl_dstate->cel_size++;
+            }
+        }
+    }
+
+    return qemu_uuid_parse(cel_uuidstr, &cel_uuid);
+}
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
index 47154d6850..0eca715d10 100644
--- a/hw/cxl/meson.build
+++ b/hw/cxl/meson.build
@@ -1,4 +1,5 @@
 softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
   'cxl-component-utils.c',
   'cxl-device-utils.c',
+  'cxl-mailbox-utils.c',
 ))
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 23f52c4cf9..362cda40de 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -14,5 +14,8 @@
 #include "cxl_component.h"
 #include "cxl_device.h"
 
+#define COMPONENT_REG_BAR_IDX 0
+#define DEVICE_REG_BAR_IDX 2
+
 #endif
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index f3bcf19410..af91bec10c 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -80,16 +80,27 @@ typedef struct cxl_device_state {
     MemoryRegion device_registers;
 
     /* mmio for device capabilities array - 8.2.8.2 */
+    MemoryRegion device;
     struct {
         MemoryRegion caps;
         uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
     };
 
-    /* mmio for the device status registers 8.2.8.3 */
-    MemoryRegion device;
-
     /* mmio for the mailbox registers 8.2.8.4 */
-    MemoryRegion mailbox;
+    struct {
+        MemoryRegion mailbox;
+        uint16_t payload_size;
+        union {
+            uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
+            uint32_t mbox_reg_state32[CXL_MAILBOX_REGISTERS_LENGTH / 4];
+            uint64_t mbox_reg_state64[CXL_MAILBOX_REGISTERS_LENGTH / 8];
+        };
+        struct {
+            uint16_t opcode;
+            uint16_t effect;
+        } cel_log[1 << 16];
+        size_t cel_size;
+    };
 
     /* memory region for persistent memory, HDM */
     MemoryRegion *pmem;
@@ -135,6 +146,9 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
                                                CXL_DEVICE_CAP_REG_SIZE)
 
+int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
+void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
+
 #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
     do {                                                                           \
         uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
@@ -162,6 +176,12 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
     FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
     FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
 
+/* XXX: actually a 64b register */
+REG32(CXL_DEV_MAILBOX_CMD, 8)
+    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)
+    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
+    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
+
 /* XXX: actually a 64b register */
 REG32(CXL_DEV_MAILBOX_STS, 0x10)
     FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

This is the beginning of implementing mailbox support for CXL 2.0
devices. The implementation recognizes when the doorbell is rung,
handles the command/payload, clears the doorbell while returning error
codes and data.

Generally the mailbox mechanism is designed to permit communication
between the host OS and the firmware running on the device. For our
purposes, we emulate both the firmware, implemented primarily in
cxl-mailbox-utils.c, and the hardware.

No commands are implemented yet.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c   | 125 ++++++++++++++++++++++-
 hw/cxl/cxl-mailbox-utils.c  | 197 ++++++++++++++++++++++++++++++++++++
 hw/cxl/meson.build          |   1 +
 include/hw/cxl/cxl.h        |   3 +
 include/hw/cxl/cxl_device.h |  28 ++++-
 5 files changed, 349 insertions(+), 5 deletions(-)
 create mode 100644 hw/cxl/cxl-mailbox-utils.c

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index bb15ad9a0f..6602606f3d 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -40,6 +40,111 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
     return 0;
 }
 
+static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    switch (size) {
+    case 8:
+        return cxl_dstate->mbox_reg_state64[offset / 8];
+    case 4:
+        return cxl_dstate->mbox_reg_state32[offset / 4];
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
+                               uint64_t value)
+{
+    switch (offset) {
+    case A_CXL_DEV_MAILBOX_CTRL:
+        /* fallthrough */
+    case A_CXL_DEV_MAILBOX_CAP:
+        /* RO register */
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
+                      __func__, offset);
+        break;
+    }
+
+    reg_state[offset / 4] = value;
+}
+
+static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
+                               uint64_t value)
+{
+    switch (offset) {
+    case A_CXL_DEV_MAILBOX_CMD:
+        break;
+    case A_CXL_DEV_BG_CMD_STS:
+        /* BG not supported */
+        /* fallthrough */
+    case A_CXL_DEV_MAILBOX_STS:
+        /* Read only register, will get updated by the state machine */
+        return;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
+                      __func__, offset);
+        return;
+    }
+
+
+    reg_state[offset / 8] = value;
+}
+
+static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
+                              unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    if (offset >= A_CXL_DEV_CMD_PAYLOAD) {
+        memcpy(cxl_dstate->mbox_reg_state + offset, &value, size);
+        return;
+    }
+
+    /*
+     * Lock is needed to prevent concurrent writes as well as to prevent writes
+     * coming in while the firmware is processing. Without background commands
+     * or the second mailbox implemented, this serves no purpose since the
+     * memory access is synchronized at a higher level (per memory region).
+     */
+    RCU_READ_LOCK_GUARD();
+
+    switch (size) {
+    case 4:
+        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
+        break;
+    case 8:
+        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+                         DOORBELL))
+        cxl_process_mailbox(cxl_dstate);
+}
+
+static const MemoryRegionOps mailbox_ops = {
+    .read = mailbox_reg_read,
+    .write = mailbox_reg_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
 static const MemoryRegionOps dev_ops = {
     .read = dev_reg_read,
     .write = NULL, /* status register is read only */
@@ -80,20 +185,33 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
                           "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
     memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
                           "device-status", CXL_DEVICE_REGISTERS_LENGTH);
+    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
+                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
 
     memory_region_add_subregion(&cxl_dstate->device_registers, 0,
                                 &cxl_dstate->caps);
     memory_region_add_subregion(&cxl_dstate->device_registers,
                                 CXL_DEVICE_REGISTERS_OFFSET,
                                 &cxl_dstate->device);
+    memory_region_add_subregion(&cxl_dstate->device_registers,
+                                CXL_MAILBOX_REGISTERS_OFFSET,
+                                &cxl_dstate->mailbox);
 }
 
 static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
 
+static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
+{
+    /* 2048 payload size, with no interrupt or background support */
+    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
+                     PAYLOAD_SIZE, CXL_MAILBOX_PAYLOAD_SHIFT);
+    cxl_dstate->payload_size = CXL_MAILBOX_MAX_PAYLOAD_SIZE;
+}
+
 void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 {
     uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
-    const int cap_count = 1;
+    const int cap_count = 2;
 
     /* CXL Device Capabilities Array Register */
     ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
@@ -102,4 +220,9 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 
     cxl_device_cap_init(cxl_dstate, DEVICE, 1);
     device_reg_init_common(cxl_dstate);
+
+    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
+    mailbox_reg_init_common(cxl_dstate);
+
+    assert(cxl_initialize_mailbox(cxl_dstate) == 0);
 }
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
new file mode 100644
index 0000000000..466055b01a
--- /dev/null
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -0,0 +1,197 @@
+/*
+ * CXL Utility library for mailbox interface
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/cxl/cxl.h"
+#include "hw/pci/pci.h"
+#include "qemu/log.h"
+#include "qemu/uuid.h"
+
+/*
+ * How to add a new command, example. The command set FOO, with cmd BAR.
+ *  1. Add the command set and cmd to the enum.
+ *     FOO    = 0x7f,
+ *          #define BAR 0
+ *  2. Forward declare the handler.
+ *     declare_mailbox_handler(FOO_BAR);
+ *  3. Add the command to the cxl_cmd_set[][]
+ *     CXL_CMD(FOO, BAR, 0, 0),
+ *  4. Implement your handler
+ *     define_mailbox_handler(FOO_BAR) { ... return CXL_MBOX_SUCCESS; }
+ *
+ *
+ *  Writing the handler:
+ *    The handler will provide the &struct cxl_cmd, the &CXLDeviceState, and the
+ *    in/out length of the payload. The handler is responsible for consuming the
+ *    payload from cmd->payload and operating upon it as necessary. It must then
+ *    fill the output data into cmd->payload (overwriting what was there),
+ *    setting the length, and returning a valid return code.
+ *
+ *  XXX: The handler need not worry about endianess. The payload is read out of
+ *  a register interface that already deals with it.
+ */
+
+/* 8.2.8.4.5.1 Command Return Codes */
+typedef enum {
+    CXL_MBOX_SUCCESS = 0x0,
+    CXL_MBOX_BG_STARTED = 0x1,
+    CXL_MBOX_INVALID_INPUT = 0x2,
+    CXL_MBOX_UNSUPPORTED = 0x3,
+    CXL_MBOX_INTERNAL_ERROR = 0x4,
+    CXL_MBOX_RETRY_REQUIRED = 0x5,
+    CXL_MBOX_BUSY = 0x6,
+    CXL_MBOX_MEDIA_DISABLED = 0x7,
+    CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
+    CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
+    CXL_MBOX_FW_AUTH_FAILED = 0xa,
+    CXL_MBOX_FW_INVALID_SLOT = 0xb,
+    CXL_MBOX_FW_ROLLEDBACK = 0xc,
+    CXL_MBOX_FW_REST_REQD = 0xd,
+    CXL_MBOX_INVALID_HANDLE = 0xe,
+    CXL_MBOX_INVALID_PA = 0xf,
+    CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
+    CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
+    CXL_MBOX_ABORTED = 0x12,
+    CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
+    CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
+    CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
+    CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
+    CXL_MBOX_MAX = 0x17
+} ret_code;
+
+struct cxl_cmd;
+typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,
+                                   CXLDeviceState *cxl_dstate, uint16_t *len);
+struct cxl_cmd {
+    const char *name;
+    opcode_handler handler;
+    ssize_t in;
+    uint16_t effect; /* Reported in CEL */
+    uint8_t *payload;
+};
+
+#define define_mailbox_handler(name)                \
+    static ret_code cmd_##name(struct cxl_cmd *cmd, \
+                               CXLDeviceState *cxl_dstate, uint16_t *len)
+#define declare_mailbox_handler(name) define_mailbox_handler(name)
+
+#define define_mailbox_handler_zeroed(name, size)                         \
+    uint16_t __zero##name = size;                                         \
+    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
+                               CXLDeviceState *cxl_dstate, uint16_t *len) \
+    {                                                                     \
+        *len = __zero##name;                                              \
+        memset(cmd->payload, 0, *len);                                    \
+        return CXL_MBOX_SUCCESS;                                          \
+    }
+#define define_mailbox_handler_const(name, data)                          \
+    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
+                               CXLDeviceState *cxl_dstate, uint16_t *len) \
+    {                                                                     \
+        *len = sizeof(data);                                              \
+        memcpy(cmd->payload, data, *len);                                 \
+        return CXL_MBOX_SUCCESS;                                          \
+    }
+#define define_mailbox_handler_nop(name)                                  \
+    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
+                               CXLDeviceState *cxl_dstate, uint16_t *len) \
+    {                                                                     \
+        return CXL_MBOX_SUCCESS;                                          \
+    }
+
+#define CXL_CMD(s, c, in, cel_effect) \
+    [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
+
+static struct cxl_cmd cxl_cmd_set[256][256] = {};
+
+#undef CXL_CMD
+
+QemuUUID cel_uuid;
+
+void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
+{
+    uint16_t ret = CXL_MBOX_SUCCESS;
+    struct cxl_cmd *cxl_cmd;
+    uint64_t status_reg;
+    opcode_handler h;
+
+    /*
+     * current state of mailbox interface
+     *  mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
+     *  mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
+     *  status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
+     */
+    uint64_t command_reg =
+        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
+
+    /* Check if we have to do anything */
+    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+                          DOORBELL)) {
+        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
+        return;
+    }
+
+    uint8_t set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
+    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
+    uint16_t len = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH);
+    cxl_cmd = &cxl_cmd_set[set][cmd];
+    h = cxl_cmd->handler;
+    if (!h) {
+        goto handled;
+    }
+
+    if (len != cxl_cmd->in) {
+        ret = CXL_MBOX_INVALID_PAYLOAD_LENGTH;
+    }
+
+    cxl_cmd->payload = cxl_dstate->mbox_reg_state + A_CXL_DEV_CMD_PAYLOAD;
+    ret = (*h)(cxl_cmd, cxl_dstate, &len);
+    assert(len <= cxl_dstate->payload_size);
+
+handled:
+    /*
+     * Set the return code
+     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
+     * away with this
+     */
+    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
+
+    /*
+     * Set the return length
+     */
+    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
+    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
+    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, len);
+
+    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_CMD / 8] = command_reg;
+    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_STS / 8] = status_reg;
+
+    /* Tell the host we're done */
+    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+                     DOORBELL, 0);
+}
+
+int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate)
+{
+    const char *cel_uuidstr = "0da9c0b5-bf41-4b78-8f79-96b1623b3f17";
+
+    for (int i = 0; i < 256; i++) {
+        for (int j = 0; j < 256; j++) {
+            if (cxl_cmd_set[i][j].handler) {
+                struct cxl_cmd *c = &cxl_cmd_set[i][j];
+
+                cxl_dstate->cel_log[cxl_dstate->cel_size].opcode = (i << 8) | j;
+                cxl_dstate->cel_log[cxl_dstate->cel_size].effect = c->effect;
+                cxl_dstate->cel_size++;
+            }
+        }
+    }
+
+    return qemu_uuid_parse(cel_uuidstr, &cel_uuid);
+}
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
index 47154d6850..0eca715d10 100644
--- a/hw/cxl/meson.build
+++ b/hw/cxl/meson.build
@@ -1,4 +1,5 @@
 softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
   'cxl-component-utils.c',
   'cxl-device-utils.c',
+  'cxl-mailbox-utils.c',
 ))
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 23f52c4cf9..362cda40de 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -14,5 +14,8 @@
 #include "cxl_component.h"
 #include "cxl_device.h"
 
+#define COMPONENT_REG_BAR_IDX 0
+#define DEVICE_REG_BAR_IDX 2
+
 #endif
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index f3bcf19410..af91bec10c 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -80,16 +80,27 @@ typedef struct cxl_device_state {
     MemoryRegion device_registers;
 
     /* mmio for device capabilities array - 8.2.8.2 */
+    MemoryRegion device;
     struct {
         MemoryRegion caps;
         uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
     };
 
-    /* mmio for the device status registers 8.2.8.3 */
-    MemoryRegion device;
-
     /* mmio for the mailbox registers 8.2.8.4 */
-    MemoryRegion mailbox;
+    struct {
+        MemoryRegion mailbox;
+        uint16_t payload_size;
+        union {
+            uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
+            uint32_t mbox_reg_state32[CXL_MAILBOX_REGISTERS_LENGTH / 4];
+            uint64_t mbox_reg_state64[CXL_MAILBOX_REGISTERS_LENGTH / 8];
+        };
+        struct {
+            uint16_t opcode;
+            uint16_t effect;
+        } cel_log[1 << 16];
+        size_t cel_size;
+    };
 
     /* memory region for persistent memory, HDM */
     MemoryRegion *pmem;
@@ -135,6 +146,9 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
                                                CXL_DEVICE_CAP_REG_SIZE)
 
+int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
+void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
+
 #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
     do {                                                                           \
         uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
@@ -162,6 +176,12 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
     FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
     FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
 
+/* XXX: actually a 64b register */
+REG32(CXL_DEV_MAILBOX_CMD, 8)
+    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)
+    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
+    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
+
 /* XXX: actually a 64b register */
 REG32(CXL_DEV_MAILBOX_STS, 0x10)
     FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 06/31] hw/cxl/device: Add memory device utilities
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Memory devices implement extra capabilities on top of CXL devices. This
adds support for that.

A large part of memory devices is the mailbox/command interface. All of
the mailbox handling is done in the mailbox-utils library. Longer term,
new CXL devices that are being emulated may want to handle commands
differently, and therefore would need a mechanism to opt in/out of the
specific generic handlers. As such, this is considered sufficient for
now, but may need more depth in the future.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c   | 38 ++++++++++++++++++++++++++++++++++++-
 include/hw/cxl/cxl_device.h | 18 +++++++++++++++++-
 2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index 6602606f3d..639ace523d 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -130,6 +130,31 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
         cxl_process_mailbox(cxl_dstate);
 }
 
+static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    uint64_t retval = 0;
+
+    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
+    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
+
+    return retval;
+}
+
+static const MemoryRegionOps mdev_ops = {
+    .read = mdev_reg_read,
+    .write = NULL, /* memory device register is read only */
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 8,
+        .max_access_size = 8,
+    },
+};
+
 static const MemoryRegionOps mailbox_ops = {
     .read = mailbox_reg_read,
     .write = mailbox_reg_write,
@@ -187,6 +212,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
                           "device-status", CXL_DEVICE_REGISTERS_LENGTH);
     memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
                           "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
+    memory_region_init_io(&cxl_dstate->memory_device, obj, &mdev_ops,
+                          cxl_dstate, "memory device caps",
+                          CXL_MEMORY_DEVICE_REGISTERS_LENGTH);
 
     memory_region_add_subregion(&cxl_dstate->device_registers, 0,
                                 &cxl_dstate->caps);
@@ -196,6 +224,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
     memory_region_add_subregion(&cxl_dstate->device_registers,
                                 CXL_MAILBOX_REGISTERS_OFFSET,
                                 &cxl_dstate->mailbox);
+    memory_region_add_subregion(&cxl_dstate->device_registers,
+                                CXL_MEMORY_DEVICE_REGISTERS_OFFSET,
+                                &cxl_dstate->memory_device);
 }
 
 static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
@@ -208,10 +239,12 @@ static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
     cxl_dstate->payload_size = CXL_MAILBOX_MAX_PAYLOAD_SIZE;
 }
 
+static void memdev_reg_init_common(CXLDeviceState *cxl_dstate) { }
+
 void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 {
     uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
-    const int cap_count = 2;
+    const int cap_count = 3;
 
     /* CXL Device Capabilities Array Register */
     ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
@@ -224,5 +257,8 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
     cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
     mailbox_reg_init_common(cxl_dstate);
 
+    cxl_device_cap_init(cxl_dstate, MEMORY_DEVICE, 0x4000);
+    memdev_reg_init_common(cxl_dstate);
+
     assert(cxl_initialize_mailbox(cxl_dstate) == 0);
 }
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index af91bec10c..0cc5354ba4 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -72,15 +72,20 @@
 #define CXL_MAILBOX_REGISTERS_LENGTH \
     (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
 
+#define CXL_MEMORY_DEVICE_REGISTERS_OFFSET \
+    (CXL_MAILBOX_REGISTERS_OFFSET + CXL_MAILBOX_REGISTERS_LENGTH)
+#define CXL_MEMORY_DEVICE_REGISTERS_LENGTH 0x8
+
 #define CXL_MMIO_SIZE                                       \
     CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_REGISTERS_LENGTH + \
-        CXL_MAILBOX_REGISTERS_LENGTH
+        CXL_MAILBOX_REGISTERS_LENGTH + CXL_MEMORY_DEVICE_REGISTERS_LENGTH
 
 typedef struct cxl_device_state {
     MemoryRegion device_registers;
 
     /* mmio for device capabilities array - 8.2.8.2 */
     MemoryRegion device;
+    MemoryRegion memory_device;
     struct {
         MemoryRegion caps;
         uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
@@ -145,6 +150,9 @@ REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
                                                CXL_DEVICE_CAP_REG_SIZE)
+CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MEMORY_DEVICE,
+                                      CXL_DEVICE_CAP_HDR1_OFFSET +
+                                          CXL_DEVICE_CAP_REG_SIZE * 2)
 
 int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
@@ -197,4 +205,12 @@ REG32(CXL_DEV_BG_CMD_STS, 0x18)
 
 REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
 
+/* XXX: actually a 64b registers */
+REG32(CXL_MEM_DEV_STS, 0)
+    FIELD(CXL_MEM_DEV_STS, FATAL, 0, 1)
+    FIELD(CXL_MEM_DEV_STS, FW_HALT, 1, 1)
+    FIELD(CXL_MEM_DEV_STS, MEDIA_STATUS, 2, 2)
+    FIELD(CXL_MEM_DEV_STS, MBOX_READY, 4, 1)
+    FIELD(CXL_MEM_DEV_STS, RESET_NEEDED, 5, 3)
+
 #endif
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 06/31] hw/cxl/device: Add memory device utilities
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Memory devices implement extra capabilities on top of CXL devices. This
adds support for that.

A large part of memory devices is the mailbox/command interface. All of
the mailbox handling is done in the mailbox-utils library. Longer term,
new CXL devices that are being emulated may want to handle commands
differently, and therefore would need a mechanism to opt in/out of the
specific generic handlers. As such, this is considered sufficient for
now, but may need more depth in the future.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c   | 38 ++++++++++++++++++++++++++++++++++++-
 include/hw/cxl/cxl_device.h | 18 +++++++++++++++++-
 2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index 6602606f3d..639ace523d 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -130,6 +130,31 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
         cxl_process_mailbox(cxl_dstate);
 }
 
+static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    uint64_t retval = 0;
+
+    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
+    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
+
+    return retval;
+}
+
+static const MemoryRegionOps mdev_ops = {
+    .read = mdev_reg_read,
+    .write = NULL, /* memory device register is read only */
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 8,
+        .max_access_size = 8,
+    },
+};
+
 static const MemoryRegionOps mailbox_ops = {
     .read = mailbox_reg_read,
     .write = mailbox_reg_write,
@@ -187,6 +212,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
                           "device-status", CXL_DEVICE_REGISTERS_LENGTH);
     memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
                           "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
+    memory_region_init_io(&cxl_dstate->memory_device, obj, &mdev_ops,
+                          cxl_dstate, "memory device caps",
+                          CXL_MEMORY_DEVICE_REGISTERS_LENGTH);
 
     memory_region_add_subregion(&cxl_dstate->device_registers, 0,
                                 &cxl_dstate->caps);
@@ -196,6 +224,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
     memory_region_add_subregion(&cxl_dstate->device_registers,
                                 CXL_MAILBOX_REGISTERS_OFFSET,
                                 &cxl_dstate->mailbox);
+    memory_region_add_subregion(&cxl_dstate->device_registers,
+                                CXL_MEMORY_DEVICE_REGISTERS_OFFSET,
+                                &cxl_dstate->memory_device);
 }
 
 static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
@@ -208,10 +239,12 @@ static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
     cxl_dstate->payload_size = CXL_MAILBOX_MAX_PAYLOAD_SIZE;
 }
 
+static void memdev_reg_init_common(CXLDeviceState *cxl_dstate) { }
+
 void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 {
     uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
-    const int cap_count = 2;
+    const int cap_count = 3;
 
     /* CXL Device Capabilities Array Register */
     ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
@@ -224,5 +257,8 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
     cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
     mailbox_reg_init_common(cxl_dstate);
 
+    cxl_device_cap_init(cxl_dstate, MEMORY_DEVICE, 0x4000);
+    memdev_reg_init_common(cxl_dstate);
+
     assert(cxl_initialize_mailbox(cxl_dstate) == 0);
 }
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index af91bec10c..0cc5354ba4 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -72,15 +72,20 @@
 #define CXL_MAILBOX_REGISTERS_LENGTH \
     (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
 
+#define CXL_MEMORY_DEVICE_REGISTERS_OFFSET \
+    (CXL_MAILBOX_REGISTERS_OFFSET + CXL_MAILBOX_REGISTERS_LENGTH)
+#define CXL_MEMORY_DEVICE_REGISTERS_LENGTH 0x8
+
 #define CXL_MMIO_SIZE                                       \
     CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_REGISTERS_LENGTH + \
-        CXL_MAILBOX_REGISTERS_LENGTH
+        CXL_MAILBOX_REGISTERS_LENGTH + CXL_MEMORY_DEVICE_REGISTERS_LENGTH
 
 typedef struct cxl_device_state {
     MemoryRegion device_registers;
 
     /* mmio for device capabilities array - 8.2.8.2 */
     MemoryRegion device;
+    MemoryRegion memory_device;
     struct {
         MemoryRegion caps;
         uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
@@ -145,6 +150,9 @@ REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
                                                CXL_DEVICE_CAP_REG_SIZE)
+CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MEMORY_DEVICE,
+                                      CXL_DEVICE_CAP_HDR1_OFFSET +
+                                          CXL_DEVICE_CAP_REG_SIZE * 2)
 
 int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
@@ -197,4 +205,12 @@ REG32(CXL_DEV_BG_CMD_STS, 0x18)
 
 REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
 
+/* XXX: actually a 64b registers */
+REG32(CXL_MEM_DEV_STS, 0)
+    FIELD(CXL_MEM_DEV_STS, FATAL, 0, 1)
+    FIELD(CXL_MEM_DEV_STS, FW_HALT, 1, 1)
+    FIELD(CXL_MEM_DEV_STS, MEDIA_STATUS, 2, 2)
+    FIELD(CXL_MEM_DEV_STS, MBOX_READY, 4, 1)
+    FIELD(CXL_MEM_DEV_STS, RESET_NEEDED, 5, 3)
+
 #endif
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 07/31] hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Using the previously implemented stubbed helpers, it is now possible to
easily add the missing, required commands to the implementation.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 466055b01a..7c939a1851 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -37,6 +37,14 @@
  *  a register interface that already deals with it.
  */
 
+enum {
+    EVENTS      = 0x01,
+        #define GET_RECORDS   0x0
+        #define CLEAR_RECORDS   0x1
+        #define GET_INTERRUPT_POLICY   0x2
+        #define SET_INTERRUPT_POLICY   0x3
+};
+
 /* 8.2.8.4.5.1 Command Return Codes */
 typedef enum {
     CXL_MBOX_SUCCESS = 0x0,
@@ -105,10 +113,23 @@ struct cxl_cmd {
         return CXL_MBOX_SUCCESS;                                          \
     }
 
+define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
+define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
+define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
+define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
+
+#define IMMEDIATE_CONFIG_CHANGE (1 << 1)
+#define IMMEDIATE_LOG_CHANGE (1 << 4)
+
 #define CXL_CMD(s, c, in, cel_effect) \
     [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
 
-static struct cxl_cmd cxl_cmd_set[256][256] = {};
+static struct cxl_cmd cxl_cmd_set[256][256] = {
+    CXL_CMD(EVENTS, GET_RECORDS, 1, 0),
+    CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
+    CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
+    CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
+};
 
 #undef CXL_CMD
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 07/31] hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Using the previously implemented stubbed helpers, it is now possible to
easily add the missing, required commands to the implementation.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 466055b01a..7c939a1851 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -37,6 +37,14 @@
  *  a register interface that already deals with it.
  */
 
+enum {
+    EVENTS      = 0x01,
+        #define GET_RECORDS   0x0
+        #define CLEAR_RECORDS   0x1
+        #define GET_INTERRUPT_POLICY   0x2
+        #define SET_INTERRUPT_POLICY   0x3
+};
+
 /* 8.2.8.4.5.1 Command Return Codes */
 typedef enum {
     CXL_MBOX_SUCCESS = 0x0,
@@ -105,10 +113,23 @@ struct cxl_cmd {
         return CXL_MBOX_SUCCESS;                                          \
     }
 
+define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
+define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
+define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
+define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
+
+#define IMMEDIATE_CONFIG_CHANGE (1 << 1)
+#define IMMEDIATE_LOG_CHANGE (1 << 4)
+
 #define CXL_CMD(s, c, in, cel_effect) \
     [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
 
-static struct cxl_cmd cxl_cmd_set[256][256] = {};
+static struct cxl_cmd cxl_cmd_set[256][256] = {
+    CXL_CMD(EVENTS, GET_RECORDS, 1, 0),
+    CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
+    CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
+    CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
+};
 
 #undef CXL_CMD
 
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 08/31] hw/cxl/device: Timestamp implementation (8.2.9.3)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Per spec, timestamp appears to be a free-running counter from a value
set by the host via the Set Timestamp command (0301h). There are
references to the epoch, which seem like a red herring. Therefore, the
implementation implements the timestamp as freerunning counter from the
last value that was issued by the Set Timestamp command.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 53 +++++++++++++++++++++++++++++++++++++
 include/hw/cxl/cxl_device.h |  6 +++++
 2 files changed, 59 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 7c939a1851..3d36614c0c 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -43,6 +43,9 @@ enum {
         #define CLEAR_RECORDS   0x1
         #define GET_INTERRUPT_POLICY   0x2
         #define SET_INTERRUPT_POLICY   0x3
+    TIMESTAMP   = 0x03,
+        #define GET           0x0
+        #define SET           0x1
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -117,8 +120,11 @@ define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
 define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
 define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
 define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
+declare_mailbox_handler(TIMESTAMP_GET);
+declare_mailbox_handler(TIMESTAMP_SET);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
+#define IMMEDIATE_POLICY_CHANGE (1 << 3)
 #define IMMEDIATE_LOG_CHANGE (1 << 4)
 
 #define CXL_CMD(s, c, in, cel_effect) \
@@ -129,10 +135,57 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
     CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
     CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
+    CXL_CMD(TIMESTAMP, GET, 0, 0),
+    CXL_CMD(TIMESTAMP, SET, 8, IMMEDIATE_POLICY_CHANGE),
 };
 
 #undef CXL_CMD
 
+/*
+ * 8.2.9.3.1
+ */
+define_mailbox_handler(TIMESTAMP_GET)
+{
+    struct timespec ts;
+    uint64_t delta;
+
+    if (!cxl_dstate->timestamp.set) {
+        *(uint64_t *)cmd->payload = 0;
+        goto done;
+    }
+
+    /* First find the delta from the last time the host set the time. */
+    clock_gettime(CLOCK_REALTIME, &ts);
+    delta = (ts.tv_sec * NANOSECONDS_PER_SECOND + ts.tv_nsec) -
+            cxl_dstate->timestamp.last_set;
+
+    /* Then adjust the actual time */
+    stq_le_p(cmd->payload, cxl_dstate->timestamp.host_set + delta);
+
+done:
+    *len = 8;
+    return CXL_MBOX_SUCCESS;
+}
+
+/*
+ * 8.2.9.3.2
+ */
+define_mailbox_handler(TIMESTAMP_SET)
+{
+    struct timespec ts;
+
+    clock_gettime(CLOCK_REALTIME, &ts);
+
+    cxl_dstate->timestamp.set = true;
+    cxl_dstate->timestamp.last_set =
+        ts.tv_sec * NANOSECONDS_PER_SECOND + ts.tv_nsec;
+
+    cxl_dstate->timestamp.host_set = le64_to_cpu(*(uint64_t *)cmd->payload);
+
+    *len = 0;
+    return CXL_MBOX_SUCCESS;
+}
+
 QemuUUID cel_uuid;
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 0cc5354ba4..ca5328a581 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -107,6 +107,12 @@ typedef struct cxl_device_state {
         size_t cel_size;
     };
 
+    struct {
+        bool set;
+        uint64_t last_set;
+        uint64_t host_set;
+    } timestamp;
+
     /* memory region for persistent memory, HDM */
     MemoryRegion *pmem;
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 08/31] hw/cxl/device: Timestamp implementation (8.2.9.3)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Per spec, timestamp appears to be a free-running counter from a value
set by the host via the Set Timestamp command (0301h). There are
references to the epoch, which seem like a red herring. Therefore, the
implementation implements the timestamp as freerunning counter from the
last value that was issued by the Set Timestamp command.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 53 +++++++++++++++++++++++++++++++++++++
 include/hw/cxl/cxl_device.h |  6 +++++
 2 files changed, 59 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 7c939a1851..3d36614c0c 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -43,6 +43,9 @@ enum {
         #define CLEAR_RECORDS   0x1
         #define GET_INTERRUPT_POLICY   0x2
         #define SET_INTERRUPT_POLICY   0x3
+    TIMESTAMP   = 0x03,
+        #define GET           0x0
+        #define SET           0x1
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -117,8 +120,11 @@ define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
 define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
 define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
 define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
+declare_mailbox_handler(TIMESTAMP_GET);
+declare_mailbox_handler(TIMESTAMP_SET);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
+#define IMMEDIATE_POLICY_CHANGE (1 << 3)
 #define IMMEDIATE_LOG_CHANGE (1 << 4)
 
 #define CXL_CMD(s, c, in, cel_effect) \
@@ -129,10 +135,57 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
     CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
     CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
+    CXL_CMD(TIMESTAMP, GET, 0, 0),
+    CXL_CMD(TIMESTAMP, SET, 8, IMMEDIATE_POLICY_CHANGE),
 };
 
 #undef CXL_CMD
 
+/*
+ * 8.2.9.3.1
+ */
+define_mailbox_handler(TIMESTAMP_GET)
+{
+    struct timespec ts;
+    uint64_t delta;
+
+    if (!cxl_dstate->timestamp.set) {
+        *(uint64_t *)cmd->payload = 0;
+        goto done;
+    }
+
+    /* First find the delta from the last time the host set the time. */
+    clock_gettime(CLOCK_REALTIME, &ts);
+    delta = (ts.tv_sec * NANOSECONDS_PER_SECOND + ts.tv_nsec) -
+            cxl_dstate->timestamp.last_set;
+
+    /* Then adjust the actual time */
+    stq_le_p(cmd->payload, cxl_dstate->timestamp.host_set + delta);
+
+done:
+    *len = 8;
+    return CXL_MBOX_SUCCESS;
+}
+
+/*
+ * 8.2.9.3.2
+ */
+define_mailbox_handler(TIMESTAMP_SET)
+{
+    struct timespec ts;
+
+    clock_gettime(CLOCK_REALTIME, &ts);
+
+    cxl_dstate->timestamp.set = true;
+    cxl_dstate->timestamp.last_set =
+        ts.tv_sec * NANOSECONDS_PER_SECOND + ts.tv_nsec;
+
+    cxl_dstate->timestamp.host_set = le64_to_cpu(*(uint64_t *)cmd->payload);
+
+    *len = 0;
+    return CXL_MBOX_SUCCESS;
+}
+
 QemuUUID cel_uuid;
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 0cc5354ba4..ca5328a581 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -107,6 +107,12 @@ typedef struct cxl_device_state {
         size_t cel_size;
     };
 
+    struct {
+        bool set;
+        uint64_t last_set;
+        uint64_t host_set;
+    } timestamp;
+
     /* memory region for persistent memory, HDM */
     MemoryRegion *pmem;
 
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 09/31] hw/cxl/device: Add log commands (8.2.9.4) + CEL
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

CXL specification provides for the ability to obtain logs from the
device. Logs are either spec defined, like the "Command Effects Log"
(CEL), or vendor specific. UUIDs are defined for all log types.

The CEL is a mechanism to provide information to the host about which
commands are supported. It is useful both to determine which spec'd
optional commands are supported, as well as provide a list of vendor
specified commands that might be used. The CEL is already created as
part of mailbox initialization, but here it is now exported to hosts
that use these log commands.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c | 67 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 3d36614c0c..3f0ae8b9e5 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -46,6 +46,9 @@ enum {
     TIMESTAMP   = 0x03,
         #define GET           0x0
         #define SET           0x1
+    LOGS        = 0x04,
+        #define GET_SUPPORTED 0x0
+        #define GET_LOG       0x1
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -122,6 +125,8 @@ define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
 define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
 declare_mailbox_handler(TIMESTAMP_GET);
 declare_mailbox_handler(TIMESTAMP_SET);
+declare_mailbox_handler(LOGS_GET_SUPPORTED);
+declare_mailbox_handler(LOGS_GET_LOG);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -137,6 +142,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
     CXL_CMD(TIMESTAMP, GET, 0, 0),
     CXL_CMD(TIMESTAMP, SET, 8, IMMEDIATE_POLICY_CHANGE),
+    CXL_CMD(LOGS, GET_SUPPORTED, 0, 0),
+    CXL_CMD(LOGS, GET_LOG, 0x18, 0),
 };
 
 #undef CXL_CMD
@@ -188,6 +195,66 @@ define_mailbox_handler(TIMESTAMP_SET)
 
 QemuUUID cel_uuid;
 
+/* 8.2.9.4.1 */
+define_mailbox_handler(LOGS_GET_SUPPORTED)
+{
+    struct {
+        uint16_t entries;
+        uint8_t rsvd[6];
+        struct {
+            QemuUUID uuid;
+            uint32_t size;
+        } log_entries[1];
+    } __attribute__((packed)) *supported_logs = (void *)cmd->payload;
+    _Static_assert(sizeof(*supported_logs) == 0x1c, "Bad supported log size");
+
+    supported_logs->entries = 1;
+    supported_logs->log_entries[0].uuid = cel_uuid;
+    supported_logs->log_entries[0].size = 4 * cxl_dstate->cel_size;
+
+    *len = sizeof(*supported_logs);
+    return CXL_MBOX_SUCCESS;
+}
+
+/* 8.2.9.4.2 */
+define_mailbox_handler(LOGS_GET_LOG)
+{
+    struct {
+        QemuUUID uuid;
+        uint32_t offset;
+        uint32_t length;
+    } __attribute__((packed, __aligned__(16))) *get_log = (void *)cmd->payload;
+
+    /*
+     * 8.2.9.4.2
+     *   The device shall return Invalid Parameter if the Offset or Length
+     *   fields attempt to access beyond the size of the log as reported by Get
+     *   Supported Logs.
+     *
+     * XXX: Spec is wrong, "Invalid Parameter" isn't a thing.
+     * XXX: Spec doesn't address incorrect UUID incorrectness.
+     *
+     * The CEL buffer is large enough to fit all commands in the emulation, so
+     * the only possible failure would be if the mailbox itself isn't big
+     * enough.
+     */
+    if (get_log->offset + get_log->length > cxl_dstate->payload_size) {
+        return CXL_MBOX_INVALID_INPUT;
+    }
+
+    if (!qemu_uuid_is_equal(&get_log->uuid, &cel_uuid)) {
+        return CXL_MBOX_UNSUPPORTED;
+    }
+
+    /* Store off everything to local variables so we can wipe out the payload */
+    *len = get_log->length;
+
+    memmove(cmd->payload, cxl_dstate->cel_log + get_log->offset,
+           get_log->length);
+
+    return CXL_MBOX_SUCCESS;
+}
+
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
 {
     uint16_t ret = CXL_MBOX_SUCCESS;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 09/31] hw/cxl/device: Add log commands (8.2.9.4) + CEL
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

CXL specification provides for the ability to obtain logs from the
device. Logs are either spec defined, like the "Command Effects Log"
(CEL), or vendor specific. UUIDs are defined for all log types.

The CEL is a mechanism to provide information to the host about which
commands are supported. It is useful both to determine which spec'd
optional commands are supported, as well as provide a list of vendor
specified commands that might be used. The CEL is already created as
part of mailbox initialization, but here it is now exported to hosts
that use these log commands.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c | 67 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 3d36614c0c..3f0ae8b9e5 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -46,6 +46,9 @@ enum {
     TIMESTAMP   = 0x03,
         #define GET           0x0
         #define SET           0x1
+    LOGS        = 0x04,
+        #define GET_SUPPORTED 0x0
+        #define GET_LOG       0x1
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -122,6 +125,8 @@ define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
 define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
 declare_mailbox_handler(TIMESTAMP_GET);
 declare_mailbox_handler(TIMESTAMP_SET);
+declare_mailbox_handler(LOGS_GET_SUPPORTED);
+declare_mailbox_handler(LOGS_GET_LOG);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -137,6 +142,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
     CXL_CMD(TIMESTAMP, GET, 0, 0),
     CXL_CMD(TIMESTAMP, SET, 8, IMMEDIATE_POLICY_CHANGE),
+    CXL_CMD(LOGS, GET_SUPPORTED, 0, 0),
+    CXL_CMD(LOGS, GET_LOG, 0x18, 0),
 };
 
 #undef CXL_CMD
@@ -188,6 +195,66 @@ define_mailbox_handler(TIMESTAMP_SET)
 
 QemuUUID cel_uuid;
 
+/* 8.2.9.4.1 */
+define_mailbox_handler(LOGS_GET_SUPPORTED)
+{
+    struct {
+        uint16_t entries;
+        uint8_t rsvd[6];
+        struct {
+            QemuUUID uuid;
+            uint32_t size;
+        } log_entries[1];
+    } __attribute__((packed)) *supported_logs = (void *)cmd->payload;
+    _Static_assert(sizeof(*supported_logs) == 0x1c, "Bad supported log size");
+
+    supported_logs->entries = 1;
+    supported_logs->log_entries[0].uuid = cel_uuid;
+    supported_logs->log_entries[0].size = 4 * cxl_dstate->cel_size;
+
+    *len = sizeof(*supported_logs);
+    return CXL_MBOX_SUCCESS;
+}
+
+/* 8.2.9.4.2 */
+define_mailbox_handler(LOGS_GET_LOG)
+{
+    struct {
+        QemuUUID uuid;
+        uint32_t offset;
+        uint32_t length;
+    } __attribute__((packed, __aligned__(16))) *get_log = (void *)cmd->payload;
+
+    /*
+     * 8.2.9.4.2
+     *   The device shall return Invalid Parameter if the Offset or Length
+     *   fields attempt to access beyond the size of the log as reported by Get
+     *   Supported Logs.
+     *
+     * XXX: Spec is wrong, "Invalid Parameter" isn't a thing.
+     * XXX: Spec doesn't address incorrect UUID incorrectness.
+     *
+     * The CEL buffer is large enough to fit all commands in the emulation, so
+     * the only possible failure would be if the mailbox itself isn't big
+     * enough.
+     */
+    if (get_log->offset + get_log->length > cxl_dstate->payload_size) {
+        return CXL_MBOX_INVALID_INPUT;
+    }
+
+    if (!qemu_uuid_is_equal(&get_log->uuid, &cel_uuid)) {
+        return CXL_MBOX_UNSUPPORTED;
+    }
+
+    /* Store off everything to local variables so we can wipe out the payload */
+    *len = get_log->length;
+
+    memmove(cmd->payload, cxl_dstate->cel_log + get_log->offset,
+           get_log->length);
+
+    return CXL_MBOX_SUCCESS;
+}
+
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
 {
     uint16_t ret = CXL_MBOX_SUCCESS;
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 10/31] hw/pxb: Use a type for realizing expanders
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

This opens up the possibility for more types of expanders (other than
PCI and PCIe). We'll need this to create a CXL expander.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index aedded1064..232b7ce305 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -24,6 +24,8 @@
 #include "hw/boards.h"
 #include "qom/object.h"
 
+enum BusType { PCI, PCIE };
+
 #define TYPE_PXB_BUS "pxb-bus"
 typedef struct PXBBus PXBBus;
 DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
@@ -214,7 +216,8 @@ static gint pxb_compare(gconstpointer a, gconstpointer b)
            0;
 }
 
-static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
+static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
+                                   Error **errp)
 {
     PXBDev *pxb = convert_to_pxb(dev);
     DeviceState *ds, *bds = NULL;
@@ -239,7 +242,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
     }
 
     ds = qdev_new(TYPE_PXB_HOST);
-    if (pcie) {
+    if (type == PCIE) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
@@ -287,7 +290,7 @@ static void pxb_dev_realize(PCIDevice *dev, Error **errp)
         return;
     }
 
-    pxb_dev_realize_common(dev, false, errp);
+    pxb_dev_realize_common(dev, PCI, errp);
 }
 
 static void pxb_dev_exitfn(PCIDevice *pci_dev)
@@ -339,7 +342,7 @@ static void pxb_pcie_dev_realize(PCIDevice *dev, Error **errp)
         return;
     }
 
-    pxb_dev_realize_common(dev, true, errp);
+    pxb_dev_realize_common(dev, PCIE, errp);
 }
 
 static void pxb_pcie_dev_class_init(ObjectClass *klass, void *data)
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 10/31] hw/pxb: Use a type for realizing expanders
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

This opens up the possibility for more types of expanders (other than
PCI and PCIe). We'll need this to create a CXL expander.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index aedded1064..232b7ce305 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -24,6 +24,8 @@
 #include "hw/boards.h"
 #include "qom/object.h"
 
+enum BusType { PCI, PCIE };
+
 #define TYPE_PXB_BUS "pxb-bus"
 typedef struct PXBBus PXBBus;
 DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
@@ -214,7 +216,8 @@ static gint pxb_compare(gconstpointer a, gconstpointer b)
            0;
 }
 
-static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
+static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
+                                   Error **errp)
 {
     PXBDev *pxb = convert_to_pxb(dev);
     DeviceState *ds, *bds = NULL;
@@ -239,7 +242,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
     }
 
     ds = qdev_new(TYPE_PXB_HOST);
-    if (pcie) {
+    if (type == PCIE) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
@@ -287,7 +290,7 @@ static void pxb_dev_realize(PCIDevice *dev, Error **errp)
         return;
     }
 
-    pxb_dev_realize_common(dev, false, errp);
+    pxb_dev_realize_common(dev, PCI, errp);
 }
 
 static void pxb_dev_exitfn(PCIDevice *pci_dev)
@@ -339,7 +342,7 @@ static void pxb_pcie_dev_realize(PCIDevice *dev, Error **errp)
         return;
     }
 
-    pxb_dev_realize_common(dev, true, errp);
+    pxb_dev_realize_common(dev, PCIE, errp);
 }
 
 static void pxb_pcie_dev_class_init(ObjectClass *klass, void *data)
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 11/31] hw/pci/cxl: Create a CXL bus type
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

The easiest way to differentiate a CXL bus, and a PCIE bus is using a
flag. A CXL bus, in hardware, is backward compatible with PCIE, and
therefore the code tries pretty hard to keep them in sync as much as
possible.

The other way to implement this would be to try to cast the bus to the
correct type. This is less code and useful for debugging via simply
looking at the flags.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 9 ++++++++-
 include/hw/pci/pci_bus.h            | 7 +++++++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 232b7ce305..88c45dc3b5 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -24,7 +24,7 @@
 #include "hw/boards.h"
 #include "qom/object.h"
 
-enum BusType { PCI, PCIE };
+enum BusType { PCI, PCIE, CXL };
 
 #define TYPE_PXB_BUS "pxb-bus"
 typedef struct PXBBus PXBBus;
@@ -35,6 +35,10 @@ DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
 DECLARE_INSTANCE_CHECKER(PXBBus, PXB_PCIE_BUS,
                          TYPE_PXB_PCIE_BUS)
 
+#define TYPE_PXB_CXL_BUS "pxb-cxl-bus"
+DECLARE_INSTANCE_CHECKER(PXBBus, PXB_CXL_BUS,
+                         TYPE_PXB_CXL_BUS)
+
 struct PXBBus {
     /*< private >*/
     PCIBus parent_obj;
@@ -244,6 +248,9 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
     ds = qdev_new(TYPE_PXB_HOST);
     if (type == PCIE) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
+    } else if (type == CXL) {
+        bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
+        bus->flags |= PCI_BUS_CXL;
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
         bds = qdev_new("pci-bridge");
diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
index 347440d42c..eb94e7e85c 100644
--- a/include/hw/pci/pci_bus.h
+++ b/include/hw/pci/pci_bus.h
@@ -24,6 +24,8 @@ enum PCIBusFlags {
     PCI_BUS_IS_ROOT                                         = 0x0001,
     /* PCIe extended configuration space is accessible on this bus */
     PCI_BUS_EXTENDED_CONFIG_SPACE                           = 0x0002,
+    /* This is a CXL Type BUS */
+    PCI_BUS_CXL                                             = 0x0004,
 };
 
 struct PCIBus {
@@ -53,6 +55,11 @@ struct PCIBus {
     Notifier machine_done;
 };
 
+static inline bool pci_bus_is_cxl(PCIBus *bus)
+{
+    return !!(bus->flags & PCI_BUS_CXL);
+}
+
 static inline bool pci_bus_is_root(PCIBus *bus)
 {
     return !!(bus->flags & PCI_BUS_IS_ROOT);
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 11/31] hw/pci/cxl: Create a CXL bus type
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

The easiest way to differentiate a CXL bus, and a PCIE bus is using a
flag. A CXL bus, in hardware, is backward compatible with PCIE, and
therefore the code tries pretty hard to keep them in sync as much as
possible.

The other way to implement this would be to try to cast the bus to the
correct type. This is less code and useful for debugging via simply
looking at the flags.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 9 ++++++++-
 include/hw/pci/pci_bus.h            | 7 +++++++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 232b7ce305..88c45dc3b5 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -24,7 +24,7 @@
 #include "hw/boards.h"
 #include "qom/object.h"
 
-enum BusType { PCI, PCIE };
+enum BusType { PCI, PCIE, CXL };
 
 #define TYPE_PXB_BUS "pxb-bus"
 typedef struct PXBBus PXBBus;
@@ -35,6 +35,10 @@ DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
 DECLARE_INSTANCE_CHECKER(PXBBus, PXB_PCIE_BUS,
                          TYPE_PXB_PCIE_BUS)
 
+#define TYPE_PXB_CXL_BUS "pxb-cxl-bus"
+DECLARE_INSTANCE_CHECKER(PXBBus, PXB_CXL_BUS,
+                         TYPE_PXB_CXL_BUS)
+
 struct PXBBus {
     /*< private >*/
     PCIBus parent_obj;
@@ -244,6 +248,9 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
     ds = qdev_new(TYPE_PXB_HOST);
     if (type == PCIE) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
+    } else if (type == CXL) {
+        bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
+        bus->flags |= PCI_BUS_CXL;
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
         bds = qdev_new("pci-bridge");
diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
index 347440d42c..eb94e7e85c 100644
--- a/include/hw/pci/pci_bus.h
+++ b/include/hw/pci/pci_bus.h
@@ -24,6 +24,8 @@ enum PCIBusFlags {
     PCI_BUS_IS_ROOT                                         = 0x0001,
     /* PCIe extended configuration space is accessible on this bus */
     PCI_BUS_EXTENDED_CONFIG_SPACE                           = 0x0002,
+    /* This is a CXL Type BUS */
+    PCI_BUS_CXL                                             = 0x0004,
 };
 
 struct PCIBus {
@@ -53,6 +55,11 @@ struct PCIBus {
     Notifier machine_done;
 };
 
+static inline bool pci_bus_is_cxl(PCIBus *bus)
+{
+    return !!(bus->flags & PCI_BUS_CXL);
+}
+
 static inline bool pci_bus_is_root(PCIBus *bus)
 {
     return !!(bus->flags & PCI_BUS_IS_ROOT);
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 12/31] hw/pxb: Allow creation of a CXL PXB (host bridge)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

This works like adding a typical pxb device, except the name is
'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as
follows:
  -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1

A CXL PXB is backward compatible with PCIe. What this means in practice
is that an operating system that is unaware of CXL should still be able
to enumerate this topology as if it were PCIe.

One can create multiple CXL PXB host bridges, but a host bridge can only
be connected to the main root bus. Host bridges cannot appear elsewhere
in the topology.

Note that as of this patch, the ACPI tables needed for the host bridge
(specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't
created. So while this patch internally creates it, it cannot be
properly used by an operating system or other system software.

Upcoming patches will allow creating multiple host bridges.

v2: Remove vendor and device ID (Ben)

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 67 ++++++++++++++++++++++++++++-
 hw/pci/pci.c                        |  7 +++
 include/hw/pci/pci.h                |  6 +++
 3 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 88c45dc3b5..b42592e1ff 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -56,6 +56,10 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
 DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
                          TYPE_PXB_PCIE_DEVICE)
 
+#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
+DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
+                         TYPE_PXB_CXL_DEVICE)
+
 struct PXBDev {
     /*< private >*/
     PCIDevice parent_obj;
@@ -67,6 +71,11 @@ struct PXBDev {
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
 {
+    /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PXB_CXL_DEVICE)) {
+        return PXB_CXL_DEV(dev);
+    }
+
     return pci_bus_is_express(pci_get_bus(dev))
         ? PXB_PCIE_DEV(dev) : PXB_DEV(dev);
 }
@@ -111,11 +120,20 @@ static const TypeInfo pxb_pcie_bus_info = {
     .class_init    = pxb_bus_class_init,
 };
 
+static const TypeInfo pxb_cxl_bus_info = {
+    .name          = TYPE_PXB_CXL_BUS,
+    .parent        = TYPE_CXL_BUS,
+    .instance_size = sizeof(PXBBus),
+    .class_init    = pxb_bus_class_init,
+};
+
 static const char *pxb_host_root_bus_path(PCIHostState *host_bridge,
                                           PCIBus *rootbus)
 {
-    PXBBus *bus = pci_bus_is_express(rootbus) ?
-                  PXB_PCIE_BUS(rootbus) : PXB_BUS(rootbus);
+    PXBBus *bus = pci_bus_is_cxl(rootbus) ?
+                      PXB_CXL_BUS(rootbus) :
+                      pci_bus_is_express(rootbus) ? PXB_PCIE_BUS(rootbus) :
+                                                    PXB_BUS(rootbus);
 
     snprintf(bus->bus_path, 8, "0000:%02x", pxb_bus_num(rootbus));
     return bus->bus_path;
@@ -380,13 +398,58 @@ static const TypeInfo pxb_pcie_dev_info = {
     },
 };
 
+static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
+{
+    /* A CXL PXB's parent bus is still PCIe */
+    if (!pci_bus_is_express(pci_get_bus(dev))) {
+        error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
+        return;
+    }
+
+    pxb_dev_realize_common(dev, CXL, errp);
+}
+
+static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc   = DEVICE_CLASS(klass);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+    k->realize             = pxb_cxl_dev_realize;
+    k->exit                = pxb_dev_exitfn;
+    /*
+     * XXX: These types of bridges don't actually show up in the hierarchy so
+     * vendor, device, class, etc. ids are intentionally left out.
+     */
+
+    dc->desc = "CXL Host Bridge";
+    device_class_set_props(dc, pxb_dev_properties);
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+
+    /* Host bridges aren't hotpluggable. FIXME: spec reference */
+    dc->hotpluggable = false;
+}
+
+static const TypeInfo pxb_cxl_dev_info = {
+    .name          = TYPE_PXB_CXL_DEVICE,
+    .parent        = TYPE_PCI_DEVICE,
+    .instance_size = sizeof(PXBDev),
+    .class_init    = pxb_cxl_dev_class_init,
+    .interfaces =
+        (InterfaceInfo[]){
+            { INTERFACE_CONVENTIONAL_PCI_DEVICE },
+            {},
+        },
+};
+
 static void pxb_register_types(void)
 {
     type_register_static(&pxb_bus_info);
     type_register_static(&pxb_pcie_bus_info);
+    type_register_static(&pxb_cxl_bus_info);
     type_register_static(&pxb_host_info);
     type_register_static(&pxb_dev_info);
     type_register_static(&pxb_pcie_dev_info);
+    type_register_static(&pxb_cxl_dev_info);
 }
 
 type_init(pxb_register_types)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index a45ca326ed..adbe8aa260 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -222,6 +222,12 @@ static const TypeInfo pcie_bus_info = {
     .class_init = pcie_bus_class_init,
 };
 
+static const TypeInfo cxl_bus_info = {
+    .name       = TYPE_CXL_BUS,
+    .parent     = TYPE_PCIE_BUS,
+    .class_init = pcie_bus_class_init,
+};
+
 static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num);
 static void pci_update_mappings(PCIDevice *d);
 static void pci_irq_handler(void *opaque, int irq_num, int level);
@@ -2825,6 +2831,7 @@ static void pci_register_types(void)
 {
     type_register_static(&pci_bus_info);
     type_register_static(&pcie_bus_info);
+    type_register_static(&cxl_bus_info);
     type_register_static(&conventional_pci_interface_info);
     type_register_static(&cxl_interface_info);
     type_register_static(&pcie_interface_info);
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 528cef341c..bde3697bee 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -406,6 +406,7 @@ typedef PCIINTxRoute (*pci_route_irq_fn)(void *opaque, int pin);
 #define TYPE_PCI_BUS "PCI"
 OBJECT_DECLARE_TYPE(PCIBus, PCIBusClass, PCI_BUS)
 #define TYPE_PCIE_BUS "PCIE"
+#define TYPE_CXL_BUS "CXL"
 
 bool pci_bus_is_express(PCIBus *bus);
 
@@ -754,6 +755,11 @@ static inline void pci_irq_pulse(PCIDevice *pci_dev)
     pci_irq_deassert(pci_dev);
 }
 
+static inline int pci_is_cxl(const PCIDevice *d)
+{
+    return d->cap_present & QEMU_PCIE_CAP_CXL;
+}
+
 static inline int pci_is_express(const PCIDevice *d)
 {
     return d->cap_present & QEMU_PCI_CAP_EXPRESS;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 12/31] hw/pxb: Allow creation of a CXL PXB (host bridge)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

This works like adding a typical pxb device, except the name is
'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as
follows:
  -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1

A CXL PXB is backward compatible with PCIe. What this means in practice
is that an operating system that is unaware of CXL should still be able
to enumerate this topology as if it were PCIe.

One can create multiple CXL PXB host bridges, but a host bridge can only
be connected to the main root bus. Host bridges cannot appear elsewhere
in the topology.

Note that as of this patch, the ACPI tables needed for the host bridge
(specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't
created. So while this patch internally creates it, it cannot be
properly used by an operating system or other system software.

Upcoming patches will allow creating multiple host bridges.

v2: Remove vendor and device ID (Ben)

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 67 ++++++++++++++++++++++++++++-
 hw/pci/pci.c                        |  7 +++
 include/hw/pci/pci.h                |  6 +++
 3 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 88c45dc3b5..b42592e1ff 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -56,6 +56,10 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
 DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
                          TYPE_PXB_PCIE_DEVICE)
 
+#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
+DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
+                         TYPE_PXB_CXL_DEVICE)
+
 struct PXBDev {
     /*< private >*/
     PCIDevice parent_obj;
@@ -67,6 +71,11 @@ struct PXBDev {
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
 {
+    /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PXB_CXL_DEVICE)) {
+        return PXB_CXL_DEV(dev);
+    }
+
     return pci_bus_is_express(pci_get_bus(dev))
         ? PXB_PCIE_DEV(dev) : PXB_DEV(dev);
 }
@@ -111,11 +120,20 @@ static const TypeInfo pxb_pcie_bus_info = {
     .class_init    = pxb_bus_class_init,
 };
 
+static const TypeInfo pxb_cxl_bus_info = {
+    .name          = TYPE_PXB_CXL_BUS,
+    .parent        = TYPE_CXL_BUS,
+    .instance_size = sizeof(PXBBus),
+    .class_init    = pxb_bus_class_init,
+};
+
 static const char *pxb_host_root_bus_path(PCIHostState *host_bridge,
                                           PCIBus *rootbus)
 {
-    PXBBus *bus = pci_bus_is_express(rootbus) ?
-                  PXB_PCIE_BUS(rootbus) : PXB_BUS(rootbus);
+    PXBBus *bus = pci_bus_is_cxl(rootbus) ?
+                      PXB_CXL_BUS(rootbus) :
+                      pci_bus_is_express(rootbus) ? PXB_PCIE_BUS(rootbus) :
+                                                    PXB_BUS(rootbus);
 
     snprintf(bus->bus_path, 8, "0000:%02x", pxb_bus_num(rootbus));
     return bus->bus_path;
@@ -380,13 +398,58 @@ static const TypeInfo pxb_pcie_dev_info = {
     },
 };
 
+static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
+{
+    /* A CXL PXB's parent bus is still PCIe */
+    if (!pci_bus_is_express(pci_get_bus(dev))) {
+        error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
+        return;
+    }
+
+    pxb_dev_realize_common(dev, CXL, errp);
+}
+
+static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc   = DEVICE_CLASS(klass);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+    k->realize             = pxb_cxl_dev_realize;
+    k->exit                = pxb_dev_exitfn;
+    /*
+     * XXX: These types of bridges don't actually show up in the hierarchy so
+     * vendor, device, class, etc. ids are intentionally left out.
+     */
+
+    dc->desc = "CXL Host Bridge";
+    device_class_set_props(dc, pxb_dev_properties);
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+
+    /* Host bridges aren't hotpluggable. FIXME: spec reference */
+    dc->hotpluggable = false;
+}
+
+static const TypeInfo pxb_cxl_dev_info = {
+    .name          = TYPE_PXB_CXL_DEVICE,
+    .parent        = TYPE_PCI_DEVICE,
+    .instance_size = sizeof(PXBDev),
+    .class_init    = pxb_cxl_dev_class_init,
+    .interfaces =
+        (InterfaceInfo[]){
+            { INTERFACE_CONVENTIONAL_PCI_DEVICE },
+            {},
+        },
+};
+
 static void pxb_register_types(void)
 {
     type_register_static(&pxb_bus_info);
     type_register_static(&pxb_pcie_bus_info);
+    type_register_static(&pxb_cxl_bus_info);
     type_register_static(&pxb_host_info);
     type_register_static(&pxb_dev_info);
     type_register_static(&pxb_pcie_dev_info);
+    type_register_static(&pxb_cxl_dev_info);
 }
 
 type_init(pxb_register_types)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index a45ca326ed..adbe8aa260 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -222,6 +222,12 @@ static const TypeInfo pcie_bus_info = {
     .class_init = pcie_bus_class_init,
 };
 
+static const TypeInfo cxl_bus_info = {
+    .name       = TYPE_CXL_BUS,
+    .parent     = TYPE_PCIE_BUS,
+    .class_init = pcie_bus_class_init,
+};
+
 static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num);
 static void pci_update_mappings(PCIDevice *d);
 static void pci_irq_handler(void *opaque, int irq_num, int level);
@@ -2825,6 +2831,7 @@ static void pci_register_types(void)
 {
     type_register_static(&pci_bus_info);
     type_register_static(&pcie_bus_info);
+    type_register_static(&cxl_bus_info);
     type_register_static(&conventional_pci_interface_info);
     type_register_static(&cxl_interface_info);
     type_register_static(&pcie_interface_info);
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 528cef341c..bde3697bee 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -406,6 +406,7 @@ typedef PCIINTxRoute (*pci_route_irq_fn)(void *opaque, int pin);
 #define TYPE_PCI_BUS "PCI"
 OBJECT_DECLARE_TYPE(PCIBus, PCIBusClass, PCI_BUS)
 #define TYPE_PCIE_BUS "PCIE"
+#define TYPE_CXL_BUS "CXL"
 
 bool pci_bus_is_express(PCIBus *bus);
 
@@ -754,6 +755,11 @@ static inline void pci_irq_pulse(PCIDevice *pci_dev)
     pci_irq_deassert(pci_dev);
 }
 
+static inline int pci_is_cxl(const PCIDevice *d)
+{
+    return d->cap_present & QEMU_PCIE_CAP_CXL;
+}
+
 static inline int pci_is_express(const PCIDevice *d)
 {
     return d->cap_present & QEMU_PCI_CAP_EXPRESS;
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 13/31] qtest: allow DSDT acpi table changes
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..5c695cdf37 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,22 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/DSDT",
+"tests/data/acpi/pc/DSDT.acpihmat",
+"tests/data/acpi/pc/DSDT.bridge",
+"tests/data/acpi/pc/DSDT.cphp",
+"tests/data/acpi/pc/DSDT.dimmpxm",
+"tests/data/acpi/pc/DSDT.hpbridge",
+"tests/data/acpi/pc/DSDT.hpbrroot",
+"tests/data/acpi/pc/DSDT.ipmikcs",
+"tests/data/acpi/pc/DSDT.memhp",
+"tests/data/acpi/pc/DSDT.numamem",
+"tests/data/acpi/pc/DSDT.roothp",
+"tests/data/acpi/q35/DSDT",
+"tests/data/acpi/q35/DSDT.acpihmat",
+"tests/data/acpi/q35/DSDT.bridge",
+"tests/data/acpi/q35/DSDT.cphp",
+"tests/data/acpi/q35/DSDT.dimmpxm",
+"tests/data/acpi/q35/DSDT.ipmibt",
+"tests/data/acpi/q35/DSDT.memhp",
+"tests/data/acpi/q35/DSDT.mmio64",
+"tests/data/acpi/q35/DSDT.numamem",
+"tests/data/acpi/q35/DSDT.tis",
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 13/31] qtest: allow DSDT acpi table changes
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/qtest/bios-tables-test-allowed-diff.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..5c695cdf37 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,22 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/DSDT",
+"tests/data/acpi/pc/DSDT.acpihmat",
+"tests/data/acpi/pc/DSDT.bridge",
+"tests/data/acpi/pc/DSDT.cphp",
+"tests/data/acpi/pc/DSDT.dimmpxm",
+"tests/data/acpi/pc/DSDT.hpbridge",
+"tests/data/acpi/pc/DSDT.hpbrroot",
+"tests/data/acpi/pc/DSDT.ipmikcs",
+"tests/data/acpi/pc/DSDT.memhp",
+"tests/data/acpi/pc/DSDT.numamem",
+"tests/data/acpi/pc/DSDT.roothp",
+"tests/data/acpi/q35/DSDT",
+"tests/data/acpi/q35/DSDT.acpihmat",
+"tests/data/acpi/q35/DSDT.bridge",
+"tests/data/acpi/q35/DSDT.cphp",
+"tests/data/acpi/q35/DSDT.dimmpxm",
+"tests/data/acpi/q35/DSDT.ipmibt",
+"tests/data/acpi/q35/DSDT.memhp",
+"tests/data/acpi/q35/DSDT.mmio64",
+"tests/data/acpi/q35/DSDT.numamem",
+"tests/data/acpi/q35/DSDT.tis",
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 14/31] acpi/pci: Consolidate host bridge setup
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

This cleanup will make it easier to add support for CXL to the mix.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/i386/acpi-build.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index f56d699c7f..cf6eb54c22 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1194,6 +1194,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
     aml_append(table, scope);
 }
 
+enum { PCI, PCIE };
+static void init_pci_acpi(Aml *dev, int uid, int type)
+{
+    if (type == PCI) {
+        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
+        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+    } else {
+        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
+        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
+        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+        aml_append(dev, build_q35_osc_method());
+    }
+}
+
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker,
            AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -1222,9 +1236,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     if (misc->is_piix4) {
         sb_scope = aml_scope("_SB");
         dev = aml_device("PCI0");
-        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
+        init_pci_acpi(dev, 0, PCI);
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
         aml_append(sb_scope, dev);
         aml_append(dsdt, sb_scope);
 
@@ -1238,11 +1251,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     } else {
         sb_scope = aml_scope("_SB");
         dev = aml_device("PCI0");
-        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
+        init_pci_acpi(dev, 0, PCIE);
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
-        aml_append(dev, build_q35_osc_method());
         aml_append(sb_scope, dev);
 
         if (pm->smi_on_cpuhp) {
@@ -1345,15 +1355,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
             scope = aml_scope("\\_SB");
             dev = aml_device("PC%.02X", bus_num);
-            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-            if (pci_bus_is_express(bus)) {
-                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
-                aml_append(dev, build_q35_osc_method());
-            } else {
-                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
-            }
+            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
                 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 14/31] acpi/pci: Consolidate host bridge setup
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

This cleanup will make it easier to add support for CXL to the mix.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/i386/acpi-build.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index f56d699c7f..cf6eb54c22 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1194,6 +1194,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
     aml_append(table, scope);
 }
 
+enum { PCI, PCIE };
+static void init_pci_acpi(Aml *dev, int uid, int type)
+{
+    if (type == PCI) {
+        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
+        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+    } else {
+        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
+        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
+        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+        aml_append(dev, build_q35_osc_method());
+    }
+}
+
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker,
            AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -1222,9 +1236,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     if (misc->is_piix4) {
         sb_scope = aml_scope("_SB");
         dev = aml_device("PCI0");
-        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
+        init_pci_acpi(dev, 0, PCI);
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
         aml_append(sb_scope, dev);
         aml_append(dsdt, sb_scope);
 
@@ -1238,11 +1251,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     } else {
         sb_scope = aml_scope("_SB");
         dev = aml_device("PCI0");
-        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
+        init_pci_acpi(dev, 0, PCIE);
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
-        aml_append(dev, build_q35_osc_method());
         aml_append(sb_scope, dev);
 
         if (pm->smi_on_cpuhp) {
@@ -1345,15 +1355,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
             scope = aml_scope("\\_SB");
             dev = aml_device("PC%.02X", bus_num);
-            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-            if (pci_bus_is_express(bus)) {
-                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
-                aml_append(dev, build_q35_osc_method());
-            } else {
-                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
-            }
+            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
                 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 15/31] tests/acpi: remove stale allowed tables
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

     Scope (_SB)
     {
         Device (PCI0)
         {
             Name (_HID, EisaId ("PNP0A03") /* PCI Bus */)  // _HID: Hardware ID
-            Name (_ADR, Zero)  // _ADR: Address
             Name (_UID, Zero)  // _UID: Unique ID
+            Name (_ADR, Zero)  // _ADR: Address

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/data/acpi/pc/DSDT                     | Bin 5065 -> 5065 bytes
 tests/data/acpi/pc/DSDT.acpihmat            | Bin 6390 -> 6390 bytes
 tests/data/acpi/pc/DSDT.bridge              | Bin 6924 -> 6924 bytes
 tests/data/acpi/pc/DSDT.cphp                | Bin 5529 -> 5529 bytes
 tests/data/acpi/pc/DSDT.dimmpxm             | Bin 6719 -> 6719 bytes
 tests/data/acpi/pc/DSDT.hpbridge            | Bin 5026 -> 5026 bytes
 tests/data/acpi/pc/DSDT.hpbrroot            | Bin 3084 -> 3084 bytes
 tests/data/acpi/pc/DSDT.ipmikcs             | Bin 5137 -> 5137 bytes
 tests/data/acpi/pc/DSDT.memhp               | Bin 6424 -> 6424 bytes
 tests/data/acpi/pc/DSDT.numamem             | Bin 5071 -> 5071 bytes
 tests/data/acpi/pc/DSDT.roothp              | Bin 5261 -> 5261 bytes
 tests/data/acpi/q35/DSDT                    | Bin 7801 -> 7801 bytes
 tests/data/acpi/q35/DSDT.acpihmat           | Bin 9126 -> 9126 bytes
 tests/data/acpi/q35/DSDT.bridge             | Bin 7819 -> 7819 bytes
 tests/data/acpi/q35/DSDT.cphp               | Bin 8265 -> 8265 bytes
 tests/data/acpi/q35/DSDT.dimmpxm            | Bin 9455 -> 9455 bytes
 tests/data/acpi/q35/DSDT.ipmibt             | Bin 7876 -> 7876 bytes
 tests/data/acpi/q35/DSDT.memhp              | Bin 9160 -> 9160 bytes
 tests/data/acpi/q35/DSDT.mmio64             | Bin 8932 -> 8932 bytes
 tests/data/acpi/q35/DSDT.numamem            | Bin 7807 -> 7807 bytes
 tests/qtest/bios-tables-test-allowed-diff.h |  21 --------------------
 21 files changed, 21 deletions(-)

diff --git a/tests/data/acpi/pc/DSDT b/tests/data/acpi/pc/DSDT
index f6173df1d598767a79aa34ad7585ad7d45c5d4f3..b516745128e3f1a297b6327e9057026a2d16229c 100644
GIT binary patch
delta 20
bcmX@9eo}oxJ7=h;3j;^Iqf5}n36{bDOsEE~

delta 20
bcmX@9eo}oxJEx;d5CcbisHe-u36{bDOlAhI

diff --git a/tests/data/acpi/pc/DSDT.acpihmat b/tests/data/acpi/pc/DSDT.acpihmat
index 67f3f7249eaaa9404ebf0f2d0a324b8c8e3bd445..aeae285c6434ae6cf3c53660e34425727a497871 100644
GIT binary patch
delta 20
bcmexn_|0%aJ7=h;3j;^Iqf5}n3271lRUHRT

delta 20
bcmexn_|0%aJEx;d5CcbisHe-u3271lRNDtm

diff --git a/tests/data/acpi/pc/DSDT.bridge b/tests/data/acpi/pc/DSDT.bridge
index 643390f4c4138b37fc481656d3f555d0eeedcb02..4cd26a87dd11d96e10bf6de786b9d56ebfe0a4f9 100644
GIT binary patch
delta 20
bcmeA%>oJ?q&Kc_I!oU&l=n}MXLX8vvMneXi

delta 20
bcmeA%>oJ?q&gtk9#J~|B>glp^LX8vvMgaz#

diff --git a/tests/data/acpi/pc/DSDT.cphp b/tests/data/acpi/pc/DSDT.cphp
index 1ddcf7d8812f5d8d4d38fe7e7b35fd5885806046..fecb784812cbb2308ef58acf4a2c580f56d35c39 100644
GIT binary patch
delta 20
bcmbQKJyUx^J7=h;3j;^Iqf5}n37nz;MY;wk

delta 20
bcmbQKJyUx^JEx;d5CcbisHe-u37nz;MR*1%

diff --git a/tests/data/acpi/pc/DSDT.dimmpxm b/tests/data/acpi/pc/DSDT.dimmpxm
index c44385cc01879324738ffb7f997b8cdd762cbf97..f2c31e150ead16e4931367a6dab42704950a21e9 100644
GIT binary patch
delta 20
bcmdmQvfpGvJ7=h;3j;^Iqf5}n3F{>RP4WjY

delta 20
bcmdmQvfpGvJEx;d5CcbisHe-u3F{>RO|S<r

diff --git a/tests/data/acpi/pc/DSDT.hpbridge b/tests/data/acpi/pc/DSDT.hpbridge
index 4ecf1eb13bf49499f729b53a6d0114672a76e28d..7a8955cdbc52c025a2fd8f160cf8aff9442c985b 100644
GIT binary patch
delta 20
bcmZ3azDRvSJ7=h;3j;^Iqf5}n2|~gEMvw+M

delta 20
bcmZ3azDRvSJEx;d5CcbisHe-u2|~gEMotDf

diff --git a/tests/data/acpi/pc/DSDT.hpbrroot b/tests/data/acpi/pc/DSDT.hpbrroot
index a3046226ec1dcb234b726029b3790dfedb3b9221..88d23fca4743c2ee57493e7d77d6297a60964d3c 100644
GIT binary patch
delta 20
bcmeB?=#iMv&Kc_I!oU&l=n}MXLJc<nLHq_$

delta 20
bcmeB?=#iMv&gtk9#J~|B>glp^LJc<nLAnM}

diff --git a/tests/data/acpi/pc/DSDT.ipmikcs b/tests/data/acpi/pc/DSDT.ipmikcs
index f1638c5d079a9442c09390426a913010df6efd8d..d670ae793b5778c095a7f8c79ff1a046889d1a56 100644
GIT binary patch
delta 20
bcmbQJF;QbeJ7=h;3j;^Iqf5}n35~)4MGOXr

delta 20
bcmbQJF;QbeJEx;d5CcbisHe-u35~)4M9Kz;

diff --git a/tests/data/acpi/pc/DSDT.memhp b/tests/data/acpi/pc/DSDT.memhp
index 4c19e45e66918c61674785c99e4474e58866f125..a7de3d9fd94e62e8fc357fe3093bf7f394a39219 100644
GIT binary patch
delta 20
bcmbPXG{a~@J7=h;3j;^Iqf5}n2^|suN0A1$

delta 20
bcmbPXG{a~@JEx;d5CcbisHe-u2^|suM^6T}

diff --git a/tests/data/acpi/pc/DSDT.numamem b/tests/data/acpi/pc/DSDT.numamem
index 40cfd933259af05ac2aee07fca32f22122255211..57958b6cec216c1fb8731f4ed2da67f0fad7484a 100644
GIT binary patch
delta 20
bcmX@FeqMb-J7=h;3j;^Iqf5}n3HHJOO_&D2

delta 20
bcmX@FeqMb-JEx;d5CcbisHe-u3HHJOO;!fL

diff --git a/tests/data/acpi/pc/DSDT.roothp b/tests/data/acpi/pc/DSDT.roothp
index 078fc8031b479cc77b6527a2b7b4bd576b6e6028..624d0e367693fe267a4237a5fc97295cee2ebd60 100644
GIT binary patch
delta 20
bcmeCx?A4sm&Kc_I!oU&l=n}MX!e3zkMUV#m

delta 20
bcmeCx?A4sm&gtk9#J~|B>glp^!e3zkMNS6(

diff --git a/tests/data/acpi/q35/DSDT b/tests/data/acpi/q35/DSDT
index d25cd7072932886d6967f4023faac1e1fa6e836c..17e2aebde98e0a3161d93e9b2e200737b13699ac 100644
GIT binary patch
delta 21
dcmexq^V4R+<cTvI**M}IU4j@kOEJdF0sv{z2gd*a

delta 19
bcmexq^V4R+WEMx4Aclz(n>R}_#>)Z#RX+z<

diff --git a/tests/data/acpi/q35/DSDT.acpihmat b/tests/data/acpi/q35/DSDT.acpihmat
index 722e06af83abcde203a2b96a8ec81fd3bab9fc98..7b3d659352a0923822f6a5db1dbd0a6ad853c446 100644
GIT binary patch
delta 21
dcmZ4HzRZ2X<cTvI**M}IU4j@kOELB+0RUdw2WbER

delta 19
bcmZ4HzRZ2XWEMx4Aclz(n>R}__9y`WOK1lA

diff --git a/tests/data/acpi/q35/DSDT.bridge b/tests/data/acpi/q35/DSDT.bridge
index 06bac139d668ddfc7914e258b471a303c9dbd192..5961b55b1067c3090b2f1f4cd3386d71efee241d 100644
GIT binary patch
delta 21
ccmeCS?Y5mTdE(4QHja2lmmr4CQjCSN09fk={{R30

delta 19
acmeCS?Y5mTnZ?m1h+*Qy=FL)!g|Yxf4F-?^

diff --git a/tests/data/acpi/q35/DSDT.cphp b/tests/data/acpi/q35/DSDT.cphp
index 2b933ac482e6883efccbd7d6c96089602f2c0b4d..09c92d52f92bb346ed807945b9638cad958446f8 100644
GIT binary patch
delta 21
dcmX@<aMEGI<cTvI**M}IU4j@kOEK!p0{~)+2SES;

delta 19
bcmX@<aMEGIWEMx4Aclz(n>R}_>dONFPN@dc

diff --git a/tests/data/acpi/q35/DSDT.dimmpxm b/tests/data/acpi/q35/DSDT.dimmpxm
index bd8f8305b028ef20f9b6d1a0c69ac428d027e3d1..1da97afb32dddafefe7f27934acbcb7d56a67489 100644
GIT binary patch
delta 21
dcmaFw`QCHF<cTvI**M}IU4j@kOEF$m1^{az2uT0{

delta 19
bcmaFw`QCHFWEMx4Aclz(n>R}_UR4GFR)YuH

diff --git a/tests/data/acpi/q35/DSDT.ipmibt b/tests/data/acpi/q35/DSDT.ipmibt
index a8f868e23c25688ab1c0371016c071f23e9d732f..c7e68432b66e7b4d03284c882c65bbf3066825dc 100644
GIT binary patch
delta 21
dcmX?Nd&G9a<cTvI**M}IU4j@kOEIpJ1ps122dV%7

delta 19
bcmX?Nd&G9aWEMx4Aclz(n>R}_u95`+PJ;(K

diff --git a/tests/data/acpi/q35/DSDT.memhp b/tests/data/acpi/q35/DSDT.memhp
index 9a802e4c67022386442976d5cb997ea3fc57b58f..3af457dd550461b2d2ea85aa85d7740452913b34 100644
GIT binary patch
delta 21
dcmX@%e!_ji<cTvI**M}IU4j@kOEIof0sv%g2hRWi

delta 19
bcmX@%e!_jiWEMx4Aclz(n>R}_u2TX4P;>`i

diff --git a/tests/data/acpi/q35/DSDT.mmio64 b/tests/data/acpi/q35/DSDT.mmio64
index 948c2dc7264c31932b490ca00691a7c4d9aefdb0..a4d20f676ac173e6846dcd4e076220d512215963 100644
GIT binary patch
delta 21
dcmaFj`owj@<cTvI**M}IU4j@kOEI2O1ORBc2p#|c

delta 19
bcmaFj`owj@WEMx4Aclz(n>R}_o>Bw=R96SD

diff --git a/tests/data/acpi/q35/DSDT.numamem b/tests/data/acpi/q35/DSDT.numamem
index 44ec1b0af400da6d298284aa959aa38add7e6dd5..bbab0d10a2a064528519fa69e90c799430129b75 100644
GIT binary patch
delta 21
dcmexw^WSE|<cTvI**M}IU4j@kOEIR(0sv~w2iX7s

delta 19
bcmexw^WSE|WEMx4Aclz(n>R}_rpf{URwD;$

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 5c695cdf37..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,22 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/DSDT",
-"tests/data/acpi/pc/DSDT.acpihmat",
-"tests/data/acpi/pc/DSDT.bridge",
-"tests/data/acpi/pc/DSDT.cphp",
-"tests/data/acpi/pc/DSDT.dimmpxm",
-"tests/data/acpi/pc/DSDT.hpbridge",
-"tests/data/acpi/pc/DSDT.hpbrroot",
-"tests/data/acpi/pc/DSDT.ipmikcs",
-"tests/data/acpi/pc/DSDT.memhp",
-"tests/data/acpi/pc/DSDT.numamem",
-"tests/data/acpi/pc/DSDT.roothp",
-"tests/data/acpi/q35/DSDT",
-"tests/data/acpi/q35/DSDT.acpihmat",
-"tests/data/acpi/q35/DSDT.bridge",
-"tests/data/acpi/q35/DSDT.cphp",
-"tests/data/acpi/q35/DSDT.dimmpxm",
-"tests/data/acpi/q35/DSDT.ipmibt",
-"tests/data/acpi/q35/DSDT.memhp",
-"tests/data/acpi/q35/DSDT.mmio64",
-"tests/data/acpi/q35/DSDT.numamem",
-"tests/data/acpi/q35/DSDT.tis",
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 15/31] tests/acpi: remove stale allowed tables
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

     Scope (_SB)
     {
         Device (PCI0)
         {
             Name (_HID, EisaId ("PNP0A03") /* PCI Bus */)  // _HID: Hardware ID
-            Name (_ADR, Zero)  // _ADR: Address
             Name (_UID, Zero)  // _UID: Unique ID
+            Name (_ADR, Zero)  // _ADR: Address

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/data/acpi/pc/DSDT                     | Bin 5065 -> 5065 bytes
 tests/data/acpi/pc/DSDT.acpihmat            | Bin 6390 -> 6390 bytes
 tests/data/acpi/pc/DSDT.bridge              | Bin 6924 -> 6924 bytes
 tests/data/acpi/pc/DSDT.cphp                | Bin 5529 -> 5529 bytes
 tests/data/acpi/pc/DSDT.dimmpxm             | Bin 6719 -> 6719 bytes
 tests/data/acpi/pc/DSDT.hpbridge            | Bin 5026 -> 5026 bytes
 tests/data/acpi/pc/DSDT.hpbrroot            | Bin 3084 -> 3084 bytes
 tests/data/acpi/pc/DSDT.ipmikcs             | Bin 5137 -> 5137 bytes
 tests/data/acpi/pc/DSDT.memhp               | Bin 6424 -> 6424 bytes
 tests/data/acpi/pc/DSDT.numamem             | Bin 5071 -> 5071 bytes
 tests/data/acpi/pc/DSDT.roothp              | Bin 5261 -> 5261 bytes
 tests/data/acpi/q35/DSDT                    | Bin 7801 -> 7801 bytes
 tests/data/acpi/q35/DSDT.acpihmat           | Bin 9126 -> 9126 bytes
 tests/data/acpi/q35/DSDT.bridge             | Bin 7819 -> 7819 bytes
 tests/data/acpi/q35/DSDT.cphp               | Bin 8265 -> 8265 bytes
 tests/data/acpi/q35/DSDT.dimmpxm            | Bin 9455 -> 9455 bytes
 tests/data/acpi/q35/DSDT.ipmibt             | Bin 7876 -> 7876 bytes
 tests/data/acpi/q35/DSDT.memhp              | Bin 9160 -> 9160 bytes
 tests/data/acpi/q35/DSDT.mmio64             | Bin 8932 -> 8932 bytes
 tests/data/acpi/q35/DSDT.numamem            | Bin 7807 -> 7807 bytes
 tests/qtest/bios-tables-test-allowed-diff.h |  21 --------------------
 21 files changed, 21 deletions(-)

diff --git a/tests/data/acpi/pc/DSDT b/tests/data/acpi/pc/DSDT
index f6173df1d598767a79aa34ad7585ad7d45c5d4f3..b516745128e3f1a297b6327e9057026a2d16229c 100644
GIT binary patch
delta 20
bcmX@9eo}oxJ7=h;3j;^Iqf5}n36{bDOsEE~

delta 20
bcmX@9eo}oxJEx;d5CcbisHe-u36{bDOlAhI

diff --git a/tests/data/acpi/pc/DSDT.acpihmat b/tests/data/acpi/pc/DSDT.acpihmat
index 67f3f7249eaaa9404ebf0f2d0a324b8c8e3bd445..aeae285c6434ae6cf3c53660e34425727a497871 100644
GIT binary patch
delta 20
bcmexn_|0%aJ7=h;3j;^Iqf5}n3271lRUHRT

delta 20
bcmexn_|0%aJEx;d5CcbisHe-u3271lRNDtm

diff --git a/tests/data/acpi/pc/DSDT.bridge b/tests/data/acpi/pc/DSDT.bridge
index 643390f4c4138b37fc481656d3f555d0eeedcb02..4cd26a87dd11d96e10bf6de786b9d56ebfe0a4f9 100644
GIT binary patch
delta 20
bcmeA%>oJ?q&Kc_I!oU&l=n}MXLX8vvMneXi

delta 20
bcmeA%>oJ?q&gtk9#J~|B>glp^LX8vvMgaz#

diff --git a/tests/data/acpi/pc/DSDT.cphp b/tests/data/acpi/pc/DSDT.cphp
index 1ddcf7d8812f5d8d4d38fe7e7b35fd5885806046..fecb784812cbb2308ef58acf4a2c580f56d35c39 100644
GIT binary patch
delta 20
bcmbQKJyUx^J7=h;3j;^Iqf5}n37nz;MY;wk

delta 20
bcmbQKJyUx^JEx;d5CcbisHe-u37nz;MR*1%

diff --git a/tests/data/acpi/pc/DSDT.dimmpxm b/tests/data/acpi/pc/DSDT.dimmpxm
index c44385cc01879324738ffb7f997b8cdd762cbf97..f2c31e150ead16e4931367a6dab42704950a21e9 100644
GIT binary patch
delta 20
bcmdmQvfpGvJ7=h;3j;^Iqf5}n3F{>RP4WjY

delta 20
bcmdmQvfpGvJEx;d5CcbisHe-u3F{>RO|S<r

diff --git a/tests/data/acpi/pc/DSDT.hpbridge b/tests/data/acpi/pc/DSDT.hpbridge
index 4ecf1eb13bf49499f729b53a6d0114672a76e28d..7a8955cdbc52c025a2fd8f160cf8aff9442c985b 100644
GIT binary patch
delta 20
bcmZ3azDRvSJ7=h;3j;^Iqf5}n2|~gEMvw+M

delta 20
bcmZ3azDRvSJEx;d5CcbisHe-u2|~gEMotDf

diff --git a/tests/data/acpi/pc/DSDT.hpbrroot b/tests/data/acpi/pc/DSDT.hpbrroot
index a3046226ec1dcb234b726029b3790dfedb3b9221..88d23fca4743c2ee57493e7d77d6297a60964d3c 100644
GIT binary patch
delta 20
bcmeB?=#iMv&Kc_I!oU&l=n}MXLJc<nLHq_$

delta 20
bcmeB?=#iMv&gtk9#J~|B>glp^LJc<nLAnM}

diff --git a/tests/data/acpi/pc/DSDT.ipmikcs b/tests/data/acpi/pc/DSDT.ipmikcs
index f1638c5d079a9442c09390426a913010df6efd8d..d670ae793b5778c095a7f8c79ff1a046889d1a56 100644
GIT binary patch
delta 20
bcmbQJF;QbeJ7=h;3j;^Iqf5}n35~)4MGOXr

delta 20
bcmbQJF;QbeJEx;d5CcbisHe-u35~)4M9Kz;

diff --git a/tests/data/acpi/pc/DSDT.memhp b/tests/data/acpi/pc/DSDT.memhp
index 4c19e45e66918c61674785c99e4474e58866f125..a7de3d9fd94e62e8fc357fe3093bf7f394a39219 100644
GIT binary patch
delta 20
bcmbPXG{a~@J7=h;3j;^Iqf5}n2^|suN0A1$

delta 20
bcmbPXG{a~@JEx;d5CcbisHe-u2^|suM^6T}

diff --git a/tests/data/acpi/pc/DSDT.numamem b/tests/data/acpi/pc/DSDT.numamem
index 40cfd933259af05ac2aee07fca32f22122255211..57958b6cec216c1fb8731f4ed2da67f0fad7484a 100644
GIT binary patch
delta 20
bcmX@FeqMb-J7=h;3j;^Iqf5}n3HHJOO_&D2

delta 20
bcmX@FeqMb-JEx;d5CcbisHe-u3HHJOO;!fL

diff --git a/tests/data/acpi/pc/DSDT.roothp b/tests/data/acpi/pc/DSDT.roothp
index 078fc8031b479cc77b6527a2b7b4bd576b6e6028..624d0e367693fe267a4237a5fc97295cee2ebd60 100644
GIT binary patch
delta 20
bcmeCx?A4sm&Kc_I!oU&l=n}MX!e3zkMUV#m

delta 20
bcmeCx?A4sm&gtk9#J~|B>glp^!e3zkMNS6(

diff --git a/tests/data/acpi/q35/DSDT b/tests/data/acpi/q35/DSDT
index d25cd7072932886d6967f4023faac1e1fa6e836c..17e2aebde98e0a3161d93e9b2e200737b13699ac 100644
GIT binary patch
delta 21
dcmexq^V4R+<cTvI**M}IU4j@kOEJdF0sv{z2gd*a

delta 19
bcmexq^V4R+WEMx4Aclz(n>R}_#>)Z#RX+z<

diff --git a/tests/data/acpi/q35/DSDT.acpihmat b/tests/data/acpi/q35/DSDT.acpihmat
index 722e06af83abcde203a2b96a8ec81fd3bab9fc98..7b3d659352a0923822f6a5db1dbd0a6ad853c446 100644
GIT binary patch
delta 21
dcmZ4HzRZ2X<cTvI**M}IU4j@kOELB+0RUdw2WbER

delta 19
bcmZ4HzRZ2XWEMx4Aclz(n>R}__9y`WOK1lA

diff --git a/tests/data/acpi/q35/DSDT.bridge b/tests/data/acpi/q35/DSDT.bridge
index 06bac139d668ddfc7914e258b471a303c9dbd192..5961b55b1067c3090b2f1f4cd3386d71efee241d 100644
GIT binary patch
delta 21
ccmeCS?Y5mTdE(4QHja2lmmr4CQjCSN09fk={{R30

delta 19
acmeCS?Y5mTnZ?m1h+*Qy=FL)!g|Yxf4F-?^

diff --git a/tests/data/acpi/q35/DSDT.cphp b/tests/data/acpi/q35/DSDT.cphp
index 2b933ac482e6883efccbd7d6c96089602f2c0b4d..09c92d52f92bb346ed807945b9638cad958446f8 100644
GIT binary patch
delta 21
dcmX@<aMEGI<cTvI**M}IU4j@kOEK!p0{~)+2SES;

delta 19
bcmX@<aMEGIWEMx4Aclz(n>R}_>dONFPN@dc

diff --git a/tests/data/acpi/q35/DSDT.dimmpxm b/tests/data/acpi/q35/DSDT.dimmpxm
index bd8f8305b028ef20f9b6d1a0c69ac428d027e3d1..1da97afb32dddafefe7f27934acbcb7d56a67489 100644
GIT binary patch
delta 21
dcmaFw`QCHF<cTvI**M}IU4j@kOEF$m1^{az2uT0{

delta 19
bcmaFw`QCHFWEMx4Aclz(n>R}_UR4GFR)YuH

diff --git a/tests/data/acpi/q35/DSDT.ipmibt b/tests/data/acpi/q35/DSDT.ipmibt
index a8f868e23c25688ab1c0371016c071f23e9d732f..c7e68432b66e7b4d03284c882c65bbf3066825dc 100644
GIT binary patch
delta 21
dcmX?Nd&G9a<cTvI**M}IU4j@kOEIpJ1ps122dV%7

delta 19
bcmX?Nd&G9aWEMx4Aclz(n>R}_u95`+PJ;(K

diff --git a/tests/data/acpi/q35/DSDT.memhp b/tests/data/acpi/q35/DSDT.memhp
index 9a802e4c67022386442976d5cb997ea3fc57b58f..3af457dd550461b2d2ea85aa85d7740452913b34 100644
GIT binary patch
delta 21
dcmX@%e!_ji<cTvI**M}IU4j@kOEIof0sv%g2hRWi

delta 19
bcmX@%e!_jiWEMx4Aclz(n>R}_u2TX4P;>`i

diff --git a/tests/data/acpi/q35/DSDT.mmio64 b/tests/data/acpi/q35/DSDT.mmio64
index 948c2dc7264c31932b490ca00691a7c4d9aefdb0..a4d20f676ac173e6846dcd4e076220d512215963 100644
GIT binary patch
delta 21
dcmaFj`owj@<cTvI**M}IU4j@kOEI2O1ORBc2p#|c

delta 19
bcmaFj`owj@WEMx4Aclz(n>R}_o>Bw=R96SD

diff --git a/tests/data/acpi/q35/DSDT.numamem b/tests/data/acpi/q35/DSDT.numamem
index 44ec1b0af400da6d298284aa959aa38add7e6dd5..bbab0d10a2a064528519fa69e90c799430129b75 100644
GIT binary patch
delta 21
dcmexw^WSE|<cTvI**M}IU4j@kOEIR(0sv~w2iX7s

delta 19
bcmexw^WSE|WEMx4Aclz(n>R}_rpf{URwD;$

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 5c695cdf37..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,22 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/DSDT",
-"tests/data/acpi/pc/DSDT.acpihmat",
-"tests/data/acpi/pc/DSDT.bridge",
-"tests/data/acpi/pc/DSDT.cphp",
-"tests/data/acpi/pc/DSDT.dimmpxm",
-"tests/data/acpi/pc/DSDT.hpbridge",
-"tests/data/acpi/pc/DSDT.hpbrroot",
-"tests/data/acpi/pc/DSDT.ipmikcs",
-"tests/data/acpi/pc/DSDT.memhp",
-"tests/data/acpi/pc/DSDT.numamem",
-"tests/data/acpi/pc/DSDT.roothp",
-"tests/data/acpi/q35/DSDT",
-"tests/data/acpi/q35/DSDT.acpihmat",
-"tests/data/acpi/q35/DSDT.bridge",
-"tests/data/acpi/q35/DSDT.cphp",
-"tests/data/acpi/q35/DSDT.dimmpxm",
-"tests/data/acpi/q35/DSDT.ipmibt",
-"tests/data/acpi/q35/DSDT.memhp",
-"tests/data/acpi/q35/DSDT.mmio64",
-"tests/data/acpi/q35/DSDT.numamem",
-"tests/data/acpi/q35/DSDT.tis",
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
there is nothing wrong with doing it this way, CXL spec has a heavy
reliance on _UID to identify host bridges and there is no link to the
bus number. Having a distinct UID solves two problems. The first is it
gets us around the limitation of 256 (current max bus number). The
second is it allows us to replicate hardware configurations where bus
number and uid aren't equivalent. The latter has benefits for our
development and debugging using QEMU.

The other way to do this would be to implement the expanded bus
numbering, but having an explicit uid makes more sense when trying to
replicate real hardware configurations.

The QEMU commandline to utilize this would be:
  -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--

I'm guessing this patch will be somewhat controversial. For early CXL
work, this can be dropped without too much heartache.
---
 hw/i386/acpi-build.c                |  3 ++-
 hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
 hw/pci/pci.c                        | 11 +++++++++++
 include/hw/pci/pci.h                |  1 +
 include/hw/pci/pci_bus.h            |  1 +
 5 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index cf6eb54c22..145a503e92 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         QLIST_FOREACH(bus, &bus->child, sibling) {
             uint8_t bus_num = pci_bus_num(bus);
             uint8_t numa_node = pci_bus_numa_node(bus);
+            int32_t uid = pci_bus_uid(bus);
 
             /* look only for expander root buses */
             if (!pci_bus_is_root(bus)) {
@@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             scope = aml_scope("\\_SB");
             dev = aml_device("PC%.02X", bus_num);
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
+            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
                 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index b42592e1ff..5021b60435 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -67,6 +67,7 @@ struct PXBDev {
 
     uint8_t bus_nr;
     uint16_t numa_node;
+    int32_t uid;
 };
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
@@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
     return pxb->numa_node;
 }
 
+static int32_t pxb_bus_uid(PCIBus *bus)
+{
+    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
+
+    return pxb->uid;
+}
+
 static void pxb_bus_class_init(ObjectClass *class, void *data)
 {
     PCIBusClass *pbc = PCI_BUS_CLASS(class);
 
     pbc->bus_num = pxb_bus_num;
     pbc->numa_node = pxb_bus_numa_node;
+    pbc->uid = pxb_bus_uid;
 }
 
 static const TypeInfo pxb_bus_info = {
@@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
     /* Note: 0 is not a legal PXB bus number. */
     DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
     DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
+    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
 
 static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
 {
+    PXBDev *pxb = convert_to_pxb(dev);
+
     /* A CXL PXB's parent bus is still PCIe */
     if (!pci_bus_is_express(pci_get_bus(dev))) {
         error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
         return;
     }
 
+    if (pxb->uid < 0) {
+        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
+        return;
+    }
+
+    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
+
     pxb_dev_realize_common(dev, CXL, errp);
 }
 
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index adbe8aa260..bf019d91a0 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
     return NUMA_NODE_UNASSIGNED;
 }
 
+static int32_t pcibus_uid(PCIBus *bus)
+{
+    return -1;
+}
+
 static void pci_bus_class_init(ObjectClass *klass, void *data)
 {
     BusClass *k = BUS_CLASS(klass);
@@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
 
     pbc->bus_num = pcibus_num;
     pbc->numa_node = pcibus_numa_node;
+    pbc->uid = pcibus_uid;
 }
 
 static const TypeInfo pci_bus_info = {
@@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
     return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
 }
 
+int pci_bus_uid(PCIBus *bus)
+{
+    return PCI_BUS_GET_CLASS(bus)->uid(bus);
+}
+
 static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
                                  const VMStateField *field)
 {
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index bde3697bee..a46de48ccd 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
 }
 
 int pci_bus_numa_node(PCIBus *bus);
+int pci_bus_uid(PCIBus *bus);
 void pci_for_each_device(PCIBus *bus, int bus_num,
                          void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
                          void *opaque);
diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
index eb94e7e85c..3c9fbc55bb 100644
--- a/include/hw/pci/pci_bus.h
+++ b/include/hw/pci/pci_bus.h
@@ -17,6 +17,7 @@ struct PCIBusClass {
 
     int (*bus_num)(PCIBus *bus);
     uint16_t (*numa_node)(PCIBus *bus);
+    int32_t (*uid)(PCIBus *bus);
 };
 
 enum PCIBusFlags {
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
there is nothing wrong with doing it this way, CXL spec has a heavy
reliance on _UID to identify host bridges and there is no link to the
bus number. Having a distinct UID solves two problems. The first is it
gets us around the limitation of 256 (current max bus number). The
second is it allows us to replicate hardware configurations where bus
number and uid aren't equivalent. The latter has benefits for our
development and debugging using QEMU.

The other way to do this would be to implement the expanded bus
numbering, but having an explicit uid makes more sense when trying to
replicate real hardware configurations.

The QEMU commandline to utilize this would be:
  -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--

I'm guessing this patch will be somewhat controversial. For early CXL
work, this can be dropped without too much heartache.
---
 hw/i386/acpi-build.c                |  3 ++-
 hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
 hw/pci/pci.c                        | 11 +++++++++++
 include/hw/pci/pci.h                |  1 +
 include/hw/pci/pci_bus.h            |  1 +
 5 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index cf6eb54c22..145a503e92 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         QLIST_FOREACH(bus, &bus->child, sibling) {
             uint8_t bus_num = pci_bus_num(bus);
             uint8_t numa_node = pci_bus_numa_node(bus);
+            int32_t uid = pci_bus_uid(bus);
 
             /* look only for expander root buses */
             if (!pci_bus_is_root(bus)) {
@@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             scope = aml_scope("\\_SB");
             dev = aml_device("PC%.02X", bus_num);
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
+            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
                 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index b42592e1ff..5021b60435 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -67,6 +67,7 @@ struct PXBDev {
 
     uint8_t bus_nr;
     uint16_t numa_node;
+    int32_t uid;
 };
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
@@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
     return pxb->numa_node;
 }
 
+static int32_t pxb_bus_uid(PCIBus *bus)
+{
+    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
+
+    return pxb->uid;
+}
+
 static void pxb_bus_class_init(ObjectClass *class, void *data)
 {
     PCIBusClass *pbc = PCI_BUS_CLASS(class);
 
     pbc->bus_num = pxb_bus_num;
     pbc->numa_node = pxb_bus_numa_node;
+    pbc->uid = pxb_bus_uid;
 }
 
 static const TypeInfo pxb_bus_info = {
@@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
     /* Note: 0 is not a legal PXB bus number. */
     DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
     DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
+    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
 
 static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
 {
+    PXBDev *pxb = convert_to_pxb(dev);
+
     /* A CXL PXB's parent bus is still PCIe */
     if (!pci_bus_is_express(pci_get_bus(dev))) {
         error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
         return;
     }
 
+    if (pxb->uid < 0) {
+        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
+        return;
+    }
+
+    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
+
     pxb_dev_realize_common(dev, CXL, errp);
 }
 
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index adbe8aa260..bf019d91a0 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
     return NUMA_NODE_UNASSIGNED;
 }
 
+static int32_t pcibus_uid(PCIBus *bus)
+{
+    return -1;
+}
+
 static void pci_bus_class_init(ObjectClass *klass, void *data)
 {
     BusClass *k = BUS_CLASS(klass);
@@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
 
     pbc->bus_num = pcibus_num;
     pbc->numa_node = pcibus_numa_node;
+    pbc->uid = pcibus_uid;
 }
 
 static const TypeInfo pci_bus_info = {
@@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
     return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
 }
 
+int pci_bus_uid(PCIBus *bus)
+{
+    return PCI_BUS_GET_CLASS(bus)->uid(bus);
+}
+
 static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
                                  const VMStateField *field)
 {
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index bde3697bee..a46de48ccd 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
 }
 
 int pci_bus_numa_node(PCIBus *bus);
+int pci_bus_uid(PCIBus *bus);
 void pci_for_each_device(PCIBus *bus, int bus_num,
                          void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
                          void *opaque);
diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
index eb94e7e85c..3c9fbc55bb 100644
--- a/include/hw/pci/pci_bus.h
+++ b/include/hw/pci/pci_bus.h
@@ -17,6 +17,7 @@ struct PCIBusClass {
 
     int (*bus_num)(PCIBus *bus);
     uint16_t (*numa_node)(PCIBus *bus);
+    int32_t (*uid)(PCIBus *bus);
 };
 
 enum PCIBusFlags {
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 17/31] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

CXL host bridges themselves may have MMIO. Since host bridges don't have
a BAR they are treated as special for MMIO.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--

It's arbitrarily chosen here to pick 0xD0000000 as the base for the host
bridge MMIO. I'm not sure what the right way to find free space for
platform hardcoded things like this is.
---
 hw/pci-bridge/pci_expander_bridge.c | 53 ++++++++++++++++++++++++++++-
 include/hw/cxl/cxl.h                |  2 ++
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 5021b60435..226a8a5fff 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -17,6 +17,7 @@
 #include "hw/pci/pci_host.h"
 #include "hw/qdev-properties.h"
 #include "hw/pci/pci_bridge.h"
+#include "hw/cxl/cxl.h"
 #include "qemu/range.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
@@ -70,6 +71,12 @@ struct PXBDev {
     int32_t uid;
 };
 
+typedef struct CXLHost {
+    PCIHostState parent_obj;
+
+    CXLComponentState cxl_cstate;
+} CXLHost;
+
 static PXBDev *convert_to_pxb(PCIDevice *dev)
 {
     /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
@@ -85,6 +92,9 @@ static GList *pxb_dev_list;
 
 #define TYPE_PXB_HOST "pxb-host"
 
+#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
+#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
+
 static int pxb_bus_num(PCIBus *bus)
 {
     PXBDev *pxb = convert_to_pxb(bus->parent_dev);
@@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
     .class_init    = pxb_host_class_init,
 };
 
+static void pxb_cxl_realize(DeviceState *dev, Error **errp)
+{
+    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+    PCIHostState *phb = PCI_HOST_BRIDGE(dev);
+    CXLHost *cxl = PXB_CXL_HOST(dev);
+    CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
+    struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
+
+    cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
+                                      TYPE_PXB_CXL_HOST);
+    sysbus_init_mmio(sbd, mr);
+
+    /* FIXME: support multiple host bridges. */
+    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
+                            memory_region_size(mr) * pci_bus_uid(phb->bus));
+}
+
+static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(class);
+    PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
+
+    hc->root_bus_path = pxb_host_root_bus_path;
+    dc->fw_name = "cxl";
+    dc->realize = pxb_cxl_realize;
+    /* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself */
+    dc->user_creatable = false;
+}
+
+/*
+ * This is a device to handle the MMIO for a CXL host bridge. It does nothing
+ * else.
+ */
+static const TypeInfo cxl_host_info = {
+    .name          = TYPE_PXB_CXL_HOST,
+    .parent        = TYPE_PCI_HOST_BRIDGE,
+    .instance_size = sizeof(CXLHost),
+    .class_init    = pxb_cxl_host_class_init,
+};
+
 /*
  * Registers the PXB bus as a child of pci host root bus.
  */
@@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
         dev_name = dev->qdev.id;
     }
 
-    ds = qdev_new(TYPE_PXB_HOST);
+    ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
     if (type == PCIE) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
     } else if (type == CXL) {
@@ -466,6 +516,7 @@ static void pxb_register_types(void)
     type_register_static(&pxb_pcie_bus_info);
     type_register_static(&pxb_cxl_bus_info);
     type_register_static(&pxb_host_info);
+    type_register_static(&cxl_host_info);
     type_register_static(&pxb_dev_info);
     type_register_static(&pxb_pcie_dev_info);
     type_register_static(&pxb_cxl_dev_info);
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 362cda40de..6bc344f205 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -17,5 +17,7 @@
 #define COMPONENT_REG_BAR_IDX 0
 #define DEVICE_REG_BAR_IDX 2
 
+#define CXL_HOST_BASE 0xD0000000
+
 #endif
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 17/31] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

CXL host bridges themselves may have MMIO. Since host bridges don't have
a BAR they are treated as special for MMIO.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--

It's arbitrarily chosen here to pick 0xD0000000 as the base for the host
bridge MMIO. I'm not sure what the right way to find free space for
platform hardcoded things like this is.
---
 hw/pci-bridge/pci_expander_bridge.c | 53 ++++++++++++++++++++++++++++-
 include/hw/cxl/cxl.h                |  2 ++
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 5021b60435..226a8a5fff 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -17,6 +17,7 @@
 #include "hw/pci/pci_host.h"
 #include "hw/qdev-properties.h"
 #include "hw/pci/pci_bridge.h"
+#include "hw/cxl/cxl.h"
 #include "qemu/range.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
@@ -70,6 +71,12 @@ struct PXBDev {
     int32_t uid;
 };
 
+typedef struct CXLHost {
+    PCIHostState parent_obj;
+
+    CXLComponentState cxl_cstate;
+} CXLHost;
+
 static PXBDev *convert_to_pxb(PCIDevice *dev)
 {
     /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
@@ -85,6 +92,9 @@ static GList *pxb_dev_list;
 
 #define TYPE_PXB_HOST "pxb-host"
 
+#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
+#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
+
 static int pxb_bus_num(PCIBus *bus)
 {
     PXBDev *pxb = convert_to_pxb(bus->parent_dev);
@@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
     .class_init    = pxb_host_class_init,
 };
 
+static void pxb_cxl_realize(DeviceState *dev, Error **errp)
+{
+    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+    PCIHostState *phb = PCI_HOST_BRIDGE(dev);
+    CXLHost *cxl = PXB_CXL_HOST(dev);
+    CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
+    struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
+
+    cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
+                                      TYPE_PXB_CXL_HOST);
+    sysbus_init_mmio(sbd, mr);
+
+    /* FIXME: support multiple host bridges. */
+    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
+                            memory_region_size(mr) * pci_bus_uid(phb->bus));
+}
+
+static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(class);
+    PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
+
+    hc->root_bus_path = pxb_host_root_bus_path;
+    dc->fw_name = "cxl";
+    dc->realize = pxb_cxl_realize;
+    /* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself */
+    dc->user_creatable = false;
+}
+
+/*
+ * This is a device to handle the MMIO for a CXL host bridge. It does nothing
+ * else.
+ */
+static const TypeInfo cxl_host_info = {
+    .name          = TYPE_PXB_CXL_HOST,
+    .parent        = TYPE_PCI_HOST_BRIDGE,
+    .instance_size = sizeof(CXLHost),
+    .class_init    = pxb_cxl_host_class_init,
+};
+
 /*
  * Registers the PXB bus as a child of pci host root bus.
  */
@@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
         dev_name = dev->qdev.id;
     }
 
-    ds = qdev_new(TYPE_PXB_HOST);
+    ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
     if (type == PCIE) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
     } else if (type == CXL) {
@@ -466,6 +516,7 @@ static void pxb_register_types(void)
     type_register_static(&pxb_pcie_bus_info);
     type_register_static(&pxb_cxl_bus_info);
     type_register_static(&pxb_host_info);
+    type_register_static(&cxl_host_info);
     type_register_static(&pxb_dev_info);
     type_register_static(&pxb_pcie_dev_info);
     type_register_static(&pxb_cxl_dev_info);
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 362cda40de..6bc344f205 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -17,5 +17,7 @@
 #define COMPONENT_REG_BAR_IDX 0
 #define DEVICE_REG_BAR_IDX 2
 
+#define CXL_HOST_BASE 0xD0000000
+
 #endif
 
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 18/31] acpi/pxb/cxl: Reserve host bridge MMIO
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

For all host bridges, reserve MMIO space with _CRS. The MMIO for the
host bridge lives in a magically hard coded space in the system's
physical address space. The standard mechanism to tell the OS about
regions which can't be used for host bridges is _CRS.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/i386/acpi-build.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 145a503e92..ecdc10b148 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -28,6 +28,7 @@
 #include "qemu/bitmap.h"
 #include "qemu/error-report.h"
 #include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
 #include "hw/core/cpu.h"
 #include "target/i386/cpu.h"
 #include "hw/misc/pvpanic.h"
@@ -1194,7 +1195,7 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
     aml_append(table, scope);
 }
 
-enum { PCI, PCIE };
+enum { PCI, PCIE, CXL };
 static void init_pci_acpi(Aml *dev, int uid, int type)
 {
     if (type == PCI) {
@@ -1344,20 +1345,28 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             uint8_t bus_num = pci_bus_num(bus);
             uint8_t numa_node = pci_bus_numa_node(bus);
             int32_t uid = pci_bus_uid(bus);
+            int type;
 
             /* look only for expander root buses */
             if (!pci_bus_is_root(bus)) {
                 continue;
             }
 
+            type = pci_bus_is_cxl(bus) ? CXL :
+                                         pci_bus_is_express(bus) ? PCIE : PCI;
+
             if (bus_num < root_bus_limit) {
                 root_bus_limit = bus_num - 1;
             }
 
             scope = aml_scope("\\_SB");
-            dev = aml_device("PC%.02X", bus_num);
+            if (type == CXL) {
+                dev = aml_device("CXL%.01X", pci_bus_uid(bus));
+            } else {
+                dev = aml_device("PC%.02X", bus_num);
+            }
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
+            init_pci_acpi(dev, uid, type);
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
                 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
@@ -1369,6 +1378,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             aml_append(dev, aml_name_decl("_CRS", crs));
             aml_append(scope, dev);
             aml_append(dsdt, scope);
+
+            /* Handle the ranges for the PXB expanders */
+            if (type == CXL) {
+                uint64_t base = CXL_HOST_BASE + uid * 0x10000;
+                crs_range_insert(crs_range_set.mem_ranges, base,
+                                 base + 0x10000 - 1);
+            }
         }
     }
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 18/31] acpi/pxb/cxl: Reserve host bridge MMIO
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

For all host bridges, reserve MMIO space with _CRS. The MMIO for the
host bridge lives in a magically hard coded space in the system's
physical address space. The standard mechanism to tell the OS about
regions which can't be used for host bridges is _CRS.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/i386/acpi-build.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 145a503e92..ecdc10b148 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -28,6 +28,7 @@
 #include "qemu/bitmap.h"
 #include "qemu/error-report.h"
 #include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
 #include "hw/core/cpu.h"
 #include "target/i386/cpu.h"
 #include "hw/misc/pvpanic.h"
@@ -1194,7 +1195,7 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
     aml_append(table, scope);
 }
 
-enum { PCI, PCIE };
+enum { PCI, PCIE, CXL };
 static void init_pci_acpi(Aml *dev, int uid, int type)
 {
     if (type == PCI) {
@@ -1344,20 +1345,28 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             uint8_t bus_num = pci_bus_num(bus);
             uint8_t numa_node = pci_bus_numa_node(bus);
             int32_t uid = pci_bus_uid(bus);
+            int type;
 
             /* look only for expander root buses */
             if (!pci_bus_is_root(bus)) {
                 continue;
             }
 
+            type = pci_bus_is_cxl(bus) ? CXL :
+                                         pci_bus_is_express(bus) ? PCIE : PCI;
+
             if (bus_num < root_bus_limit) {
                 root_bus_limit = bus_num - 1;
             }
 
             scope = aml_scope("\\_SB");
-            dev = aml_device("PC%.02X", bus_num);
+            if (type == CXL) {
+                dev = aml_device("CXL%.01X", pci_bus_uid(bus));
+            } else {
+                dev = aml_device("PC%.02X", bus_num);
+            }
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
+            init_pci_acpi(dev, uid, type);
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
                 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
@@ -1369,6 +1378,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             aml_append(dev, aml_name_decl("_CRS", crs));
             aml_append(scope, dev);
             aml_append(dsdt, scope);
+
+            /* Handle the ranges for the PXB expanders */
+            if (type == CXL) {
+                uint64_t base = CXL_HOST_BASE + uid * 0x10000;
+                crs_range_insert(crs_range_set.mem_ranges, base,
+                                 base + 0x10000 - 1);
+            }
         }
     }
 
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 19/31] hw/pxb/cxl: Add "windows" for host bridges
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

In a bare metal CXL capable system, system firmware will program
physical address ranges on the host. This is done by programming
internal registers that aren't typically known to OS. These address
ranges might be contiguous or interleaved across host bridges.

For a QEMU guest a new construct is introduced allowing passing a memory
backend to the host bridge for this same purpose. Each memory backend
needs to be passed to the host bridge as well as any device that will be
emulating that memory (not implemented here).

I'm hopeful the interleaving work in the link can be re-purposed here
(see Link).

An example to create a host bridges with a 512M window at 0x4c0000000
 -object memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M
 -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-memory-base=1,memory-base\[0\]=0x4c0000000,memory\[0\]=cxl-mem1

Link: https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg03680.html
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 65 +++++++++++++++++++++++++++--
 include/hw/cxl/cxl.h                |  1 +
 2 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 226a8a5fff..af1450c69d 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -69,12 +69,19 @@ struct PXBDev {
     uint8_t bus_nr;
     uint16_t numa_node;
     int32_t uid;
+    struct cxl_dev {
+        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
+
+        uint32_t num_windows;
+        hwaddr *window_base[CXL_WINDOW_MAX];
+    } cxl;
 };
 
 typedef struct CXLHost {
     PCIHostState parent_obj;
 
     CXLComponentState cxl_cstate;
+    PXBDev *dev;
 } CXLHost;
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
@@ -213,16 +220,31 @@ static void pxb_cxl_realize(DeviceState *dev, Error **errp)
     SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
     PCIHostState *phb = PCI_HOST_BRIDGE(dev);
     CXLHost *cxl = PXB_CXL_HOST(dev);
+    struct cxl_dev *cxl_dev = &cxl->dev->cxl;
     CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
     struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
+    int uid = pci_bus_uid(phb->bus);
 
     cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
                                       TYPE_PXB_CXL_HOST);
     sysbus_init_mmio(sbd, mr);
 
-    /* FIXME: support multiple host bridges. */
-    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
-                            memory_region_size(mr) * pci_bus_uid(phb->bus));
+    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE + memory_region_size(mr) * uid);
+
+    /*
+     * A CXL host bridge can exist without a fixed memory window, but it would
+     * only operate in legacy PCIe mode.
+     */
+    if (!cxl_dev->memory_window[uid]) {
+        warn_report(
+            "CXL expander bridge created without window. Consider using %s",
+            "memdev[0]=<memory_backend>");
+        return;
+    }
+
+    mr = host_memory_backend_get_memory(cxl_dev->memory_window[uid]);
+    sysbus_init_mmio(sbd, mr);
+    sysbus_mmio_map(sbd, 1 + uid, *cxl_dev->window_base[uid]);
 }
 
 static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
@@ -328,6 +350,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
     } else if (type == CXL) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
         bus->flags |= PCI_BUS_CXL;
+        PXB_CXL_HOST(ds)->dev = PXB_CXL_DEV(dev);
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
         bds = qdev_new("pci-bridge");
@@ -389,6 +412,8 @@ static Property pxb_dev_properties[] = {
     DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
     DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
     DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
+    DEFINE_PROP_ARRAY("window-base", PXBDev, cxl.num_windows, cxl.window_base,
+                      qdev_prop_uint64, hwaddr),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -460,7 +485,9 @@ static const TypeInfo pxb_pcie_dev_info = {
 
 static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
 {
-    PXBDev *pxb = convert_to_pxb(dev);
+    PXBDev *pxb = PXB_CXL_DEV(dev);
+    struct cxl_dev *cxl = &pxb->cxl;
+    int count = 0;
 
     /* A CXL PXB's parent bus is still PCIe */
     if (!pci_bus_is_express(pci_get_bus(dev))) {
@@ -476,6 +503,23 @@ static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
     /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
 
     pxb_dev_realize_common(dev, CXL, errp);
+
+    for (unsigned i = 0; i < CXL_WINDOW_MAX; i++) {
+        if (!cxl->memory_window[i]) {
+            continue;
+        }
+
+        count++;
+    }
+
+    if (!count) {
+        warn_report("memory-windows should be set when creating CXL host bridges");
+    }
+
+    if (count != cxl->num_windows) {
+        error_setg(errp, "window bases count (%d) must match window count (%d)",
+                   cxl->num_windows, count);
+    }
 }
 
 static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
@@ -496,6 +540,19 @@ static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
 
     /* Host bridges aren't hotpluggable. FIXME: spec reference */
     dc->hotpluggable = false;
+
+    /*
+     * Below is moral equivalent of:
+     *   DEFINE_PROP_ARRAY("memdev", PXBDev, window_count, windows,
+     *                     qdev_prop_memory_device, HostMemoryBackend)
+     */
+    for (unsigned i = 0; i < CXL_WINDOW_MAX; i++) {
+        g_autofree char *name = g_strdup_printf("memdev[%u]", i);
+        object_class_property_add_link(klass, name, TYPE_MEMORY_BACKEND,
+                offsetof(PXBDev, cxl.memory_window[i]),
+                qdev_prop_allow_set_link_before_realize,
+                OBJ_PROP_LINK_STRONG);
+    }
 }
 
 static const TypeInfo pxb_cxl_dev_info = {
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 6bc344f205..b1e5f4a8fa 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -18,6 +18,7 @@
 #define DEVICE_REG_BAR_IDX 2
 
 #define CXL_HOST_BASE 0xD0000000
+#define CXL_WINDOW_MAX 10
 
 #endif
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 19/31] hw/pxb/cxl: Add "windows" for host bridges
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

In a bare metal CXL capable system, system firmware will program
physical address ranges on the host. This is done by programming
internal registers that aren't typically known to OS. These address
ranges might be contiguous or interleaved across host bridges.

For a QEMU guest a new construct is introduced allowing passing a memory
backend to the host bridge for this same purpose. Each memory backend
needs to be passed to the host bridge as well as any device that will be
emulating that memory (not implemented here).

I'm hopeful the interleaving work in the link can be re-purposed here
(see Link).

An example to create a host bridges with a 512M window at 0x4c0000000
 -object memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M
 -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-memory-base=1,memory-base\[0\]=0x4c0000000,memory\[0\]=cxl-mem1

Link: https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg03680.html
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 65 +++++++++++++++++++++++++++--
 include/hw/cxl/cxl.h                |  1 +
 2 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 226a8a5fff..af1450c69d 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -69,12 +69,19 @@ struct PXBDev {
     uint8_t bus_nr;
     uint16_t numa_node;
     int32_t uid;
+    struct cxl_dev {
+        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
+
+        uint32_t num_windows;
+        hwaddr *window_base[CXL_WINDOW_MAX];
+    } cxl;
 };
 
 typedef struct CXLHost {
     PCIHostState parent_obj;
 
     CXLComponentState cxl_cstate;
+    PXBDev *dev;
 } CXLHost;
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
@@ -213,16 +220,31 @@ static void pxb_cxl_realize(DeviceState *dev, Error **errp)
     SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
     PCIHostState *phb = PCI_HOST_BRIDGE(dev);
     CXLHost *cxl = PXB_CXL_HOST(dev);
+    struct cxl_dev *cxl_dev = &cxl->dev->cxl;
     CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
     struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
+    int uid = pci_bus_uid(phb->bus);
 
     cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
                                       TYPE_PXB_CXL_HOST);
     sysbus_init_mmio(sbd, mr);
 
-    /* FIXME: support multiple host bridges. */
-    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
-                            memory_region_size(mr) * pci_bus_uid(phb->bus));
+    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE + memory_region_size(mr) * uid);
+
+    /*
+     * A CXL host bridge can exist without a fixed memory window, but it would
+     * only operate in legacy PCIe mode.
+     */
+    if (!cxl_dev->memory_window[uid]) {
+        warn_report(
+            "CXL expander bridge created without window. Consider using %s",
+            "memdev[0]=<memory_backend>");
+        return;
+    }
+
+    mr = host_memory_backend_get_memory(cxl_dev->memory_window[uid]);
+    sysbus_init_mmio(sbd, mr);
+    sysbus_mmio_map(sbd, 1 + uid, *cxl_dev->window_base[uid]);
 }
 
 static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
@@ -328,6 +350,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
     } else if (type == CXL) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
         bus->flags |= PCI_BUS_CXL;
+        PXB_CXL_HOST(ds)->dev = PXB_CXL_DEV(dev);
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
         bds = qdev_new("pci-bridge");
@@ -389,6 +412,8 @@ static Property pxb_dev_properties[] = {
     DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
     DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
     DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
+    DEFINE_PROP_ARRAY("window-base", PXBDev, cxl.num_windows, cxl.window_base,
+                      qdev_prop_uint64, hwaddr),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -460,7 +485,9 @@ static const TypeInfo pxb_pcie_dev_info = {
 
 static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
 {
-    PXBDev *pxb = convert_to_pxb(dev);
+    PXBDev *pxb = PXB_CXL_DEV(dev);
+    struct cxl_dev *cxl = &pxb->cxl;
+    int count = 0;
 
     /* A CXL PXB's parent bus is still PCIe */
     if (!pci_bus_is_express(pci_get_bus(dev))) {
@@ -476,6 +503,23 @@ static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
     /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
 
     pxb_dev_realize_common(dev, CXL, errp);
+
+    for (unsigned i = 0; i < CXL_WINDOW_MAX; i++) {
+        if (!cxl->memory_window[i]) {
+            continue;
+        }
+
+        count++;
+    }
+
+    if (!count) {
+        warn_report("memory-windows should be set when creating CXL host bridges");
+    }
+
+    if (count != cxl->num_windows) {
+        error_setg(errp, "window bases count (%d) must match window count (%d)",
+                   cxl->num_windows, count);
+    }
 }
 
 static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
@@ -496,6 +540,19 @@ static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
 
     /* Host bridges aren't hotpluggable. FIXME: spec reference */
     dc->hotpluggable = false;
+
+    /*
+     * Below is moral equivalent of:
+     *   DEFINE_PROP_ARRAY("memdev", PXBDev, window_count, windows,
+     *                     qdev_prop_memory_device, HostMemoryBackend)
+     */
+    for (unsigned i = 0; i < CXL_WINDOW_MAX; i++) {
+        g_autofree char *name = g_strdup_printf("memdev[%u]", i);
+        object_class_property_add_link(klass, name, TYPE_MEMORY_BACKEND,
+                offsetof(PXBDev, cxl.memory_window[i]),
+                qdev_prop_allow_set_link_before_realize,
+                OBJ_PROP_LINK_STRONG);
+    }
 }
 
 static const TypeInfo pxb_cxl_dev_info = {
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 6bc344f205..b1e5f4a8fa 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -18,6 +18,7 @@
 #define DEVICE_REG_BAR_IDX 2
 
 #define CXL_HOST_BASE 0xD0000000
+#define CXL_WINDOW_MAX 10
 
 #endif
 
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 20/31] hw/cxl/rp: Add a root port
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

This adds just enough of a root port implementation to be able to
enumerate root ports (creating the required DVSEC entries). What's not
here yet is the MMIO nor the ability to write some of the DVSEC entries.

This can be added with the qemu commandline by adding a rootport to a
specific CXL host bridge. For example:
  -device cxl-rp,id=rp0,bus="cxl.0",addr=0.0,chassis=4

Like the host bridge patch, the ACPI tables aren't generated at this
point and so system software cannot use it.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/Kconfig          |   5 +
 hw/pci-bridge/cxl_root_port.c  | 231 +++++++++++++++++++++++++++++++++
 hw/pci-bridge/meson.build      |   1 +
 hw/pci-bridge/pcie_root_port.c |   6 +-
 hw/pci/pci.c                   |   4 +-
 5 files changed, 245 insertions(+), 2 deletions(-)
 create mode 100644 hw/pci-bridge/cxl_root_port.c

diff --git a/hw/pci-bridge/Kconfig b/hw/pci-bridge/Kconfig
index f8df4315ba..02614f49aa 100644
--- a/hw/pci-bridge/Kconfig
+++ b/hw/pci-bridge/Kconfig
@@ -27,3 +27,8 @@ config DEC_PCI
 
 config SIMBA
     bool
+
+config CXL
+    bool
+    default y if PCI_EXPRESS && PXB
+    depends on PCI_EXPRESS && MSI_NONBROKEN && PXB
diff --git a/hw/pci-bridge/cxl_root_port.c b/hw/pci-bridge/cxl_root_port.c
new file mode 100644
index 0000000000..6c3b215bb3
--- /dev/null
+++ b/hw/pci-bridge/cxl_root_port.c
@@ -0,0 +1,231 @@
+/*
+ * CXL 2.0 Root Port Implementation
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/range.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci/pcie_port.h"
+#include "hw/qdev-properties.h"
+#include "hw/sysbus.h"
+#include "qapi/error.h"
+#include "hw/cxl/cxl.h"
+
+#define CXL_ROOT_PORT_DID 0x7075
+
+/* Copied from the gen root port which we derive */
+#define GEN_PCIE_ROOT_PORT_AER_OFFSET 0x100
+#define GEN_PCIE_ROOT_PORT_ACS_OFFSET \
+    (GEN_PCIE_ROOT_PORT_AER_OFFSET + PCI_ERR_SIZEOF)
+#define CXL_ROOT_PORT_DVSEC_OFFSET \
+    (GEN_PCIE_ROOT_PORT_ACS_OFFSET + PCI_ACS_SIZEOF)
+
+typedef struct CXLRootPort {
+    /*< private >*/
+    PCIESlot parent_obj;
+
+    CXLComponentState cxl_cstate;
+    PCIResReserve res_reserve;
+} CXLRootPort;
+
+#define TYPE_CXL_ROOT_PORT "cxl-rp"
+DECLARE_INSTANCE_CHECKER(CXLRootPort, CXL_ROOT_PORT, TYPE_CXL_ROOT_PORT)
+
+static void latch_registers(CXLRootPort *crp)
+{
+    uint32_t *reg_state = crp->cxl_cstate.crb.cache_mem_registers;
+
+    cxl_component_register_init_common(reg_state, CXL2_ROOT_PORT);
+}
+
+static void build_dvsecs(CXLComponentState *cxl)
+{
+    uint8_t *dvsec;
+
+    dvsec = (uint8_t *)&(struct extensions_dvsec_port){ 0 };
+    cxl_component_create_dvsec(cxl, EXTENSIONS_PORT_DVSEC_LENGTH,
+                               EXTENSIONS_PORT_DVSEC,
+                               EXTENSIONS_PORT_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_port_gpf){
+        .rsvd        = 0,
+        .phase1_ctrl = 1, /* 1μs timeout */
+        .phase2_ctrl = 1, /* 1μs timeout */
+    };
+    cxl_component_create_dvsec(cxl, GPF_PORT_DVSEC_LENGTH, GPF_PORT_DVSEC,
+                               GPF_PORT_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_port_flexbus){
+        .cap              = 0x26, /* IO, Mem, non-MLD */
+        .ctrl             = 0,
+        .status           = 0x26, /* same */
+        .rcvd_mod_ts_data = 0xef, /* WTF? */
+    };
+    cxl_component_create_dvsec(cxl, PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0,
+                               PCIE_FLEXBUS_PORT_DVSEC,
+                               PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_register_locator){
+        .rsvd         = 0,
+        .reg0_base_lo = RBI_COMPONENT_REG | COMPONENT_REG_BAR_IDX,
+        .reg0_base_hi = 0,
+    };
+    cxl_component_create_dvsec(cxl, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+                               REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void cxl_rp_realize(DeviceState *dev, Error **errp)
+{
+    PCIDevice *pci_dev     = PCI_DEVICE(dev);
+    PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(dev);
+    CXLRootPort *crp       = CXL_ROOT_PORT(dev);
+    CXLComponentState *cxl_cstate = &crp->cxl_cstate;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+    MemoryRegion *component_bar = &cregs->component_registers;
+    Error *local_err = NULL;
+
+    rpc->parent_realize(dev, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    int rc =
+        pci_bridge_qemu_reserve_cap_init(pci_dev, 0, crp->res_reserve, errp);
+    if (rc < 0) {
+        rpc->parent_class.exit(pci_dev);
+        return;
+    }
+
+    if (!crp->res_reserve.io || crp->res_reserve.io == -1) {
+        pci_word_test_and_clear_mask(pci_dev->wmask + PCI_COMMAND,
+                                     PCI_COMMAND_IO);
+        pci_dev->wmask[PCI_IO_BASE]  = 0;
+        pci_dev->wmask[PCI_IO_LIMIT] = 0;
+    }
+
+    cxl_cstate->dvsec_offset = CXL_ROOT_PORT_DVSEC_OFFSET;
+    cxl_cstate->pdev = pci_dev;
+    build_dvsecs(&crp->cxl_cstate);
+
+    cxl_component_register_block_init(OBJECT(pci_dev), cxl_cstate,
+                                      TYPE_CXL_ROOT_PORT);
+
+    pci_register_bar(pci_dev, COMPONENT_REG_BAR_IDX,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     component_bar);
+}
+
+static void cxl_rp_reset(DeviceState *dev)
+{
+    PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(dev);
+    CXLRootPort *crp = CXL_ROOT_PORT(dev);
+
+    rpc->parent_reset(dev);
+
+    latch_registers(crp);
+}
+
+static Property gen_rp_props[] = {
+    DEFINE_PROP_UINT32("bus-reserve", CXLRootPort, res_reserve.bus, -1),
+    DEFINE_PROP_SIZE("io-reserve", CXLRootPort, res_reserve.io, -1),
+    DEFINE_PROP_SIZE("mem-reserve", CXLRootPort, res_reserve.mem_non_pref, -1),
+    DEFINE_PROP_SIZE("pref32-reserve", CXLRootPort, res_reserve.mem_pref_32,
+                     -1),
+    DEFINE_PROP_SIZE("pref64-reserve", CXLRootPort, res_reserve.mem_pref_64,
+                     -1),
+    DEFINE_PROP_END_OF_LIST()
+};
+
+static void cxl_rp_dvsec_write_config(PCIDevice *dev, uint32_t addr,
+                                      uint32_t val, int len)
+{
+    CXLRootPort *crp = CXL_ROOT_PORT(dev);
+
+    if (range_contains(&crp->cxl_cstate.dvsecs[EXTENSIONS_PORT_DVSEC], addr)) {
+        uint8_t *reg = &dev->config[addr];
+        addr -= crp->cxl_cstate.dvsecs[EXTENSIONS_PORT_DVSEC].lob;
+        if (addr == PORT_CONTROL_OVERRIDE_OFFSET) {
+            if (pci_get_word(reg) & PORT_CONTROL_UNMASK_SBR) {
+                /* unmask SBR */
+            }
+            if (pci_get_word(reg) & PORT_CONTROL_ALT_MEMID_EN) {
+                /* Alt Memory & ID Space Enable */
+            }
+        }
+    }
+}
+
+static void cxl_rp_write_config(PCIDevice *d, uint32_t address, uint32_t val,
+                                int len)
+{
+    uint16_t slt_ctl, slt_sta;
+
+    pcie_cap_slot_get(d, &slt_ctl, &slt_sta);
+    pci_bridge_write_config(d, address, val, len);
+    pcie_cap_flr_write_config(d, address, val, len);
+    pcie_cap_slot_write_config(d, slt_ctl, slt_sta, address, val, len);
+    pcie_aer_write_config(d, address, val, len);
+
+    cxl_rp_dvsec_write_config(d, address, val, len);
+}
+
+static void cxl_root_port_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc        = DEVICE_CLASS(oc);
+    PCIDeviceClass *k      = PCI_DEVICE_CLASS(oc);
+    PCIERootPortClass *rpc = PCIE_ROOT_PORT_CLASS(oc);
+
+    k->vendor_id = PCI_VENDOR_ID_INTEL;
+    k->device_id = CXL_ROOT_PORT_DID;
+    dc->desc     = "CXL Root Port";
+    k->revision  = 0;
+    device_class_set_props(dc, gen_rp_props);
+    k->config_write = cxl_rp_write_config;
+
+    device_class_set_parent_realize(dc, cxl_rp_realize, &rpc->parent_realize);
+    device_class_set_parent_reset(dc, cxl_rp_reset, &rpc->parent_reset);
+
+    rpc->aer_offset = GEN_PCIE_ROOT_PORT_AER_OFFSET;
+    rpc->acs_offset = GEN_PCIE_ROOT_PORT_ACS_OFFSET;
+
+    /*
+     * Explain
+     */
+    dc->hotpluggable = false;
+}
+
+static const TypeInfo cxl_root_port_info = {
+    .name = TYPE_CXL_ROOT_PORT,
+    .parent = TYPE_PCIE_ROOT_PORT,
+    .instance_size = sizeof(CXLRootPort),
+    .class_init = cxl_root_port_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { INTERFACE_CXL_DEVICE },
+        { }
+    },
+};
+
+static void cxl_register(void)
+{
+    type_register_static(&cxl_root_port_info);
+}
+
+type_init(cxl_register);
diff --git a/hw/pci-bridge/meson.build b/hw/pci-bridge/meson.build
index daab8acf2a..b6d26a03d5 100644
--- a/hw/pci-bridge/meson.build
+++ b/hw/pci-bridge/meson.build
@@ -5,6 +5,7 @@ pci_ss.add(when: 'CONFIG_IOH3420', if_true: files('ioh3420.c'))
 pci_ss.add(when: 'CONFIG_PCIE_PORT', if_true: files('pcie_root_port.c', 'gen_pcie_root_port.c', 'pcie_pci_bridge.c'))
 pci_ss.add(when: 'CONFIG_PXB', if_true: files('pci_expander_bridge.c'))
 pci_ss.add(when: 'CONFIG_XIO3130', if_true: files('xio3130_upstream.c', 'xio3130_downstream.c'))
+pci_ss.add(when: 'CONFIG_CXL', if_true: files('cxl_root_port.c'))
 
 # NewWorld PowerMac
 pci_ss.add(when: 'CONFIG_DEC_PCI', if_true: files('dec.c'))
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index f1cfe9d14a..460e48269d 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -67,7 +67,11 @@ static void rp_realize(PCIDevice *d, Error **errp)
     int rc;
 
     pci_config_set_interrupt_pin(d->config, 1);
-    pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    if (d->cap_present & QEMU_PCIE_CAP_CXL) {
+        pci_bridge_initfn(d, TYPE_CXL_BUS);
+    } else {
+        pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    }
     pcie_port_init_reg(d);
 
     rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id,
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index bf019d91a0..eb325704e7 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2668,7 +2668,9 @@ static void pci_device_class_base_init(ObjectClass *klass, void *data)
             object_class_dynamic_cast(klass, INTERFACE_CONVENTIONAL_PCI_DEVICE);
         ObjectClass *pcie =
             object_class_dynamic_cast(klass, INTERFACE_PCIE_DEVICE);
-        assert(conventional || pcie);
+        ObjectClass *cxl =
+            object_class_dynamic_cast(klass, INTERFACE_CXL_DEVICE);
+        assert(conventional || pcie || cxl);
     }
 }
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 20/31] hw/cxl/rp: Add a root port
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

This adds just enough of a root port implementation to be able to
enumerate root ports (creating the required DVSEC entries). What's not
here yet is the MMIO nor the ability to write some of the DVSEC entries.

This can be added with the qemu commandline by adding a rootport to a
specific CXL host bridge. For example:
  -device cxl-rp,id=rp0,bus="cxl.0",addr=0.0,chassis=4

Like the host bridge patch, the ACPI tables aren't generated at this
point and so system software cannot use it.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/Kconfig          |   5 +
 hw/pci-bridge/cxl_root_port.c  | 231 +++++++++++++++++++++++++++++++++
 hw/pci-bridge/meson.build      |   1 +
 hw/pci-bridge/pcie_root_port.c |   6 +-
 hw/pci/pci.c                   |   4 +-
 5 files changed, 245 insertions(+), 2 deletions(-)
 create mode 100644 hw/pci-bridge/cxl_root_port.c

diff --git a/hw/pci-bridge/Kconfig b/hw/pci-bridge/Kconfig
index f8df4315ba..02614f49aa 100644
--- a/hw/pci-bridge/Kconfig
+++ b/hw/pci-bridge/Kconfig
@@ -27,3 +27,8 @@ config DEC_PCI
 
 config SIMBA
     bool
+
+config CXL
+    bool
+    default y if PCI_EXPRESS && PXB
+    depends on PCI_EXPRESS && MSI_NONBROKEN && PXB
diff --git a/hw/pci-bridge/cxl_root_port.c b/hw/pci-bridge/cxl_root_port.c
new file mode 100644
index 0000000000..6c3b215bb3
--- /dev/null
+++ b/hw/pci-bridge/cxl_root_port.c
@@ -0,0 +1,231 @@
+/*
+ * CXL 2.0 Root Port Implementation
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/range.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci/pcie_port.h"
+#include "hw/qdev-properties.h"
+#include "hw/sysbus.h"
+#include "qapi/error.h"
+#include "hw/cxl/cxl.h"
+
+#define CXL_ROOT_PORT_DID 0x7075
+
+/* Copied from the gen root port which we derive */
+#define GEN_PCIE_ROOT_PORT_AER_OFFSET 0x100
+#define GEN_PCIE_ROOT_PORT_ACS_OFFSET \
+    (GEN_PCIE_ROOT_PORT_AER_OFFSET + PCI_ERR_SIZEOF)
+#define CXL_ROOT_PORT_DVSEC_OFFSET \
+    (GEN_PCIE_ROOT_PORT_ACS_OFFSET + PCI_ACS_SIZEOF)
+
+typedef struct CXLRootPort {
+    /*< private >*/
+    PCIESlot parent_obj;
+
+    CXLComponentState cxl_cstate;
+    PCIResReserve res_reserve;
+} CXLRootPort;
+
+#define TYPE_CXL_ROOT_PORT "cxl-rp"
+DECLARE_INSTANCE_CHECKER(CXLRootPort, CXL_ROOT_PORT, TYPE_CXL_ROOT_PORT)
+
+static void latch_registers(CXLRootPort *crp)
+{
+    uint32_t *reg_state = crp->cxl_cstate.crb.cache_mem_registers;
+
+    cxl_component_register_init_common(reg_state, CXL2_ROOT_PORT);
+}
+
+static void build_dvsecs(CXLComponentState *cxl)
+{
+    uint8_t *dvsec;
+
+    dvsec = (uint8_t *)&(struct extensions_dvsec_port){ 0 };
+    cxl_component_create_dvsec(cxl, EXTENSIONS_PORT_DVSEC_LENGTH,
+                               EXTENSIONS_PORT_DVSEC,
+                               EXTENSIONS_PORT_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_port_gpf){
+        .rsvd        = 0,
+        .phase1_ctrl = 1, /* 1μs timeout */
+        .phase2_ctrl = 1, /* 1μs timeout */
+    };
+    cxl_component_create_dvsec(cxl, GPF_PORT_DVSEC_LENGTH, GPF_PORT_DVSEC,
+                               GPF_PORT_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_port_flexbus){
+        .cap              = 0x26, /* IO, Mem, non-MLD */
+        .ctrl             = 0,
+        .status           = 0x26, /* same */
+        .rcvd_mod_ts_data = 0xef, /* WTF? */
+    };
+    cxl_component_create_dvsec(cxl, PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0,
+                               PCIE_FLEXBUS_PORT_DVSEC,
+                               PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_register_locator){
+        .rsvd         = 0,
+        .reg0_base_lo = RBI_COMPONENT_REG | COMPONENT_REG_BAR_IDX,
+        .reg0_base_hi = 0,
+    };
+    cxl_component_create_dvsec(cxl, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+                               REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void cxl_rp_realize(DeviceState *dev, Error **errp)
+{
+    PCIDevice *pci_dev     = PCI_DEVICE(dev);
+    PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(dev);
+    CXLRootPort *crp       = CXL_ROOT_PORT(dev);
+    CXLComponentState *cxl_cstate = &crp->cxl_cstate;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+    MemoryRegion *component_bar = &cregs->component_registers;
+    Error *local_err = NULL;
+
+    rpc->parent_realize(dev, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    int rc =
+        pci_bridge_qemu_reserve_cap_init(pci_dev, 0, crp->res_reserve, errp);
+    if (rc < 0) {
+        rpc->parent_class.exit(pci_dev);
+        return;
+    }
+
+    if (!crp->res_reserve.io || crp->res_reserve.io == -1) {
+        pci_word_test_and_clear_mask(pci_dev->wmask + PCI_COMMAND,
+                                     PCI_COMMAND_IO);
+        pci_dev->wmask[PCI_IO_BASE]  = 0;
+        pci_dev->wmask[PCI_IO_LIMIT] = 0;
+    }
+
+    cxl_cstate->dvsec_offset = CXL_ROOT_PORT_DVSEC_OFFSET;
+    cxl_cstate->pdev = pci_dev;
+    build_dvsecs(&crp->cxl_cstate);
+
+    cxl_component_register_block_init(OBJECT(pci_dev), cxl_cstate,
+                                      TYPE_CXL_ROOT_PORT);
+
+    pci_register_bar(pci_dev, COMPONENT_REG_BAR_IDX,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     component_bar);
+}
+
+static void cxl_rp_reset(DeviceState *dev)
+{
+    PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(dev);
+    CXLRootPort *crp = CXL_ROOT_PORT(dev);
+
+    rpc->parent_reset(dev);
+
+    latch_registers(crp);
+}
+
+static Property gen_rp_props[] = {
+    DEFINE_PROP_UINT32("bus-reserve", CXLRootPort, res_reserve.bus, -1),
+    DEFINE_PROP_SIZE("io-reserve", CXLRootPort, res_reserve.io, -1),
+    DEFINE_PROP_SIZE("mem-reserve", CXLRootPort, res_reserve.mem_non_pref, -1),
+    DEFINE_PROP_SIZE("pref32-reserve", CXLRootPort, res_reserve.mem_pref_32,
+                     -1),
+    DEFINE_PROP_SIZE("pref64-reserve", CXLRootPort, res_reserve.mem_pref_64,
+                     -1),
+    DEFINE_PROP_END_OF_LIST()
+};
+
+static void cxl_rp_dvsec_write_config(PCIDevice *dev, uint32_t addr,
+                                      uint32_t val, int len)
+{
+    CXLRootPort *crp = CXL_ROOT_PORT(dev);
+
+    if (range_contains(&crp->cxl_cstate.dvsecs[EXTENSIONS_PORT_DVSEC], addr)) {
+        uint8_t *reg = &dev->config[addr];
+        addr -= crp->cxl_cstate.dvsecs[EXTENSIONS_PORT_DVSEC].lob;
+        if (addr == PORT_CONTROL_OVERRIDE_OFFSET) {
+            if (pci_get_word(reg) & PORT_CONTROL_UNMASK_SBR) {
+                /* unmask SBR */
+            }
+            if (pci_get_word(reg) & PORT_CONTROL_ALT_MEMID_EN) {
+                /* Alt Memory & ID Space Enable */
+            }
+        }
+    }
+}
+
+static void cxl_rp_write_config(PCIDevice *d, uint32_t address, uint32_t val,
+                                int len)
+{
+    uint16_t slt_ctl, slt_sta;
+
+    pcie_cap_slot_get(d, &slt_ctl, &slt_sta);
+    pci_bridge_write_config(d, address, val, len);
+    pcie_cap_flr_write_config(d, address, val, len);
+    pcie_cap_slot_write_config(d, slt_ctl, slt_sta, address, val, len);
+    pcie_aer_write_config(d, address, val, len);
+
+    cxl_rp_dvsec_write_config(d, address, val, len);
+}
+
+static void cxl_root_port_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc        = DEVICE_CLASS(oc);
+    PCIDeviceClass *k      = PCI_DEVICE_CLASS(oc);
+    PCIERootPortClass *rpc = PCIE_ROOT_PORT_CLASS(oc);
+
+    k->vendor_id = PCI_VENDOR_ID_INTEL;
+    k->device_id = CXL_ROOT_PORT_DID;
+    dc->desc     = "CXL Root Port";
+    k->revision  = 0;
+    device_class_set_props(dc, gen_rp_props);
+    k->config_write = cxl_rp_write_config;
+
+    device_class_set_parent_realize(dc, cxl_rp_realize, &rpc->parent_realize);
+    device_class_set_parent_reset(dc, cxl_rp_reset, &rpc->parent_reset);
+
+    rpc->aer_offset = GEN_PCIE_ROOT_PORT_AER_OFFSET;
+    rpc->acs_offset = GEN_PCIE_ROOT_PORT_ACS_OFFSET;
+
+    /*
+     * Explain
+     */
+    dc->hotpluggable = false;
+}
+
+static const TypeInfo cxl_root_port_info = {
+    .name = TYPE_CXL_ROOT_PORT,
+    .parent = TYPE_PCIE_ROOT_PORT,
+    .instance_size = sizeof(CXLRootPort),
+    .class_init = cxl_root_port_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { INTERFACE_CXL_DEVICE },
+        { }
+    },
+};
+
+static void cxl_register(void)
+{
+    type_register_static(&cxl_root_port_info);
+}
+
+type_init(cxl_register);
diff --git a/hw/pci-bridge/meson.build b/hw/pci-bridge/meson.build
index daab8acf2a..b6d26a03d5 100644
--- a/hw/pci-bridge/meson.build
+++ b/hw/pci-bridge/meson.build
@@ -5,6 +5,7 @@ pci_ss.add(when: 'CONFIG_IOH3420', if_true: files('ioh3420.c'))
 pci_ss.add(when: 'CONFIG_PCIE_PORT', if_true: files('pcie_root_port.c', 'gen_pcie_root_port.c', 'pcie_pci_bridge.c'))
 pci_ss.add(when: 'CONFIG_PXB', if_true: files('pci_expander_bridge.c'))
 pci_ss.add(when: 'CONFIG_XIO3130', if_true: files('xio3130_upstream.c', 'xio3130_downstream.c'))
+pci_ss.add(when: 'CONFIG_CXL', if_true: files('cxl_root_port.c'))
 
 # NewWorld PowerMac
 pci_ss.add(when: 'CONFIG_DEC_PCI', if_true: files('dec.c'))
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index f1cfe9d14a..460e48269d 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -67,7 +67,11 @@ static void rp_realize(PCIDevice *d, Error **errp)
     int rc;
 
     pci_config_set_interrupt_pin(d->config, 1);
-    pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    if (d->cap_present & QEMU_PCIE_CAP_CXL) {
+        pci_bridge_initfn(d, TYPE_CXL_BUS);
+    } else {
+        pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    }
     pcie_port_init_reg(d);
 
     rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id,
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index bf019d91a0..eb325704e7 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2668,7 +2668,9 @@ static void pci_device_class_base_init(ObjectClass *klass, void *data)
             object_class_dynamic_cast(klass, INTERFACE_CONVENTIONAL_PCI_DEVICE);
         ObjectClass *pcie =
             object_class_dynamic_cast(klass, INTERFACE_PCIE_DEVICE);
-        assert(conventional || pcie);
+        ObjectClass *cxl =
+            object_class_dynamic_cast(klass, INTERFACE_CXL_DEVICE);
+        assert(conventional || pcie || cxl);
     }
 }
 
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 21/31] hw/cxl/device: Add a memory device (8.2.8.5)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

A CXL memory device (AKA Type 3) is a CXL component that contains some
combination of volatile and persistent memory. It also implements the
previously defined mailbox interface as well as the memory device
firmware interface.

Although the memory device is configured like a normal PCIe device, the
memory traffic is on an entirely separate bus conceptually (using the
same physical wires as PCIe, but different protocol).

The guest physical address for the memory device is part of a larger
window which is owned by the platform. Currently, this is hardcoded as
an object property on host bridge (PXB) creation, but that will need to
change for interleaving.

The following example will create a 256M device in a 512M window:
-object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
-device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/core/numa.c             |   3 +
 hw/cxl/cxl-mailbox-utils.c |  41 ++++++
 hw/i386/pc.c               |   1 +
 hw/mem/Kconfig             |   5 +
 hw/mem/cxl_type3.c         | 281 +++++++++++++++++++++++++++++++++++++
 hw/mem/meson.build         |   1 +
 hw/pci/pcie.c              |  30 ++++
 include/hw/cxl/cxl.h       |   2 +
 include/hw/cxl/cxl_pci.h   |  22 +++
 include/hw/pci/pci_ids.h   |   1 +
 monitor/hmp-cmds.c         |  15 ++
 qapi/machine.json          |   1 +
 12 files changed, 403 insertions(+)
 create mode 100644 hw/mem/cxl_type3.c

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 68cee65f61..cd7df371e6 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -770,6 +770,9 @@ static void numa_stat_memory_devices(NumaNodeMem node_mem[])
                 node_mem[pcdimm_info->node].node_plugged_mem +=
                     pcdimm_info->size;
                 break;
+            case MEMORY_DEVICE_INFO_KIND_CXL:
+                /* FINISHME */
+                break;
             case MEMORY_DEVICE_INFO_KIND_VIRTIO_PMEM:
                 vpi = value->u.virtio_pmem.data;
                 /* TODO: once we support numa, assign to right node */
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 3f0ae8b9e5..f92dfad882 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -49,6 +49,8 @@ enum {
     LOGS        = 0x04,
         #define GET_SUPPORTED 0x0
         #define GET_LOG       0x1
+    IDENTIFY    = 0x40,
+        #define MEMORY_DEVICE 0x0
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -127,6 +129,7 @@ declare_mailbox_handler(TIMESTAMP_GET);
 declare_mailbox_handler(TIMESTAMP_SET);
 declare_mailbox_handler(LOGS_GET_SUPPORTED);
 declare_mailbox_handler(LOGS_GET_LOG);
+declare_mailbox_handler(IDENTIFY_MEMORY_DEVICE);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -144,6 +147,7 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(TIMESTAMP, SET, 8, IMMEDIATE_POLICY_CHANGE),
     CXL_CMD(LOGS, GET_SUPPORTED, 0, 0),
     CXL_CMD(LOGS, GET_LOG, 0x18, 0),
+    CXL_CMD(IDENTIFY, MEMORY_DEVICE, 0, 0),
 };
 
 #undef CXL_CMD
@@ -255,6 +259,43 @@ define_mailbox_handler(LOGS_GET_LOG)
     return CXL_MBOX_SUCCESS;
 }
 
+/* 8.2.9.5.1.1 */
+define_mailbox_handler(IDENTIFY_MEMORY_DEVICE)
+{
+    struct {
+        char fw_revision[0x10];
+        uint64_t total_capacity;
+        uint64_t volatile_capacity;
+        uint64_t persistent_capacity;
+        uint64_t partition_align;
+        uint16_t info_event_log_size;
+        uint16_t warning_event_log_size;
+        uint16_t failure_event_log_size;
+        uint16_t fatal_event_log_size;
+        uint32_t lsa_size;
+        uint8_t poison_list_max_mer[3];
+        uint16_t inject_poison_limit;
+        uint8_t poison_caps;
+        uint8_t qos_telemetry_caps;
+    } __attribute__((packed)) *id;
+    _Static_assert(sizeof(*id) == 0x43, "Bad identify size");
+
+    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
+        return CXL_MBOX_INTERNAL_ERROR;
+    }
+
+    id = (void *)cmd->payload;
+    memset(id, 0, sizeof(*id));
+
+    /* PMEM only */
+    snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
+    id->total_capacity = memory_region_size(cxl_dstate->pmem);
+    id->persistent_capacity = memory_region_size(cxl_dstate->pmem);
+
+    *len = sizeof(*id);
+    return CXL_MBOX_SUCCESS;
+}
+
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
 {
     uint16_t ret = CXL_MBOX_SUCCESS;
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5458f61d10..5d41809b37 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -79,6 +79,7 @@
 #include "acpi-build.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/cxl/cxl.h"
 #include "qapi/error.h"
 #include "qapi/qapi-visit-common.h"
 #include "qapi/visitor.h"
diff --git a/hw/mem/Kconfig b/hw/mem/Kconfig
index a0ef2cf648..7d9d1ced3e 100644
--- a/hw/mem/Kconfig
+++ b/hw/mem/Kconfig
@@ -10,3 +10,8 @@ config NVDIMM
     default y
     depends on (PC || PSERIES || ARM_VIRT)
     select MEM_DEVICE
+
+config CXL_MEM_DEVICE
+    bool
+    default y if CXL
+    select MEM_DEVICE
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
new file mode 100644
index 0000000000..4e9a016448
--- /dev/null
+++ b/hw/mem/cxl_type3.c
@@ -0,0 +1,281 @@
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "hw/mem/memory-device.h"
+#include "hw/mem/pc-dimm.h"
+#include "hw/pci/pci.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qemu/range.h"
+#include "qemu/rcu.h"
+#include "sysemu/hostmem.h"
+#include "hw/cxl/cxl.h"
+
+typedef struct cxl_type3_dev {
+    /* Private */
+    PCIDevice parent_obj;
+
+    /* Properties */
+    uint64_t size;
+    HostMemoryBackend *hostmem;
+
+    /* State */
+    CXLComponentState cxl_cstate;
+    CXLDeviceState cxl_dstate;
+} CXLType3Dev;
+
+#define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
+
+static void build_dvsecs(CXLType3Dev *ct3d)
+{
+    CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
+    uint8_t *dvsec;
+
+    dvsec = (uint8_t *)&(struct dvsec_device){
+        .cap = 0x1e,
+        .ctrl = 0x6,
+        .status2 = 0x2,
+        .range1_size_hi = 0,
+        .range1_size_lo = (2 << 5) | (2 << 2) | 0x3 | ct3d->size,
+        .range1_base_hi = 0,
+        .range1_base_lo = 0,
+    };
+    cxl_component_create_dvsec(cxl_cstate, PCIE_CXL_DEVICE_DVSEC_LENGTH,
+                               PCIE_CXL_DEVICE_DVSEC,
+                               PCIE_CXL2_DEVICE_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_register_locator){
+        .rsvd         = 0,
+        .reg0_base_lo = RBI_COMPONENT_REG | COMPONENT_REG_BAR_IDX,
+        .reg0_base_hi = 0,
+        .reg1_base_lo = RBI_CXL_DEVICE_REG | DEVICE_REG_BAR_IDX,
+        .reg1_base_hi = 0,
+    };
+    cxl_component_create_dvsec(cxl_cstate, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+                               REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void ct3_instance_init(Object *obj)
+{
+    /* MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(obj); */
+}
+
+static void ct3_finalize(Object *obj)
+{
+    CXLType3Dev *ct3d = CT3(obj);
+
+    g_free(ct3d->cxl_dstate.pmem);
+}
+
+#ifdef SET_PMEM_PADDR
+static void cxl_set_addr(CXLType3Dev *ct3d, hwaddr addr, Error **errp)
+{
+    MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(ct3d);
+    mdc->set_addr(MEMORY_DEVICE(ct3d), addr, errp);
+}
+#endif
+
+static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
+{
+    MemoryRegionSection mrs;
+    MemoryRegion *pmem;
+    MemoryRegion *mr;
+    uint64_t offset = 0;
+    size_t remaining_size;
+
+    if (!ct3d->hostmem) {
+        error_setg(errp, "memdev property must be set");
+        return;
+    }
+
+    /* FIXME: need to check mr is the host bridge's MR */
+    mr = host_memory_backend_get_memory(ct3d->hostmem);
+
+    /* Create our new subregion */
+    pmem = g_new(MemoryRegion, 1);
+
+    /* Find the first free space in the window */
+    WITH_RCU_READ_LOCK_GUARD()
+    {
+        mrs = memory_region_find(mr, offset, 1);
+        while (mrs.mr && mrs.mr != mr) {
+            offset += memory_region_size(mrs.mr);
+            mrs = memory_region_find(mr, offset, 1);
+        }
+    }
+
+    remaining_size = memory_region_size(mr) - offset;
+    if (remaining_size < ct3d->size) {
+        g_free(pmem);
+        error_setg(errp,
+                   "Not enough free space (%zd) required for device (%" PRId64  ")",
+                   remaining_size, ct3d->size);
+    }
+
+    memory_region_set_nonvolatile(pmem, true);
+    memory_region_set_enabled(pmem, false);
+    memory_region_init_alias(pmem, OBJECT(ct3d), "cxl_type3-memory", mr, 0,
+                             ct3d->size);
+    ct3d->cxl_dstate.pmem = pmem;
+
+#ifdef SET_PMEM_PADDR
+    /* This path will initialize the memory device as if BIOS had done it */
+    cxl_set_addr(ct3d, mr->addr + offset, errp);
+    memory_region_set_enabled(pmem, true);
+#endif
+}
+
+static MemoryRegion *cxl_md_get_memory_region(MemoryDeviceState *md,
+                                              Error **errp)
+{
+    CXLType3Dev *ct3d = CT3(md);
+
+    if (!ct3d->cxl_dstate.pmem) {
+        cxl_setup_memory(ct3d, errp);
+    }
+
+    return ct3d->cxl_dstate.pmem;
+}
+
+static void ct3_realize(PCIDevice *pci_dev, Error **errp)
+{
+    CXLType3Dev *ct3d = CT3(pci_dev);
+    CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
+    ComponentRegisters *regs = &cxl_cstate->crb;
+    MemoryRegion *mr = &regs->component_registers;
+    uint8_t *pci_conf = pci_dev->config;
+
+    if (!ct3d->cxl_dstate.pmem) {
+        cxl_setup_memory(ct3d, errp);
+    }
+
+    pci_config_set_prog_interface(pci_conf, 0x10);
+    pci_config_set_class(pci_conf, PCI_CLASS_MEMORY_CXL);
+
+    pcie_endpoint_cap_init(pci_dev, 0x80);
+    cxl_cstate->dvsec_offset = 0x100;
+
+    ct3d->cxl_cstate.pdev = pci_dev;
+    build_dvsecs(ct3d);
+
+    cxl_component_register_block_init(OBJECT(pci_dev), cxl_cstate,
+                                      TYPE_CXL_TYPE3_DEV);
+
+    pci_register_bar(
+        pci_dev, COMPONENT_REG_BAR_IDX,
+        PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64, mr);
+
+    cxl_device_register_block_init(OBJECT(pci_dev), &ct3d->cxl_dstate);
+    pci_register_bar(pci_dev, DEVICE_REG_BAR_IDX,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     &ct3d->cxl_dstate.device_registers);
+}
+
+static uint64_t cxl_md_get_addr(const MemoryDeviceState *md)
+{
+    CXLType3Dev *ct3d = CT3(md);
+    MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
+
+    assert(pmem->alias);
+    return pmem->alias_offset;
+}
+
+static void cxl_md_set_addr(MemoryDeviceState *md, uint64_t addr, Error **errp)
+{
+    CXLType3Dev *ct3d = CT3(md);
+    MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
+
+    assert(pmem->alias);
+    memory_region_set_alias_offset(pmem, addr);
+    memory_region_set_address(pmem, addr);
+}
+
+static void ct3d_reset(DeviceState *dev)
+{
+    CXLType3Dev *ct3d = CT3(dev);
+    uint32_t *reg_state = ct3d->cxl_cstate.crb.cache_mem_registers;
+
+    cxl_component_register_init_common(reg_state, CXL2_TYPE3_DEVICE);
+    cxl_device_register_init_common(&ct3d->cxl_dstate);
+}
+
+static Property ct3_props[] = {
+    DEFINE_PROP_SIZE("size", CXLType3Dev, size, -1),
+    DEFINE_PROP_LINK("memdev", CXLType3Dev, hostmem, TYPE_MEMORY_BACKEND,
+                     HostMemoryBackend *),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void pc_dimm_md_fill_device_info(const MemoryDeviceState *md,
+                                        MemoryDeviceInfo *info)
+{
+    PCDIMMDeviceInfo *di = g_new0(PCDIMMDeviceInfo, 1);
+    const DeviceClass *dc = DEVICE_GET_CLASS(md);
+    const DeviceState *dev = DEVICE(md);
+    CXLType3Dev *ct3d = CT3(md);
+
+    if (dev->id) {
+        di->has_id = true;
+        di->id = g_strdup(dev->id);
+    }
+
+    di->hotplugged = dev->hotplugged;
+    di->hotpluggable = dc->hotpluggable;
+    di->addr = cxl_md_get_addr(md);
+    di->slot = 0;
+    di->node = 0;
+    di->size = memory_device_get_region_size(md, NULL);
+    di->memdev = object_get_canonical_path(OBJECT(ct3d->hostmem));
+
+    info->u.cxl.data = di;
+    info->type = MEMORY_DEVICE_INFO_KIND_CXL;
+}
+
+static void ct3_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    PCIDeviceClass *pc = PCI_DEVICE_CLASS(oc);
+    MemoryDeviceClass *mdc = MEMORY_DEVICE_CLASS(oc);
+
+    pc->realize = ct3_realize;
+    pc->class_id = PCI_CLASS_STORAGE_EXPRESS;
+    pc->vendor_id = PCI_VENDOR_ID_INTEL;
+    pc->device_id = 0xd93; /* LVF for now */
+    pc->revision = 1;
+
+    set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+    dc->desc = "CXL PMEM Device (Type 3)";
+    dc->reset = ct3d_reset;
+    device_class_set_props(dc, ct3_props);
+
+    mdc->get_memory_region = cxl_md_get_memory_region;
+    mdc->get_addr = cxl_md_get_addr;
+    mdc->fill_device_info = pc_dimm_md_fill_device_info;
+    mdc->get_plugged_size = memory_device_get_region_size;
+    mdc->set_addr = cxl_md_set_addr;
+}
+
+static const TypeInfo ct3d_info = {
+    .name = TYPE_CXL_TYPE3_DEV,
+    .parent = TYPE_PCI_DEVICE,
+    .class_init = ct3_class_init,
+    .instance_size = sizeof(CXLType3Dev),
+    .instance_init = ct3_instance_init,
+    .instance_finalize = ct3_finalize,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_MEMORY_DEVICE },
+        { INTERFACE_CXL_DEVICE },
+        { INTERFACE_PCIE_DEVICE },
+        {}
+    },
+};
+
+static void ct3d_registers(void)
+{
+    type_register_static(&ct3d_info);
+}
+
+type_init(ct3d_registers);
diff --git a/hw/mem/meson.build b/hw/mem/meson.build
index 0d22f2b572..d13c3ed117 100644
--- a/hw/mem/meson.build
+++ b/hw/mem/meson.build
@@ -3,5 +3,6 @@ mem_ss.add(files('memory-device.c'))
 mem_ss.add(when: 'CONFIG_DIMM', if_true: files('pc-dimm.c'))
 mem_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_mc.c'))
 mem_ss.add(when: 'CONFIG_NVDIMM', if_true: files('nvdimm.c'))
+mem_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_true: files('cxl_type3.c'))
 
 softmmu_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index d4010cf8f3..1ecf6f6a55 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -20,6 +20,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "hw/mem/memory-device.h"
 #include "hw/pci/pci_bridge.h"
 #include "hw/pci/pcie.h"
 #include "hw/pci/msix.h"
@@ -27,6 +28,8 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/pci/pcie_regs.h"
 #include "hw/pci/pcie_port.h"
+#include "hw/cxl/cxl.h"
+#include "hw/boards.h"
 #include "qemu/range.h"
 
 //#define DEBUG_PCIE
@@ -419,6 +422,28 @@ void pcie_cap_slot_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
     }
 
     pcie_cap_slot_plug_common(PCI_DEVICE(hotplug_dev), dev, errp);
+
+#ifdef CXL_MEM_DEVICE
+    /*
+     * FIXME:
+     * if (object_dynamic_cast(OBJECT(dev), TYPE_CXL_TYPE3_DEV)) {
+     *    HotplugHandler *hotplug_ctrl;
+     *   Error *local_err = NULL;
+     *  hotplug_ctrl = qdev_get_hotplug_handler(dev);
+     *  if (hotplug_ctrl) {
+     *      hotplug_handler_pre_plug(hotplug_ctrl, dev, &local_err);
+     *      if (local_err) {
+     *          error_propagate(errp, local_err);
+     *          return;
+     *      }
+     *  }
+     */
+
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CXL_TYPE3_DEV)) {
+        memory_device_pre_plug(MEMORY_DEVICE(dev), MACHINE(qdev_get_machine()),
+                               NULL, errp);
+    }
+#endif
 }
 
 void pcie_cap_slot_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
@@ -455,6 +480,11 @@ void pcie_cap_slot_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
         pcie_cap_slot_event(hotplug_pdev,
                             PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP);
     }
+
+#ifdef CXL_MEM_DEVICE
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CXL_TYPE3_DEV))
+        memory_device_plug(MEMORY_DEVICE(dev), MACHINE(qdev_get_machine()));
+#endif
 }
 
 void pcie_cap_slot_unplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index b1e5f4a8fa..809ed7de60 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -17,6 +17,8 @@
 #define COMPONENT_REG_BAR_IDX 0
 #define DEVICE_REG_BAR_IDX 2
 
+#define TYPE_CXL_TYPE3_DEV "cxl-type3"
+
 #define CXL_HOST_BASE 0xD0000000
 #define CXL_WINDOW_MAX 10
 
diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
index a53c2e5ae7..9ec28c9feb 100644
--- a/include/hw/cxl/cxl_pci.h
+++ b/include/hw/cxl/cxl_pci.h
@@ -64,6 +64,28 @@ _Static_assert(sizeof(struct dvsec_header) == 10,
  * CXL 2.0 Downstream Port: 3, 4, 7, 8
  */
 
+/* CXL 2.0 - 8.1.3 (ID 0001) */
+struct dvsec_device {
+    struct dvsec_header hdr;
+    uint16_t cap;
+    uint16_t ctrl;
+    uint16_t status;
+    uint16_t ctrl2;
+    uint16_t status2;
+    uint16_t lock;
+    uint16_t cap2;
+    uint32_t range1_size_hi;
+    uint32_t range1_size_lo;
+    uint32_t range1_base_hi;
+    uint32_t range1_base_lo;
+    uint32_t range2_size_hi;
+    uint32_t range2_size_lo;
+    uint32_t range2_base_hi;
+    uint32_t range2_base_lo;
+};
+_Static_assert(sizeof(struct dvsec_device) == 0x38,
+               "dvsec device size incorrect");
+
 /* CXL 2.0 - 8.1.5 (ID 0003) */
 struct extensions_dvsec_port {
     struct dvsec_header hdr;
diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
index 11f8ab7149..76bf3ed590 100644
--- a/include/hw/pci/pci_ids.h
+++ b/include/hw/pci/pci_ids.h
@@ -53,6 +53,7 @@
 #define PCI_BASE_CLASS_MEMORY            0x05
 #define PCI_CLASS_MEMORY_RAM             0x0500
 #define PCI_CLASS_MEMORY_FLASH           0x0501
+#define PCI_CLASS_MEMORY_CXL             0x0502
 #define PCI_CLASS_MEMORY_OTHER           0x0580
 
 #define PCI_BASE_CLASS_BRIDGE            0x06
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index a48bc1e904..1c8edfb136 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1884,6 +1884,21 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
                 monitor_printf(mon, "  hotpluggable: %s\n",
                                di->hotpluggable ? "true" : "false");
                 break;
+            case MEMORY_DEVICE_INFO_KIND_CXL:
+                di = value->u.cxl.data;
+                monitor_printf(mon, "Memory device [%s]: \"%s\"\n",
+                               MemoryDeviceInfoKind_str(value->type),
+                               di->id ? di->id : "");
+                monitor_printf(mon, "  addr: 0x%" PRIx64 "\n", di->addr);
+                monitor_printf(mon, "  slot: %" PRId64 "\n", di->slot);
+                monitor_printf(mon, "  node: %" PRId64 "\n", di->node);
+                monitor_printf(mon, "  size: %" PRIu64 "\n", di->size);
+                monitor_printf(mon, "  memdev: %s\n", di->memdev);
+                monitor_printf(mon, "  hotplugged: %s\n",
+                               di->hotplugged ? "true" : "false");
+                monitor_printf(mon, "  hotpluggable: %s\n",
+                               di->hotpluggable ? "true" : "false");
+                break;
             case MEMORY_DEVICE_INFO_KIND_VIRTIO_PMEM:
                 vpi = value->u.virtio_pmem.data;
                 monitor_printf(mon, "Memory device [%s]: \"%s\"\n",
diff --git a/qapi/machine.json b/qapi/machine.json
index 330189efe3..aa96d662bd 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1394,6 +1394,7 @@
 { 'union': 'MemoryDeviceInfo',
   'data': { 'dimm': 'PCDIMMDeviceInfo',
             'nvdimm': 'PCDIMMDeviceInfo',
+            'cxl': 'PCDIMMDeviceInfo',
             'virtio-pmem': 'VirtioPMEMDeviceInfo',
             'virtio-mem': 'VirtioMEMDeviceInfo'
           }
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 21/31] hw/cxl/device: Add a memory device (8.2.8.5)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

A CXL memory device (AKA Type 3) is a CXL component that contains some
combination of volatile and persistent memory. It also implements the
previously defined mailbox interface as well as the memory device
firmware interface.

Although the memory device is configured like a normal PCIe device, the
memory traffic is on an entirely separate bus conceptually (using the
same physical wires as PCIe, but different protocol).

The guest physical address for the memory device is part of a larger
window which is owned by the platform. Currently, this is hardcoded as
an object property on host bridge (PXB) creation, but that will need to
change for interleaving.

The following example will create a 256M device in a 512M window:
-object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
-device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/core/numa.c             |   3 +
 hw/cxl/cxl-mailbox-utils.c |  41 ++++++
 hw/i386/pc.c               |   1 +
 hw/mem/Kconfig             |   5 +
 hw/mem/cxl_type3.c         | 281 +++++++++++++++++++++++++++++++++++++
 hw/mem/meson.build         |   1 +
 hw/pci/pcie.c              |  30 ++++
 include/hw/cxl/cxl.h       |   2 +
 include/hw/cxl/cxl_pci.h   |  22 +++
 include/hw/pci/pci_ids.h   |   1 +
 monitor/hmp-cmds.c         |  15 ++
 qapi/machine.json          |   1 +
 12 files changed, 403 insertions(+)
 create mode 100644 hw/mem/cxl_type3.c

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 68cee65f61..cd7df371e6 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -770,6 +770,9 @@ static void numa_stat_memory_devices(NumaNodeMem node_mem[])
                 node_mem[pcdimm_info->node].node_plugged_mem +=
                     pcdimm_info->size;
                 break;
+            case MEMORY_DEVICE_INFO_KIND_CXL:
+                /* FINISHME */
+                break;
             case MEMORY_DEVICE_INFO_KIND_VIRTIO_PMEM:
                 vpi = value->u.virtio_pmem.data;
                 /* TODO: once we support numa, assign to right node */
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 3f0ae8b9e5..f92dfad882 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -49,6 +49,8 @@ enum {
     LOGS        = 0x04,
         #define GET_SUPPORTED 0x0
         #define GET_LOG       0x1
+    IDENTIFY    = 0x40,
+        #define MEMORY_DEVICE 0x0
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -127,6 +129,7 @@ declare_mailbox_handler(TIMESTAMP_GET);
 declare_mailbox_handler(TIMESTAMP_SET);
 declare_mailbox_handler(LOGS_GET_SUPPORTED);
 declare_mailbox_handler(LOGS_GET_LOG);
+declare_mailbox_handler(IDENTIFY_MEMORY_DEVICE);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -144,6 +147,7 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(TIMESTAMP, SET, 8, IMMEDIATE_POLICY_CHANGE),
     CXL_CMD(LOGS, GET_SUPPORTED, 0, 0),
     CXL_CMD(LOGS, GET_LOG, 0x18, 0),
+    CXL_CMD(IDENTIFY, MEMORY_DEVICE, 0, 0),
 };
 
 #undef CXL_CMD
@@ -255,6 +259,43 @@ define_mailbox_handler(LOGS_GET_LOG)
     return CXL_MBOX_SUCCESS;
 }
 
+/* 8.2.9.5.1.1 */
+define_mailbox_handler(IDENTIFY_MEMORY_DEVICE)
+{
+    struct {
+        char fw_revision[0x10];
+        uint64_t total_capacity;
+        uint64_t volatile_capacity;
+        uint64_t persistent_capacity;
+        uint64_t partition_align;
+        uint16_t info_event_log_size;
+        uint16_t warning_event_log_size;
+        uint16_t failure_event_log_size;
+        uint16_t fatal_event_log_size;
+        uint32_t lsa_size;
+        uint8_t poison_list_max_mer[3];
+        uint16_t inject_poison_limit;
+        uint8_t poison_caps;
+        uint8_t qos_telemetry_caps;
+    } __attribute__((packed)) *id;
+    _Static_assert(sizeof(*id) == 0x43, "Bad identify size");
+
+    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
+        return CXL_MBOX_INTERNAL_ERROR;
+    }
+
+    id = (void *)cmd->payload;
+    memset(id, 0, sizeof(*id));
+
+    /* PMEM only */
+    snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
+    id->total_capacity = memory_region_size(cxl_dstate->pmem);
+    id->persistent_capacity = memory_region_size(cxl_dstate->pmem);
+
+    *len = sizeof(*id);
+    return CXL_MBOX_SUCCESS;
+}
+
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
 {
     uint16_t ret = CXL_MBOX_SUCCESS;
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5458f61d10..5d41809b37 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -79,6 +79,7 @@
 #include "acpi-build.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/cxl/cxl.h"
 #include "qapi/error.h"
 #include "qapi/qapi-visit-common.h"
 #include "qapi/visitor.h"
diff --git a/hw/mem/Kconfig b/hw/mem/Kconfig
index a0ef2cf648..7d9d1ced3e 100644
--- a/hw/mem/Kconfig
+++ b/hw/mem/Kconfig
@@ -10,3 +10,8 @@ config NVDIMM
     default y
     depends on (PC || PSERIES || ARM_VIRT)
     select MEM_DEVICE
+
+config CXL_MEM_DEVICE
+    bool
+    default y if CXL
+    select MEM_DEVICE
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
new file mode 100644
index 0000000000..4e9a016448
--- /dev/null
+++ b/hw/mem/cxl_type3.c
@@ -0,0 +1,281 @@
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "hw/mem/memory-device.h"
+#include "hw/mem/pc-dimm.h"
+#include "hw/pci/pci.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qemu/range.h"
+#include "qemu/rcu.h"
+#include "sysemu/hostmem.h"
+#include "hw/cxl/cxl.h"
+
+typedef struct cxl_type3_dev {
+    /* Private */
+    PCIDevice parent_obj;
+
+    /* Properties */
+    uint64_t size;
+    HostMemoryBackend *hostmem;
+
+    /* State */
+    CXLComponentState cxl_cstate;
+    CXLDeviceState cxl_dstate;
+} CXLType3Dev;
+
+#define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
+
+static void build_dvsecs(CXLType3Dev *ct3d)
+{
+    CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
+    uint8_t *dvsec;
+
+    dvsec = (uint8_t *)&(struct dvsec_device){
+        .cap = 0x1e,
+        .ctrl = 0x6,
+        .status2 = 0x2,
+        .range1_size_hi = 0,
+        .range1_size_lo = (2 << 5) | (2 << 2) | 0x3 | ct3d->size,
+        .range1_base_hi = 0,
+        .range1_base_lo = 0,
+    };
+    cxl_component_create_dvsec(cxl_cstate, PCIE_CXL_DEVICE_DVSEC_LENGTH,
+                               PCIE_CXL_DEVICE_DVSEC,
+                               PCIE_CXL2_DEVICE_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_register_locator){
+        .rsvd         = 0,
+        .reg0_base_lo = RBI_COMPONENT_REG | COMPONENT_REG_BAR_IDX,
+        .reg0_base_hi = 0,
+        .reg1_base_lo = RBI_CXL_DEVICE_REG | DEVICE_REG_BAR_IDX,
+        .reg1_base_hi = 0,
+    };
+    cxl_component_create_dvsec(cxl_cstate, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+                               REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void ct3_instance_init(Object *obj)
+{
+    /* MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(obj); */
+}
+
+static void ct3_finalize(Object *obj)
+{
+    CXLType3Dev *ct3d = CT3(obj);
+
+    g_free(ct3d->cxl_dstate.pmem);
+}
+
+#ifdef SET_PMEM_PADDR
+static void cxl_set_addr(CXLType3Dev *ct3d, hwaddr addr, Error **errp)
+{
+    MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(ct3d);
+    mdc->set_addr(MEMORY_DEVICE(ct3d), addr, errp);
+}
+#endif
+
+static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
+{
+    MemoryRegionSection mrs;
+    MemoryRegion *pmem;
+    MemoryRegion *mr;
+    uint64_t offset = 0;
+    size_t remaining_size;
+
+    if (!ct3d->hostmem) {
+        error_setg(errp, "memdev property must be set");
+        return;
+    }
+
+    /* FIXME: need to check mr is the host bridge's MR */
+    mr = host_memory_backend_get_memory(ct3d->hostmem);
+
+    /* Create our new subregion */
+    pmem = g_new(MemoryRegion, 1);
+
+    /* Find the first free space in the window */
+    WITH_RCU_READ_LOCK_GUARD()
+    {
+        mrs = memory_region_find(mr, offset, 1);
+        while (mrs.mr && mrs.mr != mr) {
+            offset += memory_region_size(mrs.mr);
+            mrs = memory_region_find(mr, offset, 1);
+        }
+    }
+
+    remaining_size = memory_region_size(mr) - offset;
+    if (remaining_size < ct3d->size) {
+        g_free(pmem);
+        error_setg(errp,
+                   "Not enough free space (%zd) required for device (%" PRId64  ")",
+                   remaining_size, ct3d->size);
+    }
+
+    memory_region_set_nonvolatile(pmem, true);
+    memory_region_set_enabled(pmem, false);
+    memory_region_init_alias(pmem, OBJECT(ct3d), "cxl_type3-memory", mr, 0,
+                             ct3d->size);
+    ct3d->cxl_dstate.pmem = pmem;
+
+#ifdef SET_PMEM_PADDR
+    /* This path will initialize the memory device as if BIOS had done it */
+    cxl_set_addr(ct3d, mr->addr + offset, errp);
+    memory_region_set_enabled(pmem, true);
+#endif
+}
+
+static MemoryRegion *cxl_md_get_memory_region(MemoryDeviceState *md,
+                                              Error **errp)
+{
+    CXLType3Dev *ct3d = CT3(md);
+
+    if (!ct3d->cxl_dstate.pmem) {
+        cxl_setup_memory(ct3d, errp);
+    }
+
+    return ct3d->cxl_dstate.pmem;
+}
+
+static void ct3_realize(PCIDevice *pci_dev, Error **errp)
+{
+    CXLType3Dev *ct3d = CT3(pci_dev);
+    CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
+    ComponentRegisters *regs = &cxl_cstate->crb;
+    MemoryRegion *mr = &regs->component_registers;
+    uint8_t *pci_conf = pci_dev->config;
+
+    if (!ct3d->cxl_dstate.pmem) {
+        cxl_setup_memory(ct3d, errp);
+    }
+
+    pci_config_set_prog_interface(pci_conf, 0x10);
+    pci_config_set_class(pci_conf, PCI_CLASS_MEMORY_CXL);
+
+    pcie_endpoint_cap_init(pci_dev, 0x80);
+    cxl_cstate->dvsec_offset = 0x100;
+
+    ct3d->cxl_cstate.pdev = pci_dev;
+    build_dvsecs(ct3d);
+
+    cxl_component_register_block_init(OBJECT(pci_dev), cxl_cstate,
+                                      TYPE_CXL_TYPE3_DEV);
+
+    pci_register_bar(
+        pci_dev, COMPONENT_REG_BAR_IDX,
+        PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64, mr);
+
+    cxl_device_register_block_init(OBJECT(pci_dev), &ct3d->cxl_dstate);
+    pci_register_bar(pci_dev, DEVICE_REG_BAR_IDX,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     &ct3d->cxl_dstate.device_registers);
+}
+
+static uint64_t cxl_md_get_addr(const MemoryDeviceState *md)
+{
+    CXLType3Dev *ct3d = CT3(md);
+    MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
+
+    assert(pmem->alias);
+    return pmem->alias_offset;
+}
+
+static void cxl_md_set_addr(MemoryDeviceState *md, uint64_t addr, Error **errp)
+{
+    CXLType3Dev *ct3d = CT3(md);
+    MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
+
+    assert(pmem->alias);
+    memory_region_set_alias_offset(pmem, addr);
+    memory_region_set_address(pmem, addr);
+}
+
+static void ct3d_reset(DeviceState *dev)
+{
+    CXLType3Dev *ct3d = CT3(dev);
+    uint32_t *reg_state = ct3d->cxl_cstate.crb.cache_mem_registers;
+
+    cxl_component_register_init_common(reg_state, CXL2_TYPE3_DEVICE);
+    cxl_device_register_init_common(&ct3d->cxl_dstate);
+}
+
+static Property ct3_props[] = {
+    DEFINE_PROP_SIZE("size", CXLType3Dev, size, -1),
+    DEFINE_PROP_LINK("memdev", CXLType3Dev, hostmem, TYPE_MEMORY_BACKEND,
+                     HostMemoryBackend *),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void pc_dimm_md_fill_device_info(const MemoryDeviceState *md,
+                                        MemoryDeviceInfo *info)
+{
+    PCDIMMDeviceInfo *di = g_new0(PCDIMMDeviceInfo, 1);
+    const DeviceClass *dc = DEVICE_GET_CLASS(md);
+    const DeviceState *dev = DEVICE(md);
+    CXLType3Dev *ct3d = CT3(md);
+
+    if (dev->id) {
+        di->has_id = true;
+        di->id = g_strdup(dev->id);
+    }
+
+    di->hotplugged = dev->hotplugged;
+    di->hotpluggable = dc->hotpluggable;
+    di->addr = cxl_md_get_addr(md);
+    di->slot = 0;
+    di->node = 0;
+    di->size = memory_device_get_region_size(md, NULL);
+    di->memdev = object_get_canonical_path(OBJECT(ct3d->hostmem));
+
+    info->u.cxl.data = di;
+    info->type = MEMORY_DEVICE_INFO_KIND_CXL;
+}
+
+static void ct3_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    PCIDeviceClass *pc = PCI_DEVICE_CLASS(oc);
+    MemoryDeviceClass *mdc = MEMORY_DEVICE_CLASS(oc);
+
+    pc->realize = ct3_realize;
+    pc->class_id = PCI_CLASS_STORAGE_EXPRESS;
+    pc->vendor_id = PCI_VENDOR_ID_INTEL;
+    pc->device_id = 0xd93; /* LVF for now */
+    pc->revision = 1;
+
+    set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+    dc->desc = "CXL PMEM Device (Type 3)";
+    dc->reset = ct3d_reset;
+    device_class_set_props(dc, ct3_props);
+
+    mdc->get_memory_region = cxl_md_get_memory_region;
+    mdc->get_addr = cxl_md_get_addr;
+    mdc->fill_device_info = pc_dimm_md_fill_device_info;
+    mdc->get_plugged_size = memory_device_get_region_size;
+    mdc->set_addr = cxl_md_set_addr;
+}
+
+static const TypeInfo ct3d_info = {
+    .name = TYPE_CXL_TYPE3_DEV,
+    .parent = TYPE_PCI_DEVICE,
+    .class_init = ct3_class_init,
+    .instance_size = sizeof(CXLType3Dev),
+    .instance_init = ct3_instance_init,
+    .instance_finalize = ct3_finalize,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_MEMORY_DEVICE },
+        { INTERFACE_CXL_DEVICE },
+        { INTERFACE_PCIE_DEVICE },
+        {}
+    },
+};
+
+static void ct3d_registers(void)
+{
+    type_register_static(&ct3d_info);
+}
+
+type_init(ct3d_registers);
diff --git a/hw/mem/meson.build b/hw/mem/meson.build
index 0d22f2b572..d13c3ed117 100644
--- a/hw/mem/meson.build
+++ b/hw/mem/meson.build
@@ -3,5 +3,6 @@ mem_ss.add(files('memory-device.c'))
 mem_ss.add(when: 'CONFIG_DIMM', if_true: files('pc-dimm.c'))
 mem_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_mc.c'))
 mem_ss.add(when: 'CONFIG_NVDIMM', if_true: files('nvdimm.c'))
+mem_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_true: files('cxl_type3.c'))
 
 softmmu_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index d4010cf8f3..1ecf6f6a55 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -20,6 +20,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "hw/mem/memory-device.h"
 #include "hw/pci/pci_bridge.h"
 #include "hw/pci/pcie.h"
 #include "hw/pci/msix.h"
@@ -27,6 +28,8 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/pci/pcie_regs.h"
 #include "hw/pci/pcie_port.h"
+#include "hw/cxl/cxl.h"
+#include "hw/boards.h"
 #include "qemu/range.h"
 
 //#define DEBUG_PCIE
@@ -419,6 +422,28 @@ void pcie_cap_slot_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
     }
 
     pcie_cap_slot_plug_common(PCI_DEVICE(hotplug_dev), dev, errp);
+
+#ifdef CXL_MEM_DEVICE
+    /*
+     * FIXME:
+     * if (object_dynamic_cast(OBJECT(dev), TYPE_CXL_TYPE3_DEV)) {
+     *    HotplugHandler *hotplug_ctrl;
+     *   Error *local_err = NULL;
+     *  hotplug_ctrl = qdev_get_hotplug_handler(dev);
+     *  if (hotplug_ctrl) {
+     *      hotplug_handler_pre_plug(hotplug_ctrl, dev, &local_err);
+     *      if (local_err) {
+     *          error_propagate(errp, local_err);
+     *          return;
+     *      }
+     *  }
+     */
+
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CXL_TYPE3_DEV)) {
+        memory_device_pre_plug(MEMORY_DEVICE(dev), MACHINE(qdev_get_machine()),
+                               NULL, errp);
+    }
+#endif
 }
 
 void pcie_cap_slot_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
@@ -455,6 +480,11 @@ void pcie_cap_slot_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
         pcie_cap_slot_event(hotplug_pdev,
                             PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP);
     }
+
+#ifdef CXL_MEM_DEVICE
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CXL_TYPE3_DEV))
+        memory_device_plug(MEMORY_DEVICE(dev), MACHINE(qdev_get_machine()));
+#endif
 }
 
 void pcie_cap_slot_unplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index b1e5f4a8fa..809ed7de60 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -17,6 +17,8 @@
 #define COMPONENT_REG_BAR_IDX 0
 #define DEVICE_REG_BAR_IDX 2
 
+#define TYPE_CXL_TYPE3_DEV "cxl-type3"
+
 #define CXL_HOST_BASE 0xD0000000
 #define CXL_WINDOW_MAX 10
 
diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
index a53c2e5ae7..9ec28c9feb 100644
--- a/include/hw/cxl/cxl_pci.h
+++ b/include/hw/cxl/cxl_pci.h
@@ -64,6 +64,28 @@ _Static_assert(sizeof(struct dvsec_header) == 10,
  * CXL 2.0 Downstream Port: 3, 4, 7, 8
  */
 
+/* CXL 2.0 - 8.1.3 (ID 0001) */
+struct dvsec_device {
+    struct dvsec_header hdr;
+    uint16_t cap;
+    uint16_t ctrl;
+    uint16_t status;
+    uint16_t ctrl2;
+    uint16_t status2;
+    uint16_t lock;
+    uint16_t cap2;
+    uint32_t range1_size_hi;
+    uint32_t range1_size_lo;
+    uint32_t range1_base_hi;
+    uint32_t range1_base_lo;
+    uint32_t range2_size_hi;
+    uint32_t range2_size_lo;
+    uint32_t range2_base_hi;
+    uint32_t range2_base_lo;
+};
+_Static_assert(sizeof(struct dvsec_device) == 0x38,
+               "dvsec device size incorrect");
+
 /* CXL 2.0 - 8.1.5 (ID 0003) */
 struct extensions_dvsec_port {
     struct dvsec_header hdr;
diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
index 11f8ab7149..76bf3ed590 100644
--- a/include/hw/pci/pci_ids.h
+++ b/include/hw/pci/pci_ids.h
@@ -53,6 +53,7 @@
 #define PCI_BASE_CLASS_MEMORY            0x05
 #define PCI_CLASS_MEMORY_RAM             0x0500
 #define PCI_CLASS_MEMORY_FLASH           0x0501
+#define PCI_CLASS_MEMORY_CXL             0x0502
 #define PCI_CLASS_MEMORY_OTHER           0x0580
 
 #define PCI_BASE_CLASS_BRIDGE            0x06
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index a48bc1e904..1c8edfb136 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1884,6 +1884,21 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
                 monitor_printf(mon, "  hotpluggable: %s\n",
                                di->hotpluggable ? "true" : "false");
                 break;
+            case MEMORY_DEVICE_INFO_KIND_CXL:
+                di = value->u.cxl.data;
+                monitor_printf(mon, "Memory device [%s]: \"%s\"\n",
+                               MemoryDeviceInfoKind_str(value->type),
+                               di->id ? di->id : "");
+                monitor_printf(mon, "  addr: 0x%" PRIx64 "\n", di->addr);
+                monitor_printf(mon, "  slot: %" PRId64 "\n", di->slot);
+                monitor_printf(mon, "  node: %" PRId64 "\n", di->node);
+                monitor_printf(mon, "  size: %" PRIu64 "\n", di->size);
+                monitor_printf(mon, "  memdev: %s\n", di->memdev);
+                monitor_printf(mon, "  hotplugged: %s\n",
+                               di->hotplugged ? "true" : "false");
+                monitor_printf(mon, "  hotpluggable: %s\n",
+                               di->hotpluggable ? "true" : "false");
+                break;
             case MEMORY_DEVICE_INFO_KIND_VIRTIO_PMEM:
                 vpi = value->u.virtio_pmem.data;
                 monitor_printf(mon, "Memory device [%s]: \"%s\"\n",
diff --git a/qapi/machine.json b/qapi/machine.json
index 330189efe3..aa96d662bd 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1394,6 +1394,7 @@
 { 'union': 'MemoryDeviceInfo',
   'data': { 'dimm': 'PCDIMMDeviceInfo',
             'nvdimm': 'PCDIMMDeviceInfo',
+            'cxl': 'PCDIMMDeviceInfo',
             'virtio-pmem': 'VirtioPMEMDeviceInfo',
             'virtio-mem': 'VirtioMEMDeviceInfo'
           }
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 22/31] hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

A device's volatile and persistent memory are known Host Defined Memory
(HDM) regions. The mechanism by which the device is programmed to claim
the addresses associated with those regions is through dedicated logic
known as the HDM decoder. In order to allow the OS to properly program
the HDMs, the HDM decoders must be modeled.

There are two ways the HDM decoders can be implemented, the legacy
mechanism is through the PCIe DVSEC programming from CXL 1.1 (8.1.3.8),
and MMIO is found in 8.2.5.12 of the spec. For now, 8.1.3.8 is not
implemented.

Much of CXL device logic is implemented in cxl-utils. The HDM decoder
however is implemented directly by the device implementation. The
generic cxl-utils probably should be the correct place to put this since
HDM decoders aren't unique to a type3 device. It is however easier at
the moment, and requires less design consideration to simply implement
it in the device, and figure out how to consolidate it later.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/mem/cxl_type3.c | 92 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 84 insertions(+), 8 deletions(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 4e9a016448..fe02c3b63c 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -57,6 +57,84 @@ static void build_dvsecs(CXLType3Dev *ct3d)
                                REG_LOC_DVSEC_REVID, dvsec);
 }
 
+static void cxl_set_addr(CXLType3Dev *ct3d, hwaddr addr, Error **errp)
+{
+    MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(ct3d);
+    mdc->set_addr(MEMORY_DEVICE(ct3d), addr, errp);
+}
+
+static void hdm_decoder_commit(CXLType3Dev *ct3d, int which)
+{
+    MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
+    MemoryRegion *mr = host_memory_backend_get_memory(ct3d->hostmem);
+    Range window, device;
+    ComponentRegisters *cregs = &ct3d->cxl_cstate.crb;
+    uint32_t *cache_mem = cregs->cache_mem_registers;
+    uint64_t offset, size;
+    Error *err = NULL;
+
+    assert(which == 0);
+
+    ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, COMMIT, 0);
+    ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 0);
+
+    offset = ((uint64_t)cache_mem[R_CXL_HDM_DECODER0_BASE_HI] << 32) |
+             cache_mem[R_CXL_HDM_DECODER0_BASE_LO];
+    size = ((uint64_t)cache_mem[R_CXL_HDM_DECODER0_SIZE_HI] << 32) |
+           cache_mem[R_CXL_HDM_DECODER0_SIZE_LO];
+
+    range_init_nofail(&window, mr->addr, memory_region_size(mr));
+    range_init_nofail(&device, offset, size);
+
+    if (!range_contains_range(&window, &device)) {
+        ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+        return;
+    }
+
+    /*
+     * FIXME: Support resizing.
+     * Maybe just memory_region_ram_resize(pmem, size, &err)?
+     */
+    if (size != memory_region_size(pmem)) {
+        ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+        return;
+    }
+
+    cxl_set_addr(ct3d, offset, &err);
+    if (err) {
+        ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+        return;
+    }
+    memory_region_set_enabled(pmem, true);
+
+    ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, COMMITTED, 1);
+}
+
+static void ct3d_reg_write(void *opaque, hwaddr offset, uint64_t value, unsigned size)
+{
+    CXLComponentState *cxl_cstate = opaque;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+    CXLType3Dev *ct3d = container_of(cxl_cstate, CXLType3Dev, cxl_cstate);
+    uint32_t *cache_mem = cregs->cache_mem_registers;
+    bool should_commit = false;
+    int which_hdm = -1;
+
+    assert(size == 4);
+
+    switch (offset) {
+    case A_CXL_HDM_DECODER0_CTRL:
+        should_commit = FIELD_EX32(value, CXL_HDM_DECODER0_CTRL, COMMIT);
+        which_hdm = 0;
+        break;
+    default:
+        break;
+    }
+
+    stl_le_p((uint8_t *)cache_mem + offset, value);
+    if (should_commit)
+        hdm_decoder_commit(ct3d, which_hdm);
+}
+
 static void ct3_instance_init(Object *obj)
 {
     /* MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(obj); */
@@ -65,18 +143,13 @@ static void ct3_instance_init(Object *obj)
 static void ct3_finalize(Object *obj)
 {
     CXLType3Dev *ct3d = CT3(obj);
+    CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
+    ComponentRegisters *regs = &cxl_cstate->crb;
 
+    g_free((void *)regs->special_ops);
     g_free(ct3d->cxl_dstate.pmem);
 }
 
-#ifdef SET_PMEM_PADDR
-static void cxl_set_addr(CXLType3Dev *ct3d, hwaddr addr, Error **errp)
-{
-    MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(ct3d);
-    mdc->set_addr(MEMORY_DEVICE(ct3d), addr, errp);
-}
-#endif
-
 static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
 {
     MemoryRegionSection mrs;
@@ -160,6 +233,9 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
     ct3d->cxl_cstate.pdev = pci_dev;
     build_dvsecs(ct3d);
 
+    regs->special_ops = g_new0(MemoryRegionOps, 1);
+    regs->special_ops->write = ct3d_reg_write;
+
     cxl_component_register_block_init(OBJECT(pci_dev), cxl_cstate,
                                       TYPE_CXL_TYPE3_DEV);
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 22/31] hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

A device's volatile and persistent memory are known Host Defined Memory
(HDM) regions. The mechanism by which the device is programmed to claim
the addresses associated with those regions is through dedicated logic
known as the HDM decoder. In order to allow the OS to properly program
the HDMs, the HDM decoders must be modeled.

There are two ways the HDM decoders can be implemented, the legacy
mechanism is through the PCIe DVSEC programming from CXL 1.1 (8.1.3.8),
and MMIO is found in 8.2.5.12 of the spec. For now, 8.1.3.8 is not
implemented.

Much of CXL device logic is implemented in cxl-utils. The HDM decoder
however is implemented directly by the device implementation. The
generic cxl-utils probably should be the correct place to put this since
HDM decoders aren't unique to a type3 device. It is however easier at
the moment, and requires less design consideration to simply implement
it in the device, and figure out how to consolidate it later.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/mem/cxl_type3.c | 92 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 84 insertions(+), 8 deletions(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 4e9a016448..fe02c3b63c 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -57,6 +57,84 @@ static void build_dvsecs(CXLType3Dev *ct3d)
                                REG_LOC_DVSEC_REVID, dvsec);
 }
 
+static void cxl_set_addr(CXLType3Dev *ct3d, hwaddr addr, Error **errp)
+{
+    MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(ct3d);
+    mdc->set_addr(MEMORY_DEVICE(ct3d), addr, errp);
+}
+
+static void hdm_decoder_commit(CXLType3Dev *ct3d, int which)
+{
+    MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
+    MemoryRegion *mr = host_memory_backend_get_memory(ct3d->hostmem);
+    Range window, device;
+    ComponentRegisters *cregs = &ct3d->cxl_cstate.crb;
+    uint32_t *cache_mem = cregs->cache_mem_registers;
+    uint64_t offset, size;
+    Error *err = NULL;
+
+    assert(which == 0);
+
+    ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, COMMIT, 0);
+    ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 0);
+
+    offset = ((uint64_t)cache_mem[R_CXL_HDM_DECODER0_BASE_HI] << 32) |
+             cache_mem[R_CXL_HDM_DECODER0_BASE_LO];
+    size = ((uint64_t)cache_mem[R_CXL_HDM_DECODER0_SIZE_HI] << 32) |
+           cache_mem[R_CXL_HDM_DECODER0_SIZE_LO];
+
+    range_init_nofail(&window, mr->addr, memory_region_size(mr));
+    range_init_nofail(&device, offset, size);
+
+    if (!range_contains_range(&window, &device)) {
+        ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+        return;
+    }
+
+    /*
+     * FIXME: Support resizing.
+     * Maybe just memory_region_ram_resize(pmem, size, &err)?
+     */
+    if (size != memory_region_size(pmem)) {
+        ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+        return;
+    }
+
+    cxl_set_addr(ct3d, offset, &err);
+    if (err) {
+        ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+        return;
+    }
+    memory_region_set_enabled(pmem, true);
+
+    ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, COMMITTED, 1);
+}
+
+static void ct3d_reg_write(void *opaque, hwaddr offset, uint64_t value, unsigned size)
+{
+    CXLComponentState *cxl_cstate = opaque;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+    CXLType3Dev *ct3d = container_of(cxl_cstate, CXLType3Dev, cxl_cstate);
+    uint32_t *cache_mem = cregs->cache_mem_registers;
+    bool should_commit = false;
+    int which_hdm = -1;
+
+    assert(size == 4);
+
+    switch (offset) {
+    case A_CXL_HDM_DECODER0_CTRL:
+        should_commit = FIELD_EX32(value, CXL_HDM_DECODER0_CTRL, COMMIT);
+        which_hdm = 0;
+        break;
+    default:
+        break;
+    }
+
+    stl_le_p((uint8_t *)cache_mem + offset, value);
+    if (should_commit)
+        hdm_decoder_commit(ct3d, which_hdm);
+}
+
 static void ct3_instance_init(Object *obj)
 {
     /* MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(obj); */
@@ -65,18 +143,13 @@ static void ct3_instance_init(Object *obj)
 static void ct3_finalize(Object *obj)
 {
     CXLType3Dev *ct3d = CT3(obj);
+    CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
+    ComponentRegisters *regs = &cxl_cstate->crb;
 
+    g_free((void *)regs->special_ops);
     g_free(ct3d->cxl_dstate.pmem);
 }
 
-#ifdef SET_PMEM_PADDR
-static void cxl_set_addr(CXLType3Dev *ct3d, hwaddr addr, Error **errp)
-{
-    MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(ct3d);
-    mdc->set_addr(MEMORY_DEVICE(ct3d), addr, errp);
-}
-#endif
-
 static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
 {
     MemoryRegionSection mrs;
@@ -160,6 +233,9 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
     ct3d->cxl_cstate.pdev = pci_dev;
     build_dvsecs(ct3d);
 
+    regs->special_ops = g_new0(MemoryRegionOps, 1);
+    regs->special_ops->write = ct3d_reg_write;
+
     cxl_component_register_block_init(OBJECT(pci_dev), cxl_cstate,
                                       TYPE_CXL_TYPE3_DEV);
 
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 23/31] acpi/cxl: Add _OSC implementation (9.14.2)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

CXL 2.0 specification adds 2 new dwords to the existing _OSC definition
from PCIe. The new dwords are accessed with a new uuid. This
implementation supports what is in the specification.

We are currently in the process of trying to define a new definition for
_OSC. See later work for an explanation.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/acpi/Kconfig       |   5 ++
 hw/acpi/cxl.c         | 104 ++++++++++++++++++++++++++++++++++++++++++
 hw/acpi/meson.build   |   1 +
 hw/i386/acpi-build.c  |  12 ++++-
 include/hw/acpi/cxl.h |  23 ++++++++++
 5 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 hw/acpi/cxl.c
 create mode 100644 include/hw/acpi/cxl.h

diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 1932f66af8..b27907953e 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -5,6 +5,7 @@ config ACPI_X86
     bool
     select ACPI
     select ACPI_NVDIMM
+    select ACPI_CXL
     select ACPI_CPU_HOTPLUG
     select ACPI_MEMORY_HOTPLUG
     select ACPI_HMAT
@@ -42,3 +43,7 @@ config ACPI_VMGENID
     depends on PC
 
 config ACPI_HW_REDUCED
+
+config ACPI_CXL
+    bool
+    depends on ACPI
diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
new file mode 100644
index 0000000000..7124d5a1a3
--- /dev/null
+++ b/hw/acpi/cxl.c
@@ -0,0 +1,104 @@
+/*
+ * CXL ACPI Implementation
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu/osdep.h"
+#include "hw/cxl/cxl.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/acpi/bios-linker-loader.h"
+#include "hw/acpi/cxl.h"
+#include "qapi/error.h"
+#include "qemu/uuid.h"
+
+static Aml *__build_cxl_osc_method(void)
+{
+    Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, *if_caps_masked;
+    Aml *a_ctrl = aml_local(0);
+    Aml *a_cdw1 = aml_name("CDW1");
+
+    method = aml_method("_OSC", 4, AML_NOTSERIALIZED);
+    aml_append(method, aml_create_dword_field(aml_arg(3), aml_int(0), "CDW1"));
+
+    /* 9.14.2.1.4 */
+    if_uuid = aml_if(
+        aml_lor(aml_equal(aml_arg(0),
+                          aml_touuid("33DB4D5B-1FF7-401C-9657-7441C03DD766")),
+                aml_equal(aml_arg(0),
+                          aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC"))));
+    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
+    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(8), "CDW3"));
+
+    aml_append(if_uuid, aml_store(aml_name("CDW3"), a_ctrl));
+
+    /* This is all the same as what's used for PCIe */
+    aml_append(if_uuid,
+               aml_and(aml_name("CTRL"), aml_int(0x1F), aml_name("CTRL")));
+
+    if_arg1_not_1 = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(0x1))));
+    /* Unknown revision */
+    aml_append(if_arg1_not_1, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
+    aml_append(if_uuid, if_arg1_not_1);
+
+    if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW3"), a_ctrl)));
+    /* Capability bits were masked */
+    aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
+    aml_append(if_uuid, if_caps_masked);
+
+    aml_append(if_uuid, aml_store(aml_name("CDW2"), aml_name("SUPP")));
+    aml_append(if_uuid, aml_store(aml_name("CDW3"), aml_name("CTRL")));
+
+    if_cxl = aml_if(aml_equal(
+        aml_arg(0), aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC")));
+    /* CXL support field */
+    aml_append(if_cxl, aml_create_dword_field(aml_arg(3), aml_int(12), "CDW4"));
+    /* CXL capabilities */
+    aml_append(if_cxl, aml_create_dword_field(aml_arg(3), aml_int(16), "CDW5"));
+    aml_append(if_cxl, aml_store(aml_name("CDW4"), aml_name("SUPC")));
+    aml_append(if_cxl, aml_store(aml_name("CDW5"), aml_name("CTRC")));
+
+    /* CXL 2.0 Port/Device Register access */
+    aml_append(if_cxl,
+               aml_or(aml_name("CDW5"), aml_int(0x1), aml_name("CDW5")));
+    aml_append(if_uuid, if_cxl);
+
+    /* Update DWORD3 (the return value) */
+    aml_append(if_uuid, aml_store(a_ctrl, aml_name("CDW3")));
+
+    aml_append(if_uuid, aml_return(aml_arg(3)));
+    aml_append(method, if_uuid);
+
+    else_uuid = aml_else();
+
+    /* unrecognized uuid */
+    aml_append(else_uuid,
+               aml_or(aml_name("CDW1"), aml_int(0x4), aml_name("CDW1")));
+    aml_append(else_uuid, aml_return(aml_arg(3)));
+    aml_append(method, else_uuid);
+
+    return method;
+}
+
+void build_cxl_osc_method(Aml *dev)
+{
+    aml_append(dev, aml_name_decl("SUPP", aml_int(0)));
+    aml_append(dev, aml_name_decl("CTRL", aml_int(0)));
+    aml_append(dev, aml_name_decl("SUPC", aml_int(0)));
+    aml_append(dev, aml_name_decl("CTRC", aml_int(0)));
+    aml_append(dev, __build_cxl_osc_method());
+}
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index dd69577212..9f5c5ced28 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -10,6 +10,7 @@ acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_MEMORY_HOTPLUG', if_true: files('memory_hotplug.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_NVDIMM', if_true: files('nvdimm.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_PCI', if_true: files('pci.c'))
+acpi_ss.add(when: 'CONFIG_ACPI_CXL', if_true: files('cxl.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_VMGENID', if_true: files('vmgenid.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_HW_REDUCED', if_true: files('generic_event_device.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_HMAT', if_true: files('hmat.c'))
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ecdc10b148..2c2293b55f 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -66,6 +66,7 @@
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/utils.h"
 #include "hw/acpi/pci.h"
+#include "hw/acpi/cxl.h"
 
 #include "qom/qom-qobject.h"
 #include "hw/i386/amd_iommu.h"
@@ -1201,11 +1202,20 @@ static void init_pci_acpi(Aml *dev, int uid, int type)
     if (type == PCI) {
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
-    } else {
+    } else if (type == PCIE) {
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
         aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
         aml_append(dev, build_q35_osc_method());
+    } else /* CXL */ {
+        struct Aml *pkg = aml_package(2);
+
+        aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0016")));
+        aml_append(pkg, aml_eisaid("PNP0A08"));
+        aml_append(pkg, aml_eisaid("PNP0A03"));
+        aml_append(dev, aml_name_decl("_CID", pkg));
+        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+        build_cxl_osc_method(dev);
     }
 }
 
diff --git a/include/hw/acpi/cxl.h b/include/hw/acpi/cxl.h
new file mode 100644
index 0000000000..7b8f3b8a2e
--- /dev/null
+++ b/include/hw/acpi/cxl.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2020 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ACPI_CXL_H
+#define HW_ACPI_CXL_H
+
+void build_cxl_osc_method(Aml *dev);
+
+#endif
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 23/31] acpi/cxl: Add _OSC implementation (9.14.2)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

CXL 2.0 specification adds 2 new dwords to the existing _OSC definition
from PCIe. The new dwords are accessed with a new uuid. This
implementation supports what is in the specification.

We are currently in the process of trying to define a new definition for
_OSC. See later work for an explanation.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/acpi/Kconfig       |   5 ++
 hw/acpi/cxl.c         | 104 ++++++++++++++++++++++++++++++++++++++++++
 hw/acpi/meson.build   |   1 +
 hw/i386/acpi-build.c  |  12 ++++-
 include/hw/acpi/cxl.h |  23 ++++++++++
 5 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 hw/acpi/cxl.c
 create mode 100644 include/hw/acpi/cxl.h

diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 1932f66af8..b27907953e 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -5,6 +5,7 @@ config ACPI_X86
     bool
     select ACPI
     select ACPI_NVDIMM
+    select ACPI_CXL
     select ACPI_CPU_HOTPLUG
     select ACPI_MEMORY_HOTPLUG
     select ACPI_HMAT
@@ -42,3 +43,7 @@ config ACPI_VMGENID
     depends on PC
 
 config ACPI_HW_REDUCED
+
+config ACPI_CXL
+    bool
+    depends on ACPI
diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
new file mode 100644
index 0000000000..7124d5a1a3
--- /dev/null
+++ b/hw/acpi/cxl.c
@@ -0,0 +1,104 @@
+/*
+ * CXL ACPI Implementation
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu/osdep.h"
+#include "hw/cxl/cxl.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/acpi/bios-linker-loader.h"
+#include "hw/acpi/cxl.h"
+#include "qapi/error.h"
+#include "qemu/uuid.h"
+
+static Aml *__build_cxl_osc_method(void)
+{
+    Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, *if_caps_masked;
+    Aml *a_ctrl = aml_local(0);
+    Aml *a_cdw1 = aml_name("CDW1");
+
+    method = aml_method("_OSC", 4, AML_NOTSERIALIZED);
+    aml_append(method, aml_create_dword_field(aml_arg(3), aml_int(0), "CDW1"));
+
+    /* 9.14.2.1.4 */
+    if_uuid = aml_if(
+        aml_lor(aml_equal(aml_arg(0),
+                          aml_touuid("33DB4D5B-1FF7-401C-9657-7441C03DD766")),
+                aml_equal(aml_arg(0),
+                          aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC"))));
+    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
+    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(8), "CDW3"));
+
+    aml_append(if_uuid, aml_store(aml_name("CDW3"), a_ctrl));
+
+    /* This is all the same as what's used for PCIe */
+    aml_append(if_uuid,
+               aml_and(aml_name("CTRL"), aml_int(0x1F), aml_name("CTRL")));
+
+    if_arg1_not_1 = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(0x1))));
+    /* Unknown revision */
+    aml_append(if_arg1_not_1, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
+    aml_append(if_uuid, if_arg1_not_1);
+
+    if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW3"), a_ctrl)));
+    /* Capability bits were masked */
+    aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
+    aml_append(if_uuid, if_caps_masked);
+
+    aml_append(if_uuid, aml_store(aml_name("CDW2"), aml_name("SUPP")));
+    aml_append(if_uuid, aml_store(aml_name("CDW3"), aml_name("CTRL")));
+
+    if_cxl = aml_if(aml_equal(
+        aml_arg(0), aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC")));
+    /* CXL support field */
+    aml_append(if_cxl, aml_create_dword_field(aml_arg(3), aml_int(12), "CDW4"));
+    /* CXL capabilities */
+    aml_append(if_cxl, aml_create_dword_field(aml_arg(3), aml_int(16), "CDW5"));
+    aml_append(if_cxl, aml_store(aml_name("CDW4"), aml_name("SUPC")));
+    aml_append(if_cxl, aml_store(aml_name("CDW5"), aml_name("CTRC")));
+
+    /* CXL 2.0 Port/Device Register access */
+    aml_append(if_cxl,
+               aml_or(aml_name("CDW5"), aml_int(0x1), aml_name("CDW5")));
+    aml_append(if_uuid, if_cxl);
+
+    /* Update DWORD3 (the return value) */
+    aml_append(if_uuid, aml_store(a_ctrl, aml_name("CDW3")));
+
+    aml_append(if_uuid, aml_return(aml_arg(3)));
+    aml_append(method, if_uuid);
+
+    else_uuid = aml_else();
+
+    /* unrecognized uuid */
+    aml_append(else_uuid,
+               aml_or(aml_name("CDW1"), aml_int(0x4), aml_name("CDW1")));
+    aml_append(else_uuid, aml_return(aml_arg(3)));
+    aml_append(method, else_uuid);
+
+    return method;
+}
+
+void build_cxl_osc_method(Aml *dev)
+{
+    aml_append(dev, aml_name_decl("SUPP", aml_int(0)));
+    aml_append(dev, aml_name_decl("CTRL", aml_int(0)));
+    aml_append(dev, aml_name_decl("SUPC", aml_int(0)));
+    aml_append(dev, aml_name_decl("CTRC", aml_int(0)));
+    aml_append(dev, __build_cxl_osc_method());
+}
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index dd69577212..9f5c5ced28 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -10,6 +10,7 @@ acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_MEMORY_HOTPLUG', if_true: files('memory_hotplug.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_NVDIMM', if_true: files('nvdimm.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_PCI', if_true: files('pci.c'))
+acpi_ss.add(when: 'CONFIG_ACPI_CXL', if_true: files('cxl.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_VMGENID', if_true: files('vmgenid.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_HW_REDUCED', if_true: files('generic_event_device.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_HMAT', if_true: files('hmat.c'))
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ecdc10b148..2c2293b55f 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -66,6 +66,7 @@
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/utils.h"
 #include "hw/acpi/pci.h"
+#include "hw/acpi/cxl.h"
 
 #include "qom/qom-qobject.h"
 #include "hw/i386/amd_iommu.h"
@@ -1201,11 +1202,20 @@ static void init_pci_acpi(Aml *dev, int uid, int type)
     if (type == PCI) {
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
-    } else {
+    } else if (type == PCIE) {
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
         aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
         aml_append(dev, build_q35_osc_method());
+    } else /* CXL */ {
+        struct Aml *pkg = aml_package(2);
+
+        aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0016")));
+        aml_append(pkg, aml_eisaid("PNP0A08"));
+        aml_append(pkg, aml_eisaid("PNP0A03"));
+        aml_append(dev, aml_name_decl("_CID", pkg));
+        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+        build_cxl_osc_method(dev);
     }
 }
 
diff --git a/include/hw/acpi/cxl.h b/include/hw/acpi/cxl.h
new file mode 100644
index 0000000000..7b8f3b8a2e
--- /dev/null
+++ b/include/hw/acpi/cxl.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2020 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ACPI_CXL_H
+#define HW_ACPI_CXL_H
+
+void build_cxl_osc_method(Aml *dev);
+
+#endif
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 24/31] tests/acpi: allow CEDT table addition
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/data/acpi/pc/CEDT                     | 0
 tests/data/acpi/q35/CEDT                    | 0
 tests/qtest/bios-tables-test-allowed-diff.h | 2 ++
 3 files changed, 2 insertions(+)
 create mode 100644 tests/data/acpi/pc/CEDT
 create mode 100644 tests/data/acpi/q35/CEDT

diff --git a/tests/data/acpi/pc/CEDT b/tests/data/acpi/pc/CEDT
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/tests/data/acpi/q35/CEDT b/tests/data/acpi/q35/CEDT
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..9b07f1e1ff 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,3 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/CEDT",
+"tests/data/acpi/q35/CEDT",
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 24/31] tests/acpi: allow CEDT table addition
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/data/acpi/pc/CEDT                     | 0
 tests/data/acpi/q35/CEDT                    | 0
 tests/qtest/bios-tables-test-allowed-diff.h | 2 ++
 3 files changed, 2 insertions(+)
 create mode 100644 tests/data/acpi/pc/CEDT
 create mode 100644 tests/data/acpi/q35/CEDT

diff --git a/tests/data/acpi/pc/CEDT b/tests/data/acpi/pc/CEDT
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/tests/data/acpi/q35/CEDT b/tests/data/acpi/q35/CEDT
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..9b07f1e1ff 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,3 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/CEDT",
+"tests/data/acpi/q35/CEDT",
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 25/31] acpi/cxl: Create the CEDT (9.14.1)
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

The CXL Early Discovery Table is defined in the CXL 2.0 specification as
a way for the OS to get CXL specific information from the system
firmware.

CXL 2.0 specification adds an _HID, ACPI0016, for CXL capable host
bridges, with a _CID of PNP0A08 (PCIe host bridge). CXL aware software
is able to use this initiate the proper _OSC method, and get the _UID
which is referenced by the CEDT. Therefore the existence of an ACPI0016
device allows a CXL aware driver perform the necessary actions. For a
CXL capable OS, this works. For a CXL unaware OS, this works.

CEDT awaremess requires more. The motivation for ACPI0017 is to provide
the possibility of having a Linux CXL module that can work on a legacy
Linux kernel. Linux core PCI/ACPI which won't be built as a module,
will see the _CID of PNP0A08 and bind a driver to it. If we later loaded
a driver for ACPI0016, Linux won't be able to bind it to the hardware
because it has already bound the PNP0A08 driver. The ACPI0017 device is
an opportunity to have an object to bind a driver will be used by a
Linux driver to walk the CXL topology and do everything that we would
have preferred to do with ACPI0016.

There is another motivation for an ACPI0017 device which isn't
implemented here. An operating system needs an attach point for a
non-volatile region provider that understands cross-hostbridge
interleaving. Since QEMU emulation doesn't support interleaving yet,
this is more important on the OS side, for now.

As of CXL 2.0 spec, only 1 sub structure is defined, the CXL Host Bridge
Structure (CHBS) which is primarily useful for telling the OS exactly
where the MMIO for the host bridge is.

v2: Update CHBS to spec released definition
v3: squash ACPI0017 in now that it's ratified.

Link: https://lore.kernel.org/linux-cxl/20210115034911.nkgpzc756d6qmjpl@intel.com/T/#t
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/acpi/cxl.c                       | 69 +++++++++++++++++++++++++++++
 hw/i386/acpi-build.c                | 25 ++++++++++-
 hw/pci-bridge/pci_expander_bridge.c | 21 +--------
 include/hw/acpi/cxl.h               |  4 ++
 include/hw/pci/pci_bridge.h         | 25 +++++++++++
 5 files changed, 123 insertions(+), 21 deletions(-)

diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
index 7124d5a1a3..68db0fe3a8 100644
--- a/hw/acpi/cxl.c
+++ b/hw/acpi/cxl.c
@@ -18,14 +18,83 @@
  */
 
 #include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci/pci_host.h"
 #include "hw/cxl/cxl.h"
+#include "hw/mem/memory-device.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/acpi/cxl.h"
+#include "hw/acpi/cxl.h"
 #include "qapi/error.h"
 #include "qemu/uuid.h"
 
+static void cedt_build_chbs(GArray *table_data, PXBDev *cxl)
+{
+    SysBusDevice *sbd = SYS_BUS_DEVICE(cxl->cxl.cxl_host_bridge);
+    struct MemoryRegion *mr = sbd->mmio[0].memory;
+
+    /* Type */
+    build_append_int_noprefix(table_data, 0, 1);
+
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0, 1);
+
+    /* Record Length */
+    build_append_int_noprefix(table_data, 32, 2);
+
+    /* UID */
+    build_append_int_noprefix(table_data, cxl->uid, 4);
+
+    /* Version */
+    build_append_int_noprefix(table_data, 1, 4);
+
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0, 4);
+
+    /* Base */
+    build_append_int_noprefix(table_data, mr->addr, 8);
+
+    /* Length */
+    build_append_int_noprefix(table_data, memory_region_size(mr), 8);
+}
+
+static int cxl_foreach_pxb_hb(Object *obj, void *opaque)
+{
+    Aml *cedt = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_PXB_CXL_DEVICE)) {
+        PXBDev *pxb = PXB_CXL_DEV(obj);
+
+        cedt_build_chbs(cedt->buf, pxb);
+    }
+
+    return 0;
+}
+
+void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
+                    BIOSLinker *linker)
+{
+    const int cedt_start = table_data->len;
+    Aml *cedt;
+
+    cedt = init_aml_allocator();
+
+    /* reserve space for CEDT header */
+    acpi_add_table(table_offsets, table_data);
+    acpi_data_push(cedt->buf, sizeof(AcpiTableHeader));
+
+    object_child_foreach_recursive(object_get_root(), cxl_foreach_pxb_hb, cedt);
+
+    /* copy AML table into ACPI tables blob and patch header there */
+    g_array_append_vals(table_data, cedt->buf->data, cedt->buf->len);
+    build_header(linker, table_data, (void *)(table_data->data + cedt_start),
+                 "CEDT", table_data->len - cedt_start, 1, NULL, NULL);
+    free_aml_allocator();
+}
+
 static Aml *__build_cxl_osc_method(void)
 {
     Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, *if_caps_masked;
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2c2293b55f..7706856c49 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -75,6 +75,8 @@
 #include "hw/acpi/ipmi.h"
 #include "hw/acpi/hmat.h"
 
+#include "hw/acpi/cxl.h"
+
 /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
  * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
  * a little bit, there should be plenty of free space since the DSDT
@@ -1219,6 +1221,19 @@ static void init_pci_acpi(Aml *dev, int uid, int type)
     }
 }
 
+static void build_acpi0017(Aml *table)
+{
+    Aml *dev;
+    Aml *scope;
+
+    scope =  aml_scope("_SB");
+    dev = aml_device("CXLM");
+    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0017")));
+
+    aml_append(scope, dev);
+    aml_append(table, scope);
+}
+
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker,
            AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -1235,6 +1250,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     int root_bus_limit = 0xFF;
     PCIBus *bus = NULL;
     TPMIf *tpm = tpm_find();
+    bool cxl_present = false;
     int i;
     VMBusBridge *vmbus_bridge = vmbus_bridge_find();
 
@@ -1371,7 +1387,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
             scope = aml_scope("\\_SB");
             if (type == CXL) {
-                dev = aml_device("CXL%.01X", pci_bus_uid(bus));
+                dev = aml_device("CXL%.01X", uid);
             } else {
                 dev = aml_device("PC%.02X", bus_num);
             }
@@ -1391,6 +1407,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
             /* Handle the ranges for the PXB expanders */
             if (type == CXL) {
+                cxl_present = true;
                 uint64_t base = CXL_HOST_BASE + uid * 0x10000;
                 crs_range_insert(crs_range_set.mem_ranges, base,
                                  base + 0x10000 - 1);
@@ -1398,6 +1415,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         }
     }
 
+    if (cxl_present) {
+        build_acpi0017(dsdt);
+    }
+
     /*
      * At this point crs_range_set has all the ranges used by pci
      * busses *other* than PCI0.  These ranges will be excluded from
@@ -2278,6 +2299,8 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
                           machine->nvdimms_state, machine->ram_slots);
     }
 
+    cxl_build_cedt(table_offsets, tables_blob, tables->linker);
+
     acpi_add_table(table_offsets, tables_blob);
     build_waet(tables_blob, tables->linker);
 
diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index af1450c69d..6458d5b76e 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -57,26 +57,6 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
 DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
                          TYPE_PXB_PCIE_DEVICE)
 
-#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
-DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
-                         TYPE_PXB_CXL_DEVICE)
-
-struct PXBDev {
-    /*< private >*/
-    PCIDevice parent_obj;
-    /*< public >*/
-
-    uint8_t bus_nr;
-    uint16_t numa_node;
-    int32_t uid;
-    struct cxl_dev {
-        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
-
-        uint32_t num_windows;
-        hwaddr *window_base[CXL_WINDOW_MAX];
-    } cxl;
-};
-
 typedef struct CXLHost {
     PCIHostState parent_obj;
 
@@ -351,6 +331,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
         bus->flags |= PCI_BUS_CXL;
         PXB_CXL_HOST(ds)->dev = PXB_CXL_DEV(dev);
+        PXB_CXL_DEV(dev)->cxl.cxl_host_bridge = ds;
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
         bds = qdev_new("pci-bridge");
diff --git a/include/hw/acpi/cxl.h b/include/hw/acpi/cxl.h
index 7b8f3b8a2e..db2063f8c9 100644
--- a/include/hw/acpi/cxl.h
+++ b/include/hw/acpi/cxl.h
@@ -18,6 +18,10 @@
 #ifndef HW_ACPI_CXL_H
 #define HW_ACPI_CXL_H
 
+#include "hw/acpi/bios-linker-loader.h"
+
+void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
+                    BIOSLinker *linker);
 void build_cxl_osc_method(Aml *dev);
 
 #endif
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index a94d350034..50dd7fdf33 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -28,6 +28,7 @@
 
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_bus.h"
+#include "hw/cxl/cxl.h"
 #include "qom/object.h"
 
 typedef struct PCIBridgeWindows PCIBridgeWindows;
@@ -81,6 +82,30 @@ struct PCIBridge {
 #define PCI_BRIDGE_DEV_PROP_MSI        "msi"
 #define PCI_BRIDGE_DEV_PROP_SHPC       "shpc"
 
+struct PXBDev {
+    /*< private >*/
+    PCIDevice parent_obj;
+    /*< public >*/
+
+    uint8_t bus_nr;
+    uint16_t numa_node;
+    int32_t uid;
+
+    struct cxl_dev {
+        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
+
+        uint32_t num_windows;
+        hwaddr *window_base[CXL_WINDOW_MAX];
+
+        void *cxl_host_bridge; /* Pointer to a CXLHost */
+    } cxl;
+};
+
+typedef struct PXBDev PXBDev;
+#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
+DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
+                         TYPE_PXB_CXL_DEVICE)
+
 int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
                           uint16_t svid, uint16_t ssid,
                           Error **errp);
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 25/31] acpi/cxl: Create the CEDT (9.14.1)
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

The CXL Early Discovery Table is defined in the CXL 2.0 specification as
a way for the OS to get CXL specific information from the system
firmware.

CXL 2.0 specification adds an _HID, ACPI0016, for CXL capable host
bridges, with a _CID of PNP0A08 (PCIe host bridge). CXL aware software
is able to use this initiate the proper _OSC method, and get the _UID
which is referenced by the CEDT. Therefore the existence of an ACPI0016
device allows a CXL aware driver perform the necessary actions. For a
CXL capable OS, this works. For a CXL unaware OS, this works.

CEDT awaremess requires more. The motivation for ACPI0017 is to provide
the possibility of having a Linux CXL module that can work on a legacy
Linux kernel. Linux core PCI/ACPI which won't be built as a module,
will see the _CID of PNP0A08 and bind a driver to it. If we later loaded
a driver for ACPI0016, Linux won't be able to bind it to the hardware
because it has already bound the PNP0A08 driver. The ACPI0017 device is
an opportunity to have an object to bind a driver will be used by a
Linux driver to walk the CXL topology and do everything that we would
have preferred to do with ACPI0016.

There is another motivation for an ACPI0017 device which isn't
implemented here. An operating system needs an attach point for a
non-volatile region provider that understands cross-hostbridge
interleaving. Since QEMU emulation doesn't support interleaving yet,
this is more important on the OS side, for now.

As of CXL 2.0 spec, only 1 sub structure is defined, the CXL Host Bridge
Structure (CHBS) which is primarily useful for telling the OS exactly
where the MMIO for the host bridge is.

v2: Update CHBS to spec released definition
v3: squash ACPI0017 in now that it's ratified.

Link: https://lore.kernel.org/linux-cxl/20210115034911.nkgpzc756d6qmjpl@intel.com/T/#t
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/acpi/cxl.c                       | 69 +++++++++++++++++++++++++++++
 hw/i386/acpi-build.c                | 25 ++++++++++-
 hw/pci-bridge/pci_expander_bridge.c | 21 +--------
 include/hw/acpi/cxl.h               |  4 ++
 include/hw/pci/pci_bridge.h         | 25 +++++++++++
 5 files changed, 123 insertions(+), 21 deletions(-)

diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
index 7124d5a1a3..68db0fe3a8 100644
--- a/hw/acpi/cxl.c
+++ b/hw/acpi/cxl.c
@@ -18,14 +18,83 @@
  */
 
 #include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci/pci_host.h"
 #include "hw/cxl/cxl.h"
+#include "hw/mem/memory-device.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/acpi/cxl.h"
+#include "hw/acpi/cxl.h"
 #include "qapi/error.h"
 #include "qemu/uuid.h"
 
+static void cedt_build_chbs(GArray *table_data, PXBDev *cxl)
+{
+    SysBusDevice *sbd = SYS_BUS_DEVICE(cxl->cxl.cxl_host_bridge);
+    struct MemoryRegion *mr = sbd->mmio[0].memory;
+
+    /* Type */
+    build_append_int_noprefix(table_data, 0, 1);
+
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0, 1);
+
+    /* Record Length */
+    build_append_int_noprefix(table_data, 32, 2);
+
+    /* UID */
+    build_append_int_noprefix(table_data, cxl->uid, 4);
+
+    /* Version */
+    build_append_int_noprefix(table_data, 1, 4);
+
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0, 4);
+
+    /* Base */
+    build_append_int_noprefix(table_data, mr->addr, 8);
+
+    /* Length */
+    build_append_int_noprefix(table_data, memory_region_size(mr), 8);
+}
+
+static int cxl_foreach_pxb_hb(Object *obj, void *opaque)
+{
+    Aml *cedt = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_PXB_CXL_DEVICE)) {
+        PXBDev *pxb = PXB_CXL_DEV(obj);
+
+        cedt_build_chbs(cedt->buf, pxb);
+    }
+
+    return 0;
+}
+
+void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
+                    BIOSLinker *linker)
+{
+    const int cedt_start = table_data->len;
+    Aml *cedt;
+
+    cedt = init_aml_allocator();
+
+    /* reserve space for CEDT header */
+    acpi_add_table(table_offsets, table_data);
+    acpi_data_push(cedt->buf, sizeof(AcpiTableHeader));
+
+    object_child_foreach_recursive(object_get_root(), cxl_foreach_pxb_hb, cedt);
+
+    /* copy AML table into ACPI tables blob and patch header there */
+    g_array_append_vals(table_data, cedt->buf->data, cedt->buf->len);
+    build_header(linker, table_data, (void *)(table_data->data + cedt_start),
+                 "CEDT", table_data->len - cedt_start, 1, NULL, NULL);
+    free_aml_allocator();
+}
+
 static Aml *__build_cxl_osc_method(void)
 {
     Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, *if_caps_masked;
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2c2293b55f..7706856c49 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -75,6 +75,8 @@
 #include "hw/acpi/ipmi.h"
 #include "hw/acpi/hmat.h"
 
+#include "hw/acpi/cxl.h"
+
 /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
  * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
  * a little bit, there should be plenty of free space since the DSDT
@@ -1219,6 +1221,19 @@ static void init_pci_acpi(Aml *dev, int uid, int type)
     }
 }
 
+static void build_acpi0017(Aml *table)
+{
+    Aml *dev;
+    Aml *scope;
+
+    scope =  aml_scope("_SB");
+    dev = aml_device("CXLM");
+    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0017")));
+
+    aml_append(scope, dev);
+    aml_append(table, scope);
+}
+
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker,
            AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -1235,6 +1250,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     int root_bus_limit = 0xFF;
     PCIBus *bus = NULL;
     TPMIf *tpm = tpm_find();
+    bool cxl_present = false;
     int i;
     VMBusBridge *vmbus_bridge = vmbus_bridge_find();
 
@@ -1371,7 +1387,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
             scope = aml_scope("\\_SB");
             if (type == CXL) {
-                dev = aml_device("CXL%.01X", pci_bus_uid(bus));
+                dev = aml_device("CXL%.01X", uid);
             } else {
                 dev = aml_device("PC%.02X", bus_num);
             }
@@ -1391,6 +1407,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
             /* Handle the ranges for the PXB expanders */
             if (type == CXL) {
+                cxl_present = true;
                 uint64_t base = CXL_HOST_BASE + uid * 0x10000;
                 crs_range_insert(crs_range_set.mem_ranges, base,
                                  base + 0x10000 - 1);
@@ -1398,6 +1415,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         }
     }
 
+    if (cxl_present) {
+        build_acpi0017(dsdt);
+    }
+
     /*
      * At this point crs_range_set has all the ranges used by pci
      * busses *other* than PCI0.  These ranges will be excluded from
@@ -2278,6 +2299,8 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
                           machine->nvdimms_state, machine->ram_slots);
     }
 
+    cxl_build_cedt(table_offsets, tables_blob, tables->linker);
+
     acpi_add_table(table_offsets, tables_blob);
     build_waet(tables_blob, tables->linker);
 
diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index af1450c69d..6458d5b76e 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -57,26 +57,6 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
 DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
                          TYPE_PXB_PCIE_DEVICE)
 
-#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
-DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
-                         TYPE_PXB_CXL_DEVICE)
-
-struct PXBDev {
-    /*< private >*/
-    PCIDevice parent_obj;
-    /*< public >*/
-
-    uint8_t bus_nr;
-    uint16_t numa_node;
-    int32_t uid;
-    struct cxl_dev {
-        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
-
-        uint32_t num_windows;
-        hwaddr *window_base[CXL_WINDOW_MAX];
-    } cxl;
-};
-
 typedef struct CXLHost {
     PCIHostState parent_obj;
 
@@ -351,6 +331,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
         bus->flags |= PCI_BUS_CXL;
         PXB_CXL_HOST(ds)->dev = PXB_CXL_DEV(dev);
+        PXB_CXL_DEV(dev)->cxl.cxl_host_bridge = ds;
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
         bds = qdev_new("pci-bridge");
diff --git a/include/hw/acpi/cxl.h b/include/hw/acpi/cxl.h
index 7b8f3b8a2e..db2063f8c9 100644
--- a/include/hw/acpi/cxl.h
+++ b/include/hw/acpi/cxl.h
@@ -18,6 +18,10 @@
 #ifndef HW_ACPI_CXL_H
 #define HW_ACPI_CXL_H
 
+#include "hw/acpi/bios-linker-loader.h"
+
+void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
+                    BIOSLinker *linker);
 void build_cxl_osc_method(Aml *dev);
 
 #endif
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index a94d350034..50dd7fdf33 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -28,6 +28,7 @@
 
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_bus.h"
+#include "hw/cxl/cxl.h"
 #include "qom/object.h"
 
 typedef struct PCIBridgeWindows PCIBridgeWindows;
@@ -81,6 +82,30 @@ struct PCIBridge {
 #define PCI_BRIDGE_DEV_PROP_MSI        "msi"
 #define PCI_BRIDGE_DEV_PROP_SHPC       "shpc"
 
+struct PXBDev {
+    /*< private >*/
+    PCIDevice parent_obj;
+    /*< public >*/
+
+    uint8_t bus_nr;
+    uint16_t numa_node;
+    int32_t uid;
+
+    struct cxl_dev {
+        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
+
+        uint32_t num_windows;
+        hwaddr *window_base[CXL_WINDOW_MAX];
+
+        void *cxl_host_bridge; /* Pointer to a CXLHost */
+    } cxl;
+};
+
+typedef struct PXBDev PXBDev;
+#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
+DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
+                         TYPE_PXB_CXL_DEVICE)
+
 int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
                           uint16_t svid, uint16_t ssid,
                           Error **errp);
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 26/31] tests/acpi: Add new CEDT files
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/data/acpi/pc/CEDT                     | Bin 0 -> 36 bytes
 tests/data/acpi/q35/CEDT                    | Bin 0 -> 36 bytes
 tests/qtest/bios-tables-test-allowed-diff.h |   2 --
 3 files changed, 2 deletions(-)

diff --git a/tests/data/acpi/pc/CEDT b/tests/data/acpi/pc/CEDT
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..ebf9b54b0b27d9efca53359c3c2e560511f0e165 100644
GIT binary patch
literal 36
kcmZ>EbqP^nU|?X};N<V@5v<@85#a0$6k`O6f!H7#0Fb2z0RR91

literal 0
HcmV?d00001

diff --git a/tests/data/acpi/q35/CEDT b/tests/data/acpi/q35/CEDT
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..ebf9b54b0b27d9efca53359c3c2e560511f0e165 100644
GIT binary patch
literal 36
kcmZ>EbqP^nU|?X};N<V@5v<@85#a0$6k`O6f!H7#0Fb2z0RR91

literal 0
HcmV?d00001

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 9b07f1e1ff..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,3 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/CEDT",
-"tests/data/acpi/q35/CEDT",
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 26/31] tests/acpi: Add new CEDT files
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/data/acpi/pc/CEDT                     | Bin 0 -> 36 bytes
 tests/data/acpi/q35/CEDT                    | Bin 0 -> 36 bytes
 tests/qtest/bios-tables-test-allowed-diff.h |   2 --
 3 files changed, 2 deletions(-)

diff --git a/tests/data/acpi/pc/CEDT b/tests/data/acpi/pc/CEDT
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..ebf9b54b0b27d9efca53359c3c2e560511f0e165 100644
GIT binary patch
literal 36
kcmZ>EbqP^nU|?X};N<V@5v<@85#a0$6k`O6f!H7#0Fb2z0RR91

literal 0
HcmV?d00001

diff --git a/tests/data/acpi/q35/CEDT b/tests/data/acpi/q35/CEDT
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..ebf9b54b0b27d9efca53359c3c2e560511f0e165 100644
GIT binary patch
literal 36
kcmZ>EbqP^nU|?X};N<V@5v<@85#a0$6k`O6f!H7#0Fb2z0RR91

literal 0
HcmV?d00001

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index 9b07f1e1ff..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,3 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/CEDT",
-"tests/data/acpi/q35/CEDT",
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 27/31] hw/cxl/device: Add some trivial commands
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

GET_FW_INFO and GET_PARTITION_INFO, for this emulation, is equivalent to
info already returned in the IDENTIFY command. To have a more robust
implementation, add those.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c | 65 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index f92dfad882..dc8e0eb08e 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -43,6 +43,8 @@ enum {
         #define CLEAR_RECORDS   0x1
         #define GET_INTERRUPT_POLICY   0x2
         #define SET_INTERRUPT_POLICY   0x3
+    FIRMWARE_UPDATE = 0x02,
+        #define GET_INFO      0x0
     TIMESTAMP   = 0x03,
         #define GET           0x0
         #define SET           0x1
@@ -51,6 +53,8 @@ enum {
         #define GET_LOG       0x1
     IDENTIFY    = 0x40,
         #define MEMORY_DEVICE 0x0
+    CCLS        = 0x41,
+        #define GET_PARTITION_INFO     0x0
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -125,11 +129,13 @@ define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
 define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
 define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
 define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
+declare_mailbox_handler(FIRMWARE_UPDATE_GET_INFO);
 declare_mailbox_handler(TIMESTAMP_GET);
 declare_mailbox_handler(TIMESTAMP_SET);
 declare_mailbox_handler(LOGS_GET_SUPPORTED);
 declare_mailbox_handler(LOGS_GET_LOG);
 declare_mailbox_handler(IDENTIFY_MEMORY_DEVICE);
+declare_mailbox_handler(CCLS_GET_PARTITION_INFO);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -143,15 +149,50 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
     CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
     CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
+    CXL_CMD(FIRMWARE_UPDATE, GET_INFO, 0, 0),
     CXL_CMD(TIMESTAMP, GET, 0, 0),
     CXL_CMD(TIMESTAMP, SET, 8, IMMEDIATE_POLICY_CHANGE),
     CXL_CMD(LOGS, GET_SUPPORTED, 0, 0),
     CXL_CMD(LOGS, GET_LOG, 0x18, 0),
     CXL_CMD(IDENTIFY, MEMORY_DEVICE, 0, 0),
+    CXL_CMD(CCLS, GET_PARTITION_INFO, 0, 0),
 };
 
 #undef CXL_CMD
 
+/*
+ * 8.2.9.2.1
+ */
+define_mailbox_handler(FIRMWARE_UPDATE_GET_INFO)
+{
+    struct {
+        uint8_t slots_supported;
+        uint8_t slot_info;
+        uint8_t caps;
+        uint8_t rsvd[0xd];
+        char fw_rev1[0x10];
+        char fw_rev2[0x10];
+        char fw_rev3[0x10];
+        char fw_rev4[0x10];
+    } __attribute__((packed)) *fw_info;
+    _Static_assert(sizeof(*fw_info) == 0x50, "Bad firmware info size");
+
+    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
+        return CXL_MBOX_INTERNAL_ERROR;
+    }
+
+    fw_info = (void *)cmd->payload;
+    memset(fw_info, 0, sizeof(*fw_info));
+
+    fw_info->slots_supported = 2;
+    fw_info->slot_info = BIT(0) | BIT(3);
+    fw_info->caps = 0;
+    snprintf(fw_info->fw_rev1, 0x10, "BWFW VERSION %02d", 0);
+
+    *len = sizeof(*fw_info);
+    return CXL_MBOX_SUCCESS;
+}
+
 /*
  * 8.2.9.3.1
  */
@@ -296,6 +337,30 @@ define_mailbox_handler(IDENTIFY_MEMORY_DEVICE)
     return CXL_MBOX_SUCCESS;
 }
 
+define_mailbox_handler(CCLS_GET_PARTITION_INFO)
+{
+    struct {
+        uint64_t active_vmem;
+        uint64_t active_pmem;
+        uint64_t next_vmem;
+        uint64_t next_pmem;
+    } __attribute__((packed)) *part_info = (void *)cmd->payload;
+    _Static_assert(sizeof(*part_info) == 0x20, "Bad get partition info size");
+
+    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
+        return CXL_MBOX_INTERNAL_ERROR;
+    }
+
+    /* PMEM only */
+    part_info->active_vmem = 0;
+    part_info->next_vmem = 0;
+    part_info->active_pmem = memory_region_size(cxl_dstate->pmem);
+    part_info->next_pmem = part_info->active_pmem;
+
+    *len = sizeof(*part_info);
+    return CXL_MBOX_SUCCESS;
+}
+
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
 {
     uint16_t ret = CXL_MBOX_SUCCESS;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 27/31] hw/cxl/device: Add some trivial commands
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

GET_FW_INFO and GET_PARTITION_INFO, for this emulation, is equivalent to
info already returned in the IDENTIFY command. To have a more robust
implementation, add those.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c | 65 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index f92dfad882..dc8e0eb08e 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -43,6 +43,8 @@ enum {
         #define CLEAR_RECORDS   0x1
         #define GET_INTERRUPT_POLICY   0x2
         #define SET_INTERRUPT_POLICY   0x3
+    FIRMWARE_UPDATE = 0x02,
+        #define GET_INFO      0x0
     TIMESTAMP   = 0x03,
         #define GET           0x0
         #define SET           0x1
@@ -51,6 +53,8 @@ enum {
         #define GET_LOG       0x1
     IDENTIFY    = 0x40,
         #define MEMORY_DEVICE 0x0
+    CCLS        = 0x41,
+        #define GET_PARTITION_INFO     0x0
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -125,11 +129,13 @@ define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
 define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
 define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
 define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
+declare_mailbox_handler(FIRMWARE_UPDATE_GET_INFO);
 declare_mailbox_handler(TIMESTAMP_GET);
 declare_mailbox_handler(TIMESTAMP_SET);
 declare_mailbox_handler(LOGS_GET_SUPPORTED);
 declare_mailbox_handler(LOGS_GET_LOG);
 declare_mailbox_handler(IDENTIFY_MEMORY_DEVICE);
+declare_mailbox_handler(CCLS_GET_PARTITION_INFO);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -143,15 +149,50 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
     CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
     CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
+    CXL_CMD(FIRMWARE_UPDATE, GET_INFO, 0, 0),
     CXL_CMD(TIMESTAMP, GET, 0, 0),
     CXL_CMD(TIMESTAMP, SET, 8, IMMEDIATE_POLICY_CHANGE),
     CXL_CMD(LOGS, GET_SUPPORTED, 0, 0),
     CXL_CMD(LOGS, GET_LOG, 0x18, 0),
     CXL_CMD(IDENTIFY, MEMORY_DEVICE, 0, 0),
+    CXL_CMD(CCLS, GET_PARTITION_INFO, 0, 0),
 };
 
 #undef CXL_CMD
 
+/*
+ * 8.2.9.2.1
+ */
+define_mailbox_handler(FIRMWARE_UPDATE_GET_INFO)
+{
+    struct {
+        uint8_t slots_supported;
+        uint8_t slot_info;
+        uint8_t caps;
+        uint8_t rsvd[0xd];
+        char fw_rev1[0x10];
+        char fw_rev2[0x10];
+        char fw_rev3[0x10];
+        char fw_rev4[0x10];
+    } __attribute__((packed)) *fw_info;
+    _Static_assert(sizeof(*fw_info) == 0x50, "Bad firmware info size");
+
+    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
+        return CXL_MBOX_INTERNAL_ERROR;
+    }
+
+    fw_info = (void *)cmd->payload;
+    memset(fw_info, 0, sizeof(*fw_info));
+
+    fw_info->slots_supported = 2;
+    fw_info->slot_info = BIT(0) | BIT(3);
+    fw_info->caps = 0;
+    snprintf(fw_info->fw_rev1, 0x10, "BWFW VERSION %02d", 0);
+
+    *len = sizeof(*fw_info);
+    return CXL_MBOX_SUCCESS;
+}
+
 /*
  * 8.2.9.3.1
  */
@@ -296,6 +337,30 @@ define_mailbox_handler(IDENTIFY_MEMORY_DEVICE)
     return CXL_MBOX_SUCCESS;
 }
 
+define_mailbox_handler(CCLS_GET_PARTITION_INFO)
+{
+    struct {
+        uint64_t active_vmem;
+        uint64_t active_pmem;
+        uint64_t next_vmem;
+        uint64_t next_pmem;
+    } __attribute__((packed)) *part_info = (void *)cmd->payload;
+    _Static_assert(sizeof(*part_info) == 0x20, "Bad get partition info size");
+
+    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
+        return CXL_MBOX_INTERNAL_ERROR;
+    }
+
+    /* PMEM only */
+    part_info->active_vmem = 0;
+    part_info->next_vmem = 0;
+    part_info->active_pmem = memory_region_size(cxl_dstate->pmem);
+    part_info->next_pmem = part_info->active_pmem;
+
+    *len = sizeof(*part_info);
+    return CXL_MBOX_SUCCESS;
+}
+
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
 {
     uint16_t ret = CXL_MBOX_SUCCESS;
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 28/31] hw/cxl/device: Plumb real LSA sizing
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

This should introduce no change. Subsequent work will make use of this
new class member.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c  |  4 ++++
 hw/mem/cxl_type3.c          | 24 +++++++++---------------
 include/hw/cxl/cxl.h        |  1 -
 include/hw/cxl/cxl_device.h | 24 ++++++++++++++++++++++++
 4 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index dc8e0eb08e..2637250c7b 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -321,6 +321,9 @@ define_mailbox_handler(IDENTIFY_MEMORY_DEVICE)
     } __attribute__((packed)) *id;
     _Static_assert(sizeof(*id) == 0x43, "Bad identify size");
 
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLType3Class *cvc = CXL_TYPE3_DEV_GET_CLASS(ct3d);
+
     if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
         return CXL_MBOX_INTERNAL_ERROR;
     }
@@ -332,6 +335,7 @@ define_mailbox_handler(IDENTIFY_MEMORY_DEVICE)
     snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
     id->total_capacity = memory_region_size(cxl_dstate->pmem);
     id->persistent_capacity = memory_region_size(cxl_dstate->pmem);
+    id->lsa_size = cvc->get_lsa_size(ct3d);
 
     *len = sizeof(*id);
     return CXL_MBOX_SUCCESS;
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index fe02c3b63c..074d1dd41f 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -13,21 +13,6 @@
 #include "sysemu/hostmem.h"
 #include "hw/cxl/cxl.h"
 
-typedef struct cxl_type3_dev {
-    /* Private */
-    PCIDevice parent_obj;
-
-    /* Properties */
-    uint64_t size;
-    HostMemoryBackend *hostmem;
-
-    /* State */
-    CXLComponentState cxl_cstate;
-    CXLDeviceState cxl_dstate;
-} CXLType3Dev;
-
-#define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
-
 static void build_dvsecs(CXLType3Dev *ct3d)
 {
     CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
@@ -310,11 +295,17 @@ static void pc_dimm_md_fill_device_info(const MemoryDeviceState *md,
     info->type = MEMORY_DEVICE_INFO_KIND_CXL;
 }
 
+static uint64_t get_lsa_size(CXLType3Dev *ct3d)
+{
+    return 0;
+}
+
 static void ct3_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
     PCIDeviceClass *pc = PCI_DEVICE_CLASS(oc);
     MemoryDeviceClass *mdc = MEMORY_DEVICE_CLASS(oc);
+    CXLType3Class *cvc = CXL_TYPE3_DEV_CLASS(oc);
 
     pc->realize = ct3_realize;
     pc->class_id = PCI_CLASS_STORAGE_EXPRESS;
@@ -332,11 +323,14 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     mdc->fill_device_info = pc_dimm_md_fill_device_info;
     mdc->get_plugged_size = memory_device_get_region_size;
     mdc->set_addr = cxl_md_set_addr;
+
+    cvc->get_lsa_size = get_lsa_size;
 }
 
 static const TypeInfo ct3d_info = {
     .name = TYPE_CXL_TYPE3_DEV,
     .parent = TYPE_PCI_DEVICE,
+    .class_size = sizeof(struct CXLType3Class),
     .class_init = ct3_class_init,
     .instance_size = sizeof(CXLType3Dev),
     .instance_init = ct3_instance_init,
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 809ed7de60..c7ca42930f 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -23,4 +23,3 @@
 #define CXL_WINDOW_MAX 10
 
 #endif
-
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index ca5328a581..a79a0f106c 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -219,4 +219,28 @@ REG32(CXL_MEM_DEV_STS, 0)
     FIELD(CXL_MEM_DEV_STS, MBOX_READY, 4, 1)
     FIELD(CXL_MEM_DEV_STS, RESET_NEEDED, 5, 3)
 
+typedef struct cxl_type3_dev {
+    /* Private */
+    PCIDevice parent_obj;
+
+    /* Properties */
+    uint64_t size;
+    HostMemoryBackend *hostmem;
+    HostMemoryBackend *lsa;
+
+    /* State */
+    CXLComponentState cxl_cstate;
+    CXLDeviceState cxl_dstate;
+} CXLType3Dev;
+
+#define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
+
+struct CXLType3Class {
+    /* Private */
+    PCIDeviceClass parent_class;
+
+    /* public */
+    uint64_t (*get_lsa_size)(CXLType3Dev *ct3d);
+};
+
 #endif
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 28/31] hw/cxl/device: Plumb real LSA sizing
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

This should introduce no change. Subsequent work will make use of this
new class member.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c  |  4 ++++
 hw/mem/cxl_type3.c          | 24 +++++++++---------------
 include/hw/cxl/cxl.h        |  1 -
 include/hw/cxl/cxl_device.h | 24 ++++++++++++++++++++++++
 4 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index dc8e0eb08e..2637250c7b 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -321,6 +321,9 @@ define_mailbox_handler(IDENTIFY_MEMORY_DEVICE)
     } __attribute__((packed)) *id;
     _Static_assert(sizeof(*id) == 0x43, "Bad identify size");
 
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLType3Class *cvc = CXL_TYPE3_DEV_GET_CLASS(ct3d);
+
     if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
         return CXL_MBOX_INTERNAL_ERROR;
     }
@@ -332,6 +335,7 @@ define_mailbox_handler(IDENTIFY_MEMORY_DEVICE)
     snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
     id->total_capacity = memory_region_size(cxl_dstate->pmem);
     id->persistent_capacity = memory_region_size(cxl_dstate->pmem);
+    id->lsa_size = cvc->get_lsa_size(ct3d);
 
     *len = sizeof(*id);
     return CXL_MBOX_SUCCESS;
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index fe02c3b63c..074d1dd41f 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -13,21 +13,6 @@
 #include "sysemu/hostmem.h"
 #include "hw/cxl/cxl.h"
 
-typedef struct cxl_type3_dev {
-    /* Private */
-    PCIDevice parent_obj;
-
-    /* Properties */
-    uint64_t size;
-    HostMemoryBackend *hostmem;
-
-    /* State */
-    CXLComponentState cxl_cstate;
-    CXLDeviceState cxl_dstate;
-} CXLType3Dev;
-
-#define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
-
 static void build_dvsecs(CXLType3Dev *ct3d)
 {
     CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
@@ -310,11 +295,17 @@ static void pc_dimm_md_fill_device_info(const MemoryDeviceState *md,
     info->type = MEMORY_DEVICE_INFO_KIND_CXL;
 }
 
+static uint64_t get_lsa_size(CXLType3Dev *ct3d)
+{
+    return 0;
+}
+
 static void ct3_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
     PCIDeviceClass *pc = PCI_DEVICE_CLASS(oc);
     MemoryDeviceClass *mdc = MEMORY_DEVICE_CLASS(oc);
+    CXLType3Class *cvc = CXL_TYPE3_DEV_CLASS(oc);
 
     pc->realize = ct3_realize;
     pc->class_id = PCI_CLASS_STORAGE_EXPRESS;
@@ -332,11 +323,14 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     mdc->fill_device_info = pc_dimm_md_fill_device_info;
     mdc->get_plugged_size = memory_device_get_region_size;
     mdc->set_addr = cxl_md_set_addr;
+
+    cvc->get_lsa_size = get_lsa_size;
 }
 
 static const TypeInfo ct3d_info = {
     .name = TYPE_CXL_TYPE3_DEV,
     .parent = TYPE_PCI_DEVICE,
+    .class_size = sizeof(struct CXLType3Class),
     .class_init = ct3_class_init,
     .instance_size = sizeof(CXLType3Dev),
     .instance_init = ct3_instance_init,
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 809ed7de60..c7ca42930f 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -23,4 +23,3 @@
 #define CXL_WINDOW_MAX 10
 
 #endif
-
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index ca5328a581..a79a0f106c 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -219,4 +219,28 @@ REG32(CXL_MEM_DEV_STS, 0)
     FIELD(CXL_MEM_DEV_STS, MBOX_READY, 4, 1)
     FIELD(CXL_MEM_DEV_STS, RESET_NEEDED, 5, 3)
 
+typedef struct cxl_type3_dev {
+    /* Private */
+    PCIDevice parent_obj;
+
+    /* Properties */
+    uint64_t size;
+    HostMemoryBackend *hostmem;
+    HostMemoryBackend *lsa;
+
+    /* State */
+    CXLComponentState cxl_cstate;
+    CXLDeviceState cxl_dstate;
+} CXLType3Dev;
+
+#define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
+
+struct CXLType3Class {
+    /* Private */
+    PCIDeviceClass parent_class;
+
+    /* public */
+    uint64_t (*get_lsa_size)(CXLType3Dev *ct3d);
+};
+
 #endif
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 29/31] hw/cxl/device: Implement get/set LSA
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 50 +++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          | 56 ++++++++++++++++++++++++++++++++++++-
 include/hw/cxl/cxl_device.h |  9 ++++++
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 2637250c7b..c133cf0341 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -55,6 +55,8 @@ enum {
         #define MEMORY_DEVICE 0x0
     CCLS        = 0x41,
         #define GET_PARTITION_INFO     0x0
+        #define GET_LSA       0x2
+        #define SET_LSA       0x3
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -136,8 +138,11 @@ declare_mailbox_handler(LOGS_GET_SUPPORTED);
 declare_mailbox_handler(LOGS_GET_LOG);
 declare_mailbox_handler(IDENTIFY_MEMORY_DEVICE);
 declare_mailbox_handler(CCLS_GET_PARTITION_INFO);
+declare_mailbox_handler(CCLS_GET_LSA);
+declare_mailbox_handler(CCLS_SET_LSA);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
+#define IMMEDIATE_DATA_CHANGE (1 << 1)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
 #define IMMEDIATE_LOG_CHANGE (1 << 4)
 
@@ -156,6 +161,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(LOGS, GET_LOG, 0x18, 0),
     CXL_CMD(IDENTIFY, MEMORY_DEVICE, 0, 0),
     CXL_CMD(CCLS, GET_PARTITION_INFO, 0, 0),
+    CXL_CMD(CCLS, GET_LSA, 0, 0),
+    CXL_CMD(CCLS, SET_LSA, ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE),
 };
 
 #undef CXL_CMD
@@ -365,6 +372,49 @@ define_mailbox_handler(CCLS_GET_PARTITION_INFO)
     return CXL_MBOX_SUCCESS;
 }
 
+define_mailbox_handler(CCLS_GET_LSA)
+{
+    struct {
+        uint32_t offset;
+        uint32_t length;
+    } __attribute__((packed, __aligned__(16))) *get_lsa;
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLType3Class *cvc = CXL_TYPE3_DEV_GET_CLASS(ct3d);
+    uint32_t offset, length;
+
+    get_lsa = (void *)cmd->payload;
+    offset = get_lsa->offset;
+    length = get_lsa->length;
+
+    *len = 0;
+    if (offset + length > cvc->get_lsa_size(ct3d)) {
+        return CXL_MBOX_INVALID_INPUT;
+    }
+
+    *len = cvc->get_lsa(ct3d, get_lsa, length, offset);
+    return CXL_MBOX_SUCCESS;
+}
+
+define_mailbox_handler(CCLS_SET_LSA)
+{
+    struct {
+        uint32_t offset;
+        uint32_t rsvd;
+        void *data;
+    } __attribute__((packed, __aligned__(16))) *set_lsa = (void *)cmd->payload;
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLType3Class *cvc = CXL_TYPE3_DEV_GET_CLASS(ct3d);
+    uint16_t plen = *len;
+
+    *len = 0;
+    if ((set_lsa->offset + plen) > cvc->get_lsa_size(ct3d)) {
+        return CXL_MBOX_INVALID_INPUT;
+    }
+
+    cvc->set_lsa(ct3d, set_lsa->data, plen, set_lsa->offset);
+    return CXL_MBOX_SUCCESS;
+}
+
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
 {
     uint16_t ret = CXL_MBOX_SUCCESS;
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 074d1dd41f..d091e645aa 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -8,6 +8,7 @@
 #include "qapi/error.h"
 #include "qemu/log.h"
 #include "qemu/module.h"
+#include "qemu/pmem.h"
 #include "qemu/range.h"
 #include "qemu/rcu.h"
 #include "sysemu/hostmem.h"
@@ -148,6 +149,11 @@ static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
         return;
     }
 
+    if (!ct3d->lsa) {
+        error_setg(errp, "lsa property must be set");
+        return;
+    }
+
     /* FIXME: need to check mr is the host bridge's MR */
     mr = host_memory_backend_get_memory(ct3d->hostmem);
 
@@ -267,6 +273,8 @@ static Property ct3_props[] = {
     DEFINE_PROP_SIZE("size", CXLType3Dev, size, -1),
     DEFINE_PROP_LINK("memdev", CXLType3Dev, hostmem, TYPE_MEMORY_BACKEND,
                      HostMemoryBackend *),
+    DEFINE_PROP_LINK("lsa", CXLType3Dev, lsa, TYPE_MEMORY_BACKEND,
+                     HostMemoryBackend *),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -297,7 +305,51 @@ static void pc_dimm_md_fill_device_info(const MemoryDeviceState *md,
 
 static uint64_t get_lsa_size(CXLType3Dev *ct3d)
 {
-    return 0;
+    MemoryRegion *mr;
+
+    mr = host_memory_backend_get_memory(ct3d->lsa);
+    return memory_region_size(mr);
+}
+
+static void validate_lsa_access(MemoryRegion *mr, uint64_t size,
+                                uint64_t offset)
+{
+    assert(offset + size <= memory_region_size(mr));
+    assert(offset + size > offset);
+}
+
+static uint64_t get_lsa(CXLType3Dev *ct3d, void *buf, uint64_t size,
+                    uint64_t offset)
+{
+    MemoryRegion *mr;
+    void *lsa;
+
+    mr = host_memory_backend_get_memory(ct3d->lsa);
+    validate_lsa_access(mr, size, offset);
+
+    lsa = memory_region_get_ram_ptr(mr) + offset;
+    memcpy(buf, lsa, size);
+
+    return size;
+}
+
+static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
+                    uint64_t offset)
+{
+    MemoryRegion *mr;
+    void *lsa;
+
+    mr = host_memory_backend_get_memory(ct3d->lsa);
+    validate_lsa_access(mr, size, offset);
+
+    lsa = memory_region_get_ram_ptr(mr) + offset;
+    memcpy(lsa, buf, size);
+    memory_region_set_dirty(mr, offset, size);
+
+    /*
+     * Just like the PMEM, if the guest is not allowed to exit gracefully, label
+     * updates will get lost.
+     */
 }
 
 static void ct3_class_init(ObjectClass *oc, void *data)
@@ -325,6 +377,8 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     mdc->set_addr = cxl_md_set_addr;
 
     cvc->get_lsa_size = get_lsa_size;
+    cvc->get_lsa = get_lsa;
+    cvc->set_lsa = set_lsa;
 }
 
 static const TypeInfo ct3d_info = {
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index a79a0f106c..1869876ef6 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -233,7 +233,11 @@ typedef struct cxl_type3_dev {
     CXLDeviceState cxl_dstate;
 } CXLType3Dev;
 
+#ifndef TYPE_CXL_TYPE3_DEV
+#define TYPE_CXL_TYPE3_DEV "cxl-type3"
+#endif
 #define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
+OBJECT_DECLARE_TYPE(CXLType3Device, CXLType3Class, CXL_TYPE3_DEV)
 
 struct CXLType3Class {
     /* Private */
@@ -241,6 +245,11 @@ struct CXLType3Class {
 
     /* public */
     uint64_t (*get_lsa_size)(CXLType3Dev *ct3d);
+
+    uint64_t (*get_lsa)(CXLType3Dev *ct3d, void *buf, uint64_t size,
+                        uint64_t offset);
+    void (*set_lsa)(CXLType3Dev *ct3d, const void *buf, uint64_t size,
+                    uint64_t offset);
 };
 
 #endif
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 29/31] hw/cxl/device: Implement get/set LSA
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 50 +++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          | 56 ++++++++++++++++++++++++++++++++++++-
 include/hw/cxl/cxl_device.h |  9 ++++++
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 2637250c7b..c133cf0341 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -55,6 +55,8 @@ enum {
         #define MEMORY_DEVICE 0x0
     CCLS        = 0x41,
         #define GET_PARTITION_INFO     0x0
+        #define GET_LSA       0x2
+        #define SET_LSA       0x3
 };
 
 /* 8.2.8.4.5.1 Command Return Codes */
@@ -136,8 +138,11 @@ declare_mailbox_handler(LOGS_GET_SUPPORTED);
 declare_mailbox_handler(LOGS_GET_LOG);
 declare_mailbox_handler(IDENTIFY_MEMORY_DEVICE);
 declare_mailbox_handler(CCLS_GET_PARTITION_INFO);
+declare_mailbox_handler(CCLS_GET_LSA);
+declare_mailbox_handler(CCLS_SET_LSA);
 
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
+#define IMMEDIATE_DATA_CHANGE (1 << 1)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
 #define IMMEDIATE_LOG_CHANGE (1 << 4)
 
@@ -156,6 +161,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     CXL_CMD(LOGS, GET_LOG, 0x18, 0),
     CXL_CMD(IDENTIFY, MEMORY_DEVICE, 0, 0),
     CXL_CMD(CCLS, GET_PARTITION_INFO, 0, 0),
+    CXL_CMD(CCLS, GET_LSA, 0, 0),
+    CXL_CMD(CCLS, SET_LSA, ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE),
 };
 
 #undef CXL_CMD
@@ -365,6 +372,49 @@ define_mailbox_handler(CCLS_GET_PARTITION_INFO)
     return CXL_MBOX_SUCCESS;
 }
 
+define_mailbox_handler(CCLS_GET_LSA)
+{
+    struct {
+        uint32_t offset;
+        uint32_t length;
+    } __attribute__((packed, __aligned__(16))) *get_lsa;
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLType3Class *cvc = CXL_TYPE3_DEV_GET_CLASS(ct3d);
+    uint32_t offset, length;
+
+    get_lsa = (void *)cmd->payload;
+    offset = get_lsa->offset;
+    length = get_lsa->length;
+
+    *len = 0;
+    if (offset + length > cvc->get_lsa_size(ct3d)) {
+        return CXL_MBOX_INVALID_INPUT;
+    }
+
+    *len = cvc->get_lsa(ct3d, get_lsa, length, offset);
+    return CXL_MBOX_SUCCESS;
+}
+
+define_mailbox_handler(CCLS_SET_LSA)
+{
+    struct {
+        uint32_t offset;
+        uint32_t rsvd;
+        void *data;
+    } __attribute__((packed, __aligned__(16))) *set_lsa = (void *)cmd->payload;
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLType3Class *cvc = CXL_TYPE3_DEV_GET_CLASS(ct3d);
+    uint16_t plen = *len;
+
+    *len = 0;
+    if ((set_lsa->offset + plen) > cvc->get_lsa_size(ct3d)) {
+        return CXL_MBOX_INVALID_INPUT;
+    }
+
+    cvc->set_lsa(ct3d, set_lsa->data, plen, set_lsa->offset);
+    return CXL_MBOX_SUCCESS;
+}
+
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
 {
     uint16_t ret = CXL_MBOX_SUCCESS;
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 074d1dd41f..d091e645aa 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -8,6 +8,7 @@
 #include "qapi/error.h"
 #include "qemu/log.h"
 #include "qemu/module.h"
+#include "qemu/pmem.h"
 #include "qemu/range.h"
 #include "qemu/rcu.h"
 #include "sysemu/hostmem.h"
@@ -148,6 +149,11 @@ static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
         return;
     }
 
+    if (!ct3d->lsa) {
+        error_setg(errp, "lsa property must be set");
+        return;
+    }
+
     /* FIXME: need to check mr is the host bridge's MR */
     mr = host_memory_backend_get_memory(ct3d->hostmem);
 
@@ -267,6 +273,8 @@ static Property ct3_props[] = {
     DEFINE_PROP_SIZE("size", CXLType3Dev, size, -1),
     DEFINE_PROP_LINK("memdev", CXLType3Dev, hostmem, TYPE_MEMORY_BACKEND,
                      HostMemoryBackend *),
+    DEFINE_PROP_LINK("lsa", CXLType3Dev, lsa, TYPE_MEMORY_BACKEND,
+                     HostMemoryBackend *),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -297,7 +305,51 @@ static void pc_dimm_md_fill_device_info(const MemoryDeviceState *md,
 
 static uint64_t get_lsa_size(CXLType3Dev *ct3d)
 {
-    return 0;
+    MemoryRegion *mr;
+
+    mr = host_memory_backend_get_memory(ct3d->lsa);
+    return memory_region_size(mr);
+}
+
+static void validate_lsa_access(MemoryRegion *mr, uint64_t size,
+                                uint64_t offset)
+{
+    assert(offset + size <= memory_region_size(mr));
+    assert(offset + size > offset);
+}
+
+static uint64_t get_lsa(CXLType3Dev *ct3d, void *buf, uint64_t size,
+                    uint64_t offset)
+{
+    MemoryRegion *mr;
+    void *lsa;
+
+    mr = host_memory_backend_get_memory(ct3d->lsa);
+    validate_lsa_access(mr, size, offset);
+
+    lsa = memory_region_get_ram_ptr(mr) + offset;
+    memcpy(buf, lsa, size);
+
+    return size;
+}
+
+static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
+                    uint64_t offset)
+{
+    MemoryRegion *mr;
+    void *lsa;
+
+    mr = host_memory_backend_get_memory(ct3d->lsa);
+    validate_lsa_access(mr, size, offset);
+
+    lsa = memory_region_get_ram_ptr(mr) + offset;
+    memcpy(lsa, buf, size);
+    memory_region_set_dirty(mr, offset, size);
+
+    /*
+     * Just like the PMEM, if the guest is not allowed to exit gracefully, label
+     * updates will get lost.
+     */
 }
 
 static void ct3_class_init(ObjectClass *oc, void *data)
@@ -325,6 +377,8 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     mdc->set_addr = cxl_md_set_addr;
 
     cvc->get_lsa_size = get_lsa_size;
+    cvc->get_lsa = get_lsa;
+    cvc->set_lsa = set_lsa;
 }
 
 static const TypeInfo ct3d_info = {
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index a79a0f106c..1869876ef6 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -233,7 +233,11 @@ typedef struct cxl_type3_dev {
     CXLDeviceState cxl_dstate;
 } CXLType3Dev;
 
+#ifndef TYPE_CXL_TYPE3_DEV
+#define TYPE_CXL_TYPE3_DEV "cxl-type3"
+#endif
 #define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
+OBJECT_DECLARE_TYPE(CXLType3Device, CXLType3Class, CXL_TYPE3_DEV)
 
 struct CXLType3Class {
     /* Private */
@@ -241,6 +245,11 @@ struct CXLType3Class {
 
     /* public */
     uint64_t (*get_lsa_size)(CXLType3Dev *ct3d);
+
+    uint64_t (*get_lsa)(CXLType3Dev *ct3d, void *buf, uint64_t size,
+                        uint64_t offset);
+    void (*set_lsa)(CXLType3Dev *ct3d, const void *buf, uint64_t size,
+                    uint64_t offset);
 };
 
 #endif
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 30/31] qtest/cxl: Add very basic sanity tests
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/qtest/cxl-test.c  | 93 +++++++++++++++++++++++++++++++++++++++++
 tests/qtest/meson.build |  4 ++
 2 files changed, 97 insertions(+)
 create mode 100644 tests/qtest/cxl-test.c

diff --git a/tests/qtest/cxl-test.c b/tests/qtest/cxl-test.c
new file mode 100644
index 0000000000..00eca14faa
--- /dev/null
+++ b/tests/qtest/cxl-test.c
@@ -0,0 +1,93 @@
+/*
+ * QTest testcase for CXL
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+
+#define QEMU_PXB_CMD "-machine q35 -object memory-backend-file,id=cxl-mem1," \
+                     "share,mem-path=%s,size=512M "                          \
+                     "-device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,"  \
+                     "len-window-base=1,window-base[0]=0x4c0000000,memdev[0]=cxl-mem1"
+#define QEMU_RP "-device cxl-rp,id=rp0,bus=cxl.0,addr=0.0,chassis=0,slot=0"
+
+#define QEMU_T3D "-device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
+
+static void cxl_basic_hb(void)
+{
+    qtest_start("-machine q35,cxl");
+    qtest_end();
+}
+
+static void cxl_basic_pxb(void)
+{
+    qtest_start("-machine q35 -device pxb-cxl,bus=pcie.0,uid=0");
+    qtest_end();
+}
+
+static void cxl_pxb_with_window(void)
+{
+    GString *cmdline;
+    char template[] = "/tmp/cxl-test-XXXXXX";
+    const char *tmpfs;
+
+    tmpfs = mkdtemp(template);
+
+    cmdline = g_string_new(NULL);
+    g_string_printf(cmdline, QEMU_PXB_CMD, tmpfs);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+
+    g_string_free(cmdline, TRUE);
+}
+
+static void cxl_root_port(void)
+{
+    GString *cmdline;
+    char template[] = "/tmp/cxl-test-XXXXXX";
+    const char *tmpfs;
+
+    tmpfs = mkdtemp(template);
+
+    cmdline = g_string_new(NULL);
+    g_string_printf(cmdline, QEMU_PXB_CMD " %s", tmpfs, QEMU_RP);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+
+    g_string_free(cmdline, TRUE);
+}
+
+static void cxl_t3d(void)
+{
+    GString *cmdline;
+    char template[] = "/tmp/cxl-test-XXXXXX";
+    const char *tmpfs;
+
+    tmpfs = mkdtemp(template);
+
+    cmdline = g_string_new(NULL);
+    g_string_printf(cmdline, QEMU_PXB_CMD " %s %s", tmpfs, QEMU_RP, QEMU_T3D);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+
+    g_string_free(cmdline, TRUE);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    qtest_add_func("/pci/cxl/basic_hostbridge", cxl_basic_hb);
+    qtest_add_func("/pci/cxl/basic_pxb", cxl_basic_pxb);
+    qtest_add_func("/pci/cxl/pxb_with_window", cxl_pxb_with_window);
+    qtest_add_func("/pci/cxl/root_port", cxl_root_port);
+    qtest_add_func("/pci/cxl/type3_device", cxl_t3d);
+
+    return g_test_run();
+}
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index c83bc211b6..554152b7c5 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -22,6 +22,9 @@ qtests_pci = \
   (config_all_devices.has_key('CONFIG_VGA') ? ['display-vga-test'] : []) +                  \
   (config_all_devices.has_key('CONFIG_IVSHMEM_DEVICE') ? ['ivshmem-test'] : [])
 
+qtests_cxl = \
+  (config_all_devices.has_key('CONFIG_CXL') ? ['cxl-test'] : [])
+
 qtests_i386 = \
   (slirp.found() ? ['pxe-test', 'test-netfilter'] : []) +             \
   (config_host.has_key('CONFIG_POSIX') ? ['test-filter-mirror'] : []) +                     \
@@ -48,6 +51,7 @@ qtests_i386 = \
   (config_all_devices.has_key('CONFIG_TPM_TIS_ISA') ? ['tpm-tis-swtpm-test'] : []) +        \
   (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) +              \
   qtests_pci +                                                                              \
+  qtests_cxl +                                                                              \
   ['fdc-test',
    'ide-test',
    'hd-geo-test',
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 30/31] qtest/cxl: Add very basic sanity tests
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/qtest/cxl-test.c  | 93 +++++++++++++++++++++++++++++++++++++++++
 tests/qtest/meson.build |  4 ++
 2 files changed, 97 insertions(+)
 create mode 100644 tests/qtest/cxl-test.c

diff --git a/tests/qtest/cxl-test.c b/tests/qtest/cxl-test.c
new file mode 100644
index 0000000000..00eca14faa
--- /dev/null
+++ b/tests/qtest/cxl-test.c
@@ -0,0 +1,93 @@
+/*
+ * QTest testcase for CXL
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+
+#define QEMU_PXB_CMD "-machine q35 -object memory-backend-file,id=cxl-mem1," \
+                     "share,mem-path=%s,size=512M "                          \
+                     "-device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,"  \
+                     "len-window-base=1,window-base[0]=0x4c0000000,memdev[0]=cxl-mem1"
+#define QEMU_RP "-device cxl-rp,id=rp0,bus=cxl.0,addr=0.0,chassis=0,slot=0"
+
+#define QEMU_T3D "-device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
+
+static void cxl_basic_hb(void)
+{
+    qtest_start("-machine q35,cxl");
+    qtest_end();
+}
+
+static void cxl_basic_pxb(void)
+{
+    qtest_start("-machine q35 -device pxb-cxl,bus=pcie.0,uid=0");
+    qtest_end();
+}
+
+static void cxl_pxb_with_window(void)
+{
+    GString *cmdline;
+    char template[] = "/tmp/cxl-test-XXXXXX";
+    const char *tmpfs;
+
+    tmpfs = mkdtemp(template);
+
+    cmdline = g_string_new(NULL);
+    g_string_printf(cmdline, QEMU_PXB_CMD, tmpfs);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+
+    g_string_free(cmdline, TRUE);
+}
+
+static void cxl_root_port(void)
+{
+    GString *cmdline;
+    char template[] = "/tmp/cxl-test-XXXXXX";
+    const char *tmpfs;
+
+    tmpfs = mkdtemp(template);
+
+    cmdline = g_string_new(NULL);
+    g_string_printf(cmdline, QEMU_PXB_CMD " %s", tmpfs, QEMU_RP);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+
+    g_string_free(cmdline, TRUE);
+}
+
+static void cxl_t3d(void)
+{
+    GString *cmdline;
+    char template[] = "/tmp/cxl-test-XXXXXX";
+    const char *tmpfs;
+
+    tmpfs = mkdtemp(template);
+
+    cmdline = g_string_new(NULL);
+    g_string_printf(cmdline, QEMU_PXB_CMD " %s %s", tmpfs, QEMU_RP, QEMU_T3D);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+
+    g_string_free(cmdline, TRUE);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    qtest_add_func("/pci/cxl/basic_hostbridge", cxl_basic_hb);
+    qtest_add_func("/pci/cxl/basic_pxb", cxl_basic_pxb);
+    qtest_add_func("/pci/cxl/pxb_with_window", cxl_pxb_with_window);
+    qtest_add_func("/pci/cxl/root_port", cxl_root_port);
+    qtest_add_func("/pci/cxl/type3_device", cxl_t3d);
+
+    return g_test_run();
+}
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index c83bc211b6..554152b7c5 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -22,6 +22,9 @@ qtests_pci = \
   (config_all_devices.has_key('CONFIG_VGA') ? ['display-vga-test'] : []) +                  \
   (config_all_devices.has_key('CONFIG_IVSHMEM_DEVICE') ? ['ivshmem-test'] : [])
 
+qtests_cxl = \
+  (config_all_devices.has_key('CONFIG_CXL') ? ['cxl-test'] : [])
+
 qtests_i386 = \
   (slirp.found() ? ['pxe-test', 'test-netfilter'] : []) +             \
   (config_host.has_key('CONFIG_POSIX') ? ['test-filter-mirror'] : []) +                     \
@@ -48,6 +51,7 @@ qtests_i386 = \
   (config_all_devices.has_key('CONFIG_TPM_TIS_ISA') ? ['tpm-tis-swtpm-test'] : []) +        \
   (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) +              \
   qtests_pci +                                                                              \
+  qtests_cxl +                                                                              \
   ['fdc-test',
    'ide-test',
    'hd-geo-test',
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 31/31] WIP: i386/cxl: Initialize a host bridge
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  0:59   ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Jonathan Cameron,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

This patch allows initializing the primary host bridge as a CXL capable
hostbridge.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--
This patch is WIP.
---
 hw/arm/virt.c        |  1 +
 hw/core/machine.c    | 26 ++++++++++++++++++++++++++
 hw/i386/acpi-build.c |  8 +++++++-
 hw/i386/microvm.c    |  1 +
 hw/i386/pc.c         |  1 +
 hw/ppc/spapr.c       |  2 ++
 include/hw/boards.h  |  2 ++
 include/hw/cxl/cxl.h |  4 ++++
 8 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 399da73454..fd5f5b656c 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2547,6 +2547,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug_request = virt_machine_device_unplug_request_cb;
     hc->unplug = virt_machine_device_unplug_cb;
     mc->nvdimm_supported = true;
+    mc->cxl_supported = false;
     mc->auto_enable_numa_with_memhp = true;
     mc->auto_enable_numa_with_memdev = true;
     mc->default_ram_id = "mach-virt.ram";
diff --git a/hw/core/machine.c b/hw/core/machine.c
index de3b8f1b31..c739803854 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -30,6 +30,7 @@
 #include "sysemu/qtest.h"
 #include "hw/pci/pci.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/cxl/cxl.h"
 #include "migration/global_state.h"
 #include "migration/vmstate.h"
 
@@ -502,6 +503,20 @@ static void machine_set_nvdimm_persistence(Object *obj, const char *value,
     nvdimms_state->persistence_string = g_strdup(value);
 }
 
+static bool machine_get_cxl(Object *obj, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    return ms->cxl_devices_state->is_enabled;
+}
+
+static void machine_set_cxl(Object *obj, bool value, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    ms->cxl_devices_state->is_enabled = value;
+}
+
 void machine_class_allow_dynamic_sysbus_dev(MachineClass *mc, const char *type)
 {
     QAPI_LIST_PREPEND(mc->allowed_dynamic_sysbus_devices, g_strdup(type));
@@ -903,6 +918,16 @@ static void machine_initfn(Object *obj)
                                         "Valid values are cpu, mem-ctrl");
     }
 
+    if (mc->cxl_supported) {
+        Object *obj = OBJECT(ms);
+
+        ms->cxl_devices_state = g_new0(CXLState, 1);
+        object_property_add_bool(obj, "cxl", machine_get_cxl, machine_set_cxl);
+        object_property_set_description(obj, "cxl",
+                                        "Set on/off to enable/disable "
+                                        "CXL instantiation");
+    }
+
     if (mc->cpu_index_to_instance_props && mc->get_default_cpu_node_id) {
         ms->numa_state = g_new0(NumaState, 1);
         object_property_add_bool(obj, "hmat",
@@ -939,6 +964,7 @@ static void machine_finalize(Object *obj)
     g_free(ms->device_memory);
     g_free(ms->nvdimms_state);
     g_free(ms->numa_state);
+    g_free(ms->cxl_devices_state);
 }
 
 bool machine_usb(MachineState *machine)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 7706856c49..2250e6d27b 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -53,6 +53,7 @@
 #include "sysemu/numa.h"
 #include "sysemu/reset.h"
 #include "hw/hyperv/vmbus-bridge.h"
+#include "hw/cxl/cxl.h"
 
 /* Supported chipsets: */
 #include "hw/southbridge/piix.h"
@@ -1277,8 +1278,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         build_piix4_pci0_int(dsdt);
     } else {
         sb_scope = aml_scope("_SB");
+        /*
+         * XXX: CXL spec calls this "CXL0", but that would require lots of
+         * changes throughout and so even for CXL enabled, we call it "PCI0"
+         */
         dev = aml_device("PCI0");
-        init_pci_acpi(dev, 0, PCIE);
+        init_pci_acpi(dev, 0,
+                machine->cxl_devices_state->is_enabled ? CXL : PCIE);
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
         aml_append(sb_scope, dev);
 
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index edf2b0f061..970b299a69 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -688,6 +688,7 @@ static void microvm_class_init(ObjectClass *oc, void *data)
     mc->auto_enable_numa_with_memdev = false;
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = false;
+    mc->cxl_supported = false;
     mc->default_ram_id = "microvm.ram";
 
     /* Avoid relying too much on kernel components */
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5d41809b37..7350eeea9c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1725,6 +1725,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug = pc_machine_device_unplug_cb;
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = true;
+    mc->cxl_supported = true;
     mc->default_ram_id = "pc.ram";
 
     object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6c47466fc2..9773dbd83c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4440,6 +4440,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
     mc->has_hotpluggable_cpus = true;
     mc->nvdimm_supported = true;
+    mc->cxl_supported = false;
     smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED;
     fwc->get_dev_path = spapr_get_fw_dev_path;
     nc->nmi_monitor_handler = spapr_nmi;
@@ -4600,6 +4601,7 @@ static void spapr_machine_4_2_class_options(MachineClass *mc)
     smc->default_caps.caps[SPAPR_CAP_FWNMI] = SPAPR_CAP_OFF;
     smc->rma_limit = 16 * GiB;
     mc->nvdimm_supported = false;
+    mc->cxl_supported = false;
 }
 
 DEFINE_SPAPR_MACHINE(4_2, "4.2", false);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 17b1f3f0b9..808f73e134 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -206,6 +206,7 @@ struct MachineClass {
     bool ignore_boot_device_suffixes;
     bool smbus_no_migration_support;
     bool nvdimm_supported;
+    bool cxl_supported;
     bool numa_mem_supported;
     bool auto_enable_numa;
     const char *default_ram_id;
@@ -292,6 +293,7 @@ struct MachineState {
     CPUArchIdList *possible_cpus;
     CpuTopology smp;
     struct NVDIMMState *nvdimms_state;
+    struct CXLState *cxl_devices_state;
     struct NumaState *numa_state;
 };
 
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index c7ca42930f..784b4f7a04 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -22,4 +22,8 @@
 #define CXL_HOST_BASE 0xD0000000
 #define CXL_WINDOW_MAX 10
 
+typedef struct CXLState {
+    bool is_enabled;
+} CXLState;
+
 #endif
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [RFC PATCH v3 31/31] WIP: i386/cxl: Initialize a host bridge
@ 2021-02-02  0:59   ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02  0:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

This patch allows initializing the primary host bridge as a CXL capable
hostbridge.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--
This patch is WIP.
---
 hw/arm/virt.c        |  1 +
 hw/core/machine.c    | 26 ++++++++++++++++++++++++++
 hw/i386/acpi-build.c |  8 +++++++-
 hw/i386/microvm.c    |  1 +
 hw/i386/pc.c         |  1 +
 hw/ppc/spapr.c       |  2 ++
 include/hw/boards.h  |  2 ++
 include/hw/cxl/cxl.h |  4 ++++
 8 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 399da73454..fd5f5b656c 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2547,6 +2547,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug_request = virt_machine_device_unplug_request_cb;
     hc->unplug = virt_machine_device_unplug_cb;
     mc->nvdimm_supported = true;
+    mc->cxl_supported = false;
     mc->auto_enable_numa_with_memhp = true;
     mc->auto_enable_numa_with_memdev = true;
     mc->default_ram_id = "mach-virt.ram";
diff --git a/hw/core/machine.c b/hw/core/machine.c
index de3b8f1b31..c739803854 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -30,6 +30,7 @@
 #include "sysemu/qtest.h"
 #include "hw/pci/pci.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/cxl/cxl.h"
 #include "migration/global_state.h"
 #include "migration/vmstate.h"
 
@@ -502,6 +503,20 @@ static void machine_set_nvdimm_persistence(Object *obj, const char *value,
     nvdimms_state->persistence_string = g_strdup(value);
 }
 
+static bool machine_get_cxl(Object *obj, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    return ms->cxl_devices_state->is_enabled;
+}
+
+static void machine_set_cxl(Object *obj, bool value, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    ms->cxl_devices_state->is_enabled = value;
+}
+
 void machine_class_allow_dynamic_sysbus_dev(MachineClass *mc, const char *type)
 {
     QAPI_LIST_PREPEND(mc->allowed_dynamic_sysbus_devices, g_strdup(type));
@@ -903,6 +918,16 @@ static void machine_initfn(Object *obj)
                                         "Valid values are cpu, mem-ctrl");
     }
 
+    if (mc->cxl_supported) {
+        Object *obj = OBJECT(ms);
+
+        ms->cxl_devices_state = g_new0(CXLState, 1);
+        object_property_add_bool(obj, "cxl", machine_get_cxl, machine_set_cxl);
+        object_property_set_description(obj, "cxl",
+                                        "Set on/off to enable/disable "
+                                        "CXL instantiation");
+    }
+
     if (mc->cpu_index_to_instance_props && mc->get_default_cpu_node_id) {
         ms->numa_state = g_new0(NumaState, 1);
         object_property_add_bool(obj, "hmat",
@@ -939,6 +964,7 @@ static void machine_finalize(Object *obj)
     g_free(ms->device_memory);
     g_free(ms->nvdimms_state);
     g_free(ms->numa_state);
+    g_free(ms->cxl_devices_state);
 }
 
 bool machine_usb(MachineState *machine)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 7706856c49..2250e6d27b 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -53,6 +53,7 @@
 #include "sysemu/numa.h"
 #include "sysemu/reset.h"
 #include "hw/hyperv/vmbus-bridge.h"
+#include "hw/cxl/cxl.h"
 
 /* Supported chipsets: */
 #include "hw/southbridge/piix.h"
@@ -1277,8 +1278,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         build_piix4_pci0_int(dsdt);
     } else {
         sb_scope = aml_scope("_SB");
+        /*
+         * XXX: CXL spec calls this "CXL0", but that would require lots of
+         * changes throughout and so even for CXL enabled, we call it "PCI0"
+         */
         dev = aml_device("PCI0");
-        init_pci_acpi(dev, 0, PCIE);
+        init_pci_acpi(dev, 0,
+                machine->cxl_devices_state->is_enabled ? CXL : PCIE);
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
         aml_append(sb_scope, dev);
 
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index edf2b0f061..970b299a69 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -688,6 +688,7 @@ static void microvm_class_init(ObjectClass *oc, void *data)
     mc->auto_enable_numa_with_memdev = false;
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = false;
+    mc->cxl_supported = false;
     mc->default_ram_id = "microvm.ram";
 
     /* Avoid relying too much on kernel components */
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5d41809b37..7350eeea9c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1725,6 +1725,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug = pc_machine_device_unplug_cb;
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = true;
+    mc->cxl_supported = true;
     mc->default_ram_id = "pc.ram";
 
     object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6c47466fc2..9773dbd83c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4440,6 +4440,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
     mc->has_hotpluggable_cpus = true;
     mc->nvdimm_supported = true;
+    mc->cxl_supported = false;
     smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED;
     fwc->get_dev_path = spapr_get_fw_dev_path;
     nc->nmi_monitor_handler = spapr_nmi;
@@ -4600,6 +4601,7 @@ static void spapr_machine_4_2_class_options(MachineClass *mc)
     smc->default_caps.caps[SPAPR_CAP_FWNMI] = SPAPR_CAP_OFF;
     smc->rma_limit = 16 * GiB;
     mc->nvdimm_supported = false;
+    mc->cxl_supported = false;
 }
 
 DEFINE_SPAPR_MACHINE(4_2, "4.2", false);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 17b1f3f0b9..808f73e134 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -206,6 +206,7 @@ struct MachineClass {
     bool ignore_boot_device_suffixes;
     bool smbus_no_migration_support;
     bool nvdimm_supported;
+    bool cxl_supported;
     bool numa_mem_supported;
     bool auto_enable_numa;
     const char *default_ram_id;
@@ -292,6 +293,7 @@ struct MachineState {
     CPUArchIdList *possible_cpus;
     CpuTopology smp;
     struct NVDIMMState *nvdimms_state;
+    struct CXLState *cxl_devices_state;
     struct NumaState *numa_state;
 };
 
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index c7ca42930f..784b4f7a04 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -22,4 +22,8 @@
 #define CXL_HOST_BASE 0xD0000000
 #define CXL_WINDOW_MAX 10
 
+typedef struct CXLState {
+    bool is_enabled;
+} CXLState;
+
 #endif
-- 
2.30.0



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 00/31] CXL 2.0 Support
  2021-02-02  0:59 ` Ben Widawsky
@ 2021-02-02  1:33   ` no-reply
  -1 siblings, 0 replies; 117+ messages in thread
From: no-reply @ 2021-02-02  1:33 UTC (permalink / raw)
  To: ben.widawsky
  Cc: qemu-devel, ben.widawsky, david, vishal.l.verma, jgroves, cbrowy,
	armbru, linux-cxl, f4bug, mst, Jonathan.Cameron, imammedo,
	dan.j.williams, ira.weiny

Patchew URL: https://patchew.org/QEMU/20210202005948.241655-1-ben.widawsky@intel.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210202005948.241655-1-ben.widawsky@intel.com
Subject: [RFC PATCH v3 00/31] CXL 2.0 Support

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20210202005948.241655-1-ben.widawsky@intel.com -> patchew/20210202005948.241655-1-ben.widawsky@intel.com
Switched to a new branch 'test'
e26ed22 WIP: i386/cxl: Initialize a host bridge
9329c2b qtest/cxl: Add very basic sanity tests
c140fd9 hw/cxl/device: Implement get/set LSA
8ed7755 hw/cxl/device: Plumb real LSA sizing
5f683ab hw/cxl/device: Add some trivial commands
4399501 tests/acpi: Add new CEDT files
6c13c92 acpi/cxl: Create the CEDT (9.14.1)
04a874a tests/acpi: allow CEDT table addition
50f82e6 acpi/cxl: Add _OSC implementation (9.14.2)
7eb8038 hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
ba80470 hw/cxl/device: Add a memory device (8.2.8.5)
54b9662 hw/cxl/rp: Add a root port
e70de08 hw/pxb/cxl: Add "windows" for host bridges
606831a acpi/pxb/cxl: Reserve host bridge MMIO
29a562b hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
32e7bdd hw/pci: Plumb _UID through host bridges
6651f84 tests/acpi: remove stale allowed tables
24837fc acpi/pci: Consolidate host bridge setup
52f548c qtest: allow DSDT acpi table changes
bdcd7d9 hw/pxb: Allow creation of a CXL PXB (host bridge)
5d67d7e hw/pci/cxl: Create a CXL bus type
3b0d310 hw/pxb: Use a type for realizing expanders
5ccf850 hw/cxl/device: Add log commands (8.2.9.4) + CEL
892e722 hw/cxl/device: Timestamp implementation (8.2.9.3)
f2444bb hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
67fa438 hw/cxl/device: Add memory device utilities
cfa875c hw/cxl/device: Implement basic mailbox (8.2.8.4)
bdd7975 hw/cxl/device: Implement the CAP array (8.2.8.1-2)
c9e87d1 hw/cxl/device: Introduce a CXL device (8.2.8)
1cc9e2a hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
7b0e042 hw/pci/cxl: Add a CXL component type (interface)

=== OUTPUT BEGIN ===
1/31 Checking commit 7b0e042bc22b (hw/pci/cxl: Add a CXL component type (interface))
2/31 Checking commit 1cc9e2a0a6d5 (hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5))
WARNING: line over 80 characters
#187: FILE: hw/cxl/cxl-component-utils.c:101:
+    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */

WARNING: line over 80 characters
#193: FILE: hw/cxl/cxl-component-utils.c:107:
+    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);

WARNING: line over 80 characters
#406: FILE: include/hw/cxl/cxl_component.h:62:
+#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */

WARNING: line over 80 characters
#417: FILE: include/hw/cxl/cxl_component.h:73:
+#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)

WARNING: line over 80 characters
#421: FILE: include/hw/cxl/cxl_component.h:77:
+#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)

WARNING: line over 80 characters
#465: FILE: include/hw/cxl/cxl_component.h:121:
+#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)

WARNING: line over 80 characters
#469: FILE: include/hw/cxl/cxl_component.h:125:
+#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)

WARNING: line over 80 characters
#473: FILE: include/hw/cxl/cxl_component.h:129:
+#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)

total: 0 errors, 8 warnings, 582 lines checked

Patch 2/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/31 Checking commit c9e87d150708 (hw/cxl/device: Introduce a CXL device (8.2.8))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#36: 
new file mode 100644

WARNING: line over 80 characters
#156: FILE: include/hw/cxl/cxl_device.h:116:
+#define CXL_DEVICE_CAPABILITY_HEADER_REGISTER(n, offset)                            \

total: 0 errors, 2 warnings, 162 lines checked

Patch 3/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
4/31 Checking commit bdd7975aa4bc (hw/cxl/device: Implement the CAP array (8.2.8.1-2))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#23: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#159: FILE: include/hw/cxl/cxl_device.h:75:
+#define CXL_MMIO_SIZE                                       \
+    CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_REGISTERS_LENGTH + \
+        CXL_MAILBOX_REGISTERS_LENGTH

WARNING: line over 80 characters
#179: FILE: include/hw/cxl/cxl_device.h:138:
+#define cxl_device_cap_init(dstate, reg, cap_id)                                   \

WARNING: line over 80 characters
#180: FILE: include/hw/cxl/cxl_device.h:139:
+    do {                                                                           \

WARNING: line over 80 characters
#181: FILE: include/hw/cxl/cxl_device.h:140:
+        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \

WARNING: line over 80 characters
#182: FILE: include/hw/cxl/cxl_device.h:141:
+        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \

WARNING: line over 80 characters
#183: FILE: include/hw/cxl/cxl_device.h:142:
+        cap_hdrs[which] =                                                          \

WARNING: line over 80 characters
#184: FILE: include/hw/cxl/cxl_device.h:143:
+            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \

WARNING: line over 80 characters
#185: FILE: include/hw/cxl/cxl_device.h:144:
+        cap_hdrs[which] = FIELD_DP32(                                              \

WARNING: line over 80 characters
#186: FILE: include/hw/cxl/cxl_device.h:145:
+            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \

WARNING: line over 80 characters
#187: FILE: include/hw/cxl/cxl_device.h:146:
+        cap_hdrs[which + 1] =                                                      \

WARNING: line over 80 characters
#188: FILE: include/hw/cxl/cxl_device.h:147:
+            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \

WARNING: line over 80 characters
#189: FILE: include/hw/cxl/cxl_device.h:148:
+                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \

WARNING: line over 80 characters
#190: FILE: include/hw/cxl/cxl_device.h:149:
+        cap_hdrs[which + 2] =                                                      \

WARNING: line over 80 characters
#191: FILE: include/hw/cxl/cxl_device.h:150:
+            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \

WARNING: line over 80 characters
#192: FILE: include/hw/cxl/cxl_device.h:151:
+                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \

total: 1 errors, 15 warnings, 158 lines checked

Patch 4/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

5/31 Checking commit cfa875c3c48b (hw/cxl/device: Implement basic mailbox (8.2.8.4))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#191: 
new file mode 100644

ERROR: space prohibited between function name and open parenthesis '('
#264: FILE: hw/cxl/cxl-mailbox-utils.c:69:
+typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,

total: 1 errors, 1 warnings, 416 lines checked

Patch 5/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

6/31 Checking commit 67fa4383326f (hw/cxl/device: Add memory device utilities)
7/31 Checking commit f2444bba9cd7 (hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1))
8/31 Checking commit 892e722ede17 (hw/cxl/device: Timestamp implementation (8.2.9.3))
9/31 Checking commit 5ccf850db462 (hw/cxl/device: Add log commands (8.2.9.4) + CEL)
10/31 Checking commit 3b0d3108b26c (hw/pxb: Use a type for realizing expanders)
11/31 Checking commit 5d67d7eb3d82 (hw/pci/cxl: Create a CXL bus type)
12/31 Checking commit bdcd7d995e9f (hw/pxb: Allow creation of a CXL PXB (host bridge))
13/31 Checking commit 52f548ca385d (qtest: allow DSDT acpi table changes)
14/31 Checking commit 24837fc1bb0e (acpi/pci: Consolidate host bridge setup)
15/31 Checking commit 6651f845de76 (tests/acpi: remove stale allowed tables)
16/31 Checking commit 32e7bdd7607d (hw/pci: Plumb _UID through host bridges)
WARNING: line over 80 characters
#113: FILE: hw/pci-bridge/pci_expander_bridge.c:422:
+        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");

total: 0 errors, 1 warnings, 113 lines checked

Patch 16/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
17/31 Checking commit 29a562ba112f (hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142))
18/31 Checking commit 606831a33eda (acpi/pxb/cxl: Reserve host bridge MMIO)
19/31 Checking commit e70de0847b95 (hw/pxb/cxl: Add "windows" for host bridges)
WARNING: line over 80 characters
#133: FILE: hw/pci-bridge/pci_expander_bridge.c:516:
+        warn_report("memory-windows should be set when creating CXL host bridges");

total: 0 errors, 1 warnings, 127 lines checked

Patch 19/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
20/31 Checking commit 54b96623ffd8 (hw/cxl/rp: Add a root port)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#42: 
new file mode 100644

total: 0 errors, 1 warnings, 268 lines checked

Patch 20/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
21/31 Checking commit ba804700c6a6 (hw/cxl/device: Add a memory device (8.2.8.5))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#155: 
new file mode 100644

WARNING: line over 80 characters
#272: FILE: hw/mem/cxl_type3.c:113:
+                   "Not enough free space (%zd) required for device (%" PRId64  ")",

total: 0 errors, 2 warnings, 501 lines checked

Patch 21/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
22/31 Checking commit 7eb80384a516 (hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12))
WARNING: line over 80 characters
#92: FILE: hw/mem/cxl_type3.c:113:
+static void ct3d_reg_write(void *opaque, hwaddr offset, uint64_t value, unsigned size)

total: 0 errors, 1 warnings, 114 lines checked

Patch 22/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
23/31 Checking commit 50f82e6dddb1 (acpi/cxl: Add _OSC implementation (9.14.2))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#45: 
new file mode 100644

WARNING: Block comments use a leading /* on a separate line
#188: FILE: hw/i386/acpi-build.c:1210:
+    } else /* CXL */ {

total: 0 errors, 2 warnings, 176 lines checked

Patch 23/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
24/31 Checking commit 04a874a8d982 (tests/acpi: allow CEDT table addition)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#16: 
new file mode 100644

total: 0 errors, 1 warnings, 3 lines checked

Patch 24/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
25/31 Checking commit 6c13c9205746 (acpi/cxl: Create the CEDT (9.14.1))
26/31 Checking commit 43995014aa28 (tests/acpi: Add new CEDT files)
27/31 Checking commit 5f683ab6ee1a (hw/cxl/device: Add some trivial commands)
28/31 Checking commit 8ed7755c7a36 (hw/cxl/device: Plumb real LSA sizing)
29/31 Checking commit c140fd9d4517 (hw/cxl/device: Implement get/set LSA)
30/31 Checking commit 9329c2b72e7f (qtest/cxl: Add very basic sanity tests)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#15: 
new file mode 100644

WARNING: line over 80 characters
#36: FILE: tests/qtest/cxl-test.c:17:
+#define QEMU_T3D "-device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"

total: 0 errors, 2 warnings, 109 lines checked

Patch 30/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
31/31 Checking commit e26ed228d062 (WIP: i386/cxl: Initialize a host bridge)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210202005948.241655-1-ben.widawsky@intel.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 00/31] CXL 2.0 Support
@ 2021-02-02  1:33   ` no-reply
  0 siblings, 0 replies; 117+ messages in thread
From: no-reply @ 2021-02-02  1:33 UTC (permalink / raw)
  To: ben.widawsky
  Cc: ben.widawsky, david, vishal.l.verma, jgroves, cbrowy, qemu-devel,
	linux-cxl, armbru, mst, Jonathan.Cameron, imammedo,
	dan.j.williams, ira.weiny, f4bug

Patchew URL: https://patchew.org/QEMU/20210202005948.241655-1-ben.widawsky@intel.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210202005948.241655-1-ben.widawsky@intel.com
Subject: [RFC PATCH v3 00/31] CXL 2.0 Support

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20210202005948.241655-1-ben.widawsky@intel.com -> patchew/20210202005948.241655-1-ben.widawsky@intel.com
Switched to a new branch 'test'
e26ed22 WIP: i386/cxl: Initialize a host bridge
9329c2b qtest/cxl: Add very basic sanity tests
c140fd9 hw/cxl/device: Implement get/set LSA
8ed7755 hw/cxl/device: Plumb real LSA sizing
5f683ab hw/cxl/device: Add some trivial commands
4399501 tests/acpi: Add new CEDT files
6c13c92 acpi/cxl: Create the CEDT (9.14.1)
04a874a tests/acpi: allow CEDT table addition
50f82e6 acpi/cxl: Add _OSC implementation (9.14.2)
7eb8038 hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
ba80470 hw/cxl/device: Add a memory device (8.2.8.5)
54b9662 hw/cxl/rp: Add a root port
e70de08 hw/pxb/cxl: Add "windows" for host bridges
606831a acpi/pxb/cxl: Reserve host bridge MMIO
29a562b hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
32e7bdd hw/pci: Plumb _UID through host bridges
6651f84 tests/acpi: remove stale allowed tables
24837fc acpi/pci: Consolidate host bridge setup
52f548c qtest: allow DSDT acpi table changes
bdcd7d9 hw/pxb: Allow creation of a CXL PXB (host bridge)
5d67d7e hw/pci/cxl: Create a CXL bus type
3b0d310 hw/pxb: Use a type for realizing expanders
5ccf850 hw/cxl/device: Add log commands (8.2.9.4) + CEL
892e722 hw/cxl/device: Timestamp implementation (8.2.9.3)
f2444bb hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
67fa438 hw/cxl/device: Add memory device utilities
cfa875c hw/cxl/device: Implement basic mailbox (8.2.8.4)
bdd7975 hw/cxl/device: Implement the CAP array (8.2.8.1-2)
c9e87d1 hw/cxl/device: Introduce a CXL device (8.2.8)
1cc9e2a hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
7b0e042 hw/pci/cxl: Add a CXL component type (interface)

=== OUTPUT BEGIN ===
1/31 Checking commit 7b0e042bc22b (hw/pci/cxl: Add a CXL component type (interface))
2/31 Checking commit 1cc9e2a0a6d5 (hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5))
WARNING: line over 80 characters
#187: FILE: hw/cxl/cxl-component-utils.c:101:
+    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */

WARNING: line over 80 characters
#193: FILE: hw/cxl/cxl-component-utils.c:107:
+    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);

WARNING: line over 80 characters
#406: FILE: include/hw/cxl/cxl_component.h:62:
+#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */

WARNING: line over 80 characters
#417: FILE: include/hw/cxl/cxl_component.h:73:
+#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)

WARNING: line over 80 characters
#421: FILE: include/hw/cxl/cxl_component.h:77:
+#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)

WARNING: line over 80 characters
#465: FILE: include/hw/cxl/cxl_component.h:121:
+#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)

WARNING: line over 80 characters
#469: FILE: include/hw/cxl/cxl_component.h:125:
+#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)

WARNING: line over 80 characters
#473: FILE: include/hw/cxl/cxl_component.h:129:
+#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)

total: 0 errors, 8 warnings, 582 lines checked

Patch 2/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/31 Checking commit c9e87d150708 (hw/cxl/device: Introduce a CXL device (8.2.8))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#36: 
new file mode 100644

WARNING: line over 80 characters
#156: FILE: include/hw/cxl/cxl_device.h:116:
+#define CXL_DEVICE_CAPABILITY_HEADER_REGISTER(n, offset)                            \

total: 0 errors, 2 warnings, 162 lines checked

Patch 3/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
4/31 Checking commit bdd7975aa4bc (hw/cxl/device: Implement the CAP array (8.2.8.1-2))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#23: 
new file mode 100644

ERROR: Macros with complex values should be enclosed in parenthesis
#159: FILE: include/hw/cxl/cxl_device.h:75:
+#define CXL_MMIO_SIZE                                       \
+    CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_REGISTERS_LENGTH + \
+        CXL_MAILBOX_REGISTERS_LENGTH

WARNING: line over 80 characters
#179: FILE: include/hw/cxl/cxl_device.h:138:
+#define cxl_device_cap_init(dstate, reg, cap_id)                                   \

WARNING: line over 80 characters
#180: FILE: include/hw/cxl/cxl_device.h:139:
+    do {                                                                           \

WARNING: line over 80 characters
#181: FILE: include/hw/cxl/cxl_device.h:140:
+        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \

WARNING: line over 80 characters
#182: FILE: include/hw/cxl/cxl_device.h:141:
+        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \

WARNING: line over 80 characters
#183: FILE: include/hw/cxl/cxl_device.h:142:
+        cap_hdrs[which] =                                                          \

WARNING: line over 80 characters
#184: FILE: include/hw/cxl/cxl_device.h:143:
+            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \

WARNING: line over 80 characters
#185: FILE: include/hw/cxl/cxl_device.h:144:
+        cap_hdrs[which] = FIELD_DP32(                                              \

WARNING: line over 80 characters
#186: FILE: include/hw/cxl/cxl_device.h:145:
+            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \

WARNING: line over 80 characters
#187: FILE: include/hw/cxl/cxl_device.h:146:
+        cap_hdrs[which + 1] =                                                      \

WARNING: line over 80 characters
#188: FILE: include/hw/cxl/cxl_device.h:147:
+            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \

WARNING: line over 80 characters
#189: FILE: include/hw/cxl/cxl_device.h:148:
+                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \

WARNING: line over 80 characters
#190: FILE: include/hw/cxl/cxl_device.h:149:
+        cap_hdrs[which + 2] =                                                      \

WARNING: line over 80 characters
#191: FILE: include/hw/cxl/cxl_device.h:150:
+            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \

WARNING: line over 80 characters
#192: FILE: include/hw/cxl/cxl_device.h:151:
+                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \

total: 1 errors, 15 warnings, 158 lines checked

Patch 4/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

5/31 Checking commit cfa875c3c48b (hw/cxl/device: Implement basic mailbox (8.2.8.4))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#191: 
new file mode 100644

ERROR: space prohibited between function name and open parenthesis '('
#264: FILE: hw/cxl/cxl-mailbox-utils.c:69:
+typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,

total: 1 errors, 1 warnings, 416 lines checked

Patch 5/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

6/31 Checking commit 67fa4383326f (hw/cxl/device: Add memory device utilities)
7/31 Checking commit f2444bba9cd7 (hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1))
8/31 Checking commit 892e722ede17 (hw/cxl/device: Timestamp implementation (8.2.9.3))
9/31 Checking commit 5ccf850db462 (hw/cxl/device: Add log commands (8.2.9.4) + CEL)
10/31 Checking commit 3b0d3108b26c (hw/pxb: Use a type for realizing expanders)
11/31 Checking commit 5d67d7eb3d82 (hw/pci/cxl: Create a CXL bus type)
12/31 Checking commit bdcd7d995e9f (hw/pxb: Allow creation of a CXL PXB (host bridge))
13/31 Checking commit 52f548ca385d (qtest: allow DSDT acpi table changes)
14/31 Checking commit 24837fc1bb0e (acpi/pci: Consolidate host bridge setup)
15/31 Checking commit 6651f845de76 (tests/acpi: remove stale allowed tables)
16/31 Checking commit 32e7bdd7607d (hw/pci: Plumb _UID through host bridges)
WARNING: line over 80 characters
#113: FILE: hw/pci-bridge/pci_expander_bridge.c:422:
+        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");

total: 0 errors, 1 warnings, 113 lines checked

Patch 16/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
17/31 Checking commit 29a562ba112f (hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142))
18/31 Checking commit 606831a33eda (acpi/pxb/cxl: Reserve host bridge MMIO)
19/31 Checking commit e70de0847b95 (hw/pxb/cxl: Add "windows" for host bridges)
WARNING: line over 80 characters
#133: FILE: hw/pci-bridge/pci_expander_bridge.c:516:
+        warn_report("memory-windows should be set when creating CXL host bridges");

total: 0 errors, 1 warnings, 127 lines checked

Patch 19/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
20/31 Checking commit 54b96623ffd8 (hw/cxl/rp: Add a root port)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#42: 
new file mode 100644

total: 0 errors, 1 warnings, 268 lines checked

Patch 20/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
21/31 Checking commit ba804700c6a6 (hw/cxl/device: Add a memory device (8.2.8.5))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#155: 
new file mode 100644

WARNING: line over 80 characters
#272: FILE: hw/mem/cxl_type3.c:113:
+                   "Not enough free space (%zd) required for device (%" PRId64  ")",

total: 0 errors, 2 warnings, 501 lines checked

Patch 21/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
22/31 Checking commit 7eb80384a516 (hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12))
WARNING: line over 80 characters
#92: FILE: hw/mem/cxl_type3.c:113:
+static void ct3d_reg_write(void *opaque, hwaddr offset, uint64_t value, unsigned size)

total: 0 errors, 1 warnings, 114 lines checked

Patch 22/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
23/31 Checking commit 50f82e6dddb1 (acpi/cxl: Add _OSC implementation (9.14.2))
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#45: 
new file mode 100644

WARNING: Block comments use a leading /* on a separate line
#188: FILE: hw/i386/acpi-build.c:1210:
+    } else /* CXL */ {

total: 0 errors, 2 warnings, 176 lines checked

Patch 23/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
24/31 Checking commit 04a874a8d982 (tests/acpi: allow CEDT table addition)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#16: 
new file mode 100644

total: 0 errors, 1 warnings, 3 lines checked

Patch 24/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
25/31 Checking commit 6c13c9205746 (acpi/cxl: Create the CEDT (9.14.1))
26/31 Checking commit 43995014aa28 (tests/acpi: Add new CEDT files)
27/31 Checking commit 5f683ab6ee1a (hw/cxl/device: Add some trivial commands)
28/31 Checking commit 8ed7755c7a36 (hw/cxl/device: Plumb real LSA sizing)
29/31 Checking commit c140fd9d4517 (hw/cxl/device: Implement get/set LSA)
30/31 Checking commit 9329c2b72e7f (qtest/cxl: Add very basic sanity tests)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#15: 
new file mode 100644

WARNING: line over 80 characters
#36: FILE: tests/qtest/cxl-test.c:17:
+#define QEMU_T3D "-device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"

total: 0 errors, 2 warnings, 109 lines checked

Patch 30/31 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
31/31 Checking commit e26ed228d062 (WIP: i386/cxl: Initialize a host bridge)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210202005948.241655-1-ben.widawsky@intel.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 02/31] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-02 11:48     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 11:48 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:19 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> A CXL 2.0 component is any entity in the CXL topology. All components
> have a analogous function in PCIe. Except for the CXL host bridge, all
> have a PCIe config space that is accessible via the common PCIe
> mechanisms. CXL components are enumerated via DVSEC fields in the
> extended PCIe header space. CXL components will minimally implement some
> subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
> 2.0 specification. Two headers and a utility library are introduced to
> support the minimum functionality needed to enumerate components.
> 
> The cxl_pci header manages bits associated with PCI, specifically the
> DVSEC and related fields. The cxl_component.h variant has data
> structures and APIs that are useful for drivers implementing any of the
> CXL 2.0 components. The library takes care of making use of the DVSEC
> bits and the CXL.[mem|cache] registers. Per spec, the registers are
> little endian.
> 
> None of the mechanisms required to enumerate a CXL capable hostbridge
> are introduced at this point.
> 
> Note that the CXL.mem and CXL.cache registers used are always 4B wide.
> It's possible in the future that this constraint will not hold.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

A few minor discrepancies from the spec, + naming suggestions.

Otherwise LGTM.

> ---
>  MAINTAINERS                    |   6 +
>  hw/Kconfig                     |   1 +
>  hw/cxl/Kconfig                 |   3 +
>  hw/cxl/cxl-component-utils.c   | 208 +++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build             |   3 +
>  hw/meson.build                 |   1 +
>  include/hw/cxl/cxl.h           |  17 +++
>  include/hw/cxl/cxl_component.h | 187 +++++++++++++++++++++++++++++
>  include/hw/cxl/cxl_pci.h       | 138 ++++++++++++++++++++++
>  9 files changed, 564 insertions(+)
>  create mode 100644 hw/cxl/Kconfig
>  create mode 100644 hw/cxl/cxl-component-utils.c
>  create mode 100644 hw/cxl/meson.build
>  create mode 100644 include/hw/cxl/cxl.h
>  create mode 100644 include/hw/cxl/cxl_component.h
>  create mode 100644 include/hw/cxl/cxl_pci.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bcd88668bc..981dc92e25 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2234,6 +2234,12 @@ F: qapi/block*.json
>  F: qapi/transaction.json
>  T: git https://repo.or.cz/qemu/armbru.git block-next
>  
> +Compute Express Link
> +M: Ben Widawsky <ben.widawsky@intel.com>
> +S: Supported
> +F: hw/cxl/
> +F: include/hw/cxl/
> +
>  Dirty Bitmaps
>  M: Eric Blake <eblake@redhat.com>
>  M: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> diff --git a/hw/Kconfig b/hw/Kconfig
> index 5ad3c6b5a4..c03650c5ed 100644
> --- a/hw/Kconfig
> +++ b/hw/Kconfig
> @@ -6,6 +6,7 @@ source audio/Kconfig
>  source block/Kconfig
>  source char/Kconfig
>  source core/Kconfig
> +source cxl/Kconfig
>  source display/Kconfig
>  source dma/Kconfig
>  source gpio/Kconfig
> diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig
> new file mode 100644
> index 0000000000..8e67519b16
> --- /dev/null
> +++ b/hw/cxl/Kconfig
> @@ -0,0 +1,3 @@
> +config CXL
> +    bool
> +    default y if PCI_EXPRESS
> diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> new file mode 100644
> index 0000000000..8d56ad5c7d
> --- /dev/null
> +++ b/hw/cxl/cxl-component-utils.c
> @@ -0,0 +1,208 @@
> +/*
> + * CXL Utility library for components
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "hw/pci/pci.h"
> +#include "hw/cxl/cxl.h"
> +
> +static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
> +                                       unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    assert(size == 4);
> +
> +    if (cregs->special_ops && cregs->special_ops->read) {
> +        return cregs->special_ops->read(cxl_cstate, offset, size);
> +    } else {
> +        return cregs->cache_mem_registers[offset / 4];
> +    }
> +}
> +
> +static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
> +                                    unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    assert(size == 4);
> +
> +    if (cregs->special_ops && cregs->special_ops->write) {
> +        cregs->special_ops->write(cxl_cstate, offset, value, size);
> +    } else {
> +        cregs->cache_mem_registers[offset / 4] = value;
> +    }
> +}
> +
> +/*
> + * 8.2.3
> + *   The access restrictions specified in Section 8.2.2 also apply to CXL 2.0
> + *   Component Registers.
> + *
> + * 8.2.2
> + *   • A 32 bit register shall be accessed as a 4 Bytes quantity. Partial
> + *   reads are not permitted.
> + *   • A 64 bit register shall be accessed as a 8 Bytes quantity. Partial
> + *   reads are not permitted.
> + *
> + * As of the spec defined today, only 4 byte registers exist.
> + */
> +static const MemoryRegionOps cache_mem_ops = {
> +    .read = cxl_cache_mem_read_reg,
> +    .write = cxl_cache_mem_write_reg,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +};
> +
> +void cxl_component_register_block_init(Object *obj,
> +                                       CXLComponentState *cxl_cstate,
> +                                       const char *type)
> +{
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    memory_region_init(&cregs->component_registers, obj, type,
> +                       CXL2_COMPONENT_BLOCK_SIZE);
> +
> +    /* io registers controls link which we don't care about in QEMU */
> +    memory_region_init_io(&cregs->io, obj, NULL, cregs, ".io",
> +                          CXL2_COMPONENT_IO_REGION_SIZE);
> +    memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cregs,
> +                          ".cache_mem", CXL2_COMPONENT_CM_REGION_SIZE);
> +
> +    memory_region_add_subregion(&cregs->component_registers, 0, &cregs->io);
> +    memory_region_add_subregion(&cregs->component_registers,
> +                                CXL2_COMPONENT_IO_REGION_SIZE,
> +                                &cregs->cache_mem);
> +}
> +
> +static void ras_init_common(uint32_t *reg_state)
> +{
> +    reg_state[R_CXL_RAS_UNC_ERR_STATUS] = 0;
> +    reg_state[R_CXL_RAS_UNC_ERR_MASK] = 0x1efff;  
This should be everything up to bit 11 then bits 14-16 I believe.
0x1cfff 

> +    reg_state[R_CXL_RAS_UNC_ERR_SEVERITY] = 0x1efff;
0x1cfff as well

> +    reg_state[R_CXL_RAS_COR_ERR_STATUS] = 0;
> +    reg_state[R_CXL_RAS_COR_ERR_MASK] = 0x3f;
> +    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */
> +}
> +
> +static void hdm_init_common(uint32_t *reg_state)
> +{
> +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0);
> +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);
> +}
> +
> +void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
> +{
> +    int caps = 0;
> +    switch (type) {
> +    case CXL2_DOWNSTREAM_PORT:
> +    case CXL2_DEVICE:
> +        /* CAP, RAS, Link */
> +        caps = 2;
> +        break;
> +    case CXL2_UPSTREAM_PORT:
> +    case CXL2_TYPE3_DEVICE:
> +    case CXL2_LOGICAL_DEVICE:
> +        /* + HDM */
> +        caps = 3;
> +        break;
> +    case CXL2_ROOT_PORT:
> +        /* + Extended Security, + Snoop */
> +        caps = 5;
> +        break;
> +    default:
> +        abort();
> +    }
> +
> +    memset(reg_state, 0, 0x1000);

Better to pass the size in so it's apparent where that came from?

> +
> +    /* CXL Capability Header Register */
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
> +
> +
> +#define init_cap_reg(reg, id, version)                                        \
> +    _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
> +    do {                                                                      \
> +        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
> +        reg_state[which] = FIELD_DP32(reg_state[which],                       \
> +                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
> +                       VERSION, version);                                     \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
> +                       CXL_##reg##_REGISTERS_OFFSET);                         \
> +    } while (0)
> +
> +    init_cap_reg(RAS, 2, 1);
> +    ras_init_common(reg_state);
> +
> +    init_cap_reg(LINK, 4, 2);
> +
> +    if (caps < 3) {

It strikes me that this approach of basing it purely on number of caps is
not particularly flexible or maintainable but I guess it will do for until we need
something more sophisticated. (i.e. when they aren't a series of expanding inclusive
sets of entries)

> +        return;
> +    }
> +
> +    init_cap_reg(HDM, 5, 1);
> +    hdm_init_common(reg_state);
> +
> +    if (caps < 5) {
> +        return;
> +    }
> +
> +    init_cap_reg(EXTSEC, 6, 1);
> +    init_cap_reg(SNOOP, 8, 1);
> +
> +#undef init_cap_reg
> +}
> +
> +/*
> + * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
> + * for tracking the valid offset.
> + *
> + * This function will build the DVSEC header on behalf of the caller and then
> + * copy in the remaining data for the vendor specific bits.
> + */
> +void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
> +                                uint16_t type, uint8_t rev, uint8_t *body)
> +{
> +    PCIDevice *pdev = cxl->pdev;
> +    uint16_t offset = cxl->dvsec_offset;
> +
> +    assert(offset >= PCI_CFG_SPACE_SIZE &&
> +           ((offset + length) < PCI_CFG_SPACE_EXP_SIZE));
> +    assert((length & 0xf000) == 0);
> +    assert((rev & ~0xf) == 0);
> +
> +    /* Create the DVSEC in the MCFG space */
> +    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
> +    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER1_OFFSET,
> +                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
> +    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
> +    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
> +           body + sizeof(struct dvsec_header),
> +           length - sizeof(struct dvsec_header));
> +
> +    /* Update state for future DVSEC additions */
> +    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
> +    cxl->dvsec_offset += length;
> +}
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> new file mode 100644
> index 0000000000..00c3876a0f
> --- /dev/null
> +++ b/hw/cxl/meson.build
> @@ -0,0 +1,3 @@
> +softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> +  'cxl-component-utils.c',
> +))
> diff --git a/hw/meson.build b/hw/meson.build
> index 010de7219c..3e440c341a 100644
> --- a/hw/meson.build
> +++ b/hw/meson.build
> @@ -6,6 +6,7 @@ subdir('block')
>  subdir('char')
>  subdir('core')
>  subdir('cpu')
> +subdir('cxl')
>  subdir('display')
>  subdir('dma')
>  subdir('gpio')
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> new file mode 100644
> index 0000000000..55f6cc30a5
> --- /dev/null
> +++ b/include/hw/cxl/cxl.h
> @@ -0,0 +1,17 @@
> +/*
> + * QEMU CXL Support
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_H
> +#define CXL_H
> +
> +#include "cxl_pci.h"
> +#include "cxl_component.h"
> +
> +#endif
> +
> diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
> new file mode 100644
> index 0000000000..762feb54da
> --- /dev/null
> +++ b/include/hw/cxl/cxl_component.h
> @@ -0,0 +1,187 @@
> +/*
> + * QEMU CXL Component
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_COMPONENT_H
> +#define CXL_COMPONENT_H
> +
> +/* CXL 2.0 - 8.2.4 */
> +#define CXL2_COMPONENT_IO_REGION_SIZE 0x1000
> +#define CXL2_COMPONENT_CM_REGION_SIZE 0x1000
> +#define CXL2_COMPONENT_BLOCK_SIZE 0x10000
> +
> +#include "qemu/range.h"
> +#include "qemu/typedefs.h"
> +#include "hw/register.h"
> +
> +enum reg_type {
> +    CXL2_DEVICE,
> +    CXL2_TYPE3_DEVICE,
> +    CXL2_LOGICAL_DEVICE,
> +    CXL2_ROOT_PORT,
> +    CXL2_UPSTREAM_PORT,
> +    CXL2_DOWNSTREAM_PORT
> +};
> +
> +/*
> + * Capability registers are defined at the top of the CXL.cache/mem region and
> + * are packed. For our purposes we will always define the caps in the same
> + * order.
> + * CXL 2.0 - 8.2.5 Table 142 for details.
> + */
> +
> +/* CXL 2.0 - 8.2.5.1 */
> +REG32(CXL_CAPABILITY_HEADER, 0)
> +    FIELD(CXL_CAPABILITY_HEADER, ID, 0, 16)
> +    FIELD(CXL_CAPABILITY_HEADER, VERSION, 16, 4)
> +    FIELD(CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 20, 4)
> +    FIELD(CXL_CAPABILITY_HEADER, ARRAY_SIZE, 24, 8)
> +
> +#define CXLx_CAPABILITY_HEADER(type, offset)                  \
> +    REG32(CXL_##type##_CAPABILITY_HEADER, offset)             \
> +        FIELD(CXL_##type##_CAPABILITY_HEADER, ID, 0, 16)      \
> +        FIELD(CXL_##type##_CAPABILITY_HEADER, VERSION, 16, 4) \
> +        FIELD(CXL_##type##_CAPABILITY_HEADER, PTR, 20, 12)
> +CXLx_CAPABILITY_HEADER(RAS, 0x4)
> +CXLx_CAPABILITY_HEADER(LINK, 0x8)
> +CXLx_CAPABILITY_HEADER(HDM, 0xc)
> +CXLx_CAPABILITY_HEADER(EXTSEC, 0x10)
> +CXLx_CAPABILITY_HEADER(SNOOP, 0x14)
> +
> +/*
> + * Capability structures contain the actual registers that the CXL component
> + * implements. Some of these are specific to certain types of components, but
> + * this implementation leaves enough space regardless.
> + */
> +/* 8.2.5.9 - CXL RAS Capability Structure */
> +#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */
> +#define CXL_RAS_REGISTERS_SIZE   0x58
> +REG32(CXL_RAS_UNC_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET)
> +REG32(CXL_RAS_UNC_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x4)
> +REG32(CXL_RAS_UNC_ERR_SEVERITY, CXL_RAS_REGISTERS_OFFSET + 0x8)
> +REG32(CXL_RAS_COR_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET + 0xc)
> +REG32(CXL_RAS_COR_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x10)
> +REG32(CXL_RAS_ERR_CAP_CTRL, CXL_RAS_REGISTERS_OFFSET + 0x14)
> +/* Offset 0x18 - 0x58 reserved for RAS logs */
> +
> +/* 8.2.5.10 - CXL Security Capability Structure */
> +#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
> +#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
> +
> +/* 8.2.5.11 - CXL Link Capability Structure */
> +#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
> +#define CXL_LINK_REGISTERS_SIZE   0x38
> +
> +/* 8.2.5.12 - CXL HDM Decoder Capability Structure */
> +#define HDM_DECODE_MAX 10 /* 8.2.5.12.1 */
> +#define CXL_HDM_REGISTERS_OFFSET \
> +    (CXL_LINK_REGISTERS_OFFSET + CXL_LINK_REGISTERS_SIZE) /* 8.2.5.12 */

The positioning of this isn't really defined by 8.2.5.12 that I can see. So I'd drop
that trailing comment.

> +#define CXL_HDM_REGISTERS_SIZE (0x20 + HDM_DECODE_MAX * 10)

This doesn't look quite right.  0x10 + HDM_DECODE_MAX * 0x20 I think.
Offset to decoder 0 + number of HDM decoders * size of each decode description.


> +#define HDM_DECODER_INIT(n)                                                    \
> +  REG32(CXL_HDM_DECODER##n##_BASE_LO,                                          \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x10)                          \
this might be easier to read if you define something like
CXL_HDM_REGS_DECODER0_OFFSET  CXL_HDM_REGISTERS_OFFSET + 0x10
then use that for the base.

> +            FIELD(CXL_HDM_DECODER##n##_BASE_LO, L, 28, 4)                      \
> +  REG32(CXL_HDM_DECODER##n##_BASE_HI,                                          \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x14)                          \
> +  REG32(CXL_HDM_DECODER##n##_SIZE_LO,                                          \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x18)                          \
> +  REG32(CXL_HDM_DECODER##n##_SIZE_HI,                                          \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x1C)                          \
> +  REG32(CXL_HDM_DECODER##n##_CTRL,                                             \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x20)                          \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, IG, 0, 4)                         \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, IW, 4, 4)                         \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, LOCK_ON_COMMIT, 8, 1)             \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMIT, 9, 1)                     \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMITTED, 10, 1)                 \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, ERROR, 11, 1)                     \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, TYPE, 12, 1)                      \
> +  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_LO, 0x24)                             \
Offset should I think be per 'n'.

> +  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_HI, 0x28)

Hmm. There is a hole here in the spec.  Probably needs a reserved 4 bytes
at CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x2c given next entry is at 0x30

Should be added to 8.2.5.12 table

> +
> +REG32(CXL_HDM_DECODER_CAPABILITY, CXL_HDM_REGISTERS_OFFSET)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0, 4)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, TARGET_COUNT, 4, 4)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_256B, 8, 1)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTELEAVE_4K, 9, 1)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, POISON_ON_ERR_CAP, 10, 1)
> +REG32(CXL_HDM_DECODER_GLOBAL_CONTROL, CXL_HDM_REGISTERS_OFFSET + 4) 

I'd be consistent on using hex for all offsets (even when it clearly makes no
difference like here!)

> +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, POISON_ON_ERR_EN, 0, 1)
> +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 1, 1)
> +
> +HDM_DECODER_INIT(0);
> +
> +/* 8.2.5.13 - CXL Extended Security Capability Structure (Root complex only) */
> +#define EXTSEC_ENTRY_MAX        256
> +#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)
> +#define CXL_EXTSEC_REGISTERS_SIZE   (8 * EXTSEC_ENTRY_MAX + 4)
> +
> +/* 8.2.5.14 - CXL IDE Capability Structure */
> +#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)
> +#define CXL_IDE_REGISTERS_SIZE   0

0x20 (given we seem to be sizing other things we aren't using yet)

> +
> +/* 8.2.5.15 - CXL Snoop Filter Capability Structure */
> +#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
> +#define CXL_SNOOP_REGISTERS_SIZE   0x8
> +
> +_Static_assert((CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE) < 0x1000,
> +               "No space for registers");
> +
> +typedef struct component_registers {
> +    /*
> +     * Main memory region to be registered with QEMU core.
> +     */
> +    MemoryRegion component_registers;
> +
> +    /*
> +     * 8.2.4 Table 141:
> +     *   0x0000 - 0x0fff CXL.io registers
> +     *   0x1000 - 0x1fff CXL.cache and CXL.mem
> +     *   0x2000 - 0xdfff Implementation specific
> +     *   0xe000 - 0xe3ff CXL ARB/MUX registers
> +     *   0xe400 - 0xffff RSVD
> +     */
> +    uint32_t io_registers[CXL2_COMPONENT_IO_REGION_SIZE >> 2];
> +    MemoryRegion io;
> +
> +    uint32_t cache_mem_registers[CXL2_COMPONENT_CM_REGION_SIZE >> 2];
> +    MemoryRegion cache_mem;
> +
> +    MemoryRegion impl_specific;
> +    MemoryRegion arb_mux;
> +    MemoryRegion rsvd;
> +
> +    /* special_ops is used for any component that needs any specific handling */
> +    MemoryRegionOps *special_ops;
> +} ComponentRegisters;
> +
> +/*
> + * A CXL component represents all entities in a CXL hierarchy. This includes,
> + * host bridges, root ports, upstream/downstream switch ports, and devices
> + */
> +typedef struct cxl_component {
> +    ComponentRegisters crb;
> +    union {
> +        struct {
> +            Range dvsecs[CXL20_MAX_DVSEC];
> +            uint16_t dvsec_offset;
> +            struct PCIDevice *pdev;
> +        };
> +    };
> +} CXLComponentState;
> +
> +void cxl_component_register_block_init(Object *obj,
> +                                       CXLComponentState *cxl_cstate,
> +                                       const char *type);
> +void cxl_component_register_init_common(uint32_t *reg_state,
> +                                        enum reg_type type);
> +
> +void cxl_component_create_dvsec(CXLComponentState *cxl_cstate, uint16_t length,
> +                                uint16_t type, uint8_t rev, uint8_t *body);
> +
> +#endif
> diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
> new file mode 100644
> index 0000000000..a53c2e5ae7
> --- /dev/null
> +++ b/include/hw/cxl/cxl_pci.h
> @@ -0,0 +1,138 @@
> +/*
> + * QEMU CXL PCI interfaces
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_PCI_H
> +#define CXL_PCI_H
> +
> +#include "hw/pci/pci.h"
> +#include "hw/pci/pcie.h"
> +
> +#define CXL_VENDOR_ID 0x1e98
> +
> +#define PCIE_DVSEC_HEADER1_OFFSET 0x4 /* Offset from start of extend cap */
> +#define PCIE_DVSEC_ID_OFFSET 0x8
> +
> +#define PCIE_CXL_DEVICE_DVSEC_LENGTH 0x38
> +#define PCIE_CXL1_DEVICE_DVSEC_REVID 0
> +#define PCIE_CXL2_DEVICE_DVSEC_REVID 1
> +
> +#define EXTENSIONS_PORT_DVSEC_LENGTH 0x28
> +#define EXTENSIONS_PORT_DVSEC_REVID 0
> +
> +#define GPF_PORT_DVSEC_LENGTH 0x10
> +#define GPF_PORT_DVSEC_REVID  0
> +
> +#define PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0 0x14
> +#define PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0  1
> +
> +#define REG_LOC_DVSEC_LENGTH 0x24
> +#define REG_LOC_DVSEC_REVID  0
> +
> +enum {
> +    PCIE_CXL_DEVICE_DVSEC      = 0,
> +    NON_CXL_FUNCTION_MAP_DVSEC = 2,
> +    EXTENSIONS_PORT_DVSEC      = 3,
> +    GPF_PORT_DVSEC             = 4,
> +    GPF_DEVICE_DVSEC           = 5,
> +    PCIE_FLEXBUS_PORT_DVSEC    = 7,
> +    REG_LOC_DVSEC              = 8,
> +    MLD_DVSEC                  = 9,
> +    CXL20_MAX_DVSEC
> +};
> +
> +struct dvsec_header {
> +    uint32_t cap_hdr;
> +    uint32_t dv_hdr1;
> +    uint16_t dv_hdr2;
> +} __attribute__((__packed__));
> +_Static_assert(sizeof(struct dvsec_header) == 10,

Be consistent on decimal or hex for these checks. I don't really
care which but odd to mix and match.

> +               "dvsec header size incorrect");
> +
> +/*
> + * CXL 2.0 devices must implement certain DVSEC IDs, and can [optionally]
> + * implement others.
> + *
> + * CXL 2.0 Device: 0, [2], 5, 8
> + * CXL 2.0 RP: 3, 4, 7, 8
> + * CXL 2.0 Upstream Port: [2], 7, 8
> + * CXL 2.0 Downstream Port: 3, 4, 7, 8
> + */
> +
> +/* CXL 2.0 - 8.1.5 (ID 0003) */
> +struct extensions_dvsec_port {
Probably want consistent naming (as can be done anyway) so
cxl_dvsec_port_extensions maybe?

> +    struct dvsec_header hdr;
> +    uint16_t status;
> +    uint16_t control;
> +    uint8_t alt_bus_base;
> +    uint8_t alt_bus_limit;
> +    uint16_t alt_memory_base;
> +    uint16_t alt_memory_limit;
> +    uint16_t alt_prefetch_base;
> +    uint16_t alt_prefetch_limit;
> +    uint32_t alt_prefetch_base_high;
> +    uint32_t alt_prefetch_base_low;
> +    uint32_t rcrb_base;
> +    uint32_t rcrb_base_high;
> +};
> +_Static_assert(sizeof(struct extensions_dvsec_port) == 0x28,
> +               "extensions dvsec port size incorrect");
> +#define PORT_CONTROL_OVERRIDE_OFFSET 0xc

What's this one?  Looks to just be the PORT_CONTROL_OFFSET
though admittedly the spec does refer to this as OVERRIDE_OFFSET in
one place.

> +#define PORT_CONTROL_UNMASK_SBR      1
> +#define PORT_CONTROL_ALT_MEMID_EN    4
> +
> +/* CXL 2.0 - 8.1.6 GPF DVSEC (ID 0004) */
> +struct dvsec_port_gpf {
> +    struct dvsec_header hdr;
> +    uint16_t rsvd;
> +    uint16_t phase1_ctrl;
> +    uint16_t phase2_ctrl;
> +};
> +_Static_assert(sizeof(struct dvsec_port_gpf) == 0x10,
> +               "dvsec port GPF size incorrect");
> +
> +/* CXL 2.0 - 8.1.8/8.2.1.3 Flexbus DVSEC (ID 0007) */
> +struct dvsec_port_flexbus {
> +    struct dvsec_header hdr;
> +    uint16_t cap;
> +    uint16_t ctrl;
> +    uint16_t status;
> +    uint32_t rcvd_mod_ts_data;

Whilst it is wordy, I'd keep the full naming of that field.
rcvd_mod_ts_data_phase_1; 

> +};
> +_Static_assert(sizeof(struct dvsec_port_flexbus) == 0x14,
> +               "dvsec port flexbus size incorrect");
> +
> +/* CXL 2.0 - 8.1.9 Register Locator DVSEC (ID 0008) */
> +struct dvsec_register_locator {

I'd prefix these structures with cxl_ for all the normal
namespacing collision reasons.
Same for #defines


> +    struct dvsec_header hdr;
> +    uint16_t rsvd;
> +    uint32_t reg0_base_lo;
> +    uint32_t reg0_base_hi;
> +    uint32_t reg1_base_lo;
> +    uint32_t reg1_base_hi;
> +    uint32_t reg2_base_lo;
> +    uint32_t reg2_base_hi;
> +};
> +_Static_assert(sizeof(struct dvsec_register_locator) == 0x24,
> +               "dvsec register locator size incorrect");
> +
> +/* BAR Equivalence Indicator */
> +#define BEI_BAR_10H 0
> +#define BEI_BAR_14H 1
> +#define BEI_BAR_18H 2
> +#define BEI_BAR_1cH 3
> +#define BEI_BAR_20H 4
> +#define BEI_BAR_24H 5
> +
> +/* Register Block Identifier */
> +#define RBI_EMPTY          0
> +#define RBI_COMPONENT_REG  (1 << 8)
> +#define RBI_BAR_VIRT_ACL   (2 << 8)
> +#define RBI_CXL_DEVICE_REG (3 << 8)
> +
> +#endif


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 02/31] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
@ 2021-02-02 11:48     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 11:48 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:19 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> A CXL 2.0 component is any entity in the CXL topology. All components
> have a analogous function in PCIe. Except for the CXL host bridge, all
> have a PCIe config space that is accessible via the common PCIe
> mechanisms. CXL components are enumerated via DVSEC fields in the
> extended PCIe header space. CXL components will minimally implement some
> subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
> 2.0 specification. Two headers and a utility library are introduced to
> support the minimum functionality needed to enumerate components.
> 
> The cxl_pci header manages bits associated with PCI, specifically the
> DVSEC and related fields. The cxl_component.h variant has data
> structures and APIs that are useful for drivers implementing any of the
> CXL 2.0 components. The library takes care of making use of the DVSEC
> bits and the CXL.[mem|cache] registers. Per spec, the registers are
> little endian.
> 
> None of the mechanisms required to enumerate a CXL capable hostbridge
> are introduced at this point.
> 
> Note that the CXL.mem and CXL.cache registers used are always 4B wide.
> It's possible in the future that this constraint will not hold.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

A few minor discrepancies from the spec, + naming suggestions.

Otherwise LGTM.

> ---
>  MAINTAINERS                    |   6 +
>  hw/Kconfig                     |   1 +
>  hw/cxl/Kconfig                 |   3 +
>  hw/cxl/cxl-component-utils.c   | 208 +++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build             |   3 +
>  hw/meson.build                 |   1 +
>  include/hw/cxl/cxl.h           |  17 +++
>  include/hw/cxl/cxl_component.h | 187 +++++++++++++++++++++++++++++
>  include/hw/cxl/cxl_pci.h       | 138 ++++++++++++++++++++++
>  9 files changed, 564 insertions(+)
>  create mode 100644 hw/cxl/Kconfig
>  create mode 100644 hw/cxl/cxl-component-utils.c
>  create mode 100644 hw/cxl/meson.build
>  create mode 100644 include/hw/cxl/cxl.h
>  create mode 100644 include/hw/cxl/cxl_component.h
>  create mode 100644 include/hw/cxl/cxl_pci.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bcd88668bc..981dc92e25 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2234,6 +2234,12 @@ F: qapi/block*.json
>  F: qapi/transaction.json
>  T: git https://repo.or.cz/qemu/armbru.git block-next
>  
> +Compute Express Link
> +M: Ben Widawsky <ben.widawsky@intel.com>
> +S: Supported
> +F: hw/cxl/
> +F: include/hw/cxl/
> +
>  Dirty Bitmaps
>  M: Eric Blake <eblake@redhat.com>
>  M: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> diff --git a/hw/Kconfig b/hw/Kconfig
> index 5ad3c6b5a4..c03650c5ed 100644
> --- a/hw/Kconfig
> +++ b/hw/Kconfig
> @@ -6,6 +6,7 @@ source audio/Kconfig
>  source block/Kconfig
>  source char/Kconfig
>  source core/Kconfig
> +source cxl/Kconfig
>  source display/Kconfig
>  source dma/Kconfig
>  source gpio/Kconfig
> diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig
> new file mode 100644
> index 0000000000..8e67519b16
> --- /dev/null
> +++ b/hw/cxl/Kconfig
> @@ -0,0 +1,3 @@
> +config CXL
> +    bool
> +    default y if PCI_EXPRESS
> diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> new file mode 100644
> index 0000000000..8d56ad5c7d
> --- /dev/null
> +++ b/hw/cxl/cxl-component-utils.c
> @@ -0,0 +1,208 @@
> +/*
> + * CXL Utility library for components
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "hw/pci/pci.h"
> +#include "hw/cxl/cxl.h"
> +
> +static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
> +                                       unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    assert(size == 4);
> +
> +    if (cregs->special_ops && cregs->special_ops->read) {
> +        return cregs->special_ops->read(cxl_cstate, offset, size);
> +    } else {
> +        return cregs->cache_mem_registers[offset / 4];
> +    }
> +}
> +
> +static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
> +                                    unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    assert(size == 4);
> +
> +    if (cregs->special_ops && cregs->special_ops->write) {
> +        cregs->special_ops->write(cxl_cstate, offset, value, size);
> +    } else {
> +        cregs->cache_mem_registers[offset / 4] = value;
> +    }
> +}
> +
> +/*
> + * 8.2.3
> + *   The access restrictions specified in Section 8.2.2 also apply to CXL 2.0
> + *   Component Registers.
> + *
> + * 8.2.2
> + *   • A 32 bit register shall be accessed as a 4 Bytes quantity. Partial
> + *   reads are not permitted.
> + *   • A 64 bit register shall be accessed as a 8 Bytes quantity. Partial
> + *   reads are not permitted.
> + *
> + * As of the spec defined today, only 4 byte registers exist.
> + */
> +static const MemoryRegionOps cache_mem_ops = {
> +    .read = cxl_cache_mem_read_reg,
> +    .write = cxl_cache_mem_write_reg,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +};
> +
> +void cxl_component_register_block_init(Object *obj,
> +                                       CXLComponentState *cxl_cstate,
> +                                       const char *type)
> +{
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    memory_region_init(&cregs->component_registers, obj, type,
> +                       CXL2_COMPONENT_BLOCK_SIZE);
> +
> +    /* io registers controls link which we don't care about in QEMU */
> +    memory_region_init_io(&cregs->io, obj, NULL, cregs, ".io",
> +                          CXL2_COMPONENT_IO_REGION_SIZE);
> +    memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cregs,
> +                          ".cache_mem", CXL2_COMPONENT_CM_REGION_SIZE);
> +
> +    memory_region_add_subregion(&cregs->component_registers, 0, &cregs->io);
> +    memory_region_add_subregion(&cregs->component_registers,
> +                                CXL2_COMPONENT_IO_REGION_SIZE,
> +                                &cregs->cache_mem);
> +}
> +
> +static void ras_init_common(uint32_t *reg_state)
> +{
> +    reg_state[R_CXL_RAS_UNC_ERR_STATUS] = 0;
> +    reg_state[R_CXL_RAS_UNC_ERR_MASK] = 0x1efff;  
This should be everything up to bit 11 then bits 14-16 I believe.
0x1cfff 

> +    reg_state[R_CXL_RAS_UNC_ERR_SEVERITY] = 0x1efff;
0x1cfff as well

> +    reg_state[R_CXL_RAS_COR_ERR_STATUS] = 0;
> +    reg_state[R_CXL_RAS_COR_ERR_MASK] = 0x3f;
> +    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */
> +}
> +
> +static void hdm_init_common(uint32_t *reg_state)
> +{
> +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0);
> +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);
> +}
> +
> +void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
> +{
> +    int caps = 0;
> +    switch (type) {
> +    case CXL2_DOWNSTREAM_PORT:
> +    case CXL2_DEVICE:
> +        /* CAP, RAS, Link */
> +        caps = 2;
> +        break;
> +    case CXL2_UPSTREAM_PORT:
> +    case CXL2_TYPE3_DEVICE:
> +    case CXL2_LOGICAL_DEVICE:
> +        /* + HDM */
> +        caps = 3;
> +        break;
> +    case CXL2_ROOT_PORT:
> +        /* + Extended Security, + Snoop */
> +        caps = 5;
> +        break;
> +    default:
> +        abort();
> +    }
> +
> +    memset(reg_state, 0, 0x1000);

Better to pass the size in so it's apparent where that came from?

> +
> +    /* CXL Capability Header Register */
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
> +
> +
> +#define init_cap_reg(reg, id, version)                                        \
> +    _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
> +    do {                                                                      \
> +        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
> +        reg_state[which] = FIELD_DP32(reg_state[which],                       \
> +                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
> +                       VERSION, version);                                     \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
> +                       CXL_##reg##_REGISTERS_OFFSET);                         \
> +    } while (0)
> +
> +    init_cap_reg(RAS, 2, 1);
> +    ras_init_common(reg_state);
> +
> +    init_cap_reg(LINK, 4, 2);
> +
> +    if (caps < 3) {

It strikes me that this approach of basing it purely on number of caps is
not particularly flexible or maintainable but I guess it will do for until we need
something more sophisticated. (i.e. when they aren't a series of expanding inclusive
sets of entries)

> +        return;
> +    }
> +
> +    init_cap_reg(HDM, 5, 1);
> +    hdm_init_common(reg_state);
> +
> +    if (caps < 5) {
> +        return;
> +    }
> +
> +    init_cap_reg(EXTSEC, 6, 1);
> +    init_cap_reg(SNOOP, 8, 1);
> +
> +#undef init_cap_reg
> +}
> +
> +/*
> + * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
> + * for tracking the valid offset.
> + *
> + * This function will build the DVSEC header on behalf of the caller and then
> + * copy in the remaining data for the vendor specific bits.
> + */
> +void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
> +                                uint16_t type, uint8_t rev, uint8_t *body)
> +{
> +    PCIDevice *pdev = cxl->pdev;
> +    uint16_t offset = cxl->dvsec_offset;
> +
> +    assert(offset >= PCI_CFG_SPACE_SIZE &&
> +           ((offset + length) < PCI_CFG_SPACE_EXP_SIZE));
> +    assert((length & 0xf000) == 0);
> +    assert((rev & ~0xf) == 0);
> +
> +    /* Create the DVSEC in the MCFG space */
> +    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
> +    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER1_OFFSET,
> +                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
> +    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
> +    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
> +           body + sizeof(struct dvsec_header),
> +           length - sizeof(struct dvsec_header));
> +
> +    /* Update state for future DVSEC additions */
> +    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
> +    cxl->dvsec_offset += length;
> +}
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> new file mode 100644
> index 0000000000..00c3876a0f
> --- /dev/null
> +++ b/hw/cxl/meson.build
> @@ -0,0 +1,3 @@
> +softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> +  'cxl-component-utils.c',
> +))
> diff --git a/hw/meson.build b/hw/meson.build
> index 010de7219c..3e440c341a 100644
> --- a/hw/meson.build
> +++ b/hw/meson.build
> @@ -6,6 +6,7 @@ subdir('block')
>  subdir('char')
>  subdir('core')
>  subdir('cpu')
> +subdir('cxl')
>  subdir('display')
>  subdir('dma')
>  subdir('gpio')
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> new file mode 100644
> index 0000000000..55f6cc30a5
> --- /dev/null
> +++ b/include/hw/cxl/cxl.h
> @@ -0,0 +1,17 @@
> +/*
> + * QEMU CXL Support
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_H
> +#define CXL_H
> +
> +#include "cxl_pci.h"
> +#include "cxl_component.h"
> +
> +#endif
> +
> diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
> new file mode 100644
> index 0000000000..762feb54da
> --- /dev/null
> +++ b/include/hw/cxl/cxl_component.h
> @@ -0,0 +1,187 @@
> +/*
> + * QEMU CXL Component
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_COMPONENT_H
> +#define CXL_COMPONENT_H
> +
> +/* CXL 2.0 - 8.2.4 */
> +#define CXL2_COMPONENT_IO_REGION_SIZE 0x1000
> +#define CXL2_COMPONENT_CM_REGION_SIZE 0x1000
> +#define CXL2_COMPONENT_BLOCK_SIZE 0x10000
> +
> +#include "qemu/range.h"
> +#include "qemu/typedefs.h"
> +#include "hw/register.h"
> +
> +enum reg_type {
> +    CXL2_DEVICE,
> +    CXL2_TYPE3_DEVICE,
> +    CXL2_LOGICAL_DEVICE,
> +    CXL2_ROOT_PORT,
> +    CXL2_UPSTREAM_PORT,
> +    CXL2_DOWNSTREAM_PORT
> +};
> +
> +/*
> + * Capability registers are defined at the top of the CXL.cache/mem region and
> + * are packed. For our purposes we will always define the caps in the same
> + * order.
> + * CXL 2.0 - 8.2.5 Table 142 for details.
> + */
> +
> +/* CXL 2.0 - 8.2.5.1 */
> +REG32(CXL_CAPABILITY_HEADER, 0)
> +    FIELD(CXL_CAPABILITY_HEADER, ID, 0, 16)
> +    FIELD(CXL_CAPABILITY_HEADER, VERSION, 16, 4)
> +    FIELD(CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 20, 4)
> +    FIELD(CXL_CAPABILITY_HEADER, ARRAY_SIZE, 24, 8)
> +
> +#define CXLx_CAPABILITY_HEADER(type, offset)                  \
> +    REG32(CXL_##type##_CAPABILITY_HEADER, offset)             \
> +        FIELD(CXL_##type##_CAPABILITY_HEADER, ID, 0, 16)      \
> +        FIELD(CXL_##type##_CAPABILITY_HEADER, VERSION, 16, 4) \
> +        FIELD(CXL_##type##_CAPABILITY_HEADER, PTR, 20, 12)
> +CXLx_CAPABILITY_HEADER(RAS, 0x4)
> +CXLx_CAPABILITY_HEADER(LINK, 0x8)
> +CXLx_CAPABILITY_HEADER(HDM, 0xc)
> +CXLx_CAPABILITY_HEADER(EXTSEC, 0x10)
> +CXLx_CAPABILITY_HEADER(SNOOP, 0x14)
> +
> +/*
> + * Capability structures contain the actual registers that the CXL component
> + * implements. Some of these are specific to certain types of components, but
> + * this implementation leaves enough space regardless.
> + */
> +/* 8.2.5.9 - CXL RAS Capability Structure */
> +#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */
> +#define CXL_RAS_REGISTERS_SIZE   0x58
> +REG32(CXL_RAS_UNC_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET)
> +REG32(CXL_RAS_UNC_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x4)
> +REG32(CXL_RAS_UNC_ERR_SEVERITY, CXL_RAS_REGISTERS_OFFSET + 0x8)
> +REG32(CXL_RAS_COR_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET + 0xc)
> +REG32(CXL_RAS_COR_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x10)
> +REG32(CXL_RAS_ERR_CAP_CTRL, CXL_RAS_REGISTERS_OFFSET + 0x14)
> +/* Offset 0x18 - 0x58 reserved for RAS logs */
> +
> +/* 8.2.5.10 - CXL Security Capability Structure */
> +#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
> +#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
> +
> +/* 8.2.5.11 - CXL Link Capability Structure */
> +#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
> +#define CXL_LINK_REGISTERS_SIZE   0x38
> +
> +/* 8.2.5.12 - CXL HDM Decoder Capability Structure */
> +#define HDM_DECODE_MAX 10 /* 8.2.5.12.1 */
> +#define CXL_HDM_REGISTERS_OFFSET \
> +    (CXL_LINK_REGISTERS_OFFSET + CXL_LINK_REGISTERS_SIZE) /* 8.2.5.12 */

The positioning of this isn't really defined by 8.2.5.12 that I can see. So I'd drop
that trailing comment.

> +#define CXL_HDM_REGISTERS_SIZE (0x20 + HDM_DECODE_MAX * 10)

This doesn't look quite right.  0x10 + HDM_DECODE_MAX * 0x20 I think.
Offset to decoder 0 + number of HDM decoders * size of each decode description.


> +#define HDM_DECODER_INIT(n)                                                    \
> +  REG32(CXL_HDM_DECODER##n##_BASE_LO,                                          \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x10)                          \
this might be easier to read if you define something like
CXL_HDM_REGS_DECODER0_OFFSET  CXL_HDM_REGISTERS_OFFSET + 0x10
then use that for the base.

> +            FIELD(CXL_HDM_DECODER##n##_BASE_LO, L, 28, 4)                      \
> +  REG32(CXL_HDM_DECODER##n##_BASE_HI,                                          \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x14)                          \
> +  REG32(CXL_HDM_DECODER##n##_SIZE_LO,                                          \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x18)                          \
> +  REG32(CXL_HDM_DECODER##n##_SIZE_HI,                                          \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x1C)                          \
> +  REG32(CXL_HDM_DECODER##n##_CTRL,                                             \
> +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x20)                          \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, IG, 0, 4)                         \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, IW, 4, 4)                         \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, LOCK_ON_COMMIT, 8, 1)             \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMIT, 9, 1)                     \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMITTED, 10, 1)                 \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, ERROR, 11, 1)                     \
> +            FIELD(CXL_HDM_DECODER##n##_CTRL, TYPE, 12, 1)                      \
> +  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_LO, 0x24)                             \
Offset should I think be per 'n'.

> +  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_HI, 0x28)

Hmm. There is a hole here in the spec.  Probably needs a reserved 4 bytes
at CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x2c given next entry is at 0x30

Should be added to 8.2.5.12 table

> +
> +REG32(CXL_HDM_DECODER_CAPABILITY, CXL_HDM_REGISTERS_OFFSET)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0, 4)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, TARGET_COUNT, 4, 4)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_256B, 8, 1)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTELEAVE_4K, 9, 1)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, POISON_ON_ERR_CAP, 10, 1)
> +REG32(CXL_HDM_DECODER_GLOBAL_CONTROL, CXL_HDM_REGISTERS_OFFSET + 4) 

I'd be consistent on using hex for all offsets (even when it clearly makes no
difference like here!)

> +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, POISON_ON_ERR_EN, 0, 1)
> +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 1, 1)
> +
> +HDM_DECODER_INIT(0);
> +
> +/* 8.2.5.13 - CXL Extended Security Capability Structure (Root complex only) */
> +#define EXTSEC_ENTRY_MAX        256
> +#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)
> +#define CXL_EXTSEC_REGISTERS_SIZE   (8 * EXTSEC_ENTRY_MAX + 4)
> +
> +/* 8.2.5.14 - CXL IDE Capability Structure */
> +#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)
> +#define CXL_IDE_REGISTERS_SIZE   0

0x20 (given we seem to be sizing other things we aren't using yet)

> +
> +/* 8.2.5.15 - CXL Snoop Filter Capability Structure */
> +#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
> +#define CXL_SNOOP_REGISTERS_SIZE   0x8
> +
> +_Static_assert((CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE) < 0x1000,
> +               "No space for registers");
> +
> +typedef struct component_registers {
> +    /*
> +     * Main memory region to be registered with QEMU core.
> +     */
> +    MemoryRegion component_registers;
> +
> +    /*
> +     * 8.2.4 Table 141:
> +     *   0x0000 - 0x0fff CXL.io registers
> +     *   0x1000 - 0x1fff CXL.cache and CXL.mem
> +     *   0x2000 - 0xdfff Implementation specific
> +     *   0xe000 - 0xe3ff CXL ARB/MUX registers
> +     *   0xe400 - 0xffff RSVD
> +     */
> +    uint32_t io_registers[CXL2_COMPONENT_IO_REGION_SIZE >> 2];
> +    MemoryRegion io;
> +
> +    uint32_t cache_mem_registers[CXL2_COMPONENT_CM_REGION_SIZE >> 2];
> +    MemoryRegion cache_mem;
> +
> +    MemoryRegion impl_specific;
> +    MemoryRegion arb_mux;
> +    MemoryRegion rsvd;
> +
> +    /* special_ops is used for any component that needs any specific handling */
> +    MemoryRegionOps *special_ops;
> +} ComponentRegisters;
> +
> +/*
> + * A CXL component represents all entities in a CXL hierarchy. This includes,
> + * host bridges, root ports, upstream/downstream switch ports, and devices
> + */
> +typedef struct cxl_component {
> +    ComponentRegisters crb;
> +    union {
> +        struct {
> +            Range dvsecs[CXL20_MAX_DVSEC];
> +            uint16_t dvsec_offset;
> +            struct PCIDevice *pdev;
> +        };
> +    };
> +} CXLComponentState;
> +
> +void cxl_component_register_block_init(Object *obj,
> +                                       CXLComponentState *cxl_cstate,
> +                                       const char *type);
> +void cxl_component_register_init_common(uint32_t *reg_state,
> +                                        enum reg_type type);
> +
> +void cxl_component_create_dvsec(CXLComponentState *cxl_cstate, uint16_t length,
> +                                uint16_t type, uint8_t rev, uint8_t *body);
> +
> +#endif
> diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
> new file mode 100644
> index 0000000000..a53c2e5ae7
> --- /dev/null
> +++ b/include/hw/cxl/cxl_pci.h
> @@ -0,0 +1,138 @@
> +/*
> + * QEMU CXL PCI interfaces
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_PCI_H
> +#define CXL_PCI_H
> +
> +#include "hw/pci/pci.h"
> +#include "hw/pci/pcie.h"
> +
> +#define CXL_VENDOR_ID 0x1e98
> +
> +#define PCIE_DVSEC_HEADER1_OFFSET 0x4 /* Offset from start of extend cap */
> +#define PCIE_DVSEC_ID_OFFSET 0x8
> +
> +#define PCIE_CXL_DEVICE_DVSEC_LENGTH 0x38
> +#define PCIE_CXL1_DEVICE_DVSEC_REVID 0
> +#define PCIE_CXL2_DEVICE_DVSEC_REVID 1
> +
> +#define EXTENSIONS_PORT_DVSEC_LENGTH 0x28
> +#define EXTENSIONS_PORT_DVSEC_REVID 0
> +
> +#define GPF_PORT_DVSEC_LENGTH 0x10
> +#define GPF_PORT_DVSEC_REVID  0
> +
> +#define PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0 0x14
> +#define PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0  1
> +
> +#define REG_LOC_DVSEC_LENGTH 0x24
> +#define REG_LOC_DVSEC_REVID  0
> +
> +enum {
> +    PCIE_CXL_DEVICE_DVSEC      = 0,
> +    NON_CXL_FUNCTION_MAP_DVSEC = 2,
> +    EXTENSIONS_PORT_DVSEC      = 3,
> +    GPF_PORT_DVSEC             = 4,
> +    GPF_DEVICE_DVSEC           = 5,
> +    PCIE_FLEXBUS_PORT_DVSEC    = 7,
> +    REG_LOC_DVSEC              = 8,
> +    MLD_DVSEC                  = 9,
> +    CXL20_MAX_DVSEC
> +};
> +
> +struct dvsec_header {
> +    uint32_t cap_hdr;
> +    uint32_t dv_hdr1;
> +    uint16_t dv_hdr2;
> +} __attribute__((__packed__));
> +_Static_assert(sizeof(struct dvsec_header) == 10,

Be consistent on decimal or hex for these checks. I don't really
care which but odd to mix and match.

> +               "dvsec header size incorrect");
> +
> +/*
> + * CXL 2.0 devices must implement certain DVSEC IDs, and can [optionally]
> + * implement others.
> + *
> + * CXL 2.0 Device: 0, [2], 5, 8
> + * CXL 2.0 RP: 3, 4, 7, 8
> + * CXL 2.0 Upstream Port: [2], 7, 8
> + * CXL 2.0 Downstream Port: 3, 4, 7, 8
> + */
> +
> +/* CXL 2.0 - 8.1.5 (ID 0003) */
> +struct extensions_dvsec_port {
Probably want consistent naming (as can be done anyway) so
cxl_dvsec_port_extensions maybe?

> +    struct dvsec_header hdr;
> +    uint16_t status;
> +    uint16_t control;
> +    uint8_t alt_bus_base;
> +    uint8_t alt_bus_limit;
> +    uint16_t alt_memory_base;
> +    uint16_t alt_memory_limit;
> +    uint16_t alt_prefetch_base;
> +    uint16_t alt_prefetch_limit;
> +    uint32_t alt_prefetch_base_high;
> +    uint32_t alt_prefetch_base_low;
> +    uint32_t rcrb_base;
> +    uint32_t rcrb_base_high;
> +};
> +_Static_assert(sizeof(struct extensions_dvsec_port) == 0x28,
> +               "extensions dvsec port size incorrect");
> +#define PORT_CONTROL_OVERRIDE_OFFSET 0xc

What's this one?  Looks to just be the PORT_CONTROL_OFFSET
though admittedly the spec does refer to this as OVERRIDE_OFFSET in
one place.

> +#define PORT_CONTROL_UNMASK_SBR      1
> +#define PORT_CONTROL_ALT_MEMID_EN    4
> +
> +/* CXL 2.0 - 8.1.6 GPF DVSEC (ID 0004) */
> +struct dvsec_port_gpf {
> +    struct dvsec_header hdr;
> +    uint16_t rsvd;
> +    uint16_t phase1_ctrl;
> +    uint16_t phase2_ctrl;
> +};
> +_Static_assert(sizeof(struct dvsec_port_gpf) == 0x10,
> +               "dvsec port GPF size incorrect");
> +
> +/* CXL 2.0 - 8.1.8/8.2.1.3 Flexbus DVSEC (ID 0007) */
> +struct dvsec_port_flexbus {
> +    struct dvsec_header hdr;
> +    uint16_t cap;
> +    uint16_t ctrl;
> +    uint16_t status;
> +    uint32_t rcvd_mod_ts_data;

Whilst it is wordy, I'd keep the full naming of that field.
rcvd_mod_ts_data_phase_1; 

> +};
> +_Static_assert(sizeof(struct dvsec_port_flexbus) == 0x14,
> +               "dvsec port flexbus size incorrect");
> +
> +/* CXL 2.0 - 8.1.9 Register Locator DVSEC (ID 0008) */
> +struct dvsec_register_locator {

I'd prefix these structures with cxl_ for all the normal
namespacing collision reasons.
Same for #defines


> +    struct dvsec_header hdr;
> +    uint16_t rsvd;
> +    uint32_t reg0_base_lo;
> +    uint32_t reg0_base_hi;
> +    uint32_t reg1_base_lo;
> +    uint32_t reg1_base_hi;
> +    uint32_t reg2_base_lo;
> +    uint32_t reg2_base_hi;
> +};
> +_Static_assert(sizeof(struct dvsec_register_locator) == 0x24,
> +               "dvsec register locator size incorrect");
> +
> +/* BAR Equivalence Indicator */
> +#define BEI_BAR_10H 0
> +#define BEI_BAR_14H 1
> +#define BEI_BAR_18H 2
> +#define BEI_BAR_1cH 3
> +#define BEI_BAR_20H 4
> +#define BEI_BAR_24H 5
> +
> +/* Register Block Identifier */
> +#define RBI_EMPTY          0
> +#define RBI_COMPONENT_REG  (1 << 8)
> +#define RBI_BAR_VIRT_ACL   (2 << 8)
> +#define RBI_CXL_DEVICE_REG (3 << 8)
> +
> +#endif



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 03/31] hw/cxl/device: Introduce a CXL device (8.2.8)
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-02 12:03     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 12:03 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:20 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> A CXL device is a type of CXL component. Conceptually, a CXL device
> would be a leaf node in a CXL topology. From an emulation perspective,
> CXL devices are the most complex and so the actual implementation is
> reserved for discrete commits.
> 
> This new device type is specifically catered towards the eventual
> implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
> specification.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Really minor comments inline.

In the interests of avoiding giving myself a headache again
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  include/hw/cxl/cxl.h        |   1 +
>  include/hw/cxl/cxl_device.h | 155 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 156 insertions(+)
>  create mode 100644 include/hw/cxl/cxl_device.h
> 
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> index 55f6cc30a5..23f52c4cf9 100644
> --- a/include/hw/cxl/cxl.h
> +++ b/include/hw/cxl/cxl.h
> @@ -12,6 +12,7 @@
>  
>  #include "cxl_pci.h"
>  #include "cxl_component.h"
> +#include "cxl_device.h"
>  
>  #endif
>  
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> new file mode 100644
> index 0000000000..a85f250503
> --- /dev/null
> +++ b/include/hw/cxl/cxl_device.h
> @@ -0,0 +1,155 @@
> +/*
> + * QEMU CXL Devices
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_DEVICE_H
> +#define CXL_DEVICE_H
> +
> +#include "hw/register.h"
> +
> +/*
> + * The following is how a CXL device's MMIO space is laid out. The only
> + * requirement from the spec is that the capabilities array and the capability
> + * headers start at offset 0 and are contiguously packed. The headers themselves
> + * provide offsets to the register fields. For this emulation, registers will
> + * start at offset 0x80 (m == 0x80). No secondary mailbox is implemented which
> + * means that n = m + sizeof(mailbox registers) + sizeof(device registers).
> + *
> + * This is roughly described in 8.2.8 Figure 138 of the CXL 2.0 spec.
> + *
> + * n + PAYLOAD_SIZE_MAX  +---------------------------------+
> + *                       |                                 |
> + *                  ^    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |         Command Payload         |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  n    +---------------------------------+
> + *                  ^    |                                 |
> + *                  |    |    Device Capability Registers  |
> + *                  |    |    x, mailbox, y                |
> + *                  |    |                                 |
> + *                  m    +---------------------------------+
> + *                  ^    |     Device Capability Header y  |
> + *                  |    +---------------------------------+
> + *                  |    | Device Capability Header Mailbox|
> + *                  |    +------------- --------------------
> + *                  |    |     Device Capability Header x  |
> + *                  |    +---------------------------------+
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |      Device Cap Array[0..n]     |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  0    +---------------------------------+
> + */
> +
> +#define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
> +#define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
> +#define CXL_DEVICE_CAPS_MAX 4 /* 8.2.8.2.1 + 8.2.8.5 */
> +
> +#define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
> +#define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
> +
> +#define CXL_MAILBOX_REGISTERS_OFFSET \
> +    (CXL_DEVICE_REGISTERS_OFFSET + CXL_DEVICE_REGISTERS_LENGTH)
> +#define CXL_MAILBOX_REGISTERS_SIZE 0x20

Perhaps a ref to 8.2.8.4 or Figure 139 here somewhere?
Thanks for all the refs by the way. They make checking this a lot quicker!

> +#define CXL_MAILBOX_PAYLOAD_SHIFT 11
> +#define CXL_MAILBOX_MAX_PAYLOAD_SIZE (1 << CXL_MAILBOX_PAYLOAD_SHIFT)
> +#define CXL_MAILBOX_REGISTERS_LENGTH \
> +    (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
> +
> +typedef struct cxl_device_state {
> +    MemoryRegion device_registers;
> +
> +    /* mmio for device capabilities array - 8.2.8.2 */
> +    MemoryRegion caps;
> +
> +    /* mmio for the device status registers 8.2.8.3 */
> +    MemoryRegion device;
> +
> +    /* mmio for the mailbox registers 8.2.8.4 */
> +    MemoryRegion mailbox;
> +
> +    /* memory region for persistent memory, HDM */
> +    MemoryRegion *pmem;
> +
> +    /* memory region for volatile  memory, HDM */
> +    MemoryRegion *vmem;
> +} CXLDeviceState;
> +
> +/* Initialize the register block for a device */
> +void cxl_device_register_block_init(Object *obj, CXLDeviceState *dev);
> +
> +/* Set up default values for the register block */
> +void cxl_device_register_init_common(CXLDeviceState *dev);
> +
> +/* CXL 2.0 - 8.2.8.1 */
> +REG32(CXL_DEV_CAP_ARRAY, 0) /* 48b!?!?! */

Also missing a reserved 64 bits to fill in below the device capability headers
which are offset by 0x10

> +    FIELD(CXL_DEV_CAP_ARRAY, CAP_ID, 0, 16)
> +    FIELD(CXL_DEV_CAP_ARRAY, CAP_VERSION, 16, 8)
> +REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
> +    FIELD(CXL_DEV_CAP_ARRAY2, CAP_COUNT, 0, 16)
> +
> +/*
> + * Helper macro to initialize capability headers for CXL devices.
> + *
> + * In the 8.2.8.2, this is listed as a 128b register, but in 8.2.8, it says:
> + * > No registers defined in Section 8.2.8 are larger than 64-bits wide so that
> + * > is the maximum access size allowed for these registers. If this rule is not
> + * > followed, the behavior is undefined
> + *
> + * Here we've chosen to make it 4 dwords. The spec allows any pow2 multiple
> + * access to be used for a register (2 qwords, 8 words, 128 bytes).
> + */
> +#define CXL_DEVICE_CAPABILITY_HEADER_REGISTER(n, offset)                            \
> +    REG32(CXL_DEV_##n##_CAP_HDR0, offset)                 \
> +        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_ID, 0, 16)      \
> +        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_VERSION, 16, 8) \
> +    REG32(CXL_DEV_##n##_CAP_HDR1, offset + 4)             \
> +        FIELD(CXL_DEV_##n##_CAP_HDR1, CAP_OFFSET, 0, 32)  \
> +    REG32(CXL_DEV_##n##_CAP_HDR2, offset + 8)             \
> +        FIELD(CXL_DEV_##n##_CAP_HDR2, CAP_LENGTH, 0, 32)
> +
> +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> +                                               CXL_DEVICE_CAP_REG_SIZE)
> +
> +REG32(CXL_DEV_MAILBOX_CAP, 0)
> +    FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
> +    FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
> +    FIELD(CXL_DEV_MAILBOX_CAP, BG_INT_CAP, 6, 1)
> +    FIELD(CXL_DEV_MAILBOX_CAP, MSI_N, 7, 4)
> +
> +REG32(CXL_DEV_MAILBOX_CTRL, 4)
> +    FIELD(CXL_DEV_MAILBOX_CTRL, DOORBELL, 0, 1)
> +    FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
> +    FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)

Is it not worth defining the
CXL_DEV_MAILBOX_CMD register for completeness? off set 0x8

> +
> +/* XXX: actually a 64b register */
> +REG32(CXL_DEV_MAILBOX_STS, 0x10)
> +    FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
> +    FIELD(CXL_DEV_MAILBOX_STS, ERRNO, 32, 16)
> +    FIELD(CXL_DEV_MAILBOX_STS, VENDOR_ERRNO, 48, 16)
> +
> +/* XXX: actually a 64b register */
> +REG32(CXL_DEV_BG_CMD_STS, 0x18)
> +    FIELD(CXL_DEV_BG_CMD_STS, BG, 0, 16)
> +    FIELD(CXL_DEV_BG_CMD_STS, DONE, 16, 7)
> +    FIELD(CXL_DEV_BG_CMD_STS, ERRNO, 32, 16)
> +    FIELD(CXL_DEV_BG_CMD_STS, VENDOR_ERRNO, 48, 16)
> +
> +REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
Probably want a comment for this one that it might be huge.

> +
> +#endif


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 03/31] hw/cxl/device: Introduce a CXL device (8.2.8)
@ 2021-02-02 12:03     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 12:03 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:20 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> A CXL device is a type of CXL component. Conceptually, a CXL device
> would be a leaf node in a CXL topology. From an emulation perspective,
> CXL devices are the most complex and so the actual implementation is
> reserved for discrete commits.
> 
> This new device type is specifically catered towards the eventual
> implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
> specification.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Really minor comments inline.

In the interests of avoiding giving myself a headache again
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  include/hw/cxl/cxl.h        |   1 +
>  include/hw/cxl/cxl_device.h | 155 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 156 insertions(+)
>  create mode 100644 include/hw/cxl/cxl_device.h
> 
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> index 55f6cc30a5..23f52c4cf9 100644
> --- a/include/hw/cxl/cxl.h
> +++ b/include/hw/cxl/cxl.h
> @@ -12,6 +12,7 @@
>  
>  #include "cxl_pci.h"
>  #include "cxl_component.h"
> +#include "cxl_device.h"
>  
>  #endif
>  
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> new file mode 100644
> index 0000000000..a85f250503
> --- /dev/null
> +++ b/include/hw/cxl/cxl_device.h
> @@ -0,0 +1,155 @@
> +/*
> + * QEMU CXL Devices
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_DEVICE_H
> +#define CXL_DEVICE_H
> +
> +#include "hw/register.h"
> +
> +/*
> + * The following is how a CXL device's MMIO space is laid out. The only
> + * requirement from the spec is that the capabilities array and the capability
> + * headers start at offset 0 and are contiguously packed. The headers themselves
> + * provide offsets to the register fields. For this emulation, registers will
> + * start at offset 0x80 (m == 0x80). No secondary mailbox is implemented which
> + * means that n = m + sizeof(mailbox registers) + sizeof(device registers).
> + *
> + * This is roughly described in 8.2.8 Figure 138 of the CXL 2.0 spec.
> + *
> + * n + PAYLOAD_SIZE_MAX  +---------------------------------+
> + *                       |                                 |
> + *                  ^    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |         Command Payload         |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  n    +---------------------------------+
> + *                  ^    |                                 |
> + *                  |    |    Device Capability Registers  |
> + *                  |    |    x, mailbox, y                |
> + *                  |    |                                 |
> + *                  m    +---------------------------------+
> + *                  ^    |     Device Capability Header y  |
> + *                  |    +---------------------------------+
> + *                  |    | Device Capability Header Mailbox|
> + *                  |    +------------- --------------------
> + *                  |    |     Device Capability Header x  |
> + *                  |    +---------------------------------+
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |      Device Cap Array[0..n]     |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  0    +---------------------------------+
> + */
> +
> +#define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
> +#define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
> +#define CXL_DEVICE_CAPS_MAX 4 /* 8.2.8.2.1 + 8.2.8.5 */
> +
> +#define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
> +#define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
> +
> +#define CXL_MAILBOX_REGISTERS_OFFSET \
> +    (CXL_DEVICE_REGISTERS_OFFSET + CXL_DEVICE_REGISTERS_LENGTH)
> +#define CXL_MAILBOX_REGISTERS_SIZE 0x20

Perhaps a ref to 8.2.8.4 or Figure 139 here somewhere?
Thanks for all the refs by the way. They make checking this a lot quicker!

> +#define CXL_MAILBOX_PAYLOAD_SHIFT 11
> +#define CXL_MAILBOX_MAX_PAYLOAD_SIZE (1 << CXL_MAILBOX_PAYLOAD_SHIFT)
> +#define CXL_MAILBOX_REGISTERS_LENGTH \
> +    (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
> +
> +typedef struct cxl_device_state {
> +    MemoryRegion device_registers;
> +
> +    /* mmio for device capabilities array - 8.2.8.2 */
> +    MemoryRegion caps;
> +
> +    /* mmio for the device status registers 8.2.8.3 */
> +    MemoryRegion device;
> +
> +    /* mmio for the mailbox registers 8.2.8.4 */
> +    MemoryRegion mailbox;
> +
> +    /* memory region for persistent memory, HDM */
> +    MemoryRegion *pmem;
> +
> +    /* memory region for volatile  memory, HDM */
> +    MemoryRegion *vmem;
> +} CXLDeviceState;
> +
> +/* Initialize the register block for a device */
> +void cxl_device_register_block_init(Object *obj, CXLDeviceState *dev);
> +
> +/* Set up default values for the register block */
> +void cxl_device_register_init_common(CXLDeviceState *dev);
> +
> +/* CXL 2.0 - 8.2.8.1 */
> +REG32(CXL_DEV_CAP_ARRAY, 0) /* 48b!?!?! */

Also missing a reserved 64 bits to fill in below the device capability headers
which are offset by 0x10

> +    FIELD(CXL_DEV_CAP_ARRAY, CAP_ID, 0, 16)
> +    FIELD(CXL_DEV_CAP_ARRAY, CAP_VERSION, 16, 8)
> +REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
> +    FIELD(CXL_DEV_CAP_ARRAY2, CAP_COUNT, 0, 16)
> +
> +/*
> + * Helper macro to initialize capability headers for CXL devices.
> + *
> + * In the 8.2.8.2, this is listed as a 128b register, but in 8.2.8, it says:
> + * > No registers defined in Section 8.2.8 are larger than 64-bits wide so that
> + * > is the maximum access size allowed for these registers. If this rule is not
> + * > followed, the behavior is undefined
> + *
> + * Here we've chosen to make it 4 dwords. The spec allows any pow2 multiple
> + * access to be used for a register (2 qwords, 8 words, 128 bytes).
> + */
> +#define CXL_DEVICE_CAPABILITY_HEADER_REGISTER(n, offset)                            \
> +    REG32(CXL_DEV_##n##_CAP_HDR0, offset)                 \
> +        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_ID, 0, 16)      \
> +        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_VERSION, 16, 8) \
> +    REG32(CXL_DEV_##n##_CAP_HDR1, offset + 4)             \
> +        FIELD(CXL_DEV_##n##_CAP_HDR1, CAP_OFFSET, 0, 32)  \
> +    REG32(CXL_DEV_##n##_CAP_HDR2, offset + 8)             \
> +        FIELD(CXL_DEV_##n##_CAP_HDR2, CAP_LENGTH, 0, 32)
> +
> +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> +                                               CXL_DEVICE_CAP_REG_SIZE)
> +
> +REG32(CXL_DEV_MAILBOX_CAP, 0)
> +    FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
> +    FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
> +    FIELD(CXL_DEV_MAILBOX_CAP, BG_INT_CAP, 6, 1)
> +    FIELD(CXL_DEV_MAILBOX_CAP, MSI_N, 7, 4)
> +
> +REG32(CXL_DEV_MAILBOX_CTRL, 4)
> +    FIELD(CXL_DEV_MAILBOX_CTRL, DOORBELL, 0, 1)
> +    FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
> +    FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)

Is it not worth defining the
CXL_DEV_MAILBOX_CMD register for completeness? off set 0x8

> +
> +/* XXX: actually a 64b register */
> +REG32(CXL_DEV_MAILBOX_STS, 0x10)
> +    FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
> +    FIELD(CXL_DEV_MAILBOX_STS, ERRNO, 32, 16)
> +    FIELD(CXL_DEV_MAILBOX_STS, VENDOR_ERRNO, 48, 16)
> +
> +/* XXX: actually a 64b register */
> +REG32(CXL_DEV_BG_CMD_STS, 0x18)
> +    FIELD(CXL_DEV_BG_CMD_STS, BG, 0, 16)
> +    FIELD(CXL_DEV_BG_CMD_STS, DONE, 16, 7)
> +    FIELD(CXL_DEV_BG_CMD_STS, ERRNO, 32, 16)
> +    FIELD(CXL_DEV_BG_CMD_STS, VENDOR_ERRNO, 48, 16)
> +
> +REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
Probably want a comment for this one that it might be huge.

> +
> +#endif



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 04/31] hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-02 12:23     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 12:23 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:21 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This implements all device MMIO up to the first capability. That
> includes the CXL Device Capabilities Array Register, as well as all of
> the CXL Device Capability Header Registers. The latter are filled in as
> they are implemented in the following patches.
> 
> Endianness and alignment are managed by softmmu memory core.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
A few trivials
> ---
>  hw/cxl/cxl-device-utils.c   | 105 ++++++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build          |   1 +
>  include/hw/cxl/cxl_device.h |  27 +++++++++-
>  3 files changed, 132 insertions(+), 1 deletion(-)
>  create mode 100644 hw/cxl/cxl-device-utils.c
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> new file mode 100644
> index 0000000000..bb15ad9a0f
> --- /dev/null
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -0,0 +1,105 @@
> +/*
> + * CXL Utility library for devices
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "hw/cxl/cxl.h"
> +
> +/*
> + * Device registers have no restrictions per the spec, and so fall back to the
> + * default memory mapped register rules in 8.2:
> + *   Software shall use CXL.io Memory Read and Write to access memory mapped
> + *   register defined in this section. Unless otherwise specified, software
> + *   shall restrict the accesses width based on the following:
> + *   • A 32 bit register shall   be accessed as a 1 Byte, 2 Bytes or 4 Bytes

odd spacing

> + *     quantity.
> + *   • A 64 bit register shall be accessed as a 1 Byte, 2 Bytes, 4 Bytes or 8
> + *     Bytes
> + *   • The address shall be a multiple of the access width, e.g. when
> + *     accessing a register as a 4 Byte quantity, the address shall be
> + *     multiple of 4.
> + *   • The accesses shall map to contiguous bytes.If these rules are not
> + *     followed, the behavior is undefined
> + */
> +
> +static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    CXLDeviceState *cxl_dstate = opaque;
> +
> +    return cxl_dstate->caps_reg_state32[offset / 4];
> +}
> +
> +static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    return 0;
> +}
> +
> +static const MemoryRegionOps dev_ops = {
> +    .read = dev_reg_read,
> +    .write = NULL, /* status register is read only */
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 1,
> +        .max_access_size = 8,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 1,
> +        .max_access_size = 8,
> +    },
> +};
> +
> +static const MemoryRegionOps caps_ops = {
> +    .read = caps_reg_read,
> +    .write = NULL, /* caps registers are read only */
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 1,
> +        .max_access_size = 8,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +};
> +
> +void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> +{
> +    /* This will be a BAR, so needs to be rounded up to pow2 for PCI spec */
> +    memory_region_init(&cxl_dstate->device_registers, obj, "device-registers",
> +                       pow2ceil(CXL_MMIO_SIZE));
> +
> +    memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
> +                          "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);

Specifying a size in terms of the offset of another region isn't exactly 
intuitive so perhaps a comment on why or better yet actually use a size
parameter covering what is there rather than simply the region below
the CXL_DEVICE_REGISTERS_OFFSET.


> +    memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
> +                          "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> +
> +    memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> +                                &cxl_dstate->caps);
> +    memory_region_add_subregion(&cxl_dstate->device_registers,
> +                                CXL_DEVICE_REGISTERS_OFFSET,
> +                                &cxl_dstate->device);
> +}
> +
> +static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
> +
> +void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> +{
> +    uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> +    const int cap_count = 1;
> +
> +    /* CXL Device Capabilities Array Register */
> +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
> +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
> +
> +    cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> +    device_reg_init_common(cxl_dstate);
> +}
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> index 00c3876a0f..47154d6850 100644
> --- a/hw/cxl/meson.build
> +++ b/hw/cxl/meson.build
> @@ -1,3 +1,4 @@
>  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
>    'cxl-component-utils.c',
> +  'cxl-device-utils.c',
>  ))
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index a85f250503..f3bcf19410 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -58,6 +58,8 @@
>  #define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
>  #define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
>  #define CXL_DEVICE_CAPS_MAX 4 /* 8.2.8.2.1 + 8.2.8.5 */
> +#define CXL_CAPS_SIZE \
> +    (CXL_DEVICE_CAP_REG_SIZE * CXL_DEVICE_CAPS_MAX + 1) /* +1 for header */
>  
>  #define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
>  #define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
> @@ -70,11 +72,18 @@
>  #define CXL_MAILBOX_REGISTERS_LENGTH \
>      (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
>  
> +#define CXL_MMIO_SIZE                                       \
> +    CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_REGISTERS_LENGTH + \
> +        CXL_MAILBOX_REGISTERS_LENGTH
> +
>  typedef struct cxl_device_state {
>      MemoryRegion device_registers;
>  
>      /* mmio for device capabilities array - 8.2.8.2 */
> -    MemoryRegion caps;
> +    struct {
> +        MemoryRegion caps;
> +        uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
> +    };

With this unnamed,w hat is the benefit of having these two in a
struct?  The naming makes it clear they are related anyway.

>  
>      /* mmio for the device status registers 8.2.8.3 */
>      MemoryRegion device;
> @@ -126,6 +135,22 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
>  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
>                                                 CXL_DEVICE_CAP_REG_SIZE)
>  
> +#define cxl_device_cap_init(dstate, reg, cap_id)                                   \
> +    do {                                                                           \
> +        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> +        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \
> +        cap_hdrs[which] =                                                          \
> +            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \
> +        cap_hdrs[which] = FIELD_DP32(                                              \
> +            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \
> +        cap_hdrs[which + 1] =                                                      \
> +            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \
> +                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \
> +        cap_hdrs[which + 2] =                                                      \
> +            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \
> +                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \
> +    } while (0)
> +
>  REG32(CXL_DEV_MAILBOX_CAP, 0)
>      FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
>      FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 04/31] hw/cxl/device: Implement the CAP array (8.2.8.1-2)
@ 2021-02-02 12:23     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 12:23 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:21 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This implements all device MMIO up to the first capability. That
> includes the CXL Device Capabilities Array Register, as well as all of
> the CXL Device Capability Header Registers. The latter are filled in as
> they are implemented in the following patches.
> 
> Endianness and alignment are managed by softmmu memory core.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
A few trivials
> ---
>  hw/cxl/cxl-device-utils.c   | 105 ++++++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build          |   1 +
>  include/hw/cxl/cxl_device.h |  27 +++++++++-
>  3 files changed, 132 insertions(+), 1 deletion(-)
>  create mode 100644 hw/cxl/cxl-device-utils.c
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> new file mode 100644
> index 0000000000..bb15ad9a0f
> --- /dev/null
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -0,0 +1,105 @@
> +/*
> + * CXL Utility library for devices
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "hw/cxl/cxl.h"
> +
> +/*
> + * Device registers have no restrictions per the spec, and so fall back to the
> + * default memory mapped register rules in 8.2:
> + *   Software shall use CXL.io Memory Read and Write to access memory mapped
> + *   register defined in this section. Unless otherwise specified, software
> + *   shall restrict the accesses width based on the following:
> + *   • A 32 bit register shall   be accessed as a 1 Byte, 2 Bytes or 4 Bytes

odd spacing

> + *     quantity.
> + *   • A 64 bit register shall be accessed as a 1 Byte, 2 Bytes, 4 Bytes or 8
> + *     Bytes
> + *   • The address shall be a multiple of the access width, e.g. when
> + *     accessing a register as a 4 Byte quantity, the address shall be
> + *     multiple of 4.
> + *   • The accesses shall map to contiguous bytes.If these rules are not
> + *     followed, the behavior is undefined
> + */
> +
> +static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    CXLDeviceState *cxl_dstate = opaque;
> +
> +    return cxl_dstate->caps_reg_state32[offset / 4];
> +}
> +
> +static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    return 0;
> +}
> +
> +static const MemoryRegionOps dev_ops = {
> +    .read = dev_reg_read,
> +    .write = NULL, /* status register is read only */
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 1,
> +        .max_access_size = 8,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 1,
> +        .max_access_size = 8,
> +    },
> +};
> +
> +static const MemoryRegionOps caps_ops = {
> +    .read = caps_reg_read,
> +    .write = NULL, /* caps registers are read only */
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 1,
> +        .max_access_size = 8,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +};
> +
> +void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> +{
> +    /* This will be a BAR, so needs to be rounded up to pow2 for PCI spec */
> +    memory_region_init(&cxl_dstate->device_registers, obj, "device-registers",
> +                       pow2ceil(CXL_MMIO_SIZE));
> +
> +    memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
> +                          "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);

Specifying a size in terms of the offset of another region isn't exactly 
intuitive so perhaps a comment on why or better yet actually use a size
parameter covering what is there rather than simply the region below
the CXL_DEVICE_REGISTERS_OFFSET.


> +    memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
> +                          "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> +
> +    memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> +                                &cxl_dstate->caps);
> +    memory_region_add_subregion(&cxl_dstate->device_registers,
> +                                CXL_DEVICE_REGISTERS_OFFSET,
> +                                &cxl_dstate->device);
> +}
> +
> +static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
> +
> +void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> +{
> +    uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> +    const int cap_count = 1;
> +
> +    /* CXL Device Capabilities Array Register */
> +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
> +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
> +
> +    cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> +    device_reg_init_common(cxl_dstate);
> +}
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> index 00c3876a0f..47154d6850 100644
> --- a/hw/cxl/meson.build
> +++ b/hw/cxl/meson.build
> @@ -1,3 +1,4 @@
>  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
>    'cxl-component-utils.c',
> +  'cxl-device-utils.c',
>  ))
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index a85f250503..f3bcf19410 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -58,6 +58,8 @@
>  #define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
>  #define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
>  #define CXL_DEVICE_CAPS_MAX 4 /* 8.2.8.2.1 + 8.2.8.5 */
> +#define CXL_CAPS_SIZE \
> +    (CXL_DEVICE_CAP_REG_SIZE * CXL_DEVICE_CAPS_MAX + 1) /* +1 for header */
>  
>  #define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
>  #define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
> @@ -70,11 +72,18 @@
>  #define CXL_MAILBOX_REGISTERS_LENGTH \
>      (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
>  
> +#define CXL_MMIO_SIZE                                       \
> +    CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_REGISTERS_LENGTH + \
> +        CXL_MAILBOX_REGISTERS_LENGTH
> +
>  typedef struct cxl_device_state {
>      MemoryRegion device_registers;
>  
>      /* mmio for device capabilities array - 8.2.8.2 */
> -    MemoryRegion caps;
> +    struct {
> +        MemoryRegion caps;
> +        uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
> +    };

With this unnamed,w hat is the benefit of having these two in a
struct?  The naming makes it clear they are related anyway.

>  
>      /* mmio for the device status registers 8.2.8.3 */
>      MemoryRegion device;
> @@ -126,6 +135,22 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
>  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
>                                                 CXL_DEVICE_CAP_REG_SIZE)
>  
> +#define cxl_device_cap_init(dstate, reg, cap_id)                                   \
> +    do {                                                                           \
> +        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> +        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \
> +        cap_hdrs[which] =                                                          \
> +            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \
> +        cap_hdrs[which] = FIELD_DP32(                                              \
> +            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \
> +        cap_hdrs[which + 1] =                                                      \
> +            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \
> +                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \
> +        cap_hdrs[which + 2] =                                                      \
> +            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \
> +                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \
> +    } while (0)
> +
>  REG32(CXL_DEV_MAILBOX_CAP, 0)
>      FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
>      FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 07/31] hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-02 13:44     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 13:44 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:24 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Using the previously implemented stubbed helpers, it is now possible to
> easily add the missing, required commands to the implementation.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
comment inline. Otherwise LGTM.
> ---
>  hw/cxl/cxl-mailbox-utils.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 466055b01a..7c939a1851 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -37,6 +37,14 @@
>   *  a register interface that already deals with it.
>   */
>  
> +enum {
> +    EVENTS      = 0x01,
> +        #define GET_RECORDS   0x0
> +        #define CLEAR_RECORDS   0x1
> +        #define GET_INTERRUPT_POLICY   0x2
> +        #define SET_INTERRUPT_POLICY   0x3
> +};
> +
>  /* 8.2.8.4.5.1 Command Return Codes */
>  typedef enum {
>      CXL_MBOX_SUCCESS = 0x0,
> @@ -105,10 +113,23 @@ struct cxl_cmd {
>          return CXL_MBOX_SUCCESS;                                          \
>      }
>  
> +define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
> +define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
> +define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);

Ideally add section reference comments for these.  Makes my life a tiny
bit easier as a reviewer!

> +define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
> +
> +#define IMMEDIATE_CONFIG_CHANGE (1 << 1)
> +#define IMMEDIATE_LOG_CHANGE (1 << 4)
> +
>  #define CXL_CMD(s, c, in, cel_effect) \
>      [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
>  
> -static struct cxl_cmd cxl_cmd_set[256][256] = {};
> +static struct cxl_cmd cxl_cmd_set[256][256] = {
> +    CXL_CMD(EVENTS, GET_RECORDS, 1, 0),
> +    CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
> +    CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
> +    CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
> +};
>  
>  #undef CXL_CMD
>  


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 07/31] hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
@ 2021-02-02 13:44     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 13:44 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:24 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Using the previously implemented stubbed helpers, it is now possible to
> easily add the missing, required commands to the implementation.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
comment inline. Otherwise LGTM.
> ---
>  hw/cxl/cxl-mailbox-utils.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 466055b01a..7c939a1851 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -37,6 +37,14 @@
>   *  a register interface that already deals with it.
>   */
>  
> +enum {
> +    EVENTS      = 0x01,
> +        #define GET_RECORDS   0x0
> +        #define CLEAR_RECORDS   0x1
> +        #define GET_INTERRUPT_POLICY   0x2
> +        #define SET_INTERRUPT_POLICY   0x3
> +};
> +
>  /* 8.2.8.4.5.1 Command Return Codes */
>  typedef enum {
>      CXL_MBOX_SUCCESS = 0x0,
> @@ -105,10 +113,23 @@ struct cxl_cmd {
>          return CXL_MBOX_SUCCESS;                                          \
>      }
>  
> +define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
> +define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
> +define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);

Ideally add section reference comments for these.  Makes my life a tiny
bit easier as a reviewer!

> +define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
> +
> +#define IMMEDIATE_CONFIG_CHANGE (1 << 1)
> +#define IMMEDIATE_LOG_CHANGE (1 << 4)
> +
>  #define CXL_CMD(s, c, in, cel_effect) \
>      [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
>  
> -static struct cxl_cmd cxl_cmd_set[256][256] = {};
> +static struct cxl_cmd cxl_cmd_set[256][256] = {
> +    CXL_CMD(EVENTS, GET_RECORDS, 1, 0),
> +    CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
> +    CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
> +    CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),
> +};
>  
>  #undef CXL_CMD
>  



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 10/31] hw/pxb: Use a type for realizing expanders
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-02 13:50     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 13:50 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:27 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This opens up the possibility for more types of expanders (other than
> PCI and PCIe). We'll need this to create a CXL expander.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Minor suggestion inline but nothing important if you don't want to change it.

Jonathan

> ---
>  hw/pci-bridge/pci_expander_bridge.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index aedded1064..232b7ce305 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -24,6 +24,8 @@
>  #include "hw/boards.h"
>  #include "qom/object.h"
>  
> +enum BusType { PCI, PCIE };
> +
>  #define TYPE_PXB_BUS "pxb-bus"
>  typedef struct PXBBus PXBBus;
>  DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
> @@ -214,7 +216,8 @@ static gint pxb_compare(gconstpointer a, gconstpointer b)
>             0;
>  }
>  
> -static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
> +static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
> +                                   Error **errp)
>  {
>      PXBDev *pxb = convert_to_pxb(dev);
>      DeviceState *ds, *bds = NULL;
> @@ -239,7 +242,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
>      }
>  
>      ds = qdev_new(TYPE_PXB_HOST);
> -    if (pcie) {
> +    if (type == PCIE) {

I'd make this a switch statement now given we are about to the 3 entries and may well
get more in the future.

>          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
>      } else {
>          bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
> @@ -287,7 +290,7 @@ static void pxb_dev_realize(PCIDevice *dev, Error **errp)
>          return;
>      }
>  
> -    pxb_dev_realize_common(dev, false, errp);
> +    pxb_dev_realize_common(dev, PCI, errp);
>  }
>  
>  static void pxb_dev_exitfn(PCIDevice *pci_dev)
> @@ -339,7 +342,7 @@ static void pxb_pcie_dev_realize(PCIDevice *dev, Error **errp)
>          return;
>      }
>  
> -    pxb_dev_realize_common(dev, true, errp);
> +    pxb_dev_realize_common(dev, PCIE, errp);
>  }
>  
>  static void pxb_pcie_dev_class_init(ObjectClass *klass, void *data)


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 10/31] hw/pxb: Use a type for realizing expanders
@ 2021-02-02 13:50     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 13:50 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:27 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This opens up the possibility for more types of expanders (other than
> PCI and PCIe). We'll need this to create a CXL expander.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Minor suggestion inline but nothing important if you don't want to change it.

Jonathan

> ---
>  hw/pci-bridge/pci_expander_bridge.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index aedded1064..232b7ce305 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -24,6 +24,8 @@
>  #include "hw/boards.h"
>  #include "qom/object.h"
>  
> +enum BusType { PCI, PCIE };
> +
>  #define TYPE_PXB_BUS "pxb-bus"
>  typedef struct PXBBus PXBBus;
>  DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
> @@ -214,7 +216,8 @@ static gint pxb_compare(gconstpointer a, gconstpointer b)
>             0;
>  }
>  
> -static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
> +static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
> +                                   Error **errp)
>  {
>      PXBDev *pxb = convert_to_pxb(dev);
>      DeviceState *ds, *bds = NULL;
> @@ -239,7 +242,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
>      }
>  
>      ds = qdev_new(TYPE_PXB_HOST);
> -    if (pcie) {
> +    if (type == PCIE) {

I'd make this a switch statement now given we are about to the 3 entries and may well
get more in the future.

>          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
>      } else {
>          bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
> @@ -287,7 +290,7 @@ static void pxb_dev_realize(PCIDevice *dev, Error **errp)
>          return;
>      }
>  
> -    pxb_dev_realize_common(dev, false, errp);
> +    pxb_dev_realize_common(dev, PCI, errp);
>  }
>  
>  static void pxb_dev_exitfn(PCIDevice *pci_dev)
> @@ -339,7 +342,7 @@ static void pxb_pcie_dev_realize(PCIDevice *dev, Error **errp)
>          return;
>      }
>  
> -    pxb_dev_realize_common(dev, true, errp);
> +    pxb_dev_realize_common(dev, PCIE, errp);
>  }
>  
>  static void pxb_pcie_dev_class_init(ObjectClass *klass, void *data)



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 14/31] acpi/pci: Consolidate host bridge setup
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-02 13:56     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 13:56 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:31 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This cleanup will make it easier to add support for CXL to the mix.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
One trivial but up to you on whether you think it's worth changing.


> ---
>  hw/i386/acpi-build.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index f56d699c7f..cf6eb54c22 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1194,6 +1194,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
>      aml_append(table, scope);
>  }
>  
> +enum { PCI, PCIE };
> +static void init_pci_acpi(Aml *dev, int uid, int type)
> +{
> +    if (type == PCI) {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +    } else {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> +        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +        aml_append(dev, build_q35_osc_method());
> +    }

Gut feeling would be this will end up nicer as a switch. I suspect this isn't
the last time this will get extended!

> +}
> +
>  static void
>  build_dsdt(GArray *table_data, BIOSLinker *linker,
>             AcpiPmInfo *pm, AcpiMiscInfo *misc,
> @@ -1222,9 +1236,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      if (misc->is_piix4) {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCI);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1238,11 +1251,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      } else {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCIE);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> -        aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>  
>          if (pm->smi_on_cpuhp) {
> @@ -1345,15 +1355,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>  
>              scope = aml_scope("\\_SB");
>              dev = aml_device("PC%.02X", bus_num);
> -            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> -            if (pci_bus_is_express(bus)) {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> -                aml_append(dev, build_q35_osc_method());
> -            } else {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> -            }
> +            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {
>                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 14/31] acpi/pci: Consolidate host bridge setup
@ 2021-02-02 13:56     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 13:56 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:31 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This cleanup will make it easier to add support for CXL to the mix.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
One trivial but up to you on whether you think it's worth changing.


> ---
>  hw/i386/acpi-build.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index f56d699c7f..cf6eb54c22 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1194,6 +1194,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
>      aml_append(table, scope);
>  }
>  
> +enum { PCI, PCIE };
> +static void init_pci_acpi(Aml *dev, int uid, int type)
> +{
> +    if (type == PCI) {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +    } else {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> +        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +        aml_append(dev, build_q35_osc_method());
> +    }

Gut feeling would be this will end up nicer as a switch. I suspect this isn't
the last time this will get extended!

> +}
> +
>  static void
>  build_dsdt(GArray *table_data, BIOSLinker *linker,
>             AcpiPmInfo *pm, AcpiMiscInfo *misc,
> @@ -1222,9 +1236,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      if (misc->is_piix4) {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCI);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1238,11 +1251,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      } else {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCIE);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> -        aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>  
>          if (pm->smi_on_cpuhp) {
> @@ -1345,15 +1355,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>  
>              scope = aml_scope("\\_SB");
>              dev = aml_device("PC%.02X", bus_num);
> -            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> -            if (pci_bus_is_express(bus)) {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> -                aml_append(dev, build_q35_osc_method());
> -            } else {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> -            }
> +            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {
>                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 21/31] hw/cxl/device: Add a memory device (8.2.8.5)
  2021-02-02  0:59   ` Ben Widawsky
  (?)
@ 2021-02-02 14:26   ` Eric Blake
  2021-02-02 15:06       ` Ben Widawsky
  -1 siblings, 1 reply; 117+ messages in thread
From: Eric Blake @ 2021-02-02 14:26 UTC (permalink / raw)
  To: Ben Widawsky, qemu-devel
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

On 2/1/21 6:59 PM, Ben Widawsky wrote:
> A CXL memory device (AKA Type 3) is a CXL component that contains some
> combination of volatile and persistent memory. It also implements the
> previously defined mailbox interface as well as the memory device
> firmware interface.
> 
> Although the memory device is configured like a normal PCIe device, the
> memory traffic is on an entirely separate bus conceptually (using the
> same physical wires as PCIe, but different protocol).
> 
> The guest physical address for the memory device is part of a larger
> window which is owned by the platform. Currently, this is hardcoded as
> an object property on host bridge (PXB) creation, but that will need to
> change for interleaving.
> 
> The following example will create a 256M device in a 512M window:
> -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
> -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---

> +++ b/qapi/machine.json
> @@ -1394,6 +1394,7 @@
>  { 'union': 'MemoryDeviceInfo',
>    'data': { 'dimm': 'PCDIMMDeviceInfo',
>              'nvdimm': 'PCDIMMDeviceInfo',
> +            'cxl': 'PCDIMMDeviceInfo',
>              'virtio-pmem': 'VirtioPMEMDeviceInfo',
>              'virtio-mem': 'VirtioMEMDeviceInfo'
>            }

Missing documentation that 'cxl' was introduced in 6.0.  Also, is it
worth keeping the branches of this union in lexicographic order?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4)
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-02 14:58     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 14:58 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:22 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This is the beginning of implementing mailbox support for CXL 2.0
> devices. The implementation recognizes when the doorbell is rung,
> handles the command/payload, clears the doorbell while returning error
> codes and data.
> 
> Generally the mailbox mechanism is designed to permit communication
> between the host OS and the firmware running on the device. For our
> purposes, we emulate both the firmware, implemented primarily in
> cxl-mailbox-utils.c, and the hardware.
> 
> No commands are implemented yet.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  hw/cxl/cxl-device-utils.c   | 125 ++++++++++++++++++++++-
>  hw/cxl/cxl-mailbox-utils.c  | 197 ++++++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build          |   1 +
>  include/hw/cxl/cxl.h        |   3 +
>  include/hw/cxl/cxl_device.h |  28 ++++-
>  5 files changed, 349 insertions(+), 5 deletions(-)
>  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> index bb15ad9a0f..6602606f3d 100644
> --- a/hw/cxl/cxl-device-utils.c
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -40,6 +40,111 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
>      return 0;
>  }
>  
> +static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    CXLDeviceState *cxl_dstate = opaque;
> +
> +    switch (size) {
> +    case 8:
> +        return cxl_dstate->mbox_reg_state64[offset / 8];
> +    case 4:
> +        return cxl_dstate->mbox_reg_state32[offset / 4];

Numeric order seems more natural and I can't see a reason not to do it.
+ you do them in that order below.

> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
> +static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
> +                               uint64_t value)
> +{
> +    switch (offset) {
> +    case A_CXL_DEV_MAILBOX_CTRL:
> +        /* fallthrough */
> +    case A_CXL_DEV_MAILBOX_CAP:
> +        /* RO register */
> +        break;
> +    default:
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
> +                      __func__, offset);
> +        break;
> +    }
> +
> +    reg_state[offset / 4] = value;
> +}
> +
> +static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
> +                               uint64_t value)
> +{
> +    switch (offset) {
> +    case A_CXL_DEV_MAILBOX_CMD:
> +        break;
> +    case A_CXL_DEV_BG_CMD_STS:
> +        /* BG not supported */
> +        /* fallthrough */
> +    case A_CXL_DEV_MAILBOX_STS:
> +        /* Read only register, will get updated by the state machine */
> +        return;
> +    default:
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
> +                      __func__, offset);
> +        return;
> +    }
> +
> +
> +    reg_state[offset / 8] = value;
> +}
> +
> +static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
> +                              unsigned size)
> +{
> +    CXLDeviceState *cxl_dstate = opaque;
> +
> +    if (offset >= A_CXL_DEV_CMD_PAYLOAD) {
> +        memcpy(cxl_dstate->mbox_reg_state + offset, &value, size);
> +        return;
> +    }
> +
> +    /*
> +     * Lock is needed to prevent concurrent writes as well as to prevent writes
> +     * coming in while the firmware is processing. Without background commands
> +     * or the second mailbox implemented, this serves no purpose since the
> +     * memory access is synchronized at a higher level (per memory region).
> +     */
> +    RCU_READ_LOCK_GUARD();
> +
> +    switch (size) {
> +    case 4:
> +        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
> +        break;
> +    case 8:
> +        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
> +        break;
> +    default:
> +        g_assert_not_reached();
> +    }
> +
> +    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> +                         DOORBELL))
> +        cxl_process_mailbox(cxl_dstate);
> +}
> +
> +static const MemoryRegionOps mailbox_ops = {
> +    .read = mailbox_reg_read,
> +    .write = mailbox_reg_write,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 1,
> +        .max_access_size = 8,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +};
> +
>  static const MemoryRegionOps dev_ops = {
>      .read = dev_reg_read,
>      .write = NULL, /* status register is read only */
> @@ -80,20 +185,33 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
>                            "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
>      memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
>                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> +    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
> +                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
>  
>      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
>                                  &cxl_dstate->caps);
>      memory_region_add_subregion(&cxl_dstate->device_registers,
>                                  CXL_DEVICE_REGISTERS_OFFSET,
>                                  &cxl_dstate->device);
> +    memory_region_add_subregion(&cxl_dstate->device_registers,
> +                                CXL_MAILBOX_REGISTERS_OFFSET,
> +                                &cxl_dstate->mailbox);
>  }
>  
>  static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
>  
> +static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
> +{
> +    /* 2048 payload size, with no interrupt or background support */
> +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> +                     PAYLOAD_SIZE, CXL_MAILBOX_PAYLOAD_SHIFT);
> +    cxl_dstate->payload_size = CXL_MAILBOX_MAX_PAYLOAD_SIZE;
> +}
> +
>  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
>  {
>      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> -    const int cap_count = 1;
> +    const int cap_count = 2;
>  
>      /* CXL Device Capabilities Array Register */
>      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> @@ -102,4 +220,9 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
>  
>      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
>      device_reg_init_common(cxl_dstate);
> +
> +    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> +    mailbox_reg_init_common(cxl_dstate);
> +
> +    assert(cxl_initialize_mailbox(cxl_dstate) == 0);
>  }
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> new file mode 100644
> index 0000000000..466055b01a
> --- /dev/null
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -0,0 +1,197 @@
> +/*
> + * CXL Utility library for mailbox interface
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/cxl/cxl.h"
> +#include "hw/pci/pci.h"
> +#include "qemu/log.h"
> +#include "qemu/uuid.h"
> +
> +/*
> + * How to add a new command, example. The command set FOO, with cmd BAR.
> + *  1. Add the command set and cmd to the enum.
> + *     FOO    = 0x7f,
> + *          #define BAR 0
> + *  2. Forward declare the handler.
> + *     declare_mailbox_handler(FOO_BAR);
> + *  3. Add the command to the cxl_cmd_set[][]
> + *     CXL_CMD(FOO, BAR, 0, 0),
> + *  4. Implement your handler
> + *     define_mailbox_handler(FOO_BAR) { ... return CXL_MBOX_SUCCESS; }
> + *
> + *
> + *  Writing the handler:
> + *    The handler will provide the &struct cxl_cmd, the &CXLDeviceState, and the
> + *    in/out length of the payload. The handler is responsible for consuming the
> + *    payload from cmd->payload and operating upon it as necessary. It must then
> + *    fill the output data into cmd->payload (overwriting what was there),
> + *    setting the length, and returning a valid return code.
> + *
> + *  XXX: The handler need not worry about endianess. The payload is read out of
> + *  a register interface that already deals with it.
> + */
> +
> +/* 8.2.8.4.5.1 Command Return Codes */
> +typedef enum {
> +    CXL_MBOX_SUCCESS = 0x0,
> +    CXL_MBOX_BG_STARTED = 0x1,
> +    CXL_MBOX_INVALID_INPUT = 0x2,
> +    CXL_MBOX_UNSUPPORTED = 0x3,
> +    CXL_MBOX_INTERNAL_ERROR = 0x4,
> +    CXL_MBOX_RETRY_REQUIRED = 0x5,
> +    CXL_MBOX_BUSY = 0x6,
> +    CXL_MBOX_MEDIA_DISABLED = 0x7,
> +    CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
> +    CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
> +    CXL_MBOX_FW_AUTH_FAILED = 0xa,
> +    CXL_MBOX_FW_INVALID_SLOT = 0xb,
> +    CXL_MBOX_FW_ROLLEDBACK = 0xc,
> +    CXL_MBOX_FW_REST_REQD = 0xd,
> +    CXL_MBOX_INVALID_HANDLE = 0xe,
> +    CXL_MBOX_INVALID_PA = 0xf,
> +    CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
> +    CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
> +    CXL_MBOX_ABORTED = 0x12,
> +    CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
> +    CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
> +    CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
> +    CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
> +    CXL_MBOX_MAX = 0x17
> +} ret_code;
> +
> +struct cxl_cmd;
> +typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,
> +                                   CXLDeviceState *cxl_dstate, uint16_t *len);
> +struct cxl_cmd {
> +    const char *name;
> +    opcode_handler handler;
> +    ssize_t in;
> +    uint16_t effect; /* Reported in CEL */
> +    uint8_t *payload;

This payload pointer feels somewhat out of place. Perhaps it should be a parameter
passed to the opcode_handler()?  The address of the payload doesn't feel like
part of the command as such so you are justing using it as somewhere to stash
the address when passing to the handler.


> +};
> +
> +#define define_mailbox_handler(name)                \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd, \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len)
> +#define declare_mailbox_handler(name) define_mailbox_handler(name)
> +
> +#define define_mailbox_handler_zeroed(name, size)                         \
> +    uint16_t __zero##name = size;                                         \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        *len = __zero##name;                                              \
> +        memset(cmd->payload, 0, *len);                                    \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +#define define_mailbox_handler_const(name, data)                          \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        *len = sizeof(data);                                              \
> +        memcpy(cmd->payload, data, *len);                                 \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +#define define_mailbox_handler_nop(name)                                  \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +
> +#define CXL_CMD(s, c, in, cel_effect) \
> +    [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
> +
> +static struct cxl_cmd cxl_cmd_set[256][256] = {};
> +
> +#undef CXL_CMD
> +
> +QemuUUID cel_uuid;
> +
> +void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> +{
> +    uint16_t ret = CXL_MBOX_SUCCESS;
> +    struct cxl_cmd *cxl_cmd;
> +    uint64_t status_reg;
> +    opcode_handler h;
> +
> +    /*
> +     * current state of mailbox interface
> +     *  mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
> +     *  mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
> +     *  status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
> +     */
> +    uint64_t command_reg =
> +        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
> +
> +    /* Check if we have to do anything */
> +    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> +                          DOORBELL)) {
> +        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
> +        return;
> +    }
> +
> +    uint8_t set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
> +    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> +    uint16_t len = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH);
> +    cxl_cmd = &cxl_cmd_set[set][cmd];
> +    h = cxl_cmd->handler;
> +    if (!h) {

This path seems to not convey information it perhaps should.  Maybe some feedback that
a command was requested that isn't registered?

> +        goto handled;
> +    }
> +
> +    if (len != cxl_cmd->in) {
> +        ret = CXL_MBOX_INVALID_PAYLOAD_LENGTH;
> +    }
> +
> +    cxl_cmd->payload = cxl_dstate->mbox_reg_state + A_CXL_DEV_CMD_PAYLOAD;
> +    ret = (*h)(cxl_cmd, cxl_dstate, &len);
> +    assert(len <= cxl_dstate->payload_size);
> +
> +handled:
> +    /*
> +     * Set the return code
> +     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
> +     * away with this
> +     */
> +    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
> +
> +    /*
> +     * Set the return length
> +     */
> +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
> +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
> +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, len);
> +
> +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_CMD / 8] = command_reg;
> +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_STS / 8] = status_reg;
> +
> +    /* Tell the host we're done */
> +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> +                     DOORBELL, 0);
> +}
> +
> +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate)
> +{
> +    const char *cel_uuidstr = "0da9c0b5-bf41-4b78-8f79-96b1623b3f17";
> +
> +    for (int i = 0; i < 256; i++) {

Instead of indexing with i and j, perhaps this would be more consistent using
the naming you have above cmd and set?


> +        for (int j = 0; j < 256; j++) {
> +            if (cxl_cmd_set[i][j].handler) {
> +                struct cxl_cmd *c = &cxl_cmd_set[i][j];
> +
> +                cxl_dstate->cel_log[cxl_dstate->cel_size].opcode = (i << 8) | j;
> +                cxl_dstate->cel_log[cxl_dstate->cel_size].effect = c->effect;
> +                cxl_dstate->cel_size++;
> +            }
> +        }
> +    }
> +
> +    return qemu_uuid_parse(cel_uuidstr, &cel_uuid);
> +}
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> index 47154d6850..0eca715d10 100644
> --- a/hw/cxl/meson.build
> +++ b/hw/cxl/meson.build
> @@ -1,4 +1,5 @@
>  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
>    'cxl-component-utils.c',
>    'cxl-device-utils.c',
> +  'cxl-mailbox-utils.c',
>  ))
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> index 23f52c4cf9..362cda40de 100644
> --- a/include/hw/cxl/cxl.h
> +++ b/include/hw/cxl/cxl.h
> @@ -14,5 +14,8 @@
>  #include "cxl_component.h"
>  #include "cxl_device.h"
>  
> +#define COMPONENT_REG_BAR_IDX 0
> +#define DEVICE_REG_BAR_IDX 2

I'd argue for prefixing all defines

CXL_COMPONENT_REG_BAR_IDX etc

Will make it clear they are generic CXL related things.

> +
>  #endif
>  
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index f3bcf19410..af91bec10c 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -80,16 +80,27 @@ typedef struct cxl_device_state {
>      MemoryRegion device_registers;
>  
>      /* mmio for device capabilities array - 8.2.8.2 */
> +    MemoryRegion device;
>      struct {
>          MemoryRegion caps;
>          uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
>      };
>  
> -    /* mmio for the device status registers 8.2.8.3 */
> -    MemoryRegion device;
> -

Move this block back to where it was originally introduced rather than
introduce it then move it later?

>      /* mmio for the mailbox registers 8.2.8.4 */
> -    MemoryRegion mailbox;
> +    struct {
> +        MemoryRegion mailbox;
> +        uint16_t payload_size;
> +        union {
> +            uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
> +            uint32_t mbox_reg_state32[CXL_MAILBOX_REGISTERS_LENGTH / 4];
> +            uint64_t mbox_reg_state64[CXL_MAILBOX_REGISTERS_LENGTH / 8];
> +        };
> +        struct {
> +            uint16_t opcode;
> +            uint16_t effect;
> +        } cel_log[1 << 16];
> +        size_t cel_size;
> +    };

If the structure is unnamed, chances of a naming clash seem rather high
if you don't prefix all the elements with mbx_ or something like that.

>  
>      /* memory region for persistent memory, HDM */
>      MemoryRegion *pmem;
> @@ -135,6 +146,9 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
>  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
>                                                 CXL_DEVICE_CAP_REG_SIZE)
>  
> +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
> +void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
> +
>  #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
>      do {                                                                           \
>          uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> @@ -162,6 +176,12 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
>      FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
>      FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
>  
> +/* XXX: actually a 64b register */
> +REG32(CXL_DEV_MAILBOX_CMD, 8)
> +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)
> +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
> +    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
> +

Ah. this is where this definition went.  Perhaps pull it back into patch 3?
That patch defines plenty of other things that aren't used until later patches
I think, so one more won't hurt and will save me asking why you skipped it:)


>  /* XXX: actually a 64b register */
>  REG32(CXL_DEV_MAILBOX_STS, 0x10)
>      FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4)
@ 2021-02-02 14:58     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 14:58 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:22 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This is the beginning of implementing mailbox support for CXL 2.0
> devices. The implementation recognizes when the doorbell is rung,
> handles the command/payload, clears the doorbell while returning error
> codes and data.
> 
> Generally the mailbox mechanism is designed to permit communication
> between the host OS and the firmware running on the device. For our
> purposes, we emulate both the firmware, implemented primarily in
> cxl-mailbox-utils.c, and the hardware.
> 
> No commands are implemented yet.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  hw/cxl/cxl-device-utils.c   | 125 ++++++++++++++++++++++-
>  hw/cxl/cxl-mailbox-utils.c  | 197 ++++++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build          |   1 +
>  include/hw/cxl/cxl.h        |   3 +
>  include/hw/cxl/cxl_device.h |  28 ++++-
>  5 files changed, 349 insertions(+), 5 deletions(-)
>  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> index bb15ad9a0f..6602606f3d 100644
> --- a/hw/cxl/cxl-device-utils.c
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -40,6 +40,111 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
>      return 0;
>  }
>  
> +static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    CXLDeviceState *cxl_dstate = opaque;
> +
> +    switch (size) {
> +    case 8:
> +        return cxl_dstate->mbox_reg_state64[offset / 8];
> +    case 4:
> +        return cxl_dstate->mbox_reg_state32[offset / 4];

Numeric order seems more natural and I can't see a reason not to do it.
+ you do them in that order below.

> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
> +static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
> +                               uint64_t value)
> +{
> +    switch (offset) {
> +    case A_CXL_DEV_MAILBOX_CTRL:
> +        /* fallthrough */
> +    case A_CXL_DEV_MAILBOX_CAP:
> +        /* RO register */
> +        break;
> +    default:
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
> +                      __func__, offset);
> +        break;
> +    }
> +
> +    reg_state[offset / 4] = value;
> +}
> +
> +static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
> +                               uint64_t value)
> +{
> +    switch (offset) {
> +    case A_CXL_DEV_MAILBOX_CMD:
> +        break;
> +    case A_CXL_DEV_BG_CMD_STS:
> +        /* BG not supported */
> +        /* fallthrough */
> +    case A_CXL_DEV_MAILBOX_STS:
> +        /* Read only register, will get updated by the state machine */
> +        return;
> +    default:
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
> +                      __func__, offset);
> +        return;
> +    }
> +
> +
> +    reg_state[offset / 8] = value;
> +}
> +
> +static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
> +                              unsigned size)
> +{
> +    CXLDeviceState *cxl_dstate = opaque;
> +
> +    if (offset >= A_CXL_DEV_CMD_PAYLOAD) {
> +        memcpy(cxl_dstate->mbox_reg_state + offset, &value, size);
> +        return;
> +    }
> +
> +    /*
> +     * Lock is needed to prevent concurrent writes as well as to prevent writes
> +     * coming in while the firmware is processing. Without background commands
> +     * or the second mailbox implemented, this serves no purpose since the
> +     * memory access is synchronized at a higher level (per memory region).
> +     */
> +    RCU_READ_LOCK_GUARD();
> +
> +    switch (size) {
> +    case 4:
> +        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
> +        break;
> +    case 8:
> +        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
> +        break;
> +    default:
> +        g_assert_not_reached();
> +    }
> +
> +    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> +                         DOORBELL))
> +        cxl_process_mailbox(cxl_dstate);
> +}
> +
> +static const MemoryRegionOps mailbox_ops = {
> +    .read = mailbox_reg_read,
> +    .write = mailbox_reg_write,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 1,
> +        .max_access_size = 8,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +};
> +
>  static const MemoryRegionOps dev_ops = {
>      .read = dev_reg_read,
>      .write = NULL, /* status register is read only */
> @@ -80,20 +185,33 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
>                            "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
>      memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
>                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> +    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
> +                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
>  
>      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
>                                  &cxl_dstate->caps);
>      memory_region_add_subregion(&cxl_dstate->device_registers,
>                                  CXL_DEVICE_REGISTERS_OFFSET,
>                                  &cxl_dstate->device);
> +    memory_region_add_subregion(&cxl_dstate->device_registers,
> +                                CXL_MAILBOX_REGISTERS_OFFSET,
> +                                &cxl_dstate->mailbox);
>  }
>  
>  static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
>  
> +static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
> +{
> +    /* 2048 payload size, with no interrupt or background support */
> +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> +                     PAYLOAD_SIZE, CXL_MAILBOX_PAYLOAD_SHIFT);
> +    cxl_dstate->payload_size = CXL_MAILBOX_MAX_PAYLOAD_SIZE;
> +}
> +
>  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
>  {
>      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> -    const int cap_count = 1;
> +    const int cap_count = 2;
>  
>      /* CXL Device Capabilities Array Register */
>      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> @@ -102,4 +220,9 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
>  
>      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
>      device_reg_init_common(cxl_dstate);
> +
> +    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> +    mailbox_reg_init_common(cxl_dstate);
> +
> +    assert(cxl_initialize_mailbox(cxl_dstate) == 0);
>  }
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> new file mode 100644
> index 0000000000..466055b01a
> --- /dev/null
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -0,0 +1,197 @@
> +/*
> + * CXL Utility library for mailbox interface
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/cxl/cxl.h"
> +#include "hw/pci/pci.h"
> +#include "qemu/log.h"
> +#include "qemu/uuid.h"
> +
> +/*
> + * How to add a new command, example. The command set FOO, with cmd BAR.
> + *  1. Add the command set and cmd to the enum.
> + *     FOO    = 0x7f,
> + *          #define BAR 0
> + *  2. Forward declare the handler.
> + *     declare_mailbox_handler(FOO_BAR);
> + *  3. Add the command to the cxl_cmd_set[][]
> + *     CXL_CMD(FOO, BAR, 0, 0),
> + *  4. Implement your handler
> + *     define_mailbox_handler(FOO_BAR) { ... return CXL_MBOX_SUCCESS; }
> + *
> + *
> + *  Writing the handler:
> + *    The handler will provide the &struct cxl_cmd, the &CXLDeviceState, and the
> + *    in/out length of the payload. The handler is responsible for consuming the
> + *    payload from cmd->payload and operating upon it as necessary. It must then
> + *    fill the output data into cmd->payload (overwriting what was there),
> + *    setting the length, and returning a valid return code.
> + *
> + *  XXX: The handler need not worry about endianess. The payload is read out of
> + *  a register interface that already deals with it.
> + */
> +
> +/* 8.2.8.4.5.1 Command Return Codes */
> +typedef enum {
> +    CXL_MBOX_SUCCESS = 0x0,
> +    CXL_MBOX_BG_STARTED = 0x1,
> +    CXL_MBOX_INVALID_INPUT = 0x2,
> +    CXL_MBOX_UNSUPPORTED = 0x3,
> +    CXL_MBOX_INTERNAL_ERROR = 0x4,
> +    CXL_MBOX_RETRY_REQUIRED = 0x5,
> +    CXL_MBOX_BUSY = 0x6,
> +    CXL_MBOX_MEDIA_DISABLED = 0x7,
> +    CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
> +    CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
> +    CXL_MBOX_FW_AUTH_FAILED = 0xa,
> +    CXL_MBOX_FW_INVALID_SLOT = 0xb,
> +    CXL_MBOX_FW_ROLLEDBACK = 0xc,
> +    CXL_MBOX_FW_REST_REQD = 0xd,
> +    CXL_MBOX_INVALID_HANDLE = 0xe,
> +    CXL_MBOX_INVALID_PA = 0xf,
> +    CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
> +    CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
> +    CXL_MBOX_ABORTED = 0x12,
> +    CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
> +    CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
> +    CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
> +    CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
> +    CXL_MBOX_MAX = 0x17
> +} ret_code;
> +
> +struct cxl_cmd;
> +typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,
> +                                   CXLDeviceState *cxl_dstate, uint16_t *len);
> +struct cxl_cmd {
> +    const char *name;
> +    opcode_handler handler;
> +    ssize_t in;
> +    uint16_t effect; /* Reported in CEL */
> +    uint8_t *payload;

This payload pointer feels somewhat out of place. Perhaps it should be a parameter
passed to the opcode_handler()?  The address of the payload doesn't feel like
part of the command as such so you are justing using it as somewhere to stash
the address when passing to the handler.


> +};
> +
> +#define define_mailbox_handler(name)                \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd, \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len)
> +#define declare_mailbox_handler(name) define_mailbox_handler(name)
> +
> +#define define_mailbox_handler_zeroed(name, size)                         \
> +    uint16_t __zero##name = size;                                         \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        *len = __zero##name;                                              \
> +        memset(cmd->payload, 0, *len);                                    \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +#define define_mailbox_handler_const(name, data)                          \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        *len = sizeof(data);                                              \
> +        memcpy(cmd->payload, data, *len);                                 \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +#define define_mailbox_handler_nop(name)                                  \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +
> +#define CXL_CMD(s, c, in, cel_effect) \
> +    [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
> +
> +static struct cxl_cmd cxl_cmd_set[256][256] = {};
> +
> +#undef CXL_CMD
> +
> +QemuUUID cel_uuid;
> +
> +void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> +{
> +    uint16_t ret = CXL_MBOX_SUCCESS;
> +    struct cxl_cmd *cxl_cmd;
> +    uint64_t status_reg;
> +    opcode_handler h;
> +
> +    /*
> +     * current state of mailbox interface
> +     *  mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
> +     *  mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
> +     *  status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
> +     */
> +    uint64_t command_reg =
> +        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
> +
> +    /* Check if we have to do anything */
> +    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> +                          DOORBELL)) {
> +        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
> +        return;
> +    }
> +
> +    uint8_t set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
> +    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> +    uint16_t len = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH);
> +    cxl_cmd = &cxl_cmd_set[set][cmd];
> +    h = cxl_cmd->handler;
> +    if (!h) {

This path seems to not convey information it perhaps should.  Maybe some feedback that
a command was requested that isn't registered?

> +        goto handled;
> +    }
> +
> +    if (len != cxl_cmd->in) {
> +        ret = CXL_MBOX_INVALID_PAYLOAD_LENGTH;
> +    }
> +
> +    cxl_cmd->payload = cxl_dstate->mbox_reg_state + A_CXL_DEV_CMD_PAYLOAD;
> +    ret = (*h)(cxl_cmd, cxl_dstate, &len);
> +    assert(len <= cxl_dstate->payload_size);
> +
> +handled:
> +    /*
> +     * Set the return code
> +     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
> +     * away with this
> +     */
> +    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
> +
> +    /*
> +     * Set the return length
> +     */
> +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
> +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
> +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, len);
> +
> +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_CMD / 8] = command_reg;
> +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_STS / 8] = status_reg;
> +
> +    /* Tell the host we're done */
> +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> +                     DOORBELL, 0);
> +}
> +
> +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate)
> +{
> +    const char *cel_uuidstr = "0da9c0b5-bf41-4b78-8f79-96b1623b3f17";
> +
> +    for (int i = 0; i < 256; i++) {

Instead of indexing with i and j, perhaps this would be more consistent using
the naming you have above cmd and set?


> +        for (int j = 0; j < 256; j++) {
> +            if (cxl_cmd_set[i][j].handler) {
> +                struct cxl_cmd *c = &cxl_cmd_set[i][j];
> +
> +                cxl_dstate->cel_log[cxl_dstate->cel_size].opcode = (i << 8) | j;
> +                cxl_dstate->cel_log[cxl_dstate->cel_size].effect = c->effect;
> +                cxl_dstate->cel_size++;
> +            }
> +        }
> +    }
> +
> +    return qemu_uuid_parse(cel_uuidstr, &cel_uuid);
> +}
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> index 47154d6850..0eca715d10 100644
> --- a/hw/cxl/meson.build
> +++ b/hw/cxl/meson.build
> @@ -1,4 +1,5 @@
>  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
>    'cxl-component-utils.c',
>    'cxl-device-utils.c',
> +  'cxl-mailbox-utils.c',
>  ))
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> index 23f52c4cf9..362cda40de 100644
> --- a/include/hw/cxl/cxl.h
> +++ b/include/hw/cxl/cxl.h
> @@ -14,5 +14,8 @@
>  #include "cxl_component.h"
>  #include "cxl_device.h"
>  
> +#define COMPONENT_REG_BAR_IDX 0
> +#define DEVICE_REG_BAR_IDX 2

I'd argue for prefixing all defines

CXL_COMPONENT_REG_BAR_IDX etc

Will make it clear they are generic CXL related things.

> +
>  #endif
>  
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index f3bcf19410..af91bec10c 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -80,16 +80,27 @@ typedef struct cxl_device_state {
>      MemoryRegion device_registers;
>  
>      /* mmio for device capabilities array - 8.2.8.2 */
> +    MemoryRegion device;
>      struct {
>          MemoryRegion caps;
>          uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
>      };
>  
> -    /* mmio for the device status registers 8.2.8.3 */
> -    MemoryRegion device;
> -

Move this block back to where it was originally introduced rather than
introduce it then move it later?

>      /* mmio for the mailbox registers 8.2.8.4 */
> -    MemoryRegion mailbox;
> +    struct {
> +        MemoryRegion mailbox;
> +        uint16_t payload_size;
> +        union {
> +            uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
> +            uint32_t mbox_reg_state32[CXL_MAILBOX_REGISTERS_LENGTH / 4];
> +            uint64_t mbox_reg_state64[CXL_MAILBOX_REGISTERS_LENGTH / 8];
> +        };
> +        struct {
> +            uint16_t opcode;
> +            uint16_t effect;
> +        } cel_log[1 << 16];
> +        size_t cel_size;
> +    };

If the structure is unnamed, chances of a naming clash seem rather high
if you don't prefix all the elements with mbx_ or something like that.

>  
>      /* memory region for persistent memory, HDM */
>      MemoryRegion *pmem;
> @@ -135,6 +146,9 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
>  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
>                                                 CXL_DEVICE_CAP_REG_SIZE)
>  
> +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
> +void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
> +
>  #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
>      do {                                                                           \
>          uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> @@ -162,6 +176,12 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
>      FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
>      FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
>  
> +/* XXX: actually a 64b register */
> +REG32(CXL_DEV_MAILBOX_CMD, 8)
> +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)
> +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
> +    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
> +

Ah. this is where this definition went.  Perhaps pull it back into patch 3?
That patch defines plenty of other things that aren't used until later patches
I think, so one more won't hurt and will save me asking why you skipped it:)


>  /* XXX: actually a 64b register */
>  REG32(CXL_DEV_MAILBOX_STS, 0x10)
>      FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-02 15:00     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 15:00 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:33 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> there is nothing wrong with doing it this way, CXL spec has a heavy
> reliance on _UID to identify host bridges and there is no link to the
> bus number. Having a distinct UID solves two problems. The first is it
> gets us around the limitation of 256 (current max bus number). The
> second is it allows us to replicate hardware configurations where bus
> number and uid aren't equivalent. The latter has benefits for our
> development and debugging using QEMU.
> 
> The other way to do this would be to implement the expanded bus
> numbering, but having an explicit uid makes more sense when trying to
> replicate real hardware configurations.
> 
> The QEMU commandline to utilize this would be:
>   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> --
> 
> I'm guessing this patch will be somewhat controversial. For early CXL
> work, this can be dropped without too much heartache.

Whilst I'm not personally against, this maybe best to drop for now as you
say.

> ---
>  hw/i386/acpi-build.c                |  3 ++-
>  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
>  hw/pci/pci.c                        | 11 +++++++++++
>  include/hw/pci/pci.h                |  1 +
>  include/hw/pci/pci_bus.h            |  1 +
>  5 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index cf6eb54c22..145a503e92 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          QLIST_FOREACH(bus, &bus->child, sibling) {
>              uint8_t bus_num = pci_bus_num(bus);
>              uint8_t numa_node = pci_bus_numa_node(bus);
> +            int32_t uid = pci_bus_uid(bus);
>  
>              /* look only for expander root buses */
>              if (!pci_bus_is_root(bus)) {
> @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>              scope = aml_scope("\\_SB");
>              dev = aml_device("PC%.02X", bus_num);
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {
>                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index b42592e1ff..5021b60435 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -67,6 +67,7 @@ struct PXBDev {
>  
>      uint8_t bus_nr;
>      uint16_t numa_node;
> +    int32_t uid;
>  };
>  
>  static PXBDev *convert_to_pxb(PCIDevice *dev)
> @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
>      return pxb->numa_node;
>  }
>  
> +static int32_t pxb_bus_uid(PCIBus *bus)
> +{
> +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> +
> +    return pxb->uid;
> +}
> +
>  static void pxb_bus_class_init(ObjectClass *class, void *data)
>  {
>      PCIBusClass *pbc = PCI_BUS_CLASS(class);
>  
>      pbc->bus_num = pxb_bus_num;
>      pbc->numa_node = pxb_bus_numa_node;
> +    pbc->uid = pxb_bus_uid;
>  }
>  
>  static const TypeInfo pxb_bus_info = {
> @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
>      /* Note: 0 is not a legal PXB bus number. */
>      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
>      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
>  
>  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
>  {
> +    PXBDev *pxb = convert_to_pxb(dev);
> +
>      /* A CXL PXB's parent bus is still PCIe */
>      if (!pci_bus_is_express(pci_get_bus(dev))) {
>          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
>          return;
>      }
>  
> +    if (pxb->uid < 0) {
> +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> +        return;
> +    }
> +
> +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> +
>      pxb_dev_realize_common(dev, CXL, errp);
>  }
>  
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index adbe8aa260..bf019d91a0 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
>      return NUMA_NODE_UNASSIGNED;
>  }
>  
> +static int32_t pcibus_uid(PCIBus *bus)
> +{
> +    return -1;
> +}
> +
>  static void pci_bus_class_init(ObjectClass *klass, void *data)
>  {
>      BusClass *k = BUS_CLASS(klass);
> @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
>  
>      pbc->bus_num = pcibus_num;
>      pbc->numa_node = pcibus_numa_node;
> +    pbc->uid = pcibus_uid;
>  }
>  
>  static const TypeInfo pci_bus_info = {
> @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
>      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
>  }
>  
> +int pci_bus_uid(PCIBus *bus)
> +{
> +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> +}
> +
>  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
>                                   const VMStateField *field)
>  {
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index bde3697bee..a46de48ccd 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
>  }
>  
>  int pci_bus_numa_node(PCIBus *bus);
> +int pci_bus_uid(PCIBus *bus);
>  void pci_for_each_device(PCIBus *bus, int bus_num,
>                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
>                           void *opaque);
> diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> index eb94e7e85c..3c9fbc55bb 100644
> --- a/include/hw/pci/pci_bus.h
> +++ b/include/hw/pci/pci_bus.h
> @@ -17,6 +17,7 @@ struct PCIBusClass {
>  
>      int (*bus_num)(PCIBus *bus);
>      uint16_t (*numa_node)(PCIBus *bus);
> +    int32_t (*uid)(PCIBus *bus);
>  };
>  
>  enum PCIBusFlags {


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
@ 2021-02-02 15:00     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 15:00 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:33 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> there is nothing wrong with doing it this way, CXL spec has a heavy
> reliance on _UID to identify host bridges and there is no link to the
> bus number. Having a distinct UID solves two problems. The first is it
> gets us around the limitation of 256 (current max bus number). The
> second is it allows us to replicate hardware configurations where bus
> number and uid aren't equivalent. The latter has benefits for our
> development and debugging using QEMU.
> 
> The other way to do this would be to implement the expanded bus
> numbering, but having an explicit uid makes more sense when trying to
> replicate real hardware configurations.
> 
> The QEMU commandline to utilize this would be:
>   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> --
> 
> I'm guessing this patch will be somewhat controversial. For early CXL
> work, this can be dropped without too much heartache.

Whilst I'm not personally against, this maybe best to drop for now as you
say.

> ---
>  hw/i386/acpi-build.c                |  3 ++-
>  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
>  hw/pci/pci.c                        | 11 +++++++++++
>  include/hw/pci/pci.h                |  1 +
>  include/hw/pci/pci_bus.h            |  1 +
>  5 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index cf6eb54c22..145a503e92 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>          QLIST_FOREACH(bus, &bus->child, sibling) {
>              uint8_t bus_num = pci_bus_num(bus);
>              uint8_t numa_node = pci_bus_numa_node(bus);
> +            int32_t uid = pci_bus_uid(bus);
>  
>              /* look only for expander root buses */
>              if (!pci_bus_is_root(bus)) {
> @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>              scope = aml_scope("\\_SB");
>              dev = aml_device("PC%.02X", bus_num);
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {
>                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index b42592e1ff..5021b60435 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -67,6 +67,7 @@ struct PXBDev {
>  
>      uint8_t bus_nr;
>      uint16_t numa_node;
> +    int32_t uid;
>  };
>  
>  static PXBDev *convert_to_pxb(PCIDevice *dev)
> @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
>      return pxb->numa_node;
>  }
>  
> +static int32_t pxb_bus_uid(PCIBus *bus)
> +{
> +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> +
> +    return pxb->uid;
> +}
> +
>  static void pxb_bus_class_init(ObjectClass *class, void *data)
>  {
>      PCIBusClass *pbc = PCI_BUS_CLASS(class);
>  
>      pbc->bus_num = pxb_bus_num;
>      pbc->numa_node = pxb_bus_numa_node;
> +    pbc->uid = pxb_bus_uid;
>  }
>  
>  static const TypeInfo pxb_bus_info = {
> @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
>      /* Note: 0 is not a legal PXB bus number. */
>      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
>      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
>  
>  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
>  {
> +    PXBDev *pxb = convert_to_pxb(dev);
> +
>      /* A CXL PXB's parent bus is still PCIe */
>      if (!pci_bus_is_express(pci_get_bus(dev))) {
>          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
>          return;
>      }
>  
> +    if (pxb->uid < 0) {
> +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> +        return;
> +    }
> +
> +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> +
>      pxb_dev_realize_common(dev, CXL, errp);
>  }
>  
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index adbe8aa260..bf019d91a0 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
>      return NUMA_NODE_UNASSIGNED;
>  }
>  
> +static int32_t pcibus_uid(PCIBus *bus)
> +{
> +    return -1;
> +}
> +
>  static void pci_bus_class_init(ObjectClass *klass, void *data)
>  {
>      BusClass *k = BUS_CLASS(klass);
> @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
>  
>      pbc->bus_num = pcibus_num;
>      pbc->numa_node = pcibus_numa_node;
> +    pbc->uid = pcibus_uid;
>  }
>  
>  static const TypeInfo pci_bus_info = {
> @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
>      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
>  }
>  
> +int pci_bus_uid(PCIBus *bus)
> +{
> +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> +}
> +
>  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
>                                   const VMStateField *field)
>  {
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index bde3697bee..a46de48ccd 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
>  }
>  
>  int pci_bus_numa_node(PCIBus *bus);
> +int pci_bus_uid(PCIBus *bus);
>  void pci_for_each_device(PCIBus *bus, int bus_num,
>                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
>                           void *opaque);
> diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> index eb94e7e85c..3c9fbc55bb 100644
> --- a/include/hw/pci/pci_bus.h
> +++ b/include/hw/pci/pci_bus.h
> @@ -17,6 +17,7 @@ struct PCIBusClass {
>  
>      int (*bus_num)(PCIBus *bus);
>      uint16_t (*numa_node)(PCIBus *bus);
> +    int32_t (*uid)(PCIBus *bus);
>  };
>  
>  enum PCIBusFlags {



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 21/31] hw/cxl/device: Add a memory device (8.2.8.5)
  2021-02-02 14:26   ` Eric Blake
@ 2021-02-02 15:06       ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02 15:06 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

On 21-02-02 08:26:14, Eric Blake wrote:
> On 2/1/21 6:59 PM, Ben Widawsky wrote:
> > A CXL memory device (AKA Type 3) is a CXL component that contains some
> > combination of volatile and persistent memory. It also implements the
> > previously defined mailbox interface as well as the memory device
> > firmware interface.
> > 
> > Although the memory device is configured like a normal PCIe device, the
> > memory traffic is on an entirely separate bus conceptually (using the
> > same physical wires as PCIe, but different protocol).
> > 
> > The guest physical address for the memory device is part of a larger
> > window which is owned by the platform. Currently, this is hardcoded as
> > an object property on host bridge (PXB) creation, but that will need to
> > change for interleaving.
> > 
> > The following example will create a 256M device in a 512M window:
> > -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
> > -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> 
> > +++ b/qapi/machine.json
> > @@ -1394,6 +1394,7 @@
> >  { 'union': 'MemoryDeviceInfo',
> >    'data': { 'dimm': 'PCDIMMDeviceInfo',
> >              'nvdimm': 'PCDIMMDeviceInfo',
> > +            'cxl': 'PCDIMMDeviceInfo',
> >              'virtio-pmem': 'VirtioPMEMDeviceInfo',
> >              'virtio-mem': 'VirtioMEMDeviceInfo'
> >            }
> 
> Missing documentation that 'cxl' was introduced in 6.0.  Also, is it
> worth keeping the branches of this union in lexicographic order?
> 

Sure.

As discussed on the list previously, I think more thought needs to be put in
here, and I could really use some input.

A CXL type3 memory device can have both persistent and volatile capacity. As
such a single PCDIMMDeviceInfo I believe is insufficient. The current code
supports persistent memory only, so this is fine for now.

I'd guess my best bet is to create a new CXLType3DeviceInfo, but I'm not
entirely sure of all the implications that has.

Any advice?

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
> 
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 21/31] hw/cxl/device: Add a memory device (8.2.8.5)
@ 2021-02-02 15:06       ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02 15:06 UTC (permalink / raw)
  To: Eric Blake
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny, Philippe Mathieu-Daudé

On 21-02-02 08:26:14, Eric Blake wrote:
> On 2/1/21 6:59 PM, Ben Widawsky wrote:
> > A CXL memory device (AKA Type 3) is a CXL component that contains some
> > combination of volatile and persistent memory. It also implements the
> > previously defined mailbox interface as well as the memory device
> > firmware interface.
> > 
> > Although the memory device is configured like a normal PCIe device, the
> > memory traffic is on an entirely separate bus conceptually (using the
> > same physical wires as PCIe, but different protocol).
> > 
> > The guest physical address for the memory device is part of a larger
> > window which is owned by the platform. Currently, this is hardcoded as
> > an object property on host bridge (PXB) creation, but that will need to
> > change for interleaving.
> > 
> > The following example will create a 256M device in a 512M window:
> > -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
> > -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> 
> > +++ b/qapi/machine.json
> > @@ -1394,6 +1394,7 @@
> >  { 'union': 'MemoryDeviceInfo',
> >    'data': { 'dimm': 'PCDIMMDeviceInfo',
> >              'nvdimm': 'PCDIMMDeviceInfo',
> > +            'cxl': 'PCDIMMDeviceInfo',
> >              'virtio-pmem': 'VirtioPMEMDeviceInfo',
> >              'virtio-mem': 'VirtioMEMDeviceInfo'
> >            }
> 
> Missing documentation that 'cxl' was introduced in 6.0.  Also, is it
> worth keeping the branches of this union in lexicographic order?
> 

Sure.

As discussed on the list previously, I think more thought needs to be put in
here, and I could really use some input.

A CXL type3 memory device can have both persistent and volatile capacity. As
such a single PCDIMMDeviceInfo I believe is insufficient. The current code
supports persistent memory only, so this is fine for now.

I'd guess my best bet is to create a new CXLType3DeviceInfo, but I'm not
entirely sure of all the implications that has.

Any advice?

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
> 
> 


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
  2021-02-02 15:00     ` Jonathan Cameron
@ 2021-02-02 15:24       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 117+ messages in thread
From: Michael S. Tsirkin @ 2021-02-02 15:24 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Ben Widawsky, qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves)

On Tue, Feb 02, 2021 at 03:00:56PM +0000, Jonathan Cameron wrote:
> On Mon, 1 Feb 2021 16:59:33 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> > there is nothing wrong with doing it this way, CXL spec has a heavy
> > reliance on _UID to identify host bridges and there is no link to the
> > bus number. Having a distinct UID solves two problems. The first is it
> > gets us around the limitation of 256 (current max bus number).

Not sure I understand. You want more than 256 host bridges?

> The
> > second is it allows us to replicate hardware configurations where bus
> > number and uid aren't equivalent.

A bit more data on when this needs to be the case?

> The latter has benefits for our
> > development and debugging using QEMU.
> > 
> > The other way to do this would be to implement the expanded bus
> > numbering, but having an explicit uid makes more sense when trying to
> > replicate real hardware configurations.
> > 
> > The QEMU commandline to utilize this would be:
> >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

However, if doing this how do we ensure UID is still unique?
What do we do for cases where UID was not specified?
One idea is to generate a string UID and just stick the bus #
in there.


> > --
> > 
> > I'm guessing this patch will be somewhat controversial. For early CXL
> > work, this can be dropped without too much heartache.
> 
> Whilst I'm not personally against, this maybe best to drop for now as you
> say.
> 
> > ---
> >  hw/i386/acpi-build.c                |  3 ++-
> >  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
> >  hw/pci/pci.c                        | 11 +++++++++++
> >  include/hw/pci/pci.h                |  1 +
> >  include/hw/pci/pci_bus.h            |  1 +
> >  5 files changed, 34 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index cf6eb54c22..145a503e92 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          QLIST_FOREACH(bus, &bus->child, sibling) {
> >              uint8_t bus_num = pci_bus_num(bus);
> >              uint8_t numa_node = pci_bus_numa_node(bus);
> > +            int32_t uid = pci_bus_uid(bus);
> >  
> >              /* look only for expander root buses */
> >              if (!pci_bus_is_root(bus)) {
> > @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >              scope = aml_scope("\\_SB");
> >              dev = aml_device("PC%.02X", bus_num);
> >              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> > -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> > +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
> >  
> >              if (numa_node != NUMA_NODE_UNASSIGNED) {
> >                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > index b42592e1ff..5021b60435 100644
> > --- a/hw/pci-bridge/pci_expander_bridge.c
> > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > @@ -67,6 +67,7 @@ struct PXBDev {
> >  
> >      uint8_t bus_nr;
> >      uint16_t numa_node;
> > +    int32_t uid;
> >  };
> >  
> >  static PXBDev *convert_to_pxb(PCIDevice *dev)

As long as we are doing this, do we want to support a string uid too?
How about a 64 bit uid? Why not?


> > @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
> >      return pxb->numa_node;
> >  }
> >  
> > +static int32_t pxb_bus_uid(PCIBus *bus)
> > +{
> > +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > +
> > +    return pxb->uid;
> > +}
> > +
> >  static void pxb_bus_class_init(ObjectClass *class, void *data)
> >  {
> >      PCIBusClass *pbc = PCI_BUS_CLASS(class);
> >  
> >      pbc->bus_num = pxb_bus_num;
> >      pbc->numa_node = pxb_bus_numa_node;
> > +    pbc->uid = pxb_bus_uid;
> >  }
> >  
> >  static const TypeInfo pxb_bus_info = {
> > @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
> >      /* Note: 0 is not a legal PXB bus number. */
> >      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
> >      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> > +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >  
> > @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
> >  
> >  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> >  {
> > +    PXBDev *pxb = convert_to_pxb(dev);
> > +
> >      /* A CXL PXB's parent bus is still PCIe */
> >      if (!pci_bus_is_express(pci_get_bus(dev))) {
> >          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> >          return;
> >      }
> >  
> > +    if (pxb->uid < 0) {
> > +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> > +        return;
> > +    }
> > +
> > +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> > +
> >      pxb_dev_realize_common(dev, CXL, errp);
> >  }
> >  
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index adbe8aa260..bf019d91a0 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
> >      return NUMA_NODE_UNASSIGNED;
> >  }
> >  
> > +static int32_t pcibus_uid(PCIBus *bus)
> > +{
> > +    return -1;
> > +}
> > +
> >  static void pci_bus_class_init(ObjectClass *klass, void *data)
> >  {
> >      BusClass *k = BUS_CLASS(klass);
> > @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
> >  
> >      pbc->bus_num = pcibus_num;
> >      pbc->numa_node = pcibus_numa_node;
> > +    pbc->uid = pcibus_uid;
> >  }
> >  
> >  static const TypeInfo pci_bus_info = {
> > @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
> >      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
> >  }
> >  
> > +int pci_bus_uid(PCIBus *bus)
> > +{
> > +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> > +}
> > +
> >  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
> >                                   const VMStateField *field)
> >  {
> > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > index bde3697bee..a46de48ccd 100644
> > --- a/include/hw/pci/pci.h
> > +++ b/include/hw/pci/pci.h
> > @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
> >  }
> >  
> >  int pci_bus_numa_node(PCIBus *bus);
> > +int pci_bus_uid(PCIBus *bus);
> >  void pci_for_each_device(PCIBus *bus, int bus_num,
> >                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
> >                           void *opaque);
> > diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> > index eb94e7e85c..3c9fbc55bb 100644
> > --- a/include/hw/pci/pci_bus.h
> > +++ b/include/hw/pci/pci_bus.h
> > @@ -17,6 +17,7 @@ struct PCIBusClass {
> >  
> >      int (*bus_num)(PCIBus *bus);
> >      uint16_t (*numa_node)(PCIBus *bus);
> > +    int32_t (*uid)(PCIBus *bus);
> >  };
> >  
> >  enum PCIBusFlags {


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
@ 2021-02-02 15:24       ` Michael S. Tsirkin
  0 siblings, 0 replies; 117+ messages in thread
From: Michael S. Tsirkin @ 2021-02-02 15:24 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Ben Widawsky, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Tue, Feb 02, 2021 at 03:00:56PM +0000, Jonathan Cameron wrote:
> On Mon, 1 Feb 2021 16:59:33 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> > there is nothing wrong with doing it this way, CXL spec has a heavy
> > reliance on _UID to identify host bridges and there is no link to the
> > bus number. Having a distinct UID solves two problems. The first is it
> > gets us around the limitation of 256 (current max bus number).

Not sure I understand. You want more than 256 host bridges?

> The
> > second is it allows us to replicate hardware configurations where bus
> > number and uid aren't equivalent.

A bit more data on when this needs to be the case?

> The latter has benefits for our
> > development and debugging using QEMU.
> > 
> > The other way to do this would be to implement the expanded bus
> > numbering, but having an explicit uid makes more sense when trying to
> > replicate real hardware configurations.
> > 
> > The QEMU commandline to utilize this would be:
> >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

However, if doing this how do we ensure UID is still unique?
What do we do for cases where UID was not specified?
One idea is to generate a string UID and just stick the bus #
in there.


> > --
> > 
> > I'm guessing this patch will be somewhat controversial. For early CXL
> > work, this can be dropped without too much heartache.
> 
> Whilst I'm not personally against, this maybe best to drop for now as you
> say.
> 
> > ---
> >  hw/i386/acpi-build.c                |  3 ++-
> >  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
> >  hw/pci/pci.c                        | 11 +++++++++++
> >  include/hw/pci/pci.h                |  1 +
> >  include/hw/pci/pci_bus.h            |  1 +
> >  5 files changed, 34 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index cf6eb54c22..145a503e92 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >          QLIST_FOREACH(bus, &bus->child, sibling) {
> >              uint8_t bus_num = pci_bus_num(bus);
> >              uint8_t numa_node = pci_bus_numa_node(bus);
> > +            int32_t uid = pci_bus_uid(bus);
> >  
> >              /* look only for expander root buses */
> >              if (!pci_bus_is_root(bus)) {
> > @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >              scope = aml_scope("\\_SB");
> >              dev = aml_device("PC%.02X", bus_num);
> >              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> > -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> > +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
> >  
> >              if (numa_node != NUMA_NODE_UNASSIGNED) {
> >                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > index b42592e1ff..5021b60435 100644
> > --- a/hw/pci-bridge/pci_expander_bridge.c
> > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > @@ -67,6 +67,7 @@ struct PXBDev {
> >  
> >      uint8_t bus_nr;
> >      uint16_t numa_node;
> > +    int32_t uid;
> >  };
> >  
> >  static PXBDev *convert_to_pxb(PCIDevice *dev)

As long as we are doing this, do we want to support a string uid too?
How about a 64 bit uid? Why not?


> > @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
> >      return pxb->numa_node;
> >  }
> >  
> > +static int32_t pxb_bus_uid(PCIBus *bus)
> > +{
> > +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > +
> > +    return pxb->uid;
> > +}
> > +
> >  static void pxb_bus_class_init(ObjectClass *class, void *data)
> >  {
> >      PCIBusClass *pbc = PCI_BUS_CLASS(class);
> >  
> >      pbc->bus_num = pxb_bus_num;
> >      pbc->numa_node = pxb_bus_numa_node;
> > +    pbc->uid = pxb_bus_uid;
> >  }
> >  
> >  static const TypeInfo pxb_bus_info = {
> > @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
> >      /* Note: 0 is not a legal PXB bus number. */
> >      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
> >      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> > +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >  
> > @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
> >  
> >  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> >  {
> > +    PXBDev *pxb = convert_to_pxb(dev);
> > +
> >      /* A CXL PXB's parent bus is still PCIe */
> >      if (!pci_bus_is_express(pci_get_bus(dev))) {
> >          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> >          return;
> >      }
> >  
> > +    if (pxb->uid < 0) {
> > +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> > +        return;
> > +    }
> > +
> > +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> > +
> >      pxb_dev_realize_common(dev, CXL, errp);
> >  }
> >  
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index adbe8aa260..bf019d91a0 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
> >      return NUMA_NODE_UNASSIGNED;
> >  }
> >  
> > +static int32_t pcibus_uid(PCIBus *bus)
> > +{
> > +    return -1;
> > +}
> > +
> >  static void pci_bus_class_init(ObjectClass *klass, void *data)
> >  {
> >      BusClass *k = BUS_CLASS(klass);
> > @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
> >  
> >      pbc->bus_num = pcibus_num;
> >      pbc->numa_node = pcibus_numa_node;
> > +    pbc->uid = pcibus_uid;
> >  }
> >  
> >  static const TypeInfo pci_bus_info = {
> > @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
> >      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
> >  }
> >  
> > +int pci_bus_uid(PCIBus *bus)
> > +{
> > +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> > +}
> > +
> >  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
> >                                   const VMStateField *field)
> >  {
> > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > index bde3697bee..a46de48ccd 100644
> > --- a/include/hw/pci/pci.h
> > +++ b/include/hw/pci/pci.h
> > @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
> >  }
> >  
> >  int pci_bus_numa_node(PCIBus *bus);
> > +int pci_bus_uid(PCIBus *bus);
> >  void pci_for_each_device(PCIBus *bus, int bus_num,
> >                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
> >                           void *opaque);
> > diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> > index eb94e7e85c..3c9fbc55bb 100644
> > --- a/include/hw/pci/pci_bus.h
> > +++ b/include/hw/pci/pci_bus.h
> > @@ -17,6 +17,7 @@ struct PCIBusClass {
> >  
> >      int (*bus_num)(PCIBus *bus);
> >      uint16_t (*numa_node)(PCIBus *bus);
> > +    int32_t (*uid)(PCIBus *bus);
> >  };
> >  
> >  enum PCIBusFlags {



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
  2021-02-02 15:24       ` Michael S. Tsirkin
@ 2021-02-02 15:42         ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02 15:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jonathan Cameron, qemu-devel, linux-cxl, Chris Browy,
	Dan Williams, David Hildenbrand, Igor Mammedov, Ira Weiny,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves)

Thanks for looking! Mixing comments to Jonathan and Michael..

On 21-02-02 10:24:43, Michael S. Tsirkin wrote:
> On Tue, Feb 02, 2021 at 03:00:56PM +0000, Jonathan Cameron wrote:
> > On Mon, 1 Feb 2021 16:59:33 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > 
> > > Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> > > there is nothing wrong with doing it this way, CXL spec has a heavy
> > > reliance on _UID to identify host bridges and there is no link to the
> > > bus number. Having a distinct UID solves two problems. The first is it
> > > gets us around the limitation of 256 (current max bus number).
> 
> Not sure I understand. You want more than 256 host bridges?
> 

I don't want more than 256 host bridges, but I want the ability to disaggregate
_UID and bus (_BBN). The reasoning is just to align with the spec where _UID is
used to identify a CXL host bridge which is unrelated (perhaps) to the bus
number.

> > The
> > > second is it allows us to replicate hardware configurations where bus
> > > number and uid aren't equivalent.
> 
> A bit more data on when this needs to be the case?
> 

Doesn't *need* to be the case. I was making a concerted effort to allow full
spec flexibility, but I don't believe it to be necessary unless we want to
accurately emulate a real platform.

> > The latter has benefits for our
> > > development and debugging using QEMU.
> > > 
> > > The other way to do this would be to implement the expanded bus
> > > numbering, but having an explicit uid makes more sense when trying to
> > > replicate real hardware configurations.
> > > 
> > > The QEMU commandline to utilize this would be:
> > >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> However, if doing this how do we ensure UID is still unique?
> What do we do for cases where UID was not specified?
> One idea is to generate a string UID and just stick the bus #
> in there.

This is totally mishandled in the code currently. I like your idea though.

> 
> 
> > > --
> > > 
> > > I'm guessing this patch will be somewhat controversial. For early CXL
> > > work, this can be dropped without too much heartache.
> > 
> > Whilst I'm not personally against, this maybe best to drop for now as you
> > say.
> > 

I think it'd be good to understand from the PCIe experts if CXL matches in this
regard. If PCIe generally allows (and does in practice) _UID not matching _BBN,
perhaps this is an overall improvement to the code.

> > > ---
> > >  hw/i386/acpi-build.c                |  3 ++-
> > >  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
> > >  hw/pci/pci.c                        | 11 +++++++++++
> > >  include/hw/pci/pci.h                |  1 +
> > >  include/hw/pci/pci_bus.h            |  1 +
> > >  5 files changed, 34 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > index cf6eb54c22..145a503e92 100644
> > > --- a/hw/i386/acpi-build.c
> > > +++ b/hw/i386/acpi-build.c
> > > @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >          QLIST_FOREACH(bus, &bus->child, sibling) {
> > >              uint8_t bus_num = pci_bus_num(bus);
> > >              uint8_t numa_node = pci_bus_numa_node(bus);
> > > +            int32_t uid = pci_bus_uid(bus);
> > >  
> > >              /* look only for expander root buses */
> > >              if (!pci_bus_is_root(bus)) {
> > > @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >              scope = aml_scope("\\_SB");
> > >              dev = aml_device("PC%.02X", bus_num);
> > >              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> > > -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> > > +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
> > >  
> > >              if (numa_node != NUMA_NODE_UNASSIGNED) {
> > >                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > index b42592e1ff..5021b60435 100644
> > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > @@ -67,6 +67,7 @@ struct PXBDev {
> > >  
> > >      uint8_t bus_nr;
> > >      uint16_t numa_node;
> > > +    int32_t uid;
> > >  };
> > >  
> > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> 
> As long as we are doing this, do we want to support a string uid too?
> How about a 64 bit uid? Why not?

If generally the idea of the patch is welcome, I am happy to change it.

> 
> 
> > > @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
> > >      return pxb->numa_node;
> > >  }
> > >  
> > > +static int32_t pxb_bus_uid(PCIBus *bus)
> > > +{
> > > +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > > +
> > > +    return pxb->uid;
> > > +}
> > > +
> > >  static void pxb_bus_class_init(ObjectClass *class, void *data)
> > >  {
> > >      PCIBusClass *pbc = PCI_BUS_CLASS(class);
> > >  
> > >      pbc->bus_num = pxb_bus_num;
> > >      pbc->numa_node = pxb_bus_numa_node;
> > > +    pbc->uid = pxb_bus_uid;
> > >  }
> > >  
> > >  static const TypeInfo pxb_bus_info = {
> > > @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
> > >      /* Note: 0 is not a legal PXB bus number. */
> > >      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
> > >      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> > > +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
> > >      DEFINE_PROP_END_OF_LIST(),
> > >  };
> > >  
> > > @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
> > >  
> > >  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> > >  {
> > > +    PXBDev *pxb = convert_to_pxb(dev);
> > > +
> > >      /* A CXL PXB's parent bus is still PCIe */
> > >      if (!pci_bus_is_express(pci_get_bus(dev))) {
> > >          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> > >          return;
> > >      }
> > >  
> > > +    if (pxb->uid < 0) {
> > > +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> > > +        return;
> > > +    }
> > > +
> > > +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> > > +
> > >      pxb_dev_realize_common(dev, CXL, errp);
> > >  }
> > >  
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > index adbe8aa260..bf019d91a0 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
> > >      return NUMA_NODE_UNASSIGNED;
> > >  }
> > >  
> > > +static int32_t pcibus_uid(PCIBus *bus)
> > > +{
> > > +    return -1;
> > > +}
> > > +
> > >  static void pci_bus_class_init(ObjectClass *klass, void *data)
> > >  {
> > >      BusClass *k = BUS_CLASS(klass);
> > > @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
> > >  
> > >      pbc->bus_num = pcibus_num;
> > >      pbc->numa_node = pcibus_numa_node;
> > > +    pbc->uid = pcibus_uid;
> > >  }
> > >  
> > >  static const TypeInfo pci_bus_info = {
> > > @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
> > >      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
> > >  }
> > >  
> > > +int pci_bus_uid(PCIBus *bus)
> > > +{
> > > +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> > > +}
> > > +
> > >  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
> > >                                   const VMStateField *field)
> > >  {
> > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > index bde3697bee..a46de48ccd 100644
> > > --- a/include/hw/pci/pci.h
> > > +++ b/include/hw/pci/pci.h
> > > @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
> > >  }
> > >  
> > >  int pci_bus_numa_node(PCIBus *bus);
> > > +int pci_bus_uid(PCIBus *bus);
> > >  void pci_for_each_device(PCIBus *bus, int bus_num,
> > >                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
> > >                           void *opaque);
> > > diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> > > index eb94e7e85c..3c9fbc55bb 100644
> > > --- a/include/hw/pci/pci_bus.h
> > > +++ b/include/hw/pci/pci_bus.h
> > > @@ -17,6 +17,7 @@ struct PCIBusClass {
> > >  
> > >      int (*bus_num)(PCIBus *bus);
> > >      uint16_t (*numa_node)(PCIBus *bus);
> > > +    int32_t (*uid)(PCIBus *bus);
> > >  };
> > >  
> > >  enum PCIBusFlags {
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
@ 2021-02-02 15:42         ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02 15:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Jonathan Cameron, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

Thanks for looking! Mixing comments to Jonathan and Michael..

On 21-02-02 10:24:43, Michael S. Tsirkin wrote:
> On Tue, Feb 02, 2021 at 03:00:56PM +0000, Jonathan Cameron wrote:
> > On Mon, 1 Feb 2021 16:59:33 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > 
> > > Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> > > there is nothing wrong with doing it this way, CXL spec has a heavy
> > > reliance on _UID to identify host bridges and there is no link to the
> > > bus number. Having a distinct UID solves two problems. The first is it
> > > gets us around the limitation of 256 (current max bus number).
> 
> Not sure I understand. You want more than 256 host bridges?
> 

I don't want more than 256 host bridges, but I want the ability to disaggregate
_UID and bus (_BBN). The reasoning is just to align with the spec where _UID is
used to identify a CXL host bridge which is unrelated (perhaps) to the bus
number.

> > The
> > > second is it allows us to replicate hardware configurations where bus
> > > number and uid aren't equivalent.
> 
> A bit more data on when this needs to be the case?
> 

Doesn't *need* to be the case. I was making a concerted effort to allow full
spec flexibility, but I don't believe it to be necessary unless we want to
accurately emulate a real platform.

> > The latter has benefits for our
> > > development and debugging using QEMU.
> > > 
> > > The other way to do this would be to implement the expanded bus
> > > numbering, but having an explicit uid makes more sense when trying to
> > > replicate real hardware configurations.
> > > 
> > > The QEMU commandline to utilize this would be:
> > >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> However, if doing this how do we ensure UID is still unique?
> What do we do for cases where UID was not specified?
> One idea is to generate a string UID and just stick the bus #
> in there.

This is totally mishandled in the code currently. I like your idea though.

> 
> 
> > > --
> > > 
> > > I'm guessing this patch will be somewhat controversial. For early CXL
> > > work, this can be dropped without too much heartache.
> > 
> > Whilst I'm not personally against, this maybe best to drop for now as you
> > say.
> > 

I think it'd be good to understand from the PCIe experts if CXL matches in this
regard. If PCIe generally allows (and does in practice) _UID not matching _BBN,
perhaps this is an overall improvement to the code.

> > > ---
> > >  hw/i386/acpi-build.c                |  3 ++-
> > >  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
> > >  hw/pci/pci.c                        | 11 +++++++++++
> > >  include/hw/pci/pci.h                |  1 +
> > >  include/hw/pci/pci_bus.h            |  1 +
> > >  5 files changed, 34 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > index cf6eb54c22..145a503e92 100644
> > > --- a/hw/i386/acpi-build.c
> > > +++ b/hw/i386/acpi-build.c
> > > @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >          QLIST_FOREACH(bus, &bus->child, sibling) {
> > >              uint8_t bus_num = pci_bus_num(bus);
> > >              uint8_t numa_node = pci_bus_numa_node(bus);
> > > +            int32_t uid = pci_bus_uid(bus);
> > >  
> > >              /* look only for expander root buses */
> > >              if (!pci_bus_is_root(bus)) {
> > > @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > >              scope = aml_scope("\\_SB");
> > >              dev = aml_device("PC%.02X", bus_num);
> > >              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> > > -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> > > +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
> > >  
> > >              if (numa_node != NUMA_NODE_UNASSIGNED) {
> > >                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > index b42592e1ff..5021b60435 100644
> > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > @@ -67,6 +67,7 @@ struct PXBDev {
> > >  
> > >      uint8_t bus_nr;
> > >      uint16_t numa_node;
> > > +    int32_t uid;
> > >  };
> > >  
> > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> 
> As long as we are doing this, do we want to support a string uid too?
> How about a 64 bit uid? Why not?

If generally the idea of the patch is welcome, I am happy to change it.

> 
> 
> > > @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
> > >      return pxb->numa_node;
> > >  }
> > >  
> > > +static int32_t pxb_bus_uid(PCIBus *bus)
> > > +{
> > > +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > > +
> > > +    return pxb->uid;
> > > +}
> > > +
> > >  static void pxb_bus_class_init(ObjectClass *class, void *data)
> > >  {
> > >      PCIBusClass *pbc = PCI_BUS_CLASS(class);
> > >  
> > >      pbc->bus_num = pxb_bus_num;
> > >      pbc->numa_node = pxb_bus_numa_node;
> > > +    pbc->uid = pxb_bus_uid;
> > >  }
> > >  
> > >  static const TypeInfo pxb_bus_info = {
> > > @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
> > >      /* Note: 0 is not a legal PXB bus number. */
> > >      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
> > >      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> > > +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
> > >      DEFINE_PROP_END_OF_LIST(),
> > >  };
> > >  
> > > @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
> > >  
> > >  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> > >  {
> > > +    PXBDev *pxb = convert_to_pxb(dev);
> > > +
> > >      /* A CXL PXB's parent bus is still PCIe */
> > >      if (!pci_bus_is_express(pci_get_bus(dev))) {
> > >          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> > >          return;
> > >      }
> > >  
> > > +    if (pxb->uid < 0) {
> > > +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> > > +        return;
> > > +    }
> > > +
> > > +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> > > +
> > >      pxb_dev_realize_common(dev, CXL, errp);
> > >  }
> > >  
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > index adbe8aa260..bf019d91a0 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
> > >      return NUMA_NODE_UNASSIGNED;
> > >  }
> > >  
> > > +static int32_t pcibus_uid(PCIBus *bus)
> > > +{
> > > +    return -1;
> > > +}
> > > +
> > >  static void pci_bus_class_init(ObjectClass *klass, void *data)
> > >  {
> > >      BusClass *k = BUS_CLASS(klass);
> > > @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
> > >  
> > >      pbc->bus_num = pcibus_num;
> > >      pbc->numa_node = pcibus_numa_node;
> > > +    pbc->uid = pcibus_uid;
> > >  }
> > >  
> > >  static const TypeInfo pci_bus_info = {
> > > @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
> > >      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
> > >  }
> > >  
> > > +int pci_bus_uid(PCIBus *bus)
> > > +{
> > > +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> > > +}
> > > +
> > >  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
> > >                                   const VMStateField *field)
> > >  {
> > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > index bde3697bee..a46de48ccd 100644
> > > --- a/include/hw/pci/pci.h
> > > +++ b/include/hw/pci/pci.h
> > > @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
> > >  }
> > >  
> > >  int pci_bus_numa_node(PCIBus *bus);
> > > +int pci_bus_uid(PCIBus *bus);
> > >  void pci_for_each_device(PCIBus *bus, int bus_num,
> > >                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
> > >                           void *opaque);
> > > diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> > > index eb94e7e85c..3c9fbc55bb 100644
> > > --- a/include/hw/pci/pci_bus.h
> > > +++ b/include/hw/pci/pci_bus.h
> > > @@ -17,6 +17,7 @@ struct PCIBusClass {
> > >  
> > >      int (*bus_num)(PCIBus *bus);
> > >      uint16_t (*numa_node)(PCIBus *bus);
> > > +    int32_t (*uid)(PCIBus *bus);
> > >  };
> > >  
> > >  enum PCIBusFlags {
> 


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
  2021-02-02 15:42         ` Ben Widawsky
@ 2021-02-02 15:51           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 117+ messages in thread
From: Michael S. Tsirkin @ 2021-02-02 15:51 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Jonathan Cameron, qemu-devel, linux-cxl, Chris Browy,
	Dan Williams, David Hildenbrand, Igor Mammedov, Ira Weiny,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves)

On Tue, Feb 02, 2021 at 07:42:57AM -0800, Ben Widawsky wrote:
> Thanks for looking! Mixing comments to Jonathan and Michael..
> 
> On 21-02-02 10:24:43, Michael S. Tsirkin wrote:
> > On Tue, Feb 02, 2021 at 03:00:56PM +0000, Jonathan Cameron wrote:
> > > On Mon, 1 Feb 2021 16:59:33 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > 
> > > > Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> > > > there is nothing wrong with doing it this way, CXL spec has a heavy
> > > > reliance on _UID to identify host bridges and there is no link to the
> > > > bus number. Having a distinct UID solves two problems. The first is it
> > > > gets us around the limitation of 256 (current max bus number).
> > 
> > Not sure I understand. You want more than 256 host bridges?
> > 
> 
> I don't want more than 256 host bridges, but I want the ability to disaggregate
> _UID and bus (_BBN). The reasoning is just to align with the spec where _UID is
> used to identify a CXL host bridge which is unrelated (perhaps) to the bus
> number.

Which spec is that?

> > > The
> > > > second is it allows us to replicate hardware configurations where bus
> > > > number and uid aren't equivalent.
> > 
> > A bit more data on when this needs to be the case?
> > 
> 
> Doesn't *need* to be the case. I was making a concerted effort to allow full
> spec flexibility, but I don't believe it to be necessary unless we want to
> accurately emulate a real platform.
> 
> > > The latter has benefits for our
> > > > development and debugging using QEMU.
> > > > 
> > > > The other way to do this would be to implement the expanded bus
> > > > numbering, but having an explicit uid makes more sense when trying to
> > > > replicate real hardware configurations.
> > > > 
> > > > The QEMU commandline to utilize this would be:
> > > >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > 
> > However, if doing this how do we ensure UID is still unique?
> > What do we do for cases where UID was not specified?
> > One idea is to generate a string UID and just stick the bus #
> > in there.
> 
> This is totally mishandled in the code currently. I like your idea though.
> 
> > 
> > 
> > > > --
> > > > 
> > > > I'm guessing this patch will be somewhat controversial. For early CXL
> > > > work, this can be dropped without too much heartache.
> > > 
> > > Whilst I'm not personally against, this maybe best to drop for now as you
> > > say.
> > > 
> 
> I think it'd be good to understand from the PCIe experts if CXL matches in this
> regard. If PCIe generally allows (and does in practice) _UID not matching _BBN,
> perhaps this is an overall improvement to the code.

Well

6.1.12 _UID (Unique ID)
	This object provides OSPM with a logical device ID that does not change across reboots. This object
	is optional, but is required when the device has no other way to report a persistent unique device ID.
	The _UID must be unique across all devices with either a common _HID or _CID. This is because a
	device needs to be uniquely identified to the OSPM, which may match on either a _HID or a _CID to
	identify the device. The uniqueness match must be true regardless of whether the OSPM uses the
	_HID or the _CID. OSPM typically uses the unique device ID to ensure that the device-specific
	information, such as network protocol binding information, is remembered for the device even if its
	relative location changes. For most integrated devices, this object contains a unique identifier.
	A _UID object evaluates to either a numeric value or a string.


IOW _UID is there so guest can tell devices with identical HID/CID apart.




> > > > ---
> > > >  hw/i386/acpi-build.c                |  3 ++-
> > > >  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
> > > >  hw/pci/pci.c                        | 11 +++++++++++
> > > >  include/hw/pci/pci.h                |  1 +
> > > >  include/hw/pci/pci_bus.h            |  1 +
> > > >  5 files changed, 34 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > index cf6eb54c22..145a503e92 100644
> > > > --- a/hw/i386/acpi-build.c
> > > > +++ b/hw/i386/acpi-build.c
> > > > @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > >          QLIST_FOREACH(bus, &bus->child, sibling) {
> > > >              uint8_t bus_num = pci_bus_num(bus);
> > > >              uint8_t numa_node = pci_bus_numa_node(bus);
> > > > +            int32_t uid = pci_bus_uid(bus);
> > > >  
> > > >              /* look only for expander root buses */
> > > >              if (!pci_bus_is_root(bus)) {
> > > > @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > >              scope = aml_scope("\\_SB");
> > > >              dev = aml_device("PC%.02X", bus_num);
> > > >              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> > > > -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> > > > +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
> > > >  
> > > >              if (numa_node != NUMA_NODE_UNASSIGNED) {
> > > >                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> > > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > > index b42592e1ff..5021b60435 100644
> > > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > > @@ -67,6 +67,7 @@ struct PXBDev {
> > > >  
> > > >      uint8_t bus_nr;
> > > >      uint16_t numa_node;
> > > > +    int32_t uid;
> > > >  };
> > > >  
> > > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> > 
> > As long as we are doing this, do we want to support a string uid too?
> > How about a 64 bit uid? Why not?
> 
> If generally the idea of the patch is welcome, I am happy to change it.

I am still not sure, let's figure out the motivation.

For example, grep for UID and you will find more cases
like this. E.g. hw/pci-host/gpex-acpi.c

Do we care?



> > 
> > 
> > > > @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
> > > >      return pxb->numa_node;
> > > >  }
> > > >  
> > > > +static int32_t pxb_bus_uid(PCIBus *bus)
> > > > +{
> > > > +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > > > +
> > > > +    return pxb->uid;
> > > > +}
> > > > +
> > > >  static void pxb_bus_class_init(ObjectClass *class, void *data)
> > > >  {
> > > >      PCIBusClass *pbc = PCI_BUS_CLASS(class);
> > > >  
> > > >      pbc->bus_num = pxb_bus_num;
> > > >      pbc->numa_node = pxb_bus_numa_node;
> > > > +    pbc->uid = pxb_bus_uid;
> > > >  }
> > > >  
> > > >  static const TypeInfo pxb_bus_info = {
> > > > @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
> > > >      /* Note: 0 is not a legal PXB bus number. */
> > > >      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
> > > >      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> > > > +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
> > > >      DEFINE_PROP_END_OF_LIST(),
> > > >  };
> > > >  
> > > > @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
> > > >  
> > > >  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> > > >  {
> > > > +    PXBDev *pxb = convert_to_pxb(dev);
> > > > +
> > > >      /* A CXL PXB's parent bus is still PCIe */
> > > >      if (!pci_bus_is_express(pci_get_bus(dev))) {
> > > >          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> > > >          return;
> > > >      }
> > > >  
> > > > +    if (pxb->uid < 0) {
> > > > +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> > > > +
> > > >      pxb_dev_realize_common(dev, CXL, errp);
> > > >  }
> > > >  
> > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > index adbe8aa260..bf019d91a0 100644
> > > > --- a/hw/pci/pci.c
> > > > +++ b/hw/pci/pci.c
> > > > @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
> > > >      return NUMA_NODE_UNASSIGNED;
> > > >  }
> > > >  
> > > > +static int32_t pcibus_uid(PCIBus *bus)
> > > > +{
> > > > +    return -1;
> > > > +}
> > > > +
> > > >  static void pci_bus_class_init(ObjectClass *klass, void *data)
> > > >  {
> > > >      BusClass *k = BUS_CLASS(klass);
> > > > @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
> > > >  
> > > >      pbc->bus_num = pcibus_num;
> > > >      pbc->numa_node = pcibus_numa_node;
> > > > +    pbc->uid = pcibus_uid;
> > > >  }
> > > >  
> > > >  static const TypeInfo pci_bus_info = {
> > > > @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
> > > >      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
> > > >  }
> > > >  
> > > > +int pci_bus_uid(PCIBus *bus)
> > > > +{
> > > > +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> > > > +}
> > > > +
> > > >  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
> > > >                                   const VMStateField *field)
> > > >  {
> > > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > > index bde3697bee..a46de48ccd 100644
> > > > --- a/include/hw/pci/pci.h
> > > > +++ b/include/hw/pci/pci.h
> > > > @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
> > > >  }
> > > >  
> > > >  int pci_bus_numa_node(PCIBus *bus);
> > > > +int pci_bus_uid(PCIBus *bus);
> > > >  void pci_for_each_device(PCIBus *bus, int bus_num,
> > > >                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
> > > >                           void *opaque);
> > > > diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> > > > index eb94e7e85c..3c9fbc55bb 100644
> > > > --- a/include/hw/pci/pci_bus.h
> > > > +++ b/include/hw/pci/pci_bus.h
> > > > @@ -17,6 +17,7 @@ struct PCIBusClass {
> > > >  
> > > >      int (*bus_num)(PCIBus *bus);
> > > >      uint16_t (*numa_node)(PCIBus *bus);
> > > > +    int32_t (*uid)(PCIBus *bus);
> > > >  };
> > > >  
> > > >  enum PCIBusFlags {
> > 


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
@ 2021-02-02 15:51           ` Michael S. Tsirkin
  0 siblings, 0 replies; 117+ messages in thread
From: Michael S. Tsirkin @ 2021-02-02 15:51 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Jonathan Cameron, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Tue, Feb 02, 2021 at 07:42:57AM -0800, Ben Widawsky wrote:
> Thanks for looking! Mixing comments to Jonathan and Michael..
> 
> On 21-02-02 10:24:43, Michael S. Tsirkin wrote:
> > On Tue, Feb 02, 2021 at 03:00:56PM +0000, Jonathan Cameron wrote:
> > > On Mon, 1 Feb 2021 16:59:33 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > 
> > > > Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> > > > there is nothing wrong with doing it this way, CXL spec has a heavy
> > > > reliance on _UID to identify host bridges and there is no link to the
> > > > bus number. Having a distinct UID solves two problems. The first is it
> > > > gets us around the limitation of 256 (current max bus number).
> > 
> > Not sure I understand. You want more than 256 host bridges?
> > 
> 
> I don't want more than 256 host bridges, but I want the ability to disaggregate
> _UID and bus (_BBN). The reasoning is just to align with the spec where _UID is
> used to identify a CXL host bridge which is unrelated (perhaps) to the bus
> number.

Which spec is that?

> > > The
> > > > second is it allows us to replicate hardware configurations where bus
> > > > number and uid aren't equivalent.
> > 
> > A bit more data on when this needs to be the case?
> > 
> 
> Doesn't *need* to be the case. I was making a concerted effort to allow full
> spec flexibility, but I don't believe it to be necessary unless we want to
> accurately emulate a real platform.
> 
> > > The latter has benefits for our
> > > > development and debugging using QEMU.
> > > > 
> > > > The other way to do this would be to implement the expanded bus
> > > > numbering, but having an explicit uid makes more sense when trying to
> > > > replicate real hardware configurations.
> > > > 
> > > > The QEMU commandline to utilize this would be:
> > > >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > 
> > However, if doing this how do we ensure UID is still unique?
> > What do we do for cases where UID was not specified?
> > One idea is to generate a string UID and just stick the bus #
> > in there.
> 
> This is totally mishandled in the code currently. I like your idea though.
> 
> > 
> > 
> > > > --
> > > > 
> > > > I'm guessing this patch will be somewhat controversial. For early CXL
> > > > work, this can be dropped without too much heartache.
> > > 
> > > Whilst I'm not personally against, this maybe best to drop for now as you
> > > say.
> > > 
> 
> I think it'd be good to understand from the PCIe experts if CXL matches in this
> regard. If PCIe generally allows (and does in practice) _UID not matching _BBN,
> perhaps this is an overall improvement to the code.

Well

6.1.12 _UID (Unique ID)
	This object provides OSPM with a logical device ID that does not change across reboots. This object
	is optional, but is required when the device has no other way to report a persistent unique device ID.
	The _UID must be unique across all devices with either a common _HID or _CID. This is because a
	device needs to be uniquely identified to the OSPM, which may match on either a _HID or a _CID to
	identify the device. The uniqueness match must be true regardless of whether the OSPM uses the
	_HID or the _CID. OSPM typically uses the unique device ID to ensure that the device-specific
	information, such as network protocol binding information, is remembered for the device even if its
	relative location changes. For most integrated devices, this object contains a unique identifier.
	A _UID object evaluates to either a numeric value or a string.


IOW _UID is there so guest can tell devices with identical HID/CID apart.




> > > > ---
> > > >  hw/i386/acpi-build.c                |  3 ++-
> > > >  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
> > > >  hw/pci/pci.c                        | 11 +++++++++++
> > > >  include/hw/pci/pci.h                |  1 +
> > > >  include/hw/pci/pci_bus.h            |  1 +
> > > >  5 files changed, 34 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > index cf6eb54c22..145a503e92 100644
> > > > --- a/hw/i386/acpi-build.c
> > > > +++ b/hw/i386/acpi-build.c
> > > > @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > >          QLIST_FOREACH(bus, &bus->child, sibling) {
> > > >              uint8_t bus_num = pci_bus_num(bus);
> > > >              uint8_t numa_node = pci_bus_numa_node(bus);
> > > > +            int32_t uid = pci_bus_uid(bus);
> > > >  
> > > >              /* look only for expander root buses */
> > > >              if (!pci_bus_is_root(bus)) {
> > > > @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > >              scope = aml_scope("\\_SB");
> > > >              dev = aml_device("PC%.02X", bus_num);
> > > >              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> > > > -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> > > > +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
> > > >  
> > > >              if (numa_node != NUMA_NODE_UNASSIGNED) {
> > > >                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> > > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > > index b42592e1ff..5021b60435 100644
> > > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > > @@ -67,6 +67,7 @@ struct PXBDev {
> > > >  
> > > >      uint8_t bus_nr;
> > > >      uint16_t numa_node;
> > > > +    int32_t uid;
> > > >  };
> > > >  
> > > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> > 
> > As long as we are doing this, do we want to support a string uid too?
> > How about a 64 bit uid? Why not?
> 
> If generally the idea of the patch is welcome, I am happy to change it.

I am still not sure, let's figure out the motivation.

For example, grep for UID and you will find more cases
like this. E.g. hw/pci-host/gpex-acpi.c

Do we care?



> > 
> > 
> > > > @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
> > > >      return pxb->numa_node;
> > > >  }
> > > >  
> > > > +static int32_t pxb_bus_uid(PCIBus *bus)
> > > > +{
> > > > +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > > > +
> > > > +    return pxb->uid;
> > > > +}
> > > > +
> > > >  static void pxb_bus_class_init(ObjectClass *class, void *data)
> > > >  {
> > > >      PCIBusClass *pbc = PCI_BUS_CLASS(class);
> > > >  
> > > >      pbc->bus_num = pxb_bus_num;
> > > >      pbc->numa_node = pxb_bus_numa_node;
> > > > +    pbc->uid = pxb_bus_uid;
> > > >  }
> > > >  
> > > >  static const TypeInfo pxb_bus_info = {
> > > > @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
> > > >      /* Note: 0 is not a legal PXB bus number. */
> > > >      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
> > > >      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> > > > +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
> > > >      DEFINE_PROP_END_OF_LIST(),
> > > >  };
> > > >  
> > > > @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
> > > >  
> > > >  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> > > >  {
> > > > +    PXBDev *pxb = convert_to_pxb(dev);
> > > > +
> > > >      /* A CXL PXB's parent bus is still PCIe */
> > > >      if (!pci_bus_is_express(pci_get_bus(dev))) {
> > > >          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> > > >          return;
> > > >      }
> > > >  
> > > > +    if (pxb->uid < 0) {
> > > > +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> > > > +
> > > >      pxb_dev_realize_common(dev, CXL, errp);
> > > >  }
> > > >  
> > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > index adbe8aa260..bf019d91a0 100644
> > > > --- a/hw/pci/pci.c
> > > > +++ b/hw/pci/pci.c
> > > > @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
> > > >      return NUMA_NODE_UNASSIGNED;
> > > >  }
> > > >  
> > > > +static int32_t pcibus_uid(PCIBus *bus)
> > > > +{
> > > > +    return -1;
> > > > +}
> > > > +
> > > >  static void pci_bus_class_init(ObjectClass *klass, void *data)
> > > >  {
> > > >      BusClass *k = BUS_CLASS(klass);
> > > > @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
> > > >  
> > > >      pbc->bus_num = pcibus_num;
> > > >      pbc->numa_node = pcibus_numa_node;
> > > > +    pbc->uid = pcibus_uid;
> > > >  }
> > > >  
> > > >  static const TypeInfo pci_bus_info = {
> > > > @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
> > > >      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
> > > >  }
> > > >  
> > > > +int pci_bus_uid(PCIBus *bus)
> > > > +{
> > > > +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> > > > +}
> > > > +
> > > >  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
> > > >                                   const VMStateField *field)
> > > >  {
> > > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > > index bde3697bee..a46de48ccd 100644
> > > > --- a/include/hw/pci/pci.h
> > > > +++ b/include/hw/pci/pci.h
> > > > @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
> > > >  }
> > > >  
> > > >  int pci_bus_numa_node(PCIBus *bus);
> > > > +int pci_bus_uid(PCIBus *bus);
> > > >  void pci_for_each_device(PCIBus *bus, int bus_num,
> > > >                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
> > > >                           void *opaque);
> > > > diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> > > > index eb94e7e85c..3c9fbc55bb 100644
> > > > --- a/include/hw/pci/pci_bus.h
> > > > +++ b/include/hw/pci/pci_bus.h
> > > > @@ -17,6 +17,7 @@ struct PCIBusClass {
> > > >  
> > > >      int (*bus_num)(PCIBus *bus);
> > > >      uint16_t (*numa_node)(PCIBus *bus);
> > > > +    int32_t (*uid)(PCIBus *bus);
> > > >  };
> > > >  
> > > >  enum PCIBusFlags {
> > 



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
  2021-02-02 15:51           ` Michael S. Tsirkin
@ 2021-02-02 16:20             ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02 16:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jonathan Cameron, qemu-devel, linux-cxl, Chris Browy,
	Dan Williams, David Hildenbrand, Igor Mammedov, Ira Weiny,
	Marcel Apfelbaum, Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves)

On 21-02-02 10:51:55, Michael S. Tsirkin wrote:
> On Tue, Feb 02, 2021 at 07:42:57AM -0800, Ben Widawsky wrote:
> > Thanks for looking! Mixing comments to Jonathan and Michael..
> > 
> > On 21-02-02 10:24:43, Michael S. Tsirkin wrote:
> > > On Tue, Feb 02, 2021 at 03:00:56PM +0000, Jonathan Cameron wrote:
> > > > On Mon, 1 Feb 2021 16:59:33 -0800
> > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > 
> > > > > Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> > > > > there is nothing wrong with doing it this way, CXL spec has a heavy
> > > > > reliance on _UID to identify host bridges and there is no link to the
> > > > > bus number. Having a distinct UID solves two problems. The first is it
> > > > > gets us around the limitation of 256 (current max bus number).
> > > 
> > > Not sure I understand. You want more than 256 host bridges?
> > > 
> > 
> > I don't want more than 256 host bridges, but I want the ability to disaggregate
> > _UID and bus (_BBN). The reasoning is just to align with the spec where _UID is
> > used to identify a CXL host bridge which is unrelated (perhaps) to the bus
> > number.
> 
> Which spec is that?
> 

CXL spec
https://www.computeexpresslink.org/download-the-specification

The spec introduces a new ACPI table which links information about a host bridge
based on the _UID. The _UID is 4 bytes.

"In an ACPI compliant system, there shall be one instance of CXL Host Bridge
Device object in ACPI namespace (HID=”ACPI0016”) for every CHBS entry. The _UID
object under a CXL Host Bridge object, when evaluated, shall match the UID field
in the associated CHBS entry."

"CXL Host Bridge Unique ID. Used to associate a CHBS instance with CXL Host
Bridge instance. The value of this field shall match the output of _UID under
the associated CXL Host Bridge in ACPI namespace"


> > > > The
> > > > > second is it allows us to replicate hardware configurations where bus
> > > > > number and uid aren't equivalent.
> > > 
> > > A bit more data on when this needs to be the case?
> > > 
> > 
> > Doesn't *need* to be the case. I was making a concerted effort to allow full
> > spec flexibility, but I don't believe it to be necessary unless we want to
> > accurately emulate a real platform.
> > 
> > > > The latter has benefits for our
> > > > > development and debugging using QEMU.
> > > > > 
> > > > > The other way to do this would be to implement the expanded bus
> > > > > numbering, but having an explicit uid makes more sense when trying to
> > > > > replicate real hardware configurations.
> > > > > 
> > > > > The QEMU commandline to utilize this would be:
> > > > >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> > > > > 
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > 
> > > However, if doing this how do we ensure UID is still unique?
> > > What do we do for cases where UID was not specified?
> > > One idea is to generate a string UID and just stick the bus #
> > > in there.
> > 
> > This is totally mishandled in the code currently. I like your idea though.
> > 
> > > 
> > > 
> > > > > --
> > > > > 
> > > > > I'm guessing this patch will be somewhat controversial. For early CXL
> > > > > work, this can be dropped without too much heartache.
> > > > 
> > > > Whilst I'm not personally against, this maybe best to drop for now as you
> > > > say.
> > > > 
> > 
> > I think it'd be good to understand from the PCIe experts if CXL matches in this
> > regard. If PCIe generally allows (and does in practice) _UID not matching _BBN,
> > perhaps this is an overall improvement to the code.
> 
> Well
> 
> 6.1.12 _UID (Unique ID)
> 	This object provides OSPM with a logical device ID that does not change across reboots. This object
> 	is optional, but is required when the device has no other way to report a persistent unique device ID.
> 	The _UID must be unique across all devices with either a common _HID or _CID. This is because a
> 	device needs to be uniquely identified to the OSPM, which may match on either a _HID or a _CID to
> 	identify the device. The uniqueness match must be true regardless of whether the OSPM uses the
> 	_HID or the _CID. OSPM typically uses the unique device ID to ensure that the device-specific
> 	information, such as network protocol binding information, is remembered for the device even if its
> 	relative location changes. For most integrated devices, this object contains a unique identifier.
> 	A _UID object evaluates to either a numeric value or a string.
> 
> 
> IOW _UID is there so guest can tell devices with identical HID/CID apart.
> 

Right, so it's pretty much the same thing. There's ancillary data about the CXL
host bridge that is identified by the _UID.

> 
> 
> 
> > > > > ---
> > > > >  hw/i386/acpi-build.c                |  3 ++-
> > > > >  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
> > > > >  hw/pci/pci.c                        | 11 +++++++++++
> > > > >  include/hw/pci/pci.h                |  1 +
> > > > >  include/hw/pci/pci_bus.h            |  1 +
> > > > >  5 files changed, 34 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > > index cf6eb54c22..145a503e92 100644
> > > > > --- a/hw/i386/acpi-build.c
> > > > > +++ b/hw/i386/acpi-build.c
> > > > > @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > > >          QLIST_FOREACH(bus, &bus->child, sibling) {
> > > > >              uint8_t bus_num = pci_bus_num(bus);
> > > > >              uint8_t numa_node = pci_bus_numa_node(bus);
> > > > > +            int32_t uid = pci_bus_uid(bus);
> > > > >  
> > > > >              /* look only for expander root buses */
> > > > >              if (!pci_bus_is_root(bus)) {
> > > > > @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > > >              scope = aml_scope("\\_SB");
> > > > >              dev = aml_device("PC%.02X", bus_num);
> > > > >              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> > > > > -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> > > > > +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
> > > > >  
> > > > >              if (numa_node != NUMA_NODE_UNASSIGNED) {
> > > > >                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> > > > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > > > index b42592e1ff..5021b60435 100644
> > > > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > > > @@ -67,6 +67,7 @@ struct PXBDev {
> > > > >  
> > > > >      uint8_t bus_nr;
> > > > >      uint16_t numa_node;
> > > > > +    int32_t uid;
> > > > >  };
> > > > >  
> > > > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> > > 
> > > As long as we are doing this, do we want to support a string uid too?
> > > How about a 64 bit uid? Why not?
> > 
> > If generally the idea of the patch is welcome, I am happy to change it.
> 
> I am still not sure, let's figure out the motivation.
> 
> For example, grep for UID and you will find more cases
> like this. E.g. hw/pci-host/gpex-acpi.c
> 
> Do we care?

For my purposes, I do not.

> 
> 
> 
> > > 
> > > 
> > > > > @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
> > > > >      return pxb->numa_node;
> > > > >  }
> > > > >  
> > > > > +static int32_t pxb_bus_uid(PCIBus *bus)
> > > > > +{
> > > > > +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > > > > +
> > > > > +    return pxb->uid;
> > > > > +}
> > > > > +
> > > > >  static void pxb_bus_class_init(ObjectClass *class, void *data)
> > > > >  {
> > > > >      PCIBusClass *pbc = PCI_BUS_CLASS(class);
> > > > >  
> > > > >      pbc->bus_num = pxb_bus_num;
> > > > >      pbc->numa_node = pxb_bus_numa_node;
> > > > > +    pbc->uid = pxb_bus_uid;
> > > > >  }
> > > > >  
> > > > >  static const TypeInfo pxb_bus_info = {
> > > > > @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
> > > > >      /* Note: 0 is not a legal PXB bus number. */
> > > > >      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
> > > > >      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> > > > > +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
> > > > >      DEFINE_PROP_END_OF_LIST(),
> > > > >  };
> > > > >  
> > > > > @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
> > > > >  
> > > > >  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> > > > >  {
> > > > > +    PXBDev *pxb = convert_to_pxb(dev);
> > > > > +
> > > > >      /* A CXL PXB's parent bus is still PCIe */
> > > > >      if (!pci_bus_is_express(pci_get_bus(dev))) {
> > > > >          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> > > > >          return;
> > > > >      }
> > > > >  
> > > > > +    if (pxb->uid < 0) {
> > > > > +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> > > > > +        return;
> > > > > +    }
> > > > > +
> > > > > +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> > > > > +
> > > > >      pxb_dev_realize_common(dev, CXL, errp);
> > > > >  }
> > > > >  
> > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > > index adbe8aa260..bf019d91a0 100644
> > > > > --- a/hw/pci/pci.c
> > > > > +++ b/hw/pci/pci.c
> > > > > @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
> > > > >      return NUMA_NODE_UNASSIGNED;
> > > > >  }
> > > > >  
> > > > > +static int32_t pcibus_uid(PCIBus *bus)
> > > > > +{
> > > > > +    return -1;
> > > > > +}
> > > > > +
> > > > >  static void pci_bus_class_init(ObjectClass *klass, void *data)
> > > > >  {
> > > > >      BusClass *k = BUS_CLASS(klass);
> > > > > @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
> > > > >  
> > > > >      pbc->bus_num = pcibus_num;
> > > > >      pbc->numa_node = pcibus_numa_node;
> > > > > +    pbc->uid = pcibus_uid;
> > > > >  }
> > > > >  
> > > > >  static const TypeInfo pci_bus_info = {
> > > > > @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
> > > > >      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
> > > > >  }
> > > > >  
> > > > > +int pci_bus_uid(PCIBus *bus)
> > > > > +{
> > > > > +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> > > > > +}
> > > > > +
> > > > >  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
> > > > >                                   const VMStateField *field)
> > > > >  {
> > > > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > > > index bde3697bee..a46de48ccd 100644
> > > > > --- a/include/hw/pci/pci.h
> > > > > +++ b/include/hw/pci/pci.h
> > > > > @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
> > > > >  }
> > > > >  
> > > > >  int pci_bus_numa_node(PCIBus *bus);
> > > > > +int pci_bus_uid(PCIBus *bus);
> > > > >  void pci_for_each_device(PCIBus *bus, int bus_num,
> > > > >                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
> > > > >                           void *opaque);
> > > > > diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> > > > > index eb94e7e85c..3c9fbc55bb 100644
> > > > > --- a/include/hw/pci/pci_bus.h
> > > > > +++ b/include/hw/pci/pci_bus.h
> > > > > @@ -17,6 +17,7 @@ struct PCIBusClass {
> > > > >  
> > > > >      int (*bus_num)(PCIBus *bus);
> > > > >      uint16_t (*numa_node)(PCIBus *bus);
> > > > > +    int32_t (*uid)(PCIBus *bus);
> > > > >  };
> > > > >  
> > > > >  enum PCIBusFlags {
> > > 
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges
@ 2021-02-02 16:20             ` Ben Widawsky
  0 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02 16:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Jonathan Cameron, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On 21-02-02 10:51:55, Michael S. Tsirkin wrote:
> On Tue, Feb 02, 2021 at 07:42:57AM -0800, Ben Widawsky wrote:
> > Thanks for looking! Mixing comments to Jonathan and Michael..
> > 
> > On 21-02-02 10:24:43, Michael S. Tsirkin wrote:
> > > On Tue, Feb 02, 2021 at 03:00:56PM +0000, Jonathan Cameron wrote:
> > > > On Mon, 1 Feb 2021 16:59:33 -0800
> > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > 
> > > > > Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
> > > > > there is nothing wrong with doing it this way, CXL spec has a heavy
> > > > > reliance on _UID to identify host bridges and there is no link to the
> > > > > bus number. Having a distinct UID solves two problems. The first is it
> > > > > gets us around the limitation of 256 (current max bus number).
> > > 
> > > Not sure I understand. You want more than 256 host bridges?
> > > 
> > 
> > I don't want more than 256 host bridges, but I want the ability to disaggregate
> > _UID and bus (_BBN). The reasoning is just to align with the spec where _UID is
> > used to identify a CXL host bridge which is unrelated (perhaps) to the bus
> > number.
> 
> Which spec is that?
> 

CXL spec
https://www.computeexpresslink.org/download-the-specification

The spec introduces a new ACPI table which links information about a host bridge
based on the _UID. The _UID is 4 bytes.

"In an ACPI compliant system, there shall be one instance of CXL Host Bridge
Device object in ACPI namespace (HID=”ACPI0016”) for every CHBS entry. The _UID
object under a CXL Host Bridge object, when evaluated, shall match the UID field
in the associated CHBS entry."

"CXL Host Bridge Unique ID. Used to associate a CHBS instance with CXL Host
Bridge instance. The value of this field shall match the output of _UID under
the associated CXL Host Bridge in ACPI namespace"


> > > > The
> > > > > second is it allows us to replicate hardware configurations where bus
> > > > > number and uid aren't equivalent.
> > > 
> > > A bit more data on when this needs to be the case?
> > > 
> > 
> > Doesn't *need* to be the case. I was making a concerted effort to allow full
> > spec flexibility, but I don't believe it to be necessary unless we want to
> > accurately emulate a real platform.
> > 
> > > > The latter has benefits for our
> > > > > development and debugging using QEMU.
> > > > > 
> > > > > The other way to do this would be to implement the expanded bus
> > > > > numbering, but having an explicit uid makes more sense when trying to
> > > > > replicate real hardware configurations.
> > > > > 
> > > > > The QEMU commandline to utilize this would be:
> > > > >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x
> > > > > 
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > 
> > > However, if doing this how do we ensure UID is still unique?
> > > What do we do for cases where UID was not specified?
> > > One idea is to generate a string UID and just stick the bus #
> > > in there.
> > 
> > This is totally mishandled in the code currently. I like your idea though.
> > 
> > > 
> > > 
> > > > > --
> > > > > 
> > > > > I'm guessing this patch will be somewhat controversial. For early CXL
> > > > > work, this can be dropped without too much heartache.
> > > > 
> > > > Whilst I'm not personally against, this maybe best to drop for now as you
> > > > say.
> > > > 
> > 
> > I think it'd be good to understand from the PCIe experts if CXL matches in this
> > regard. If PCIe generally allows (and does in practice) _UID not matching _BBN,
> > perhaps this is an overall improvement to the code.
> 
> Well
> 
> 6.1.12 _UID (Unique ID)
> 	This object provides OSPM with a logical device ID that does not change across reboots. This object
> 	is optional, but is required when the device has no other way to report a persistent unique device ID.
> 	The _UID must be unique across all devices with either a common _HID or _CID. This is because a
> 	device needs to be uniquely identified to the OSPM, which may match on either a _HID or a _CID to
> 	identify the device. The uniqueness match must be true regardless of whether the OSPM uses the
> 	_HID or the _CID. OSPM typically uses the unique device ID to ensure that the device-specific
> 	information, such as network protocol binding information, is remembered for the device even if its
> 	relative location changes. For most integrated devices, this object contains a unique identifier.
> 	A _UID object evaluates to either a numeric value or a string.
> 
> 
> IOW _UID is there so guest can tell devices with identical HID/CID apart.
> 

Right, so it's pretty much the same thing. There's ancillary data about the CXL
host bridge that is identified by the _UID.

> 
> 
> 
> > > > > ---
> > > > >  hw/i386/acpi-build.c                |  3 ++-
> > > > >  hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
> > > > >  hw/pci/pci.c                        | 11 +++++++++++
> > > > >  include/hw/pci/pci.h                |  1 +
> > > > >  include/hw/pci/pci_bus.h            |  1 +
> > > > >  5 files changed, 34 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > > index cf6eb54c22..145a503e92 100644
> > > > > --- a/hw/i386/acpi-build.c
> > > > > +++ b/hw/i386/acpi-build.c
> > > > > @@ -1343,6 +1343,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > > >          QLIST_FOREACH(bus, &bus->child, sibling) {
> > > > >              uint8_t bus_num = pci_bus_num(bus);
> > > > >              uint8_t numa_node = pci_bus_numa_node(bus);
> > > > > +            int32_t uid = pci_bus_uid(bus);
> > > > >  
> > > > >              /* look only for expander root buses */
> > > > >              if (!pci_bus_is_root(bus)) {
> > > > > @@ -1356,7 +1357,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> > > > >              scope = aml_scope("\\_SB");
> > > > >              dev = aml_device("PC%.02X", bus_num);
> > > > >              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> > > > > -            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
> > > > > +            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
> > > > >  
> > > > >              if (numa_node != NUMA_NODE_UNASSIGNED) {
> > > > >                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> > > > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > > > index b42592e1ff..5021b60435 100644
> > > > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > > > @@ -67,6 +67,7 @@ struct PXBDev {
> > > > >  
> > > > >      uint8_t bus_nr;
> > > > >      uint16_t numa_node;
> > > > > +    int32_t uid;
> > > > >  };
> > > > >  
> > > > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> > > 
> > > As long as we are doing this, do we want to support a string uid too?
> > > How about a 64 bit uid? Why not?
> > 
> > If generally the idea of the patch is welcome, I am happy to change it.
> 
> I am still not sure, let's figure out the motivation.
> 
> For example, grep for UID and you will find more cases
> like this. E.g. hw/pci-host/gpex-acpi.c
> 
> Do we care?

For my purposes, I do not.

> 
> 
> 
> > > 
> > > 
> > > > > @@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
> > > > >      return pxb->numa_node;
> > > > >  }
> > > > >  
> > > > > +static int32_t pxb_bus_uid(PCIBus *bus)
> > > > > +{
> > > > > +    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > > > > +
> > > > > +    return pxb->uid;
> > > > > +}
> > > > > +
> > > > >  static void pxb_bus_class_init(ObjectClass *class, void *data)
> > > > >  {
> > > > >      PCIBusClass *pbc = PCI_BUS_CLASS(class);
> > > > >  
> > > > >      pbc->bus_num = pxb_bus_num;
> > > > >      pbc->numa_node = pxb_bus_numa_node;
> > > > > +    pbc->uid = pxb_bus_uid;
> > > > >  }
> > > > >  
> > > > >  static const TypeInfo pxb_bus_info = {
> > > > > @@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
> > > > >      /* Note: 0 is not a legal PXB bus number. */
> > > > >      DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
> > > > >      DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
> > > > > +    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
> > > > >      DEFINE_PROP_END_OF_LIST(),
> > > > >  };
> > > > >  
> > > > > @@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
> > > > >  
> > > > >  static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> > > > >  {
> > > > > +    PXBDev *pxb = convert_to_pxb(dev);
> > > > > +
> > > > >      /* A CXL PXB's parent bus is still PCIe */
> > > > >      if (!pci_bus_is_express(pci_get_bus(dev))) {
> > > > >          error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> > > > >          return;
> > > > >      }
> > > > >  
> > > > > +    if (pxb->uid < 0) {
> > > > > +        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
> > > > > +        return;
> > > > > +    }
> > > > > +
> > > > > +    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
> > > > > +
> > > > >      pxb_dev_realize_common(dev, CXL, errp);
> > > > >  }
> > > > >  
> > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > > index adbe8aa260..bf019d91a0 100644
> > > > > --- a/hw/pci/pci.c
> > > > > +++ b/hw/pci/pci.c
> > > > > @@ -170,6 +170,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
> > > > >      return NUMA_NODE_UNASSIGNED;
> > > > >  }
> > > > >  
> > > > > +static int32_t pcibus_uid(PCIBus *bus)
> > > > > +{
> > > > > +    return -1;
> > > > > +}
> > > > > +
> > > > >  static void pci_bus_class_init(ObjectClass *klass, void *data)
> > > > >  {
> > > > >      BusClass *k = BUS_CLASS(klass);
> > > > > @@ -184,6 +189,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
> > > > >  
> > > > >      pbc->bus_num = pcibus_num;
> > > > >      pbc->numa_node = pcibus_numa_node;
> > > > > +    pbc->uid = pcibus_uid;
> > > > >  }
> > > > >  
> > > > >  static const TypeInfo pci_bus_info = {
> > > > > @@ -530,6 +536,11 @@ int pci_bus_numa_node(PCIBus *bus)
> > > > >      return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
> > > > >  }
> > > > >  
> > > > > +int pci_bus_uid(PCIBus *bus)
> > > > > +{
> > > > > +    return PCI_BUS_GET_CLASS(bus)->uid(bus);
> > > > > +}
> > > > > +
> > > > >  static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
> > > > >                                   const VMStateField *field)
> > > > >  {
> > > > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > > > index bde3697bee..a46de48ccd 100644
> > > > > --- a/include/hw/pci/pci.h
> > > > > +++ b/include/hw/pci/pci.h
> > > > > @@ -463,6 +463,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
> > > > >  }
> > > > >  
> > > > >  int pci_bus_numa_node(PCIBus *bus);
> > > > > +int pci_bus_uid(PCIBus *bus);
> > > > >  void pci_for_each_device(PCIBus *bus, int bus_num,
> > > > >                           void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
> > > > >                           void *opaque);
> > > > > diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
> > > > > index eb94e7e85c..3c9fbc55bb 100644
> > > > > --- a/include/hw/pci/pci_bus.h
> > > > > +++ b/include/hw/pci/pci_bus.h
> > > > > @@ -17,6 +17,7 @@ struct PCIBusClass {
> > > > >  
> > > > >      int (*bus_num)(PCIBus *bus);
> > > > >      uint16_t (*numa_node)(PCIBus *bus);
> > > > > +    int32_t (*uid)(PCIBus *bus);
> > > > >  };
> > > > >  
> > > > >  enum PCIBusFlags {
> > > 
> 


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 17/31] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-02 19:21     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 19:21 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:34 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> CXL host bridges themselves may have MMIO. Since host bridges don't have
> a BAR they are treated as special for MMIO.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> --
> 
> It's arbitrarily chosen here to pick 0xD0000000 as the base for the host
> bridge MMIO. I'm not sure what the right way to find free space for
> platform hardcoded things like this is.

Seems like this needs to come from the machine definition.
This is fairly easy for arm/virt, where there is a clearly laid out memory map.
For hw/i386 I'm less sure on how to do it.

Having said that, for this particular magic device, we do have a PCI EP
associated with it.  How about putting all the host bridge MMIO into a
BAR of that rather than having it separate.   
That has the added advantage of making it discoverable from firmware.

Any normal system is going to have this is impdef for discovery anyway.

That would then let you drop the separate definition of CXLHost structure
though it needs a bit of figuring out what to do with the memory window
setup etc.

I tried hacking it together, but not gotten it working yet.

> ---
>  hw/pci-bridge/pci_expander_bridge.c | 53 ++++++++++++++++++++++++++++-
>  include/hw/cxl/cxl.h                |  2 ++
>  2 files changed, 54 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index 5021b60435..226a8a5fff 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -17,6 +17,7 @@
>  #include "hw/pci/pci_host.h"
>  #include "hw/qdev-properties.h"
>  #include "hw/pci/pci_bridge.h"
> +#include "hw/cxl/cxl.h"
>  #include "qemu/range.h"
>  #include "qemu/error-report.h"
>  #include "qemu/module.h"
> @@ -70,6 +71,12 @@ struct PXBDev {
>      int32_t uid;
>  };
>  
> +typedef struct CXLHost {
> +    PCIHostState parent_obj;
> +
> +    CXLComponentState cxl_cstate;
> +} CXLHost;
> +
>  static PXBDev *convert_to_pxb(PCIDevice *dev)
>  {
>      /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
> @@ -85,6 +92,9 @@ static GList *pxb_dev_list;
>  
>  #define TYPE_PXB_HOST "pxb-host"
>  
> +#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
> +#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
> +
>  static int pxb_bus_num(PCIBus *bus)
>  {
>      PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> @@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
>      .class_init    = pxb_host_class_init,
>  };
>  
> +static void pxb_cxl_realize(DeviceState *dev, Error **errp)
> +{
> +    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
> +    PCIHostState *phb = PCI_HOST_BRIDGE(dev);
> +    CXLHost *cxl = PXB_CXL_HOST(dev);
> +    CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
> +    struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
> +
> +    cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
> +                                      TYPE_PXB_CXL_HOST);
> +    sysbus_init_mmio(sbd, mr);
> +
> +    /* FIXME: support multiple host bridges. */
> +    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
> +                            memory_region_size(mr) * pci_bus_uid(phb->bus));
> +}
> +
> +static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(class);
> +    PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
> +
> +    hc->root_bus_path = pxb_host_root_bus_path;
> +    dc->fw_name = "cxl";
> +    dc->realize = pxb_cxl_realize;
> +    /* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself */
> +    dc->user_creatable = false;
> +}
> +
> +/*
> + * This is a device to handle the MMIO for a CXL host bridge. It does nothing
> + * else.
> + */
> +static const TypeInfo cxl_host_info = {
> +    .name          = TYPE_PXB_CXL_HOST,
> +    .parent        = TYPE_PCI_HOST_BRIDGE,
> +    .instance_size = sizeof(CXLHost),
> +    .class_init    = pxb_cxl_host_class_init,
> +};
> +
>  /*
>   * Registers the PXB bus as a child of pci host root bus.
>   */
> @@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
>          dev_name = dev->qdev.id;
>      }
>  
> -    ds = qdev_new(TYPE_PXB_HOST);
> +    ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
>      if (type == PCIE) {
>          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
>      } else if (type == CXL) {
> @@ -466,6 +516,7 @@ static void pxb_register_types(void)
>      type_register_static(&pxb_pcie_bus_info);
>      type_register_static(&pxb_cxl_bus_info);
>      type_register_static(&pxb_host_info);
> +    type_register_static(&cxl_host_info);
>      type_register_static(&pxb_dev_info);
>      type_register_static(&pxb_pcie_dev_info);
>      type_register_static(&pxb_cxl_dev_info);
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> index 362cda40de..6bc344f205 100644
> --- a/include/hw/cxl/cxl.h
> +++ b/include/hw/cxl/cxl.h
> @@ -17,5 +17,7 @@
>  #define COMPONENT_REG_BAR_IDX 0
>  #define DEVICE_REG_BAR_IDX 2
>  
> +#define CXL_HOST_BASE 0xD0000000
> +
>  #endif
>  


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 17/31] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
@ 2021-02-02 19:21     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 19:21 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:34 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> CXL host bridges themselves may have MMIO. Since host bridges don't have
> a BAR they are treated as special for MMIO.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> --
> 
> It's arbitrarily chosen here to pick 0xD0000000 as the base for the host
> bridge MMIO. I'm not sure what the right way to find free space for
> platform hardcoded things like this is.

Seems like this needs to come from the machine definition.
This is fairly easy for arm/virt, where there is a clearly laid out memory map.
For hw/i386 I'm less sure on how to do it.

Having said that, for this particular magic device, we do have a PCI EP
associated with it.  How about putting all the host bridge MMIO into a
BAR of that rather than having it separate.   
That has the added advantage of making it discoverable from firmware.

Any normal system is going to have this is impdef for discovery anyway.

That would then let you drop the separate definition of CXLHost structure
though it needs a bit of figuring out what to do with the memory window
setup etc.

I tried hacking it together, but not gotten it working yet.

> ---
>  hw/pci-bridge/pci_expander_bridge.c | 53 ++++++++++++++++++++++++++++-
>  include/hw/cxl/cxl.h                |  2 ++
>  2 files changed, 54 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index 5021b60435..226a8a5fff 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -17,6 +17,7 @@
>  #include "hw/pci/pci_host.h"
>  #include "hw/qdev-properties.h"
>  #include "hw/pci/pci_bridge.h"
> +#include "hw/cxl/cxl.h"
>  #include "qemu/range.h"
>  #include "qemu/error-report.h"
>  #include "qemu/module.h"
> @@ -70,6 +71,12 @@ struct PXBDev {
>      int32_t uid;
>  };
>  
> +typedef struct CXLHost {
> +    PCIHostState parent_obj;
> +
> +    CXLComponentState cxl_cstate;
> +} CXLHost;
> +
>  static PXBDev *convert_to_pxb(PCIDevice *dev)
>  {
>      /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
> @@ -85,6 +92,9 @@ static GList *pxb_dev_list;
>  
>  #define TYPE_PXB_HOST "pxb-host"
>  
> +#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
> +#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
> +
>  static int pxb_bus_num(PCIBus *bus)
>  {
>      PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> @@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
>      .class_init    = pxb_host_class_init,
>  };
>  
> +static void pxb_cxl_realize(DeviceState *dev, Error **errp)
> +{
> +    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
> +    PCIHostState *phb = PCI_HOST_BRIDGE(dev);
> +    CXLHost *cxl = PXB_CXL_HOST(dev);
> +    CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
> +    struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
> +
> +    cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
> +                                      TYPE_PXB_CXL_HOST);
> +    sysbus_init_mmio(sbd, mr);
> +
> +    /* FIXME: support multiple host bridges. */
> +    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
> +                            memory_region_size(mr) * pci_bus_uid(phb->bus));
> +}
> +
> +static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(class);
> +    PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
> +
> +    hc->root_bus_path = pxb_host_root_bus_path;
> +    dc->fw_name = "cxl";
> +    dc->realize = pxb_cxl_realize;
> +    /* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself */
> +    dc->user_creatable = false;
> +}
> +
> +/*
> + * This is a device to handle the MMIO for a CXL host bridge. It does nothing
> + * else.
> + */
> +static const TypeInfo cxl_host_info = {
> +    .name          = TYPE_PXB_CXL_HOST,
> +    .parent        = TYPE_PCI_HOST_BRIDGE,
> +    .instance_size = sizeof(CXLHost),
> +    .class_init    = pxb_cxl_host_class_init,
> +};
> +
>  /*
>   * Registers the PXB bus as a child of pci host root bus.
>   */
> @@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
>          dev_name = dev->qdev.id;
>      }
>  
> -    ds = qdev_new(TYPE_PXB_HOST);
> +    ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
>      if (type == PCIE) {
>          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
>      } else if (type == CXL) {
> @@ -466,6 +516,7 @@ static void pxb_register_types(void)
>      type_register_static(&pxb_pcie_bus_info);
>      type_register_static(&pxb_cxl_bus_info);
>      type_register_static(&pxb_host_info);
> +    type_register_static(&cxl_host_info);
>      type_register_static(&pxb_dev_info);
>      type_register_static(&pxb_pcie_dev_info);
>      type_register_static(&pxb_cxl_dev_info);
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> index 362cda40de..6bc344f205 100644
> --- a/include/hw/cxl/cxl.h
> +++ b/include/hw/cxl/cxl.h
> @@ -17,5 +17,7 @@
>  #define COMPONENT_REG_BAR_IDX 0
>  #define DEVICE_REG_BAR_IDX 2
>  
> +#define CXL_HOST_BASE 0xD0000000
> +
>  #endif
>  



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 17/31] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  2021-02-02 19:21     ` Jonathan Cameron
  (?)
@ 2021-02-02 19:45     ` Ben Widawsky
  2021-02-02 20:43       ` Jonathan Cameron
  -1 siblings, 1 reply; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02 19:45 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On 21-02-02 19:21:35, Jonathan Cameron wrote:
> On Mon, 1 Feb 2021 16:59:34 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > CXL host bridges themselves may have MMIO. Since host bridges don't have
> > a BAR they are treated as special for MMIO.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > 
> > --
> > 
> > It's arbitrarily chosen here to pick 0xD0000000 as the base for the host
> > bridge MMIO. I'm not sure what the right way to find free space for
> > platform hardcoded things like this is.
> 
> Seems like this needs to come from the machine definition.
> This is fairly easy for arm/virt, where there is a clearly laid out memory map.
> For hw/i386 I'm less sure on how to do it.

I think this is how to do it :D

> 
> Having said that, for this particular magic device, we do have a PCI EP
> associated with it.  How about putting all the host bridge MMIO into a
> BAR of that rather than having it separate.   
> That has the added advantage of making it discoverable from firmware.
> 
> Any normal system is going to have this is impdef for discovery anyway.
> 

This is not how it's expected to work for Intel at least. If the device was
discoverable you wouldn't need CEDT/CHBS. The magic host bridges are only
advertised via the CEDT.

When I build and run QEMU for x86_64, I do not see the host bridge in the pci
topology, do you (it's meant to not be there)?

00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
...
34:00.0 PCI bridge: Intel Corporation Device 7075
35:00.0 Memory controller [0502]: Intel Corporation Device 0d93 (rev 01)

That's Q35, Root Port, and Type 3 device respectively.

> That would then let you drop the separate definition of CXLHost structure
> though it needs a bit of figuring out what to do with the memory window
> setup etc.
> 
> I tried hacking it together, but not gotten it working yet.
> 
> > ---
> >  hw/pci-bridge/pci_expander_bridge.c | 53 ++++++++++++++++++++++++++++-
> >  include/hw/cxl/cxl.h                |  2 ++
> >  2 files changed, 54 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > index 5021b60435..226a8a5fff 100644
> > --- a/hw/pci-bridge/pci_expander_bridge.c
> > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > @@ -17,6 +17,7 @@
> >  #include "hw/pci/pci_host.h"
> >  #include "hw/qdev-properties.h"
> >  #include "hw/pci/pci_bridge.h"
> > +#include "hw/cxl/cxl.h"
> >  #include "qemu/range.h"
> >  #include "qemu/error-report.h"
> >  #include "qemu/module.h"
> > @@ -70,6 +71,12 @@ struct PXBDev {
> >      int32_t uid;
> >  };
> >  
> > +typedef struct CXLHost {
> > +    PCIHostState parent_obj;
> > +
> > +    CXLComponentState cxl_cstate;
> > +} CXLHost;
> > +
> >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> >  {
> >      /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
> > @@ -85,6 +92,9 @@ static GList *pxb_dev_list;
> >  
> >  #define TYPE_PXB_HOST "pxb-host"
> >  
> > +#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
> > +#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
> > +
> >  static int pxb_bus_num(PCIBus *bus)
> >  {
> >      PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > @@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
> >      .class_init    = pxb_host_class_init,
> >  };
> >  
> > +static void pxb_cxl_realize(DeviceState *dev, Error **errp)
> > +{
> > +    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
> > +    PCIHostState *phb = PCI_HOST_BRIDGE(dev);
> > +    CXLHost *cxl = PXB_CXL_HOST(dev);
> > +    CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
> > +    struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
> > +
> > +    cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
> > +                                      TYPE_PXB_CXL_HOST);
> > +    sysbus_init_mmio(sbd, mr);
> > +
> > +    /* FIXME: support multiple host bridges. */
> > +    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
> > +                            memory_region_size(mr) * pci_bus_uid(phb->bus));
> > +}
> > +
> > +static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
> > +{
> > +    DeviceClass *dc = DEVICE_CLASS(class);
> > +    PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
> > +
> > +    hc->root_bus_path = pxb_host_root_bus_path;
> > +    dc->fw_name = "cxl";
> > +    dc->realize = pxb_cxl_realize;
> > +    /* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself */
> > +    dc->user_creatable = false;
> > +}
> > +
> > +/*
> > + * This is a device to handle the MMIO for a CXL host bridge. It does nothing
> > + * else.
> > + */
> > +static const TypeInfo cxl_host_info = {
> > +    .name          = TYPE_PXB_CXL_HOST,
> > +    .parent        = TYPE_PCI_HOST_BRIDGE,
> > +    .instance_size = sizeof(CXLHost),
> > +    .class_init    = pxb_cxl_host_class_init,
> > +};
> > +
> >  /*
> >   * Registers the PXB bus as a child of pci host root bus.
> >   */
> > @@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
> >          dev_name = dev->qdev.id;
> >      }
> >  
> > -    ds = qdev_new(TYPE_PXB_HOST);
> > +    ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
> >      if (type == PCIE) {
> >          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
> >      } else if (type == CXL) {
> > @@ -466,6 +516,7 @@ static void pxb_register_types(void)
> >      type_register_static(&pxb_pcie_bus_info);
> >      type_register_static(&pxb_cxl_bus_info);
> >      type_register_static(&pxb_host_info);
> > +    type_register_static(&cxl_host_info);
> >      type_register_static(&pxb_dev_info);
> >      type_register_static(&pxb_pcie_dev_info);
> >      type_register_static(&pxb_cxl_dev_info);
> > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > index 362cda40de..6bc344f205 100644
> > --- a/include/hw/cxl/cxl.h
> > +++ b/include/hw/cxl/cxl.h
> > @@ -17,5 +17,7 @@
> >  #define COMPONENT_REG_BAR_IDX 0
> >  #define DEVICE_REG_BAR_IDX 2
> >  
> > +#define CXL_HOST_BASE 0xD0000000
> > +
> >  #endif
> >  
> 
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 17/31] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  2021-02-02 19:45     ` Ben Widawsky
@ 2021-02-02 20:43       ` Jonathan Cameron
  2021-02-02 21:03         ` Ben Widawsky
  0 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 20:43 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Tue, 2 Feb 2021 11:45:05 -0800
Ben Widawsky <ben@bwidawsk.net> wrote:

> On 21-02-02 19:21:35, Jonathan Cameron wrote:
> > On Mon, 1 Feb 2021 16:59:34 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > CXL host bridges themselves may have MMIO. Since host bridges don't have
> > > a BAR they are treated as special for MMIO.
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > 
> > > --
> > > 
> > > It's arbitrarily chosen here to pick 0xD0000000 as the base for the host
> > > bridge MMIO. I'm not sure what the right way to find free space for
> > > platform hardcoded things like this is.  
> > 
> > Seems like this needs to come from the machine definition.
> > This is fairly easy for arm/virt, where there is a clearly laid out memory map.
> > For hw/i386 I'm less sure on how to do it.  
> 
> I think this is how to do it :D

It may well be, but they you'll need to find a suitable region and document
it and ensure no one else ever tramples on it.  Easy to do on a physical system,
bit trickier in emulation.

> 
> > 
> > Having said that, for this particular magic device, we do have a PCI EP
> > associated with it.  How about putting all the host bridge MMIO into a
> > BAR of that rather than having it separate.   
> > That has the added advantage of making it discoverable from firmware.
> > 
> > Any normal system is going to have this is impdef for discovery anyway.
> >   
> 
> This is not how it's expected to work for Intel at least. If the device was
> discoverable you wouldn't need CEDT/CHBS. The magic host bridges are only
> advertised via the CEDT.

I agree on a normal system (i.e. a real one) this doesn't work,
but then a normal system doesn't involve a magic PCI RCiEP that also happens
to instantiate an extra host bridge. This is what pxb_pcie is doing and
what your pxb_cxl is almost doing.

> 
> When I build and run QEMU for x86_64, I do not see the host bridge in the pci
> topology, do you (it's meant to not be there)?
> 
> 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> ...
> 34:00.0 PCI bridge: Intel Corporation Device 7075
> 35:00.0 Memory controller [0502]: Intel Corporation Device 0d93 (rev 01)
> 
> That's Q35, Root Port, and Type 3 device respectively.

You don't see the host bridge, for pxb_cxl, but for pxb_pcie you do.
00:06.0 Host bridge: Red Hat, Inc QEMU PCIe Expander bridge.
If you have another device after your pxb-cxl you'll also notice that there
is a hole punched in the list where you'd expect pxb-cxl to be (device number
skipped).  (that had me confused earlier).

This seems to be because no VID etc (unlike pxb-pcie).

I gave vendor and device IDs (and a bar to test that could be done) and it
then appears just like pxb_pcie does.  Hence handy place to hang our
magic memory off so that EDK2 or similar can work with it and indeed
construct he CHBS as needed so we can get to this via the same paths as
a normal system.  It's a bit convoluted but in some ways more elegant. 

Jonathan

> 
> > That would then let you drop the separate definition of CXLHost structure
> > though it needs a bit of figuring out what to do with the memory window
> > setup etc.
> > 
> > I tried hacking it together, but not gotten it working yet.
> >   
> > > ---
> > >  hw/pci-bridge/pci_expander_bridge.c | 53 ++++++++++++++++++++++++++++-
> > >  include/hw/cxl/cxl.h                |  2 ++
> > >  2 files changed, 54 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > index 5021b60435..226a8a5fff 100644
> > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > @@ -17,6 +17,7 @@
> > >  #include "hw/pci/pci_host.h"
> > >  #include "hw/qdev-properties.h"
> > >  #include "hw/pci/pci_bridge.h"
> > > +#include "hw/cxl/cxl.h"
> > >  #include "qemu/range.h"
> > >  #include "qemu/error-report.h"
> > >  #include "qemu/module.h"
> > > @@ -70,6 +71,12 @@ struct PXBDev {
> > >      int32_t uid;
> > >  };
> > >  
> > > +typedef struct CXLHost {
> > > +    PCIHostState parent_obj;
> > > +
> > > +    CXLComponentState cxl_cstate;
> > > +} CXLHost;
> > > +
> > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> > >  {
> > >      /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
> > > @@ -85,6 +92,9 @@ static GList *pxb_dev_list;
> > >  
> > >  #define TYPE_PXB_HOST "pxb-host"
> > >  
> > > +#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
> > > +#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
> > > +
> > >  static int pxb_bus_num(PCIBus *bus)
> > >  {
> > >      PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > > @@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
> > >      .class_init    = pxb_host_class_init,
> > >  };
> > >  
> > > +static void pxb_cxl_realize(DeviceState *dev, Error **errp)
> > > +{
> > > +    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
> > > +    PCIHostState *phb = PCI_HOST_BRIDGE(dev);
> > > +    CXLHost *cxl = PXB_CXL_HOST(dev);
> > > +    CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
> > > +    struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
> > > +
> > > +    cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
> > > +                                      TYPE_PXB_CXL_HOST);
> > > +    sysbus_init_mmio(sbd, mr);
> > > +
> > > +    /* FIXME: support multiple host bridges. */
> > > +    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
> > > +                            memory_region_size(mr) * pci_bus_uid(phb->bus));
> > > +}
> > > +
> > > +static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
> > > +{
> > > +    DeviceClass *dc = DEVICE_CLASS(class);
> > > +    PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
> > > +
> > > +    hc->root_bus_path = pxb_host_root_bus_path;
> > > +    dc->fw_name = "cxl";
> > > +    dc->realize = pxb_cxl_realize;
> > > +    /* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself */
> > > +    dc->user_creatable = false;
> > > +}
> > > +
> > > +/*
> > > + * This is a device to handle the MMIO for a CXL host bridge. It does nothing
> > > + * else.
> > > + */
> > > +static const TypeInfo cxl_host_info = {
> > > +    .name          = TYPE_PXB_CXL_HOST,
> > > +    .parent        = TYPE_PCI_HOST_BRIDGE,
> > > +    .instance_size = sizeof(CXLHost),
> > > +    .class_init    = pxb_cxl_host_class_init,
> > > +};
> > > +
> > >  /*
> > >   * Registers the PXB bus as a child of pci host root bus.
> > >   */
> > > @@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
> > >          dev_name = dev->qdev.id;
> > >      }
> > >  
> > > -    ds = qdev_new(TYPE_PXB_HOST);
> > > +    ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
> > >      if (type == PCIE) {
> > >          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
> > >      } else if (type == CXL) {
> > > @@ -466,6 +516,7 @@ static void pxb_register_types(void)
> > >      type_register_static(&pxb_pcie_bus_info);
> > >      type_register_static(&pxb_cxl_bus_info);
> > >      type_register_static(&pxb_host_info);
> > > +    type_register_static(&cxl_host_info);
> > >      type_register_static(&pxb_dev_info);
> > >      type_register_static(&pxb_pcie_dev_info);
> > >      type_register_static(&pxb_cxl_dev_info);
> > > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > > index 362cda40de..6bc344f205 100644
> > > --- a/include/hw/cxl/cxl.h
> > > +++ b/include/hw/cxl/cxl.h
> > > @@ -17,5 +17,7 @@
> > >  #define COMPONENT_REG_BAR_IDX 0
> > >  #define DEVICE_REG_BAR_IDX 2
> > >  
> > > +#define CXL_HOST_BASE 0xD0000000
> > > +
> > >  #endif
> > >    
> > 
> >   
> 


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 17/31] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  2021-02-02 20:43       ` Jonathan Cameron
@ 2021-02-02 21:03         ` Ben Widawsky
  2021-02-02 22:06           ` Jonathan Cameron
  0 siblings, 1 reply; 117+ messages in thread
From: Ben Widawsky @ 2021-02-02 21:03 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On 21-02-02 20:43:38, Jonathan Cameron wrote:
> On Tue, 2 Feb 2021 11:45:05 -0800
> Ben Widawsky <ben@bwidawsk.net> wrote:
> 
> > On 21-02-02 19:21:35, Jonathan Cameron wrote:
> > > On Mon, 1 Feb 2021 16:59:34 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >   
> > > > CXL host bridges themselves may have MMIO. Since host bridges don't have
> > > > a BAR they are treated as special for MMIO.
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > 
> > > > --
> > > > 
> > > > It's arbitrarily chosen here to pick 0xD0000000 as the base for the host
> > > > bridge MMIO. I'm not sure what the right way to find free space for
> > > > platform hardcoded things like this is.  
> > > 
> > > Seems like this needs to come from the machine definition.
> > > This is fairly easy for arm/virt, where there is a clearly laid out memory map.
> > > For hw/i386 I'm less sure on how to do it.  
> > 
> > I think this is how to do it :D
> 
> It may well be, but they you'll need to find a suitable region and document
> it and ensure no one else ever tramples on it.  Easy to do on a physical system,
> bit trickier in emulation.
> 

Maybe? x86 is full of magic physical address holes. As long as it's conveyed to
EDK via _CRS, I think it's pretty safe. If something else tries to use the same
address, you should get a fairly obvious error.

Document somehow, yes please.

> > 
> > > 
> > > Having said that, for this particular magic device, we do have a PCI EP
> > > associated with it.  How about putting all the host bridge MMIO into a
> > > BAR of that rather than having it separate.   
> > > That has the added advantage of making it discoverable from firmware.
> > > 
> > > Any normal system is going to have this is impdef for discovery anyway.
> > >   
> > 
> > This is not how it's expected to work for Intel at least. If the device was
> > discoverable you wouldn't need CEDT/CHBS. The magic host bridges are only
> > advertised via the CEDT.
> 
> I agree on a normal system (i.e. a real one) this doesn't work,
> but then a normal system doesn't involve a magic PCI RCiEP that also happens
> to instantiate an extra host bridge. This is what pxb_pcie is doing and
> what your pxb_cxl is almost doing.
> 
> > 
> > When I build and run QEMU for x86_64, I do not see the host bridge in the pci
> > topology, do you (it's meant to not be there)?
> > 
> > 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> > ...
> > 34:00.0 PCI bridge: Intel Corporation Device 7075
> > 35:00.0 Memory controller [0502]: Intel Corporation Device 0d93 (rev 01)
> > 
> > That's Q35, Root Port, and Type 3 device respectively.
> 
> You don't see the host bridge, for pxb_cxl, but for pxb_pcie you do.
> 00:06.0 Host bridge: Red Hat, Inc QEMU PCIe Expander bridge.
> If you have another device after your pxb-cxl you'll also notice that there
> is a hole punched in the list where you'd expect pxb-cxl to be (device number
> skipped).  (that had me confused earlier).
> 
> This seems to be because no VID etc (unlike pxb-pcie).
> 

Right. This was in an earlier version of the series and you made me realize if I
got rid of them that it disappears. I really like that this more accurately
represents the hardware.

I agree it can be implemented more simply, but why do it if you can accurately
model it?

> I gave vendor and device IDs (and a bar to test that could be done) and it
> then appears just like pxb_pcie does.  Hence handy place to hang our
> magic memory off so that EDK2 or similar can work with it and indeed
> construct he CHBS as needed so we can get to this via the same paths as
> a normal system.  It's a bit convoluted but in some ways more elegant. 
> 

What are you looking to get out of EDK2 or similar? Anything you want to convey
should work with _CRS, I think. That was the path I was going down.

> Jonathan
> 
> > 
> > > That would then let you drop the separate definition of CXLHost structure
> > > though it needs a bit of figuring out what to do with the memory window
> > > setup etc.
> > > 
> > > I tried hacking it together, but not gotten it working yet.
> > >   
> > > > ---
> > > >  hw/pci-bridge/pci_expander_bridge.c | 53 ++++++++++++++++++++++++++++-
> > > >  include/hw/cxl/cxl.h                |  2 ++
> > > >  2 files changed, 54 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > > index 5021b60435..226a8a5fff 100644
> > > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > > @@ -17,6 +17,7 @@
> > > >  #include "hw/pci/pci_host.h"
> > > >  #include "hw/qdev-properties.h"
> > > >  #include "hw/pci/pci_bridge.h"
> > > > +#include "hw/cxl/cxl.h"
> > > >  #include "qemu/range.h"
> > > >  #include "qemu/error-report.h"
> > > >  #include "qemu/module.h"
> > > > @@ -70,6 +71,12 @@ struct PXBDev {
> > > >      int32_t uid;
> > > >  };
> > > >  
> > > > +typedef struct CXLHost {
> > > > +    PCIHostState parent_obj;
> > > > +
> > > > +    CXLComponentState cxl_cstate;
> > > > +} CXLHost;
> > > > +
> > > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> > > >  {
> > > >      /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
> > > > @@ -85,6 +92,9 @@ static GList *pxb_dev_list;
> > > >  
> > > >  #define TYPE_PXB_HOST "pxb-host"
> > > >  
> > > > +#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
> > > > +#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
> > > > +
> > > >  static int pxb_bus_num(PCIBus *bus)
> > > >  {
> > > >      PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > > > @@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
> > > >      .class_init    = pxb_host_class_init,
> > > >  };
> > > >  
> > > > +static void pxb_cxl_realize(DeviceState *dev, Error **errp)
> > > > +{
> > > > +    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
> > > > +    PCIHostState *phb = PCI_HOST_BRIDGE(dev);
> > > > +    CXLHost *cxl = PXB_CXL_HOST(dev);
> > > > +    CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
> > > > +    struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
> > > > +
> > > > +    cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
> > > > +                                      TYPE_PXB_CXL_HOST);
> > > > +    sysbus_init_mmio(sbd, mr);
> > > > +
> > > > +    /* FIXME: support multiple host bridges. */
> > > > +    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
> > > > +                            memory_region_size(mr) * pci_bus_uid(phb->bus));
> > > > +}
> > > > +
> > > > +static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
> > > > +{
> > > > +    DeviceClass *dc = DEVICE_CLASS(class);
> > > > +    PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
> > > > +
> > > > +    hc->root_bus_path = pxb_host_root_bus_path;
> > > > +    dc->fw_name = "cxl";
> > > > +    dc->realize = pxb_cxl_realize;
> > > > +    /* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself */
> > > > +    dc->user_creatable = false;
> > > > +}
> > > > +
> > > > +/*
> > > > + * This is a device to handle the MMIO for a CXL host bridge. It does nothing
> > > > + * else.
> > > > + */
> > > > +static const TypeInfo cxl_host_info = {
> > > > +    .name          = TYPE_PXB_CXL_HOST,
> > > > +    .parent        = TYPE_PCI_HOST_BRIDGE,
> > > > +    .instance_size = sizeof(CXLHost),
> > > > +    .class_init    = pxb_cxl_host_class_init,
> > > > +};
> > > > +
> > > >  /*
> > > >   * Registers the PXB bus as a child of pci host root bus.
> > > >   */
> > > > @@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
> > > >          dev_name = dev->qdev.id;
> > > >      }
> > > >  
> > > > -    ds = qdev_new(TYPE_PXB_HOST);
> > > > +    ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
> > > >      if (type == PCIE) {
> > > >          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
> > > >      } else if (type == CXL) {
> > > > @@ -466,6 +516,7 @@ static void pxb_register_types(void)
> > > >      type_register_static(&pxb_pcie_bus_info);
> > > >      type_register_static(&pxb_cxl_bus_info);
> > > >      type_register_static(&pxb_host_info);
> > > > +    type_register_static(&cxl_host_info);
> > > >      type_register_static(&pxb_dev_info);
> > > >      type_register_static(&pxb_pcie_dev_info);
> > > >      type_register_static(&pxb_cxl_dev_info);
> > > > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > > > index 362cda40de..6bc344f205 100644
> > > > --- a/include/hw/cxl/cxl.h
> > > > +++ b/include/hw/cxl/cxl.h
> > > > @@ -17,5 +17,7 @@
> > > >  #define COMPONENT_REG_BAR_IDX 0
> > > >  #define DEVICE_REG_BAR_IDX 2
> > > >  
> > > > +#define CXL_HOST_BASE 0xD0000000
> > > > +
> > > >  #endif
> > > >    
> > > 
> > >   
> > 
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 17/31] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  2021-02-02 21:03         ` Ben Widawsky
@ 2021-02-02 22:06           ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-02 22:06 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Tue, 2 Feb 2021 13:03:57 -0800
Ben Widawsky <ben@bwidawsk.net> wrote:

> On 21-02-02 20:43:38, Jonathan Cameron wrote:
> > On Tue, 2 Feb 2021 11:45:05 -0800
> > Ben Widawsky <ben@bwidawsk.net> wrote:
> >   
> > > On 21-02-02 19:21:35, Jonathan Cameron wrote:  
> > > > On Mon, 1 Feb 2021 16:59:34 -0800
> > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >     
> > > > > CXL host bridges themselves may have MMIO. Since host bridges don't have
> > > > > a BAR they are treated as special for MMIO.
> > > > > 
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > 
> > > > > --
> > > > > 
> > > > > It's arbitrarily chosen here to pick 0xD0000000 as the base for the host
> > > > > bridge MMIO. I'm not sure what the right way to find free space for
> > > > > platform hardcoded things like this is.    
> > > > 
> > > > Seems like this needs to come from the machine definition.
> > > > This is fairly easy for arm/virt, where there is a clearly laid out memory map.
> > > > For hw/i386 I'm less sure on how to do it.    
> > > 
> > > I think this is how to do it :D  
> > 
> > It may well be, but they you'll need to find a suitable region and document
> > it and ensure no one else ever tramples on it.  Easy to do on a physical system,
> > bit trickier in emulation.
> >   
> 
> Maybe? x86 is full of magic physical address holes. As long as it's conveyed to
> EDK via _CRS, I think it's pretty safe. If something else tries to use the same
> address, you should get a fairly obvious error.
> 
> Document somehow, yes please.
> 
> > >   
> > > > 
> > > > Having said that, for this particular magic device, we do have a PCI EP
> > > > associated with it.  How about putting all the host bridge MMIO into a
> > > > BAR of that rather than having it separate.   
> > > > That has the added advantage of making it discoverable from firmware.
> > > > 
> > > > Any normal system is going to have this is impdef for discovery anyway.
> > > >     
> > > 
> > > This is not how it's expected to work for Intel at least. If the device was
> > > discoverable you wouldn't need CEDT/CHBS. The magic host bridges are only
> > > advertised via the CEDT.  
> > 
> > I agree on a normal system (i.e. a real one) this doesn't work,
> > but then a normal system doesn't involve a magic PCI RCiEP that also happens
> > to instantiate an extra host bridge. This is what pxb_pcie is doing and
> > what your pxb_cxl is almost doing.
> >   
> > > 
> > > When I build and run QEMU for x86_64, I do not see the host bridge in the pci
> > > topology, do you (it's meant to not be there)?
> > > 
> > > 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> > > ...
> > > 34:00.0 PCI bridge: Intel Corporation Device 7075
> > > 35:00.0 Memory controller [0502]: Intel Corporation Device 0d93 (rev 01)
> > > 
> > > That's Q35, Root Port, and Type 3 device respectively.  
> > 
> > You don't see the host bridge, for pxb_cxl, but for pxb_pcie you do.
> > 00:06.0 Host bridge: Red Hat, Inc QEMU PCIe Expander bridge.
> > If you have another device after your pxb-cxl you'll also notice that there
> > is a hole punched in the list where you'd expect pxb-cxl to be (device number
> > skipped).  (that had me confused earlier).
> > 
> > This seems to be because no VID etc (unlike pxb-pcie).
> >   
> 
> Right. This was in an earlier version of the series and you made me realize if I
> got rid of them that it disappears. I really like that this more accurately
> represents the hardware.
> 
> I agree it can be implemented more simply, but why do it if you can accurately
> model it?

I'd like to understand all the reasons pxb_pcie is deliberately exposed and
whether any of them affect CXL cases.  Right now I'm not entirely sure what
they are.

> 
> > I gave vendor and device IDs (and a bar to test that could be done) and it
> > then appears just like pxb_pcie does.  Hence handy place to hang our
> > magic memory off so that EDK2 or similar can work with it and indeed
> > construct he CHBS as needed so we can get to this via the same paths as
> > a normal system.  It's a bit convoluted but in some ways more elegant. 
> >   
> 
> What are you looking to get out of EDK2 or similar? Anything you want to convey
> should work with _CRS, I think. That was the path I was going down.

If you go down the fixed route that's fine. Just seemed a bit convenient we had
somewhere to hang this that didn't rely on arbitrary fixed regions.

If we went with the BAR we'd need to enumerate first to get bar memory location
and then fill in the table.  Bit messy so fixed does look like a better option.

Jonathan

> 
> > Jonathan
> >   
> > >   
> > > > That would then let you drop the separate definition of CXLHost structure
> > > > though it needs a bit of figuring out what to do with the memory window
> > > > setup etc.
> > > > 
> > > > I tried hacking it together, but not gotten it working yet.
> > > >     
> > > > > ---
> > > > >  hw/pci-bridge/pci_expander_bridge.c | 53 ++++++++++++++++++++++++++++-
> > > > >  include/hw/cxl/cxl.h                |  2 ++
> > > > >  2 files changed, 54 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > > > index 5021b60435..226a8a5fff 100644
> > > > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > > > @@ -17,6 +17,7 @@
> > > > >  #include "hw/pci/pci_host.h"
> > > > >  #include "hw/qdev-properties.h"
> > > > >  #include "hw/pci/pci_bridge.h"
> > > > > +#include "hw/cxl/cxl.h"
> > > > >  #include "qemu/range.h"
> > > > >  #include "qemu/error-report.h"
> > > > >  #include "qemu/module.h"
> > > > > @@ -70,6 +71,12 @@ struct PXBDev {
> > > > >      int32_t uid;
> > > > >  };
> > > > >  
> > > > > +typedef struct CXLHost {
> > > > > +    PCIHostState parent_obj;
> > > > > +
> > > > > +    CXLComponentState cxl_cstate;
> > > > > +} CXLHost;
> > > > > +
> > > > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> > > > >  {
> > > > >      /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
> > > > > @@ -85,6 +92,9 @@ static GList *pxb_dev_list;
> > > > >  
> > > > >  #define TYPE_PXB_HOST "pxb-host"
> > > > >  
> > > > > +#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
> > > > > +#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
> > > > > +
> > > > >  static int pxb_bus_num(PCIBus *bus)
> > > > >  {
> > > > >      PXBDev *pxb = convert_to_pxb(bus->parent_dev);
> > > > > @@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
> > > > >      .class_init    = pxb_host_class_init,
> > > > >  };
> > > > >  
> > > > > +static void pxb_cxl_realize(DeviceState *dev, Error **errp)
> > > > > +{
> > > > > +    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
> > > > > +    PCIHostState *phb = PCI_HOST_BRIDGE(dev);
> > > > > +    CXLHost *cxl = PXB_CXL_HOST(dev);
> > > > > +    CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
> > > > > +    struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
> > > > > +
> > > > > +    cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
> > > > > +                                      TYPE_PXB_CXL_HOST);
> > > > > +    sysbus_init_mmio(sbd, mr);
> > > > > +
> > > > > +    /* FIXME: support multiple host bridges. */
> > > > > +    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
> > > > > +                            memory_region_size(mr) * pci_bus_uid(phb->bus));
> > > > > +}
> > > > > +
> > > > > +static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
> > > > > +{
> > > > > +    DeviceClass *dc = DEVICE_CLASS(class);
> > > > > +    PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
> > > > > +
> > > > > +    hc->root_bus_path = pxb_host_root_bus_path;
> > > > > +    dc->fw_name = "cxl";
> > > > > +    dc->realize = pxb_cxl_realize;
> > > > > +    /* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself */
> > > > > +    dc->user_creatable = false;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * This is a device to handle the MMIO for a CXL host bridge. It does nothing
> > > > > + * else.
> > > > > + */
> > > > > +static const TypeInfo cxl_host_info = {
> > > > > +    .name          = TYPE_PXB_CXL_HOST,
> > > > > +    .parent        = TYPE_PCI_HOST_BRIDGE,
> > > > > +    .instance_size = sizeof(CXLHost),
> > > > > +    .class_init    = pxb_cxl_host_class_init,
> > > > > +};
> > > > > +
> > > > >  /*
> > > > >   * Registers the PXB bus as a child of pci host root bus.
> > > > >   */
> > > > > @@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
> > > > >          dev_name = dev->qdev.id;
> > > > >      }
> > > > >  
> > > > > -    ds = qdev_new(TYPE_PXB_HOST);
> > > > > +    ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
> > > > >      if (type == PCIE) {
> > > > >          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
> > > > >      } else if (type == CXL) {
> > > > > @@ -466,6 +516,7 @@ static void pxb_register_types(void)
> > > > >      type_register_static(&pxb_pcie_bus_info);
> > > > >      type_register_static(&pxb_cxl_bus_info);
> > > > >      type_register_static(&pxb_host_info);
> > > > > +    type_register_static(&cxl_host_info);
> > > > >      type_register_static(&pxb_dev_info);
> > > > >      type_register_static(&pxb_pcie_dev_info);
> > > > >      type_register_static(&pxb_cxl_dev_info);
> > > > > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > > > > index 362cda40de..6bc344f205 100644
> > > > > --- a/include/hw/cxl/cxl.h
> > > > > +++ b/include/hw/cxl/cxl.h
> > > > > @@ -17,5 +17,7 @@
> > > > >  #define COMPONENT_REG_BAR_IDX 0
> > > > >  #define DEVICE_REG_BAR_IDX 2
> > > > >  
> > > > > +#define CXL_HOST_BASE 0xD0000000
> > > > > +
> > > > >  #endif
> > > > >      
> > > > 
> > > >     
> > >   
> >   


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 00/31] CXL 2.0 Support
  2021-02-02  0:59 ` Ben Widawsky
                   ` (32 preceding siblings ...)
  (?)
@ 2021-02-03 17:42 ` Ben Widawsky
  2021-02-11 18:51     ` Jonathan Cameron
  -1 siblings, 1 reply; 117+ messages in thread
From: Ben Widawsky @ 2021-02-03 17:42 UTC (permalink / raw)
  To: qemu-devel
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Jonathan Cameron, Igor Mammedov,
	Dan Williams, Ira Weiny

I've started a barebones project plan:
https://gitlab.com/bwidawsk/qemu/-/snippets/2070304

Jonathan, if you have a moment, perhaps you can send a MR summarizing CDAT/DOE
work from you and Chris?

If folks feel priorities are drastically off, we can discuss it in the snippet
comments.

As for wider acceptance, if I'm looking at this from the QEMU community
perspective, better test cases are really needed. If your fingers are itching
for some typing, might I suggest starting with that.

I've opted not to use issue tracker for this because I am hopeful this won't be
a long living gitlab project.

On 21-02-01 16:59:17, Ben Widawsky wrote:
> Major changes since v2 [1]:
>  * Removed all register endian/alignment/size checking. Using core functionality
>    instead. This untested on big endian hosts, but Should Work(tm).
>  * Fix component capability header generation (off by 1).
>  * Fixed HDM programming (multiple issues).
>  * Fixed timestamp command implementations.
>  * Added commands: GET_FIRMWARE_UPDATE_INFO, GET_PARTITION_INFO, GET_LSA, SET_LSA
> 
> Things have remained fairly stable since since v2. The biggest change here is
> definitely the HDM programming which has received limited (but not 0) testing in
> the Linux driver.
> 
> Jonathan Cameron has gotten this patch series working on ARM [2], and added some
> much sought after functionality [3].
> 
> ---
> 
> I've started #cxl on OFTC IRC for discussion. Please feel free to use that
> channel for questions or suggestions in addition to #qemu.
> 
> ---
> 
> Introduce emulation of Compute Express Link 2.0
> (https://www.computeexpresslink.org/). Specifically, add support for Type 3
> memory expanders with persistent memory.
> 
> The emulation has been critical to get the Linux enabling started [4], it would
> be an ideal place to land regression tests for different topology handling, and
> there may be applications for this emulation as a way for a guest to manipulate
> its address space relative to different performance memories.
> 
> Three of the five CXL component types are emulated with some level of
> functionality: host bridge, root port, and memory device. All components and
> devices implement basic MMIO. Devices/memory devices implement the mailbo
> interface. Basic ACPI support is also included. Upstream ports and downstream
> ports aren't implemented (the two components needed to make up a switch).
> 
> CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
> implementation utilizes existing PCI paradigms. To implement the host bridge,
> I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
> fit even though it doesn't directly map to how hardware will work. For
> persistent capacity of the memory device, I utilized the memory subsystem
> (hw/mem).
> 
> We have 3 reasons why this work is valuable:
> 1. Linux driver feature development benefits from emulation both due to a lack
>    of initial hardware availability, but also, as is seen with NVDIMM/PMEM
>    emulation, there is value in being able to share topologies with
>    system-software developers even after hardware is available.
> 
> 2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
>    resources via custom modules (nfit_test). In retrospect a QEMU emulation of
>    nfit_test capabilities would have made the test environment more portable,
>    and allowed for easier community contributions of example configurations.
> 
> 3. This is still being fleshed out, but in short it provides a standardized
>    mechanism for the guest to provide feedback to the host about size and
>    placement needs of the memory. After the host gives the guest a physical
>    window mapping to the CXL device, the emulated HDM decoders allow the guest a
>    way to tell the host how much it wants and where. There are likely simpler
>    ways to do this, but they'd require inventing a new interface and you'd need
>    to have diverging driver code in the guest programming of the HDM decoder vs.
>    the host. Since we've already done this work, why not use it?
> 
> There is quite a long list of work to do for full spec compliance, but I don't
> believe that any of it precludes merging. Off the top of my head:
> - Main host bridge support (WIP)
> - Interleaving
> - Better Tests
> - Hot plug support
> - Emulating volatile capacity
> - CDAT emulation [3]
> 
> The flow of the patches in general is to define all the data structures and
> registers associated with the various components in a top down manner. Host
> bridge, component, ports, devices. Then, the actual implementation is done in
> the same order.
> 
> The summary is:
> 1-5: Infrastructure for component and device emulation
> 6-9: Basic mailbox command implementations
> 10-19: Implement CXL host bridges as PXB devices
> 20: Implement a root port
> 21-22: Implement a memory device
> 23-26: ACPI bits
> 27-29: Add some more advanced mailbox command implementations
> 30: Start working on enabling the main host bridge
> 31: Basic test case
> 
> ---
> 
> [1]: https://lore.kernel.org/qemu-devel/20210105165323.783725-1-ben.widawsky@intel.com/
> [2]: https://lore.kernel.org/qemu-devel/20210201152655.31027-1-Jonathan.Cameron@huawei.com/
> [3]: https://lore.kernel.org/qemu-devel/20210201151629.29656-1-Jonathan.Cameron@huawei.com/
> [4]: https://lore.kernel.org/linux-cxl/20210130002438.1872527-1-ben.widawsky@intel.com/
> 
> ---
> 
> Ben Widawsky (31):
>   hw/pci/cxl: Add a CXL component type (interface)
>   hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
>   hw/cxl/device: Introduce a CXL device (8.2.8)
>   hw/cxl/device: Implement the CAP array (8.2.8.1-2)
>   hw/cxl/device: Implement basic mailbox (8.2.8.4)
>   hw/cxl/device: Add memory device utilities
>   hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
>   hw/cxl/device: Timestamp implementation (8.2.9.3)
>   hw/cxl/device: Add log commands (8.2.9.4) + CEL
>   hw/pxb: Use a type for realizing expanders
>   hw/pci/cxl: Create a CXL bus type
>   hw/pxb: Allow creation of a CXL PXB (host bridge)
>   qtest: allow DSDT acpi table changes
>   acpi/pci: Consolidate host bridge setup
>   tests/acpi: remove stale allowed tables
>   hw/pci: Plumb _UID through host bridges
>   hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
>   acpi/pxb/cxl: Reserve host bridge MMIO
>   hw/pxb/cxl: Add "windows" for host bridges
>   hw/cxl/rp: Add a root port
>   hw/cxl/device: Add a memory device (8.2.8.5)
>   hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
>   acpi/cxl: Add _OSC implementation (9.14.2)
>   tests/acpi: allow CEDT table addition
>   acpi/cxl: Create the CEDT (9.14.1)
>   tests/acpi: Add new CEDT files
>   hw/cxl/device: Add some trivial commands
>   hw/cxl/device: Plumb real LSA sizing
>   hw/cxl/device: Implement get/set LSA
>   qtest/cxl: Add very basic sanity tests
>   WIP: i386/cxl: Initialize a host bridge
> 
>  MAINTAINERS                         |   6 +
>  hw/Kconfig                          |   1 +
>  hw/acpi/Kconfig                     |   5 +
>  hw/acpi/cxl.c                       | 173 ++++++++++
>  hw/acpi/meson.build                 |   1 +
>  hw/arm/virt.c                       |   1 +
>  hw/core/machine.c                   |  26 ++
>  hw/core/numa.c                      |   3 +
>  hw/cxl/Kconfig                      |   3 +
>  hw/cxl/cxl-component-utils.c        | 208 ++++++++++++
>  hw/cxl/cxl-device-utils.c           | 264 +++++++++++++++
>  hw/cxl/cxl-mailbox-utils.c          | 498 ++++++++++++++++++++++++++++
>  hw/cxl/meson.build                  |   5 +
>  hw/i386/acpi-build.c                |  87 ++++-
>  hw/i386/microvm.c                   |   1 +
>  hw/i386/pc.c                        |   2 +
>  hw/mem/Kconfig                      |   5 +
>  hw/mem/cxl_type3.c                  | 405 ++++++++++++++++++++++
>  hw/mem/meson.build                  |   1 +
>  hw/meson.build                      |   1 +
>  hw/pci-bridge/Kconfig               |   5 +
>  hw/pci-bridge/cxl_root_port.c       | 231 +++++++++++++
>  hw/pci-bridge/meson.build           |   1 +
>  hw/pci-bridge/pci_expander_bridge.c | 209 +++++++++++-
>  hw/pci-bridge/pcie_root_port.c      |   6 +-
>  hw/pci/pci.c                        |  32 +-
>  hw/pci/pcie.c                       |  30 ++
>  hw/ppc/spapr.c                      |   2 +
>  include/hw/acpi/cxl.h               |  27 ++
>  include/hw/boards.h                 |   2 +
>  include/hw/cxl/cxl.h                |  29 ++
>  include/hw/cxl/cxl_component.h      | 187 +++++++++++
>  include/hw/cxl/cxl_device.h         | 255 ++++++++++++++
>  include/hw/cxl/cxl_pci.h            | 160 +++++++++
>  include/hw/pci/pci.h                |  15 +
>  include/hw/pci/pci_bridge.h         |  25 ++
>  include/hw/pci/pci_bus.h            |   8 +
>  include/hw/pci/pci_ids.h            |   1 +
>  monitor/hmp-cmds.c                  |  15 +
>  qapi/machine.json                   |   1 +
>  tests/data/acpi/pc/CEDT             | Bin 0 -> 36 bytes
>  tests/data/acpi/pc/DSDT             | Bin 5065 -> 5065 bytes
>  tests/data/acpi/pc/DSDT.acpihmat    | Bin 6390 -> 6390 bytes
>  tests/data/acpi/pc/DSDT.bridge      | Bin 6924 -> 6924 bytes
>  tests/data/acpi/pc/DSDT.cphp        | Bin 5529 -> 5529 bytes
>  tests/data/acpi/pc/DSDT.dimmpxm     | Bin 6719 -> 6719 bytes
>  tests/data/acpi/pc/DSDT.hpbridge    | Bin 5026 -> 5026 bytes
>  tests/data/acpi/pc/DSDT.hpbrroot    | Bin 3084 -> 3084 bytes
>  tests/data/acpi/pc/DSDT.ipmikcs     | Bin 5137 -> 5137 bytes
>  tests/data/acpi/pc/DSDT.memhp       | Bin 6424 -> 6424 bytes
>  tests/data/acpi/pc/DSDT.numamem     | Bin 5071 -> 5071 bytes
>  tests/data/acpi/pc/DSDT.roothp      | Bin 5261 -> 5261 bytes
>  tests/data/acpi/q35/CEDT            | Bin 0 -> 36 bytes
>  tests/data/acpi/q35/DSDT            | Bin 7801 -> 7801 bytes
>  tests/data/acpi/q35/DSDT.acpihmat   | Bin 9126 -> 9126 bytes
>  tests/data/acpi/q35/DSDT.bridge     | Bin 7819 -> 7819 bytes
>  tests/data/acpi/q35/DSDT.cphp       | Bin 8265 -> 8265 bytes
>  tests/data/acpi/q35/DSDT.dimmpxm    | Bin 9455 -> 9455 bytes
>  tests/data/acpi/q35/DSDT.ipmibt     | Bin 7876 -> 7876 bytes
>  tests/data/acpi/q35/DSDT.memhp      | Bin 9160 -> 9160 bytes
>  tests/data/acpi/q35/DSDT.mmio64     | Bin 8932 -> 8932 bytes
>  tests/data/acpi/q35/DSDT.numamem    | Bin 7807 -> 7807 bytes
>  tests/qtest/cxl-test.c              |  93 ++++++
>  tests/qtest/meson.build             |   4 +
>  64 files changed, 3004 insertions(+), 30 deletions(-)
>  create mode 100644 hw/acpi/cxl.c
>  create mode 100644 hw/cxl/Kconfig
>  create mode 100644 hw/cxl/cxl-component-utils.c
>  create mode 100644 hw/cxl/cxl-device-utils.c
>  create mode 100644 hw/cxl/cxl-mailbox-utils.c
>  create mode 100644 hw/cxl/meson.build
>  create mode 100644 hw/mem/cxl_type3.c
>  create mode 100644 hw/pci-bridge/cxl_root_port.c
>  create mode 100644 include/hw/acpi/cxl.h
>  create mode 100644 include/hw/cxl/cxl.h
>  create mode 100644 include/hw/cxl/cxl_component.h
>  create mode 100644 include/hw/cxl/cxl_device.h
>  create mode 100644 include/hw/cxl/cxl_pci.h
>  create mode 100644 tests/data/acpi/pc/CEDT
>  create mode 100644 tests/data/acpi/q35/CEDT
>  create mode 100644 tests/qtest/cxl-test.c
> 
> -- 
> 2.30.0
> 
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 02/31] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-11 17:08     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-11 17:08 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:19 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> A CXL 2.0 component is any entity in the CXL topology. All components
> have a analogous function in PCIe. Except for the CXL host bridge, all
> have a PCIe config space that is accessible via the common PCIe
> mechanisms. CXL components are enumerated via DVSEC fields in the
> extended PCIe header space. CXL components will minimally implement some
> subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
> 2.0 specification. Two headers and a utility library are introduced to
> support the minimum functionality needed to enumerate components.
> 
> The cxl_pci header manages bits associated with PCI, specifically the
> DVSEC and related fields. The cxl_component.h variant has data
> structures and APIs that are useful for drivers implementing any of the
> CXL 2.0 components. The library takes care of making use of the DVSEC
> bits and the CXL.[mem|cache] registers. Per spec, the registers are
> little endian.
> 
> None of the mechanisms required to enumerate a CXL capable hostbridge
> are introduced at this point.
> 
> Note that the CXL.mem and CXL.cache registers used are always 4B wide.
> It's possible in the future that this constraint will not hold.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
A few additions to previous comments.

> ---
>  MAINTAINERS                    |   6 +
>  hw/Kconfig                     |   1 +
>  hw/cxl/Kconfig                 |   3 +
>  hw/cxl/cxl-component-utils.c   | 208 +++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build             |   3 +
>  hw/meson.build                 |   1 +
>  include/hw/cxl/cxl.h           |  17 +++
>  include/hw/cxl/cxl_component.h | 187 +++++++++++++++++++++++++++++
>  include/hw/cxl/cxl_pci.h       | 138 ++++++++++++++++++++++
>  9 files changed, 564 insertions(+)
>  create mode 100644 hw/cxl/Kconfig
>  create mode 100644 hw/cxl/cxl-component-utils.c
>  create mode 100644 hw/cxl/meson.build
>  create mode 100644 include/hw/cxl/cxl.h
>  create mode 100644 include/hw/cxl/cxl_component.h
>  create mode 100644 include/hw/cxl/cxl_pci.h
> 


> diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> new file mode 100644
> index 0000000000..8d56ad5c7d
> --- /dev/null
> +++ b/hw/cxl/cxl-component-utils.c
> @@ -0,0 +1,208 @@
> +/*
> + * CXL Utility library for components
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "hw/pci/pci.h"
> +#include "hw/cxl/cxl.h"
> +
> +static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
> +                                       unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    assert(size == 4);
> +
> +    if (cregs->special_ops && cregs->special_ops->read) {
> +        return cregs->special_ops->read(cxl_cstate, offset, size);
> +    } else {
> +        return cregs->cache_mem_registers[offset / 4];
> +    }
> +}
> +
> +static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
> +                                    unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    assert(size == 4);
> +
> +    if (cregs->special_ops && cregs->special_ops->write) {
> +        cregs->special_ops->write(cxl_cstate, offset, value, size);
> +    } else {
> +        cregs->cache_mem_registers[offset / 4] = value;
> +    }
> +}
> +
> +/*
> + * 8.2.3
> + *   The access restrictions specified in Section 8.2.2 also apply to CXL 2.0
> + *   Component Registers.
> + *
> + * 8.2.2
> + *   • A 32 bit register shall be accessed as a 4 Bytes quantity. Partial
> + *   reads are not permitted.
> + *   • A 64 bit register shall be accessed as a 8 Bytes quantity. Partial
> + *   reads are not permitted.
> + *
> + * As of the spec defined today, only 4 byte registers exist.

The exciting exception to this is the RAS header log which is
defined as 512 bits.  Will seek clarification but I think the spec should
probably say that is a set of 32 bit registers.

A bunch of the other elements that we probably want to block in plausible
values for also seem to use 64 bit registers.

> + */
> +static const MemoryRegionOps cache_mem_ops = {
> +    .read = cxl_cache_mem_read_reg,
> +    .write = cxl_cache_mem_write_reg,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +};
> +

..
> +
> +void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
> +{
> +    int caps = 0;
> +    switch (type) {
> +    case CXL2_DOWNSTREAM_PORT:
> +    case CXL2_DEVICE:
> +        /* CAP, RAS, Link */
> +        caps = 2;
> +        break;
> +    case CXL2_UPSTREAM_PORT:
> +    case CXL2_TYPE3_DEVICE:
> +    case CXL2_LOGICAL_DEVICE:
> +        /* + HDM */
> +        caps = 3;
> +        break;
> +    case CXL2_ROOT_PORT:
> +        /* + Extended Security, + Snoop */
> +        caps = 5;
> +        break;
> +    default:
> +        abort();
> +    }
> +
> +    memset(reg_state, 0, 0x1000);
> +
> +    /* CXL Capability Header Register */
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
> +
> +
> +#define init_cap_reg(reg, id, version)                                        \
> +    _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
> +    do {                                                                      \
> +        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
> +        reg_state[which] = FIELD_DP32(reg_state[which],                       \
> +                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
> +                       VERSION, version);                                     \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
> +                       CXL_##reg##_REGISTERS_OFFSET);                         \
> +    } while (0)

Seems like this would be cleaner using ARRAY_FIELD_DP32 as you did for the header.

    #define init_cap_reg(reg, id, version)                                        \
        _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
        do {                                                                    \
            ARRAY_FIELD_DP32(reg_state, CXL_##reg##_CAPABILITY_HEADER, ID, id); \
            ARRAY_FIELD_DP32(reg_state, CXL_##reg##_CAPABILITY_HEADER,          \
                             VERSION, version);                                 \
            ARRAY_FIELD_DP32(reg_state, CXL_##reg##_CAPABILITY_HEADER,          \
                             PTR, CXL_##reg##_REGISTRS_OFFSET);                 \
	} while (0)
I think gives the same result.

> +
> +    init_cap_reg(RAS, 2, 1);
> +    ras_init_common(reg_state);
> +
> +    init_cap_reg(LINK, 4, 2);

Feels like we'll want to block some values for the rest of these to at least
ensure whatever is read isn't crazy.

> +
> +    if (caps < 3) {
> +        return;
> +    }
> +
> +    init_cap_reg(HDM, 5, 1);
> +    hdm_init_common(reg_state);
> +
> +    if (caps < 5) {
> +        return;
> +    }
> +
> +    init_cap_reg(EXTSEC, 6, 1);
> +    init_cap_reg(SNOOP, 8, 1);
> +
> +#undef init_cap_reg
> +}
> +
> +/*
> + * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
> + * for tracking the valid offset.
> + *
> + * This function will build the DVSEC header on behalf of the caller and then
> + * copy in the remaining data for the vendor specific bits.
> + */
> +void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
> +                                uint16_t type, uint8_t rev, uint8_t *body)
> +{
> +    PCIDevice *pdev = cxl->pdev;
> +    uint16_t offset = cxl->dvsec_offset;
> +
> +    assert(offset >= PCI_CFG_SPACE_SIZE &&
> +           ((offset + length) < PCI_CFG_SPACE_EXP_SIZE));
> +    assert((length & 0xf000) == 0);
> +    assert((rev & ~0xf) == 0);
> +
> +    /* Create the DVSEC in the MCFG space */
> +    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
> +    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER1_OFFSET,
> +                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
> +    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
> +    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
> +           body + sizeof(struct dvsec_header),
> +           length - sizeof(struct dvsec_header));
> +
> +    /* Update state for future DVSEC additions */
> +    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
> +    cxl->dvsec_offset += length;
> +}
...


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 02/31] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
@ 2021-02-11 17:08     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-11 17:08 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:19 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> A CXL 2.0 component is any entity in the CXL topology. All components
> have a analogous function in PCIe. Except for the CXL host bridge, all
> have a PCIe config space that is accessible via the common PCIe
> mechanisms. CXL components are enumerated via DVSEC fields in the
> extended PCIe header space. CXL components will minimally implement some
> subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
> 2.0 specification. Two headers and a utility library are introduced to
> support the minimum functionality needed to enumerate components.
> 
> The cxl_pci header manages bits associated with PCI, specifically the
> DVSEC and related fields. The cxl_component.h variant has data
> structures and APIs that are useful for drivers implementing any of the
> CXL 2.0 components. The library takes care of making use of the DVSEC
> bits and the CXL.[mem|cache] registers. Per spec, the registers are
> little endian.
> 
> None of the mechanisms required to enumerate a CXL capable hostbridge
> are introduced at this point.
> 
> Note that the CXL.mem and CXL.cache registers used are always 4B wide.
> It's possible in the future that this constraint will not hold.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
A few additions to previous comments.

> ---
>  MAINTAINERS                    |   6 +
>  hw/Kconfig                     |   1 +
>  hw/cxl/Kconfig                 |   3 +
>  hw/cxl/cxl-component-utils.c   | 208 +++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build             |   3 +
>  hw/meson.build                 |   1 +
>  include/hw/cxl/cxl.h           |  17 +++
>  include/hw/cxl/cxl_component.h | 187 +++++++++++++++++++++++++++++
>  include/hw/cxl/cxl_pci.h       | 138 ++++++++++++++++++++++
>  9 files changed, 564 insertions(+)
>  create mode 100644 hw/cxl/Kconfig
>  create mode 100644 hw/cxl/cxl-component-utils.c
>  create mode 100644 hw/cxl/meson.build
>  create mode 100644 include/hw/cxl/cxl.h
>  create mode 100644 include/hw/cxl/cxl_component.h
>  create mode 100644 include/hw/cxl/cxl_pci.h
> 


> diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> new file mode 100644
> index 0000000000..8d56ad5c7d
> --- /dev/null
> +++ b/hw/cxl/cxl-component-utils.c
> @@ -0,0 +1,208 @@
> +/*
> + * CXL Utility library for components
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "hw/pci/pci.h"
> +#include "hw/cxl/cxl.h"
> +
> +static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
> +                                       unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    assert(size == 4);
> +
> +    if (cregs->special_ops && cregs->special_ops->read) {
> +        return cregs->special_ops->read(cxl_cstate, offset, size);
> +    } else {
> +        return cregs->cache_mem_registers[offset / 4];
> +    }
> +}
> +
> +static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
> +                                    unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    assert(size == 4);
> +
> +    if (cregs->special_ops && cregs->special_ops->write) {
> +        cregs->special_ops->write(cxl_cstate, offset, value, size);
> +    } else {
> +        cregs->cache_mem_registers[offset / 4] = value;
> +    }
> +}
> +
> +/*
> + * 8.2.3
> + *   The access restrictions specified in Section 8.2.2 also apply to CXL 2.0
> + *   Component Registers.
> + *
> + * 8.2.2
> + *   • A 32 bit register shall be accessed as a 4 Bytes quantity. Partial
> + *   reads are not permitted.
> + *   • A 64 bit register shall be accessed as a 8 Bytes quantity. Partial
> + *   reads are not permitted.
> + *
> + * As of the spec defined today, only 4 byte registers exist.

The exciting exception to this is the RAS header log which is
defined as 512 bits.  Will seek clarification but I think the spec should
probably say that is a set of 32 bit registers.

A bunch of the other elements that we probably want to block in plausible
values for also seem to use 64 bit registers.

> + */
> +static const MemoryRegionOps cache_mem_ops = {
> +    .read = cxl_cache_mem_read_reg,
> +    .write = cxl_cache_mem_write_reg,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +        .unaligned = false,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +};
> +

..
> +
> +void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
> +{
> +    int caps = 0;
> +    switch (type) {
> +    case CXL2_DOWNSTREAM_PORT:
> +    case CXL2_DEVICE:
> +        /* CAP, RAS, Link */
> +        caps = 2;
> +        break;
> +    case CXL2_UPSTREAM_PORT:
> +    case CXL2_TYPE3_DEVICE:
> +    case CXL2_LOGICAL_DEVICE:
> +        /* + HDM */
> +        caps = 3;
> +        break;
> +    case CXL2_ROOT_PORT:
> +        /* + Extended Security, + Snoop */
> +        caps = 5;
> +        break;
> +    default:
> +        abort();
> +    }
> +
> +    memset(reg_state, 0, 0x1000);
> +
> +    /* CXL Capability Header Register */
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
> +
> +
> +#define init_cap_reg(reg, id, version)                                        \
> +    _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
> +    do {                                                                      \
> +        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
> +        reg_state[which] = FIELD_DP32(reg_state[which],                       \
> +                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
> +                       VERSION, version);                                     \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
> +                       CXL_##reg##_REGISTERS_OFFSET);                         \
> +    } while (0)

Seems like this would be cleaner using ARRAY_FIELD_DP32 as you did for the header.

    #define init_cap_reg(reg, id, version)                                        \
        _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
        do {                                                                    \
            ARRAY_FIELD_DP32(reg_state, CXL_##reg##_CAPABILITY_HEADER, ID, id); \
            ARRAY_FIELD_DP32(reg_state, CXL_##reg##_CAPABILITY_HEADER,          \
                             VERSION, version);                                 \
            ARRAY_FIELD_DP32(reg_state, CXL_##reg##_CAPABILITY_HEADER,          \
                             PTR, CXL_##reg##_REGISTRS_OFFSET);                 \
	} while (0)
I think gives the same result.

> +
> +    init_cap_reg(RAS, 2, 1);
> +    ras_init_common(reg_state);
> +
> +    init_cap_reg(LINK, 4, 2);

Feels like we'll want to block some values for the rest of these to at least
ensure whatever is read isn't crazy.

> +
> +    if (caps < 3) {
> +        return;
> +    }
> +
> +    init_cap_reg(HDM, 5, 1);
> +    hdm_init_common(reg_state);
> +
> +    if (caps < 5) {
> +        return;
> +    }
> +
> +    init_cap_reg(EXTSEC, 6, 1);
> +    init_cap_reg(SNOOP, 8, 1);
> +
> +#undef init_cap_reg
> +}
> +
> +/*
> + * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
> + * for tracking the valid offset.
> + *
> + * This function will build the DVSEC header on behalf of the caller and then
> + * copy in the remaining data for the vendor specific bits.
> + */
> +void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
> +                                uint16_t type, uint8_t rev, uint8_t *body)
> +{
> +    PCIDevice *pdev = cxl->pdev;
> +    uint16_t offset = cxl->dvsec_offset;
> +
> +    assert(offset >= PCI_CFG_SPACE_SIZE &&
> +           ((offset + length) < PCI_CFG_SPACE_EXP_SIZE));
> +    assert((length & 0xf000) == 0);
> +    assert((rev & ~0xf) == 0);
> +
> +    /* Create the DVSEC in the MCFG space */
> +    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
> +    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER1_OFFSET,
> +                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
> +    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
> +    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
> +           body + sizeof(struct dvsec_header),
> +           length - sizeof(struct dvsec_header));
> +
> +    /* Update state for future DVSEC additions */
> +    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
> +    cxl->dvsec_offset += length;
> +}
...



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4)
  2021-02-02 14:58     ` Jonathan Cameron
  (?)
@ 2021-02-11 17:46     ` Jonathan Cameron
  2021-02-18  0:55       ` Ben Widawsky
  -1 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-11 17:46 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Tue, 2 Feb 2021 14:58:30 +0000
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Mon, 1 Feb 2021 16:59:22 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > This is the beginning of implementing mailbox support for CXL 2.0
> > devices. The implementation recognizes when the doorbell is rung,
> > handles the command/payload, clears the doorbell while returning error
> > codes and data.
> > 
> > Generally the mailbox mechanism is designed to permit communication
> > between the host OS and the firmware running on the device. For our
> > purposes, we emulate both the firmware, implemented primarily in
> > cxl-mailbox-utils.c, and the hardware.
> > 
> > No commands are implemented yet.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  hw/cxl/cxl-device-utils.c   | 125 ++++++++++++++++++++++-
> >  hw/cxl/cxl-mailbox-utils.c  | 197 ++++++++++++++++++++++++++++++++++++
> >  hw/cxl/meson.build          |   1 +
> >  include/hw/cxl/cxl.h        |   3 +
> >  include/hw/cxl/cxl_device.h |  28 ++++-
> >  5 files changed, 349 insertions(+), 5 deletions(-)
> >  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> > 
> > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > index bb15ad9a0f..6602606f3d 100644
> > --- a/hw/cxl/cxl-device-utils.c
> > +++ b/hw/cxl/cxl-device-utils.c
> > @@ -40,6 +40,111 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> >      return 0;
> >  }
> >  
> > +static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
> > +{
> > +    CXLDeviceState *cxl_dstate = opaque;
> > +
> > +    switch (size) {

As per the thread on the linux driver and the infinite loop I saw there
as a result of doing 1 byte reads.

With the current setup of min_access_size = 4 and this
function QEMU will helpfully issue a series of unaligned 4 byte
reads to this function. It will then mask those down to 1 byte each
and combine them.  Given the integer division that results in
the bottom byte of offset / 4 being returned up to 4 times.

To handle 2 and 1 byte reads we need explicit support in here and
the MemoryRegionOps need to reflect that as well.

All the similar cases where such reads are allowed need to do the
same.

> > +    case 8:
> > +        return cxl_dstate->mbox_reg_state64[offset / 8];
> > +    case 4:
> > +        return cxl_dstate->mbox_reg_state32[offset / 4];  
> 
> Numeric order seems more natural and I can't see a reason not to do it.
> + you do them in that order below.
> 
> > +    default:
> > +        g_assert_not_reached();
> > +    }
> > +}
> > +
> > +static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
> > +                               uint64_t value)
> > +{
> > +    switch (offset) {
> > +    case A_CXL_DEV_MAILBOX_CTRL:
> > +        /* fallthrough */
> > +    case A_CXL_DEV_MAILBOX_CAP:
> > +        /* RO register */
> > +        break;
> > +    default:
> > +        qemu_log_mask(LOG_UNIMP,
> > +                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
> > +                      __func__, offset);
> > +        break;
> > +    }
> > +
> > +    reg_state[offset / 4] = value;
> > +}
> > +
> > +static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
> > +                               uint64_t value)
> > +{
> > +    switch (offset) {
> > +    case A_CXL_DEV_MAILBOX_CMD:
> > +        break;
> > +    case A_CXL_DEV_BG_CMD_STS:
> > +        /* BG not supported */
> > +        /* fallthrough */
> > +    case A_CXL_DEV_MAILBOX_STS:
> > +        /* Read only register, will get updated by the state machine */
> > +        return;
> > +    default:
> > +        qemu_log_mask(LOG_UNIMP,
> > +                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
> > +                      __func__, offset);
> > +        return;
> > +    }
> > +
> > +
> > +    reg_state[offset / 8] = value;
> > +}
> > +
> > +static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
> > +                              unsigned size)
> > +{
> > +    CXLDeviceState *cxl_dstate = opaque;
> > +
> > +    if (offset >= A_CXL_DEV_CMD_PAYLOAD) {
> > +        memcpy(cxl_dstate->mbox_reg_state + offset, &value, size);
> > +        return;
> > +    }
> > +
> > +    /*
> > +     * Lock is needed to prevent concurrent writes as well as to prevent writes
> > +     * coming in while the firmware is processing. Without background commands
> > +     * or the second mailbox implemented, this serves no purpose since the
> > +     * memory access is synchronized at a higher level (per memory region).
> > +     */
> > +    RCU_READ_LOCK_GUARD();
> > +
> > +    switch (size) {
> > +    case 4:
> > +        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
> > +        break;
> > +    case 8:
> > +        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
> > +        break;
> > +    default:
> > +        g_assert_not_reached();
> > +    }
> > +
> > +    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > +                         DOORBELL))
> > +        cxl_process_mailbox(cxl_dstate);
> > +}
> > +
> > +static const MemoryRegionOps mailbox_ops = {
> > +    .read = mailbox_reg_read,
> > +    .write = mailbox_reg_write,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 1,
> > +        .max_access_size = 8,
> > +        .unaligned = false,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 8,
> > +    },
> > +};
> > +
> >  static const MemoryRegionOps dev_ops = {
> >      .read = dev_reg_read,
> >      .write = NULL, /* status register is read only */
> > @@ -80,20 +185,33 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> >                            "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
> >      memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
> >                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> > +    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
> > +                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
> >  
> >      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> >                                  &cxl_dstate->caps);
> >      memory_region_add_subregion(&cxl_dstate->device_registers,
> >                                  CXL_DEVICE_REGISTERS_OFFSET,
> >                                  &cxl_dstate->device);
> > +    memory_region_add_subregion(&cxl_dstate->device_registers,
> > +                                CXL_MAILBOX_REGISTERS_OFFSET,
> > +                                &cxl_dstate->mailbox);
> >  }
> >  
> >  static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
> >  
> > +static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
> > +{
> > +    /* 2048 payload size, with no interrupt or background support */
> > +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> > +                     PAYLOAD_SIZE, CXL_MAILBOX_PAYLOAD_SHIFT);
> > +    cxl_dstate->payload_size = CXL_MAILBOX_MAX_PAYLOAD_SIZE;
> > +}
> > +
> >  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> >  {
> >      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> > -    const int cap_count = 1;
> > +    const int cap_count = 2;
> >  
> >      /* CXL Device Capabilities Array Register */
> >      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> > @@ -102,4 +220,9 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> >  
> >      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> >      device_reg_init_common(cxl_dstate);
> > +
> > +    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> > +    mailbox_reg_init_common(cxl_dstate);
> > +
> > +    assert(cxl_initialize_mailbox(cxl_dstate) == 0);
> >  }
> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > new file mode 100644
> > index 0000000000..466055b01a
> > --- /dev/null
> > +++ b/hw/cxl/cxl-mailbox-utils.c
> > @@ -0,0 +1,197 @@
> > +/*
> > + * CXL Utility library for mailbox interface
> > + *
> > + * Copyright(C) 2020 Intel Corporation.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "hw/cxl/cxl.h"
> > +#include "hw/pci/pci.h"
> > +#include "qemu/log.h"
> > +#include "qemu/uuid.h"
> > +
> > +/*
> > + * How to add a new command, example. The command set FOO, with cmd BAR.
> > + *  1. Add the command set and cmd to the enum.
> > + *     FOO    = 0x7f,
> > + *          #define BAR 0
> > + *  2. Forward declare the handler.
> > + *     declare_mailbox_handler(FOO_BAR);
> > + *  3. Add the command to the cxl_cmd_set[][]
> > + *     CXL_CMD(FOO, BAR, 0, 0),
> > + *  4. Implement your handler
> > + *     define_mailbox_handler(FOO_BAR) { ... return CXL_MBOX_SUCCESS; }
> > + *
> > + *
> > + *  Writing the handler:
> > + *    The handler will provide the &struct cxl_cmd, the &CXLDeviceState, and the
> > + *    in/out length of the payload. The handler is responsible for consuming the
> > + *    payload from cmd->payload and operating upon it as necessary. It must then
> > + *    fill the output data into cmd->payload (overwriting what was there),
> > + *    setting the length, and returning a valid return code.
> > + *
> > + *  XXX: The handler need not worry about endianess. The payload is read out of
> > + *  a register interface that already deals with it.
> > + */
> > +
> > +/* 8.2.8.4.5.1 Command Return Codes */
> > +typedef enum {
> > +    CXL_MBOX_SUCCESS = 0x0,
> > +    CXL_MBOX_BG_STARTED = 0x1,
> > +    CXL_MBOX_INVALID_INPUT = 0x2,
> > +    CXL_MBOX_UNSUPPORTED = 0x3,
> > +    CXL_MBOX_INTERNAL_ERROR = 0x4,
> > +    CXL_MBOX_RETRY_REQUIRED = 0x5,
> > +    CXL_MBOX_BUSY = 0x6,
> > +    CXL_MBOX_MEDIA_DISABLED = 0x7,
> > +    CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
> > +    CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
> > +    CXL_MBOX_FW_AUTH_FAILED = 0xa,
> > +    CXL_MBOX_FW_INVALID_SLOT = 0xb,
> > +    CXL_MBOX_FW_ROLLEDBACK = 0xc,
> > +    CXL_MBOX_FW_REST_REQD = 0xd,
> > +    CXL_MBOX_INVALID_HANDLE = 0xe,
> > +    CXL_MBOX_INVALID_PA = 0xf,
> > +    CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
> > +    CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
> > +    CXL_MBOX_ABORTED = 0x12,
> > +    CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
> > +    CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
> > +    CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
> > +    CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
> > +    CXL_MBOX_MAX = 0x17
> > +} ret_code;
> > +
> > +struct cxl_cmd;
> > +typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,
> > +                                   CXLDeviceState *cxl_dstate, uint16_t *len);
> > +struct cxl_cmd {
> > +    const char *name;
> > +    opcode_handler handler;
> > +    ssize_t in;
> > +    uint16_t effect; /* Reported in CEL */
> > +    uint8_t *payload;  
> 
> This payload pointer feels somewhat out of place. Perhaps it should be a parameter
> passed to the opcode_handler()?  The address of the payload doesn't feel like
> part of the command as such so you are justing using it as somewhere to stash
> the address when passing to the handler.
> 
> 
> > +};
> > +
> > +#define define_mailbox_handler(name)                \
> > +    static ret_code cmd_##name(struct cxl_cmd *cmd, \
> > +                               CXLDeviceState *cxl_dstate, uint16_t *len)
> > +#define declare_mailbox_handler(name) define_mailbox_handler(name)
> > +
> > +#define define_mailbox_handler_zeroed(name, size)                         \
> > +    uint16_t __zero##name = size;                                         \
> > +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> > +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> > +    {                                                                     \
> > +        *len = __zero##name;                                              \
> > +        memset(cmd->payload, 0, *len);                                    \
> > +        return CXL_MBOX_SUCCESS;                                          \
> > +    }
> > +#define define_mailbox_handler_const(name, data)                          \
> > +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> > +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> > +    {                                                                     \
> > +        *len = sizeof(data);                                              \
> > +        memcpy(cmd->payload, data, *len);                                 \
> > +        return CXL_MBOX_SUCCESS;                                          \
> > +    }
> > +#define define_mailbox_handler_nop(name)                                  \
> > +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> > +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> > +    {                                                                     \
> > +        return CXL_MBOX_SUCCESS;                                          \
> > +    }
> > +
> > +#define CXL_CMD(s, c, in, cel_effect) \
> > +    [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
> > +
> > +static struct cxl_cmd cxl_cmd_set[256][256] = {};
> > +
> > +#undef CXL_CMD
> > +
> > +QemuUUID cel_uuid;
> > +
> > +void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> > +{
> > +    uint16_t ret = CXL_MBOX_SUCCESS;
> > +    struct cxl_cmd *cxl_cmd;
> > +    uint64_t status_reg;
> > +    opcode_handler h;
> > +
> > +    /*
> > +     * current state of mailbox interface
> > +     *  mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
> > +     *  mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
> > +     *  status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
> > +     */
> > +    uint64_t command_reg =
> > +        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
> > +
> > +    /* Check if we have to do anything */
> > +    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > +                          DOORBELL)) {
> > +        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
> > +        return;
> > +    }
> > +
> > +    uint8_t set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
> > +    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> > +    uint16_t len = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH);
> > +    cxl_cmd = &cxl_cmd_set[set][cmd];
> > +    h = cxl_cmd->handler;
> > +    if (!h) {  
> 
> This path seems to not convey information it perhaps should.  Maybe some feedback that
> a command was requested that isn't registered?
> 
> > +        goto handled;
> > +    }
> > +
> > +    if (len != cxl_cmd->in) {
> > +        ret = CXL_MBOX_INVALID_PAYLOAD_LENGTH;
> > +    }
> > +
> > +    cxl_cmd->payload = cxl_dstate->mbox_reg_state + A_CXL_DEV_CMD_PAYLOAD;
> > +    ret = (*h)(cxl_cmd, cxl_dstate, &len);
> > +    assert(len <= cxl_dstate->payload_size);
> > +
> > +handled:
> > +    /*
> > +     * Set the return code
> > +     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
> > +     * away with this
> > +     */
> > +    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
> > +
> > +    /*
> > +     * Set the return length
> > +     */
> > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
> > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
> > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, len);
> > +
> > +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_CMD / 8] = command_reg;
> > +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_STS / 8] = status_reg;
> > +
> > +    /* Tell the host we're done */
> > +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > +                     DOORBELL, 0);
> > +}
> > +
> > +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate)
> > +{
> > +    const char *cel_uuidstr = "0da9c0b5-bf41-4b78-8f79-96b1623b3f17";
> > +
> > +    for (int i = 0; i < 256; i++) {  
> 
> Instead of indexing with i and j, perhaps this would be more consistent using
> the naming you have above cmd and set?
> 
> 
> > +        for (int j = 0; j < 256; j++) {
> > +            if (cxl_cmd_set[i][j].handler) {
> > +                struct cxl_cmd *c = &cxl_cmd_set[i][j];
> > +
> > +                cxl_dstate->cel_log[cxl_dstate->cel_size].opcode = (i << 8) | j;
> > +                cxl_dstate->cel_log[cxl_dstate->cel_size].effect = c->effect;
> > +                cxl_dstate->cel_size++;
> > +            }
> > +        }
> > +    }
> > +
> > +    return qemu_uuid_parse(cel_uuidstr, &cel_uuid);
> > +}
> > diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> > index 47154d6850..0eca715d10 100644
> > --- a/hw/cxl/meson.build
> > +++ b/hw/cxl/meson.build
> > @@ -1,4 +1,5 @@
> >  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> >    'cxl-component-utils.c',
> >    'cxl-device-utils.c',
> > +  'cxl-mailbox-utils.c',
> >  ))
> > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > index 23f52c4cf9..362cda40de 100644
> > --- a/include/hw/cxl/cxl.h
> > +++ b/include/hw/cxl/cxl.h
> > @@ -14,5 +14,8 @@
> >  #include "cxl_component.h"
> >  #include "cxl_device.h"
> >  
> > +#define COMPONENT_REG_BAR_IDX 0
> > +#define DEVICE_REG_BAR_IDX 2  
> 
> I'd argue for prefixing all defines
> 
> CXL_COMPONENT_REG_BAR_IDX etc
> 
> Will make it clear they are generic CXL related things.
> 
> > +
> >  #endif
> >  
> > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > index f3bcf19410..af91bec10c 100644
> > --- a/include/hw/cxl/cxl_device.h
> > +++ b/include/hw/cxl/cxl_device.h
> > @@ -80,16 +80,27 @@ typedef struct cxl_device_state {
> >      MemoryRegion device_registers;
> >  
> >      /* mmio for device capabilities array - 8.2.8.2 */
> > +    MemoryRegion device;
> >      struct {
> >          MemoryRegion caps;
> >          uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
> >      };
> >  
> > -    /* mmio for the device status registers 8.2.8.3 */
> > -    MemoryRegion device;
> > -  
> 
> Move this block back to where it was originally introduced rather than
> introduce it then move it later?
> 
> >      /* mmio for the mailbox registers 8.2.8.4 */
> > -    MemoryRegion mailbox;
> > +    struct {
> > +        MemoryRegion mailbox;
> > +        uint16_t payload_size;
> > +        union {
> > +            uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
> > +            uint32_t mbox_reg_state32[CXL_MAILBOX_REGISTERS_LENGTH / 4];
> > +            uint64_t mbox_reg_state64[CXL_MAILBOX_REGISTERS_LENGTH / 8];
> > +        };
> > +        struct {
> > +            uint16_t opcode;
> > +            uint16_t effect;
> > +        } cel_log[1 << 16];
> > +        size_t cel_size;
> > +    };  
> 
> If the structure is unnamed, chances of a naming clash seem rather high
> if you don't prefix all the elements with mbx_ or something like that.
> 
> >  
> >      /* memory region for persistent memory, HDM */
> >      MemoryRegion *pmem;
> > @@ -135,6 +146,9 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> >                                                 CXL_DEVICE_CAP_REG_SIZE)
> >  
> > +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
> > +void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
> > +
> >  #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
> >      do {                                                                           \
> >          uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> > @@ -162,6 +176,12 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
> >      FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
> >      FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
> >  
> > +/* XXX: actually a 64b register */
> > +REG32(CXL_DEV_MAILBOX_CMD, 8)
> > +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)
> > +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
> > +    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
> > +  
> 
> Ah. this is where this definition went.  Perhaps pull it back into patch 3?
> That patch defines plenty of other things that aren't used until later patches
> I think, so one more won't hurt and will save me asking why you skipped it:)
> 
> 
> >  /* XXX: actually a 64b register */
> >  REG32(CXL_DEV_MAILBOX_STS, 0x10)
> >      FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)  
> 
> 


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 07/31] hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-11 17:59     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-11 17:59 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:24 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Using the previously implemented stubbed helpers, it is now possible to
> easily add the missing, required commands to the implementation.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  hw/cxl/cxl-mailbox-utils.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 466055b01a..7c939a1851 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -37,6 +37,14 @@
>   *  a register interface that already deals with it.
>   */
>  
> +enum {
> +    EVENTS      = 0x01,
> +        #define GET_RECORDS   0x0
> +        #define CLEAR_RECORDS   0x1
> +        #define GET_INTERRUPT_POLICY   0x2
> +        #define SET_INTERRUPT_POLICY   0x3
> +};
> +
>  /* 8.2.8.4.5.1 Command Return Codes */
>  typedef enum {
>      CXL_MBOX_SUCCESS = 0x0,
> @@ -105,10 +113,23 @@ struct cxl_cmd {
>          return CXL_MBOX_SUCCESS;                                          \
>      }
>  
> +define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
> +define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
> +define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
> +define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
> +
> +#define IMMEDIATE_CONFIG_CHANGE (1 << 1)
> +#define IMMEDIATE_LOG_CHANGE (1 << 4)
> +
>  #define CXL_CMD(s, c, in, cel_effect) \
>      [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
>  
> -static struct cxl_cmd cxl_cmd_set[256][256] = {};
> +static struct cxl_cmd cxl_cmd_set[256][256] = {
> +    CXL_CMD(EVENTS, GET_RECORDS, 1, 0),
> +    CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
> +    CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
> +    CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),

CXL 2.0 spec says IMMEDIATE_POLICY_CHANGE for this rather than
IMMEDIATE_CONFIG_CHANGE.

> +};
>  
>  #undef CXL_CMD
>  


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 07/31] hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
@ 2021-02-11 17:59     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-11 17:59 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:24 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Using the previously implemented stubbed helpers, it is now possible to
> easily add the missing, required commands to the implementation.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  hw/cxl/cxl-mailbox-utils.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 466055b01a..7c939a1851 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -37,6 +37,14 @@
>   *  a register interface that already deals with it.
>   */
>  
> +enum {
> +    EVENTS      = 0x01,
> +        #define GET_RECORDS   0x0
> +        #define CLEAR_RECORDS   0x1
> +        #define GET_INTERRUPT_POLICY   0x2
> +        #define SET_INTERRUPT_POLICY   0x3
> +};
> +
>  /* 8.2.8.4.5.1 Command Return Codes */
>  typedef enum {
>      CXL_MBOX_SUCCESS = 0x0,
> @@ -105,10 +113,23 @@ struct cxl_cmd {
>          return CXL_MBOX_SUCCESS;                                          \
>      }
>  
> +define_mailbox_handler_zeroed(EVENTS_GET_RECORDS, 0x20);
> +define_mailbox_handler_nop(EVENTS_CLEAR_RECORDS);
> +define_mailbox_handler_zeroed(EVENTS_GET_INTERRUPT_POLICY, 4);
> +define_mailbox_handler_nop(EVENTS_SET_INTERRUPT_POLICY);
> +
> +#define IMMEDIATE_CONFIG_CHANGE (1 << 1)
> +#define IMMEDIATE_LOG_CHANGE (1 << 4)
> +
>  #define CXL_CMD(s, c, in, cel_effect) \
>      [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
>  
> -static struct cxl_cmd cxl_cmd_set[256][256] = {};
> +static struct cxl_cmd cxl_cmd_set[256][256] = {
> +    CXL_CMD(EVENTS, GET_RECORDS, 1, 0),
> +    CXL_CMD(EVENTS, CLEAR_RECORDS, ~0, IMMEDIATE_LOG_CHANGE),
> +    CXL_CMD(EVENTS, GET_INTERRUPT_POLICY, 0, 0),
> +    CXL_CMD(EVENTS, SET_INTERRUPT_POLICY, 4, IMMEDIATE_CONFIG_CHANGE),

CXL 2.0 spec says IMMEDIATE_POLICY_CHANGE for this rather than
IMMEDIATE_CONFIG_CHANGE.

> +};
>  
>  #undef CXL_CMD
>  



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4)
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-02-11 18:09     ` Jonathan Cameron
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-11 18:09 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, linux-cxl, Chris Browy, Dan Williams,
	David Hildenbrand, Igor Mammedov, Ira Weiny, Marcel Apfelbaum,
	Markus Armbruster, Philippe Mathieu-Daudé,
	Vishal Verma, John Groves (jgroves),
	Michael S. Tsirkin

On Mon, 1 Feb 2021 16:59:22 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This is the beginning of implementing mailbox support for CXL 2.0
> devices. The implementation recognizes when the doorbell is rung,
> handles the command/payload, clears the doorbell while returning error
> codes and data.
> 
> Generally the mailbox mechanism is designed to permit communication
> between the host OS and the firmware running on the device. For our
> purposes, we emulate both the firmware, implemented primarily in
> cxl-mailbox-utils.c, and the hardware.
> 
> No commands are implemented yet.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Sorry review is a little incoherent. It's a lot of patches
so I've ended up looking at your tree then trying to figure out
which patch a given comment belongs alongside.

> ---
>  hw/cxl/cxl-device-utils.c   | 125 ++++++++++++++++++++++-
>  hw/cxl/cxl-mailbox-utils.c  | 197 ++++++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build          |   1 +
>  include/hw/cxl/cxl.h        |   3 +
>  include/hw/cxl/cxl_device.h |  28 ++++-
>  5 files changed, 349 insertions(+), 5 deletions(-)
>  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> index bb15ad9a0f..6602606f3d 100644
> --- a/hw/cxl/cxl-device-utils.c
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -40,6 +40,111 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
>      return 0;
>  }
>  


> +
> +#define define_mailbox_handler_zeroed(name, size)                         \
> +    uint16_t __zero##name = size;                                         \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        *len = __zero##name;                                              \

Why not just use the value of size here?

__zero##name isn't used anywhere else.

> +        memset(cmd->payload, 0, *len);                                    \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +#define define_mailbox_handler_const(name, data)                          \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        *len = sizeof(data);                                              \
> +        memcpy(cmd->payload, data, *len);                                 \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +#define define_mailbox_handler_nop(name)                                  \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4)
@ 2021-02-11 18:09     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-11 18:09 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Mon, 1 Feb 2021 16:59:22 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This is the beginning of implementing mailbox support for CXL 2.0
> devices. The implementation recognizes when the doorbell is rung,
> handles the command/payload, clears the doorbell while returning error
> codes and data.
> 
> Generally the mailbox mechanism is designed to permit communication
> between the host OS and the firmware running on the device. For our
> purposes, we emulate both the firmware, implemented primarily in
> cxl-mailbox-utils.c, and the hardware.
> 
> No commands are implemented yet.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Sorry review is a little incoherent. It's a lot of patches
so I've ended up looking at your tree then trying to figure out
which patch a given comment belongs alongside.

> ---
>  hw/cxl/cxl-device-utils.c   | 125 ++++++++++++++++++++++-
>  hw/cxl/cxl-mailbox-utils.c  | 197 ++++++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build          |   1 +
>  include/hw/cxl/cxl.h        |   3 +
>  include/hw/cxl/cxl_device.h |  28 ++++-
>  5 files changed, 349 insertions(+), 5 deletions(-)
>  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> index bb15ad9a0f..6602606f3d 100644
> --- a/hw/cxl/cxl-device-utils.c
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -40,6 +40,111 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
>      return 0;
>  }
>  


> +
> +#define define_mailbox_handler_zeroed(name, size)                         \
> +    uint16_t __zero##name = size;                                         \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        *len = __zero##name;                                              \

Why not just use the value of size here?

__zero##name isn't used anywhere else.

> +        memset(cmd->payload, 0, *len);                                    \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +#define define_mailbox_handler_const(name, data)                          \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        *len = sizeof(data);                                              \
> +        memcpy(cmd->payload, data, *len);                                 \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +#define define_mailbox_handler_nop(name)                                  \
> +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> +    {                                                                     \
> +        return CXL_MBOX_SUCCESS;                                          \
> +    }
> +



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 00/31] CXL 2.0 Support
  2021-02-03 17:42 ` Ben Widawsky
@ 2021-02-11 18:51     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-11 18:51 UTC (permalink / raw)
  To: Ben Widawsky, David Hildenbrand
  Cc: qemu-devel, Vishal Verma, John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny

On Wed, 3 Feb 2021 09:42:16 -0800
Ben Widawsky <ben@bwidawsk.net> wrote:

> I've started a barebones project plan:
> https://gitlab.com/bwidawsk/qemu/-/snippets/2070304

Great.

> 
> Jonathan, if you have a moment, perhaps you can send a MR summarizing CDAT/DOE
> work from you and Chris?

I need to catch up with what Chris has posted, but sure after that I'll add to your
doc if Chris doesn't get there first.  My intent is to let Chris get on with the
DOE QEMU support.  Plenty of other stuff to do as long as it covers what I need
(if not I'll hack stuff on top :) Will catch up with reviewing that in the next
few days.

I'll also add some other comments on the plan when I get a chance.
Will need to fake at least a partial switch sometime soon for example.

> 
> If folks feel priorities are drastically off, we can discuss it in the snippet
> comments.
> 
> As for wider acceptance, if I'm looking at this from the QEMU community
> perspective, better test cases are really needed. If your fingers are itching
> for some typing, might I suggest starting with that.
> 
> I've opted not to use issue tracker for this because I am hopeful this won't be
> a long living gitlab project.

All sounds good.

I've not reviewed that much on the last few patches in here, at least partly
because a bunch of them have todo comments so I'm assuming they are very much
a work in progress.

One thing I will note is this has become large and complex enough that I'd be
tempted to start separating the 'racey cutting edge' parts from bits that have
been moderately stable for a while.  Hopefully some of that stable part can get
wider review without the fun stuff and all the churn related to that.

Jonathan



> 
> On 21-02-01 16:59:17, Ben Widawsky wrote:
> > Major changes since v2 [1]:
> >  * Removed all register endian/alignment/size checking. Using core functionality
> >    instead. This untested on big endian hosts, but Should Work(tm).
> >  * Fix component capability header generation (off by 1).
> >  * Fixed HDM programming (multiple issues).
> >  * Fixed timestamp command implementations.
> >  * Added commands: GET_FIRMWARE_UPDATE_INFO, GET_PARTITION_INFO, GET_LSA, SET_LSA
> > 
> > Things have remained fairly stable since since v2. The biggest change here is
> > definitely the HDM programming which has received limited (but not 0) testing in
> > the Linux driver.
> > 
> > Jonathan Cameron has gotten this patch series working on ARM [2], and added some
> > much sought after functionality [3].
> > 
> > ---
> > 
> > I've started #cxl on OFTC IRC for discussion. Please feel free to use that
> > channel for questions or suggestions in addition to #qemu.
> > 
> > ---
> > 
> > Introduce emulation of Compute Express Link 2.0
> > (https://www.computeexpresslink.org/). Specifically, add support for Type 3
> > memory expanders with persistent memory.
> > 
> > The emulation has been critical to get the Linux enabling started [4], it would
> > be an ideal place to land regression tests for different topology handling, and
> > there may be applications for this emulation as a way for a guest to manipulate
> > its address space relative to different performance memories.
> > 
> > Three of the five CXL component types are emulated with some level of
> > functionality: host bridge, root port, and memory device. All components and
> > devices implement basic MMIO. Devices/memory devices implement the mailbo
> > interface. Basic ACPI support is also included. Upstream ports and downstream
> > ports aren't implemented (the two components needed to make up a switch).
> > 
> > CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
> > implementation utilizes existing PCI paradigms. To implement the host bridge,
> > I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
> > fit even though it doesn't directly map to how hardware will work. For
> > persistent capacity of the memory device, I utilized the memory subsystem
> > (hw/mem).
> > 
> > We have 3 reasons why this work is valuable:
> > 1. Linux driver feature development benefits from emulation both due to a lack
> >    of initial hardware availability, but also, as is seen with NVDIMM/PMEM
> >    emulation, there is value in being able to share topologies with
> >    system-software developers even after hardware is available.
> > 
> > 2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
> >    resources via custom modules (nfit_test). In retrospect a QEMU emulation of
> >    nfit_test capabilities would have made the test environment more portable,
> >    and allowed for easier community contributions of example configurations.
> > 
> > 3. This is still being fleshed out, but in short it provides a standardized
> >    mechanism for the guest to provide feedback to the host about size and
> >    placement needs of the memory. After the host gives the guest a physical
> >    window mapping to the CXL device, the emulated HDM decoders allow the guest a
> >    way to tell the host how much it wants and where. There are likely simpler
> >    ways to do this, but they'd require inventing a new interface and you'd need
> >    to have diverging driver code in the guest programming of the HDM decoder vs.
> >    the host. Since we've already done this work, why not use it?
> > 
> > There is quite a long list of work to do for full spec compliance, but I don't
> > believe that any of it precludes merging. Off the top of my head:
> > - Main host bridge support (WIP)
> > - Interleaving
> > - Better Tests
> > - Hot plug support
> > - Emulating volatile capacity
> > - CDAT emulation [3]
> > 
> > The flow of the patches in general is to define all the data structures and
> > registers associated with the various components in a top down manner. Host
> > bridge, component, ports, devices. Then, the actual implementation is done in
> > the same order.
> > 
> > The summary is:
> > 1-5: Infrastructure for component and device emulation
> > 6-9: Basic mailbox command implementations
> > 10-19: Implement CXL host bridges as PXB devices
> > 20: Implement a root port
> > 21-22: Implement a memory device
> > 23-26: ACPI bits
> > 27-29: Add some more advanced mailbox command implementations
> > 30: Start working on enabling the main host bridge
> > 31: Basic test case
> > 
> > ---
> > 
> > [1]: https://lore.kernel.org/qemu-devel/20210105165323.783725-1-ben.widawsky@intel.com/
> > [2]: https://lore.kernel.org/qemu-devel/20210201152655.31027-1-Jonathan.Cameron@huawei.com/
> > [3]: https://lore.kernel.org/qemu-devel/20210201151629.29656-1-Jonathan.Cameron@huawei.com/
> > [4]: https://lore.kernel.org/linux-cxl/20210130002438.1872527-1-ben.widawsky@intel.com/
> > 
> > ---
> > 
> > Ben Widawsky (31):
> >   hw/pci/cxl: Add a CXL component type (interface)
> >   hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
> >   hw/cxl/device: Introduce a CXL device (8.2.8)
> >   hw/cxl/device: Implement the CAP array (8.2.8.1-2)
> >   hw/cxl/device: Implement basic mailbox (8.2.8.4)
> >   hw/cxl/device: Add memory device utilities
> >   hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
> >   hw/cxl/device: Timestamp implementation (8.2.9.3)
> >   hw/cxl/device: Add log commands (8.2.9.4) + CEL
> >   hw/pxb: Use a type for realizing expanders
> >   hw/pci/cxl: Create a CXL bus type
> >   hw/pxb: Allow creation of a CXL PXB (host bridge)
> >   qtest: allow DSDT acpi table changes
> >   acpi/pci: Consolidate host bridge setup
> >   tests/acpi: remove stale allowed tables
> >   hw/pci: Plumb _UID through host bridges
> >   hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
> >   acpi/pxb/cxl: Reserve host bridge MMIO
> >   hw/pxb/cxl: Add "windows" for host bridges
> >   hw/cxl/rp: Add a root port
> >   hw/cxl/device: Add a memory device (8.2.8.5)
> >   hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
> >   acpi/cxl: Add _OSC implementation (9.14.2)
> >   tests/acpi: allow CEDT table addition
> >   acpi/cxl: Create the CEDT (9.14.1)
> >   tests/acpi: Add new CEDT files
> >   hw/cxl/device: Add some trivial commands
> >   hw/cxl/device: Plumb real LSA sizing
> >   hw/cxl/device: Implement get/set LSA
> >   qtest/cxl: Add very basic sanity tests
> >   WIP: i386/cxl: Initialize a host bridge
> > 
> >  MAINTAINERS                         |   6 +
> >  hw/Kconfig                          |   1 +
> >  hw/acpi/Kconfig                     |   5 +
> >  hw/acpi/cxl.c                       | 173 ++++++++++
> >  hw/acpi/meson.build                 |   1 +
> >  hw/arm/virt.c                       |   1 +
> >  hw/core/machine.c                   |  26 ++
> >  hw/core/numa.c                      |   3 +
> >  hw/cxl/Kconfig                      |   3 +
> >  hw/cxl/cxl-component-utils.c        | 208 ++++++++++++
> >  hw/cxl/cxl-device-utils.c           | 264 +++++++++++++++
> >  hw/cxl/cxl-mailbox-utils.c          | 498 ++++++++++++++++++++++++++++
> >  hw/cxl/meson.build                  |   5 +
> >  hw/i386/acpi-build.c                |  87 ++++-
> >  hw/i386/microvm.c                   |   1 +
> >  hw/i386/pc.c                        |   2 +
> >  hw/mem/Kconfig                      |   5 +
> >  hw/mem/cxl_type3.c                  | 405 ++++++++++++++++++++++
> >  hw/mem/meson.build                  |   1 +
> >  hw/meson.build                      |   1 +
> >  hw/pci-bridge/Kconfig               |   5 +
> >  hw/pci-bridge/cxl_root_port.c       | 231 +++++++++++++
> >  hw/pci-bridge/meson.build           |   1 +
> >  hw/pci-bridge/pci_expander_bridge.c | 209 +++++++++++-
> >  hw/pci-bridge/pcie_root_port.c      |   6 +-
> >  hw/pci/pci.c                        |  32 +-
> >  hw/pci/pcie.c                       |  30 ++
> >  hw/ppc/spapr.c                      |   2 +
> >  include/hw/acpi/cxl.h               |  27 ++
> >  include/hw/boards.h                 |   2 +
> >  include/hw/cxl/cxl.h                |  29 ++
> >  include/hw/cxl/cxl_component.h      | 187 +++++++++++
> >  include/hw/cxl/cxl_device.h         | 255 ++++++++++++++
> >  include/hw/cxl/cxl_pci.h            | 160 +++++++++
> >  include/hw/pci/pci.h                |  15 +
> >  include/hw/pci/pci_bridge.h         |  25 ++
> >  include/hw/pci/pci_bus.h            |   8 +
> >  include/hw/pci/pci_ids.h            |   1 +
> >  monitor/hmp-cmds.c                  |  15 +
> >  qapi/machine.json                   |   1 +
> >  tests/data/acpi/pc/CEDT             | Bin 0 -> 36 bytes
> >  tests/data/acpi/pc/DSDT             | Bin 5065 -> 5065 bytes
> >  tests/data/acpi/pc/DSDT.acpihmat    | Bin 6390 -> 6390 bytes
> >  tests/data/acpi/pc/DSDT.bridge      | Bin 6924 -> 6924 bytes
> >  tests/data/acpi/pc/DSDT.cphp        | Bin 5529 -> 5529 bytes
> >  tests/data/acpi/pc/DSDT.dimmpxm     | Bin 6719 -> 6719 bytes
> >  tests/data/acpi/pc/DSDT.hpbridge    | Bin 5026 -> 5026 bytes
> >  tests/data/acpi/pc/DSDT.hpbrroot    | Bin 3084 -> 3084 bytes
> >  tests/data/acpi/pc/DSDT.ipmikcs     | Bin 5137 -> 5137 bytes
> >  tests/data/acpi/pc/DSDT.memhp       | Bin 6424 -> 6424 bytes
> >  tests/data/acpi/pc/DSDT.numamem     | Bin 5071 -> 5071 bytes
> >  tests/data/acpi/pc/DSDT.roothp      | Bin 5261 -> 5261 bytes
> >  tests/data/acpi/q35/CEDT            | Bin 0 -> 36 bytes
> >  tests/data/acpi/q35/DSDT            | Bin 7801 -> 7801 bytes
> >  tests/data/acpi/q35/DSDT.acpihmat   | Bin 9126 -> 9126 bytes
> >  tests/data/acpi/q35/DSDT.bridge     | Bin 7819 -> 7819 bytes
> >  tests/data/acpi/q35/DSDT.cphp       | Bin 8265 -> 8265 bytes
> >  tests/data/acpi/q35/DSDT.dimmpxm    | Bin 9455 -> 9455 bytes
> >  tests/data/acpi/q35/DSDT.ipmibt     | Bin 7876 -> 7876 bytes
> >  tests/data/acpi/q35/DSDT.memhp      | Bin 9160 -> 9160 bytes
> >  tests/data/acpi/q35/DSDT.mmio64     | Bin 8932 -> 8932 bytes
> >  tests/data/acpi/q35/DSDT.numamem    | Bin 7807 -> 7807 bytes
> >  tests/qtest/cxl-test.c              |  93 ++++++
> >  tests/qtest/meson.build             |   4 +
> >  64 files changed, 3004 insertions(+), 30 deletions(-)
> >  create mode 100644 hw/acpi/cxl.c
> >  create mode 100644 hw/cxl/Kconfig
> >  create mode 100644 hw/cxl/cxl-component-utils.c
> >  create mode 100644 hw/cxl/cxl-device-utils.c
> >  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> >  create mode 100644 hw/cxl/meson.build
> >  create mode 100644 hw/mem/cxl_type3.c
> >  create mode 100644 hw/pci-bridge/cxl_root_port.c
> >  create mode 100644 include/hw/acpi/cxl.h
> >  create mode 100644 include/hw/cxl/cxl.h
> >  create mode 100644 include/hw/cxl/cxl_component.h
> >  create mode 100644 include/hw/cxl/cxl_device.h
> >  create mode 100644 include/hw/cxl/cxl_pci.h
> >  create mode 100644 tests/data/acpi/pc/CEDT
> >  create mode 100644 tests/data/acpi/q35/CEDT
> >  create mode 100644 tests/qtest/cxl-test.c
> > 
> > -- 
> > 2.30.0
> > 
> >   


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 00/31] CXL 2.0 Support
@ 2021-02-11 18:51     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-11 18:51 UTC (permalink / raw)
  To: Ben Widawsky, David Hildenbrand
  Cc: Michael S. Tsirkin, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Wed, 3 Feb 2021 09:42:16 -0800
Ben Widawsky <ben@bwidawsk.net> wrote:

> I've started a barebones project plan:
> https://gitlab.com/bwidawsk/qemu/-/snippets/2070304

Great.

> 
> Jonathan, if you have a moment, perhaps you can send a MR summarizing CDAT/DOE
> work from you and Chris?

I need to catch up with what Chris has posted, but sure after that I'll add to your
doc if Chris doesn't get there first.  My intent is to let Chris get on with the
DOE QEMU support.  Plenty of other stuff to do as long as it covers what I need
(if not I'll hack stuff on top :) Will catch up with reviewing that in the next
few days.

I'll also add some other comments on the plan when I get a chance.
Will need to fake at least a partial switch sometime soon for example.

> 
> If folks feel priorities are drastically off, we can discuss it in the snippet
> comments.
> 
> As for wider acceptance, if I'm looking at this from the QEMU community
> perspective, better test cases are really needed. If your fingers are itching
> for some typing, might I suggest starting with that.
> 
> I've opted not to use issue tracker for this because I am hopeful this won't be
> a long living gitlab project.

All sounds good.

I've not reviewed that much on the last few patches in here, at least partly
because a bunch of them have todo comments so I'm assuming they are very much
a work in progress.

One thing I will note is this has become large and complex enough that I'd be
tempted to start separating the 'racey cutting edge' parts from bits that have
been moderately stable for a while.  Hopefully some of that stable part can get
wider review without the fun stuff and all the churn related to that.

Jonathan



> 
> On 21-02-01 16:59:17, Ben Widawsky wrote:
> > Major changes since v2 [1]:
> >  * Removed all register endian/alignment/size checking. Using core functionality
> >    instead. This untested on big endian hosts, but Should Work(tm).
> >  * Fix component capability header generation (off by 1).
> >  * Fixed HDM programming (multiple issues).
> >  * Fixed timestamp command implementations.
> >  * Added commands: GET_FIRMWARE_UPDATE_INFO, GET_PARTITION_INFO, GET_LSA, SET_LSA
> > 
> > Things have remained fairly stable since since v2. The biggest change here is
> > definitely the HDM programming which has received limited (but not 0) testing in
> > the Linux driver.
> > 
> > Jonathan Cameron has gotten this patch series working on ARM [2], and added some
> > much sought after functionality [3].
> > 
> > ---
> > 
> > I've started #cxl on OFTC IRC for discussion. Please feel free to use that
> > channel for questions or suggestions in addition to #qemu.
> > 
> > ---
> > 
> > Introduce emulation of Compute Express Link 2.0
> > (https://www.computeexpresslink.org/). Specifically, add support for Type 3
> > memory expanders with persistent memory.
> > 
> > The emulation has been critical to get the Linux enabling started [4], it would
> > be an ideal place to land regression tests for different topology handling, and
> > there may be applications for this emulation as a way for a guest to manipulate
> > its address space relative to different performance memories.
> > 
> > Three of the five CXL component types are emulated with some level of
> > functionality: host bridge, root port, and memory device. All components and
> > devices implement basic MMIO. Devices/memory devices implement the mailbo
> > interface. Basic ACPI support is also included. Upstream ports and downstream
> > ports aren't implemented (the two components needed to make up a switch).
> > 
> > CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
> > implementation utilizes existing PCI paradigms. To implement the host bridge,
> > I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
> > fit even though it doesn't directly map to how hardware will work. For
> > persistent capacity of the memory device, I utilized the memory subsystem
> > (hw/mem).
> > 
> > We have 3 reasons why this work is valuable:
> > 1. Linux driver feature development benefits from emulation both due to a lack
> >    of initial hardware availability, but also, as is seen with NVDIMM/PMEM
> >    emulation, there is value in being able to share topologies with
> >    system-software developers even after hardware is available.
> > 
> > 2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
> >    resources via custom modules (nfit_test). In retrospect a QEMU emulation of
> >    nfit_test capabilities would have made the test environment more portable,
> >    and allowed for easier community contributions of example configurations.
> > 
> > 3. This is still being fleshed out, but in short it provides a standardized
> >    mechanism for the guest to provide feedback to the host about size and
> >    placement needs of the memory. After the host gives the guest a physical
> >    window mapping to the CXL device, the emulated HDM decoders allow the guest a
> >    way to tell the host how much it wants and where. There are likely simpler
> >    ways to do this, but they'd require inventing a new interface and you'd need
> >    to have diverging driver code in the guest programming of the HDM decoder vs.
> >    the host. Since we've already done this work, why not use it?
> > 
> > There is quite a long list of work to do for full spec compliance, but I don't
> > believe that any of it precludes merging. Off the top of my head:
> > - Main host bridge support (WIP)
> > - Interleaving
> > - Better Tests
> > - Hot plug support
> > - Emulating volatile capacity
> > - CDAT emulation [3]
> > 
> > The flow of the patches in general is to define all the data structures and
> > registers associated with the various components in a top down manner. Host
> > bridge, component, ports, devices. Then, the actual implementation is done in
> > the same order.
> > 
> > The summary is:
> > 1-5: Infrastructure for component and device emulation
> > 6-9: Basic mailbox command implementations
> > 10-19: Implement CXL host bridges as PXB devices
> > 20: Implement a root port
> > 21-22: Implement a memory device
> > 23-26: ACPI bits
> > 27-29: Add some more advanced mailbox command implementations
> > 30: Start working on enabling the main host bridge
> > 31: Basic test case
> > 
> > ---
> > 
> > [1]: https://lore.kernel.org/qemu-devel/20210105165323.783725-1-ben.widawsky@intel.com/
> > [2]: https://lore.kernel.org/qemu-devel/20210201152655.31027-1-Jonathan.Cameron@huawei.com/
> > [3]: https://lore.kernel.org/qemu-devel/20210201151629.29656-1-Jonathan.Cameron@huawei.com/
> > [4]: https://lore.kernel.org/linux-cxl/20210130002438.1872527-1-ben.widawsky@intel.com/
> > 
> > ---
> > 
> > Ben Widawsky (31):
> >   hw/pci/cxl: Add a CXL component type (interface)
> >   hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
> >   hw/cxl/device: Introduce a CXL device (8.2.8)
> >   hw/cxl/device: Implement the CAP array (8.2.8.1-2)
> >   hw/cxl/device: Implement basic mailbox (8.2.8.4)
> >   hw/cxl/device: Add memory device utilities
> >   hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1)
> >   hw/cxl/device: Timestamp implementation (8.2.9.3)
> >   hw/cxl/device: Add log commands (8.2.9.4) + CEL
> >   hw/pxb: Use a type for realizing expanders
> >   hw/pci/cxl: Create a CXL bus type
> >   hw/pxb: Allow creation of a CXL PXB (host bridge)
> >   qtest: allow DSDT acpi table changes
> >   acpi/pci: Consolidate host bridge setup
> >   tests/acpi: remove stale allowed tables
> >   hw/pci: Plumb _UID through host bridges
> >   hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
> >   acpi/pxb/cxl: Reserve host bridge MMIO
> >   hw/pxb/cxl: Add "windows" for host bridges
> >   hw/cxl/rp: Add a root port
> >   hw/cxl/device: Add a memory device (8.2.8.5)
> >   hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
> >   acpi/cxl: Add _OSC implementation (9.14.2)
> >   tests/acpi: allow CEDT table addition
> >   acpi/cxl: Create the CEDT (9.14.1)
> >   tests/acpi: Add new CEDT files
> >   hw/cxl/device: Add some trivial commands
> >   hw/cxl/device: Plumb real LSA sizing
> >   hw/cxl/device: Implement get/set LSA
> >   qtest/cxl: Add very basic sanity tests
> >   WIP: i386/cxl: Initialize a host bridge
> > 
> >  MAINTAINERS                         |   6 +
> >  hw/Kconfig                          |   1 +
> >  hw/acpi/Kconfig                     |   5 +
> >  hw/acpi/cxl.c                       | 173 ++++++++++
> >  hw/acpi/meson.build                 |   1 +
> >  hw/arm/virt.c                       |   1 +
> >  hw/core/machine.c                   |  26 ++
> >  hw/core/numa.c                      |   3 +
> >  hw/cxl/Kconfig                      |   3 +
> >  hw/cxl/cxl-component-utils.c        | 208 ++++++++++++
> >  hw/cxl/cxl-device-utils.c           | 264 +++++++++++++++
> >  hw/cxl/cxl-mailbox-utils.c          | 498 ++++++++++++++++++++++++++++
> >  hw/cxl/meson.build                  |   5 +
> >  hw/i386/acpi-build.c                |  87 ++++-
> >  hw/i386/microvm.c                   |   1 +
> >  hw/i386/pc.c                        |   2 +
> >  hw/mem/Kconfig                      |   5 +
> >  hw/mem/cxl_type3.c                  | 405 ++++++++++++++++++++++
> >  hw/mem/meson.build                  |   1 +
> >  hw/meson.build                      |   1 +
> >  hw/pci-bridge/Kconfig               |   5 +
> >  hw/pci-bridge/cxl_root_port.c       | 231 +++++++++++++
> >  hw/pci-bridge/meson.build           |   1 +
> >  hw/pci-bridge/pci_expander_bridge.c | 209 +++++++++++-
> >  hw/pci-bridge/pcie_root_port.c      |   6 +-
> >  hw/pci/pci.c                        |  32 +-
> >  hw/pci/pcie.c                       |  30 ++
> >  hw/ppc/spapr.c                      |   2 +
> >  include/hw/acpi/cxl.h               |  27 ++
> >  include/hw/boards.h                 |   2 +
> >  include/hw/cxl/cxl.h                |  29 ++
> >  include/hw/cxl/cxl_component.h      | 187 +++++++++++
> >  include/hw/cxl/cxl_device.h         | 255 ++++++++++++++
> >  include/hw/cxl/cxl_pci.h            | 160 +++++++++
> >  include/hw/pci/pci.h                |  15 +
> >  include/hw/pci/pci_bridge.h         |  25 ++
> >  include/hw/pci/pci_bus.h            |   8 +
> >  include/hw/pci/pci_ids.h            |   1 +
> >  monitor/hmp-cmds.c                  |  15 +
> >  qapi/machine.json                   |   1 +
> >  tests/data/acpi/pc/CEDT             | Bin 0 -> 36 bytes
> >  tests/data/acpi/pc/DSDT             | Bin 5065 -> 5065 bytes
> >  tests/data/acpi/pc/DSDT.acpihmat    | Bin 6390 -> 6390 bytes
> >  tests/data/acpi/pc/DSDT.bridge      | Bin 6924 -> 6924 bytes
> >  tests/data/acpi/pc/DSDT.cphp        | Bin 5529 -> 5529 bytes
> >  tests/data/acpi/pc/DSDT.dimmpxm     | Bin 6719 -> 6719 bytes
> >  tests/data/acpi/pc/DSDT.hpbridge    | Bin 5026 -> 5026 bytes
> >  tests/data/acpi/pc/DSDT.hpbrroot    | Bin 3084 -> 3084 bytes
> >  tests/data/acpi/pc/DSDT.ipmikcs     | Bin 5137 -> 5137 bytes
> >  tests/data/acpi/pc/DSDT.memhp       | Bin 6424 -> 6424 bytes
> >  tests/data/acpi/pc/DSDT.numamem     | Bin 5071 -> 5071 bytes
> >  tests/data/acpi/pc/DSDT.roothp      | Bin 5261 -> 5261 bytes
> >  tests/data/acpi/q35/CEDT            | Bin 0 -> 36 bytes
> >  tests/data/acpi/q35/DSDT            | Bin 7801 -> 7801 bytes
> >  tests/data/acpi/q35/DSDT.acpihmat   | Bin 9126 -> 9126 bytes
> >  tests/data/acpi/q35/DSDT.bridge     | Bin 7819 -> 7819 bytes
> >  tests/data/acpi/q35/DSDT.cphp       | Bin 8265 -> 8265 bytes
> >  tests/data/acpi/q35/DSDT.dimmpxm    | Bin 9455 -> 9455 bytes
> >  tests/data/acpi/q35/DSDT.ipmibt     | Bin 7876 -> 7876 bytes
> >  tests/data/acpi/q35/DSDT.memhp      | Bin 9160 -> 9160 bytes
> >  tests/data/acpi/q35/DSDT.mmio64     | Bin 8932 -> 8932 bytes
> >  tests/data/acpi/q35/DSDT.numamem    | Bin 7807 -> 7807 bytes
> >  tests/qtest/cxl-test.c              |  93 ++++++
> >  tests/qtest/meson.build             |   4 +
> >  64 files changed, 3004 insertions(+), 30 deletions(-)
> >  create mode 100644 hw/acpi/cxl.c
> >  create mode 100644 hw/cxl/Kconfig
> >  create mode 100644 hw/cxl/cxl-component-utils.c
> >  create mode 100644 hw/cxl/cxl-device-utils.c
> >  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> >  create mode 100644 hw/cxl/meson.build
> >  create mode 100644 hw/mem/cxl_type3.c
> >  create mode 100644 hw/pci-bridge/cxl_root_port.c
> >  create mode 100644 include/hw/acpi/cxl.h
> >  create mode 100644 include/hw/cxl/cxl.h
> >  create mode 100644 include/hw/cxl/cxl_component.h
> >  create mode 100644 include/hw/cxl/cxl_device.h
> >  create mode 100644 include/hw/cxl/cxl_pci.h
> >  create mode 100644 tests/data/acpi/pc/CEDT
> >  create mode 100644 tests/data/acpi/q35/CEDT
> >  create mode 100644 tests/qtest/cxl-test.c
> > 
> > -- 
> > 2.30.0
> > 
> >   



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 02/31] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2021-02-11 17:08     ` Jonathan Cameron
  (?)
@ 2021-02-17 16:40     ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-17 16:40 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On 21-02-11 17:08:45, Jonathan Cameron wrote:
> On Mon, 1 Feb 2021 16:59:19 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > A CXL 2.0 component is any entity in the CXL topology. All components
> > have a analogous function in PCIe. Except for the CXL host bridge, all
> > have a PCIe config space that is accessible via the common PCIe
> > mechanisms. CXL components are enumerated via DVSEC fields in the
> > extended PCIe header space. CXL components will minimally implement some
> > subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
> > 2.0 specification. Two headers and a utility library are introduced to
> > support the minimum functionality needed to enumerate components.
> > 
> > The cxl_pci header manages bits associated with PCI, specifically the
> > DVSEC and related fields. The cxl_component.h variant has data
> > structures and APIs that are useful for drivers implementing any of the
> > CXL 2.0 components. The library takes care of making use of the DVSEC
> > bits and the CXL.[mem|cache] registers. Per spec, the registers are
> > little endian.
> > 
> > None of the mechanisms required to enumerate a CXL capable hostbridge
> > are introduced at this point.
> > 
> > Note that the CXL.mem and CXL.cache registers used are always 4B wide.
> > It's possible in the future that this constraint will not hold.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> A few additions to previous comments.

Thanks for continuing to look.

> 
> > ---
> >  MAINTAINERS                    |   6 +
> >  hw/Kconfig                     |   1 +
> >  hw/cxl/Kconfig                 |   3 +
> >  hw/cxl/cxl-component-utils.c   | 208 +++++++++++++++++++++++++++++++++
> >  hw/cxl/meson.build             |   3 +
> >  hw/meson.build                 |   1 +
> >  include/hw/cxl/cxl.h           |  17 +++
> >  include/hw/cxl/cxl_component.h | 187 +++++++++++++++++++++++++++++
> >  include/hw/cxl/cxl_pci.h       | 138 ++++++++++++++++++++++
> >  9 files changed, 564 insertions(+)
> >  create mode 100644 hw/cxl/Kconfig
> >  create mode 100644 hw/cxl/cxl-component-utils.c
> >  create mode 100644 hw/cxl/meson.build
> >  create mode 100644 include/hw/cxl/cxl.h
> >  create mode 100644 include/hw/cxl/cxl_component.h
> >  create mode 100644 include/hw/cxl/cxl_pci.h
> > 
> 
> 
> > diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> > new file mode 100644
> > index 0000000000..8d56ad5c7d
> > --- /dev/null
> > +++ b/hw/cxl/cxl-component-utils.c
> > @@ -0,0 +1,208 @@
> > +/*
> > + * CXL Utility library for components
> > + *
> > + * Copyright(C) 2020 Intel Corporation.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "hw/pci/pci.h"
> > +#include "hw/cxl/cxl.h"
> > +
> > +static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
> > +                                       unsigned size)
> > +{
> > +    CXLComponentState *cxl_cstate = opaque;
> > +    ComponentRegisters *cregs = &cxl_cstate->crb;
> > +
> > +    assert(size == 4);
> > +
> > +    if (cregs->special_ops && cregs->special_ops->read) {
> > +        return cregs->special_ops->read(cxl_cstate, offset, size);
> > +    } else {
> > +        return cregs->cache_mem_registers[offset / 4];
> > +    }
> > +}
> > +
> > +static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
> > +                                    unsigned size)
> > +{
> > +    CXLComponentState *cxl_cstate = opaque;
> > +    ComponentRegisters *cregs = &cxl_cstate->crb;
> > +
> > +    assert(size == 4);
> > +
> > +    if (cregs->special_ops && cregs->special_ops->write) {
> > +        cregs->special_ops->write(cxl_cstate, offset, value, size);
> > +    } else {
> > +        cregs->cache_mem_registers[offset / 4] = value;
> > +    }
> > +}
> > +
> > +/*
> > + * 8.2.3
> > + *   The access restrictions specified in Section 8.2.2 also apply to CXL 2.0
> > + *   Component Registers.
> > + *
> > + * 8.2.2
> > + *   • A 32 bit register shall be accessed as a 4 Bytes quantity. Partial
> > + *   reads are not permitted.
> > + *   • A 64 bit register shall be accessed as a 8 Bytes quantity. Partial
> > + *   reads are not permitted.
> > + *
> > + * As of the spec defined today, only 4 byte registers exist.
> 
> The exciting exception to this is the RAS header log which is
> defined as 512 bits.  Will seek clarification but I think the spec should
> probably say that is a set of 32 bit registers.
> 
> A bunch of the other elements that we probably want to block in plausible
> values for also seem to use 64 bit registers.
> 

IIRC, it was only the link caps, but I can look again. (I don't ever intend to
emulate link caps). The RAS log was a mistake...

FWIW, I bunch of feedback about a few of register mixups in this vein and I
think there's been errata published but it was before I was sitting in on the
consortium calls, so I'm not sure.

> > + */
> > +static const MemoryRegionOps cache_mem_ops = {
> > +    .read = cxl_cache_mem_read_reg,
> > +    .write = cxl_cache_mem_write_reg,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 4,
> > +        .unaligned = false,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 4,
> > +    },
> > +};
> > +
> 
> ..
> > +
> > +void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
> > +{
> > +    int caps = 0;
> > +    switch (type) {
> > +    case CXL2_DOWNSTREAM_PORT:
> > +    case CXL2_DEVICE:
> > +        /* CAP, RAS, Link */
> > +        caps = 2;
> > +        break;
> > +    case CXL2_UPSTREAM_PORT:
> > +    case CXL2_TYPE3_DEVICE:
> > +    case CXL2_LOGICAL_DEVICE:
> > +        /* + HDM */
> > +        caps = 3;
> > +        break;
> > +    case CXL2_ROOT_PORT:
> > +        /* + Extended Security, + Snoop */
> > +        caps = 5;
> > +        break;
> > +    default:
> > +        abort();
> > +    }
> > +
> > +    memset(reg_state, 0, 0x1000);
> > +
> > +    /* CXL Capability Header Register */
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
> > +
> > +
> > +#define init_cap_reg(reg, id, version)                                        \
> > +    _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
> > +    do {                                                                      \
> > +        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
> > +        reg_state[which] = FIELD_DP32(reg_state[which],                       \
> > +                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
> > +        reg_state[which] =                                                    \
> > +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
> > +                       VERSION, version);                                     \
> > +        reg_state[which] =                                                    \
> > +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
> > +                       CXL_##reg##_REGISTERS_OFFSET);                         \
> > +    } while (0)
> 
> Seems like this would be cleaner using ARRAY_FIELD_DP32 as you did for the header.
> 
>     #define init_cap_reg(reg, id, version)                                        \
>         _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
>         do {                                                                    \
>             ARRAY_FIELD_DP32(reg_state, CXL_##reg##_CAPABILITY_HEADER, ID, id); \
>             ARRAY_FIELD_DP32(reg_state, CXL_##reg##_CAPABILITY_HEADER,          \
>                              VERSION, version);                                 \
>             ARRAY_FIELD_DP32(reg_state, CXL_##reg##_CAPABILITY_HEADER,          \
>                              PTR, CXL_##reg##_REGISTRS_OFFSET);                 \
> 	} while (0)
> I think gives the same result.
> 

I think it looks better too. I don't remember why I didn't do this.

Could I entice you to send a tested patch to change it? I'll gladly put it on
top. I'm trying to not mess with the original patches at this point and do
everything on top, until someone yells to squash it in.

> > +
> > +    init_cap_reg(RAS, 2, 1);
> > +    ras_init_common(reg_state);
> > +
> > +    init_cap_reg(LINK, 4, 2);
> 
> Feels like we'll want to block some values for the rest of these to at least
> ensure whatever is read isn't crazy.
> 

Yep. I've pretty much left everything as a TODO in the component register block.
I only did RAS as an example on how one would add things, but then I ended up
adding HDM as a better example.

Would be good as part of the "project plan" to identify what registers are
interesting to implement.

> > +
> > +    if (caps < 3) {
> > +        return;
> > +    }
> > +
> > +    init_cap_reg(HDM, 5, 1);
> > +    hdm_init_common(reg_state);
> > +
> > +    if (caps < 5) {
> > +        return;
> > +    }
> > +
> > +    init_cap_reg(EXTSEC, 6, 1);
> > +    init_cap_reg(SNOOP, 8, 1);
> > +
> > +#undef init_cap_reg
> > +}
> > +
> > +/*
> > + * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
> > + * for tracking the valid offset.
> > + *
> > + * This function will build the DVSEC header on behalf of the caller and then
> > + * copy in the remaining data for the vendor specific bits.
> > + */
> > +void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
> > +                                uint16_t type, uint8_t rev, uint8_t *body)
> > +{
> > +    PCIDevice *pdev = cxl->pdev;
> > +    uint16_t offset = cxl->dvsec_offset;
> > +
> > +    assert(offset >= PCI_CFG_SPACE_SIZE &&
> > +           ((offset + length) < PCI_CFG_SPACE_EXP_SIZE));
> > +    assert((length & 0xf000) == 0);
> > +    assert((rev & ~0xf) == 0);
> > +
> > +    /* Create the DVSEC in the MCFG space */
> > +    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
> > +    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER1_OFFSET,
> > +                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
> > +    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
> > +    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
> > +           body + sizeof(struct dvsec_header),
> > +           length - sizeof(struct dvsec_header));
> > +
> > +    /* Update state for future DVSEC additions */
> > +    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
> > +    cxl->dvsec_offset += length;
> > +}
> ...
> 
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 02/31] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2021-02-02 11:48     ` Jonathan Cameron
  (?)
@ 2021-02-17 18:36     ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-17 18:36 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On 21-02-02 11:48:15, Jonathan Cameron wrote:
> On Mon, 1 Feb 2021 16:59:19 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > A CXL 2.0 component is any entity in the CXL topology. All components
> > have a analogous function in PCIe. Except for the CXL host bridge, all
> > have a PCIe config space that is accessible via the common PCIe
> > mechanisms. CXL components are enumerated via DVSEC fields in the
> > extended PCIe header space. CXL components will minimally implement some
> > subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
> > 2.0 specification. Two headers and a utility library are introduced to
> > support the minimum functionality needed to enumerate components.
> > 
> > The cxl_pci header manages bits associated with PCI, specifically the
> > DVSEC and related fields. The cxl_component.h variant has data
> > structures and APIs that are useful for drivers implementing any of the
> > CXL 2.0 components. The library takes care of making use of the DVSEC
> > bits and the CXL.[mem|cache] registers. Per spec, the registers are
> > little endian.
> > 
> > None of the mechanisms required to enumerate a CXL capable hostbridge
> > are introduced at this point.
> > 
> > Note that the CXL.mem and CXL.cache registers used are always 4B wide.
> > It's possible in the future that this constraint will not hold.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> A few minor discrepancies from the spec, + naming suggestions.
> 
> Otherwise LGTM.
> 
> > ---
> >  MAINTAINERS                    |   6 +
> >  hw/Kconfig                     |   1 +
> >  hw/cxl/Kconfig                 |   3 +
> >  hw/cxl/cxl-component-utils.c   | 208 +++++++++++++++++++++++++++++++++
> >  hw/cxl/meson.build             |   3 +
> >  hw/meson.build                 |   1 +
> >  include/hw/cxl/cxl.h           |  17 +++
> >  include/hw/cxl/cxl_component.h | 187 +++++++++++++++++++++++++++++
> >  include/hw/cxl/cxl_pci.h       | 138 ++++++++++++++++++++++
> >  9 files changed, 564 insertions(+)
> >  create mode 100644 hw/cxl/Kconfig
> >  create mode 100644 hw/cxl/cxl-component-utils.c
> >  create mode 100644 hw/cxl/meson.build
> >  create mode 100644 include/hw/cxl/cxl.h
> >  create mode 100644 include/hw/cxl/cxl_component.h
> >  create mode 100644 include/hw/cxl/cxl_pci.h
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index bcd88668bc..981dc92e25 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -2234,6 +2234,12 @@ F: qapi/block*.json
> >  F: qapi/transaction.json
> >  T: git https://repo.or.cz/qemu/armbru.git block-next
> >  
> > +Compute Express Link
> > +M: Ben Widawsky <ben.widawsky@intel.com>
> > +S: Supported
> > +F: hw/cxl/
> > +F: include/hw/cxl/
> > +
> >  Dirty Bitmaps
> >  M: Eric Blake <eblake@redhat.com>
> >  M: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > diff --git a/hw/Kconfig b/hw/Kconfig
> > index 5ad3c6b5a4..c03650c5ed 100644
> > --- a/hw/Kconfig
> > +++ b/hw/Kconfig
> > @@ -6,6 +6,7 @@ source audio/Kconfig
> >  source block/Kconfig
> >  source char/Kconfig
> >  source core/Kconfig
> > +source cxl/Kconfig
> >  source display/Kconfig
> >  source dma/Kconfig
> >  source gpio/Kconfig
> > diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig
> > new file mode 100644
> > index 0000000000..8e67519b16
> > --- /dev/null
> > +++ b/hw/cxl/Kconfig
> > @@ -0,0 +1,3 @@
> > +config CXL
> > +    bool
> > +    default y if PCI_EXPRESS
> > diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> > new file mode 100644
> > index 0000000000..8d56ad5c7d
> > --- /dev/null
> > +++ b/hw/cxl/cxl-component-utils.c
> > @@ -0,0 +1,208 @@
> > +/*
> > + * CXL Utility library for components
> > + *
> > + * Copyright(C) 2020 Intel Corporation.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "hw/pci/pci.h"
> > +#include "hw/cxl/cxl.h"
> > +
> > +static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
> > +                                       unsigned size)
> > +{
> > +    CXLComponentState *cxl_cstate = opaque;
> > +    ComponentRegisters *cregs = &cxl_cstate->crb;
> > +
> > +    assert(size == 4);
> > +
> > +    if (cregs->special_ops && cregs->special_ops->read) {
> > +        return cregs->special_ops->read(cxl_cstate, offset, size);
> > +    } else {
> > +        return cregs->cache_mem_registers[offset / 4];
> > +    }
> > +}
> > +
> > +static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
> > +                                    unsigned size)
> > +{
> > +    CXLComponentState *cxl_cstate = opaque;
> > +    ComponentRegisters *cregs = &cxl_cstate->crb;
> > +
> > +    assert(size == 4);
> > +
> > +    if (cregs->special_ops && cregs->special_ops->write) {
> > +        cregs->special_ops->write(cxl_cstate, offset, value, size);
> > +    } else {
> > +        cregs->cache_mem_registers[offset / 4] = value;
> > +    }
> > +}
> > +
> > +/*
> > + * 8.2.3
> > + *   The access restrictions specified in Section 8.2.2 also apply to CXL 2.0
> > + *   Component Registers.
> > + *
> > + * 8.2.2
> > + *   • A 32 bit register shall be accessed as a 4 Bytes quantity. Partial
> > + *   reads are not permitted.
> > + *   • A 64 bit register shall be accessed as a 8 Bytes quantity. Partial
> > + *   reads are not permitted.
> > + *
> > + * As of the spec defined today, only 4 byte registers exist.
> > + */
> > +static const MemoryRegionOps cache_mem_ops = {
> > +    .read = cxl_cache_mem_read_reg,
> > +    .write = cxl_cache_mem_write_reg,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 4,
> > +        .unaligned = false,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 4,
> > +    },
> > +};
> > +
> > +void cxl_component_register_block_init(Object *obj,
> > +                                       CXLComponentState *cxl_cstate,
> > +                                       const char *type)
> > +{
> > +    ComponentRegisters *cregs = &cxl_cstate->crb;
> > +
> > +    memory_region_init(&cregs->component_registers, obj, type,
> > +                       CXL2_COMPONENT_BLOCK_SIZE);
> > +
> > +    /* io registers controls link which we don't care about in QEMU */
> > +    memory_region_init_io(&cregs->io, obj, NULL, cregs, ".io",
> > +                          CXL2_COMPONENT_IO_REGION_SIZE);
> > +    memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cregs,
> > +                          ".cache_mem", CXL2_COMPONENT_CM_REGION_SIZE);
> > +
> > +    memory_region_add_subregion(&cregs->component_registers, 0, &cregs->io);
> > +    memory_region_add_subregion(&cregs->component_registers,
> > +                                CXL2_COMPONENT_IO_REGION_SIZE,
> > +                                &cregs->cache_mem);
> > +}
> > +
> > +static void ras_init_common(uint32_t *reg_state)
> > +{
> > +    reg_state[R_CXL_RAS_UNC_ERR_STATUS] = 0;
> > +    reg_state[R_CXL_RAS_UNC_ERR_MASK] = 0x1efff;  
> This should be everything up to bit 11 then bits 14-16 I believe.
> 0x1cfff 
> 
> > +    reg_state[R_CXL_RAS_UNC_ERR_SEVERITY] = 0x1efff;
> 0x1cfff as well
> 

Thanks. This changed since the spec version I originally wrote to...

> > +    reg_state[R_CXL_RAS_COR_ERR_STATUS] = 0;
> > +    reg_state[R_CXL_RAS_COR_ERR_MASK] = 0x3f;
> > +    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */
> > +}
> > +
> > +static void hdm_init_common(uint32_t *reg_state)
> > +{
> > +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);
> > +}
> > +
> > +void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
> > +{
> > +    int caps = 0;
> > +    switch (type) {
> > +    case CXL2_DOWNSTREAM_PORT:
> > +    case CXL2_DEVICE:
> > +        /* CAP, RAS, Link */
> > +        caps = 2;
> > +        break;
> > +    case CXL2_UPSTREAM_PORT:
> > +    case CXL2_TYPE3_DEVICE:
> > +    case CXL2_LOGICAL_DEVICE:
> > +        /* + HDM */
> > +        caps = 3;
> > +        break;
> > +    case CXL2_ROOT_PORT:
> > +        /* + Extended Security, + Snoop */
> > +        caps = 5;
> > +        break;
> > +    default:
> > +        abort();
> > +    }
> > +
> > +    memset(reg_state, 0, 0x1000);
> 
> Better to pass the size in so it's apparent where that came from?

I think #define for now is a better fit. As you allude to below, I suspect this
gets messy with future versions of the spec.

> 
> > +
> > +    /* CXL Capability Header Register */
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
> > +
> > +
> > +#define init_cap_reg(reg, id, version)                                        \
> > +    _Static_assert(CXL_##reg##_REGISTERS_OFFSET != 0, "Invalid cap offset\n");\
> > +    do {                                                                      \
> > +        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
> > +        reg_state[which] = FIELD_DP32(reg_state[which],                       \
> > +                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
> > +        reg_state[which] =                                                    \
> > +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
> > +                       VERSION, version);                                     \
> > +        reg_state[which] =                                                    \
> > +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
> > +                       CXL_##reg##_REGISTERS_OFFSET);                         \
> > +    } while (0)
> > +
> > +    init_cap_reg(RAS, 2, 1);
> > +    ras_init_common(reg_state);
> > +
> > +    init_cap_reg(LINK, 4, 2);
> > +
> > +    if (caps < 3) {
> 
> It strikes me that this approach of basing it purely on number of caps is
> not particularly flexible or maintainable but I guess it will do for until we need
> something more sophisticated. (i.e. when they aren't a series of expanding inclusive
> sets of entries)
> 

SUPER inflexible. I am not yet convinced we need flexible here and I liked
simple. It's totally fine to change this when it's needed.

> > +        return;
> > +    }
> > +
> > +    init_cap_reg(HDM, 5, 1);
> > +    hdm_init_common(reg_state);
> > +
> > +    if (caps < 5) {
> > +        return;
> > +    }
> > +
> > +    init_cap_reg(EXTSEC, 6, 1);
> > +    init_cap_reg(SNOOP, 8, 1);
> > +
> > +#undef init_cap_reg
> > +}
> > +
> > +/*
> > + * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
> > + * for tracking the valid offset.
> > + *
> > + * This function will build the DVSEC header on behalf of the caller and then
> > + * copy in the remaining data for the vendor specific bits.
> > + */
> > +void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
> > +                                uint16_t type, uint8_t rev, uint8_t *body)
> > +{
> > +    PCIDevice *pdev = cxl->pdev;
> > +    uint16_t offset = cxl->dvsec_offset;
> > +
> > +    assert(offset >= PCI_CFG_SPACE_SIZE &&
> > +           ((offset + length) < PCI_CFG_SPACE_EXP_SIZE));
> > +    assert((length & 0xf000) == 0);
> > +    assert((rev & ~0xf) == 0);
> > +
> > +    /* Create the DVSEC in the MCFG space */
> > +    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
> > +    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER1_OFFSET,
> > +                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
> > +    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
> > +    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
> > +           body + sizeof(struct dvsec_header),
> > +           length - sizeof(struct dvsec_header));
> > +
> > +    /* Update state for future DVSEC additions */
> > +    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
> > +    cxl->dvsec_offset += length;
> > +}
> > diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> > new file mode 100644
> > index 0000000000..00c3876a0f
> > --- /dev/null
> > +++ b/hw/cxl/meson.build
> > @@ -0,0 +1,3 @@
> > +softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> > +  'cxl-component-utils.c',
> > +))
> > diff --git a/hw/meson.build b/hw/meson.build
> > index 010de7219c..3e440c341a 100644
> > --- a/hw/meson.build
> > +++ b/hw/meson.build
> > @@ -6,6 +6,7 @@ subdir('block')
> >  subdir('char')
> >  subdir('core')
> >  subdir('cpu')
> > +subdir('cxl')
> >  subdir('display')
> >  subdir('dma')
> >  subdir('gpio')
> > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > new file mode 100644
> > index 0000000000..55f6cc30a5
> > --- /dev/null
> > +++ b/include/hw/cxl/cxl.h
> > @@ -0,0 +1,17 @@
> > +/*
> > + * QEMU CXL Support
> > + *
> > + * Copyright (c) 2020 Intel
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef CXL_H
> > +#define CXL_H
> > +
> > +#include "cxl_pci.h"
> > +#include "cxl_component.h"
> > +
> > +#endif
> > +
> > diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
> > new file mode 100644
> > index 0000000000..762feb54da
> > --- /dev/null
> > +++ b/include/hw/cxl/cxl_component.h
> > @@ -0,0 +1,187 @@
> > +/*
> > + * QEMU CXL Component
> > + *
> > + * Copyright (c) 2020 Intel
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef CXL_COMPONENT_H
> > +#define CXL_COMPONENT_H
> > +
> > +/* CXL 2.0 - 8.2.4 */
> > +#define CXL2_COMPONENT_IO_REGION_SIZE 0x1000
> > +#define CXL2_COMPONENT_CM_REGION_SIZE 0x1000
> > +#define CXL2_COMPONENT_BLOCK_SIZE 0x10000
> > +
> > +#include "qemu/range.h"
> > +#include "qemu/typedefs.h"
> > +#include "hw/register.h"
> > +
> > +enum reg_type {
> > +    CXL2_DEVICE,
> > +    CXL2_TYPE3_DEVICE,
> > +    CXL2_LOGICAL_DEVICE,
> > +    CXL2_ROOT_PORT,
> > +    CXL2_UPSTREAM_PORT,
> > +    CXL2_DOWNSTREAM_PORT
> > +};
> > +
> > +/*
> > + * Capability registers are defined at the top of the CXL.cache/mem region and
> > + * are packed. For our purposes we will always define the caps in the same
> > + * order.
> > + * CXL 2.0 - 8.2.5 Table 142 for details.
> > + */
> > +
> > +/* CXL 2.0 - 8.2.5.1 */
> > +REG32(CXL_CAPABILITY_HEADER, 0)
> > +    FIELD(CXL_CAPABILITY_HEADER, ID, 0, 16)
> > +    FIELD(CXL_CAPABILITY_HEADER, VERSION, 16, 4)
> > +    FIELD(CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 20, 4)
> > +    FIELD(CXL_CAPABILITY_HEADER, ARRAY_SIZE, 24, 8)
> > +
> > +#define CXLx_CAPABILITY_HEADER(type, offset)                  \
> > +    REG32(CXL_##type##_CAPABILITY_HEADER, offset)             \
> > +        FIELD(CXL_##type##_CAPABILITY_HEADER, ID, 0, 16)      \
> > +        FIELD(CXL_##type##_CAPABILITY_HEADER, VERSION, 16, 4) \
> > +        FIELD(CXL_##type##_CAPABILITY_HEADER, PTR, 20, 12)
> > +CXLx_CAPABILITY_HEADER(RAS, 0x4)
> > +CXLx_CAPABILITY_HEADER(LINK, 0x8)
> > +CXLx_CAPABILITY_HEADER(HDM, 0xc)
> > +CXLx_CAPABILITY_HEADER(EXTSEC, 0x10)
> > +CXLx_CAPABILITY_HEADER(SNOOP, 0x14)
> > +
> > +/*
> > + * Capability structures contain the actual registers that the CXL component
> > + * implements. Some of these are specific to certain types of components, but
> > + * this implementation leaves enough space regardless.
> > + */
> > +/* 8.2.5.9 - CXL RAS Capability Structure */
> > +#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */
> > +#define CXL_RAS_REGISTERS_SIZE   0x58
> > +REG32(CXL_RAS_UNC_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET)
> > +REG32(CXL_RAS_UNC_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x4)
> > +REG32(CXL_RAS_UNC_ERR_SEVERITY, CXL_RAS_REGISTERS_OFFSET + 0x8)
> > +REG32(CXL_RAS_COR_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET + 0xc)
> > +REG32(CXL_RAS_COR_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x10)
> > +REG32(CXL_RAS_ERR_CAP_CTRL, CXL_RAS_REGISTERS_OFFSET + 0x14)
> > +/* Offset 0x18 - 0x58 reserved for RAS logs */
> > +
> > +/* 8.2.5.10 - CXL Security Capability Structure */
> > +#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
> > +#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
> > +
> > +/* 8.2.5.11 - CXL Link Capability Structure */
> > +#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
> > +#define CXL_LINK_REGISTERS_SIZE   0x38
> > +
> > +/* 8.2.5.12 - CXL HDM Decoder Capability Structure */
> > +#define HDM_DECODE_MAX 10 /* 8.2.5.12.1 */
> > +#define CXL_HDM_REGISTERS_OFFSET \
> > +    (CXL_LINK_REGISTERS_OFFSET + CXL_LINK_REGISTERS_SIZE) /* 8.2.5.12 */
> 
> The positioning of this isn't really defined by 8.2.5.12 that I can see. So I'd drop
> that trailing comment.
> 
> > +#define CXL_HDM_REGISTERS_SIZE (0x20 + HDM_DECODE_MAX * 10)
> 
> This doesn't look quite right.  0x10 + HDM_DECODE_MAX * 0x20 I think.
> Offset to decoder 0 + number of HDM decoders * size of each decode description.

Not sure how I botched that. It's literally in the spec as:
20h *n+ 10h

Thanks.

> 
> 
> > +#define HDM_DECODER_INIT(n)                                                    \
> > +  REG32(CXL_HDM_DECODER##n##_BASE_LO,                                          \
> > +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x10)                          \
> this might be easier to read if you define something like
> CXL_HDM_REGS_DECODER0_OFFSET  CXL_HDM_REGISTERS_OFFSET + 0x10
> then use that for the base.
> 
> > +            FIELD(CXL_HDM_DECODER##n##_BASE_LO, L, 28, 4)                      \
> > +  REG32(CXL_HDM_DECODER##n##_BASE_HI,                                          \
> > +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x14)                          \
> > +  REG32(CXL_HDM_DECODER##n##_SIZE_LO,                                          \
> > +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x18)                          \
> > +  REG32(CXL_HDM_DECODER##n##_SIZE_HI,                                          \
> > +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x1C)                          \
> > +  REG32(CXL_HDM_DECODER##n##_CTRL,                                             \
> > +        CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x20)                          \
> > +            FIELD(CXL_HDM_DECODER##n##_CTRL, IG, 0, 4)                         \
> > +            FIELD(CXL_HDM_DECODER##n##_CTRL, IW, 4, 4)                         \
> > +            FIELD(CXL_HDM_DECODER##n##_CTRL, LOCK_ON_COMMIT, 8, 1)             \
> > +            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMIT, 9, 1)                     \
> > +            FIELD(CXL_HDM_DECODER##n##_CTRL, COMMITTED, 10, 1)                 \
> > +            FIELD(CXL_HDM_DECODER##n##_CTRL, ERROR, 11, 1)                     \
> > +            FIELD(CXL_HDM_DECODER##n##_CTRL, TYPE, 12, 1)                      \
> > +  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_LO, 0x24)                             \
> Offset should I think be per 'n'.
> 
> > +  REG32(CXL_HDM_DECODER##n##_TARGET_LIST_HI, 0x28)
> 
> Hmm. There is a hole here in the spec.  Probably needs a reserved 4 bytes
> at CXL_HDM_REGISTERS_OFFSET + (0x20 * n) + 0x2c given next entry is at 0x30
> 
> Should be added to 8.2.5.12 table
> 
> > +
> > +REG32(CXL_HDM_DECODER_CAPABILITY, CXL_HDM_REGISTERS_OFFSET)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0, 4)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, TARGET_COUNT, 4, 4)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_256B, 8, 1)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTELEAVE_4K, 9, 1)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, POISON_ON_ERR_CAP, 10, 1)
> > +REG32(CXL_HDM_DECODER_GLOBAL_CONTROL, CXL_HDM_REGISTERS_OFFSET + 4) 
> 
> I'd be consistent on using hex for all offsets (even when it clearly makes no
> difference like here!)
> 

I'm going to punt on this for now to avoid thrash. But I'm fine to do this or
take a patch from someone else who does it.

> > +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, POISON_ON_ERR_EN, 0, 1)
> > +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 1, 1)
> > +
> > +HDM_DECODER_INIT(0);
> > +
> > +/* 8.2.5.13 - CXL Extended Security Capability Structure (Root complex only) */
> > +#define EXTSEC_ENTRY_MAX        256
> > +#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)
> > +#define CXL_EXTSEC_REGISTERS_SIZE   (8 * EXTSEC_ENTRY_MAX + 4)
> > +
> > +/* 8.2.5.14 - CXL IDE Capability Structure */
> > +#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)
> > +#define CXL_IDE_REGISTERS_SIZE   0
> 
> 0x20 (given we seem to be sizing other things we aren't using yet)
> 
> > +
> > +/* 8.2.5.15 - CXL Snoop Filter Capability Structure */
> > +#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
> > +#define CXL_SNOOP_REGISTERS_SIZE   0x8
> > +
> > +_Static_assert((CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE) < 0x1000,
> > +               "No space for registers");
> > +
> > +typedef struct component_registers {
> > +    /*
> > +     * Main memory region to be registered with QEMU core.
> > +     */
> > +    MemoryRegion component_registers;
> > +
> > +    /*
> > +     * 8.2.4 Table 141:
> > +     *   0x0000 - 0x0fff CXL.io registers
> > +     *   0x1000 - 0x1fff CXL.cache and CXL.mem
> > +     *   0x2000 - 0xdfff Implementation specific
> > +     *   0xe000 - 0xe3ff CXL ARB/MUX registers
> > +     *   0xe400 - 0xffff RSVD
> > +     */
> > +    uint32_t io_registers[CXL2_COMPONENT_IO_REGION_SIZE >> 2];
> > +    MemoryRegion io;
> > +
> > +    uint32_t cache_mem_registers[CXL2_COMPONENT_CM_REGION_SIZE >> 2];
> > +    MemoryRegion cache_mem;
> > +
> > +    MemoryRegion impl_specific;
> > +    MemoryRegion arb_mux;
> > +    MemoryRegion rsvd;
> > +
> > +    /* special_ops is used for any component that needs any specific handling */
> > +    MemoryRegionOps *special_ops;
> > +} ComponentRegisters;
> > +
> > +/*
> > + * A CXL component represents all entities in a CXL hierarchy. This includes,
> > + * host bridges, root ports, upstream/downstream switch ports, and devices
> > + */
> > +typedef struct cxl_component {
> > +    ComponentRegisters crb;
> > +    union {
> > +        struct {
> > +            Range dvsecs[CXL20_MAX_DVSEC];
> > +            uint16_t dvsec_offset;
> > +            struct PCIDevice *pdev;
> > +        };
> > +    };
> > +} CXLComponentState;
> > +
> > +void cxl_component_register_block_init(Object *obj,
> > +                                       CXLComponentState *cxl_cstate,
> > +                                       const char *type);
> > +void cxl_component_register_init_common(uint32_t *reg_state,
> > +                                        enum reg_type type);
> > +
> > +void cxl_component_create_dvsec(CXLComponentState *cxl_cstate, uint16_t length,
> > +                                uint16_t type, uint8_t rev, uint8_t *body);
> > +
> > +#endif
> > diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
> > new file mode 100644
> > index 0000000000..a53c2e5ae7
> > --- /dev/null
> > +++ b/include/hw/cxl/cxl_pci.h
> > @@ -0,0 +1,138 @@
> > +/*
> > + * QEMU CXL PCI interfaces
> > + *
> > + * Copyright (c) 2020 Intel
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef CXL_PCI_H
> > +#define CXL_PCI_H
> > +
> > +#include "hw/pci/pci.h"
> > +#include "hw/pci/pcie.h"
> > +
> > +#define CXL_VENDOR_ID 0x1e98
> > +
> > +#define PCIE_DVSEC_HEADER1_OFFSET 0x4 /* Offset from start of extend cap */
> > +#define PCIE_DVSEC_ID_OFFSET 0x8
> > +
> > +#define PCIE_CXL_DEVICE_DVSEC_LENGTH 0x38
> > +#define PCIE_CXL1_DEVICE_DVSEC_REVID 0
> > +#define PCIE_CXL2_DEVICE_DVSEC_REVID 1
> > +
> > +#define EXTENSIONS_PORT_DVSEC_LENGTH 0x28
> > +#define EXTENSIONS_PORT_DVSEC_REVID 0
> > +
> > +#define GPF_PORT_DVSEC_LENGTH 0x10
> > +#define GPF_PORT_DVSEC_REVID  0
> > +
> > +#define PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0 0x14
> > +#define PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0  1
> > +
> > +#define REG_LOC_DVSEC_LENGTH 0x24
> > +#define REG_LOC_DVSEC_REVID  0
> > +
> > +enum {
> > +    PCIE_CXL_DEVICE_DVSEC      = 0,
> > +    NON_CXL_FUNCTION_MAP_DVSEC = 2,
> > +    EXTENSIONS_PORT_DVSEC      = 3,
> > +    GPF_PORT_DVSEC             = 4,
> > +    GPF_DEVICE_DVSEC           = 5,
> > +    PCIE_FLEXBUS_PORT_DVSEC    = 7,
> > +    REG_LOC_DVSEC              = 8,
> > +    MLD_DVSEC                  = 9,
> > +    CXL20_MAX_DVSEC
> > +};
> > +
> > +struct dvsec_header {
> > +    uint32_t cap_hdr;
> > +    uint32_t dv_hdr1;
> > +    uint16_t dv_hdr2;
> > +} __attribute__((__packed__));
> > +_Static_assert(sizeof(struct dvsec_header) == 10,
> 
> Be consistent on decimal or hex for these checks. I don't really
> care which but odd to mix and match.
> 

...

> > +               "dvsec header size incorrect");
> > +
> > +/*
> > + * CXL 2.0 devices must implement certain DVSEC IDs, and can [optionally]
> > + * implement others.
> > + *
> > + * CXL 2.0 Device: 0, [2], 5, 8
> > + * CXL 2.0 RP: 3, 4, 7, 8
> > + * CXL 2.0 Upstream Port: [2], 7, 8
> > + * CXL 2.0 Downstream Port: 3, 4, 7, 8
> > + */
> > +
> > +/* CXL 2.0 - 8.1.5 (ID 0003) */
> > +struct extensions_dvsec_port {
> Probably want consistent naming (as can be done anyway) so
> cxl_dvsec_port_extensions maybe?
> 
> > +    struct dvsec_header hdr;
> > +    uint16_t status;
> > +    uint16_t control;
> > +    uint8_t alt_bus_base;
> > +    uint8_t alt_bus_limit;
> > +    uint16_t alt_memory_base;
> > +    uint16_t alt_memory_limit;
> > +    uint16_t alt_prefetch_base;
> > +    uint16_t alt_prefetch_limit;
> > +    uint32_t alt_prefetch_base_high;
> > +    uint32_t alt_prefetch_base_low;
> > +    uint32_t rcrb_base;
> > +    uint32_t rcrb_base_high;
> > +};
> > +_Static_assert(sizeof(struct extensions_dvsec_port) == 0x28,
> > +               "extensions dvsec port size incorrect");
> > +#define PORT_CONTROL_OVERRIDE_OFFSET 0xc
> 
> What's this one?  Looks to just be the PORT_CONTROL_OFFSET
> though admittedly the spec does refer to this as OVERRIDE_OFFSET in
> one place.
> 

This has changed since earlier versions of the spec. I think 9.12.3 needs
update...

> > +#define PORT_CONTROL_UNMASK_SBR      1
> > +#define PORT_CONTROL_ALT_MEMID_EN    4
> > +
> > +/* CXL 2.0 - 8.1.6 GPF DVSEC (ID 0004) */
> > +struct dvsec_port_gpf {
> > +    struct dvsec_header hdr;
> > +    uint16_t rsvd;
> > +    uint16_t phase1_ctrl;
> > +    uint16_t phase2_ctrl;
> > +};
> > +_Static_assert(sizeof(struct dvsec_port_gpf) == 0x10,
> > +               "dvsec port GPF size incorrect");
> > +
> > +/* CXL 2.0 - 8.1.8/8.2.1.3 Flexbus DVSEC (ID 0007) */
> > +struct dvsec_port_flexbus {
> > +    struct dvsec_header hdr;
> > +    uint16_t cap;
> > +    uint16_t ctrl;
> > +    uint16_t status;
> > +    uint32_t rcvd_mod_ts_data;
> 
> Whilst it is wordy, I'd keep the full naming of that field.
> rcvd_mod_ts_data_phase_1; 
> 
> > +};
> > +_Static_assert(sizeof(struct dvsec_port_flexbus) == 0x14,
> > +               "dvsec port flexbus size incorrect");
> > +
> > +/* CXL 2.0 - 8.1.9 Register Locator DVSEC (ID 0008) */
> > +struct dvsec_register_locator {
> 
> I'd prefix these structures with cxl_ for all the normal
> namespacing collision reasons.
> Same for #defines
> 

Agreed

> 
> > +    struct dvsec_header hdr;
> > +    uint16_t rsvd;
> > +    uint32_t reg0_base_lo;
> > +    uint32_t reg0_base_hi;
> > +    uint32_t reg1_base_lo;
> > +    uint32_t reg1_base_hi;
> > +    uint32_t reg2_base_lo;
> > +    uint32_t reg2_base_hi;
> > +};
> > +_Static_assert(sizeof(struct dvsec_register_locator) == 0x24,
> > +               "dvsec register locator size incorrect");
> > +
> > +/* BAR Equivalence Indicator */
> > +#define BEI_BAR_10H 0
> > +#define BEI_BAR_14H 1
> > +#define BEI_BAR_18H 2
> > +#define BEI_BAR_1cH 3
> > +#define BEI_BAR_20H 4
> > +#define BEI_BAR_24H 5
> > +
> > +/* Register Block Identifier */
> > +#define RBI_EMPTY          0
> > +#define RBI_COMPONENT_REG  (1 << 8)
> > +#define RBI_BAR_VIRT_ACL   (2 << 8)
> > +#define RBI_CXL_DEVICE_REG (3 << 8)
> > +
> > +#endif
> 
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 04/31] hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  2021-02-02 12:23     ` Jonathan Cameron
  (?)
@ 2021-02-17 22:15     ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-02-17 22:15 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On 21-02-02 12:23:50, Jonathan Cameron wrote:
> On Mon, 1 Feb 2021 16:59:21 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > This implements all device MMIO up to the first capability. That
> > includes the CXL Device Capabilities Array Register, as well as all of
> > the CXL Device Capability Header Registers. The latter are filled in as
> > they are implemented in the following patches.
> > 
> > Endianness and alignment are managed by softmmu memory core.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> A few trivials
> > ---
> >  hw/cxl/cxl-device-utils.c   | 105 ++++++++++++++++++++++++++++++++++++
> >  hw/cxl/meson.build          |   1 +
> >  include/hw/cxl/cxl_device.h |  27 +++++++++-
> >  3 files changed, 132 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/cxl/cxl-device-utils.c
> > 
> > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > new file mode 100644
> > index 0000000000..bb15ad9a0f
> > --- /dev/null
> > +++ b/hw/cxl/cxl-device-utils.c
> > @@ -0,0 +1,105 @@
> > +/*
> > + * CXL Utility library for devices
> > + *
> > + * Copyright(C) 2020 Intel Corporation.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "hw/cxl/cxl.h"
> > +
> > +/*
> > + * Device registers have no restrictions per the spec, and so fall back to the
> > + * default memory mapped register rules in 8.2:
> > + *   Software shall use CXL.io Memory Read and Write to access memory mapped
> > + *   register defined in this section. Unless otherwise specified, software
> > + *   shall restrict the accesses width based on the following:
> > + *   • A 32 bit register shall   be accessed as a 1 Byte, 2 Bytes or 4 Bytes
> 
> odd spacing
> 
> > + *     quantity.
> > + *   • A 64 bit register shall be accessed as a 1 Byte, 2 Bytes, 4 Bytes or 8
> > + *     Bytes
> > + *   • The address shall be a multiple of the access width, e.g. when
> > + *     accessing a register as a 4 Byte quantity, the address shall be
> > + *     multiple of 4.
> > + *   • The accesses shall map to contiguous bytes.If these rules are not
> > + *     followed, the behavior is undefined
> > + */
> > +
> > +static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
> > +{
> > +    CXLDeviceState *cxl_dstate = opaque;
> > +
> > +    return cxl_dstate->caps_reg_state32[offset / 4];
> > +}
> > +
> > +static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> > +{
> > +    return 0;
> > +}
> > +
> > +static const MemoryRegionOps dev_ops = {
> > +    .read = dev_reg_read,
> > +    .write = NULL, /* status register is read only */
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 1,
> > +        .max_access_size = 8,
> > +        .unaligned = false,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 1,
> > +        .max_access_size = 8,
> > +    },
> > +};
> > +
> > +static const MemoryRegionOps caps_ops = {
> > +    .read = caps_reg_read,
> > +    .write = NULL, /* caps registers are read only */
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 1,
> > +        .max_access_size = 8,
> > +        .unaligned = false,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 4,
> > +    },
> > +};
> > +
> > +void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> > +{
> > +    /* This will be a BAR, so needs to be rounded up to pow2 for PCI spec */
> > +    memory_region_init(&cxl_dstate->device_registers, obj, "device-registers",
> > +                       pow2ceil(CXL_MMIO_SIZE));
> > +
> > +    memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
> > +                          "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
> 
> Specifying a size in terms of the offset of another region isn't exactly 
> intuitive so perhaps a comment on why or better yet actually use a size
> parameter covering what is there rather than simply the region below
> the CXL_DEVICE_REGISTERS_OFFSET.
> 

I didn't have a simple way to accommodate this before, but now I do have
CXL_CAPS_SIZE which is what we want here.

You also made me realize I have a PEMDAS bug in CXL_CAPS_SIZE, so thanks.

> 
> > +    memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
> > +                          "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> > +
> > +    memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> > +                                &cxl_dstate->caps);
> > +    memory_region_add_subregion(&cxl_dstate->device_registers,
> > +                                CXL_DEVICE_REGISTERS_OFFSET,
> > +                                &cxl_dstate->device);
> > +}
> > +
> > +static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
> > +
> > +void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> > +{
> > +    uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> > +    const int cap_count = 1;
> > +
> > +    /* CXL Device Capabilities Array Register */
> > +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> > +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
> > +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
> > +
> > +    cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> > +    device_reg_init_common(cxl_dstate);
> > +}
> > diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> > index 00c3876a0f..47154d6850 100644
> > --- a/hw/cxl/meson.build
> > +++ b/hw/cxl/meson.build
> > @@ -1,3 +1,4 @@
> >  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> >    'cxl-component-utils.c',
> > +  'cxl-device-utils.c',
> >  ))
> > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > index a85f250503..f3bcf19410 100644
> > --- a/include/hw/cxl/cxl_device.h
> > +++ b/include/hw/cxl/cxl_device.h
> > @@ -58,6 +58,8 @@
> >  #define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
> >  #define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
> >  #define CXL_DEVICE_CAPS_MAX 4 /* 8.2.8.2.1 + 8.2.8.5 */
> > +#define CXL_CAPS_SIZE \
> > +    (CXL_DEVICE_CAP_REG_SIZE * CXL_DEVICE_CAPS_MAX + 1) /* +1 for header */
> >  
> >  #define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
> >  #define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
> > @@ -70,11 +72,18 @@
> >  #define CXL_MAILBOX_REGISTERS_LENGTH \
> >      (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
> >  
> > +#define CXL_MMIO_SIZE                                       \
> > +    CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_REGISTERS_LENGTH + \
> > +        CXL_MAILBOX_REGISTERS_LENGTH
> > +
> >  typedef struct cxl_device_state {
> >      MemoryRegion device_registers;
> >  
> >      /* mmio for device capabilities array - 8.2.8.2 */
> > -    MemoryRegion caps;
> > +    struct {
> > +        MemoryRegion caps;
> > +        uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
> > +    };
> 
> With this unnamed,w hat is the benefit of having these two in a
> struct?  The naming makes it clear they are related anyway.
> 
> >  
> >      /* mmio for the device status registers 8.2.8.3 */
> >      MemoryRegion device;
> > @@ -126,6 +135,22 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> >                                                 CXL_DEVICE_CAP_REG_SIZE)
> >  
> > +#define cxl_device_cap_init(dstate, reg, cap_id)                                   \
> > +    do {                                                                           \
> > +        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> > +        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \
> > +        cap_hdrs[which] =                                                          \
> > +            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \
> > +        cap_hdrs[which] = FIELD_DP32(                                              \
> > +            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \
> > +        cap_hdrs[which + 1] =                                                      \
> > +            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \
> > +                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \
> > +        cap_hdrs[which + 2] =                                                      \
> > +            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \
> > +                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \
> > +    } while (0)
> > +
> >  REG32(CXL_DEV_MAILBOX_CAP, 0)
> >      FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
> >      FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
> 
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4)
  2021-02-11 17:46     ` Jonathan Cameron
@ 2021-02-18  0:55       ` Ben Widawsky
  2021-02-18 16:50         ` Jonathan Cameron
  0 siblings, 1 reply; 117+ messages in thread
From: Ben Widawsky @ 2021-02-18  0:55 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On 21-02-11 17:46:39, Jonathan Cameron wrote:
> On Tue, 2 Feb 2021 14:58:30 +0000
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Mon, 1 Feb 2021 16:59:22 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > 
> > > This is the beginning of implementing mailbox support for CXL 2.0
> > > devices. The implementation recognizes when the doorbell is rung,
> > > handles the command/payload, clears the doorbell while returning error
> > > codes and data.
> > > 
> > > Generally the mailbox mechanism is designed to permit communication
> > > between the host OS and the firmware running on the device. For our
> > > purposes, we emulate both the firmware, implemented primarily in
> > > cxl-mailbox-utils.c, and the hardware.
> > > 
> > > No commands are implemented yet.
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > >  hw/cxl/cxl-device-utils.c   | 125 ++++++++++++++++++++++-
> > >  hw/cxl/cxl-mailbox-utils.c  | 197 ++++++++++++++++++++++++++++++++++++
> > >  hw/cxl/meson.build          |   1 +
> > >  include/hw/cxl/cxl.h        |   3 +
> > >  include/hw/cxl/cxl_device.h |  28 ++++-
> > >  5 files changed, 349 insertions(+), 5 deletions(-)
> > >  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> > > 
> > > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > > index bb15ad9a0f..6602606f3d 100644
> > > --- a/hw/cxl/cxl-device-utils.c
> > > +++ b/hw/cxl/cxl-device-utils.c
> > > @@ -40,6 +40,111 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> > >      return 0;
> > >  }
> > >  
> > > +static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
> > > +{
> > > +    CXLDeviceState *cxl_dstate = opaque;
> > > +
> > > +    switch (size) {
> 
> As per the thread on the linux driver and the infinite loop I saw there
> as a result of doing 1 byte reads.
> 
> With the current setup of min_access_size = 4 and this
> function QEMU will helpfully issue a series of unaligned 4 byte
> reads to this function. It will then mask those down to 1 byte each
> and combine them.  Given the integer division that results in
> the bottom byte of offset / 4 being returned up to 4 times.
> 
> To handle 2 and 1 byte reads we need explicit support in here and
> the MemoryRegionOps need to reflect that as well.
> 
> All the similar cases where such reads are allowed need to do the
> same.
> 

From the documentation (memory.rst)
- .impl.min_access_size, .impl.max_access_size define the access sizes
  (in bytes) supported by the *implementation*; other access sizes will be
  emulated using the ones available. For example a 4-byte write will be
  emulated using four 1-byte writes, if .impl.max_access_size = 1.

I'm intentionally not looking at the implementation. As I read this, the
behavior you describe is either a QEMU bug, or poor documentation.

Am I missing something?

> > > +    case 8:
> > > +        return cxl_dstate->mbox_reg_state64[offset / 8];
> > > +    case 4:
> > > +        return cxl_dstate->mbox_reg_state32[offset / 4];  
> > 
> > Numeric order seems more natural and I can't see a reason not to do it.
> > + you do them in that order below.
> > 
> > > +    default:
> > > +        g_assert_not_reached();
> > > +    }
> > > +}
> > > +
> > > +static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
> > > +                               uint64_t value)
> > > +{
> > > +    switch (offset) {
> > > +    case A_CXL_DEV_MAILBOX_CTRL:
> > > +        /* fallthrough */
> > > +    case A_CXL_DEV_MAILBOX_CAP:
> > > +        /* RO register */
> > > +        break;
> > > +    default:
> > > +        qemu_log_mask(LOG_UNIMP,
> > > +                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
> > > +                      __func__, offset);
> > > +        break;
> > > +    }
> > > +
> > > +    reg_state[offset / 4] = value;
> > > +}
> > > +
> > > +static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
> > > +                               uint64_t value)
> > > +{
> > > +    switch (offset) {
> > > +    case A_CXL_DEV_MAILBOX_CMD:
> > > +        break;
> > > +    case A_CXL_DEV_BG_CMD_STS:
> > > +        /* BG not supported */
> > > +        /* fallthrough */
> > > +    case A_CXL_DEV_MAILBOX_STS:
> > > +        /* Read only register, will get updated by the state machine */
> > > +        return;
> > > +    default:
> > > +        qemu_log_mask(LOG_UNIMP,
> > > +                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
> > > +                      __func__, offset);
> > > +        return;
> > > +    }
> > > +
> > > +
> > > +    reg_state[offset / 8] = value;
> > > +}
> > > +
> > > +static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
> > > +                              unsigned size)
> > > +{
> > > +    CXLDeviceState *cxl_dstate = opaque;
> > > +
> > > +    if (offset >= A_CXL_DEV_CMD_PAYLOAD) {
> > > +        memcpy(cxl_dstate->mbox_reg_state + offset, &value, size);
> > > +        return;
> > > +    }
> > > +
> > > +    /*
> > > +     * Lock is needed to prevent concurrent writes as well as to prevent writes
> > > +     * coming in while the firmware is processing. Without background commands
> > > +     * or the second mailbox implemented, this serves no purpose since the
> > > +     * memory access is synchronized at a higher level (per memory region).
> > > +     */
> > > +    RCU_READ_LOCK_GUARD();
> > > +
> > > +    switch (size) {
> > > +    case 4:
> > > +        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
> > > +        break;
> > > +    case 8:
> > > +        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
> > > +        break;
> > > +    default:
> > > +        g_assert_not_reached();
> > > +    }
> > > +
> > > +    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > > +                         DOORBELL))
> > > +        cxl_process_mailbox(cxl_dstate);
> > > +}
> > > +
> > > +static const MemoryRegionOps mailbox_ops = {
> > > +    .read = mailbox_reg_read,
> > > +    .write = mailbox_reg_write,
> > > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > > +    .valid = {
> > > +        .min_access_size = 1,
> > > +        .max_access_size = 8,
> > > +        .unaligned = false,
> > > +    },
> > > +    .impl = {
> > > +        .min_access_size = 4,
> > > +        .max_access_size = 8,
> > > +    },
> > > +};
> > > +
> > >  static const MemoryRegionOps dev_ops = {
> > >      .read = dev_reg_read,
> > >      .write = NULL, /* status register is read only */
> > > @@ -80,20 +185,33 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> > >                            "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
> > >      memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
> > >                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> > > +    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
> > > +                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
> > >  
> > >      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> > >                                  &cxl_dstate->caps);
> > >      memory_region_add_subregion(&cxl_dstate->device_registers,
> > >                                  CXL_DEVICE_REGISTERS_OFFSET,
> > >                                  &cxl_dstate->device);
> > > +    memory_region_add_subregion(&cxl_dstate->device_registers,
> > > +                                CXL_MAILBOX_REGISTERS_OFFSET,
> > > +                                &cxl_dstate->mailbox);
> > >  }
> > >  
> > >  static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
> > >  
> > > +static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
> > > +{
> > > +    /* 2048 payload size, with no interrupt or background support */
> > > +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> > > +                     PAYLOAD_SIZE, CXL_MAILBOX_PAYLOAD_SHIFT);
> > > +    cxl_dstate->payload_size = CXL_MAILBOX_MAX_PAYLOAD_SIZE;
> > > +}
> > > +
> > >  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> > >  {
> > >      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> > > -    const int cap_count = 1;
> > > +    const int cap_count = 2;
> > >  
> > >      /* CXL Device Capabilities Array Register */
> > >      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> > > @@ -102,4 +220,9 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> > >  
> > >      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> > >      device_reg_init_common(cxl_dstate);
> > > +
> > > +    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> > > +    mailbox_reg_init_common(cxl_dstate);
> > > +
> > > +    assert(cxl_initialize_mailbox(cxl_dstate) == 0);
> > >  }
> > > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > > new file mode 100644
> > > index 0000000000..466055b01a
> > > --- /dev/null
> > > +++ b/hw/cxl/cxl-mailbox-utils.c
> > > @@ -0,0 +1,197 @@
> > > +/*
> > > + * CXL Utility library for mailbox interface
> > > + *
> > > + * Copyright(C) 2020 Intel Corporation.
> > > + *
> > > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > > + * COPYING file in the top-level directory.
> > > + */
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "hw/cxl/cxl.h"
> > > +#include "hw/pci/pci.h"
> > > +#include "qemu/log.h"
> > > +#include "qemu/uuid.h"
> > > +
> > > +/*
> > > + * How to add a new command, example. The command set FOO, with cmd BAR.
> > > + *  1. Add the command set and cmd to the enum.
> > > + *     FOO    = 0x7f,
> > > + *          #define BAR 0
> > > + *  2. Forward declare the handler.
> > > + *     declare_mailbox_handler(FOO_BAR);
> > > + *  3. Add the command to the cxl_cmd_set[][]
> > > + *     CXL_CMD(FOO, BAR, 0, 0),
> > > + *  4. Implement your handler
> > > + *     define_mailbox_handler(FOO_BAR) { ... return CXL_MBOX_SUCCESS; }
> > > + *
> > > + *
> > > + *  Writing the handler:
> > > + *    The handler will provide the &struct cxl_cmd, the &CXLDeviceState, and the
> > > + *    in/out length of the payload. The handler is responsible for consuming the
> > > + *    payload from cmd->payload and operating upon it as necessary. It must then
> > > + *    fill the output data into cmd->payload (overwriting what was there),
> > > + *    setting the length, and returning a valid return code.
> > > + *
> > > + *  XXX: The handler need not worry about endianess. The payload is read out of
> > > + *  a register interface that already deals with it.
> > > + */
> > > +
> > > +/* 8.2.8.4.5.1 Command Return Codes */
> > > +typedef enum {
> > > +    CXL_MBOX_SUCCESS = 0x0,
> > > +    CXL_MBOX_BG_STARTED = 0x1,
> > > +    CXL_MBOX_INVALID_INPUT = 0x2,
> > > +    CXL_MBOX_UNSUPPORTED = 0x3,
> > > +    CXL_MBOX_INTERNAL_ERROR = 0x4,
> > > +    CXL_MBOX_RETRY_REQUIRED = 0x5,
> > > +    CXL_MBOX_BUSY = 0x6,
> > > +    CXL_MBOX_MEDIA_DISABLED = 0x7,
> > > +    CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
> > > +    CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
> > > +    CXL_MBOX_FW_AUTH_FAILED = 0xa,
> > > +    CXL_MBOX_FW_INVALID_SLOT = 0xb,
> > > +    CXL_MBOX_FW_ROLLEDBACK = 0xc,
> > > +    CXL_MBOX_FW_REST_REQD = 0xd,
> > > +    CXL_MBOX_INVALID_HANDLE = 0xe,
> > > +    CXL_MBOX_INVALID_PA = 0xf,
> > > +    CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
> > > +    CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
> > > +    CXL_MBOX_ABORTED = 0x12,
> > > +    CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
> > > +    CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
> > > +    CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
> > > +    CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
> > > +    CXL_MBOX_MAX = 0x17
> > > +} ret_code;
> > > +
> > > +struct cxl_cmd;
> > > +typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,
> > > +                                   CXLDeviceState *cxl_dstate, uint16_t *len);
> > > +struct cxl_cmd {
> > > +    const char *name;
> > > +    opcode_handler handler;
> > > +    ssize_t in;
> > > +    uint16_t effect; /* Reported in CEL */
> > > +    uint8_t *payload;  
> > 
> > This payload pointer feels somewhat out of place. Perhaps it should be a parameter
> > passed to the opcode_handler()?  The address of the payload doesn't feel like
> > part of the command as such so you are justing using it as somewhere to stash
> > the address when passing to the handler.
> > 
> > 
> > > +};
> > > +
> > > +#define define_mailbox_handler(name)                \
> > > +    static ret_code cmd_##name(struct cxl_cmd *cmd, \
> > > +                               CXLDeviceState *cxl_dstate, uint16_t *len)
> > > +#define declare_mailbox_handler(name) define_mailbox_handler(name)
> > > +
> > > +#define define_mailbox_handler_zeroed(name, size)                         \
> > > +    uint16_t __zero##name = size;                                         \
> > > +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> > > +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> > > +    {                                                                     \
> > > +        *len = __zero##name;                                              \
> > > +        memset(cmd->payload, 0, *len);                                    \
> > > +        return CXL_MBOX_SUCCESS;                                          \
> > > +    }
> > > +#define define_mailbox_handler_const(name, data)                          \
> > > +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> > > +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> > > +    {                                                                     \
> > > +        *len = sizeof(data);                                              \
> > > +        memcpy(cmd->payload, data, *len);                                 \
> > > +        return CXL_MBOX_SUCCESS;                                          \
> > > +    }
> > > +#define define_mailbox_handler_nop(name)                                  \
> > > +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> > > +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> > > +    {                                                                     \
> > > +        return CXL_MBOX_SUCCESS;                                          \
> > > +    }
> > > +
> > > +#define CXL_CMD(s, c, in, cel_effect) \
> > > +    [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
> > > +
> > > +static struct cxl_cmd cxl_cmd_set[256][256] = {};
> > > +
> > > +#undef CXL_CMD
> > > +
> > > +QemuUUID cel_uuid;
> > > +
> > > +void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> > > +{
> > > +    uint16_t ret = CXL_MBOX_SUCCESS;
> > > +    struct cxl_cmd *cxl_cmd;
> > > +    uint64_t status_reg;
> > > +    opcode_handler h;
> > > +
> > > +    /*
> > > +     * current state of mailbox interface
> > > +     *  mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
> > > +     *  mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
> > > +     *  status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
> > > +     */
> > > +    uint64_t command_reg =
> > > +        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
> > > +
> > > +    /* Check if we have to do anything */
> > > +    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > > +                          DOORBELL)) {
> > > +        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
> > > +        return;
> > > +    }
> > > +
> > > +    uint8_t set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
> > > +    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> > > +    uint16_t len = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH);
> > > +    cxl_cmd = &cxl_cmd_set[set][cmd];
> > > +    h = cxl_cmd->handler;
> > > +    if (!h) {  
> > 
> > This path seems to not convey information it perhaps should.  Maybe some feedback that
> > a command was requested that isn't registered?
> > 
> > > +        goto handled;
> > > +    }
> > > +
> > > +    if (len != cxl_cmd->in) {
> > > +        ret = CXL_MBOX_INVALID_PAYLOAD_LENGTH;
> > > +    }
> > > +
> > > +    cxl_cmd->payload = cxl_dstate->mbox_reg_state + A_CXL_DEV_CMD_PAYLOAD;
> > > +    ret = (*h)(cxl_cmd, cxl_dstate, &len);
> > > +    assert(len <= cxl_dstate->payload_size);
> > > +
> > > +handled:
> > > +    /*
> > > +     * Set the return code
> > > +     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
> > > +     * away with this
> > > +     */
> > > +    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
> > > +
> > > +    /*
> > > +     * Set the return length
> > > +     */
> > > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
> > > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
> > > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, len);
> > > +
> > > +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_CMD / 8] = command_reg;
> > > +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_STS / 8] = status_reg;
> > > +
> > > +    /* Tell the host we're done */
> > > +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > > +                     DOORBELL, 0);
> > > +}
> > > +
> > > +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate)
> > > +{
> > > +    const char *cel_uuidstr = "0da9c0b5-bf41-4b78-8f79-96b1623b3f17";
> > > +
> > > +    for (int i = 0; i < 256; i++) {  
> > 
> > Instead of indexing with i and j, perhaps this would be more consistent using
> > the naming you have above cmd and set?
> > 
> > 
> > > +        for (int j = 0; j < 256; j++) {
> > > +            if (cxl_cmd_set[i][j].handler) {
> > > +                struct cxl_cmd *c = &cxl_cmd_set[i][j];
> > > +
> > > +                cxl_dstate->cel_log[cxl_dstate->cel_size].opcode = (i << 8) | j;
> > > +                cxl_dstate->cel_log[cxl_dstate->cel_size].effect = c->effect;
> > > +                cxl_dstate->cel_size++;
> > > +            }
> > > +        }
> > > +    }
> > > +
> > > +    return qemu_uuid_parse(cel_uuidstr, &cel_uuid);
> > > +}
> > > diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> > > index 47154d6850..0eca715d10 100644
> > > --- a/hw/cxl/meson.build
> > > +++ b/hw/cxl/meson.build
> > > @@ -1,4 +1,5 @@
> > >  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> > >    'cxl-component-utils.c',
> > >    'cxl-device-utils.c',
> > > +  'cxl-mailbox-utils.c',
> > >  ))
> > > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > > index 23f52c4cf9..362cda40de 100644
> > > --- a/include/hw/cxl/cxl.h
> > > +++ b/include/hw/cxl/cxl.h
> > > @@ -14,5 +14,8 @@
> > >  #include "cxl_component.h"
> > >  #include "cxl_device.h"
> > >  
> > > +#define COMPONENT_REG_BAR_IDX 0
> > > +#define DEVICE_REG_BAR_IDX 2  
> > 
> > I'd argue for prefixing all defines
> > 
> > CXL_COMPONENT_REG_BAR_IDX etc
> > 
> > Will make it clear they are generic CXL related things.
> > 
> > > +
> > >  #endif
> > >  
> > > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > > index f3bcf19410..af91bec10c 100644
> > > --- a/include/hw/cxl/cxl_device.h
> > > +++ b/include/hw/cxl/cxl_device.h
> > > @@ -80,16 +80,27 @@ typedef struct cxl_device_state {
> > >      MemoryRegion device_registers;
> > >  
> > >      /* mmio for device capabilities array - 8.2.8.2 */
> > > +    MemoryRegion device;
> > >      struct {
> > >          MemoryRegion caps;
> > >          uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
> > >      };
> > >  
> > > -    /* mmio for the device status registers 8.2.8.3 */
> > > -    MemoryRegion device;
> > > -  
> > 
> > Move this block back to where it was originally introduced rather than
> > introduce it then move it later?
> > 
> > >      /* mmio for the mailbox registers 8.2.8.4 */
> > > -    MemoryRegion mailbox;
> > > +    struct {
> > > +        MemoryRegion mailbox;
> > > +        uint16_t payload_size;
> > > +        union {
> > > +            uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
> > > +            uint32_t mbox_reg_state32[CXL_MAILBOX_REGISTERS_LENGTH / 4];
> > > +            uint64_t mbox_reg_state64[CXL_MAILBOX_REGISTERS_LENGTH / 8];
> > > +        };
> > > +        struct {
> > > +            uint16_t opcode;
> > > +            uint16_t effect;
> > > +        } cel_log[1 << 16];
> > > +        size_t cel_size;
> > > +    };  
> > 
> > If the structure is unnamed, chances of a naming clash seem rather high
> > if you don't prefix all the elements with mbx_ or something like that.
> > 
> > >  
> > >      /* memory region for persistent memory, HDM */
> > >      MemoryRegion *pmem;
> > > @@ -135,6 +146,9 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> > >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> > >                                                 CXL_DEVICE_CAP_REG_SIZE)
> > >  
> > > +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
> > > +void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
> > > +
> > >  #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
> > >      do {                                                                           \
> > >          uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> > > @@ -162,6 +176,12 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
> > >      FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
> > >      FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
> > >  
> > > +/* XXX: actually a 64b register */
> > > +REG32(CXL_DEV_MAILBOX_CMD, 8)
> > > +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)
> > > +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
> > > +    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
> > > +  
> > 
> > Ah. this is where this definition went.  Perhaps pull it back into patch 3?
> > That patch defines plenty of other things that aren't used until later patches
> > I think, so one more won't hurt and will save me asking why you skipped it:)
> > 
> > 
> > >  /* XXX: actually a 64b register */
> > >  REG32(CXL_DEV_MAILBOX_STS, 0x10)
> > >      FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)  
> > 
> > 
> 
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4)
  2021-02-18  0:55       ` Ben Widawsky
@ 2021-02-18 16:50         ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-02-18 16:50 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: David Hildenbrand, Vishal Verma, John Groves (jgroves),
	Chris Browy, qemu-devel, linux-cxl, Markus Armbruster,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny,
	Philippe Mathieu-Daudé

On Wed, 17 Feb 2021 16:55:14 -0800
Ben Widawsky <ben@bwidawsk.net> wrote:

> On 21-02-11 17:46:39, Jonathan Cameron wrote:
> > On Tue, 2 Feb 2021 14:58:30 +0000
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> >   
> > > On Mon, 1 Feb 2021 16:59:22 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >   
> > > > This is the beginning of implementing mailbox support for CXL 2.0
> > > > devices. The implementation recognizes when the doorbell is rung,
> > > > handles the command/payload, clears the doorbell while returning error
> > > > codes and data.
> > > > 
> > > > Generally the mailbox mechanism is designed to permit communication
> > > > between the host OS and the firmware running on the device. For our
> > > > purposes, we emulate both the firmware, implemented primarily in
> > > > cxl-mailbox-utils.c, and the hardware.
> > > > 
> > > > No commands are implemented yet.
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > ---
> > > >  hw/cxl/cxl-device-utils.c   | 125 ++++++++++++++++++++++-
> > > >  hw/cxl/cxl-mailbox-utils.c  | 197 ++++++++++++++++++++++++++++++++++++
> > > >  hw/cxl/meson.build          |   1 +
> > > >  include/hw/cxl/cxl.h        |   3 +
> > > >  include/hw/cxl/cxl_device.h |  28 ++++-
> > > >  5 files changed, 349 insertions(+), 5 deletions(-)
> > > >  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> > > > 
> > > > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > > > index bb15ad9a0f..6602606f3d 100644
> > > > --- a/hw/cxl/cxl-device-utils.c
> > > > +++ b/hw/cxl/cxl-device-utils.c
> > > > @@ -40,6 +40,111 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> > > >      return 0;
> > > >  }
> > > >  
> > > > +static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
> > > > +{
> > > > +    CXLDeviceState *cxl_dstate = opaque;
> > > > +
> > > > +    switch (size) {  
> > 
> > As per the thread on the linux driver and the infinite loop I saw there
> > as a result of doing 1 byte reads.
> > 
> > With the current setup of min_access_size = 4 and this
> > function QEMU will helpfully issue a series of unaligned 4 byte
> > reads to this function. It will then mask those down to 1 byte each
> > and combine them.  Given the integer division that results in
> > the bottom byte of offset / 4 being returned up to 4 times.
> > 
> > To handle 2 and 1 byte reads we need explicit support in here and
> > the MemoryRegionOps need to reflect that as well.
> > 
> > All the similar cases where such reads are allowed need to do the
> > same.
> >   
> 
> From the documentation (memory.rst)
> - .impl.min_access_size, .impl.max_access_size define the access sizes
>   (in bytes) supported by the *implementation*; other access sizes will be
>   emulated using the ones available. For example a 4-byte write will be
>   emulated using four 1-byte writes, if .impl.max_access_size = 1.
> 
> I'm intentionally not looking at the implementation. As I read this, the
> behavior you describe is either a QEMU bug, or poor documentation.
> 
> Am I missing something?

Good question...  

For the case of emulating a larger access than supported it seems to be
fine

Code is implemented in softmmu/memory.c as something like.

for (i = 0; i < size; i+= access_size) {
    r |= access_fn(mr, addr + i, value, access_size,
		   i * 8, access_mask, attrs);
}

For smaller read than supported,
memory_region_shift_read_access() looks like it would do the
masking but the mask and shift passed to the function are wrong.
There is a fixme in the code referring to unaligned accesses,
but these are aligned.

It's fairly easy to make this work, but unexpected side effects
worry me a little + my stab at big endian support may be completely
wrong.

I'll mess around with this a bit more then send out something to
see how people would like to fix this issue
(either the docs, or the code).

Jonathan

> 
> > > > +    case 8:
> > > > +        return cxl_dstate->mbox_reg_state64[offset / 8];
> > > > +    case 4:
> > > > +        return cxl_dstate->mbox_reg_state32[offset / 4];    
> > > 
> > > Numeric order seems more natural and I can't see a reason not to do it.
> > > + you do them in that order below.
> > >   
> > > > +    default:
> > > > +        g_assert_not_reached();
> > > > +    }
> > > > +}
> > > > +
> > > > +static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
> > > > +                               uint64_t value)
> > > > +{
> > > > +    switch (offset) {
> > > > +    case A_CXL_DEV_MAILBOX_CTRL:
> > > > +        /* fallthrough */
> > > > +    case A_CXL_DEV_MAILBOX_CAP:
> > > > +        /* RO register */
> > > > +        break;
> > > > +    default:
> > > > +        qemu_log_mask(LOG_UNIMP,
> > > > +                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
> > > > +                      __func__, offset);
> > > > +        break;
> > > > +    }
> > > > +
> > > > +    reg_state[offset / 4] = value;
> > > > +}
> > > > +
> > > > +static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
> > > > +                               uint64_t value)
> > > > +{
> > > > +    switch (offset) {
> > > > +    case A_CXL_DEV_MAILBOX_CMD:
> > > > +        break;
> > > > +    case A_CXL_DEV_BG_CMD_STS:
> > > > +        /* BG not supported */
> > > > +        /* fallthrough */
> > > > +    case A_CXL_DEV_MAILBOX_STS:
> > > > +        /* Read only register, will get updated by the state machine */
> > > > +        return;
> > > > +    default:
> > > > +        qemu_log_mask(LOG_UNIMP,
> > > > +                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
> > > > +                      __func__, offset);
> > > > +        return;
> > > > +    }
> > > > +
> > > > +
> > > > +    reg_state[offset / 8] = value;
> > > > +}
> > > > +
> > > > +static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
> > > > +                              unsigned size)
> > > > +{
> > > > +    CXLDeviceState *cxl_dstate = opaque;
> > > > +
> > > > +    if (offset >= A_CXL_DEV_CMD_PAYLOAD) {
> > > > +        memcpy(cxl_dstate->mbox_reg_state + offset, &value, size);
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    /*
> > > > +     * Lock is needed to prevent concurrent writes as well as to prevent writes
> > > > +     * coming in while the firmware is processing. Without background commands
> > > > +     * or the second mailbox implemented, this serves no purpose since the
> > > > +     * memory access is synchronized at a higher level (per memory region).
> > > > +     */
> > > > +    RCU_READ_LOCK_GUARD();
> > > > +
> > > > +    switch (size) {
> > > > +    case 4:
> > > > +        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
> > > > +        break;
> > > > +    case 8:
> > > > +        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
> > > > +        break;
> > > > +    default:
> > > > +        g_assert_not_reached();
> > > > +    }
> > > > +
> > > > +    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > > > +                         DOORBELL))
> > > > +        cxl_process_mailbox(cxl_dstate);
> > > > +}
> > > > +
> > > > +static const MemoryRegionOps mailbox_ops = {
> > > > +    .read = mailbox_reg_read,
> > > > +    .write = mailbox_reg_write,
> > > > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > > > +    .valid = {
> > > > +        .min_access_size = 1,
> > > > +        .max_access_size = 8,
> > > > +        .unaligned = false,
> > > > +    },
> > > > +    .impl = {
> > > > +        .min_access_size = 4,
> > > > +        .max_access_size = 8,
> > > > +    },
> > > > +};
> > > > +
> > > >  static const MemoryRegionOps dev_ops = {
> > > >      .read = dev_reg_read,
> > > >      .write = NULL, /* status register is read only */
> > > > @@ -80,20 +185,33 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> > > >                            "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
> > > >      memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
> > > >                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> > > > +    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
> > > > +                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
> > > >  
> > > >      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> > > >                                  &cxl_dstate->caps);
> > > >      memory_region_add_subregion(&cxl_dstate->device_registers,
> > > >                                  CXL_DEVICE_REGISTERS_OFFSET,
> > > >                                  &cxl_dstate->device);
> > > > +    memory_region_add_subregion(&cxl_dstate->device_registers,
> > > > +                                CXL_MAILBOX_REGISTERS_OFFSET,
> > > > +                                &cxl_dstate->mailbox);
> > > >  }
> > > >  
> > > >  static void device_reg_init_common(CXLDeviceState *cxl_dstate) { }
> > > >  
> > > > +static void mailbox_reg_init_common(CXLDeviceState *cxl_dstate)
> > > > +{
> > > > +    /* 2048 payload size, with no interrupt or background support */
> > > > +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CAP,
> > > > +                     PAYLOAD_SIZE, CXL_MAILBOX_PAYLOAD_SHIFT);
> > > > +    cxl_dstate->payload_size = CXL_MAILBOX_MAX_PAYLOAD_SIZE;
> > > > +}
> > > > +
> > > >  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> > > >  {
> > > >      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> > > > -    const int cap_count = 1;
> > > > +    const int cap_count = 2;
> > > >  
> > > >      /* CXL Device Capabilities Array Register */
> > > >      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> > > > @@ -102,4 +220,9 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> > > >  
> > > >      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> > > >      device_reg_init_common(cxl_dstate);
> > > > +
> > > > +    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> > > > +    mailbox_reg_init_common(cxl_dstate);
> > > > +
> > > > +    assert(cxl_initialize_mailbox(cxl_dstate) == 0);
> > > >  }
> > > > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > > > new file mode 100644
> > > > index 0000000000..466055b01a
> > > > --- /dev/null
> > > > +++ b/hw/cxl/cxl-mailbox-utils.c
> > > > @@ -0,0 +1,197 @@
> > > > +/*
> > > > + * CXL Utility library for mailbox interface
> > > > + *
> > > > + * Copyright(C) 2020 Intel Corporation.
> > > > + *
> > > > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > > > + * COPYING file in the top-level directory.
> > > > + */
> > > > +
> > > > +#include "qemu/osdep.h"
> > > > +#include "hw/cxl/cxl.h"
> > > > +#include "hw/pci/pci.h"
> > > > +#include "qemu/log.h"
> > > > +#include "qemu/uuid.h"
> > > > +
> > > > +/*
> > > > + * How to add a new command, example. The command set FOO, with cmd BAR.
> > > > + *  1. Add the command set and cmd to the enum.
> > > > + *     FOO    = 0x7f,
> > > > + *          #define BAR 0
> > > > + *  2. Forward declare the handler.
> > > > + *     declare_mailbox_handler(FOO_BAR);
> > > > + *  3. Add the command to the cxl_cmd_set[][]
> > > > + *     CXL_CMD(FOO, BAR, 0, 0),
> > > > + *  4. Implement your handler
> > > > + *     define_mailbox_handler(FOO_BAR) { ... return CXL_MBOX_SUCCESS; }
> > > > + *
> > > > + *
> > > > + *  Writing the handler:
> > > > + *    The handler will provide the &struct cxl_cmd, the &CXLDeviceState, and the
> > > > + *    in/out length of the payload. The handler is responsible for consuming the
> > > > + *    payload from cmd->payload and operating upon it as necessary. It must then
> > > > + *    fill the output data into cmd->payload (overwriting what was there),
> > > > + *    setting the length, and returning a valid return code.
> > > > + *
> > > > + *  XXX: The handler need not worry about endianess. The payload is read out of
> > > > + *  a register interface that already deals with it.
> > > > + */
> > > > +
> > > > +/* 8.2.8.4.5.1 Command Return Codes */
> > > > +typedef enum {
> > > > +    CXL_MBOX_SUCCESS = 0x0,
> > > > +    CXL_MBOX_BG_STARTED = 0x1,
> > > > +    CXL_MBOX_INVALID_INPUT = 0x2,
> > > > +    CXL_MBOX_UNSUPPORTED = 0x3,
> > > > +    CXL_MBOX_INTERNAL_ERROR = 0x4,
> > > > +    CXL_MBOX_RETRY_REQUIRED = 0x5,
> > > > +    CXL_MBOX_BUSY = 0x6,
> > > > +    CXL_MBOX_MEDIA_DISABLED = 0x7,
> > > > +    CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
> > > > +    CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
> > > > +    CXL_MBOX_FW_AUTH_FAILED = 0xa,
> > > > +    CXL_MBOX_FW_INVALID_SLOT = 0xb,
> > > > +    CXL_MBOX_FW_ROLLEDBACK = 0xc,
> > > > +    CXL_MBOX_FW_REST_REQD = 0xd,
> > > > +    CXL_MBOX_INVALID_HANDLE = 0xe,
> > > > +    CXL_MBOX_INVALID_PA = 0xf,
> > > > +    CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
> > > > +    CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
> > > > +    CXL_MBOX_ABORTED = 0x12,
> > > > +    CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
> > > > +    CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
> > > > +    CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
> > > > +    CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
> > > > +    CXL_MBOX_MAX = 0x17
> > > > +} ret_code;
> > > > +
> > > > +struct cxl_cmd;
> > > > +typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,
> > > > +                                   CXLDeviceState *cxl_dstate, uint16_t *len);
> > > > +struct cxl_cmd {
> > > > +    const char *name;
> > > > +    opcode_handler handler;
> > > > +    ssize_t in;
> > > > +    uint16_t effect; /* Reported in CEL */
> > > > +    uint8_t *payload;    
> > > 
> > > This payload pointer feels somewhat out of place. Perhaps it should be a parameter
> > > passed to the opcode_handler()?  The address of the payload doesn't feel like
> > > part of the command as such so you are justing using it as somewhere to stash
> > > the address when passing to the handler.
> > > 
> > >   
> > > > +};
> > > > +
> > > > +#define define_mailbox_handler(name)                \
> > > > +    static ret_code cmd_##name(struct cxl_cmd *cmd, \
> > > > +                               CXLDeviceState *cxl_dstate, uint16_t *len)
> > > > +#define declare_mailbox_handler(name) define_mailbox_handler(name)
> > > > +
> > > > +#define define_mailbox_handler_zeroed(name, size)                         \
> > > > +    uint16_t __zero##name = size;                                         \
> > > > +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> > > > +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> > > > +    {                                                                     \
> > > > +        *len = __zero##name;                                              \
> > > > +        memset(cmd->payload, 0, *len);                                    \
> > > > +        return CXL_MBOX_SUCCESS;                                          \
> > > > +    }
> > > > +#define define_mailbox_handler_const(name, data)                          \
> > > > +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> > > > +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> > > > +    {                                                                     \
> > > > +        *len = sizeof(data);                                              \
> > > > +        memcpy(cmd->payload, data, *len);                                 \
> > > > +        return CXL_MBOX_SUCCESS;                                          \
> > > > +    }
> > > > +#define define_mailbox_handler_nop(name)                                  \
> > > > +    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
> > > > +                               CXLDeviceState *cxl_dstate, uint16_t *len) \
> > > > +    {                                                                     \
> > > > +        return CXL_MBOX_SUCCESS;                                          \
> > > > +    }
> > > > +
> > > > +#define CXL_CMD(s, c, in, cel_effect) \
> > > > +    [s][c] = { stringify(s##_##c), cmd_##s##_##c, in, cel_effect }
> > > > +
> > > > +static struct cxl_cmd cxl_cmd_set[256][256] = {};
> > > > +
> > > > +#undef CXL_CMD
> > > > +
> > > > +QemuUUID cel_uuid;
> > > > +
> > > > +void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> > > > +{
> > > > +    uint16_t ret = CXL_MBOX_SUCCESS;
> > > > +    struct cxl_cmd *cxl_cmd;
> > > > +    uint64_t status_reg;
> > > > +    opcode_handler h;
> > > > +
> > > > +    /*
> > > > +     * current state of mailbox interface
> > > > +     *  mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
> > > > +     *  mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
> > > > +     *  status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
> > > > +     */
> > > > +    uint64_t command_reg =
> > > > +        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
> > > > +
> > > > +    /* Check if we have to do anything */
> > > > +    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > > > +                          DOORBELL)) {
> > > > +        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    uint8_t set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
> > > > +    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> > > > +    uint16_t len = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH);
> > > > +    cxl_cmd = &cxl_cmd_set[set][cmd];
> > > > +    h = cxl_cmd->handler;
> > > > +    if (!h) {    
> > > 
> > > This path seems to not convey information it perhaps should.  Maybe some feedback that
> > > a command was requested that isn't registered?
> > >   
> > > > +        goto handled;
> > > > +    }
> > > > +
> > > > +    if (len != cxl_cmd->in) {
> > > > +        ret = CXL_MBOX_INVALID_PAYLOAD_LENGTH;
> > > > +    }
> > > > +
> > > > +    cxl_cmd->payload = cxl_dstate->mbox_reg_state + A_CXL_DEV_CMD_PAYLOAD;
> > > > +    ret = (*h)(cxl_cmd, cxl_dstate, &len);
> > > > +    assert(len <= cxl_dstate->payload_size);
> > > > +
> > > > +handled:
> > > > +    /*
> > > > +     * Set the return code
> > > > +     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
> > > > +     * away with this
> > > > +     */
> > > > +    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
> > > > +
> > > > +    /*
> > > > +     * Set the return length
> > > > +     */
> > > > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
> > > > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
> > > > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, len);
> > > > +
> > > > +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_CMD / 8] = command_reg;
> > > > +    cxl_dstate->mbox_reg_state64[A_CXL_DEV_MAILBOX_STS / 8] = status_reg;
> > > > +
> > > > +    /* Tell the host we're done */
> > > > +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > > > +                     DOORBELL, 0);
> > > > +}
> > > > +
> > > > +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate)
> > > > +{
> > > > +    const char *cel_uuidstr = "0da9c0b5-bf41-4b78-8f79-96b1623b3f17";
> > > > +
> > > > +    for (int i = 0; i < 256; i++) {    
> > > 
> > > Instead of indexing with i and j, perhaps this would be more consistent using
> > > the naming you have above cmd and set?
> > > 
> > >   
> > > > +        for (int j = 0; j < 256; j++) {
> > > > +            if (cxl_cmd_set[i][j].handler) {
> > > > +                struct cxl_cmd *c = &cxl_cmd_set[i][j];
> > > > +
> > > > +                cxl_dstate->cel_log[cxl_dstate->cel_size].opcode = (i << 8) | j;
> > > > +                cxl_dstate->cel_log[cxl_dstate->cel_size].effect = c->effect;
> > > > +                cxl_dstate->cel_size++;
> > > > +            }
> > > > +        }
> > > > +    }
> > > > +
> > > > +    return qemu_uuid_parse(cel_uuidstr, &cel_uuid);
> > > > +}
> > > > diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> > > > index 47154d6850..0eca715d10 100644
> > > > --- a/hw/cxl/meson.build
> > > > +++ b/hw/cxl/meson.build
> > > > @@ -1,4 +1,5 @@
> > > >  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> > > >    'cxl-component-utils.c',
> > > >    'cxl-device-utils.c',
> > > > +  'cxl-mailbox-utils.c',
> > > >  ))
> > > > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > > > index 23f52c4cf9..362cda40de 100644
> > > > --- a/include/hw/cxl/cxl.h
> > > > +++ b/include/hw/cxl/cxl.h
> > > > @@ -14,5 +14,8 @@
> > > >  #include "cxl_component.h"
> > > >  #include "cxl_device.h"
> > > >  
> > > > +#define COMPONENT_REG_BAR_IDX 0
> > > > +#define DEVICE_REG_BAR_IDX 2    
> > > 
> > > I'd argue for prefixing all defines
> > > 
> > > CXL_COMPONENT_REG_BAR_IDX etc
> > > 
> > > Will make it clear they are generic CXL related things.
> > >   
> > > > +
> > > >  #endif
> > > >  
> > > > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > > > index f3bcf19410..af91bec10c 100644
> > > > --- a/include/hw/cxl/cxl_device.h
> > > > +++ b/include/hw/cxl/cxl_device.h
> > > > @@ -80,16 +80,27 @@ typedef struct cxl_device_state {
> > > >      MemoryRegion device_registers;
> > > >  
> > > >      /* mmio for device capabilities array - 8.2.8.2 */
> > > > +    MemoryRegion device;
> > > >      struct {
> > > >          MemoryRegion caps;
> > > >          uint32_t caps_reg_state32[CXL_CAPS_SIZE / 4];
> > > >      };
> > > >  
> > > > -    /* mmio for the device status registers 8.2.8.3 */
> > > > -    MemoryRegion device;
> > > > -    
> > > 
> > > Move this block back to where it was originally introduced rather than
> > > introduce it then move it later?
> > >   
> > > >      /* mmio for the mailbox registers 8.2.8.4 */
> > > > -    MemoryRegion mailbox;
> > > > +    struct {
> > > > +        MemoryRegion mailbox;
> > > > +        uint16_t payload_size;
> > > > +        union {
> > > > +            uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
> > > > +            uint32_t mbox_reg_state32[CXL_MAILBOX_REGISTERS_LENGTH / 4];
> > > > +            uint64_t mbox_reg_state64[CXL_MAILBOX_REGISTERS_LENGTH / 8];
> > > > +        };
> > > > +        struct {
> > > > +            uint16_t opcode;
> > > > +            uint16_t effect;
> > > > +        } cel_log[1 << 16];
> > > > +        size_t cel_size;
> > > > +    };    
> > > 
> > > If the structure is unnamed, chances of a naming clash seem rather high
> > > if you don't prefix all the elements with mbx_ or something like that.
> > >   
> > > >  
> > > >      /* memory region for persistent memory, HDM */
> > > >      MemoryRegion *pmem;
> > > > @@ -135,6 +146,9 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> > > >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> > > >                                                 CXL_DEVICE_CAP_REG_SIZE)
> > > >  
> > > > +int cxl_initialize_mailbox(CXLDeviceState *cxl_dstate);
> > > > +void cxl_process_mailbox(CXLDeviceState *cxl_dstate);
> > > > +
> > > >  #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
> > > >      do {                                                                           \
> > > >          uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> > > > @@ -162,6 +176,12 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
> > > >      FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 1)
> > > >      FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
> > > >  
> > > > +/* XXX: actually a 64b register */
> > > > +REG32(CXL_DEV_MAILBOX_CMD, 8)
> > > > +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)
> > > > +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
> > > > +    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
> > > > +    
> > > 
> > > Ah. this is where this definition went.  Perhaps pull it back into patch 3?
> > > That patch defines plenty of other things that aren't used until later patches
> > > I think, so one more won't hurt and will save me asking why you skipped it:)
> > > 
> > >   
> > > >  /* XXX: actually a 64b register */
> > > >  REG32(CXL_DEV_MAILBOX_STS, 0x10)
> > > >      FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)    
> > > 
> > >   
> > 
> >   


^ permalink raw reply	[flat|nested] 117+ messages in thread

* [RFC PATCH] hw/mem/cxl_type3: Go back to subregions
  2021-02-02  0:59 ` Ben Widawsky
                   ` (33 preceding siblings ...)
  (?)
@ 2021-03-11 23:27 ` Ben Widawsky
  -1 siblings, 0 replies; 117+ messages in thread
From: Ben Widawsky @ 2021-03-11 23:27 UTC (permalink / raw)
  To: qemu-devel
  Cc: Igor Mammedov, Ben Widawsky, Chris Browy, Michael S. Tsirkin,
	Jonathan Cameron

Each device allocates its memory (persistent only for now) out of a
container memory that represents a "window" that would be defined by the
host bridge. For example, a host bridge may claim all traffic from 0x0 -
0x4000; it might then also direct 0x1000-0x1fff to a specific CXL
device. Change the memory region type claiming 0x1000-0x1fff from an
alias, to a subregion.

v1 and v2 of the patch series both used a subregion. There were two
issues with this and so for v3, an alias was chosen mimicking nvdimm.

The switch to alias left an issue in the implementation. There's logic
that check to make sure two memory regions (ie. two devices under the
same host bridge) couldn't collide. While hardware doesn't make this
guarantee, it's nice for driver debug. There is no clean way to do that
with an alias.

More importantly, this change was inspired by implementing support for
both volatile and persistent memory. In that case, you may have multiple
devices which the BIOS is going to assign address ranges to. Since we
are the BIOS in this case, having a way of finding used space in the
memory window so that you can allocate the next chunk is easily
accomplished here.

With this, I have the following output from `info mtree`
    0000004c00000000-0000004c1fffffff (prio 0, ram): cxl-mem1
      0000004c00000000-0000004c0fffffff (prio 1, i/o): cxl_type3-memory

And `info memory-devices`
Memory device [cxl]: "cxl-pmem0"
  addr: 0x4c00000000
  slot: 0
  node: 0
  size: 268435456
  memdev: /objects/cxl-mem1
  hotplugged: false
  hotpluggable: true

This functionality has been tested with a WIP linux patch which amounts
to this:
       reg = readl(cxlm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);

       dev_err(dev, "decoder cap:\n"
                    "\tcount: %lu\n",
               FIELD_GET(CXL_HDM_DECODER_COUNT_MASK, reg));

       writel(0x4c, cxlm->regs.hdm_decoder + CXL_HDM_DECODER0_BASE_HIGH_OFFSET);
       writel(0, cxlm->regs.hdm_decoder + CXL_HDM_DECODER0_BASE_LOW_OFFSET);
       writel(0, cxlm->regs.hdm_decoder + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET);
       writel(256 << 20, cxlm->regs.hdm_decoder + CXL_HDM_DECODER0_SIZE_LOW_OFFSET);
       writel(BIT(9),  cxlm->regs.hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET);

       tmp = ioremap_uc(0x4c00000000, 4096);
       writel(0x20, tmp);

Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/mem/cxl_type3.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index bf33ddb915..33991079a6 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -46,7 +46,9 @@ static void build_dvsecs(CXLType3Dev *ct3d)
 static void cxl_set_addr(CXLType3Dev *ct3d, hwaddr addr, Error **errp)
 {
     MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(ct3d);
-    mdc->set_addr(MEMORY_DEVICE(ct3d), addr, errp);
+    MemoryRegion *mr = host_memory_backend_get_memory(ct3d->hostmem);
+
+    mdc->set_addr(MEMORY_DEVICE(ct3d), addr - mr->addr, errp);
 }
 
 static void hdm_decoder_commit(CXLType3Dev *ct3d, int which)
@@ -180,13 +182,13 @@ static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
 
     memory_region_set_nonvolatile(pmem, true);
     memory_region_set_enabled(pmem, false);
-    memory_region_init_alias(pmem, OBJECT(ct3d), "cxl_type3-memory", mr, 0,
-                             ct3d->size);
+    memory_region_init(pmem, OBJECT(ct3d), "cxl_type3-memory", ct3d->size);
+    memory_region_add_subregion_overlap(mr, 0, pmem, 1);
     ct3d->cxl_dstate.pmem = pmem;
 
 #ifdef SET_PMEM_PADDR
     /* This path will initialize the memory device as if BIOS had done it */
-    cxl_set_addr(ct3d, mr->addr + offset, errp);
+    cxl_set_addr(ct3d, offset, errp);
     memory_region_set_enabled(pmem, true);
 #endif
 }
@@ -246,8 +248,11 @@ static uint64_t cxl_md_get_addr(const MemoryDeviceState *md)
     CXLType3Dev *ct3d = CT3(md);
     MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
 
-    assert(pmem->alias);
-    return pmem->alias_offset;
+    if (pmem) {
+        return pmem->addr + pmem->container->addr;
+    }
+
+    return 0;
 }
 
 static void cxl_md_set_addr(MemoryDeviceState *md, uint64_t addr, Error **errp)
@@ -255,8 +260,6 @@ static void cxl_md_set_addr(MemoryDeviceState *md, uint64_t addr, Error **errp)
     CXLType3Dev *ct3d = CT3(md);
     MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
 
-    assert(pmem->alias);
-    memory_region_set_alias_offset(pmem, addr);
     memory_region_set_address(pmem, addr);
 }
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 14/31] acpi/pci: Consolidate host bridge setup
  2021-02-02  0:59   ` Ben Widawsky
@ 2021-12-02 10:32     ` Jonathan Cameron via
  -1 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2021-12-02 10:32 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny

On Mon, 1 Feb 2021 16:59:31 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This cleanup will make it easier to add support for CXL to the mix.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Hi Ben / all (particularly PCI experts!)

So... I was looking at the large impact this has on needing to update
ACPI tables for the tests and that got me wondering.
The issue is it reorders things a bit rather than making any functional changes.

Why does the pxb expander ACPI not have ADR set and all other
PNP0A03 / PNP0A08 root bridge acpi chunks do?

I've messed around with the values provided and dug around in Linux
and other than them being exposed in the sysfs for firmware blobs associated with
/sys/class/pci_bus/devices these particular _ADR entries don't actually seem to
be used by Linux.

So it becomes a question of what the specification says...

As ever clear as mud :)

PCI firmware spec doesn't say, but has an example with non _ADR entry.
4.5.3 in the PCI Firmware Specification Revisions 3.3
The ACPI spec has an example with _ADR when describing _SEG.
6.5.6 in ACPI 6.4

_ADR description is the address of the device on it's parent bus.
I'm not sure a root bridge parent bus (which is probably the SB, has
any such concept of an address (which probably explains why I've never
seen it set to anything other than 0).

Checking where the _ADR 0 entries came from, they go back to Michael importing
tables from seabios in 2013.

My current feeling is we don't want to risk breaking some user of these values
so perhaps the simple option is just to add _ADR to the PXB ACPI block as well?

That way I think we will cause less churn in the ACPI tables needed for tests
and end up at least consistent in what QEMU presents.

Note I'd ideally like to separate this as a precursor to the CXL series rather than
burying it in the middle of that!

Jonathan

> ---
>  hw/i386/acpi-build.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index f56d699c7f..cf6eb54c22 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1194,6 +1194,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
>      aml_append(table, scope);
>  }
>  
> +enum { PCI, PCIE };
> +static void init_pci_acpi(Aml *dev, int uid, int type)
> +{
> +    if (type == PCI) {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +    } else {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> +        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +        aml_append(dev, build_q35_osc_method());
> +    }
> +}
> +
>  static void
>  build_dsdt(GArray *table_data, BIOSLinker *linker,
>             AcpiPmInfo *pm, AcpiMiscInfo *misc,
> @@ -1222,9 +1236,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      if (misc->is_piix4) {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCI);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1238,11 +1251,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      } else {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCIE);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> -        aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>  
>          if (pm->smi_on_cpuhp) {
> @@ -1345,15 +1355,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>  
>              scope = aml_scope("\\_SB");
>              dev = aml_device("PC%.02X", bus_num);
> -            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> -            if (pci_bus_is_express(bus)) {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> -                aml_append(dev, build_q35_osc_method());
> -            } else {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> -            }
> +            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {
>                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [RFC PATCH v3 14/31] acpi/pci: Consolidate host bridge setup
@ 2021-12-02 10:32     ` Jonathan Cameron via
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron via @ 2021-12-02 10:32 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: qemu-devel, David Hildenbrand, Vishal Verma,
	John Groves (jgroves),
	Chris Browy, Markus Armbruster, linux-cxl,
	Philippe Mathieu-Daudé,
	Michael S. Tsirkin, Igor Mammedov, Dan Williams, Ira Weiny

On Mon, 1 Feb 2021 16:59:31 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This cleanup will make it easier to add support for CXL to the mix.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Hi Ben / all (particularly PCI experts!)

So... I was looking at the large impact this has on needing to update
ACPI tables for the tests and that got me wondering.
The issue is it reorders things a bit rather than making any functional changes.

Why does the pxb expander ACPI not have ADR set and all other
PNP0A03 / PNP0A08 root bridge acpi chunks do?

I've messed around with the values provided and dug around in Linux
and other than them being exposed in the sysfs for firmware blobs associated with
/sys/class/pci_bus/devices these particular _ADR entries don't actually seem to
be used by Linux.

So it becomes a question of what the specification says...

As ever clear as mud :)

PCI firmware spec doesn't say, but has an example with non _ADR entry.
4.5.3 in the PCI Firmware Specification Revisions 3.3
The ACPI spec has an example with _ADR when describing _SEG.
6.5.6 in ACPI 6.4

_ADR description is the address of the device on it's parent bus.
I'm not sure a root bridge parent bus (which is probably the SB, has
any such concept of an address (which probably explains why I've never
seen it set to anything other than 0).

Checking where the _ADR 0 entries came from, they go back to Michael importing
tables from seabios in 2013.

My current feeling is we don't want to risk breaking some user of these values
so perhaps the simple option is just to add _ADR to the PXB ACPI block as well?

That way I think we will cause less churn in the ACPI tables needed for tests
and end up at least consistent in what QEMU presents.

Note I'd ideally like to separate this as a precursor to the CXL series rather than
burying it in the middle of that!

Jonathan

> ---
>  hw/i386/acpi-build.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index f56d699c7f..cf6eb54c22 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1194,6 +1194,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
>      aml_append(table, scope);
>  }
>  
> +enum { PCI, PCIE };
> +static void init_pci_acpi(Aml *dev, int uid, int type)
> +{
> +    if (type == PCI) {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +    } else {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> +        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +        aml_append(dev, build_q35_osc_method());
> +    }
> +}
> +
>  static void
>  build_dsdt(GArray *table_data, BIOSLinker *linker,
>             AcpiPmInfo *pm, AcpiMiscInfo *misc,
> @@ -1222,9 +1236,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      if (misc->is_piix4) {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCI);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1238,11 +1251,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      } else {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCIE);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> -        aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>  
>          if (pm->smi_on_cpuhp) {
> @@ -1345,15 +1355,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>  
>              scope = aml_scope("\\_SB");
>              dev = aml_device("PC%.02X", bus_num);
> -            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> -            if (pci_bus_is_express(bus)) {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> -                aml_append(dev, build_q35_osc_method());
> -            } else {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> -            }
> +            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {
>                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));



^ permalink raw reply	[flat|nested] 117+ messages in thread

end of thread, other threads:[~2021-12-02 10:33 UTC | newest]

Thread overview: 117+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-02  0:59 [RFC PATCH v3 00/31] CXL 2.0 Support Ben Widawsky
2021-02-02  0:59 ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 01/31] hw/pci/cxl: Add a CXL component type (interface) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 02/31] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 11:48   ` Jonathan Cameron
2021-02-02 11:48     ` Jonathan Cameron
2021-02-17 18:36     ` Ben Widawsky
2021-02-11 17:08   ` Jonathan Cameron
2021-02-11 17:08     ` Jonathan Cameron
2021-02-17 16:40     ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 03/31] hw/cxl/device: Introduce a CXL device (8.2.8) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 12:03   ` Jonathan Cameron
2021-02-02 12:03     ` Jonathan Cameron
2021-02-02  0:59 ` [RFC PATCH v3 04/31] hw/cxl/device: Implement the CAP array (8.2.8.1-2) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 12:23   ` Jonathan Cameron
2021-02-02 12:23     ` Jonathan Cameron
2021-02-17 22:15     ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 05/31] hw/cxl/device: Implement basic mailbox (8.2.8.4) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 14:58   ` Jonathan Cameron
2021-02-02 14:58     ` Jonathan Cameron
2021-02-11 17:46     ` Jonathan Cameron
2021-02-18  0:55       ` Ben Widawsky
2021-02-18 16:50         ` Jonathan Cameron
2021-02-11 18:09   ` Jonathan Cameron
2021-02-11 18:09     ` Jonathan Cameron
2021-02-02  0:59 ` [RFC PATCH v3 06/31] hw/cxl/device: Add memory device utilities Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 07/31] hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 13:44   ` Jonathan Cameron
2021-02-02 13:44     ` Jonathan Cameron
2021-02-11 17:59   ` Jonathan Cameron
2021-02-11 17:59     ` Jonathan Cameron
2021-02-02  0:59 ` [RFC PATCH v3 08/31] hw/cxl/device: Timestamp implementation (8.2.9.3) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 09/31] hw/cxl/device: Add log commands (8.2.9.4) + CEL Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 10/31] hw/pxb: Use a type for realizing expanders Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 13:50   ` Jonathan Cameron
2021-02-02 13:50     ` Jonathan Cameron
2021-02-02  0:59 ` [RFC PATCH v3 11/31] hw/pci/cxl: Create a CXL bus type Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 12/31] hw/pxb: Allow creation of a CXL PXB (host bridge) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 13/31] qtest: allow DSDT acpi table changes Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 14/31] acpi/pci: Consolidate host bridge setup Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 13:56   ` Jonathan Cameron
2021-02-02 13:56     ` Jonathan Cameron
2021-12-02 10:32   ` Jonathan Cameron
2021-12-02 10:32     ` Jonathan Cameron via
2021-02-02  0:59 ` [RFC PATCH v3 15/31] tests/acpi: remove stale allowed tables Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 16/31] hw/pci: Plumb _UID through host bridges Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 15:00   ` Jonathan Cameron
2021-02-02 15:00     ` Jonathan Cameron
2021-02-02 15:24     ` Michael S. Tsirkin
2021-02-02 15:24       ` Michael S. Tsirkin
2021-02-02 15:42       ` Ben Widawsky
2021-02-02 15:42         ` Ben Widawsky
2021-02-02 15:51         ` Michael S. Tsirkin
2021-02-02 15:51           ` Michael S. Tsirkin
2021-02-02 16:20           ` Ben Widawsky
2021-02-02 16:20             ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 17/31] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 19:21   ` Jonathan Cameron
2021-02-02 19:21     ` Jonathan Cameron
2021-02-02 19:45     ` Ben Widawsky
2021-02-02 20:43       ` Jonathan Cameron
2021-02-02 21:03         ` Ben Widawsky
2021-02-02 22:06           ` Jonathan Cameron
2021-02-02  0:59 ` [RFC PATCH v3 18/31] acpi/pxb/cxl: Reserve host bridge MMIO Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 19/31] hw/pxb/cxl: Add "windows" for host bridges Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 20/31] hw/cxl/rp: Add a root port Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 21/31] hw/cxl/device: Add a memory device (8.2.8.5) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02 14:26   ` Eric Blake
2021-02-02 15:06     ` Ben Widawsky
2021-02-02 15:06       ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 22/31] hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 23/31] acpi/cxl: Add _OSC implementation (9.14.2) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 24/31] tests/acpi: allow CEDT table addition Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 25/31] acpi/cxl: Create the CEDT (9.14.1) Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 26/31] tests/acpi: Add new CEDT files Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 27/31] hw/cxl/device: Add some trivial commands Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 28/31] hw/cxl/device: Plumb real LSA sizing Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 29/31] hw/cxl/device: Implement get/set LSA Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 30/31] qtest/cxl: Add very basic sanity tests Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  0:59 ` [RFC PATCH v3 31/31] WIP: i386/cxl: Initialize a host bridge Ben Widawsky
2021-02-02  0:59   ` Ben Widawsky
2021-02-02  1:33 ` [RFC PATCH v3 00/31] CXL 2.0 Support no-reply
2021-02-02  1:33   ` no-reply
2021-02-03 17:42 ` Ben Widawsky
2021-02-11 18:51   ` Jonathan Cameron
2021-02-11 18:51     ` Jonathan Cameron
2021-03-11 23:27 ` [RFC PATCH] hw/mem/cxl_type3: Go back to subregions Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.