From: Dan Williams <dan.j.williams@intel.com>
To: linux-cxl@vger.kernel.org
Cc: David Hildenbrand <david@redhat.com>,
Tony Luck <tony.luck@intel.com>, Jason Gunthorpe <jgg@nvidia.com>,
Ben Widawsky <bwidawsk@kernel.org>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Christoph Hellwig <hch@lst.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Matthew Wilcox <willy@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
nvdimm@lists.linux.dev, linux-pci@vger.kernel.org
Subject: [PATCH v2 00/28] CXL PMEM Region Provisioning
Date: Thu, 14 Jul 2022 17:00:41 -0700 [thread overview]
Message-ID: <165784324066.1758207.15025479284039479071.stgit@dwillia2-xfh.jf.intel.com> (raw)
Changes since v1 [1]:
- Move 19 patches that have received a Reviewed-by to the 'pending'
branch in cxl.git (Thanks Alison, Adam, and Jonathan!)
- Improve the changelog and add more Cc's to "cxl/acpi: Track CXL
resources in iomem_resource" and highlight the new export of
insert_resource_expand_to_fit()
- Switch all occurrences of the pattern "rc = -ECODE; if (condition)
goto err;" to "if (condition) { rc = -ECODE; goto err; }" (Jonathan)
- Re-organize all the cxl_{root,switch,endpoint}_decoder() patches to
move the decoder-type-specific setup into the decoder-type-specific
allocation routines (Jonathan)
- Add kdoc to clarify the behavior of add_cxl_resources() (Jonathan)
- Add IORES_DESC_CXL for kernel components like EDAC to determine when
they might be dealing with a CXL address range (Tony)
- Drop usage of dev_set_drvdata() for passing @cxl_res (Jonathan)
- Drop @remove_action argument to __cxl_dpa_release(), make it behave
like any other devm_<free> helper (Jonathan)
- Clarify 'skip' vs 'skipped' in DPA handling helpers (Jonathan)
- Clarify why port teardown no proceeds under the lock with the
conversion from list to xarray (Jonathan)
- Revert rename of cxl_find_dport_by_dev() (Jonathan)
- Fold down_read() / up_write() mismatch fix to the patch that
introduced the problem (Jonathan)
- Fix description of interleave_ways and interleave_granularity in the
sysfs ABI document
- Clarify tangential cleanups in "resource: Introduce
alloc_free_mem_region()" (Jonathan)
- Clarify rationale for the region creation / naming ABI (Jonathan)
- Add SET_CXL_REGION_ATTR() to supplement CXL_REGION_ATTR() the former
is used to optionally added region attributes to an attribute list
(position independent) and the latter is used to retrieve a pointer to
the attribute in code. (Jonathan)
- For writes to region attributes allow the same value to be written
multiple times without error (Jonathan)
- Clarify the actions performed by cxl_port_attach_region() (Jonathan)
- Commit message spelling fixes (Alison and Jonathan)
- Rename cxl_dpa_resource() => cxl_dpa_resource_start() (Jonathan)
- Reword error message in cxl_parse_cfmws() (Adam)
- Keep @expected_len signed in cxl_acpi_cfmws_verify() (Jonathan)
- Miscellaneous formatting and doc fixes (Jonathan)
- Rename port->dpa_end port->hdm_end (Jonathan)
- Rename unregister_region() => unregister_nvdimm_region() (Jonathan)
[1]: https://lore.kernel.org/linux-cxl/165603869943.551046.3498980330327696732.stgit@dwillia2-xfh
---
Until the CXL 2.0 definition arrived there was little reason for OS
drivers to care about CXL memory expanders. Similar to DDR they just
implemented a physical address range that was described to the OS by
platform firmware (EFI Memory Map + ACPI SRAT/SLIT/HMAT etc). The CXL
2.0 definition adds support for PMEM, hotplug, switch topologies, and
device-interleaving which exceeds the limits of what can be reasonably
abstracted by EFI + ACPI mechanisms. As a result, Linux needs a native
capability to provision new CXL regions.
The term "region" is the same term that originated in the LIBNVDIMM
implementation to describe a host physical / system physical address
range. For PMEM a region is a persistent memory range that can be
further sub-divided into namespaces. For CXL there are three
classifications of regions:
- PMEM: set up by CXL native tooling and persisted in CXL region labels
- RAM: set up dynamically by CXL native tooling after hotplug events, or
leftover capacity not mapped by platform firmware. Any persistent
configuration would come from set up scripts / configuration files in
userspace.
- System RAM: set up by platform firmware and described by EFI + ACPI
metadata, these regions are static.
For now, these patches implement just PMEM regions without region label
support. Note though that the infrastructure routines like
cxl_region_attach() and cxl_region_setup_targets() are building blocks
for region-label support, provisioning RAM regions, and enumerating
System RAM regions.
The general flow for provisioning a CXL region is to:
- Find a device or set of devices with available device-physical-address
(DPA) capacity
- Find a platform CXL window that has free capacity to map a new region
and that is able to target the devices in the previous step.
- Allocate DPA according to the CXL specification rules of sequential
enabling of decoders by id and when a device hosts multiple decoders
make sure that lower-id decoders map lower HPA and higher-id decoders
map higher HPA.
- Assign endpoint decoders to a region and validate that the switching
topology supports the requested configuration. Recall that
interleaving is governed by modulo or xormap math that constrains which
device can support which positions in a given region interleave.
- Program all the decoders an all endpoints and participating switches
to bring the new address range online.
Once the range is online then existing drivers like LIBNVDIMM or
device-dax can manage the memory range as if the ACPI BIOS had conveyed
its parameters at boot.
This patch kit is the result of significant amounts of path finding work
[2] and long discussions with Ben. Thank you Ben for all that work!
Where the patches in this kit go in a different design direction than
the RFC, the authorship is changed and a Co-developed-by is added mainly
so I get blamed for the bad decisions and not Ben. The major updates
from that last posting are:
- all CXL resources are reflected in full in iomem_resource
- host-physical-address (HPA) range allocation moves to a
devm_request_free_mem_region() derivative
- locking moves to two global rwsems, one for DPA / endpoint decoders
and one for HPA / regions.
- the existing port scanning path is augmented to cache more topology
information rather than recreate it at region creation time
[2]: https://lore.kernel.org/r/20220413183720.2444089-1-ben.widawsky@intel.com
---
Ben Widawsky (4):
cxl/hdm: Add sysfs attributes for interleave ways + granularity
cxl/region: Add region creation support
cxl/region: Add a 'uuid' attribute
cxl/region: Add interleave geometry attributes
Dan Williams (24):
Documentation/cxl: Use a double line break between entries
cxl/core: Define a 'struct cxl_switch_decoder'
cxl/acpi: Track CXL resources in iomem_resource
cxl/core: Define a 'struct cxl_root_decoder'
cxl/core: Define a 'struct cxl_endpoint_decoder'
cxl/hdm: Enumerate allocated DPA
cxl/hdm: Add 'mode' attribute to decoder objects
cxl/hdm: Track next decoder to allocate
cxl/hdm: Add support for allocating DPA to an endpoint decoder
cxl/port: Record dport in endpoint references
cxl/port: Record parent dport when adding ports
cxl/port: Move 'cxl_ep' references to an xarray per port
cxl/port: Move dport tracking to an xarray
cxl/mem: Enumerate port targets before adding endpoints
resource: Introduce alloc_free_mem_region()
cxl/region: Allocate HPA capacity to regions
cxl/region: Enable the assignment of endpoint decoders to regions
cxl/acpi: Add a host-bridge index lookup mechanism
cxl/region: Attach endpoint decoders
cxl/region: Program target lists
cxl/hdm: Commit decoder state to hardware
cxl/region: Add region driver boiler plate
cxl/pmem: Fix offline_nvdimm_bus() to offline by bridge
cxl/region: Introduce cxl_pmem_region objects
Documentation/ABI/testing/sysfs-bus-cxl | 213 +++
Documentation/driver-api/cxl/memory-devices.rst | 11
drivers/cxl/Kconfig | 8
drivers/cxl/acpi.c | 185 ++
drivers/cxl/core/Makefile | 1
drivers/cxl/core/core.h | 49 +
drivers/cxl/core/hdm.c | 623 +++++++-
drivers/cxl/core/pmem.c | 4
drivers/cxl/core/port.c | 669 ++++++--
drivers/cxl/core/region.c | 1830 +++++++++++++++++++++++
drivers/cxl/cxl.h | 263 +++
drivers/cxl/cxlmem.h | 18
drivers/cxl/mem.c | 32
drivers/cxl/pmem.c | 259 +++
drivers/nvdimm/region_devs.c | 28
include/linux/ioport.h | 3
include/linux/libnvdimm.h | 5
kernel/resource.c | 185 ++
mm/Kconfig | 5
tools/testing/cxl/Kbuild | 1
tools/testing/cxl/test/cxl.c | 75 +
21 files changed, 4156 insertions(+), 311 deletions(-)
create mode 100644 drivers/cxl/core/region.c
base-commit: b060edfd8cdd52bc8648392500bf152a8dd6d4c5
next reply other threads:[~2022-07-15 0:00 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-15 0:00 Dan Williams [this message]
2022-07-15 0:00 ` [PATCH v2 01/28] Documentation/cxl: Use a double line break between entries Dan Williams
2022-07-20 13:26 ` Jonathan Cameron
2022-07-15 0:00 ` [PATCH v2 02/28] cxl/core: Define a 'struct cxl_switch_decoder' Dan Williams
2022-07-15 2:57 ` kernel test robot
2022-07-20 15:39 ` Jonathan Cameron
2022-07-15 0:00 ` [PATCH v2 03/28] cxl/acpi: Track CXL resources in iomem_resource Dan Williams
2022-07-15 5:23 ` Greg Kroah-Hartman
2022-07-20 16:03 ` Jonathan Cameron
2022-07-15 0:01 ` [PATCH v2 04/28] cxl/core: Define a 'struct cxl_root_decoder' Dan Williams
2022-07-20 16:07 ` Jonathan Cameron
2022-07-15 0:01 ` [PATCH v2 05/28] cxl/core: Define a 'struct cxl_endpoint_decoder' Dan Williams
2022-07-20 16:11 ` Jonathan Cameron
2022-07-15 0:01 ` [PATCH v2 06/28] cxl/hdm: Enumerate allocated DPA Dan Williams
2022-07-20 16:40 ` Jonathan Cameron
2022-07-21 15:29 ` Dan Williams
2022-07-15 0:01 ` [PATCH v2 07/28] cxl/hdm: Add 'mode' attribute to decoder objects Dan Williams
2022-07-15 0:01 ` [PATCH v2 08/28] cxl/hdm: Track next decoder to allocate Dan Williams
2022-07-20 16:45 ` Jonathan Cameron
2022-07-15 0:01 ` [PATCH v2 09/28] cxl/hdm: Add support for allocating DPA to an endpoint decoder Dan Williams
2022-07-20 16:51 ` Jonathan Cameron
2022-07-15 0:01 ` [PATCH v2 10/28] cxl/port: Record dport in endpoint references Dan Williams
2022-07-20 16:53 ` Jonathan Cameron
2022-07-15 0:01 ` [PATCH v2 11/28] cxl/port: Record parent dport when adding ports Dan Williams
2022-07-15 0:01 ` [PATCH v2 12/28] cxl/port: Move 'cxl_ep' references to an xarray per port Dan Williams
2022-07-15 0:01 ` [PATCH v2 13/28] cxl/port: Move dport tracking to an xarray Dan Williams
2022-07-20 16:56 ` Jonathan Cameron
2022-07-15 0:02 ` [PATCH v2 14/28] cxl/hdm: Add sysfs attributes for interleave ways + granularity Dan Williams
2022-07-20 16:58 ` Jonathan Cameron
2022-07-15 0:02 ` [PATCH v2 15/28] cxl/mem: Enumerate port targets before adding endpoints Dan Williams
2022-07-15 0:02 ` [PATCH v2 16/28] resource: Introduce alloc_free_mem_region() Dan Williams
2022-07-20 17:00 ` Jonathan Cameron
2022-07-21 16:10 ` Dan Williams
2022-07-21 16:10 ` Dan Williams
2022-07-21 16:10 ` [Nouveau] " Dan Williams
2022-09-06 13:25 ` Rogerio Alves
2022-07-15 0:02 ` [PATCH v2 17/28] cxl/region: Add region creation support Dan Williams
2022-07-20 17:16 ` Jonathan Cameron
2022-07-15 0:02 ` [PATCH v2 18/28] cxl/region: Add a 'uuid' attribute Dan Williams
2022-07-20 17:18 ` Jonathan Cameron
2022-07-15 0:02 ` [PATCH v2 19/28] cxl/region: Add interleave geometry attributes Dan Williams
2022-07-15 0:02 ` [PATCH v2 20/28] cxl/region: Allocate HPA capacity to regions Dan Williams
2022-07-20 17:20 ` Jonathan Cameron
2022-07-15 0:02 ` [PATCH v2 21/28] cxl/region: Enable the assignment of endpoint decoders " Dan Williams
2022-07-15 3:28 ` kernel test robot
2022-07-20 17:26 ` Jonathan Cameron
2022-07-20 19:05 ` Dan Williams
2022-07-15 0:02 ` [PATCH v2 22/28] cxl/acpi: Add a host-bridge index lookup mechanism Dan Williams
2022-07-15 0:02 ` [PATCH v2 23/28] cxl/region: Attach endpoint decoders Dan Williams
2022-07-20 17:29 ` Jonathan Cameron
2022-07-15 0:02 ` [PATCH v2 24/28] cxl/region: Program target lists Dan Williams
2022-07-20 17:41 ` Jonathan Cameron
2022-07-21 16:56 ` Dan Williams
2022-07-15 0:03 ` [PATCH v2 25/28] cxl/hdm: Commit decoder state to hardware Dan Williams
2022-07-20 17:44 ` Jonathan Cameron
2022-07-15 0:03 ` [PATCH v2 26/28] cxl/region: Add region driver boiler plate Dan Williams
2022-07-15 0:03 ` [PATCH v2 27/28] cxl/pmem: Fix offline_nvdimm_bus() to offline by bridge Dan Williams
2022-07-20 17:46 ` Jonathan Cameron
2022-07-15 0:03 ` [PATCH v2 28/28] cxl/region: Introduce cxl_pmem_region objects Dan Williams
2022-07-20 18:05 ` Jonathan Cameron
2022-07-20 18:12 ` [PATCH v2 00/28] CXL PMEM Region Provisioning Jonathan Cameron
2022-07-21 18:34 ` Dan Williams
2022-07-21 14:59 ` Jonathan Cameron
2022-07-21 16:29 ` Dan Williams
2022-07-21 17:22 ` Jonathan Cameron
2022-07-16 19:55 [PATCH v2 21/28] cxl/region: Enable the assignment of endpoint decoders to regions kernel test robot
2022-07-18 11:32 ` Dan Carpenter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=165784324066.1758207.15025479284039479071.stgit@dwillia2-xfh.jf.intel.com \
--to=dan.j.williams@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=bwidawsk@kernel.org \
--cc=david@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=hch@lst.de \
--cc=jgg@nvidia.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=nvdimm@lists.linux.dev \
--cc=tony.luck@intel.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.