All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Widawsky <ben.widawsky@intel.com>
To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev
Cc: patches@lists.linux.dev, Ben Widawsky <ben.widawsky@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Alison Schofield <alison.schofield@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Vishal Verma <vishal.l.verma@intel.com>
Subject: [RFC PATCH 05/15] cxl/acpi: Reserve CXL resources from request_free_mem_region
Date: Wed, 13 Apr 2022 11:37:10 -0700	[thread overview]
Message-ID: <20220413183720.2444089-6-ben.widawsky@intel.com> (raw)
In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com>

Define an API which allows CXL drivers to manage CXL address space.
CXL is unique in that the address space and various properties are only
known after CXL drivers come up, and therefore cannot be part of core
memory enumeration.

Compute Express Link 2.0 [ECN] defines a concept called CXL Fixed Memory
Window Structures (CFMWS). Each CFMWS conveys a region of host physical
address (HPA) space which has certain properties that are familiar to
CXL, mainly interleave properties, and restrictions, such as
persistence. The HPA ranges therefore should be owned, or at least
guided by the relevant CXL driver, cxl_acpi [1].

It would be desirable to simply insert this address space into
iomem_resource with a new flag to denote this is CXL memory. This would
permit request_free_mem_region() to be reused for CXL memory provided it
learned some new tricks. For that, it is tempting to simply use
insert_resource(). The API was designed specifically for cases where new
devices may offer new address space. This cannot work in the general
case. Boot firmware can pass, some, none, or all of the CFMWS range as
various types of memory to the kernel, and this may be left alone,
merged, or even expanded. As a result iomem_resource may intersect CFMWS
regions in ways insert_resource cannot handle [2]. Similar reasoning
applies to allocate_resource().

With the insert_resource option out, the only reasonable approach left
is to let the CXL driver manage the address space independently of
iomem_resource and attempt to prevent users of device private memory
APIs from using CXL memory. In the case where cxl_acpi comes up first,
the new API allows cxl to block use of any CFMWS defined address space
by assuming everything above the highest CFMWS entry is fair game. It is
expected that this effectively will prevent usage of device private
memory, but if such behavior is undesired, cxl_acpi can be blocked from
loading, or unloaded. When device private memory is used before CXL
comes up, or, there are intersections as described above, the CXL driver
will have to make sure to not reuse sysram that is BUSY.

[1]: The specification defines enumeration via ACPI, however, one could
envision devicetree, or some other hardcoded mechanisms for doing the
same thing.

[2]: A common way to hit this case is when BIOS creates a volatile
region with extra space for hotplug. In this case, you're likely to have

|<--------------HPA space---------------------->|
|<---iomem_resource -->|
| DDR  | CXL Volatile  |
|      | CFMWS for volatile w/ hotplug |

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/acpi.c     | 26 ++++++++++++++++++++++++++
 include/linux/ioport.h |  1 +
 kernel/resource.c      | 11 ++++++++++-
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 9b69955b90cb..0870904fe4b5 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -76,6 +76,7 @@ static int cxl_acpi_cfmws_verify(struct device *dev,
 struct cxl_cfmws_context {
 	struct device *dev;
 	struct cxl_port *root_port;
+	struct acpi_cedt_cfmws *high_cfmws;
 };
 
 static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
@@ -126,6 +127,14 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 			cfmws->base_hpa + cfmws->window_size - 1);
 		return 0;
 	}
+
+	if (ctx->high_cfmws) {
+		if (cfmws->base_hpa > ctx->high_cfmws->base_hpa)
+			ctx->high_cfmws = cfmws;
+	} else {
+		ctx->high_cfmws = cfmws;
+	}
+
 	dev_dbg(dev, "add: %s node: %d range %#llx-%#llx\n",
 		dev_name(&cxld->dev), phys_to_target_node(cxld->range.start),
 		cfmws->base_hpa, cfmws->base_hpa + cfmws->window_size - 1);
@@ -299,6 +308,7 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	ctx = (struct cxl_cfmws_context) {
 		.dev = host,
 		.root_port = root_port,
+		.high_cfmws = NULL,
 	};
 	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, cxl_parse_cfmws, &ctx);
 
@@ -317,10 +327,25 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	if (rc < 0)
 		return rc;
 
+	if (ctx.high_cfmws) {
+		resource_size_t end =
+			ctx.high_cfmws->base_hpa + ctx.high_cfmws->window_size;
+		dev_dbg(host,
+			"Disabling free device private regions below %#llx\n",
+			end);
+		set_request_free_min_base(end);
+	}
+
 	/* In case PCI is scanned before ACPI re-trigger memdev attach */
 	return cxl_bus_rescan();
 }
 
+static int cxl_acpi_remove(struct platform_device *pdev)
+{
+	set_request_free_min_base(0);
+	return 0;
+}
+
 static const struct acpi_device_id cxl_acpi_ids[] = {
 	{ "ACPI0017" },
 	{ },
@@ -329,6 +354,7 @@ MODULE_DEVICE_TABLE(acpi, cxl_acpi_ids);
 
 static struct platform_driver cxl_acpi_driver = {
 	.probe = cxl_acpi_probe,
+	.remove = cxl_acpi_remove,
 	.driver = {
 		.name = KBUILD_MODNAME,
 		.acpi_match_table = cxl_acpi_ids,
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index ec5f71f7135b..dc41e4be5635 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -325,6 +325,7 @@ extern int
 walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
 		    void *arg, int (*func)(struct resource *, void *));
 
+void set_request_free_min_base(resource_size_t val);
 struct resource *devm_request_free_mem_region(struct device *dev,
 		struct resource *base, unsigned long size);
 struct resource *request_free_mem_region(struct resource *base,
diff --git a/kernel/resource.c b/kernel/resource.c
index 34eaee179689..a4750689e529 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -1774,6 +1774,14 @@ void resource_list_free(struct list_head *head)
 EXPORT_SYMBOL(resource_list_free);
 
 #ifdef CONFIG_DEVICE_PRIVATE
+static resource_size_t request_free_min_base;
+
+void set_request_free_min_base(resource_size_t val)
+{
+	request_free_min_base = val;
+}
+EXPORT_SYMBOL_GPL(set_request_free_min_base);
+
 static struct resource *__request_free_mem_region(struct device *dev,
 		struct resource *base, unsigned long size, const char *name)
 {
@@ -1799,7 +1807,8 @@ static struct resource *__request_free_mem_region(struct device *dev,
 	}
 
 	write_lock(&resource_lock);
-	for (; addr > size && addr >= base->start; addr -= size) {
+	for (; addr > size && addr >= max(base->start, request_free_min_base);
+	     addr -= size) {
 		if (__region_intersects(addr, size, 0, IORES_DESC_NONE) !=
 				REGION_DISJOINT)
 			continue;
-- 
2.35.1


  parent reply	other threads:[~2022-04-13 18:38 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13 18:37 [RFC PATCH 00/15] Region driver Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 01/15] cxl/core: Use is_endpoint_decoder Ben Widawsky
2022-04-13 21:22   ` Dan Williams
     [not found]   ` <CGME20220415205052uscas1p209e03abf95b9c80b2ba1f287c82dfd80@uscas1p2.samsung.com>
2022-04-15 20:50     ` Adam Manzanares
2022-04-13 18:37 ` [RFC PATCH 02/15] cxl/core/hdm: Bail on endpoint init fail Ben Widawsky
2022-04-13 21:31   ` Dan Williams
     [not found]     ` <CGME20220418163713uscas1p17b3b1b45c7d27e54e3ecb62eb8af2469@uscas1p1.samsung.com>
2022-04-18 16:37       ` Adam Manzanares
2022-05-12 15:50         ` Ben Widawsky
2022-05-12 17:27           ` Luis Chamberlain
2022-05-13 12:09             ` Jonathan Cameron
2022-05-13 15:03               ` Dan Williams
2022-05-13 15:12               ` Luis Chamberlain
2022-05-13 19:14                 ` Dan Williams
2022-05-13 19:31                   ` Luis Chamberlain
2022-05-19  5:09                     ` Dan Williams
2022-04-13 18:37 ` [RFC PATCH 03/15] Revert "cxl/core: Convert decoder range to resource" Ben Widawsky
2022-04-13 21:43   ` Dan Williams
2022-05-12 16:09     ` Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 04/15] cxl/core: Create distinct decoder structs Ben Widawsky
2022-04-15  1:45   ` Dan Williams
2022-04-18 20:43     ` Dan Williams
2022-04-13 18:37 ` Ben Widawsky [this message]
2022-04-18 16:42   ` [RFC PATCH 05/15] cxl/acpi: Reserve CXL resources from request_free_mem_region Dan Williams
2022-04-19 16:43     ` Jason Gunthorpe
2022-04-19 21:50       ` Dan Williams
2022-04-19 21:59         ` Dan Williams
2022-04-19 23:04           ` Jason Gunthorpe
2022-04-20  0:47             ` Dan Williams
2022-04-20 14:34               ` Jason Gunthorpe
2022-04-20 15:32                 ` Dan Williams
2022-04-13 18:37 ` [RFC PATCH 06/15] cxl/acpi: Manage root decoder's address space Ben Widawsky
2022-04-18 22:15   ` Dan Williams
2022-05-12 19:18     ` Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 07/15] cxl/port: Surface ram and pmem resources Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 08/15] cxl/core/hdm: Allocate resources from the media Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 09/15] cxl/core/port: Add attrs for size and volatility Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 10/15] cxl/core: Extract IW/IG decoding Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 11/15] cxl/acpi: Use common " Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 12/15] cxl/region: Add region creation ABI Ben Widawsky
2022-05-04 22:56   ` Verma, Vishal L
2022-05-05  5:17     ` Dan Williams
2022-05-12 15:54       ` Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 13/15] cxl/core/port: Add attrs for root ways & granularity Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 14/15] cxl/region: Introduce configuration Ben Widawsky
2022-04-13 18:37 ` [RFC PATCH 15/15] cxl/region: Introduce a cxl_region driver Ben Widawsky
2022-05-20 16:23 ` [RFC PATCH 00/15] Region driver Jonathan Cameron
2022-05-20 16:41   ` Dan Williams
2022-05-31 12:21     ` Jonathan Cameron
2022-06-23  5:40       ` Dan Williams
2022-06-23 15:08         ` Jonathan Cameron
2022-06-23 17:33           ` Dan Williams
2022-06-23 23:44             ` Dan Williams
2022-06-24  9:08             ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220413183720.2444089-6-ben.widawsky@intel.com \
    --to=ben.widawsky@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=patches@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.