linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] Region creation
@ 2021-06-17 17:36 Ben Widawsky
  2021-06-17 17:36 ` [PATCH 1/6] cxl/region: Add region creation ABI Ben Widawsky
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Ben Widawsky @ 2021-06-17 17:36 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

CXL interleave sets and non-interleave sets are described via regions. A region
is specified in the CXL 2.0 specification and the purpose is to create a
standardized way to preserve the region across reboots.

Introduced here is the basic mechanism to create configure and delete a CXL
region. Configuring a region simply means giving it a size, UUID, and a target
list. Enabling/activating a region is comprised of three parts:
1. Validate and find space for the region in the CFMWS (included here)
2. Program the HDM decoders (WIP)
3. Adding the region label to the LSA (Dan WIP)

Here is a sample topology

    decoder1.0
    ├── create_region
    ├── delete_region
    ├── devtype
    ├── locked
    ├── region1.0:0
    │   ├── offset
    │   ├── size
    │   ├── subsystem -> ../../../../../../../bus/cxl
    │   ├── target0
    │   ├── uevent
    │   ├── uuid
    │   └── verify
    ├── size
    ├── start
    ├── subsystem -> ../../../../../../bus/cxl
    ├── target_list
    ├── target_type
    └── uevent

Next Steps.
Next task is to create a CXL memory driver which will enumerate CXL capable
endpoints. A CXL capable endpoint is similar to the existing cxl_mem structure
except that it's known that there exists a path from a hostbridge down to the
endpoint that is CXL.mem capable. Once that path is established, the HDM decoder
programming can be determined from top of the hierarchy down.

Ben Widawsky (6):
  cxl/region: Add region creation ABI
  cxl: Move cxl_memdev conversion helper to mem.h
  cxl/region: Introduce concept of region configuration
  cxl/region: Introduce a cxl_region driver
  cxl/core: Convert decoder range to resource
  cxl/region: Handle region's address space allocation

 Documentation/ABI/testing/sysfs-bus-cxl       |  53 ++
 .../driver-api/cxl/memory-devices.rst         |  11 +
 drivers/cxl/Makefile                          |   5 +-
 drivers/cxl/core.c                            | 108 +++-
 drivers/cxl/cxl.h                             |  18 +-
 drivers/cxl/mem.h                             |   6 +
 drivers/cxl/pci.c                             |   5 -
 drivers/cxl/region.c                          | 492 ++++++++++++++++++
 drivers/cxl/region.h                          |  45 ++
 drivers/cxl/trace.h                           |  33 ++
 10 files changed, 760 insertions(+), 16 deletions(-)
 create mode 100644 drivers/cxl/region.c
 create mode 100644 drivers/cxl/region.h
 create mode 100644 drivers/cxl/trace.h

-- 
2.32.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/6] cxl/region: Add region creation ABI
  2021-06-17 17:36 [PATCH 0/6] Region creation Ben Widawsky
@ 2021-06-17 17:36 ` Ben Widawsky
  2021-06-17 21:11   ` [PATCH v2 " Ben Widawsky
  2021-06-18  9:13   ` [PATCH " Jonathan Cameron
  2021-06-17 17:36 ` [PATCH 2/6] cxl: Move cxl_memdev conversion helper to mem.h Ben Widawsky
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 20+ messages in thread
From: Ben Widawsky @ 2021-06-17 17:36 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Regions are created as a child of the decoder that encompasses an
address space with constraints. Regions only exist for persistent
capacities.

When regions are created, the number of desired interleave ways must be
known. To enable this, the sysfs attribute will take the desired ways as
input. This interface intentionally allows creation of
impossible-to-enable regions based on interleave constraints in the
topology. The reasoning is to create new regions through the kernel
interfaces which may become possible on reboot under a variety of
circumstances.

As an example, creating a x1 region with:
echo 1 > /sys/bus/cxl/devices/decoder1.0/create_region

Will yield /sys/bus/cxl/devices/decoder1.0/region1.0:0

That region may then be deleted with:
echo region1.0:0 > /sys/bus/cxl/devices/decoder1.0/delete_region

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-cxl       |  21 +++
 .../driver-api/cxl/memory-devices.rst         |  11 ++
 drivers/cxl/Makefile                          |   3 +-
 drivers/cxl/core.c                            |  71 +++++++++
 drivers/cxl/cxl.h                             |  11 ++
 drivers/cxl/region.c                          | 147 ++++++++++++++++++
 drivers/cxl/region.h                          |  43 +++++
 7 files changed, 306 insertions(+), 1 deletion(-)
 create mode 100644 drivers/cxl/region.c
 create mode 100644 drivers/cxl/region.h

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 0b6a2e6e8fbb..115a25d2899d 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -127,3 +127,24 @@ Description:
 		memory (type-3). The 'target_type' attribute indicates the
 		current setting which may dynamically change based on what
 		memory regions are activated in this decode hierarchy.
+
+What:		/sys/bus/cxl/devices/decoderX.Y/create_region
+Date:		June, 2021
+KernelVersion:	v5.14
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		Creates a new CXL region of N interleaved ways. Writing a value
+		of '2' will create a new uninitialized region with 2x interleave
+		that will be mapped by the CXL decoderX.Y. Reading from this
+		node will return the last created region. Regions must be
+		subsequently configured and bound to a region driver before they
+		can be used.
+
+What:		/sys/bus/cxl/devices/decoderX.Y/delete_region
+Date:		June, 2021
+KernelVersion:	v5.14
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		Deletes the named region. A region must be unbound from the
+		region driver before being deleted. The attributes expects a
+		region in the form "regionX.Y:Z".
diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index 487ce4f41d77..c7f59a8c94db 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -39,6 +39,17 @@ CXL Core
 .. kernel-doc:: drivers/cxl/core.c
    :doc: cxl core
 
+CXL Regions
+-----------
+.. kernel-doc:: drivers/cxl/region.c
+   :doc: cxl region
+
+.. kernel-doc:: drivers/cxl/region.h
+   :identifiers:
+
+.. kernel-doc:: drivers/cxl/region.c
+   :identifiers:
+
 External Interfaces
 ===================
 
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index 32954059b37b..c3151198c041 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CXL_BUS) += cxl_core.o
-obj-$(CONFIG_CXL_MEM) += cxl_pci.o
+obj-$(CONFIG_CXL_MEM) += cxl_pci.o cxl_region.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 
@@ -9,3 +9,4 @@ cxl_core-y := core.o
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
 cxl_pmem-y := pmem.o
+cxl_region-y := region.o
diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
index a2e4d54fc7bc..d8d7ca85e110 100644
--- a/drivers/cxl/core.c
+++ b/drivers/cxl/core.c
@@ -6,6 +6,7 @@
 #include <linux/pci.h>
 #include <linux/slab.h>
 #include <linux/idr.h>
+#include "region.h"
 #include "cxl.h"
 #include "mem.h"
 
@@ -120,7 +121,68 @@ static ssize_t target_list_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(target_list);
 
+static ssize_t create_region_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	int rc;
+
+	device_lock(dev);
+	rc = sprintf(buf, "%s\n",
+		     cxld->youngest ? dev_name(&cxld->youngest->dev) : "");
+	device_unlock(dev);
+
+	return rc;
+}
+
+static ssize_t create_region_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t len)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	struct cxl_region *region;
+	ssize_t rc;
+	int val;
+
+	rc = kstrtoint(buf, 0, &val);
+	if (rc)
+		return rc;
+	if (val < 0 || val > 16)
+		return -EINVAL;
+
+	region = cxl_alloc_region(cxld, val);
+	if (IS_ERR(region))
+		return PTR_ERR(region);
+
+	rc = cxl_add_region(cxld, region);
+	if (rc) {
+		cxl_free_region(cxld, region);
+		return rc;
+	}
+
+	cxld->youngest = region;
+	return len;
+}
+static DEVICE_ATTR_RW(create_region);
+
+static ssize_t delete_region_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t len)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	int rc;
+
+	rc = cxl_delete_region(cxld, buf);
+	if (rc)
+		return rc;
+
+	return len;
+}
+static DEVICE_ATTR_WO(delete_region);
+
 static struct attribute *cxl_decoder_base_attrs[] = {
+	&dev_attr_create_region.attr,
+	&dev_attr_delete_region.attr,
 	&dev_attr_start.attr,
 	&dev_attr_size.attr,
 	&dev_attr_locked.attr,
@@ -171,7 +233,13 @@ static void cxl_decoder_release(struct device *dev)
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
 	struct cxl_port *port = to_cxl_port(dev->parent);
+	struct cxl_region *region;
 
+	list_for_each_entry(region, &cxld->regions, list)
+		cxl_delete_region(cxld, dev_name(&region->dev));
+
+	dev_WARN_ONCE(dev, !ida_is_empty(&cxld->region_ida),
+		      "Lost track of a region");
 	ida_free(&port->decoder_ida, cxld->id);
 	kfree(cxld);
 }
@@ -483,8 +551,11 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
 		.interleave_ways = interleave_ways,
 		.interleave_granularity = interleave_granularity,
 		.target_type = type,
+		.regions = LIST_HEAD_INIT(cxld->regions),
 	};
 
+	ida_init(&cxld->region_ida);
+
 	/* handle implied target_list */
 	if (interleave_ways == 1)
 		cxld->target[0] =
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index b6bda39a59e3..8b27a07d7d0f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -190,6 +190,9 @@ enum cxl_decoder_type {
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
  * @flags: memory type capabilities and locking
+ * @region_ida: allocator for region ids.
+ * @regions: List of regions mapped (may be disabled) by this decoder.
+ * @youngest: Last region created for this decoder.
  * @target: active ordered target list in current decoder configuration
  */
 struct cxl_decoder {
@@ -200,6 +203,9 @@ struct cxl_decoder {
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
 	unsigned long flags;
+	struct ida region_ida;
+	struct list_head regions;
+	struct cxl_region *youngest;
 	struct cxl_dport *target[];
 };
 
@@ -262,6 +268,11 @@ struct cxl_dport {
 	struct list_head list;
 };
 
+struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
+				    int interleave_ways);
+void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region);
+int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region);
+int cxl_delete_region(struct cxl_decoder *cxld, const char *region);
 struct cxl_port *to_cxl_port(struct device *dev);
 struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
 				   resource_size_t component_reg_phys,
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
new file mode 100644
index 000000000000..391467e864a2
--- /dev/null
+++ b/drivers/cxl/region.c
@@ -0,0 +1,147 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/idr.h>
+#include "region.h"
+#include "cxl.h"
+#include "mem.h"
+
+/**
+ * DOC: cxl region
+ *
+ * A CXL region encompasses a chunk of host physical address space that may be
+ * consumed by a single device (x1 interleave aka linear) or across multiple
+ * devices (xN interleaved). A region is a child device of a &struct
+ * cxl_decoder. There may be multiple active regions under a single &struct
+ * cxl_decoder. The common case for multiple regions would be several linear,
+ * contiguous regions under a single decoder. Generally, there will be a 1:1
+ * relationship between decoder and region when the region is interleaved.
+ */
+
+static void cxl_region_release(struct device *dev);
+
+const struct device_type cxl_region_type = {
+	.name = "cxl_region",
+	.release = cxl_region_release,
+};
+
+void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
+{
+	ida_free(&cxld->region_ida, region->id);
+	kfree(region);
+}
+
+static void cxl_region_release(struct device *dev)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
+	struct cxl_region *region = to_cxl_region(dev);
+
+	cxl_free_region(cxld, region);
+}
+
+struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
+				    int interleave_ways)
+{
+	struct cxl_region *region;
+	int rc;
+
+	region = kzalloc(struct_size(region, targets, interleave_ways),
+			 GFP_KERNEL);
+	if (!region)
+		return ERR_PTR(-ENOMEM);
+
+	region->eniw = interleave_ways;
+
+	rc = ida_alloc(&cxld->region_ida, GFP_KERNEL);
+	if (rc < 0) {
+		dev_err(&cxld->dev, "Couldn't get a new id\n");
+		kfree(region);
+		return ERR_PTR(rc);
+	}
+	region->id = rc;
+
+	return region;
+}
+
+/**
+ * cxl_add_region - Adds a region to a decoder
+ * @cxld: Parent decoder.
+ * @region: Region to be added to the decoder.
+ *
+ * This is the second step of region initialization. Regions exist within an
+ * address space which is mapped by a @cxld, and that @cxld enforces constraints
+ * upon the region as it is configured. Regions may be added to a @cxld but not
+ * activated and therefore it is possible to have more regions in a @cxld than
+ * there are interleave ways in the @cxld. Regions exist only for persistent
+ * capacities.
+ *
+ * Return: zero if the region was added to the @cxld, else returns negative
+ * error code.
+ */
+int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region)
+{
+	struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+	struct device *dev = &region->dev;
+	int rc;
+
+	device_initialize(dev);
+	dev->parent = &cxld->dev;
+	device_set_pm_not_required(dev);
+	dev->bus = &cxl_bus_type;
+	dev->type = &cxl_region_type;
+	rc = dev_set_name(dev, "region%d.%d:%d", port->id, cxld->id, region->id);
+	if (rc)
+		goto err;
+
+	rc = device_add(dev);
+	if (rc)
+		goto err;
+
+	dev_dbg(dev, "Added %s to %s\n", dev_name(dev), dev_name(&cxld->dev));
+
+	return 0;
+
+err:
+	put_device(dev);
+	return rc;
+}
+
+static struct cxl_region *
+cxl_find_region_by_name(struct cxl_decoder *cxld, const char *name)
+{
+	struct device *region_dev;
+
+	region_dev = device_find_child_by_name(&cxld->dev, name);
+	if (!region_dev)
+		return ERR_PTR(-ENOENT);
+
+	return to_cxl_region(region_dev);
+}
+
+int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
+{
+	struct cxl_region *region;
+
+	device_lock(&cxld->dev);
+
+	region = cxl_find_region_by_name(cxld, region_name);
+	if (IS_ERR(region)) {
+		device_unlock(&cxld->dev);
+		return PTR_ERR(region);
+	}
+
+	dev_dbg(&cxld->dev, "Requested removal of %s from %s\n",
+		dev_name(&region->dev), dev_name(&cxld->dev));
+
+	cmpxchg(&cxld->youngest, region, NULL);
+
+	device_unregister(&region->dev);
+	device_unlock(&cxld->dev);
+
+	put_device(&region->dev);
+
+	return 0;
+}
diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
new file mode 100644
index 000000000000..7a87d229e38a
--- /dev/null
+++ b/drivers/cxl/region.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2021 Intel Corporation. */
+#ifndef __CXL_REGION_H__
+#define __CXL_REGION_H__
+
+#include <linux/uuid.h>
+
+extern const struct device_type cxl_region_type;
+
+/**
+ * struct cxl_region - CXL region
+ * @dev: This region's device.
+ * @id: This regions id. Id is globally unique across all regions.
+ * @res: Address space consumed by this region.
+ * @requested_size: Size of the region determined from LSA or userspace.
+ * @uuid: The UUID for this region.
+ * @list: Node in decoders region list.
+ * @eniw: Number of interleave ways this region is configured for.
+ * @targets: The memory devices comprising the region.
+ */
+struct cxl_region {
+	struct device dev;
+	int id;
+	struct resource *res;
+	u64 requested_size;
+	uuid_t uuid;
+	struct list_head list;
+	int eniw;
+	struct cxl_memdev *targets[];
+};
+
+static inline struct cxl_region *to_cxl_region(struct device *dev)
+{
+	if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type,
+			  "not a cxl_region device\n"))
+		return NULL;
+
+	return container_of(dev, struct cxl_region, dev);
+}
+
+bool cxl_is_region_configured(struct cxl_region *region);
+
+#endif
-- 
2.32.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/6] cxl: Move cxl_memdev conversion helper to mem.h
  2021-06-17 17:36 [PATCH 0/6] Region creation Ben Widawsky
  2021-06-17 17:36 ` [PATCH 1/6] cxl/region: Add region creation ABI Ben Widawsky
@ 2021-06-17 17:36 ` Ben Widawsky
  2021-06-18  9:13   ` Jonathan Cameron
  2021-06-18 15:00   ` Dan Williams
  2021-06-17 17:36 ` [PATCH 3/6] cxl/region: Introduce concept of region configuration Ben Widawsky
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 20+ messages in thread
From: Ben Widawsky @ 2021-06-17 17:36 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/mem.h | 5 +++++
 drivers/cxl/pci.c | 5 -----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
index 8f02d02b26b4..fe12ef3c3dde 100644
--- a/drivers/cxl/mem.h
+++ b/drivers/cxl/mem.h
@@ -48,6 +48,11 @@ struct cxl_memdev {
 	int id;
 };
 
+static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
+{
+       return container_of(dev, struct cxl_memdev, dev);
+}
+
 /**
  * struct cxl_mem - A CXL memory device
  * @pdev: The PCI device associated with this CXL device.
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index f8408e5f0754..52508282ec37 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1161,11 +1161,6 @@ static int cxl_mem_setup_regs(struct cxl_mem *cxlm)
 	return ret;
 }
 
-static struct cxl_memdev *to_cxl_memdev(struct device *dev)
-{
-	return container_of(dev, struct cxl_memdev, dev);
-}
-
 static void cxl_memdev_release(struct device *dev)
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
-- 
2.32.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/6] cxl/region: Introduce concept of region configuration
  2021-06-17 17:36 [PATCH 0/6] Region creation Ben Widawsky
  2021-06-17 17:36 ` [PATCH 1/6] cxl/region: Add region creation ABI Ben Widawsky
  2021-06-17 17:36 ` [PATCH 2/6] cxl: Move cxl_memdev conversion helper to mem.h Ben Widawsky
@ 2021-06-17 17:36 ` Ben Widawsky
  2021-06-18 11:22   ` Jonathan Cameron
  2021-06-17 17:36 ` [PATCH 4/6] cxl/region: Introduce a cxl_region driver Ben Widawsky
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Ben Widawsky @ 2021-06-17 17:36 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The region creation APIs leave a region unconfigured. Configuring the
region will work in the same way as similar subsystems such as devdax.
Sysfs attrs will be provided to allow userspace to configure the region.
Finally once all configuration is complete, userspace may "commit" the
config. What the kernel decides to do after a config is committed is out
of scope at this point.

Introduced here are the most basic attributes needed to configure a
region.

A x1 interleave example is provided below:

decoder1.0
├── create_region
├── delete_region
├── devtype
├── locked
├── region1.0:0
│   ├── offset
│   ├── size
│   ├── subsystem -> ../../../../../../../bus/cxl
│   ├── target0
│   ├── uevent
│   ├── uuid
│   └── verify
├── size
├── start
├── subsystem -> ../../../../../../bus/cxl
├── target_list
├── target_type
└── uevent

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-cxl |  32 ++++
 drivers/cxl/region.c                    | 223 ++++++++++++++++++++++++
 2 files changed, 255 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 115a25d2899d..70f9d09385a4 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -148,3 +148,35 @@ Description:
 		Deletes the named region. A region must be unbound from the
 		region driver before being deleted. The attributes expects a
 		region in the form "regionX.Y:Z".
+
+What:		/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/offset
+Date:		June, 2021
+KernelVersion:	v5.14
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) A region resides within an address space that is claimed by
+		a decoder. Region space allocation is handled by the driver, but
+		the offset may be read by userspace tooling in order to
+		determine fragmentation, and available size for new regions.
+
+What:
+/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/{size,uuid,target[0-15]}
+Date:		June, 2021
+KernelVersion:	v5.14
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RW) Configuring regions requires a minimal set of parameters in
+		order for the subsequent bind operation to succeed. The
+		following parameters are defined:
+
+		==	========================================================
+		size	Manadatory. Phsyical address space the region will
+			consume.
+		uuid	Optional. A unique identifier for the region. If none is
+			selected, the kernel will create one.
+		target  Mandatory. Memory devices are the backing storage for a
+			region. There will be N targets based on the number of
+			interleave ways that the top level decoder is configured
+			for. Each target must be set with a memdev device ie.
+			'mem1'.
+		==	========================================================
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index 391467e864a2..cf7fd3027419 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -3,7 +3,9 @@
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/device.h>
 #include <linux/module.h>
+#include <linux/sizes.h>
 #include <linux/slab.h>
+#include <linux/uuid.h>
 #include <linux/idr.h>
 #include "region.h"
 #include "cxl.h"
@@ -21,16 +23,237 @@
  * relationship between decoder and region when the region is interleaved.
  */
 
+static struct cxl_region *to_cxl_region(struct device *dev);
+
+static bool is_region_active(struct cxl_region *region)
+{
+	/* TODO: Regions can't be activated yet. */
+	return false;
+}
+
+static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
+			   char *buf)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
+
+	if (!region->res)
+		return sysfs_emit(buf, "\n");
+
+	return sysfs_emit(buf, "%#llx\n", cxld->range.start - region->res->start);
+}
+static DEVICE_ATTR_RO(offset);
+
+static ssize_t size_show(struct device *dev, struct device_attribute *attr,
+			 char *buf)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+
+	if (!region->res)
+		return sysfs_emit(buf, "*%llu\n", region->requested_size);
+
+	return sysfs_emit(buf, "%llu\n", resource_size(region->res));
+}
+
+static ssize_t size_store(struct device *dev, struct device_attribute *attr,
+			  const char *buf, size_t len)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	unsigned long long val;
+	ssize_t rc;
+
+	rc = kstrtoull(buf, 0, &val);
+	if (rc)
+		return rc;
+
+	device_lock(&region->dev);
+	if (is_region_active(region)) {
+		rc = -EBUSY;
+	} else {
+		region->requested_size = val;
+	}
+	device_unlock(&region->dev);
+
+	return len;
+}
+static DEVICE_ATTR_RW(size);
+
+static ssize_t uuid_show(struct device *dev, struct device_attribute *attr,
+			 char *buf)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+
+	return sysfs_emit(buf, "%pUb\n", &region->uuid);
+}
+static ssize_t uuid_store(struct device *dev, struct device_attribute *attr,
+			  const char *buf, size_t len)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	ssize_t rc;
+
+	if (len != UUID_STRING_LEN + 1)
+		return -EINVAL;
+
+	device_lock(&region->dev);
+	if (is_region_active(region)) {
+		rc = -EBUSY;
+	} else {
+		rc = uuid_parse(buf, &region->uuid);
+	}
+	device_unlock(&region->dev);
+
+	return rc ? rc : len;
+}
+static DEVICE_ATTR_RW(uuid);
+
+static struct attribute *region_attrs[] = {
+	&dev_attr_offset.attr,
+	&dev_attr_size.attr,
+	&dev_attr_uuid.attr,
+	NULL,
+};
+
+static const struct attribute_group region_group = {
+	.attrs = region_attrs,
+};
+
+static size_t show_targetN(struct cxl_region *region, char *buf, int n)
+{
+	int ret;
+
+	device_lock(&region->dev);
+	if (!region->targets[n])
+		ret = sysfs_emit(buf, "%s\n", dev_name(&region->targets[n]->dev));
+	else
+		ret = sysfs_emit(buf, "nil\n");
+	device_unlock(&region->dev);
+
+	return ret;
+}
+
+static size_t set_targetN(struct cxl_region *region, const char *buf, int n, size_t len)
+{
+	struct device *memdev_dev;
+	struct cxl_memdev *cxlmd;
+	ssize_t rc;
+	int val;
+
+	device_lock(&region->dev);
+
+	rc = kstrtoint(buf, 0, &val);
+	/* Special 'remote' target operation */
+	if (!rc && val == 0) {
+		cxlmd = region->targets[n];
+		if (cxlmd)
+			put_device(&cxlmd->dev);
+		region->targets[n] = NULL;
+		device_unlock(&region->dev);
+		return len;
+	}
+
+	memdev_dev = bus_find_device_by_name(&cxl_bus_type, NULL, buf);
+	if (!memdev_dev)
+		return -ENOENT;
+
+	cxlmd = to_cxl_memdev(memdev_dev);
+	get_device(&cxlmd->dev);
+	region->targets[n] = cxlmd;
+
+	device_unlock(&region->dev);
+
+	return len;
+}
+
+#define TARGET_ATTR_RW(n)                                                      \
+	static ssize_t target##n##_show(                                       \
+		struct device *dev, struct device_attribute *attr, char *buf)  \
+	{                                                                      \
+		return show_targetN(to_cxl_region(dev), buf, n);               \
+	}                                                                      \
+	static ssize_t target##n##_store(struct device *dev,                   \
+					 struct device_attribute *attr,        \
+					 const char *buf, size_t len)          \
+	{                                                                      \
+		return set_targetN(to_cxl_region(dev), buf, n, len);           \
+	}                                                                      \
+	static DEVICE_ATTR_RW(target##n)
+
+TARGET_ATTR_RW(0);
+TARGET_ATTR_RW(1);
+TARGET_ATTR_RW(2);
+TARGET_ATTR_RW(3);
+TARGET_ATTR_RW(4);
+TARGET_ATTR_RW(5);
+TARGET_ATTR_RW(6);
+TARGET_ATTR_RW(7);
+TARGET_ATTR_RW(8);
+TARGET_ATTR_RW(9);
+TARGET_ATTR_RW(10);
+TARGET_ATTR_RW(11);
+TARGET_ATTR_RW(12);
+TARGET_ATTR_RW(13);
+TARGET_ATTR_RW(14);
+TARGET_ATTR_RW(15);
+
+static struct attribute *interleave_attrs[] = {
+	&dev_attr_target0.attr,
+	&dev_attr_target1.attr,
+	&dev_attr_target2.attr,
+	&dev_attr_target3.attr,
+	&dev_attr_target4.attr,
+	&dev_attr_target5.attr,
+	&dev_attr_target6.attr,
+	&dev_attr_target7.attr,
+	&dev_attr_target8.attr,
+	&dev_attr_target9.attr,
+	&dev_attr_target10.attr,
+	&dev_attr_target11.attr,
+	&dev_attr_target12.attr,
+	&dev_attr_target13.attr,
+	&dev_attr_target14.attr,
+	&dev_attr_target15.attr,
+	NULL,
+};
+
+static umode_t visible_targets(struct kobject *kobj, struct attribute *a, int n)
+{
+	struct device *dev = container_of(kobj, struct device, kobj);
+	struct cxl_region *region = to_cxl_region(dev);
+
+	if (n < region->eniw)
+		return a->mode;
+	return 0;
+}
+
+static const struct attribute_group region_interleave_group = {
+	.attrs = interleave_attrs,
+	.is_visible = visible_targets,
+};
+
+static const struct attribute_group *region_groups[] = {
+	&region_group,
+	&region_interleave_group,
+	NULL,
+};
+
 static void cxl_region_release(struct device *dev);
 
 const struct device_type cxl_region_type = {
 	.name = "cxl_region",
 	.release = cxl_region_release,
+	.groups = region_groups
 };
 
 void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
 {
+	int i;
+
 	ida_free(&cxld->region_ida, region->id);
+	for (i = 0; i < region->eniw; i++) {
+		if (region->targets[i])
+			put_device(&region->targets[i]->dev);
+	}
+
 	kfree(region);
 }
 
-- 
2.32.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/6] cxl/region: Introduce a cxl_region driver
  2021-06-17 17:36 [PATCH 0/6] Region creation Ben Widawsky
                   ` (2 preceding siblings ...)
  2021-06-17 17:36 ` [PATCH 3/6] cxl/region: Introduce concept of region configuration Ben Widawsky
@ 2021-06-17 17:36 ` Ben Widawsky
  2021-06-17 21:13   ` [PATCH v2 " Ben Widawsky
  2021-06-17 17:36 ` [PATCH 5/6] cxl/core: Convert decoder range to resource Ben Widawsky
  2021-06-17 17:36 ` [PATCH 6/6] cxl/region: Handle region's address space allocation Ben Widawsky
  5 siblings, 1 reply; 20+ messages in thread
From: Ben Widawsky @ 2021-06-17 17:36 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The cxl_region driver is responsible for managing the HDM decoder
programming in the CXL topology. Once a region is created it must be
configured and bound to the driver in order to activate it.

The following is a sample of how such controls might work:

echo 1 > /sys/bus/cxl/devices/decoder0.0/create_region
echo $((256<<20)) > /sys/bus/cxl/devices/decoder0.0/region0.0:0/size
echo mem0 > /sys/bus/cxl/devices/decoder0.0/region0.0:0/target0
echo region0.0:0 > /sys/bus/cxl/drivers/cxl_region/bind

In order to handle the eventual rise in failure modes of binding a
region, a new trace event is created to help track these failures for
debug and reconfiguration paths in userspace.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/Makefile |  2 +-
 drivers/cxl/core.c   | 15 ++++++-
 drivers/cxl/cxl.h    |  1 +
 drivers/cxl/mem.h    |  1 +
 drivers/cxl/region.c | 99 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/cxl/region.h |  2 +
 drivers/cxl/trace.h  | 33 +++++++++++++++
 7 files changed, 149 insertions(+), 4 deletions(-)
 create mode 100644 drivers/cxl/trace.h

diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index c3151198c041..f35077c073b8 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -4,7 +4,7 @@ obj-$(CONFIG_CXL_MEM) += cxl_pci.o cxl_region.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 
-ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL
+ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -I$(src)
 cxl_core-y := core.o
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
index d8d7ca85e110..44f982f4f247 100644
--- a/drivers/cxl/core.c
+++ b/drivers/cxl/core.c
@@ -1086,6 +1086,8 @@ static int cxl_device_id(struct device *dev)
 		return CXL_DEVICE_NVDIMM_BRIDGE;
 	if (dev->type == &cxl_nvdimm_type)
 		return CXL_DEVICE_NVDIMM;
+	if (dev->type == &cxl_region_type)
+		return CXL_DEVICE_REGION;
 	return 0;
 }
 
@@ -1102,7 +1104,18 @@ static int cxl_bus_match(struct device *dev, struct device_driver *drv)
 
 static int cxl_bus_probe(struct device *dev)
 {
-	return to_cxl_drv(dev->driver)->probe(dev);
+	int id = cxl_device_id(dev);
+
+	if (id == CXL_DEVICE_REGION) {
+		struct cxl_region *region = to_cxl_region(dev);
+
+		if (cxl_is_region_configured(region))
+			return to_cxl_drv(dev->driver)->probe(dev);
+	} else {
+		return to_cxl_drv(dev->driver)->probe(dev);
+	}
+
+	return -ENODEV;
 }
 
 static int cxl_bus_remove(struct device *dev)
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 8b27a07d7d0f..90107b21125b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -325,6 +325,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 
 #define CXL_DEVICE_NVDIMM_BRIDGE	1
 #define CXL_DEVICE_NVDIMM		2
+#define CXL_DEVICE_REGION		3
 
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
index fe12ef3c3dde..ff1f9c57e089 100644
--- a/drivers/cxl/mem.h
+++ b/drivers/cxl/mem.h
@@ -83,4 +83,5 @@ struct cxl_mem {
 	struct range pmem_range;
 	struct range ram_range;
 };
+
 #endif /* __CXL_MEM_H__ */
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index cf7fd3027419..616b47903d69 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -11,6 +11,9 @@
 #include "cxl.h"
 #include "mem.h"
 
+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
 /**
  * DOC: cxl region
  *
@@ -27,8 +30,24 @@ static struct cxl_region *to_cxl_region(struct device *dev);
 
 static bool is_region_active(struct cxl_region *region)
 {
-	/* TODO: Regions can't be activated yet. */
-	return false;
+	return region->active;
+}
+
+/*
+ * Most sanity checking is left up to region binding. This does the most basic
+ * check to determine whether or not the core should try probing the driver.
+ */
+bool cxl_is_region_configured(struct cxl_region *region)
+{
+	/* zero sized regions aren't a thing. */
+	if (region->requested_size <= 0)
+		return false;
+
+	/* all regions have at least 1 target */
+	if (region->targets[0])
+		return false;
+
+	return true;
 }
 
 static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
@@ -368,3 +387,79 @@ int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
 
 	return 0;
 }
+
+static int bind_region(struct cxl_region *region)
+{
+	int i;
+
+	if (dev_WARN_ONCE(&region->dev, !cxl_is_region_configured(region),
+			  "unconfigured regions can't be probed (race?)\n")) {
+		return -ENXIO;
+	}
+
+	if (region->requested_size % (SZ_256M * region->eniw)) {
+		trace_cxl_region_bind(region, "Invalid size. Must be multiple of NIW");
+		return -ENXIO;
+	}
+
+	for (i = 0; i < region->eniw; i++)
+		if (!region->targets[i]) {
+			trace_cxl_region_bind(region, "Missing memory device target");
+			return -ENXIO;
+		}
+
+	/* TODO: Allocate from decoder's address space */
+
+	/* TODO: program HDM decoders */
+
+	if (uuid_is_null(&region->uuid))
+		uuid_gen(&region->uuid);
+
+	trace_cxl_region_bind(region, "Region binding succeeded.");
+	return 0;
+}
+
+static int cxl_region_probe(struct device *dev)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	int ret;
+
+	if (region->active)
+		return -EBUSY;
+
+	device_lock(&region->dev);
+	ret = bind_region(region);
+	if (!ret)
+		region->active = true;
+	device_unlock(&region->dev);
+
+	return ret;
+}
+
+static void cxl_region_remove(struct device *dev)
+{
+	/* Remove region from the decoder's address space */
+}
+
+static struct cxl_driver cxl_region_driver = {
+	.name = "cxl_region",
+	.probe = cxl_region_probe,
+	.remove = cxl_region_remove,
+	.id = CXL_DEVICE_REGION,
+};
+
+static __init int cxl_region_init(void)
+{
+	return cxl_driver_register(&cxl_region_driver);
+}
+
+static __exit void cxl_region_exit(void)
+{
+	cxl_driver_unregister(&cxl_region_driver);
+}
+
+MODULE_LICENSE("GPL v2");
+module_init(cxl_region_init);
+module_exit(cxl_region_exit);
+MODULE_IMPORT_NS(CXL);
+MODULE_ALIAS_CXL(CXL_REGION);
diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
index 7a87d229e38a..926dec98709b 100644
--- a/drivers/cxl/region.h
+++ b/drivers/cxl/region.h
@@ -16,6 +16,7 @@ extern const struct device_type cxl_region_type;
  * @uuid: The UUID for this region.
  * @list: Node in decoders region list.
  * @eniw: Number of interleave ways this region is configured for.
+ * @active: If the region has been activated.
  * @targets: The memory devices comprising the region.
  */
 struct cxl_region {
@@ -26,6 +27,7 @@ struct cxl_region {
 	uuid_t uuid;
 	struct list_head list;
 	int eniw;
+	bool active;
 	struct cxl_memdev *targets[];
 };
 
diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h
new file mode 100644
index 000000000000..fe9576ec2cde
--- /dev/null
+++ b/drivers/cxl/trace.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM cxl
+
+#if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
+#define __CXL_TRACE_H__
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(cxl_region_bind,
+	TP_PROTO(struct cxl_region *region, char *status),
+	TP_ARGS(region, status),
+	TP_STRUCT__entry(
+		__field(struct cxl_region *, region)
+		__string(status, status)
+	),
+
+	TP_fast_assign(
+		__entry->region = region;
+		__assign_str(status, status);
+	),
+
+	TP_printk("%s failed to bind (%s)", dev_name(&__entry->region->dev), __get_str(status))
+);
+
+#endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace
+#include <trace/define_trace.h>
-- 
2.32.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 5/6] cxl/core: Convert decoder range to resource
  2021-06-17 17:36 [PATCH 0/6] Region creation Ben Widawsky
                   ` (3 preceding siblings ...)
  2021-06-17 17:36 ` [PATCH 4/6] cxl/region: Introduce a cxl_region driver Ben Widawsky
@ 2021-06-17 17:36 ` Ben Widawsky
  2021-06-18 11:52   ` Jonathan Cameron
  2021-06-17 17:36 ` [PATCH 6/6] cxl/region: Handle region's address space allocation Ben Widawsky
  5 siblings, 1 reply; 20+ messages in thread
From: Ben Widawsky @ 2021-06-17 17:36 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Regions will use the resource API in order to help manage allocated
space. As regions are children of the decoder, it makes sense that the
parent host the main resource to be suballocated by the region.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/core.c   | 9 +++------
 drivers/cxl/cxl.h    | 4 ++--
 drivers/cxl/region.c | 2 +-
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
index 44f982f4f247..0e598a18e95f 100644
--- a/drivers/cxl/core.c
+++ b/drivers/cxl/core.c
@@ -40,7 +40,7 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr,
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
 
-	return sysfs_emit(buf, "%#llx\n", cxld->range.start);
+	return sysfs_emit(buf, "%#llx\n", cxld->res.start);
 }
 static DEVICE_ATTR_RO(start);
 
@@ -49,7 +49,7 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr,
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
 
-	return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range));
+	return sysfs_emit(buf, "%#llx\n", resource_size(&cxld->res));
 }
 static DEVICE_ATTR_RO(size);
 
@@ -543,10 +543,7 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
 
 	*cxld = (struct cxl_decoder) {
 		.id = rc,
-		.range = {
-			.start = base,
-			.end = base + len - 1,
-		},
+		.res = DEFINE_RES_MEM_NAMED(base, len, "cxl-decoder"),
 		.flags = flags,
 		.interleave_ways = interleave_ways,
 		.interleave_granularity = interleave_granularity,
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 90107b21125b..b08f0969abeb 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -185,7 +185,7 @@ enum cxl_decoder_type {
  * struct cxl_decoder - CXL address range decode configuration
  * @dev: this decoder's device
  * @id: kernel device name id
- * @range: address range considered by this decoder
+ * @res: address space resources considered by this decoder
  * @interleave_ways: number of cxl_dports in this decode
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
@@ -198,7 +198,7 @@ enum cxl_decoder_type {
 struct cxl_decoder {
 	struct device dev;
 	int id;
-	struct range range;
+	struct resource res;
 	int interleave_ways;
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index 616b47903d69..62bca8b30349 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -59,7 +59,7 @@ static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
 	if (!region->res)
 		return sysfs_emit(buf, "\n");
 
-	return sysfs_emit(buf, "%#llx\n", cxld->range.start - region->res->start);
+	return sysfs_emit(buf, "%#llx\n", cxld->res.start - region->res->start);
 }
 static DEVICE_ATTR_RO(offset);
 
-- 
2.32.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 6/6] cxl/region: Handle region's address space allocation
  2021-06-17 17:36 [PATCH 0/6] Region creation Ben Widawsky
                   ` (4 preceding siblings ...)
  2021-06-17 17:36 ` [PATCH 5/6] cxl/core: Convert decoder range to resource Ben Widawsky
@ 2021-06-17 17:36 ` Ben Widawsky
  2021-06-18 13:35   ` Jonathan Cameron
  5 siblings, 1 reply; 20+ messages in thread
From: Ben Widawsky @ 2021-06-17 17:36 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Regions are carved out of an addresses space which is claimed by top
level decoders, and subsequently their children decoders. Regions are
created with a size and therefore must fit, with proper alignment, in
that address space. The support for doing this fitting is handled by the
driver automatically.

As an example, a platform might configure a top level decoder to claim
1TB of address space @ 0x800000000 -> 0x10800000000; it would be
possible to create M regions with appropriate alignment to occupy that
address space. Each of those regions would have a host physical address
somewhere in the range between 32G and 1.3TB, and the location will be
determined by the logic added here.

The request_region() usage is not strictly mandatory at this point as
the actual handling of the address space is done with genpools. It is
highly likely however that the resource/region APIs will become useful
in the not too distant future.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/core.c   | 13 +++++++++++++
 drivers/cxl/cxl.h    |  2 ++
 drivers/cxl/region.c | 31 +++++++++++++++++++++++++++++--
 3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
index 0e598a18e95f..b57ee627bc6a 100644
--- a/drivers/cxl/core.c
+++ b/drivers/cxl/core.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
 #include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/genalloc.h>
 #include <linux/device.h>
 #include <linux/module.h>
 #include <linux/pci.h>
@@ -607,6 +608,18 @@ devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
 	rc = devm_add_action_or_reset(host, unregister_dev, dev);
 	if (rc)
 		return ERR_PTR(rc);
+
+	cxld->address_space =
+		devm_gen_pool_create(dev, ilog2(SZ_256M * cxld->interleave_ways),
+				     NUMA_NO_NODE, dev_name(dev));
+	if (IS_ERR(cxld->address_space))
+		return ERR_CAST(cxld->address_space);
+
+	rc = gen_pool_add(cxld->address_space, cxld->res.start,
+			  resource_size(&cxld->res), NUMA_NO_NODE);
+	if (rc)
+		return ERR_PTR(rc);
+
 	return cxld;
 
 err:
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index b08f0969abeb..b5b728155d86 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -193,6 +193,7 @@ enum cxl_decoder_type {
  * @region_ida: allocator for region ids.
  * @regions: List of regions mapped (may be disabled) by this decoder.
  * @youngest: Last region created for this decoder.
+ * @address_space: Used/free address space for regions.
  * @target: active ordered target list in current decoder configuration
  */
 struct cxl_decoder {
@@ -206,6 +207,7 @@ struct cxl_decoder {
 	struct ida region_ida;
 	struct list_head regions;
 	struct cxl_region *youngest;
+	struct gen_pool *address_space;
 	struct cxl_dport *target[];
 };
 
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index 62bca8b30349..0554f3fff7ac 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
 #include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/genalloc.h>
 #include <linux/device.h>
 #include <linux/module.h>
 #include <linux/sizes.h>
@@ -388,9 +389,33 @@ int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
 	return 0;
 }
 
+static int allocate_region_addr(struct cxl_region *region)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(region->dev.parent);
+	unsigned long start;
+
+	start = gen_pool_alloc(cxld->address_space, region->requested_size);
+	if (!start) {
+		trace_cxl_region_bind(region, "Couldn't allocate address space");
+		return -ENOMEM;
+	}
+
+	region->res =
+		__request_region(&cxld->res, start, region->requested_size,
+				 dev_name(&region->dev), IORESOURCE_EXCLUSIVE);
+	if (IS_ERR(region->res)) {
+		trace_cxl_region_bind(region, "Couldn't obtain region");
+		gen_pool_free(cxld->address_space, start,
+			      region->requested_size);
+		return PTR_ERR(region->res);
+	}
+
+	return 0;
+}
+
 static int bind_region(struct cxl_region *region)
 {
-	int i;
+	int i, rc;
 
 	if (dev_WARN_ONCE(&region->dev, !cxl_is_region_configured(region),
 			  "unconfigured regions can't be probed (race?)\n")) {
@@ -408,7 +433,9 @@ static int bind_region(struct cxl_region *region)
 			return -ENXIO;
 		}
 
-	/* TODO: Allocate from decoder's address space */
+	rc = allocate_region_addr(region);
+	if (rc)
+		return rc;
 
 	/* TODO: program HDM decoders */
 
-- 
2.32.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/6] cxl/region: Add region creation ABI
  2021-06-17 17:36 ` [PATCH 1/6] cxl/region: Add region creation ABI Ben Widawsky
@ 2021-06-17 21:11   ` Ben Widawsky
  2021-06-18  9:13   ` [PATCH " Jonathan Cameron
  1 sibling, 0 replies; 20+ messages in thread
From: Ben Widawsky @ 2021-06-17 21:11 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Regions are created as a child of the decoder that encompasses an
address space with constraints. Regions only exist for persistent
capacities.

When regions are created, the number of desired interleave ways must be
known. To enable this, the sysfs attribute will take the desired ways as
input. This interface intentionally allows creation of
impossible-to-enable regions based on interleave constraints in the
topology. The reasoning is to create new regions through the kernel
interfaces which may become possible on reboot under a variety of
circumstances.

As an example, creating a x1 region with:
echo 1 > /sys/bus/cxl/devices/decoder1.0/create_region

Will yield /sys/bus/cxl/devices/decoder1.0/region1.0:0

That region may then be deleted with:
echo region1.0:0 > /sys/bus/cxl/devices/decoder1.0/delete_region

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
v2 keeps the region type static 
---
 Documentation/ABI/testing/sysfs-bus-cxl       |  21 +++
 .../driver-api/cxl/memory-devices.rst         |  11 ++
 drivers/cxl/Makefile                          |   3 +-
 drivers/cxl/core.c                            |  71 ++++++++
 drivers/cxl/cxl.h                             |  11 ++
 drivers/cxl/region.c                          | 162 ++++++++++++++++++
 drivers/cxl/region.h                          |  33 ++++
 7 files changed, 311 insertions(+), 1 deletion(-)
 create mode 100644 drivers/cxl/region.c
 create mode 100644 drivers/cxl/region.h

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 0b6a2e6e8fbb..115a25d2899d 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -127,3 +127,24 @@ Description:
 		memory (type-3). The 'target_type' attribute indicates the
 		current setting which may dynamically change based on what
 		memory regions are activated in this decode hierarchy.
+
+What:		/sys/bus/cxl/devices/decoderX.Y/create_region
+Date:		June, 2021
+KernelVersion:	v5.14
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		Creates a new CXL region of N interleaved ways. Writing a value
+		of '2' will create a new uninitialized region with 2x interleave
+		that will be mapped by the CXL decoderX.Y. Reading from this
+		node will return the last created region. Regions must be
+		subsequently configured and bound to a region driver before they
+		can be used.
+
+What:		/sys/bus/cxl/devices/decoderX.Y/delete_region
+Date:		June, 2021
+KernelVersion:	v5.14
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		Deletes the named region. A region must be unbound from the
+		region driver before being deleted. The attributes expects a
+		region in the form "regionX.Y:Z".
diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index 487ce4f41d77..c7f59a8c94db 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -39,6 +39,17 @@ CXL Core
 .. kernel-doc:: drivers/cxl/core.c
    :doc: cxl core
 
+CXL Regions
+-----------
+.. kernel-doc:: drivers/cxl/region.c
+   :doc: cxl region
+
+.. kernel-doc:: drivers/cxl/region.h
+   :identifiers:
+
+.. kernel-doc:: drivers/cxl/region.c
+   :identifiers:
+
 External Interfaces
 ===================
 
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index 32954059b37b..c3151198c041 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CXL_BUS) += cxl_core.o

-obj-$(CONFIG_CXL_MEM) += cxl_pci.o
+obj-$(CONFIG_CXL_MEM) += cxl_pci.o cxl_region.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 
@@ -9,3 +9,4 @@ cxl_core-y := core.o
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
 cxl_pmem-y := pmem.o
+cxl_region-y := region.o
diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
index a2e4d54fc7bc..d8d7ca85e110 100644
--- a/drivers/cxl/core.c
+++ b/drivers/cxl/core.c
@@ -6,6 +6,7 @@
 #include <linux/pci.h>
 #include <linux/slab.h>
 #include <linux/idr.h>
+#include "region.h"
 #include "cxl.h"
 #include "mem.h"
 
@@ -120,7 +121,68 @@ static ssize_t target_list_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(target_list);
 
+static ssize_t create_region_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	int rc;
+
+	device_lock(dev);
+	rc = sprintf(buf, "%s\n",
+		     cxld->youngest ? dev_name(&cxld->youngest->dev) : "");
+	device_unlock(dev);
+
+	return rc;
+}
+
+static ssize_t create_region_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t len)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	struct cxl_region *region;
+	ssize_t rc;
+	int val;
+
+	rc = kstrtoint(buf, 0, &val);
+	if (rc)
+		return rc;
+	if (val < 0 || val > 16)
+		return -EINVAL;
+
+	region = cxl_alloc_region(cxld, val);
+	if (IS_ERR(region))
+		return PTR_ERR(region);
+
+	rc = cxl_add_region(cxld, region);
+	if (rc) {
+		cxl_free_region(cxld, region);
+		return rc;
+	}
+
+	cxld->youngest = region;
+	return len;
+}
+static DEVICE_ATTR_RW(create_region);
+
+static ssize_t delete_region_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t len)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	int rc;
+
+	rc = cxl_delete_region(cxld, buf);
+	if (rc)
+		return rc;
+
+	return len;
+}
+static DEVICE_ATTR_WO(delete_region);
+
 static struct attribute *cxl_decoder_base_attrs[] = {
+	&dev_attr_create_region.attr,
+	&dev_attr_delete_region.attr,
 	&dev_attr_start.attr,
 	&dev_attr_size.attr,
 	&dev_attr_locked.attr,
@@ -171,7 +233,13 @@ static void cxl_decoder_release(struct device *dev)
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
 	struct cxl_port *port = to_cxl_port(dev->parent);
+	struct cxl_region *region;
 
+	list_for_each_entry(region, &cxld->regions, list)
+		cxl_delete_region(cxld, dev_name(&region->dev));
+
+	dev_WARN_ONCE(dev, !ida_is_empty(&cxld->region_ida),
+		      "Lost track of a region");
 	ida_free(&port->decoder_ida, cxld->id);
 	kfree(cxld);
 }
@@ -483,8 +551,11 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
 		.interleave_ways = interleave_ways,
 		.interleave_granularity = interleave_granularity,
 		.target_type = type,
+		.regions = LIST_HEAD_INIT(cxld->regions),
 	};
 
+	ida_init(&cxld->region_ida);
+
 	/* handle implied target_list */
 	if (interleave_ways == 1)
 		cxld->target[0] =
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index b6bda39a59e3..8b27a07d7d0f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -190,6 +190,9 @@ enum cxl_decoder_type {
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
  * @flags: memory type capabilities and locking
+ * @region_ida: allocator for region ids.
+ * @regions: List of regions mapped (may be disabled) by this decoder.
+ * @youngest: Last region created for this decoder.
  * @target: active ordered target list in current decoder configuration
  */
 struct cxl_decoder {
@@ -200,6 +203,9 @@ struct cxl_decoder {
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
 	unsigned long flags;
+	struct ida region_ida;
+	struct list_head regions;
+	struct cxl_region *youngest;
 	struct cxl_dport *target[];
 };
 
@@ -262,6 +268,11 @@ struct cxl_dport {
 	struct list_head list;
 };
 
+struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
+				    int interleave_ways);
+void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region);
+int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region);
+int cxl_delete_region(struct cxl_decoder *cxld, const char *region);
 struct cxl_port *to_cxl_port(struct device *dev);
 struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
 				   resource_size_t component_reg_phys,
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
new file mode 100644
index 000000000000..7475dc6ba84b
--- /dev/null
+++ b/drivers/cxl/region.c
@@ -0,0 +1,162 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/idr.h>
+#include "region.h"
+#include "cxl.h"
+#include "mem.h"
+
+/**
+ * DOC: cxl region
+ *
+ * A CXL region encompasses a chunk of host physical address space that may be
+ * consumed by a single device (x1 interleave aka linear) or across multiple
+ * devices (xN interleaved). A region is a child device of a &struct
+ * cxl_decoder. There may be multiple active regions under a single &struct
+ * cxl_decoder. The common case for multiple regions would be several linear,
+ * contiguous regions under a single decoder. Generally, there will be a 1:1
+ * relationship between decoder and region when the region is interleaved.
+ */
+
+static void cxl_region_release(struct device *dev);
+
+static const struct device_type cxl_region_type = {
+	.name = "cxl_region",
+	.release = cxl_region_release,
+};
+
+bool is_cxl_region(struct device *dev)
+{
+	return dev->type == &cxl_region_type;
+}
+
+static struct cxl_region *to_cxl_region(struct device *dev)
+{
+	if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type,
+			  "not a cxl_region device\n"))
+		return NULL;
+
+	return container_of(dev, struct cxl_region, dev);
+}
+
+
+void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
+{
+	ida_free(&cxld->region_ida, region->id);
+	kfree(region);
+}
+
+static void cxl_region_release(struct device *dev)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
+	struct cxl_region *region = to_cxl_region(dev);
+
+	cxl_free_region(cxld, region);
+}
+
+struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
+				    int interleave_ways)
+{
+	struct cxl_region *region;
+	int rc;
+
+	region = kzalloc(struct_size(region, targets, interleave_ways),
+			 GFP_KERNEL);
+	if (!region)
+		return ERR_PTR(-ENOMEM);
+
+	region->eniw = interleave_ways;
+
+	rc = ida_alloc(&cxld->region_ida, GFP_KERNEL);
+	if (rc < 0) {
+		dev_err(&cxld->dev, "Couldn't get a new id\n");
+		kfree(region);
+		return ERR_PTR(rc);
+	}
+	region->id = rc;
+
+	return region;
+}
+
+/**
+ * cxl_add_region - Adds a region to a decoder
+ * @cxld: Parent decoder.
+ * @region: Region to be added to the decoder.
+ *
+ * This is the second step of region initialization. Regions exist within an
+ * address space which is mapped by a @cxld, and that @cxld enforces constraints
+ * upon the region as it is configured. Regions may be added to a @cxld but not
+ * activated and therefore it is possible to have more regions in a @cxld than
+ * there are interleave ways in the @cxld. Regions exist only for persistent
+ * capacities.
+ *
+ * Return: zero if the region was added to the @cxld, else returns negative
+ * error code.
+ */
+int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region)
+{
+	struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+	struct device *dev = &region->dev;
+	int rc;
+
+	device_initialize(dev);
+	dev->parent = &cxld->dev;
+	device_set_pm_not_required(dev);
+	dev->bus = &cxl_bus_type;
+	dev->type = &cxl_region_type;
+	rc = dev_set_name(dev, "region%d.%d:%d", port->id, cxld->id, region->id);
+	if (rc)
+		goto err;
+
+	rc = device_add(dev);
+	if (rc)
+		goto err;
+
+	dev_dbg(dev, "Added %s to %s\n", dev_name(dev), dev_name(&cxld->dev));
+
+	return 0;
+
+err:
+	put_device(dev);
+	return rc;
+}
+
+static struct cxl_region *
+cxl_find_region_by_name(struct cxl_decoder *cxld, const char *name)
+{
+	struct device *region_dev;
+
+	region_dev = device_find_child_by_name(&cxld->dev, name);
+	if (!region_dev)
+		return ERR_PTR(-ENOENT);
+
+	return to_cxl_region(region_dev);
+}
+
+int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
+{
+	struct cxl_region *region;
+
+	device_lock(&cxld->dev);
+
+	region = cxl_find_region_by_name(cxld, region_name);
+	if (IS_ERR(region)) {
+		device_unlock(&cxld->dev);
+		return PTR_ERR(region);
+	}
+
+	dev_dbg(&cxld->dev, "Requested removal of %s from %s\n",
+		dev_name(&region->dev), dev_name(&cxld->dev));
+
+	cmpxchg(&cxld->youngest, region, NULL);
+
+	device_unregister(&region->dev);
+	device_unlock(&cxld->dev);
+
+	put_device(&region->dev);
+
+	return 0;
+}
diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
new file mode 100644
index 000000000000..f2c37a6f1192
--- /dev/null
+++ b/drivers/cxl/region.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2021 Intel Corporation. */
+#ifndef __CXL_REGION_H__
+#define __CXL_REGION_H__
+
+#include <linux/uuid.h>
+
+/**
+ * struct cxl_region - CXL region
+ * @dev: This region's device.
+ * @id: This regions id. Id is globally unique across all regions.
+ * @res: Address space consumed by this region.
+ * @requested_size: Size of the region determined from LSA or userspace.
+ * @uuid: The UUID for this region.
+ * @list: Node in decoders region list.
+ * @eniw: Number of interleave ways this region is configured for.
+ * @targets: The memory devices comprising the region.
+ */
+struct cxl_region {
+	struct device dev;
+	int id;
+	struct resource *res;
+	u64 requested_size;
+	uuid_t uuid;
+	struct list_head list;
+	int eniw;
+	struct cxl_memdev *targets[];
+};
+
+bool is_cxl_region(struct device *dev);
+bool cxl_is_region_configured(struct cxl_region *region);
+
+#endif
-- 
2.32.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 4/6] cxl/region: Introduce a cxl_region driver
  2021-06-17 17:36 ` [PATCH 4/6] cxl/region: Introduce a cxl_region driver Ben Widawsky
@ 2021-06-17 21:13   ` Ben Widawsky
  2021-06-18 11:49     ` Jonathan Cameron
  0 siblings, 1 reply; 20+ messages in thread
From: Ben Widawsky @ 2021-06-17 21:13 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The cxl_region driver is responsible for managing the HDM decoder
programming in the CXL topology. Once a region is created it must be
configured and bound to the driver in order to activate it.

The following is a sample of how such controls might work:

echo 1 > /sys/bus/cxl/devices/decoder0.0/create_region
echo $((256<<20)) > /sys/bus/cxl/devices/decoder0.0/region0.0:0/size
echo mem0 > /sys/bus/cxl/devices/decoder0.0/region0.0:0/target0
echo region0.0:0 > /sys/bus/cxl/drivers/cxl_region/bind

In order to handle the eventual rise in failure modes of binding a
region, a new trace event is created to help track these failures for
debug and reconfiguration paths in userspace.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---

v2 Is updated to take into account the static region_type.

---
 drivers/cxl/Makefile |   2 +-
 drivers/cxl/core.c   |  15 +++++-
 drivers/cxl/cxl.h    |   1 +
 drivers/cxl/mem.h    |   1 +
 drivers/cxl/region.c | 109 ++++++++++++++++++++++++++++++++++++++-----
 drivers/cxl/region.h |  11 +++++
 drivers/cxl/trace.h  |  33 +++++++++++++
 7 files changed, 158 insertions(+), 14 deletions(-)
 create mode 100644 drivers/cxl/trace.h

diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index c3151198c041..f35077c073b8 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -4,7 +4,7 @@ obj-$(CONFIG_CXL_MEM) += cxl_pci.o cxl_region.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 
-ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL
+ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -I$(src)
 cxl_core-y := core.o
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
index d8d7ca85e110..be65a1a1493e 100644
--- a/drivers/cxl/core.c
+++ b/drivers/cxl/core.c
@@ -1086,6 +1086,8 @@ static int cxl_device_id(struct device *dev)
 		return CXL_DEVICE_NVDIMM_BRIDGE;
 	if (dev->type == &cxl_nvdimm_type)
 		return CXL_DEVICE_NVDIMM;
+	if (is_cxl_region(dev))
+		return CXL_DEVICE_REGION;
 	return 0;
 }
 
@@ -1102,7 +1104,18 @@ static int cxl_bus_match(struct device *dev, struct device_driver *drv)
 
 static int cxl_bus_probe(struct device *dev)
 {
-	return to_cxl_drv(dev->driver)->probe(dev);
+	int id = cxl_device_id(dev);
+
+	if (id == CXL_DEVICE_REGION) {
+		struct cxl_region *region = to_cxl_region(dev);
+
+		if (cxl_is_region_configured(region))
+			return to_cxl_drv(dev->driver)->probe(dev);
+	} else {
+		return to_cxl_drv(dev->driver)->probe(dev);
+	}
+
+	return -ENODEV;
 }
 
 static int cxl_bus_remove(struct device *dev)
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 8b27a07d7d0f..90107b21125b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -325,6 +325,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 
 #define CXL_DEVICE_NVDIMM_BRIDGE	1
 #define CXL_DEVICE_NVDIMM		2
+#define CXL_DEVICE_REGION		3
 
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
index fe12ef3c3dde..ff1f9c57e089 100644
--- a/drivers/cxl/mem.h
+++ b/drivers/cxl/mem.h
@@ -83,4 +83,5 @@ struct cxl_mem {
 	struct range pmem_range;
 	struct range ram_range;
 };
+
 #endif /* __CXL_MEM_H__ */
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index 41299bd6da53..74d6f6c06608 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -11,6 +11,9 @@
 #include "cxl.h"
 #include "mem.h"
 
+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
 /**
  * DOC: cxl region
  *
@@ -27,8 +30,24 @@ static struct cxl_region *to_cxl_region(struct device *dev);
 
 static bool is_region_active(struct cxl_region *region)
 {
-	/* TODO: Regions can't be activated yet. */
-	return false;
+	return region->active;
+}
+
+/*
+ * Most sanity checking is left up to region binding. This does the most basic
+ * check to determine whether or not the core should try probing the driver.
+ */
+bool cxl_is_region_configured(struct cxl_region *region)
+{
+	/* zero sized regions aren't a thing. */
+	if (region->requested_size <= 0)
+		return false;
+
+	/* all regions have at least 1 target */
+	if (region->targets[0])
+		return false;
+
+	return true;
 }
 
 static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
@@ -249,16 +268,6 @@ bool is_cxl_region(struct device *dev)
 	return dev->type == &cxl_region_type;
 }
 
-static struct cxl_region *to_cxl_region(struct device *dev)
-{
-	if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type,
-			  "not a cxl_region device\n"))
-		return NULL;
-
-	return container_of(dev, struct cxl_region, dev);
-}
-
-
 void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
 {
 	int i;
@@ -383,3 +392,79 @@ int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
 
 	return 0;
 }
+
+static int bind_region(struct cxl_region *region)
+{
+	int i;
+
+	if (dev_WARN_ONCE(&region->dev, !cxl_is_region_configured(region),
+			  "unconfigured regions can't be probed (race?)\n")) {
+		return -ENXIO;
+	}
+
+	if (region->requested_size % (SZ_256M * region->eniw)) {
+		trace_cxl_region_bind(region, "Invalid size. Must be multiple of NIW");
+		return -ENXIO;
+	}
+
+	for (i = 0; i < region->eniw; i++)
+		if (!region->targets[i]) {
+			trace_cxl_region_bind(region, "Missing memory device target");
+			return -ENXIO;
+		}
+
+	/* TODO: Allocate from decoder's address space */
+
+	/* TODO: program HDM decoders */
+
+	if (uuid_is_null(&region->uuid))
+		uuid_gen(&region->uuid);
+
+	trace_cxl_region_bind(region, "Region binding succeeded.");
+	return 0;
+}
+
+static int cxl_region_probe(struct device *dev)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	int ret;
+
+	if (region->active)
+		return -EBUSY;
+
+	device_lock(&region->dev);
+	ret = bind_region(region);
+	if (!ret)
+		region->active = true;
+	device_unlock(&region->dev);
+
+	return ret;
+}
+
+static void cxl_region_remove(struct device *dev)
+{
+	/* Remove region from the decoder's address space */
+}
+
+static struct cxl_driver cxl_region_driver = {
+	.name = "cxl_region",
+	.probe = cxl_region_probe,
+	.remove = cxl_region_remove,
+	.id = CXL_DEVICE_REGION,
+};
+
+static __init int cxl_region_init(void)
+{
+	return cxl_driver_register(&cxl_region_driver);
+}
+
+static __exit void cxl_region_exit(void)
+{
+	cxl_driver_unregister(&cxl_region_driver);
+}
+
+MODULE_LICENSE("GPL v2");
+module_init(cxl_region_init);
+module_exit(cxl_region_exit);
+MODULE_IMPORT_NS(CXL);
+MODULE_ALIAS_CXL(CXL_REGION);
diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
index f2c37a6f1192..7f7331d9029b 100644
--- a/drivers/cxl/region.h
+++ b/drivers/cxl/region.h
@@ -14,6 +14,7 @@
  * @uuid: The UUID for this region.
  * @list: Node in decoders region list.
  * @eniw: Number of interleave ways this region is configured for.
+ * @active: If the region has been activated.
  * @targets: The memory devices comprising the region.
  */
 struct cxl_region {
@@ -24,10 +25,20 @@ struct cxl_region {
 	uuid_t uuid;
 	struct list_head list;
 	int eniw;
+	bool active;
 	struct cxl_memdev *targets[];
 };
 
 bool is_cxl_region(struct device *dev);
 bool cxl_is_region_configured(struct cxl_region *region);
 
+static inline struct cxl_region *to_cxl_region(struct device *dev)
+{
+	if (dev_WARN_ONCE(dev, !is_cxl_region(dev),
+			  "not a cxl_region device\n"))
+		return NULL;
+
+	return container_of(dev, struct cxl_region, dev);
+}
+
 #endif
diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h
new file mode 100644
index 000000000000..fe9576ec2cde
--- /dev/null
+++ b/drivers/cxl/trace.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM cxl
+
+#if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
+#define __CXL_TRACE_H__
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(cxl_region_bind,
+	TP_PROTO(struct cxl_region *region, char *status),
+	TP_ARGS(region, status),
+	TP_STRUCT__entry(
+		__field(struct cxl_region *, region)
+		__string(status, status)
+	),
+
+	TP_fast_assign(
+		__entry->region = region;
+		__assign_str(status, status);
+	),
+
+	TP_printk("%s failed to bind (%s)", dev_name(&__entry->region->dev), __get_str(status))
+);
+
+#endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace
+#include <trace/define_trace.h>
-- 
2.32.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] cxl/region: Add region creation ABI
  2021-06-17 17:36 ` [PATCH 1/6] cxl/region: Add region creation ABI Ben Widawsky
  2021-06-17 21:11   ` [PATCH v2 " Ben Widawsky
@ 2021-06-18  9:13   ` Jonathan Cameron
  2021-06-18 15:07     ` Ben Widawsky
  1 sibling, 1 reply; 20+ messages in thread
From: Jonathan Cameron @ 2021-06-18  9:13 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Alison Schofield, Dan Williams, Ira Weiny, Vishal Verma

On Thu, 17 Jun 2021 10:36:50 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Regions are created as a child of the decoder that encompasses an
> address space with constraints. Regions only exist for persistent
> capacities.
> 
> When regions are created, the number of desired interleave ways must be
> known. To enable this, the sysfs attribute will take the desired ways as
> input. This interface intentionally allows creation of
> impossible-to-enable regions based on interleave constraints in the
> topology. The reasoning is to create new regions through the kernel
> interfaces which may become possible on reboot under a variety of
> circumstances.
> 
> As an example, creating a x1 region with:
> echo 1 > /sys/bus/cxl/devices/decoder1.0/create_region
> 
> Will yield /sys/bus/cxl/devices/decoder1.0/region1.0:0
> 
> That region may then be deleted with:
> echo region1.0:0 > /sys/bus/cxl/devices/decoder1.0/delete_region
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Hi Ben,

Some comments inline. The sysfs interface is getting a little
clever for my liking....

> ---
>  Documentation/ABI/testing/sysfs-bus-cxl       |  21 +++
>  .../driver-api/cxl/memory-devices.rst         |  11 ++
>  drivers/cxl/Makefile                          |   3 +-
>  drivers/cxl/core.c                            |  71 +++++++++
>  drivers/cxl/cxl.h                             |  11 ++
>  drivers/cxl/region.c                          | 147 ++++++++++++++++++
>  drivers/cxl/region.h                          |  43 +++++
>  7 files changed, 306 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/cxl/region.c
>  create mode 100644 drivers/cxl/region.h
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 0b6a2e6e8fbb..115a25d2899d 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -127,3 +127,24 @@ Description:
>  		memory (type-3). The 'target_type' attribute indicates the
>  		current setting which may dynamically change based on what
>  		memory regions are activated in this decode hierarchy.
> +
> +What:		/sys/bus/cxl/devices/decoderX.Y/create_region
> +Date:		June, 2021
> +KernelVersion:	v5.14
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		Creates a new CXL region of N interleaved ways. Writing a value
> +		of '2' will create a new uninitialized region with 2x interleave
> +		that will be mapped by the CXL decoderX.Y. Reading from this
> +		node will return the last created region. Regions must be
> +		subsequently configured and bound to a region driver before they
> +		can be used.

I don't like sysfs attributes that return entirely different looking (and seemingly
unrelated) values from what you write to them.
I could sort of get behind writing 1 and returning the region (as in the RFC)
on basis that felt like 'create' and 'what did I create'.
Doing '2' and getting back anything other than '2' feels wrong

Maybe, would feel better if the name made it more explicit that the thing took
interleave values?

create_region_with_interleave perhaps?

Not ideal.  Sysfs always a bit clunky for this stuff and configfs would mean
a split interface which is never ideal.

It's a bit horrible, but maybe a more intuitive interface would be to make the
targetX visibility dynamic and hence make interleave an attribute of the region?

sysfs_update_group() is rarely used, but I think the intent is to support
this sort of case.

I've not fully thought through the effects of that though so might
well be missing something.

Also, add to the docs what can be expected if there is no 'last created'
or it's been removed again.

> +
> +What:		/sys/bus/cxl/devices/decoderX.Y/delete_region
> +Date:		June, 2021
> +KernelVersion:	v5.14
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		Deletes the named region. A region must be unbound from the
> +		region driver before being deleted. The attributes expects a
> +		region in the form "regionX.Y:Z".
> diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> index 487ce4f41d77..c7f59a8c94db 100644
> --- a/Documentation/driver-api/cxl/memory-devices.rst
> +++ b/Documentation/driver-api/cxl/memory-devices.rst
> @@ -39,6 +39,17 @@ CXL Core
>  .. kernel-doc:: drivers/cxl/core.c
>     :doc: cxl core
>  
> +CXL Regions
> +-----------
> +.. kernel-doc:: drivers/cxl/region.c
> +   :doc: cxl region
> +
> +.. kernel-doc:: drivers/cxl/region.h
> +   :identifiers:
> +
> +.. kernel-doc:: drivers/cxl/region.c
> +   :identifiers:
> +
>  External Interfaces
>  ===================
>  
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index 32954059b37b..c3151198c041 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_CXL_BUS) += cxl_core.o
> -obj-$(CONFIG_CXL_MEM) += cxl_pci.o
> +obj-$(CONFIG_CXL_MEM) += cxl_pci.o cxl_region.o
>  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
>  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
>  
> @@ -9,3 +9,4 @@ cxl_core-y := core.o
>  cxl_pci-y := pci.o
>  cxl_acpi-y := acpi.o
>  cxl_pmem-y := pmem.o
> +cxl_region-y := region.o
> diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> index a2e4d54fc7bc..d8d7ca85e110 100644
> --- a/drivers/cxl/core.c
> +++ b/drivers/cxl/core.c
> @@ -6,6 +6,7 @@
>  #include <linux/pci.h>
>  #include <linux/slab.h>
>  #include <linux/idr.h>
> +#include "region.h"
>  #include "cxl.h"
>  #include "mem.h"
>  
> @@ -120,7 +121,68 @@ static ssize_t target_list_show(struct device *dev,
>  }
>  static DEVICE_ATTR_RO(target_list);
>  
> +static ssize_t create_region_show(struct device *dev,
> +				  struct device_attribute *attr, char *buf)
> +{
> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +	int rc;
> +
> +	device_lock(dev);
> +	rc = sprintf(buf, "%s\n",
> +		     cxld->youngest ? dev_name(&cxld->youngest->dev) : "");

An alternative is to be cynical and adjust the interface a touch.
Lets say it always returns the name of the last created region.  If that's
true, just copy the name at creation time.  Then no need to care if the
pointer is live or not.

> +	device_unlock(dev);
> +
> +	return rc;
> +}
> +
> +static ssize_t create_region_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t len)
> +{
> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +	struct cxl_region *region;
> +	ssize_t rc;
> +	int val;
> +
> +	rc = kstrtoint(buf, 0, &val);
> +	if (rc)
> +		return rc;
> +	if (val < 0 || val > 16)
> +		return -EINVAL;
> +
> +	region = cxl_alloc_region(cxld, val);
> +	if (IS_ERR(region))
> +		return PTR_ERR(region);
> +
> +	rc = cxl_add_region(cxld, region);
> +	if (rc) {
> +		cxl_free_region(cxld, region);
> +		return rc;
> +	}
> +
> +	cxld->youngest = region;
> +	return len;
> +}
> +static DEVICE_ATTR_RW(create_region);
> +
> +static ssize_t delete_region_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t len)
> +{
> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +	int rc;
> +
> +	rc = cxl_delete_region(cxld, buf);
> +	if (rc)
> +		return rc;
> +
> +	return len;
> +}
> +static DEVICE_ATTR_WO(delete_region);
> +
>  static struct attribute *cxl_decoder_base_attrs[] = {
> +	&dev_attr_create_region.attr,
> +	&dev_attr_delete_region.attr,
>  	&dev_attr_start.attr,
>  	&dev_attr_size.attr,
>  	&dev_attr_locked.attr,
> @@ -171,7 +233,13 @@ static void cxl_decoder_release(struct device *dev)
>  {
>  	struct cxl_decoder *cxld = to_cxl_decoder(dev);
>  	struct cxl_port *port = to_cxl_port(dev->parent);
> +	struct cxl_region *region;
>  
> +	list_for_each_entry(region, &cxld->regions, list)
> +		cxl_delete_region(cxld, dev_name(&region->dev));
> +
> +	dev_WARN_ONCE(dev, !ida_is_empty(&cxld->region_ida),
> +		      "Lost track of a region");
>  	ida_free(&port->decoder_ida, cxld->id);
>  	kfree(cxld);
>  }
> @@ -483,8 +551,11 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
>  		.interleave_ways = interleave_ways,
>  		.interleave_granularity = interleave_granularity,
>  		.target_type = type,
> +		.regions = LIST_HEAD_INIT(cxld->regions),
>  	};
>  
> +	ida_init(&cxld->region_ida);
> +
>  	/* handle implied target_list */
>  	if (interleave_ways == 1)
>  		cxld->target[0] =
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index b6bda39a59e3..8b27a07d7d0f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -190,6 +190,9 @@ enum cxl_decoder_type {
>   * @interleave_granularity: data stride per dport
>   * @target_type: accelerator vs expander (type2 vs type3) selector
>   * @flags: memory type capabilities and locking
> + * @region_ida: allocator for region ids.
> + * @regions: List of regions mapped (may be disabled) by this decoder.
> + * @youngest: Last region created for this decoder.
>   * @target: active ordered target list in current decoder configuration
>   */
>  struct cxl_decoder {
> @@ -200,6 +203,9 @@ struct cxl_decoder {
>  	int interleave_granularity;
>  	enum cxl_decoder_type target_type;
>  	unsigned long flags;
> +	struct ida region_ida;
> +	struct list_head regions;
> +	struct cxl_region *youngest;
>  	struct cxl_dport *target[];
>  };
>  
> @@ -262,6 +268,11 @@ struct cxl_dport {
>  	struct list_head list;
>  };
>  
> +struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
> +				    int interleave_ways);
> +void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region);
> +int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region);
> +int cxl_delete_region(struct cxl_decoder *cxld, const char *region);
>  struct cxl_port *to_cxl_port(struct device *dev);
>  struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
>  				   resource_size_t component_reg_phys,
> diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> new file mode 100644
> index 000000000000..391467e864a2
> --- /dev/null
> +++ b/drivers/cxl/region.c
> @@ -0,0 +1,147 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +#include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/idr.h>
> +#include "region.h"
> +#include "cxl.h"
> +#include "mem.h"
> +
> +/**
> + * DOC: cxl region
> + *
> + * A CXL region encompasses a chunk of host physical address space that may be
> + * consumed by a single device (x1 interleave aka linear) or across multiple
> + * devices (xN interleaved). A region is a child device of a &struct
> + * cxl_decoder. There may be multiple active regions under a single &struct
> + * cxl_decoder. The common case for multiple regions would be several linear,
> + * contiguous regions under a single decoder. Generally, there will be a 1:1
> + * relationship between decoder and region when the region is interleaved.
> + */
> +
> +static void cxl_region_release(struct device *dev);
> +
> +const struct device_type cxl_region_type = {
> +	.name = "cxl_region",
> +	.release = cxl_region_release,
> +};
> +
> +void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
> +{
> +	ida_free(&cxld->region_ida, region->id);
> +	kfree(region);
> +}
> +
> +static void cxl_region_release(struct device *dev)
> +{
> +	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
> +	struct cxl_region *region = to_cxl_region(dev);
> +
> +	cxl_free_region(cxld, region);
> +}
> +
> +struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
> +				    int interleave_ways)
> +{
> +	struct cxl_region *region;
> +	int rc;
> +
> +	region = kzalloc(struct_size(region, targets, interleave_ways),
> +			 GFP_KERNEL);
> +	if (!region)
> +		return ERR_PTR(-ENOMEM);
> +
> +	region->eniw = interleave_ways;
> +
> +	rc = ida_alloc(&cxld->region_ida, GFP_KERNEL);
> +	if (rc < 0) {
> +		dev_err(&cxld->dev, "Couldn't get a new id\n");
> +		kfree(region);
> +		return ERR_PTR(rc);
> +	}
> +	region->id = rc;
> +
> +	return region;
> +}
> +
> +/**
> + * cxl_add_region - Adds a region to a decoder
> + * @cxld: Parent decoder.
> + * @region: Region to be added to the decoder.
> + *
> + * This is the second step of region initialization. Regions exist within an
> + * address space which is mapped by a @cxld, and that @cxld enforces constraints
> + * upon the region as it is configured. Regions may be added to a @cxld but not
> + * activated and therefore it is possible to have more regions in a @cxld than
> + * there are interleave ways in the @cxld. Regions exist only for persistent
> + * capacities.
> + *
> + * Return: zero if the region was added to the @cxld, else returns negative
> + * error code.
> + */
> +int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region)
> +{
> +	struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> +	struct device *dev = &region->dev;
> +	int rc;
> +
> +	device_initialize(dev);
> +	dev->parent = &cxld->dev;
> +	device_set_pm_not_required(dev);
> +	dev->bus = &cxl_bus_type;
> +	dev->type = &cxl_region_type;
> +	rc = dev_set_name(dev, "region%d.%d:%d", port->id, cxld->id, region->id);
> +	if (rc)
> +		goto err;
> +
> +	rc = device_add(dev);
> +	if (rc)
> +		goto err;
> +
> +	dev_dbg(dev, "Added %s to %s\n", dev_name(dev), dev_name(&cxld->dev));
> +
> +	return 0;
> +
> +err:
> +	put_device(dev);
> +	return rc;
> +}
> +
> +static struct cxl_region *
> +cxl_find_region_by_name(struct cxl_decoder *cxld, const char *name)
> +{
> +	struct device *region_dev;
> +
> +	region_dev = device_find_child_by_name(&cxld->dev, name);
> +	if (!region_dev)
> +		return ERR_PTR(-ENOENT);
> +
> +	return to_cxl_region(region_dev);
> +}
> +
> +int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
> +{
> +	struct cxl_region *region;
> +
> +	device_lock(&cxld->dev);
> +
> +	region = cxl_find_region_by_name(cxld, region_name);
> +	if (IS_ERR(region)) {
> +		device_unlock(&cxld->dev);
> +		return PTR_ERR(region);
> +	}
> +
> +	dev_dbg(&cxld->dev, "Requested removal of %s from %s\n",
> +		dev_name(&region->dev), dev_name(&cxld->dev));
> +
> +	cmpxchg(&cxld->youngest, region, NULL);

Why does this need to be atomic?  I think the other side of anything
that might make use of this isn't atomic, except because you are holding
the device_lock for cxld->dev

> +
> +	device_unregister(&region->dev);
> +	device_unlock(&cxld->dev);
> +
> +	put_device(&region->dev);
> +
> +	return 0;
> +}
> diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
> new file mode 100644
> index 000000000000..7a87d229e38a
> --- /dev/null
> +++ b/drivers/cxl/region.h
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright(c) 2021 Intel Corporation. */
> +#ifndef __CXL_REGION_H__
> +#define __CXL_REGION_H__
> +
> +#include <linux/uuid.h>
> +
> +extern const struct device_type cxl_region_type;
> +
> +/**
> + * struct cxl_region - CXL region
> + * @dev: This region's device.
> + * @id: This regions id. Id is globally unique across all regions.
> + * @res: Address space consumed by this region.
> + * @requested_size: Size of the region determined from LSA or userspace.
> + * @uuid: The UUID for this region.
> + * @list: Node in decoders region list.
> + * @eniw: Number of interleave ways this region is configured for.
> + * @targets: The memory devices comprising the region.
> + */
> +struct cxl_region {
> +	struct device dev;
> +	int id;
> +	struct resource *res;
> +	u64 requested_size;
> +	uuid_t uuid;
> +	struct list_head list;
> +	int eniw;
> +	struct cxl_memdev *targets[];
> +};
> +
> +static inline struct cxl_region *to_cxl_region(struct device *dev)
> +{
> +	if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type,
> +			  "not a cxl_region device\n"))
> +		return NULL;
> +
> +	return container_of(dev, struct cxl_region, dev);
> +}
> +
> +bool cxl_is_region_configured(struct cxl_region *region);
> +
> +#endif


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] cxl: Move cxl_memdev conversion helper to mem.h
  2021-06-17 17:36 ` [PATCH 2/6] cxl: Move cxl_memdev conversion helper to mem.h Ben Widawsky
@ 2021-06-18  9:13   ` Jonathan Cameron
  2021-06-18 15:00   ` Dan Williams
  1 sibling, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2021-06-18  9:13 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Alison Schofield, Dan Williams, Ira Weiny, Vishal Verma

On Thu, 17 Jun 2021 10:36:51 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Hard to argue with this one ;)  So I don't bother reading it again...

Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/mem.h | 5 +++++
>  drivers/cxl/pci.c | 5 -----
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
> index 8f02d02b26b4..fe12ef3c3dde 100644
> --- a/drivers/cxl/mem.h
> +++ b/drivers/cxl/mem.h
> @@ -48,6 +48,11 @@ struct cxl_memdev {
>  	int id;
>  };
>  
> +static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
> +{
> +       return container_of(dev, struct cxl_memdev, dev);
> +}
> +
>  /**
>   * struct cxl_mem - A CXL memory device
>   * @pdev: The PCI device associated with this CXL device.
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index f8408e5f0754..52508282ec37 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1161,11 +1161,6 @@ static int cxl_mem_setup_regs(struct cxl_mem *cxlm)
>  	return ret;
>  }
>  
> -static struct cxl_memdev *to_cxl_memdev(struct device *dev)
> -{
> -	return container_of(dev, struct cxl_memdev, dev);
> -}
> -
>  static void cxl_memdev_release(struct device *dev)
>  {
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] cxl/region: Introduce concept of region configuration
  2021-06-17 17:36 ` [PATCH 3/6] cxl/region: Introduce concept of region configuration Ben Widawsky
@ 2021-06-18 11:22   ` Jonathan Cameron
  2021-06-18 15:25     ` Ben Widawsky
  0 siblings, 1 reply; 20+ messages in thread
From: Jonathan Cameron @ 2021-06-18 11:22 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Alison Schofield, Dan Williams, Ira Weiny, Vishal Verma

On Thu, 17 Jun 2021 10:36:52 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The region creation APIs leave a region unconfigured. Configuring the
> region will work in the same way as similar subsystems such as devdax.
> Sysfs attrs will be provided to allow userspace to configure the region.
> Finally once all configuration is complete, userspace may "commit" the
> config. What the kernel decides to do after a config is committed is out
> of scope at this point.
> 
> Introduced here are the most basic attributes needed to configure a
> region.
> 
> A x1 interleave example is provided below:
> 
> decoder1.0
> ├── create_region
> ├── delete_region
> ├── devtype
> ├── locked
> ├── region1.0:0
> │   ├── offset
> │   ├── size
> │   ├── subsystem -> ../../../../../../../bus/cxl
> │   ├── target0
> │   ├── uevent
> │   ├── uuid
> │   └── verify
> ├── size
> ├── start
> ├── subsystem -> ../../../../../../bus/cxl
> ├── target_list
> ├── target_type
> └── uevent
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Hi Ben,

I think some of the sysfs interface needs more detailed docs
and consideration about consistency.

If an interface isn't fairly obvious without reading the docs
(but understanding what it is trying to do) then it's normally
a bad sign...

Jonathan

> ---
>  Documentation/ABI/testing/sysfs-bus-cxl |  32 ++++
>  drivers/cxl/region.c                    | 223 ++++++++++++++++++++++++
>  2 files changed, 255 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 115a25d2899d..70f9d09385a4 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -148,3 +148,35 @@ Description:
>  		Deletes the named region. A region must be unbound from the
>  		region driver before being deleted. The attributes expects a
>  		region in the form "regionX.Y:Z".
> +
> +What:		/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/offset
> +Date:		June, 2021
> +KernelVersion:	v5.14
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RO) A region resides within an address space that is claimed by
> +		a decoder. Region space allocation is handled by the driver, but
> +		the offset may be read by userspace tooling in order to
> +		determine fragmentation, and available size for new regions.

Side note rather than abotu this patch.  Feels to me like the format definition
of these docs should include something on permissions.  The (RO) bit is fine
but makes the docs less machine readable.  

> +
> +What:
> +/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/{size,uuid,target[0-15]}
> +Date:		June, 2021
> +KernelVersion:	v5.14
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RW) Configuring regions requires a minimal set of parameters in
> +		order for the subsequent bind operation to succeed. The
> +		following parameters are defined:
> +
> +		==	========================================================
> +		size	Manadatory. Phsyical address space the region will

Spell check.

> +			consume.
> +		uuid	Optional. A unique identifier for the region. If none is
> +			selected, the kernel will create one.
> +		target  Mandatory. Memory devices are the backing storage for a
> +			region. There will be N targets based on the number of
> +			interleave ways that the top level decoder is configured
> +			for. Each target must be set with a memdev device ie.
> +			'mem1'.

Document what happens if not set.  'nil' is not obvious.

> +		==	========================================================

The automation looking at these files is increasing, so I wonder if that will
cause some issues with having these all documented in one block.

I don't see a huge disadvantage in breaking this into 3 entries.  Alternatively
wait for someone to scream and refactor it then.

> diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> index 391467e864a2..cf7fd3027419 100644
> --- a/drivers/cxl/region.c
> +++ b/drivers/cxl/region.c
> @@ -3,7 +3,9 @@
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/device.h>
>  #include <linux/module.h>
> +#include <linux/sizes.h>
>  #include <linux/slab.h>
> +#include <linux/uuid.h>
>  #include <linux/idr.h>
>  #include "region.h"
>  #include "cxl.h"
> @@ -21,16 +23,237 @@
>   * relationship between decoder and region when the region is interleaved.
>   */
>  
> +static struct cxl_region *to_cxl_region(struct device *dev);
> +
> +static bool is_region_active(struct cxl_region *region)
> +{
> +	/* TODO: Regions can't be activated yet. */
> +	return false;
> +}
> +
> +static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
> +			   char *buf)
> +{
> +	struct cxl_region *region = to_cxl_region(dev);
> +	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
> +
> +	if (!region->res)
> +		return sysfs_emit(buf, "\n");
> +
> +	return sysfs_emit(buf, "%#llx\n", cxld->range.start - region->res->start);
> +}
> +static DEVICE_ATTR_RO(offset);
> +
> +static ssize_t size_show(struct device *dev, struct device_attribute *attr,
> +			 char *buf)
> +{
> +	struct cxl_region *region = to_cxl_region(dev);
> +
> +	if (!region->res)
> +		return sysfs_emit(buf, "*%llu\n", region->requested_size);
> +
> +	return sysfs_emit(buf, "%llu\n", resource_size(region->res));
> +}
> +
> +static ssize_t size_store(struct device *dev, struct device_attribute *attr,
> +			  const char *buf, size_t len)
> +{
> +	struct cxl_region *region = to_cxl_region(dev);
> +	unsigned long long val;
> +	ssize_t rc;
> +
> +	rc = kstrtoull(buf, 0, &val);
> +	if (rc)
> +		return rc;
> +
> +	device_lock(&region->dev);
> +	if (is_region_active(region)) {
> +		rc = -EBUSY;
> +	} else {
> +		region->requested_size = val;
> +	}
> +	device_unlock(&region->dev);
> +
> +	return len;
> +}
> +static DEVICE_ATTR_RW(size);
> +
> +static ssize_t uuid_show(struct device *dev, struct device_attribute *attr,
> +			 char *buf)
> +{
> +	struct cxl_region *region = to_cxl_region(dev);
> +
> +	return sysfs_emit(buf, "%pUb\n", &region->uuid);
> +}
> +static ssize_t uuid_store(struct device *dev, struct device_attribute *attr,
> +			  const char *buf, size_t len)
> +{
> +	struct cxl_region *region = to_cxl_region(dev);
> +	ssize_t rc;
> +
> +	if (len != UUID_STRING_LEN + 1)
> +		return -EINVAL;
> +
> +	device_lock(&region->dev);
> +	if (is_region_active(region)) {
> +		rc = -EBUSY;
> +	} else {
> +		rc = uuid_parse(buf, &region->uuid);
> +	}
> +	device_unlock(&region->dev);
> +
> +	return rc ? rc : len;
> +}
> +static DEVICE_ATTR_RW(uuid);
> +
> +static struct attribute *region_attrs[] = {
> +	&dev_attr_offset.attr,
> +	&dev_attr_size.attr,
> +	&dev_attr_uuid.attr,
> +	NULL,
> +};
> +
> +static const struct attribute_group region_group = {
> +	.attrs = region_attrs,
> +};
> +
> +static size_t show_targetN(struct cxl_region *region, char *buf, int n)
> +{
> +	int ret;
> +
> +	device_lock(&region->dev);
> +	if (!region->targets[n])
> +		ret = sysfs_emit(buf, "%s\n", dev_name(&region->targets[n]->dev));
> +	else
> +		ret = sysfs_emit(buf, "nil\n");
> +	device_unlock(&region->dev);
> +
> +	return ret;
> +}
> +
> +static size_t set_targetN(struct cxl_region *region, const char *buf, int n, size_t len)
> +{
> +	struct device *memdev_dev;
> +	struct cxl_memdev *cxlmd;
> +	ssize_t rc;
> +	int val;
> +
> +	device_lock(&region->dev);
> +
> +	rc = kstrtoint(buf, 0, &val);
> +	/* Special 'remote' target operation */

remote?

> +	if (!rc && val == 0) {

I'm not sure on logic here.  This seems to be if the value is valid and 0
we do something different?  Why is 0 special? It's not documented as such
and the thing returns nil, not 0.

> +		cxlmd = region->targets[n];
> +		if (cxlmd)
> +			put_device(&cxlmd->dev);
> +		region->targets[n] = NULL;
> +		device_unlock(&region->dev);
> +		return len;
> +	}
> +
> +	memdev_dev = bus_find_device_by_name(&cxl_bus_type, NULL, buf);
> +	if (!memdev_dev)
> +		return -ENOENT;
> +
> +	cxlmd = to_cxl_memdev(memdev_dev);
> +	get_device(&cxlmd->dev);
> +	region->targets[n] = cxlmd;
> +
> +	device_unlock(&region->dev);
> +
> +	return len;
> +}
> +
> +#define TARGET_ATTR_RW(n)                                                      \
> +	static ssize_t target##n##_show(                                       \
> +		struct device *dev, struct device_attribute *attr, char *buf)  \
> +	{                                                                      \
> +		return show_targetN(to_cxl_region(dev), buf, n);               \
> +	}                                                                      \
> +	static ssize_t target##n##_store(struct device *dev,                   \
> +					 struct device_attribute *attr,        \
> +					 const char *buf, size_t len)          \
> +	{                                                                      \
> +		return set_targetN(to_cxl_region(dev), buf, n, len);           \
> +	}                                                                      \
> +	static DEVICE_ATTR_RW(target##n)
> +
> +TARGET_ATTR_RW(0);
> +TARGET_ATTR_RW(1);
> +TARGET_ATTR_RW(2);
> +TARGET_ATTR_RW(3);
> +TARGET_ATTR_RW(4);
> +TARGET_ATTR_RW(5);
> +TARGET_ATTR_RW(6);
> +TARGET_ATTR_RW(7);
> +TARGET_ATTR_RW(8);
> +TARGET_ATTR_RW(9);
> +TARGET_ATTR_RW(10);
> +TARGET_ATTR_RW(11);
> +TARGET_ATTR_RW(12);
> +TARGET_ATTR_RW(13);
> +TARGET_ATTR_RW(14);
> +TARGET_ATTR_RW(15);
> +
> +static struct attribute *interleave_attrs[] = {
> +	&dev_attr_target0.attr,
> +	&dev_attr_target1.attr,
> +	&dev_attr_target2.attr,
> +	&dev_attr_target3.attr,
> +	&dev_attr_target4.attr,
> +	&dev_attr_target5.attr,
> +	&dev_attr_target6.attr,
> +	&dev_attr_target7.attr,
> +	&dev_attr_target8.attr,
> +	&dev_attr_target9.attr,
> +	&dev_attr_target10.attr,
> +	&dev_attr_target11.attr,
> +	&dev_attr_target12.attr,
> +	&dev_attr_target13.attr,
> +	&dev_attr_target14.attr,
> +	&dev_attr_target15.attr,
> +	NULL,
> +};
> +
> +static umode_t visible_targets(struct kobject *kobj, struct attribute *a, int n)
> +{
> +	struct device *dev = container_of(kobj, struct device, kobj);
> +	struct cxl_region *region = to_cxl_region(dev);
> +
> +	if (n < region->eniw)
> +		return a->mode;
> +	return 0;
> +}
> +
> +static const struct attribute_group region_interleave_group = {
> +	.attrs = interleave_attrs,
> +	.is_visible = visible_targets,
> +};
> +
> +static const struct attribute_group *region_groups[] = {
> +	&region_group,
> +	&region_interleave_group,
> +	NULL,
> +};
> +
>  static void cxl_region_release(struct device *dev);
>  
>  const struct device_type cxl_region_type = {
>  	.name = "cxl_region",
>  	.release = cxl_region_release,
> +	.groups = region_groups
>  };
>  
>  void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
>  {
> +	int i;
> +
>  	ida_free(&cxld->region_ida, region->id);
> +	for (i = 0; i < region->eniw; i++) {
> +		if (region->targets[i])
> +			put_device(&region->targets[i]->dev);
> +	}
> +
>  	kfree(region);
>  }
>  


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/6] cxl/region: Introduce a cxl_region driver
  2021-06-17 21:13   ` [PATCH v2 " Ben Widawsky
@ 2021-06-18 11:49     ` Jonathan Cameron
  0 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2021-06-18 11:49 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Alison Schofield, Dan Williams, Ira Weiny, Vishal Verma

On Thu, 17 Jun 2021 14:13:29 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The cxl_region driver is responsible for managing the HDM decoder
> programming in the CXL topology. Once a region is created it must be
> configured and bound to the driver in order to activate it.
> 
> The following is a sample of how such controls might work:
> 
> echo 1 > /sys/bus/cxl/devices/decoder0.0/create_region
> echo $((256<<20)) > /sys/bus/cxl/devices/decoder0.0/region0.0:0/size
> echo mem0 > /sys/bus/cxl/devices/decoder0.0/region0.0:0/target0
> echo region0.0:0 > /sys/bus/cxl/drivers/cxl_region/bind
> 
> In order to handle the eventual rise in failure modes of binding a
> region, a new trace event is created to help track these failures for
> debug and reconfiguration paths in userspace.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
> 
> v2 Is updated to take into account the static region_type.
> 
> ---
>  drivers/cxl/Makefile |   2 +-
>  drivers/cxl/core.c   |  15 +++++-
>  drivers/cxl/cxl.h    |   1 +
>  drivers/cxl/mem.h    |   1 +
>  drivers/cxl/region.c | 109 ++++++++++++++++++++++++++++++++++++++-----
>  drivers/cxl/region.h |  11 +++++
>  drivers/cxl/trace.h  |  33 +++++++++++++
>  7 files changed, 158 insertions(+), 14 deletions(-)
>  create mode 100644 drivers/cxl/trace.h
> 
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index c3151198c041..f35077c073b8 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -4,7 +4,7 @@ obj-$(CONFIG_CXL_MEM) += cxl_pci.o cxl_region.o
>  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
>  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
>  
> -ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL
> +ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -I$(src)
>  cxl_core-y := core.o
>  cxl_pci-y := pci.o
>  cxl_acpi-y := acpi.o
> diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> index d8d7ca85e110..be65a1a1493e 100644
> --- a/drivers/cxl/core.c
> +++ b/drivers/cxl/core.c
> @@ -1086,6 +1086,8 @@ static int cxl_device_id(struct device *dev)
>  		return CXL_DEVICE_NVDIMM_BRIDGE;
>  	if (dev->type == &cxl_nvdimm_type)
>  		return CXL_DEVICE_NVDIMM;
> +	if (is_cxl_region(dev))
> +		return CXL_DEVICE_REGION;
>  	return 0;
>  }
>  
> @@ -1102,7 +1104,18 @@ static int cxl_bus_match(struct device *dev, struct device_driver *drv)
>  
>  static int cxl_bus_probe(struct device *dev)
>  {
> -	return to_cxl_drv(dev->driver)->probe(dev);
> +	int id = cxl_device_id(dev);
> +
> +	if (id == CXL_DEVICE_REGION) {
> +		struct cxl_region *region = to_cxl_region(dev);
> +
> +		if (cxl_is_region_configured(region))
> +			return to_cxl_drv(dev->driver)->probe(dev);
> +	} else {
> +		return to_cxl_drv(dev->driver)->probe(dev);
> +	}
you could express this as a sanity check followed by common path. 

	if (id == CXL_DEVICE_REGION) {
		struct cxl_region *region = to_cxl_region(dev);

		if (!cxl_is_region_configured(region))
			return -ENODEV;  //perhaps not a great error code for this case.
	}

	return to_cxl_drv(dev->driver)->probe(dev);
>  }

A little simpler, but whether it makes sense depends a bit on whether
we expect to see other flows in this function as more features are added.

>  
>  static int cxl_bus_remove(struct device *dev)
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 8b27a07d7d0f..90107b21125b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -325,6 +325,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
>  
>  #define CXL_DEVICE_NVDIMM_BRIDGE	1
>  #define CXL_DEVICE_NVDIMM		2
> +#define CXL_DEVICE_REGION		3
>  
>  #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
>  #define CXL_MODALIAS_FMT "cxl:t%d"
> diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
> index fe12ef3c3dde..ff1f9c57e089 100644
> --- a/drivers/cxl/mem.h
> +++ b/drivers/cxl/mem.h
> @@ -83,4 +83,5 @@ struct cxl_mem {
>  	struct range pmem_range;
>  	struct range ram_range;
>  };
> +

Tidy that up.

>  #endif /* __CXL_MEM_H__ */
> diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> index 41299bd6da53..74d6f6c06608 100644
> --- a/drivers/cxl/region.c
> +++ b/drivers/cxl/region.c
> @@ -11,6 +11,9 @@
>  #include "cxl.h"
>  #include "mem.h"
>  
> +#define CREATE_TRACE_POINTS
> +#include "trace.h"
> +
>  /**
>   * DOC: cxl region
>   *
> @@ -27,8 +30,24 @@ static struct cxl_region *to_cxl_region(struct device *dev);
>  
>  static bool is_region_active(struct cxl_region *region)
>  {
> -	/* TODO: Regions can't be activated yet. */
> -	return false;
> +	return region->active;
> +}
> +
> +/*
> + * Most sanity checking is left up to region binding. This does the most basic
> + * check to determine whether or not the core should try probing the driver.
> + */
> +bool cxl_is_region_configured(struct cxl_region *region)
> +{
> +	/* zero sized regions aren't a thing. */

Nitpick.  Make these sentences or not.  Either option is fine.

> +	if (region->requested_size <= 0)
> +		return false;
> +
> +	/* all regions have at least 1 target */
> +	if (region->targets[0])
> +		return false;
> +
> +	return true;
>  }
>  
>  static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
> @@ -249,16 +268,6 @@ bool is_cxl_region(struct device *dev)
>  	return dev->type == &cxl_region_type;
>  }
>  
> -static struct cxl_region *to_cxl_region(struct device *dev)
> -{
> -	if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type,
> -			  "not a cxl_region device\n"))
> -		return NULL;
> -
> -	return container_of(dev, struct cxl_region, dev);
> -}
> -
> -
>  void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
>  {
>  	int i;
> @@ -383,3 +392,79 @@ int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
>  
>  	return 0;
>  }
> +
> +static int bind_region(struct cxl_region *region)
> +{
> +	int i;
> +
> +	if (dev_WARN_ONCE(&region->dev, !cxl_is_region_configured(region),
> +			  "unconfigured regions can't be probed (race?)\n")) {

Given this check is in here, why do we need the earlier one before we call
probe?  Easier to just have one path given it's not performance sensitive or
anything.

> +		return -ENXIO;
> +	}
> +
> +	if (region->requested_size % (SZ_256M * region->eniw)) {
> +		trace_cxl_region_bind(region, "Invalid size. Must be multiple of NIW");
> +		return -ENXIO;
> +	}
> +
> +	for (i = 0; i < region->eniw; i++)
> +		if (!region->targets[i]) {
> +			trace_cxl_region_bind(region, "Missing memory device target");
> +			return -ENXIO;
> +		}
> +
> +	/* TODO: Allocate from decoder's address space */
> +
> +	/* TODO: program HDM decoders */
> +
> +	if (uuid_is_null(&region->uuid))
> +		uuid_gen(&region->uuid);
> +
> +	trace_cxl_region_bind(region, "Region binding succeeded.");

nitpick, blank line here nice for readability.

> +	return 0;
> +}
> +
> +static int cxl_region_probe(struct device *dev)
> +{
> +	struct cxl_region *region = to_cxl_region(dev);
> +	int ret;
> +
> +	if (region->active)
> +		return -EBUSY;
> +
> +	device_lock(&region->dev);
> +	ret = bind_region(region);
> +	if (!ret)
> +		region->active = true;
> +	device_unlock(&region->dev);
> +
> +	return ret;
> +}
> +
> +static void cxl_region_remove(struct device *dev)
> +{
> +	/* Remove region from the decoder's address space */
> +}
> +
> +static struct cxl_driver cxl_region_driver = {
> +	.name = "cxl_region",
> +	.probe = cxl_region_probe,
> +	.remove = cxl_region_remove,
> +	.id = CXL_DEVICE_REGION,
> +};
> +
> +static __init int cxl_region_init(void)
> +{
> +	return cxl_driver_register(&cxl_region_driver);
> +}
> +
> +static __exit void cxl_region_exit(void)
> +{
> +	cxl_driver_unregister(&cxl_region_driver);
> +}
> +
> +MODULE_LICENSE("GPL v2");
> +module_init(cxl_region_init);
> +module_exit(cxl_region_exit);
> +MODULE_IMPORT_NS(CXL);
> +MODULE_ALIAS_CXL(CXL_REGION);
> diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
> index f2c37a6f1192..7f7331d9029b 100644
> --- a/drivers/cxl/region.h
> +++ b/drivers/cxl/region.h
> @@ -14,6 +14,7 @@
>   * @uuid: The UUID for this region.
>   * @list: Node in decoders region list.
>   * @eniw: Number of interleave ways this region is configured for.
> + * @active: If the region has been activated.
>   * @targets: The memory devices comprising the region.
>   */
>  struct cxl_region {
> @@ -24,10 +25,20 @@ struct cxl_region {
>  	uuid_t uuid;
>  	struct list_head list;
>  	int eniw;
> +	bool active;
>  	struct cxl_memdev *targets[];
>  };
>  
>  bool is_cxl_region(struct device *dev);
>  bool cxl_is_region_configured(struct cxl_region *region);
>  
> +static inline struct cxl_region *to_cxl_region(struct device *dev)
> +{
> +	if (dev_WARN_ONCE(dev, !is_cxl_region(dev),
> +			  "not a cxl_region device\n"))
> +		return NULL;
> +
> +	return container_of(dev, struct cxl_region, dev);
> +}
> +
>  #endif
> diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h
> new file mode 100644
> index 000000000000..fe9576ec2cde
> --- /dev/null
> +++ b/drivers/cxl/trace.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM cxl
> +
> +#if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
> +#define __CXL_TRACE_H__
> +
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(cxl_region_bind,
> +	TP_PROTO(struct cxl_region *region, char *status),
> +	TP_ARGS(region, status),
> +	TP_STRUCT__entry(
> +		__field(struct cxl_region *, region)
> +		__string(status, status)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->region = region;
> +		__assign_str(status, status);
> +	),
> +
> +	TP_printk("%s failed to bind (%s)", dev_name(&__entry->region->dev), __get_str(status))

Is that safe? A quick grep suggests it's not done elsewhere.
From what I recall TP_printk can be called a lot later than where the trace data got pushed.
More than possible region went away in the meantime.

If you want to print this, put the string in the trace point itself.

> +);
> +
> +#endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */
> +
> +/* This part must be outside protection */
> +#undef TRACE_INCLUDE_PATH
> +#define TRACE_INCLUDE_PATH .
> +#define TRACE_INCLUDE_FILE trace
> +#include <trace/define_trace.h>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/6] cxl/core: Convert decoder range to resource
  2021-06-17 17:36 ` [PATCH 5/6] cxl/core: Convert decoder range to resource Ben Widawsky
@ 2021-06-18 11:52   ` Jonathan Cameron
  0 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2021-06-18 11:52 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Alison Schofield, Dan Williams, Ira Weiny, Vishal Verma

On Thu, 17 Jun 2021 10:36:54 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Regions will use the resource API in order to help manage allocated
> space. As regions are children of the decoder, it makes sense that the
> parent host the main resource to be suballocated by the region.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Seems reasonable.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/core.c   | 9 +++------
>  drivers/cxl/cxl.h    | 4 ++--
>  drivers/cxl/region.c | 2 +-
>  3 files changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> index 44f982f4f247..0e598a18e95f 100644
> --- a/drivers/cxl/core.c
> +++ b/drivers/cxl/core.c
> @@ -40,7 +40,7 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr,
>  {
>  	struct cxl_decoder *cxld = to_cxl_decoder(dev);
>  
> -	return sysfs_emit(buf, "%#llx\n", cxld->range.start);
> +	return sysfs_emit(buf, "%#llx\n", cxld->res.start);
>  }
>  static DEVICE_ATTR_RO(start);
>  
> @@ -49,7 +49,7 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr,
>  {
>  	struct cxl_decoder *cxld = to_cxl_decoder(dev);
>  
> -	return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range));
> +	return sysfs_emit(buf, "%#llx\n", resource_size(&cxld->res));
>  }
>  static DEVICE_ATTR_RO(size);
>  
> @@ -543,10 +543,7 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
>  
>  	*cxld = (struct cxl_decoder) {
>  		.id = rc,
> -		.range = {
> -			.start = base,
> -			.end = base + len - 1,
> -		},
> +		.res = DEFINE_RES_MEM_NAMED(base, len, "cxl-decoder"),
>  		.flags = flags,
>  		.interleave_ways = interleave_ways,
>  		.interleave_granularity = interleave_granularity,
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 90107b21125b..b08f0969abeb 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -185,7 +185,7 @@ enum cxl_decoder_type {
>   * struct cxl_decoder - CXL address range decode configuration
>   * @dev: this decoder's device
>   * @id: kernel device name id
> - * @range: address range considered by this decoder
> + * @res: address space resources considered by this decoder
>   * @interleave_ways: number of cxl_dports in this decode
>   * @interleave_granularity: data stride per dport
>   * @target_type: accelerator vs expander (type2 vs type3) selector
> @@ -198,7 +198,7 @@ enum cxl_decoder_type {
>  struct cxl_decoder {
>  	struct device dev;
>  	int id;
> -	struct range range;
> +	struct resource res;
>  	int interleave_ways;
>  	int interleave_granularity;
>  	enum cxl_decoder_type target_type;
> diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> index 616b47903d69..62bca8b30349 100644
> --- a/drivers/cxl/region.c
> +++ b/drivers/cxl/region.c
> @@ -59,7 +59,7 @@ static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
>  	if (!region->res)
>  		return sysfs_emit(buf, "\n");
>  
> -	return sysfs_emit(buf, "%#llx\n", cxld->range.start - region->res->start);
> +	return sysfs_emit(buf, "%#llx\n", cxld->res.start - region->res->start);
>  }
>  static DEVICE_ATTR_RO(offset);
>  


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6] cxl/region: Handle region's address space allocation
  2021-06-17 17:36 ` [PATCH 6/6] cxl/region: Handle region's address space allocation Ben Widawsky
@ 2021-06-18 13:35   ` Jonathan Cameron
  0 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2021-06-18 13:35 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Alison Schofield, Dan Williams, Ira Weiny, Vishal Verma

On Thu, 17 Jun 2021 10:36:55 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Regions are carved out of an addresses space which is claimed by top
> level decoders, and subsequently their children decoders. Regions are
> created with a size and therefore must fit, with proper alignment, in
> that address space. The support for doing this fitting is handled by the
> driver automatically.
> 
> As an example, a platform might configure a top level decoder to claim
> 1TB of address space @ 0x800000000 -> 0x10800000000; it would be
> possible to create M regions with appropriate alignment to occupy that
> address space. Each of those regions would have a host physical address
> somewhere in the range between 32G and 1.3TB, and the location will be
> determined by the logic added here.
> 
> The request_region() usage is not strictly mandatory at this point as
> the actual handling of the address space is done with genpools. It is
> highly likely however that the resource/region APIs will become useful
> in the not too distant future.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

This feels like something that will be much more obvious once the rest
of the missing bits you mention in the cover letter are in place.

Until then it looks fine to me, but I'm not sure I understand how it is
going to get used.  Guess I'll wait and see ;)

Jonathan

> ---
>  drivers/cxl/core.c   | 13 +++++++++++++
>  drivers/cxl/cxl.h    |  2 ++
>  drivers/cxl/region.c | 31 +++++++++++++++++++++++++++++--
>  3 files changed, 44 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> index 0e598a18e95f..b57ee627bc6a 100644
> --- a/drivers/cxl/core.c
> +++ b/drivers/cxl/core.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
>  #include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/genalloc.h>
>  #include <linux/device.h>
>  #include <linux/module.h>
>  #include <linux/pci.h>
> @@ -607,6 +608,18 @@ devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
>  	rc = devm_add_action_or_reset(host, unregister_dev, dev);
>  	if (rc)
>  		return ERR_PTR(rc);
> +
> +	cxld->address_space =
> +		devm_gen_pool_create(dev, ilog2(SZ_256M * cxld->interleave_ways),
> +				     NUMA_NO_NODE, dev_name(dev));
> +	if (IS_ERR(cxld->address_space))
> +		return ERR_CAST(cxld->address_space);
> +
> +	rc = gen_pool_add(cxld->address_space, cxld->res.start,
> +			  resource_size(&cxld->res), NUMA_NO_NODE);
> +	if (rc)
> +		return ERR_PTR(rc);
> +
>  	return cxld;
>  
>  err:
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index b08f0969abeb..b5b728155d86 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -193,6 +193,7 @@ enum cxl_decoder_type {
>   * @region_ida: allocator for region ids.
>   * @regions: List of regions mapped (may be disabled) by this decoder.
>   * @youngest: Last region created for this decoder.
> + * @address_space: Used/free address space for regions.
>   * @target: active ordered target list in current decoder configuration
>   */
>  struct cxl_decoder {
> @@ -206,6 +207,7 @@ struct cxl_decoder {
>  	struct ida region_ida;
>  	struct list_head regions;
>  	struct cxl_region *youngest;
> +	struct gen_pool *address_space;
>  	struct cxl_dport *target[];
>  };
>  
> diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> index 62bca8b30349..0554f3fff7ac 100644
> --- a/drivers/cxl/region.c
> +++ b/drivers/cxl/region.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
>  #include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/genalloc.h>
>  #include <linux/device.h>
>  #include <linux/module.h>
>  #include <linux/sizes.h>
> @@ -388,9 +389,33 @@ int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
>  	return 0;
>  }
>  
> +static int allocate_region_addr(struct cxl_region *region)
> +{
> +	struct cxl_decoder *cxld = to_cxl_decoder(region->dev.parent);
> +	unsigned long start;
> +
> +	start = gen_pool_alloc(cxld->address_space, region->requested_size);
> +	if (!start) {
> +		trace_cxl_region_bind(region, "Couldn't allocate address space");
> +		return -ENOMEM;
> +	}
> +
> +	region->res =
> +		__request_region(&cxld->res, start, region->requested_size,
> +				 dev_name(&region->dev), IORESOURCE_EXCLUSIVE);
> +	if (IS_ERR(region->res)) {
> +		trace_cxl_region_bind(region, "Couldn't obtain region");
> +		gen_pool_free(cxld->address_space, start,
> +			      region->requested_size);
> +		return PTR_ERR(region->res);
> +	}
> +
> +	return 0;
> +}
> +
>  static int bind_region(struct cxl_region *region)
>  {
> -	int i;
> +	int i, rc;
>  
>  	if (dev_WARN_ONCE(&region->dev, !cxl_is_region_configured(region),
>  			  "unconfigured regions can't be probed (race?)\n")) {
> @@ -408,7 +433,9 @@ static int bind_region(struct cxl_region *region)
>  			return -ENXIO;
>  		}
>  
> -	/* TODO: Allocate from decoder's address space */
> +	rc = allocate_region_addr(region);
> +	if (rc)
> +		return rc;
>  
>  	/* TODO: program HDM decoders */
>  


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] cxl: Move cxl_memdev conversion helper to mem.h
  2021-06-17 17:36 ` [PATCH 2/6] cxl: Move cxl_memdev conversion helper to mem.h Ben Widawsky
  2021-06-18  9:13   ` Jonathan Cameron
@ 2021-06-18 15:00   ` Dan Williams
  1 sibling, 0 replies; 20+ messages in thread
From: Dan Williams @ 2021-06-18 15:00 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma

On Thu, Jun 17, 2021 at 10:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>

If you find yourself not writing a changelog it usually means the
patch has no reason to stand alone...

> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/mem.h | 5 +++++
>  drivers/cxl/pci.c | 5 -----
>  2 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
> index 8f02d02b26b4..fe12ef3c3dde 100644
> --- a/drivers/cxl/mem.h
> +++ b/drivers/cxl/mem.h
> @@ -48,6 +48,11 @@ struct cxl_memdev {
>         int id;
>  };
>
> +static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
> +{
> +       return container_of(dev, struct cxl_memdev, dev);
> +}
> +
>  /**
>   * struct cxl_mem - A CXL memory device
>   * @pdev: The PCI device associated with this CXL device.
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index f8408e5f0754..52508282ec37 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1161,11 +1161,6 @@ static int cxl_mem_setup_regs(struct cxl_mem *cxlm)
>         return ret;
>  }
>
> -static struct cxl_memdev *to_cxl_memdev(struct device *dev)
> -{
> -       return container_of(dev, struct cxl_memdev, dev);
> -}
> -
>  static void cxl_memdev_release(struct device *dev)
>  {
>         struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> --
> 2.32.0
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] cxl/region: Add region creation ABI
  2021-06-18  9:13   ` [PATCH " Jonathan Cameron
@ 2021-06-18 15:07     ` Ben Widawsky
  2021-06-18 16:39       ` Dan Williams
  0 siblings, 1 reply; 20+ messages in thread
From: Ben Widawsky @ 2021-06-18 15:07 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Alison Schofield, Dan Williams, Ira Weiny, Vishal Verma

On 21-06-18 10:13:11, Jonathan Cameron wrote:
> On Thu, 17 Jun 2021 10:36:50 -0700
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Regions are created as a child of the decoder that encompasses an
> > address space with constraints. Regions only exist for persistent
> > capacities.
> > 
> > When regions are created, the number of desired interleave ways must be
> > known. To enable this, the sysfs attribute will take the desired ways as
> > input. This interface intentionally allows creation of
> > impossible-to-enable regions based on interleave constraints in the
> > topology. The reasoning is to create new regions through the kernel
> > interfaces which may become possible on reboot under a variety of
> > circumstances.
> > 
> > As an example, creating a x1 region with:
> > echo 1 > /sys/bus/cxl/devices/decoder1.0/create_region
> > 
> > Will yield /sys/bus/cxl/devices/decoder1.0/region1.0:0
> > 
> > That region may then be deleted with:
> > echo region1.0:0 > /sys/bus/cxl/devices/decoder1.0/delete_region
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> Hi Ben,
> 
> Some comments inline. The sysfs interface is getting a little
> clever for my liking....

Thanks for the feedback.

> 
> > ---
> >  Documentation/ABI/testing/sysfs-bus-cxl       |  21 +++
> >  .../driver-api/cxl/memory-devices.rst         |  11 ++
> >  drivers/cxl/Makefile                          |   3 +-
> >  drivers/cxl/core.c                            |  71 +++++++++
> >  drivers/cxl/cxl.h                             |  11 ++
> >  drivers/cxl/region.c                          | 147 ++++++++++++++++++
> >  drivers/cxl/region.h                          |  43 +++++
> >  7 files changed, 306 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/cxl/region.c
> >  create mode 100644 drivers/cxl/region.h
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> > index 0b6a2e6e8fbb..115a25d2899d 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-cxl
> > +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> > @@ -127,3 +127,24 @@ Description:
> >  		memory (type-3). The 'target_type' attribute indicates the
> >  		current setting which may dynamically change based on what
> >  		memory regions are activated in this decode hierarchy.
> > +
> > +What:		/sys/bus/cxl/devices/decoderX.Y/create_region
> > +Date:		June, 2021
> > +KernelVersion:	v5.14
> > +Contact:	linux-cxl@vger.kernel.org
> > +Description:
> > +		Creates a new CXL region of N interleaved ways. Writing a value
> > +		of '2' will create a new uninitialized region with 2x interleave
> > +		that will be mapped by the CXL decoderX.Y. Reading from this
> > +		node will return the last created region. Regions must be
> > +		subsequently configured and bound to a region driver before they
> > +		can be used.
> 
> I don't like sysfs attributes that return entirely different looking (and seemingly
> unrelated) values from what you write to them.
> I could sort of get behind writing 1 and returning the region (as in the RFC)
> on basis that felt like 'create' and 'what did I create'.
> Doing '2' and getting back anything other than '2' feels wrong
> 
> Maybe, would feel better if the name made it more explicit that the thing took
> interleave values?
> 
> create_region_with_interleave perhaps?

I could argue both ways. I agree the asymmetry isn't ideal. The expectation is
that userspace tooling is going to handle most of this anyway, and it's "well
documented" in Documentation/.../ABI. So I think if the interface has some
warts, it's not a huge deal, but perhaps designing it that way isn't ideal.
More below...

> 
> Not ideal.  Sysfs always a bit clunky for this stuff and configfs would mean
> a split interface which is never ideal.

Overall, the programming model follows NVDIMM. With my DRM background, my
default would have been IOCTLs - those have a long legacy. I've never touched
configfs. I'm not advocating one way or another. Maybe it would be good to
discuss pros/cons to different options?

> 
> It's a bit horrible, but maybe a more intuitive interface would be to make the
> targetX visibility dynamic and hence make interleave an attribute of the region?
> 
> sysfs_update_group() is rarely used, but I think the intent is to support
> this sort of case.
> 
> I've not fully thought through the effects of that though so might
> well be missing something.

I really would have liked something dynamic like you described. I searched for
other drivers that this without much luck. I'm relatively unfamiliar with a lot
of the device model machinery in Linux. Dan, are you familiar with this, is it
feasible? If it sounds reasonable I can spend some more time figuring out how to
do it.

> 
> Also, add to the docs what can be expected if there is no 'last created'
> or it's been removed again.
> 
> > +
> > +What:		/sys/bus/cxl/devices/decoderX.Y/delete_region
> > +Date:		June, 2021
> > +KernelVersion:	v5.14
> > +Contact:	linux-cxl@vger.kernel.org
> > +Description:
> > +		Deletes the named region. A region must be unbound from the
> > +		region driver before being deleted. The attributes expects a
> > +		region in the form "regionX.Y:Z".
> > diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> > index 487ce4f41d77..c7f59a8c94db 100644
> > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > @@ -39,6 +39,17 @@ CXL Core
> >  .. kernel-doc:: drivers/cxl/core.c
> >     :doc: cxl core
> >  
> > +CXL Regions
> > +-----------
> > +.. kernel-doc:: drivers/cxl/region.c
> > +   :doc: cxl region
> > +
> > +.. kernel-doc:: drivers/cxl/region.h
> > +   :identifiers:
> > +
> > +.. kernel-doc:: drivers/cxl/region.c
> > +   :identifiers:
> > +
> >  External Interfaces
> >  ===================
> >  
> > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > index 32954059b37b..c3151198c041 100644
> > --- a/drivers/cxl/Makefile
> > +++ b/drivers/cxl/Makefile
> > @@ -1,6 +1,6 @@
> >  # SPDX-License-Identifier: GPL-2.0
> >  obj-$(CONFIG_CXL_BUS) += cxl_core.o
> > -obj-$(CONFIG_CXL_MEM) += cxl_pci.o
> > +obj-$(CONFIG_CXL_MEM) += cxl_pci.o cxl_region.o
> >  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
> >  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> >  
> > @@ -9,3 +9,4 @@ cxl_core-y := core.o
> >  cxl_pci-y := pci.o
> >  cxl_acpi-y := acpi.o
> >  cxl_pmem-y := pmem.o
> > +cxl_region-y := region.o
> > diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> > index a2e4d54fc7bc..d8d7ca85e110 100644
> > --- a/drivers/cxl/core.c
> > +++ b/drivers/cxl/core.c
> > @@ -6,6 +6,7 @@
> >  #include <linux/pci.h>
> >  #include <linux/slab.h>
> >  #include <linux/idr.h>
> > +#include "region.h"
> >  #include "cxl.h"
> >  #include "mem.h"
> >  
> > @@ -120,7 +121,68 @@ static ssize_t target_list_show(struct device *dev,
> >  }
> >  static DEVICE_ATTR_RO(target_list);
> >  
> > +static ssize_t create_region_show(struct device *dev,
> > +				  struct device_attribute *attr, char *buf)
> > +{
> > +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> > +	int rc;
> > +
> > +	device_lock(dev);
> > +	rc = sprintf(buf, "%s\n",
> > +		     cxld->youngest ? dev_name(&cxld->youngest->dev) : "");
> 
> An alternative is to be cynical and adjust the interface a touch.
> Lets say it always returns the name of the last created region.  If that's
> true, just copy the name at creation time.  Then no need to care if the
> pointer is live or not.
> 

Sounds fine to me.

> > +	device_unlock(dev);
> > +
> > +	return rc;
> > +}
> > +
> > +static ssize_t create_region_store(struct device *dev,
> > +				   struct device_attribute *attr,
> > +				   const char *buf, size_t len)
> > +{
> > +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> > +	struct cxl_region *region;
> > +	ssize_t rc;
> > +	int val;
> > +
> > +	rc = kstrtoint(buf, 0, &val);
> > +	if (rc)
> > +		return rc;
> > +	if (val < 0 || val > 16)
> > +		return -EINVAL;
> > +
> > +	region = cxl_alloc_region(cxld, val);
> > +	if (IS_ERR(region))
> > +		return PTR_ERR(region);
> > +
> > +	rc = cxl_add_region(cxld, region);
> > +	if (rc) {
> > +		cxl_free_region(cxld, region);
> > +		return rc;
> > +	}
> > +
> > +	cxld->youngest = region;
> > +	return len;
> > +}
> > +static DEVICE_ATTR_RW(create_region);
> > +
> > +static ssize_t delete_region_store(struct device *dev,
> > +				   struct device_attribute *attr,
> > +				   const char *buf, size_t len)
> > +{
> > +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> > +	int rc;
> > +
> > +	rc = cxl_delete_region(cxld, buf);
> > +	if (rc)
> > +		return rc;
> > +
> > +	return len;
> > +}
> > +static DEVICE_ATTR_WO(delete_region);
> > +
> >  static struct attribute *cxl_decoder_base_attrs[] = {
> > +	&dev_attr_create_region.attr,
> > +	&dev_attr_delete_region.attr,
> >  	&dev_attr_start.attr,
> >  	&dev_attr_size.attr,
> >  	&dev_attr_locked.attr,
> > @@ -171,7 +233,13 @@ static void cxl_decoder_release(struct device *dev)
> >  {
> >  	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> >  	struct cxl_port *port = to_cxl_port(dev->parent);
> > +	struct cxl_region *region;
> >  
> > +	list_for_each_entry(region, &cxld->regions, list)
> > +		cxl_delete_region(cxld, dev_name(&region->dev));
> > +
> > +	dev_WARN_ONCE(dev, !ida_is_empty(&cxld->region_ida),
> > +		      "Lost track of a region");
> >  	ida_free(&port->decoder_ida, cxld->id);
> >  	kfree(cxld);
> >  }
> > @@ -483,8 +551,11 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
> >  		.interleave_ways = interleave_ways,
> >  		.interleave_granularity = interleave_granularity,
> >  		.target_type = type,
> > +		.regions = LIST_HEAD_INIT(cxld->regions),
> >  	};
> >  
> > +	ida_init(&cxld->region_ida);
> > +
> >  	/* handle implied target_list */
> >  	if (interleave_ways == 1)
> >  		cxld->target[0] =
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index b6bda39a59e3..8b27a07d7d0f 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -190,6 +190,9 @@ enum cxl_decoder_type {
> >   * @interleave_granularity: data stride per dport
> >   * @target_type: accelerator vs expander (type2 vs type3) selector
> >   * @flags: memory type capabilities and locking
> > + * @region_ida: allocator for region ids.
> > + * @regions: List of regions mapped (may be disabled) by this decoder.
> > + * @youngest: Last region created for this decoder.
> >   * @target: active ordered target list in current decoder configuration
> >   */
> >  struct cxl_decoder {
> > @@ -200,6 +203,9 @@ struct cxl_decoder {
> >  	int interleave_granularity;
> >  	enum cxl_decoder_type target_type;
> >  	unsigned long flags;
> > +	struct ida region_ida;
> > +	struct list_head regions;
> > +	struct cxl_region *youngest;
> >  	struct cxl_dport *target[];
> >  };
> >  
> > @@ -262,6 +268,11 @@ struct cxl_dport {
> >  	struct list_head list;
> >  };
> >  
> > +struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
> > +				    int interleave_ways);
> > +void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region);
> > +int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region);
> > +int cxl_delete_region(struct cxl_decoder *cxld, const char *region);
> >  struct cxl_port *to_cxl_port(struct device *dev);
> >  struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
> >  				   resource_size_t component_reg_phys,
> > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> > new file mode 100644
> > index 000000000000..391467e864a2
> > --- /dev/null
> > +++ b/drivers/cxl/region.c
> > @@ -0,0 +1,147 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> > +#include <linux/io-64-nonatomic-lo-hi.h>
> > +#include <linux/device.h>
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +#include <linux/idr.h>
> > +#include "region.h"
> > +#include "cxl.h"
> > +#include "mem.h"
> > +
> > +/**
> > + * DOC: cxl region
> > + *
> > + * A CXL region encompasses a chunk of host physical address space that may be
> > + * consumed by a single device (x1 interleave aka linear) or across multiple
> > + * devices (xN interleaved). A region is a child device of a &struct
> > + * cxl_decoder. There may be multiple active regions under a single &struct
> > + * cxl_decoder. The common case for multiple regions would be several linear,
> > + * contiguous regions under a single decoder. Generally, there will be a 1:1
> > + * relationship between decoder and region when the region is interleaved.
> > + */
> > +
> > +static void cxl_region_release(struct device *dev);
> > +
> > +const struct device_type cxl_region_type = {
> > +	.name = "cxl_region",
> > +	.release = cxl_region_release,
> > +};
> > +
> > +void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
> > +{
> > +	ida_free(&cxld->region_ida, region->id);
> > +	kfree(region);
> > +}
> > +
> > +static void cxl_region_release(struct device *dev)
> > +{
> > +	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
> > +	struct cxl_region *region = to_cxl_region(dev);
> > +
> > +	cxl_free_region(cxld, region);
> > +}
> > +
> > +struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
> > +				    int interleave_ways)
> > +{
> > +	struct cxl_region *region;
> > +	int rc;
> > +
> > +	region = kzalloc(struct_size(region, targets, interleave_ways),
> > +			 GFP_KERNEL);
> > +	if (!region)
> > +		return ERR_PTR(-ENOMEM);
> > +
> > +	region->eniw = interleave_ways;
> > +
> > +	rc = ida_alloc(&cxld->region_ida, GFP_KERNEL);
> > +	if (rc < 0) {
> > +		dev_err(&cxld->dev, "Couldn't get a new id\n");
> > +		kfree(region);
> > +		return ERR_PTR(rc);
> > +	}
> > +	region->id = rc;
> > +
> > +	return region;
> > +}
> > +
> > +/**
> > + * cxl_add_region - Adds a region to a decoder
> > + * @cxld: Parent decoder.
> > + * @region: Region to be added to the decoder.
> > + *
> > + * This is the second step of region initialization. Regions exist within an
> > + * address space which is mapped by a @cxld, and that @cxld enforces constraints
> > + * upon the region as it is configured. Regions may be added to a @cxld but not
> > + * activated and therefore it is possible to have more regions in a @cxld than
> > + * there are interleave ways in the @cxld. Regions exist only for persistent
> > + * capacities.
> > + *
> > + * Return: zero if the region was added to the @cxld, else returns negative
> > + * error code.
> > + */
> > +int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region)
> > +{
> > +	struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> > +	struct device *dev = &region->dev;
> > +	int rc;
> > +
> > +	device_initialize(dev);
> > +	dev->parent = &cxld->dev;
> > +	device_set_pm_not_required(dev);
> > +	dev->bus = &cxl_bus_type;
> > +	dev->type = &cxl_region_type;
> > +	rc = dev_set_name(dev, "region%d.%d:%d", port->id, cxld->id, region->id);
> > +	if (rc)
> > +		goto err;
> > +
> > +	rc = device_add(dev);
> > +	if (rc)
> > +		goto err;
> > +
> > +	dev_dbg(dev, "Added %s to %s\n", dev_name(dev), dev_name(&cxld->dev));
> > +
> > +	return 0;
> > +
> > +err:
> > +	put_device(dev);
> > +	return rc;
> > +}
> > +
> > +static struct cxl_region *
> > +cxl_find_region_by_name(struct cxl_decoder *cxld, const char *name)
> > +{
> > +	struct device *region_dev;
> > +
> > +	region_dev = device_find_child_by_name(&cxld->dev, name);
> > +	if (!region_dev)
> > +		return ERR_PTR(-ENOENT);
> > +
> > +	return to_cxl_region(region_dev);
> > +}
> > +
> > +int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
> > +{
> > +	struct cxl_region *region;
> > +
> > +	device_lock(&cxld->dev);
> > +
> > +	region = cxl_find_region_by_name(cxld, region_name);
> > +	if (IS_ERR(region)) {
> > +		device_unlock(&cxld->dev);
> > +		return PTR_ERR(region);
> > +	}
> > +
> > +	dev_dbg(&cxld->dev, "Requested removal of %s from %s\n",
> > +		dev_name(&region->dev), dev_name(&cxld->dev));
> > +
> > +	cmpxchg(&cxld->youngest, region, NULL);
> 
> Why does this need to be atomic?  I think the other side of anything
> that might make use of this isn't atomic, except because you are holding
> the device_lock for cxld->dev

I attempted a version that was lockless before this (failed) and this remained.
It doesn't need to be atomic.

> 
> > +
> > +	device_unregister(&region->dev);
> > +	device_unlock(&cxld->dev);
> > +
> > +	put_device(&region->dev);
> > +
> > +	return 0;
> > +}
> > diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
> > new file mode 100644
> > index 000000000000..7a87d229e38a
> > --- /dev/null
> > +++ b/drivers/cxl/region.h
> > @@ -0,0 +1,43 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/* Copyright(c) 2021 Intel Corporation. */
> > +#ifndef __CXL_REGION_H__
> > +#define __CXL_REGION_H__
> > +
> > +#include <linux/uuid.h>
> > +
> > +extern const struct device_type cxl_region_type;
> > +
> > +/**
> > + * struct cxl_region - CXL region
> > + * @dev: This region's device.
> > + * @id: This regions id. Id is globally unique across all regions.
> > + * @res: Address space consumed by this region.
> > + * @requested_size: Size of the region determined from LSA or userspace.
> > + * @uuid: The UUID for this region.
> > + * @list: Node in decoders region list.
> > + * @eniw: Number of interleave ways this region is configured for.
> > + * @targets: The memory devices comprising the region.
> > + */
> > +struct cxl_region {
> > +	struct device dev;
> > +	int id;
> > +	struct resource *res;
> > +	u64 requested_size;
> > +	uuid_t uuid;
> > +	struct list_head list;
> > +	int eniw;
> > +	struct cxl_memdev *targets[];
> > +};
> > +
> > +static inline struct cxl_region *to_cxl_region(struct device *dev)
> > +{
> > +	if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type,
> > +			  "not a cxl_region device\n"))
> > +		return NULL;
> > +
> > +	return container_of(dev, struct cxl_region, dev);
> > +}
> > +
> > +bool cxl_is_region_configured(struct cxl_region *region);
> > +
> > +#endif
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] cxl/region: Introduce concept of region configuration
  2021-06-18 11:22   ` Jonathan Cameron
@ 2021-06-18 15:25     ` Ben Widawsky
  2021-06-18 15:44       ` Jonathan Cameron
  0 siblings, 1 reply; 20+ messages in thread
From: Ben Widawsky @ 2021-06-18 15:25 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Alison Schofield, Dan Williams, Ira Weiny, Vishal Verma

On 21-06-18 12:22:37, Jonathan Cameron wrote:
> On Thu, 17 Jun 2021 10:36:52 -0700
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The region creation APIs leave a region unconfigured. Configuring the
> > region will work in the same way as similar subsystems such as devdax.
> > Sysfs attrs will be provided to allow userspace to configure the region.
> > Finally once all configuration is complete, userspace may "commit" the
> > config. What the kernel decides to do after a config is committed is out
> > of scope at this point.
> > 
> > Introduced here are the most basic attributes needed to configure a
> > region.
> > 
> > A x1 interleave example is provided below:
> > 
> > decoder1.0
> > ├── create_region
> > ├── delete_region
> > ├── devtype
> > ├── locked
> > ├── region1.0:0
> > │   ├── offset
> > │   ├── size
> > │   ├── subsystem -> ../../../../../../../bus/cxl
> > │   ├── target0
> > │   ├── uevent
> > │   ├── uuid
> > │   └── verify
> > ├── size
> > ├── start
> > ├── subsystem -> ../../../../../../bus/cxl
> > ├── target_list
> > ├── target_type
> > └── uevent
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> Hi Ben,
> 
> I think some of the sysfs interface needs more detailed docs
> and consideration about consistency.
> 
> If an interface isn't fairly obvious without reading the docs
> (but understanding what it is trying to do) then it's normally
> a bad sign...
> 
> Jonathan
> 

My measuring stick has been: the interface should be fairly obvious if you're
fairly familiar with the spec. I think we've satisfied that here, but often
times the creator of something can be blind to effects of the creation.

I can certainly add more descriptive text to the ABI doc, but if you have
suggestions on how to make the sysfs attrs more descriptive, that'd be great.

> > ---
> >  Documentation/ABI/testing/sysfs-bus-cxl |  32 ++++
> >  drivers/cxl/region.c                    | 223 ++++++++++++++++++++++++
> >  2 files changed, 255 insertions(+)
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> > index 115a25d2899d..70f9d09385a4 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-cxl
> > +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> > @@ -148,3 +148,35 @@ Description:
> >  		Deletes the named region. A region must be unbound from the
> >  		region driver before being deleted. The attributes expects a
> >  		region in the form "regionX.Y:Z".
> > +
> > +What:		/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/offset
> > +Date:		June, 2021
> > +KernelVersion:	v5.14
> > +Contact:	linux-cxl@vger.kernel.org
> > +Description:
> > +		(RO) A region resides within an address space that is claimed by
> > +		a decoder. Region space allocation is handled by the driver, but
> > +		the offset may be read by userspace tooling in order to
> > +		determine fragmentation, and available size for new regions.
> 
> Side note rather than abotu this patch.  Feels to me like the format definition
> of these docs should include something on permissions.  The (RO) bit is fine
> but makes the docs less machine readable.  

Any examples of something you consider ideal?

> 
> > +
> > +What:
> > +/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/{size,uuid,target[0-15]}
> > +Date:		June, 2021
> > +KernelVersion:	v5.14
> > +Contact:	linux-cxl@vger.kernel.org
> > +Description:
> > +		(RW) Configuring regions requires a minimal set of parameters in
> > +		order for the subsequent bind operation to succeed. The
> > +		following parameters are defined:
> > +
> > +		==	========================================================
> > +		size	Manadatory. Phsyical address space the region will
> 
> Spell check.
> 
> > +			consume.
> > +		uuid	Optional. A unique identifier for the region. If none is
> > +			selected, the kernel will create one.
> > +		target  Mandatory. Memory devices are the backing storage for a
> > +			region. There will be N targets based on the number of
> > +			interleave ways that the top level decoder is configured
> > +			for. Each target must be set with a memdev device ie.
> > +			'mem1'.
> 
> Document what happens if not set.  'nil' is not obvious.
> 
> > +		==	========================================================
> 
> The automation looking at these files is increasing, so I wonder if that will
> cause some issues with having these all documented in one block.
> 
> I don't see a huge disadvantage in breaking this into 3 entries.  Alternatively
> wait for someone to scream and refactor it then.

Okay.

> 
> > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> > index 391467e864a2..cf7fd3027419 100644
> > --- a/drivers/cxl/region.c
> > +++ b/drivers/cxl/region.c
> > @@ -3,7 +3,9 @@
> >  #include <linux/io-64-nonatomic-lo-hi.h>
> >  #include <linux/device.h>
> >  #include <linux/module.h>
> > +#include <linux/sizes.h>
> >  #include <linux/slab.h>
> > +#include <linux/uuid.h>
> >  #include <linux/idr.h>
> >  #include "region.h"
> >  #include "cxl.h"
> > @@ -21,16 +23,237 @@
> >   * relationship between decoder and region when the region is interleaved.
> >   */
> >  
> > +static struct cxl_region *to_cxl_region(struct device *dev);
> > +
> > +static bool is_region_active(struct cxl_region *region)
> > +{
> > +	/* TODO: Regions can't be activated yet. */
> > +	return false;
> > +}
> > +
> > +static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
> > +			   char *buf)
> > +{
> > +	struct cxl_region *region = to_cxl_region(dev);
> > +	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
> > +
> > +	if (!region->res)
> > +		return sysfs_emit(buf, "\n");
> > +
> > +	return sysfs_emit(buf, "%#llx\n", cxld->range.start - region->res->start);
> > +}
> > +static DEVICE_ATTR_RO(offset);
> > +
> > +static ssize_t size_show(struct device *dev, struct device_attribute *attr,
> > +			 char *buf)
> > +{
> > +	struct cxl_region *region = to_cxl_region(dev);
> > +
> > +	if (!region->res)
> > +		return sysfs_emit(buf, "*%llu\n", region->requested_size);
> > +
> > +	return sysfs_emit(buf, "%llu\n", resource_size(region->res));
> > +}
> > +
> > +static ssize_t size_store(struct device *dev, struct device_attribute *attr,
> > +			  const char *buf, size_t len)
> > +{
> > +	struct cxl_region *region = to_cxl_region(dev);
> > +	unsigned long long val;
> > +	ssize_t rc;
> > +
> > +	rc = kstrtoull(buf, 0, &val);
> > +	if (rc)
> > +		return rc;
> > +
> > +	device_lock(&region->dev);
> > +	if (is_region_active(region)) {
> > +		rc = -EBUSY;
> > +	} else {
> > +		region->requested_size = val;
> > +	}
> > +	device_unlock(&region->dev);
> > +
> > +	return len;
> > +}
> > +static DEVICE_ATTR_RW(size);
> > +
> > +static ssize_t uuid_show(struct device *dev, struct device_attribute *attr,
> > +			 char *buf)
> > +{
> > +	struct cxl_region *region = to_cxl_region(dev);
> > +
> > +	return sysfs_emit(buf, "%pUb\n", &region->uuid);
> > +}
> > +static ssize_t uuid_store(struct device *dev, struct device_attribute *attr,
> > +			  const char *buf, size_t len)
> > +{
> > +	struct cxl_region *region = to_cxl_region(dev);
> > +	ssize_t rc;
> > +
> > +	if (len != UUID_STRING_LEN + 1)
> > +		return -EINVAL;
> > +
> > +	device_lock(&region->dev);
> > +	if (is_region_active(region)) {
> > +		rc = -EBUSY;
> > +	} else {
> > +		rc = uuid_parse(buf, &region->uuid);
> > +	}
> > +	device_unlock(&region->dev);
> > +
> > +	return rc ? rc : len;
> > +}
> > +static DEVICE_ATTR_RW(uuid);
> > +
> > +static struct attribute *region_attrs[] = {
> > +	&dev_attr_offset.attr,
> > +	&dev_attr_size.attr,
> > +	&dev_attr_uuid.attr,
> > +	NULL,
> > +};
> > +
> > +static const struct attribute_group region_group = {
> > +	.attrs = region_attrs,
> > +};
> > +
> > +static size_t show_targetN(struct cxl_region *region, char *buf, int n)
> > +{
> > +	int ret;
> > +
> > +	device_lock(&region->dev);
> > +	if (!region->targets[n])
> > +		ret = sysfs_emit(buf, "%s\n", dev_name(&region->targets[n]->dev));
> > +	else
> > +		ret = sysfs_emit(buf, "nil\n");
> > +	device_unlock(&region->dev);
> > +
> > +	return ret;
> > +}
> > +
> > +static size_t set_targetN(struct cxl_region *region, const char *buf, int n, size_t len)
> > +{
> > +	struct device *memdev_dev;
> > +	struct cxl_memdev *cxlmd;
> > +	ssize_t rc;
> > +	int val;
> > +
> > +	device_lock(&region->dev);
> > +
> > +	rc = kstrtoint(buf, 0, &val);
> > +	/* Special 'remote' target operation */
> 
> remote?
> 

remove

> > +	if (!rc && val == 0) {
> 
> I'm not sure on logic here.  This seems to be if the value is valid and 0
> we do something different?  Why is 0 special? It's not documented as such
> and the thing returns nil, not 0.

I changed it from the RFC to return empty string (or I meant to at least). I
should make it take the empty string as well.

> 
> > +		cxlmd = region->targets[n];
> > +		if (cxlmd)
> > +			put_device(&cxlmd->dev);
> > +		region->targets[n] = NULL;
> > +		device_unlock(&region->dev);
> > +		return len;
> > +	}
> > +
> > +	memdev_dev = bus_find_device_by_name(&cxl_bus_type, NULL, buf);
> > +	if (!memdev_dev)
> > +		return -ENOENT;
> > +
> > +	cxlmd = to_cxl_memdev(memdev_dev);
> > +	get_device(&cxlmd->dev);
> > +	region->targets[n] = cxlmd;
> > +
> > +	device_unlock(&region->dev);
> > +
> > +	return len;
> > +}
> > +
> > +#define TARGET_ATTR_RW(n)                                                      \
> > +	static ssize_t target##n##_show(                                       \
> > +		struct device *dev, struct device_attribute *attr, char *buf)  \
> > +	{                                                                      \
> > +		return show_targetN(to_cxl_region(dev), buf, n);               \
> > +	}                                                                      \
> > +	static ssize_t target##n##_store(struct device *dev,                   \
> > +					 struct device_attribute *attr,        \
> > +					 const char *buf, size_t len)          \
> > +	{                                                                      \
> > +		return set_targetN(to_cxl_region(dev), buf, n, len);           \
> > +	}                                                                      \
> > +	static DEVICE_ATTR_RW(target##n)
> > +
> > +TARGET_ATTR_RW(0);
> > +TARGET_ATTR_RW(1);
> > +TARGET_ATTR_RW(2);
> > +TARGET_ATTR_RW(3);
> > +TARGET_ATTR_RW(4);
> > +TARGET_ATTR_RW(5);
> > +TARGET_ATTR_RW(6);
> > +TARGET_ATTR_RW(7);
> > +TARGET_ATTR_RW(8);
> > +TARGET_ATTR_RW(9);
> > +TARGET_ATTR_RW(10);
> > +TARGET_ATTR_RW(11);
> > +TARGET_ATTR_RW(12);
> > +TARGET_ATTR_RW(13);
> > +TARGET_ATTR_RW(14);
> > +TARGET_ATTR_RW(15);
> > +
> > +static struct attribute *interleave_attrs[] = {
> > +	&dev_attr_target0.attr,
> > +	&dev_attr_target1.attr,
> > +	&dev_attr_target2.attr,
> > +	&dev_attr_target3.attr,
> > +	&dev_attr_target4.attr,
> > +	&dev_attr_target5.attr,
> > +	&dev_attr_target6.attr,
> > +	&dev_attr_target7.attr,
> > +	&dev_attr_target8.attr,
> > +	&dev_attr_target9.attr,
> > +	&dev_attr_target10.attr,
> > +	&dev_attr_target11.attr,
> > +	&dev_attr_target12.attr,
> > +	&dev_attr_target13.attr,
> > +	&dev_attr_target14.attr,
> > +	&dev_attr_target15.attr,
> > +	NULL,
> > +};
> > +
> > +static umode_t visible_targets(struct kobject *kobj, struct attribute *a, int n)
> > +{
> > +	struct device *dev = container_of(kobj, struct device, kobj);
> > +	struct cxl_region *region = to_cxl_region(dev);
> > +
> > +	if (n < region->eniw)
> > +		return a->mode;
> > +	return 0;
> > +}
> > +
> > +static const struct attribute_group region_interleave_group = {
> > +	.attrs = interleave_attrs,
> > +	.is_visible = visible_targets,
> > +};
> > +
> > +static const struct attribute_group *region_groups[] = {
> > +	&region_group,
> > +	&region_interleave_group,
> > +	NULL,
> > +};
> > +
> >  static void cxl_region_release(struct device *dev);
> >  
> >  const struct device_type cxl_region_type = {
> >  	.name = "cxl_region",
> >  	.release = cxl_region_release,
> > +	.groups = region_groups
> >  };
> >  
> >  void cxl_free_region(struct cxl_decoder *cxld, struct cxl_region *region)
> >  {
> > +	int i;
> > +
> >  	ida_free(&cxld->region_ida, region->id);
> > +	for (i = 0; i < region->eniw; i++) {
> > +		if (region->targets[i])
> > +			put_device(&region->targets[i]->dev);
> > +	}
> > +
> >  	kfree(region);
> >  }
> >  
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] cxl/region: Introduce concept of region configuration
  2021-06-18 15:25     ` Ben Widawsky
@ 2021-06-18 15:44       ` Jonathan Cameron
  0 siblings, 0 replies; 20+ messages in thread
From: Jonathan Cameron @ 2021-06-18 15:44 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Alison Schofield, Dan Williams, Ira Weiny, Vishal Verma

On Fri, 18 Jun 2021 08:25:51 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-06-18 12:22:37, Jonathan Cameron wrote:
> > On Thu, 17 Jun 2021 10:36:52 -0700
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > The region creation APIs leave a region unconfigured. Configuring the
> > > region will work in the same way as similar subsystems such as devdax.
> > > Sysfs attrs will be provided to allow userspace to configure the region.
> > > Finally once all configuration is complete, userspace may "commit" the
> > > config. What the kernel decides to do after a config is committed is out
> > > of scope at this point.
> > > 
> > > Introduced here are the most basic attributes needed to configure a
> > > region.
> > > 
> > > A x1 interleave example is provided below:
> > > 
> > > decoder1.0
> > > ├── create_region
> > > ├── delete_region
> > > ├── devtype
> > > ├── locked
> > > ├── region1.0:0
> > > │   ├── offset
> > > │   ├── size
> > > │   ├── subsystem -> ../../../../../../../bus/cxl
> > > │   ├── target0
> > > │   ├── uevent
> > > │   ├── uuid
> > > │   └── verify

Does that exist any more?

> > > ├── size
> > > ├── start
> > > ├── subsystem -> ../../../../../../bus/cxl
> > > ├── target_list
> > > ├── target_type
> > > └── uevent
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>  
> > 
> > Hi Ben,
> > 
> > I think some of the sysfs interface needs more detailed docs
> > and consideration about consistency.
> > 
> > If an interface isn't fairly obvious without reading the docs
> > (but understanding what it is trying to do) then it's normally
> > a bad sign...
> > 
> > Jonathan
> >   
> 
> My measuring stick has been: the interface should be fairly obvious if you're
> fairly familiar with the spec. I think we've satisfied that here, but often
> times the creator of something can be blind to effects of the creation.
> 
> I can certainly add more descriptive text to the ABI doc, but if you have
> suggestions on how to make the sysfs attrs more descriptive, that'd be great.

In this patch, mainly the whole 0 / nil etc.
Empty string make sense to me when clearing as that reflects what is returned
when it wasn't set in the first place.

Attr names are fine for this one.
> 
> > > ---
> > >  Documentation/ABI/testing/sysfs-bus-cxl |  32 ++++
> > >  drivers/cxl/region.c                    | 223 ++++++++++++++++++++++++
> > >  2 files changed, 255 insertions(+)
> > > 
> > > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> > > index 115a25d2899d..70f9d09385a4 100644
> > > --- a/Documentation/ABI/testing/sysfs-bus-cxl
> > > +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> > > @@ -148,3 +148,35 @@ Description:
> > >  		Deletes the named region. A region must be unbound from the
> > >  		region driver before being deleted. The attributes expects a
> > >  		region in the form "regionX.Y:Z".
> > > +
> > > +What:		/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/offset
> > > +Date:		June, 2021
> > > +KernelVersion:	v5.14
> > > +Contact:	linux-cxl@vger.kernel.org
> > > +Description:
> > > +		(RO) A region resides within an address space that is claimed by
> > > +		a decoder. Region space allocation is handled by the driver, but
> > > +		the offset may be read by userspace tooling in order to
> > > +		determine fragmentation, and available size for new regions.  
> > 
> > Side note rather than abotu this patch.  Feels to me like the format definition
> > of these docs should include something on permissions.  The (RO) bit is fine
> > but makes the docs less machine readable.    
> 
> Any examples of something you consider ideal?

Nope :)  Mauro is doing some work on automated check of this documentation.
If that works gets anywhere near RO vs RW we can figure out how to improve
the docs then to support it!

> 
> >   
> > > +
> > > +What:
> > > +/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/{size,uuid,target[0-15]}
> > > +Date:		June, 2021
> > > +KernelVersion:	v5.14
> > > +Contact:	linux-cxl@vger.kernel.org
> > > +Description:
> > > +		(RW) Configuring regions requires a minimal set of parameters in
> > > +		order for the subsequent bind operation to succeed. The
> > > +		following parameters are defined:
> > > +
> > > +		==	========================================================
> > > +		size	Manadatory. Phsyical address space the region will  
> > 
> > Spell check.
> >   
> > > +			consume.
> > > +		uuid	Optional. A unique identifier for the region. If none is
> > > +			selected, the kernel will create one.
> > > +		target  Mandatory. Memory devices are the backing storage for a
> > > +			region. There will be N targets based on the number of
> > > +			interleave ways that the top level decoder is configured
> > > +			for. Each target must be set with a memdev device ie.
> > > +			'mem1'.  
> > 
> > Document what happens if not set.  'nil' is not obvious.
> >   
> > > +		==	========================================================  
> > 
> > The automation looking at these files is increasing, so I wonder if that will
> > cause some issues with having these all documented in one block.
> > 
> > I don't see a huge disadvantage in breaking this into 3 entries.  Alternatively
> > wait for someone to scream and refactor it then.  
> 
> Okay.
> 
> >   
> > > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> > > index 391467e864a2..cf7fd3027419 100644
> > > --- a/drivers/cxl/region.c
> > > +++ b/drivers/cxl/region.c
> > > @@ -3,7 +3,9 @@
> > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > >  #include <linux/device.h>
> > >  #include <linux/module.h>
> > > +#include <linux/sizes.h>
> > >  #include <linux/slab.h>
> > > +#include <linux/uuid.h>
> > >  #include <linux/idr.h>
> > >  #include "region.h"
> > >  #include "cxl.h"
> > > @@ -21,16 +23,237 @@
> > >   * relationship between decoder and region when the region is interleaved.
> > >   */
> > >  
> > > +static struct cxl_region *to_cxl_region(struct device *dev);
> > > +
> > > +static bool is_region_active(struct cxl_region *region)
> > > +{
> > > +	/* TODO: Regions can't be activated yet. */
> > > +	return false;
> > > +}
> > > +
> > > +static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
> > > +			   char *buf)
> > > +{
> > > +	struct cxl_region *region = to_cxl_region(dev);
> > > +	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
> > > +
> > > +	if (!region->res)
> > > +		return sysfs_emit(buf, "\n");
> > > +
> > > +	return sysfs_emit(buf, "%#llx\n", cxld->range.start - region->res->start);
> > > +}
> > > +static DEVICE_ATTR_RO(offset);
> > > +
> > > +static ssize_t size_show(struct device *dev, struct device_attribute *attr,
> > > +			 char *buf)
> > > +{
> > > +	struct cxl_region *region = to_cxl_region(dev);
> > > +
> > > +	if (!region->res)
> > > +		return sysfs_emit(buf, "*%llu\n", region->requested_size);
> > > +
> > > +	return sysfs_emit(buf, "%llu\n", resource_size(region->res));
> > > +}
> > > +
> > > +static ssize_t size_store(struct device *dev, struct device_attribute *attr,
> > > +			  const char *buf, size_t len)
> > > +{
> > > +	struct cxl_region *region = to_cxl_region(dev);
> > > +	unsigned long long val;
> > > +	ssize_t rc;
> > > +
> > > +	rc = kstrtoull(buf, 0, &val);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	device_lock(&region->dev);
> > > +	if (is_region_active(region)) {
> > > +		rc = -EBUSY;
> > > +	} else {
> > > +		region->requested_size = val;
> > > +	}
> > > +	device_unlock(&region->dev);
> > > +
> > > +	return len;
> > > +}
> > > +static DEVICE_ATTR_RW(size);
> > > +
> > > +static ssize_t uuid_show(struct device *dev, struct device_attribute *attr,
> > > +			 char *buf)
> > > +{
> > > +	struct cxl_region *region = to_cxl_region(dev);
> > > +
> > > +	return sysfs_emit(buf, "%pUb\n", &region->uuid);
> > > +}
> > > +static ssize_t uuid_store(struct device *dev, struct device_attribute *attr,
> > > +			  const char *buf, size_t len)
> > > +{
> > > +	struct cxl_region *region = to_cxl_region(dev);
> > > +	ssize_t rc;
> > > +
> > > +	if (len != UUID_STRING_LEN + 1)
> > > +		return -EINVAL;
> > > +
> > > +	device_lock(&region->dev);
> > > +	if (is_region_active(region)) {
> > > +		rc = -EBUSY;
> > > +	} else {
> > > +		rc = uuid_parse(buf, &region->uuid);
> > > +	}
> > > +	device_unlock(&region->dev);
> > > +
> > > +	return rc ? rc : len;
> > > +}
> > > +static DEVICE_ATTR_RW(uuid);
> > > +
> > > +static struct attribute *region_attrs[] = {
> > > +	&dev_attr_offset.attr,
> > > +	&dev_attr_size.attr,
> > > +	&dev_attr_uuid.attr,
> > > +	NULL,
> > > +};
> > > +
> > > +static const struct attribute_group region_group = {
> > > +	.attrs = region_attrs,
> > > +};
> > > +
> > > +static size_t show_targetN(struct cxl_region *region, char *buf, int n)
> > > +{
> > > +	int ret;
> > > +
> > > +	device_lock(&region->dev);
> > > +	if (!region->targets[n])
> > > +		ret = sysfs_emit(buf, "%s\n", dev_name(&region->targets[n]->dev));
> > > +	else
> > > +		ret = sysfs_emit(buf, "nil\n");
> > > +	device_unlock(&region->dev);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static size_t set_targetN(struct cxl_region *region, const char *buf, int n, size_t len)
> > > +{
> > > +	struct device *memdev_dev;
> > > +	struct cxl_memdev *cxlmd;
> > > +	ssize_t rc;
> > > +	int val;
> > > +
> > > +	device_lock(&region->dev);
> > > +
> > > +	rc = kstrtoint(buf, 0, &val);
> > > +	/* Special 'remote' target operation */  
> > 
> > remote?
> >   
> 
> remove
> 
> > > +	if (!rc && val == 0) {  
> > 
> > I'm not sure on logic here.  This seems to be if the value is valid and 0
> > we do something different?  Why is 0 special? It's not documented as such
> > and the thing returns nil, not 0.  
> 
> I changed it from the RFC to return empty string (or I meant to at least). I
> should make it take the empty string as well.

That works better.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] cxl/region: Add region creation ABI
  2021-06-18 15:07     ` Ben Widawsky
@ 2021-06-18 16:39       ` Dan Williams
  0 siblings, 0 replies; 20+ messages in thread
From: Dan Williams @ 2021-06-18 16:39 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Jonathan Cameron, linux-cxl, Alison Schofield, Ira Weiny, Vishal Verma

On Fri, Jun 18, 2021 at 8:08 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
[..]
> > It's a bit horrible, but maybe a more intuitive interface would be to make the
> > targetX visibility dynamic and hence make interleave an attribute of the region?
> >
> > sysfs_update_group() is rarely used, but I think the intent is to support
> > this sort of case.
> >
> > I've not fully thought through the effects of that though so might
> > well be missing something.
>
> I really would have liked something dynamic like you described. I searched for
> other drivers that this without much luck. I'm relatively unfamiliar with a lot
> of the device model machinery in Linux. Dan, are you familiar with this, is it
> feasible? If it sounds reasonable I can spend some more time figuring out how to
> do it.

I would prefer not to go the sysfs_update_group() route. That tends to
make it harder for tooling to reason about the arrival of attributes
relative to KOBJ_ADD events.

That said "create" may indeed be too generic an attribute to reason
about what it does. We could make all the setup parameters that do not
involve targets their own attribute group and then have a trigger
attribute that adds the next region device. Something like:

decoderX.Y/setup_region/uuid
decoderX.Y/setup_region/size
decoderX.Y/setup_region/interleave
decoderX.Y/setup_region/commit

...and then @commit atomically adds the region and clears the
parameters for the next setup?

Alternatively, renaming create to something like add_interleave sounds
ok to me too... trying to approach the border with configfs without
crossing over.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-06-18 16:39 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-17 17:36 [PATCH 0/6] Region creation Ben Widawsky
2021-06-17 17:36 ` [PATCH 1/6] cxl/region: Add region creation ABI Ben Widawsky
2021-06-17 21:11   ` [PATCH v2 " Ben Widawsky
2021-06-18  9:13   ` [PATCH " Jonathan Cameron
2021-06-18 15:07     ` Ben Widawsky
2021-06-18 16:39       ` Dan Williams
2021-06-17 17:36 ` [PATCH 2/6] cxl: Move cxl_memdev conversion helper to mem.h Ben Widawsky
2021-06-18  9:13   ` Jonathan Cameron
2021-06-18 15:00   ` Dan Williams
2021-06-17 17:36 ` [PATCH 3/6] cxl/region: Introduce concept of region configuration Ben Widawsky
2021-06-18 11:22   ` Jonathan Cameron
2021-06-18 15:25     ` Ben Widawsky
2021-06-18 15:44       ` Jonathan Cameron
2021-06-17 17:36 ` [PATCH 4/6] cxl/region: Introduce a cxl_region driver Ben Widawsky
2021-06-17 21:13   ` [PATCH v2 " Ben Widawsky
2021-06-18 11:49     ` Jonathan Cameron
2021-06-17 17:36 ` [PATCH 5/6] cxl/core: Convert decoder range to resource Ben Widawsky
2021-06-18 11:52   ` Jonathan Cameron
2021-06-17 17:36 ` [PATCH 6/6] cxl/region: Handle region's address space allocation Ben Widawsky
2021-06-18 13:35   ` Jonathan Cameron

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox