nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests
@ 2021-09-09  5:11 Dan Williams
  2021-09-09  5:11 ` [PATCH v4 01/21] libnvdimm/labels: Add uuid helpers Dan Williams
                   ` (20 more replies)
  0 siblings, 21 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:11 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ira Weiny, Ben Widawsky, Jonathan Cameron, Vishal Verma,
	Andy Shevchenko, Nathan Chancellor, kernel test robot,
	Dan Carpenter, nvdimm, ben.widawsky, alison.schofield,
	vishal.l.verma, ira.weiny, Jonathan.Cameron

Changes since v3 [1]:
- Rebase on the cxl-for-5.15 tag where some of v3 was accepted.

- Move the introduction of the uuid_to_nvdimm_cclass() helper to the
  patch that uses it to avoid a "defined but not used" warning (now
  fatal as of the -Werror upstream change).

- Fix the kernel-doc for the new CXL structure definitions in
  drivers/nvdimm/label.h. This resulted in a lead-in patch to fix up
  existing kernel-doc warnings. (Jonathan)

- Move the cxl_mem_get_partition_info() cleanups to their own patch
  (Jonathan).

- Fix the cxl_mem_get_partition_info() kernel-doc (Ben)

- Move the idr.h include fixup to its own patch (Jonathan)

- Fix the kernel-doc for struct cxl_mbox_cmd (Jonathan)

- Add a note about the ABI implications of the nvdimm-bridge device-name
  change (Jonathan)

- Squash "cxl/core: Replace devm_cxl_add_decoder() with non-devm
  version" into "cxl/core: Split decoder setup into alloc + add"

- Cleanup some pointless cxlm->dev to pci_dev and back conversions
  (Jonathan)

- Drop pci.h include from core/mbox.c (Jonathan)

- Drop duplicate cxl_doorbell_busy() and CXL_MAILBOX_TIMEOUT_MS
  definitions, only pci.c needs them. (Jonathan)

- Fix style violation in cxl_for_each_cmd() (Jonathan)

- Move cxl_mem_create() api change to "cxl/pci: Make 'struct cxl_mem'
  device type generic" (Jonathan)

- Add a helper, decoder_populate_targets(), for initializing a decoder
  target_list (Jonathan)

- Use put_unaligned() for potential unaligned write in
  cxl_pmem_set_config_data() (Jonathan)

- Add cxl_nvdimm_remove() to explicitly handle exclusive command
  shutdown rather than some gymnastics to handle the
  devm_add_action_or_reset() error code vs nvdimm_create() failure.
  (Jonathan)

- Switch to platform_device_unregister() lest we wake the dragons
  guarding platform_device_del(). (Jonathan)

- Fix CEL retrieval for the case where CEL is larger than payload_size
  in mock_get_log(). (Jonathan)

- Move CFMWS size definition out of the setup code and into the static
  table definition directly. (Jonathan)

[1]: https://lore.kernel.org/r/162982112370.1124374.2020303588105269226.stgit@dwillia2-desk3.amr.corp.intel.com

---

As mentioned in patch 17 in this series the response of upstream QEMU
community to CXL device emulation has been underwhelming to date. Even
if that picked up it still results in a situation where new driver
features and new test capabilities for those features are split across
multiple repositories.

The "nfit_test" approach of mocking up platform resources via an
external test module continues to yield positive results catching
regressions early and often. So this attempts to repeat that success
with a "cxl_test" module to inject custom crafted topologies and command
responses into the CXL subsystem's sysfs and ioctl UAPIs.

The first target for cxl_test to verify is the integration of CXL with
LIBNVDIMM and the new support for the CXL namespace label + region-label
format. The 6 patches in this series finish off the work that was merged
for the v5.15 merge window to introduce support for the new label
format.

The next 10 patches rework the CXL PCI driver to move more common
infrastructure into the core for the unit test environment to reuse. The
largest change here is disconnecting the mailbox command processing
infrastructure from the PCI specific transport. The unit test
environment replaces the PCI transport with a custom backend with mocked
responses to command requests.

Patch 17 introduces just enough mocked functionality for the cxl_acpi
driver to load against cxl_test resources. Patch 21 fixes the first bug
discovered by this framework, namely that HDM decoder target list maps
were not being filled out.

Finally patches 19 and 20 introduce a cxl_test representation of memory
expander devices. In this initial implementation these memory expander
targets implement just enough command support to pass the basic driver
init sequence and enable label command passthrough to LIBNVDIMM.

The topology of cxl_test includes:
- (4) platform fixed memory windows. One each of a x1-volatile,
  x4-volatile, x1-persistent, and x4-persistent.
- (4) Host bridges each with (2) root ports
- (8) CXL memory expanders, one for each root port
- Each memory expander device supports the GET_SUPPORTED_LOGS, GET_LOG,
  IDENTIFY, GET_LSA, and SET_LSA commands.

Going forward the expectation is that where possible new UAPI visible
subsystem functionality comes with cxl_test emulation of the same.

The build process for cxl_test is:

    make M=tools/testing/cxl
    make M=tools/testing/cxl modules_install

The implementation methodology of the test module is the same as
nfit_test where the bulk of the emulation comes from replacing symbols
that cxl_acpi and the cxl_core import with mocked implementation of
those symbols. See the "--wrap=" lines in tools/testing/cxl/Kbuild. Some
symbols need to be replaced, but are local to the modules like
match_add_root_ports(). In those cases the local symbol is marked __weak
(via __mock) with a strong implementation coming from
tools/testing/cxl/. The goal being to be minimally invasive to
production code paths.

---

Dan Williams (21):
      libnvdimm/labels: Add uuid helpers
      libnvdimm/label: Add a helper for nlabel validation
      libnvdimm/labels: Introduce the concept of multi-range namespace labels
      libnvdimm/labels: Fix kernel-doc for label.h
      libnvdimm/label: Define CXL region labels
      libnvdimm/labels: Introduce CXL labels
      cxl/pci: Make 'struct cxl_mem' device type generic
      cxl/pci: Clean up cxl_mem_get_partition_info()
      cxl/mbox: Introduce the mbox_send operation
      cxl/pci: Drop idr.h
      cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core
      cxl/pci: Use module_pci_driver
      cxl/mbox: Convert 'enabled_cmds' to DECLARE_BITMAP
      cxl/mbox: Add exclusive kernel command support
      cxl/pmem: Translate NVDIMM label commands to CXL label commands
      cxl/pmem: Add support for multiple nvdimm-bridge objects
      tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
      cxl/bus: Populate the target list at decoder create
      cxl/mbox: Move command definitions to common location
      tools/testing/cxl: Introduce a mock memory device + driver
      cxl/core: Split decoder setup into alloc + add


 Documentation/driver-api/cxl/memory-devices.rst |    3 
 drivers/cxl/acpi.c                              |  133 ++-
 drivers/cxl/core/Makefile                       |    1 
 drivers/cxl/core/bus.c                          |  114 +-
 drivers/cxl/core/core.h                         |   11 
 drivers/cxl/core/mbox.c                         |  787 +++++++++++++++++
 drivers/cxl/core/memdev.c                       |  117 ++-
 drivers/cxl/core/pmem.c                         |   39 +
 drivers/cxl/cxl.h                               |   49 +
 drivers/cxl/cxlmem.h                            |  202 ++++
 drivers/cxl/pci.c                               | 1059 +----------------------
 drivers/cxl/pmem.c                              |  172 +++-
 drivers/nvdimm/btt.c                            |   11 
 drivers/nvdimm/btt_devs.c                       |   14 
 drivers/nvdimm/core.c                           |   40 -
 drivers/nvdimm/label.c                          |  139 ++-
 drivers/nvdimm/label.h                          |   94 ++
 drivers/nvdimm/namespace_devs.c                 |   95 +-
 drivers/nvdimm/nd-core.h                        |    5 
 drivers/nvdimm/nd.h                             |  185 +++-
 drivers/nvdimm/pfn_devs.c                       |    2 
 include/linux/nd.h                              |    4 
 tools/testing/cxl/Kbuild                        |   38 +
 tools/testing/cxl/config_check.c                |   13 
 tools/testing/cxl/mock_acpi.c                   |  109 ++
 tools/testing/cxl/mock_pmem.c                   |   24 +
 tools/testing/cxl/test/Kbuild                   |   10 
 tools/testing/cxl/test/cxl.c                    |  576 +++++++++++++
 tools/testing/cxl/test/mem.c                    |  256 ++++++
 tools/testing/cxl/test/mock.c                   |  171 ++++
 tools/testing/cxl/test/mock.h                   |   27 +
 31 files changed, 3134 insertions(+), 1366 deletions(-)
 create mode 100644 drivers/cxl/core/mbox.c
 create mode 100644 tools/testing/cxl/Kbuild
 create mode 100644 tools/testing/cxl/config_check.c
 create mode 100644 tools/testing/cxl/mock_acpi.c
 create mode 100644 tools/testing/cxl/mock_pmem.c
 create mode 100644 tools/testing/cxl/test/Kbuild
 create mode 100644 tools/testing/cxl/test/cxl.c
 create mode 100644 tools/testing/cxl/test/mem.c
 create mode 100644 tools/testing/cxl/test/mock.c
 create mode 100644 tools/testing/cxl/test/mock.h

base-commit: 2b922a9d064f8e86b53b04f5819917b7a04142ed

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v4 01/21] libnvdimm/labels: Add uuid helpers
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
@ 2021-09-09  5:11 ` Dan Williams
  2021-09-09  5:11 ` [PATCH v4 02/21] libnvdimm/label: Add a helper for nlabel validation Dan Williams
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:11 UTC (permalink / raw)
  To: linux-cxl
  Cc: Andy Shevchenko, Andy Shevchenko, Jonathan Cameron,
	vishal.l.verma, nvdimm, ben.widawsky, alison.schofield,
	vishal.l.verma, ira.weiny, Jonathan.Cameron

In preparation for CXL labels that move the uuid to a different offset
in the label, add nsl_{ref,get,validate}_uuid(). These helpers use the
proper uuid_t type. That type definition predated the libnvdimm
subsystem, so now is as a good a time as any to convert all the uuid
handling in the subsystem to uuid_t to match the helpers.

Note that the uuid fields in the label data and superblocks is not
replaced per Andy's expectation that uuid_t is a kernel internal type
not to appear in external ABI interfaces. So, in those case
{import,export}_uuid() is used to go between the 2 types.

Also note that this rework uncovered some unnecessary copies for label
comparisons, those are cleaned up with nsl_uuid_equal().

As for the whitespace changes, all new code is clang-format compliant.

Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/btt.c            |   11 +++--
 drivers/nvdimm/btt_devs.c       |   14 +++---
 drivers/nvdimm/core.c           |   40 ++---------------
 drivers/nvdimm/label.c          |   34 ++++++---------
 drivers/nvdimm/namespace_devs.c |   90 ++++++++++++++++++++++-----------------
 drivers/nvdimm/nd-core.h        |    5 +-
 drivers/nvdimm/nd.h             |   40 ++++++++++++++++-
 drivers/nvdimm/pfn_devs.c       |    2 -
 include/linux/nd.h              |    4 +-
 9 files changed, 124 insertions(+), 116 deletions(-)

diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 92dec4952297..52de60b7adee 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -973,7 +973,7 @@ static int btt_arena_write_layout(struct arena_info *arena)
 	u64 sum;
 	struct btt_sb *super;
 	struct nd_btt *nd_btt = arena->nd_btt;
-	const u8 *parent_uuid = nd_dev_to_uuid(&nd_btt->ndns->dev);
+	const uuid_t *parent_uuid = nd_dev_to_uuid(&nd_btt->ndns->dev);
 
 	ret = btt_map_init(arena);
 	if (ret)
@@ -988,8 +988,8 @@ static int btt_arena_write_layout(struct arena_info *arena)
 		return -ENOMEM;
 
 	strncpy(super->signature, BTT_SIG, BTT_SIG_LEN);
-	memcpy(super->uuid, nd_btt->uuid, 16);
-	memcpy(super->parent_uuid, parent_uuid, 16);
+	export_uuid(super->uuid, nd_btt->uuid);
+	export_uuid(super->parent_uuid, parent_uuid);
 	super->flags = cpu_to_le32(arena->flags);
 	super->version_major = cpu_to_le16(arena->version_major);
 	super->version_minor = cpu_to_le16(arena->version_minor);
@@ -1575,7 +1575,8 @@ static void btt_blk_cleanup(struct btt *btt)
  * Pointer to a new struct btt on success, NULL on failure.
  */
 static struct btt *btt_init(struct nd_btt *nd_btt, unsigned long long rawsize,
-		u32 lbasize, u8 *uuid, struct nd_region *nd_region)
+			    u32 lbasize, uuid_t *uuid,
+			    struct nd_region *nd_region)
 {
 	int ret;
 	struct btt *btt;
@@ -1694,7 +1695,7 @@ int nvdimm_namespace_attach_btt(struct nd_namespace_common *ndns)
 	}
 	nd_region = to_nd_region(nd_btt->dev.parent);
 	btt = btt_init(nd_btt, rawsize, nd_btt->lbasize, nd_btt->uuid,
-			nd_region);
+		       nd_region);
 	if (!btt)
 		return -ENOMEM;
 	nd_btt->btt = btt;
diff --git a/drivers/nvdimm/btt_devs.c b/drivers/nvdimm/btt_devs.c
index 05feb97e11ce..8b52e5144f08 100644
--- a/drivers/nvdimm/btt_devs.c
+++ b/drivers/nvdimm/btt_devs.c
@@ -180,8 +180,8 @@ bool is_nd_btt(struct device *dev)
 EXPORT_SYMBOL(is_nd_btt);
 
 static struct device *__nd_btt_create(struct nd_region *nd_region,
-		unsigned long lbasize, u8 *uuid,
-		struct nd_namespace_common *ndns)
+				      unsigned long lbasize, uuid_t *uuid,
+				      struct nd_namespace_common *ndns)
 {
 	struct nd_btt *nd_btt;
 	struct device *dev;
@@ -244,14 +244,16 @@ struct device *nd_btt_create(struct nd_region *nd_region)
  */
 bool nd_btt_arena_is_valid(struct nd_btt *nd_btt, struct btt_sb *super)
 {
-	const u8 *parent_uuid = nd_dev_to_uuid(&nd_btt->ndns->dev);
+	const uuid_t *ns_uuid = nd_dev_to_uuid(&nd_btt->ndns->dev);
+	uuid_t parent_uuid;
 	u64 checksum;
 
 	if (memcmp(super->signature, BTT_SIG, BTT_SIG_LEN) != 0)
 		return false;
 
-	if (!guid_is_null((guid_t *)&super->parent_uuid))
-		if (memcmp(super->parent_uuid, parent_uuid, 16) != 0)
+	import_uuid(&parent_uuid, super->parent_uuid);
+	if (!uuid_is_null(&parent_uuid))
+		if (!uuid_equal(&parent_uuid, ns_uuid))
 			return false;
 
 	checksum = le64_to_cpu(super->checksum);
@@ -319,7 +321,7 @@ static int __nd_btt_probe(struct nd_btt *nd_btt,
 		return rc;
 
 	nd_btt->lbasize = le32_to_cpu(btt_sb->external_lbasize);
-	nd_btt->uuid = kmemdup(btt_sb->uuid, 16, GFP_KERNEL);
+	nd_btt->uuid = kmemdup(&btt_sb->uuid, sizeof(uuid_t), GFP_KERNEL);
 	if (!nd_btt->uuid)
 		return -ENOMEM;
 
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 7de592d7eff4..690152d62bf0 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -206,38 +206,6 @@ struct device *to_nvdimm_bus_dev(struct nvdimm_bus *nvdimm_bus)
 }
 EXPORT_SYMBOL_GPL(to_nvdimm_bus_dev);
 
-static bool is_uuid_sep(char sep)
-{
-	if (sep == '\n' || sep == '-' || sep == ':' || sep == '\0')
-		return true;
-	return false;
-}
-
-static int nd_uuid_parse(struct device *dev, u8 *uuid_out, const char *buf,
-		size_t len)
-{
-	const char *str = buf;
-	u8 uuid[16];
-	int i;
-
-	for (i = 0; i < 16; i++) {
-		if (!isxdigit(str[0]) || !isxdigit(str[1])) {
-			dev_dbg(dev, "pos: %d buf[%zd]: %c buf[%zd]: %c\n",
-					i, str - buf, str[0],
-					str + 1 - buf, str[1]);
-			return -EINVAL;
-		}
-
-		uuid[i] = (hex_to_bin(str[0]) << 4) | hex_to_bin(str[1]);
-		str += 2;
-		if (is_uuid_sep(*str))
-			str++;
-	}
-
-	memcpy(uuid_out, uuid, sizeof(uuid));
-	return 0;
-}
-
 /**
  * nd_uuid_store: common implementation for writing 'uuid' sysfs attributes
  * @dev: container device for the uuid property
@@ -248,21 +216,21 @@ static int nd_uuid_parse(struct device *dev, u8 *uuid_out, const char *buf,
  * (driver detached)
  * LOCKING: expects nd_device_lock() is held on entry
  */
-int nd_uuid_store(struct device *dev, u8 **uuid_out, const char *buf,
+int nd_uuid_store(struct device *dev, uuid_t **uuid_out, const char *buf,
 		size_t len)
 {
-	u8 uuid[16];
+	uuid_t uuid;
 	int rc;
 
 	if (dev->driver)
 		return -EBUSY;
 
-	rc = nd_uuid_parse(dev, uuid, buf, len);
+	rc = uuid_parse(buf, &uuid);
 	if (rc)
 		return rc;
 
 	kfree(*uuid_out);
-	*uuid_out = kmemdup(uuid, sizeof(uuid), GFP_KERNEL);
+	*uuid_out = kmemdup(&uuid, sizeof(uuid), GFP_KERNEL);
 	if (!(*uuid_out))
 		return -ENOMEM;
 
diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
index 7f473f9db300..e7fdb718ebf0 100644
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -321,7 +321,8 @@ static bool preamble_index(struct nvdimm_drvdata *ndd, int idx,
 	return true;
 }
 
-char *nd_label_gen_id(struct nd_label_id *label_id, u8 *uuid, u32 flags)
+char *nd_label_gen_id(struct nd_label_id *label_id, const uuid_t *uuid,
+		      u32 flags)
 {
 	if (!label_id || !uuid)
 		return NULL;
@@ -400,9 +401,9 @@ int nd_label_reserve_dpa(struct nvdimm_drvdata *ndd)
 		struct nvdimm *nvdimm = to_nvdimm(ndd->dev);
 		struct nd_namespace_label *nd_label;
 		struct nd_region *nd_region = NULL;
-		u8 label_uuid[NSLABEL_UUID_LEN];
 		struct nd_label_id label_id;
 		struct resource *res;
+		uuid_t label_uuid;
 		u32 flags;
 
 		nd_label = to_label(ndd, slot);
@@ -410,11 +411,11 @@ int nd_label_reserve_dpa(struct nvdimm_drvdata *ndd)
 		if (!slot_valid(ndd, nd_label, slot))
 			continue;
 
-		memcpy(label_uuid, nd_label->uuid, NSLABEL_UUID_LEN);
+		nsl_get_uuid(ndd, nd_label, &label_uuid);
 		flags = nsl_get_flags(ndd, nd_label);
 		if (test_bit(NDD_NOBLK, &nvdimm->flags))
 			flags &= ~NSLABEL_FLAG_LOCAL;
-		nd_label_gen_id(&label_id, label_uuid, flags);
+		nd_label_gen_id(&label_id, &label_uuid, flags);
 		res = nvdimm_allocate_dpa(ndd, &label_id,
 					  nsl_get_dpa(ndd, nd_label),
 					  nsl_get_rawsize(ndd, nd_label));
@@ -851,7 +852,7 @@ static int __pmem_label_update(struct nd_region *nd_region,
 
 	nd_label = to_label(ndd, slot);
 	memset(nd_label, 0, sizeof_namespace_label(ndd));
-	memcpy(nd_label->uuid, nspm->uuid, NSLABEL_UUID_LEN);
+	nsl_set_uuid(ndd, nd_label, nspm->uuid);
 	nsl_set_name(ndd, nd_label, nspm->alt_name);
 	nsl_set_flags(ndd, nd_label, flags);
 	nsl_set_nlabel(ndd, nd_label, nd_region->ndr_mappings);
@@ -878,9 +879,8 @@ static int __pmem_label_update(struct nd_region *nd_region,
 	list_for_each_entry(label_ent, &nd_mapping->labels, list) {
 		if (!label_ent->label)
 			continue;
-		if (test_and_clear_bit(ND_LABEL_REAP, &label_ent->flags)
-				|| memcmp(nspm->uuid, label_ent->label->uuid,
-					NSLABEL_UUID_LEN) == 0)
+		if (test_and_clear_bit(ND_LABEL_REAP, &label_ent->flags) ||
+		    nsl_uuid_equal(ndd, label_ent->label, nspm->uuid))
 			reap_victim(nd_mapping, label_ent);
 	}
 
@@ -1005,7 +1005,6 @@ static int __blk_label_update(struct nd_region *nd_region,
 	unsigned long *free, *victim_map = NULL;
 	struct resource *res, **old_res_list;
 	struct nd_label_id label_id;
-	u8 uuid[NSLABEL_UUID_LEN];
 	int min_dpa_idx = 0;
 	LIST_HEAD(list);
 	u32 nslot, slot;
@@ -1043,8 +1042,7 @@ static int __blk_label_update(struct nd_region *nd_region,
 		/* mark unused labels for garbage collection */
 		for_each_clear_bit_le(slot, free, nslot) {
 			nd_label = to_label(ndd, slot);
-			memcpy(uuid, nd_label->uuid, NSLABEL_UUID_LEN);
-			if (memcmp(uuid, nsblk->uuid, NSLABEL_UUID_LEN) != 0)
+			if (!nsl_uuid_equal(ndd, nd_label, nsblk->uuid))
 				continue;
 			res = to_resource(ndd, nd_label);
 			if (res && is_old_resource(res, old_res_list,
@@ -1113,7 +1111,7 @@ static int __blk_label_update(struct nd_region *nd_region,
 
 		nd_label = to_label(ndd, slot);
 		memset(nd_label, 0, sizeof_namespace_label(ndd));
-		memcpy(nd_label->uuid, nsblk->uuid, NSLABEL_UUID_LEN);
+		nsl_set_uuid(ndd, nd_label, nsblk->uuid);
 		nsl_set_name(ndd, nd_label, nsblk->alt_name);
 		nsl_set_flags(ndd, nd_label, NSLABEL_FLAG_LOCAL);
 
@@ -1161,8 +1159,7 @@ static int __blk_label_update(struct nd_region *nd_region,
 		if (!nd_label)
 			continue;
 		nlabel++;
-		memcpy(uuid, nd_label->uuid, NSLABEL_UUID_LEN);
-		if (memcmp(uuid, nsblk->uuid, NSLABEL_UUID_LEN) != 0)
+		if (!nsl_uuid_equal(ndd, nd_label, nsblk->uuid))
 			continue;
 		nlabel--;
 		list_move(&label_ent->list, &list);
@@ -1192,8 +1189,7 @@ static int __blk_label_update(struct nd_region *nd_region,
 	}
 	for_each_clear_bit_le(slot, free, nslot) {
 		nd_label = to_label(ndd, slot);
-		memcpy(uuid, nd_label->uuid, NSLABEL_UUID_LEN);
-		if (memcmp(uuid, nsblk->uuid, NSLABEL_UUID_LEN) != 0)
+		if (!nsl_uuid_equal(ndd, nd_label, nsblk->uuid))
 			continue;
 		res = to_resource(ndd, nd_label);
 		res->flags &= ~DPA_RESOURCE_ADJUSTED;
@@ -1273,12 +1269,11 @@ static int init_labels(struct nd_mapping *nd_mapping, int num_labels)
 	return max(num_labels, old_num_labels);
 }
 
-static int del_labels(struct nd_mapping *nd_mapping, u8 *uuid)
+static int del_labels(struct nd_mapping *nd_mapping, uuid_t *uuid)
 {
 	struct nvdimm_drvdata *ndd = to_ndd(nd_mapping);
 	struct nd_label_ent *label_ent, *e;
 	struct nd_namespace_index *nsindex;
-	u8 label_uuid[NSLABEL_UUID_LEN];
 	unsigned long *free;
 	LIST_HEAD(list);
 	u32 nslot, slot;
@@ -1298,8 +1293,7 @@ static int del_labels(struct nd_mapping *nd_mapping, u8 *uuid)
 		if (!nd_label)
 			continue;
 		active++;
-		memcpy(label_uuid, nd_label->uuid, NSLABEL_UUID_LEN);
-		if (memcmp(label_uuid, uuid, NSLABEL_UUID_LEN) != 0)
+		if (!nsl_uuid_equal(ndd, nd_label, uuid))
 			continue;
 		active--;
 		slot = to_slot(ndd, nd_label);
diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 58c76d74127a..d4959981c7d4 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -51,7 +51,7 @@ static bool is_namespace_io(const struct device *dev);
 
 static int is_uuid_busy(struct device *dev, void *data)
 {
-	u8 *uuid1 = data, *uuid2 = NULL;
+	uuid_t *uuid1 = data, *uuid2 = NULL;
 
 	if (is_namespace_pmem(dev)) {
 		struct nd_namespace_pmem *nspm = to_nd_namespace_pmem(dev);
@@ -71,7 +71,7 @@ static int is_uuid_busy(struct device *dev, void *data)
 		uuid2 = nd_pfn->uuid;
 	}
 
-	if (uuid2 && memcmp(uuid1, uuid2, NSLABEL_UUID_LEN) == 0)
+	if (uuid2 && uuid_equal(uuid1, uuid2))
 		return -EBUSY;
 
 	return 0;
@@ -89,7 +89,7 @@ static int is_namespace_uuid_busy(struct device *dev, void *data)
  * @dev: any device on a nvdimm_bus
  * @uuid: uuid to check
  */
-bool nd_is_uuid_unique(struct device *dev, u8 *uuid)
+bool nd_is_uuid_unique(struct device *dev, uuid_t *uuid)
 {
 	struct nvdimm_bus *nvdimm_bus = walk_to_nvdimm_bus(dev);
 
@@ -192,12 +192,10 @@ const char *nvdimm_namespace_disk_name(struct nd_namespace_common *ndns,
 }
 EXPORT_SYMBOL(nvdimm_namespace_disk_name);
 
-const u8 *nd_dev_to_uuid(struct device *dev)
+const uuid_t *nd_dev_to_uuid(struct device *dev)
 {
-	static const u8 null_uuid[16];
-
 	if (!dev)
-		return null_uuid;
+		return &uuid_null;
 
 	if (is_namespace_pmem(dev)) {
 		struct nd_namespace_pmem *nspm = to_nd_namespace_pmem(dev);
@@ -208,7 +206,7 @@ const u8 *nd_dev_to_uuid(struct device *dev)
 
 		return nsblk->uuid;
 	} else
-		return null_uuid;
+		return &uuid_null;
 }
 EXPORT_SYMBOL(nd_dev_to_uuid);
 
@@ -938,7 +936,8 @@ static void nd_namespace_pmem_set_resource(struct nd_region *nd_region,
 	res->end = res->start + size - 1;
 }
 
-static bool uuid_not_set(const u8 *uuid, struct device *dev, const char *where)
+static bool uuid_not_set(const uuid_t *uuid, struct device *dev,
+			 const char *where)
 {
 	if (!uuid) {
 		dev_dbg(dev, "%s: uuid not set\n", where);
@@ -957,7 +956,7 @@ static ssize_t __size_store(struct device *dev, unsigned long long val)
 	struct nd_label_id label_id;
 	u32 flags = 0, remainder;
 	int rc, i, id = -1;
-	u8 *uuid = NULL;
+	uuid_t *uuid = NULL;
 
 	if (dev->driver || ndns->claim)
 		return -EBUSY;
@@ -1050,7 +1049,7 @@ static ssize_t size_store(struct device *dev,
 {
 	struct nd_region *nd_region = to_nd_region(dev->parent);
 	unsigned long long val;
-	u8 **uuid = NULL;
+	uuid_t **uuid = NULL;
 	int rc;
 
 	rc = kstrtoull(buf, 0, &val);
@@ -1147,7 +1146,7 @@ static ssize_t size_show(struct device *dev,
 }
 static DEVICE_ATTR(size, 0444, size_show, size_store);
 
-static u8 *namespace_to_uuid(struct device *dev)
+static uuid_t *namespace_to_uuid(struct device *dev)
 {
 	if (is_namespace_pmem(dev)) {
 		struct nd_namespace_pmem *nspm = to_nd_namespace_pmem(dev);
@@ -1161,10 +1160,10 @@ static u8 *namespace_to_uuid(struct device *dev)
 		return ERR_PTR(-ENXIO);
 }
 
-static ssize_t uuid_show(struct device *dev,
-		struct device_attribute *attr, char *buf)
+static ssize_t uuid_show(struct device *dev, struct device_attribute *attr,
+			 char *buf)
 {
-	u8 *uuid = namespace_to_uuid(dev);
+	uuid_t *uuid = namespace_to_uuid(dev);
 
 	if (IS_ERR(uuid))
 		return PTR_ERR(uuid);
@@ -1181,7 +1180,8 @@ static ssize_t uuid_show(struct device *dev,
  * @old_uuid: reference to the uuid storage location in the namespace object
  */
 static int namespace_update_uuid(struct nd_region *nd_region,
-		struct device *dev, u8 *new_uuid, u8 **old_uuid)
+				 struct device *dev, uuid_t *new_uuid,
+				 uuid_t **old_uuid)
 {
 	u32 flags = is_namespace_blk(dev) ? NSLABEL_FLAG_LOCAL : 0;
 	struct nd_label_id old_label_id;
@@ -1231,10 +1231,12 @@ static int namespace_update_uuid(struct nd_region *nd_region,
 		list_for_each_entry(label_ent, &nd_mapping->labels, list) {
 			struct nd_namespace_label *nd_label = label_ent->label;
 			struct nd_label_id label_id;
+			uuid_t uuid;
 
 			if (!nd_label)
 				continue;
-			nd_label_gen_id(&label_id, nd_label->uuid,
+			nsl_get_uuid(ndd, nd_label, &uuid);
+			nd_label_gen_id(&label_id, &uuid,
 					nsl_get_flags(ndd, nd_label));
 			if (strcmp(old_label_id.id, label_id.id) == 0)
 				set_bit(ND_LABEL_REAP, &label_ent->flags);
@@ -1251,9 +1253,9 @@ static ssize_t uuid_store(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t len)
 {
 	struct nd_region *nd_region = to_nd_region(dev->parent);
-	u8 *uuid = NULL;
+	uuid_t *uuid = NULL;
+	uuid_t **ns_uuid;
 	ssize_t rc = 0;
-	u8 **ns_uuid;
 
 	if (is_namespace_pmem(dev)) {
 		struct nd_namespace_pmem *nspm = to_nd_namespace_pmem(dev);
@@ -1378,8 +1380,8 @@ static ssize_t dpa_extents_show(struct device *dev,
 {
 	struct nd_region *nd_region = to_nd_region(dev->parent);
 	struct nd_label_id label_id;
+	uuid_t *uuid = NULL;
 	int count = 0, i;
-	u8 *uuid = NULL;
 	u32 flags = 0;
 
 	nvdimm_bus_lock(dev);
@@ -1831,8 +1833,8 @@ static struct device **create_namespace_io(struct nd_region *nd_region)
 	return devs;
 }
 
-static bool has_uuid_at_pos(struct nd_region *nd_region, u8 *uuid,
-		u64 cookie, u16 pos)
+static bool has_uuid_at_pos(struct nd_region *nd_region, const uuid_t *uuid,
+			    u64 cookie, u16 pos)
 {
 	struct nd_namespace_label *found = NULL;
 	int i;
@@ -1856,7 +1858,7 @@ static bool has_uuid_at_pos(struct nd_region *nd_region, u8 *uuid,
 			if (!nsl_validate_isetcookie(ndd, nd_label, cookie))
 				continue;
 
-			if (memcmp(nd_label->uuid, uuid, NSLABEL_UUID_LEN) != 0)
+			if (!nsl_uuid_equal(ndd, nd_label, uuid))
 				continue;
 
 			if (!nsl_validate_type_guid(ndd, nd_label,
@@ -1881,7 +1883,7 @@ static bool has_uuid_at_pos(struct nd_region *nd_region, u8 *uuid,
 	return found != NULL;
 }
 
-static int select_pmem_id(struct nd_region *nd_region, u8 *pmem_id)
+static int select_pmem_id(struct nd_region *nd_region, const uuid_t *pmem_id)
 {
 	int i;
 
@@ -1900,7 +1902,7 @@ static int select_pmem_id(struct nd_region *nd_region, u8 *pmem_id)
 			nd_label = label_ent->label;
 			if (!nd_label)
 				continue;
-			if (memcmp(nd_label->uuid, pmem_id, NSLABEL_UUID_LEN) == 0)
+			if (nsl_uuid_equal(ndd, nd_label, pmem_id))
 				break;
 			nd_label = NULL;
 		}
@@ -1923,7 +1925,8 @@ static int select_pmem_id(struct nd_region *nd_region, u8 *pmem_id)
 			/* pass */;
 		else {
 			dev_dbg(&nd_region->dev, "%s invalid label for %pUb\n",
-					dev_name(ndd->dev), nd_label->uuid);
+				dev_name(ndd->dev),
+				nsl_uuid_raw(ndd, nd_label));
 			return -EINVAL;
 		}
 
@@ -1953,6 +1956,7 @@ static struct device *create_namespace_pmem(struct nd_region *nd_region,
 	resource_size_t size = 0;
 	struct resource *res;
 	struct device *dev;
+	uuid_t uuid;
 	int rc = 0;
 	u16 i;
 
@@ -1963,12 +1967,12 @@ static struct device *create_namespace_pmem(struct nd_region *nd_region,
 
 	if (!nsl_validate_isetcookie(ndd, nd_label, cookie)) {
 		dev_dbg(&nd_region->dev, "invalid cookie in label: %pUb\n",
-				nd_label->uuid);
+			nsl_uuid_raw(ndd, nd_label));
 		if (!nsl_validate_isetcookie(ndd, nd_label, altcookie))
 			return ERR_PTR(-EAGAIN);
 
 		dev_dbg(&nd_region->dev, "valid altcookie in label: %pUb\n",
-				nd_label->uuid);
+			nsl_uuid_raw(ndd, nd_label));
 	}
 
 	nspm = kzalloc(sizeof(*nspm), GFP_KERNEL);
@@ -1984,9 +1988,12 @@ static struct device *create_namespace_pmem(struct nd_region *nd_region,
 	res->flags = IORESOURCE_MEM;
 
 	for (i = 0; i < nd_region->ndr_mappings; i++) {
-		if (has_uuid_at_pos(nd_region, nd_label->uuid, cookie, i))
+		uuid_t uuid;
+
+		nsl_get_uuid(ndd, nd_label, &uuid);
+		if (has_uuid_at_pos(nd_region, &uuid, cookie, i))
 			continue;
-		if (has_uuid_at_pos(nd_region, nd_label->uuid, altcookie, i))
+		if (has_uuid_at_pos(nd_region, &uuid, altcookie, i))
 			continue;
 		break;
 	}
@@ -2000,7 +2007,7 @@ static struct device *create_namespace_pmem(struct nd_region *nd_region,
 		 * find a dimm with two instances of the same uuid.
 		 */
 		dev_err(&nd_region->dev, "%s missing label for %pUb\n",
-				nvdimm_name(nvdimm), nd_label->uuid);
+			nvdimm_name(nvdimm), nsl_uuid_raw(ndd, nd_label));
 		rc = -EINVAL;
 		goto err;
 	}
@@ -2013,7 +2020,8 @@ static struct device *create_namespace_pmem(struct nd_region *nd_region,
 	 * the dimm being enabled (i.e. nd_label_reserve_dpa()
 	 * succeeded).
 	 */
-	rc = select_pmem_id(nd_region, nd_label->uuid);
+	nsl_get_uuid(ndd, nd_label, &uuid);
+	rc = select_pmem_id(nd_region, &uuid);
 	if (rc)
 		goto err;
 
@@ -2039,8 +2047,8 @@ static struct device *create_namespace_pmem(struct nd_region *nd_region,
 		WARN_ON(nspm->alt_name || nspm->uuid);
 		nspm->alt_name = kmemdup(nsl_ref_name(ndd, label0),
 					 NSLABEL_NAME_LEN, GFP_KERNEL);
-		nspm->uuid = kmemdup((void __force *) label0->uuid,
-				NSLABEL_UUID_LEN, GFP_KERNEL);
+		nsl_get_uuid(ndd, label0, &uuid);
+		nspm->uuid = kmemdup(&uuid, sizeof(uuid_t), GFP_KERNEL);
 		nspm->lbasize = nsl_get_lbasize(ndd, label0);
 		nspm->nsio.common.claim_class =
 			nsl_get_claim_class(ndd, label0);
@@ -2217,15 +2225,15 @@ static int add_namespace_resource(struct nd_region *nd_region,
 	int i;
 
 	for (i = 0; i < count; i++) {
-		u8 *uuid = namespace_to_uuid(devs[i]);
+		uuid_t *uuid = namespace_to_uuid(devs[i]);
 		struct resource *res;
 
-		if (IS_ERR_OR_NULL(uuid)) {
+		if (IS_ERR(uuid)) {
 			WARN_ON(1);
 			continue;
 		}
 
-		if (memcmp(uuid, nd_label->uuid, NSLABEL_UUID_LEN) != 0)
+		if (!nsl_uuid_equal(ndd, nd_label, uuid))
 			continue;
 		if (is_namespace_blk(devs[i])) {
 			res = nsblk_add_resource(nd_region, ndd,
@@ -2236,8 +2244,8 @@ static int add_namespace_resource(struct nd_region *nd_region,
 			nd_dbg_dpa(nd_region, ndd, res, "%d assign\n", count);
 		} else {
 			dev_err(&nd_region->dev,
-					"error: conflicting extents for uuid: %pUb\n",
-					nd_label->uuid);
+				"error: conflicting extents for uuid: %pUb\n",
+				uuid);
 			return -ENXIO;
 		}
 		break;
@@ -2257,6 +2265,7 @@ static struct device *create_namespace_blk(struct nd_region *nd_region,
 	char name[NSLABEL_NAME_LEN];
 	struct device *dev = NULL;
 	struct resource *res;
+	uuid_t uuid;
 
 	if (!nsl_validate_type_guid(ndd, nd_label, &nd_set->type_guid))
 		return ERR_PTR(-EAGAIN);
@@ -2271,7 +2280,8 @@ static struct device *create_namespace_blk(struct nd_region *nd_region,
 	dev->parent = &nd_region->dev;
 	nsblk->id = -1;
 	nsblk->lbasize = nsl_get_lbasize(ndd, nd_label);
-	nsblk->uuid = kmemdup(nd_label->uuid, NSLABEL_UUID_LEN, GFP_KERNEL);
+	nsl_get_uuid(ndd, nd_label, &uuid);
+	nsblk->uuid = kmemdup(&uuid, sizeof(uuid_t), GFP_KERNEL);
 	nsblk->common.claim_class = nsl_get_claim_class(ndd, nd_label);
 	if (!nsblk->uuid)
 		goto blk_err;
diff --git a/drivers/nvdimm/nd-core.h b/drivers/nvdimm/nd-core.h
index 564faa36a3ca..a11850dd475d 100644
--- a/drivers/nvdimm/nd-core.h
+++ b/drivers/nvdimm/nd-core.h
@@ -126,8 +126,9 @@ void nvdimm_bus_destroy_ndctl(struct nvdimm_bus *nvdimm_bus);
 void nd_synchronize(void);
 void __nd_device_register(struct device *dev);
 struct nd_label_id;
-char *nd_label_gen_id(struct nd_label_id *label_id, u8 *uuid, u32 flags);
-bool nd_is_uuid_unique(struct device *dev, u8 *uuid);
+char *nd_label_gen_id(struct nd_label_id *label_id, const uuid_t *uuid,
+		      u32 flags);
+bool nd_is_uuid_unique(struct device *dev, uuid_t *uuid);
 struct nd_region;
 struct nvdimm_drvdata;
 struct nd_mapping;
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index 5467ebbb4a6b..ec3c9aad7f50 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -177,6 +177,38 @@ static inline void nsl_set_lbasize(struct nvdimm_drvdata *ndd,
 	nd_label->lbasize = __cpu_to_le64(lbasize);
 }
 
+static inline const uuid_t *nsl_get_uuid(struct nvdimm_drvdata *ndd,
+					 struct nd_namespace_label *nd_label,
+					 uuid_t *uuid)
+{
+	import_uuid(uuid, nd_label->uuid);
+	return uuid;
+}
+
+static inline const uuid_t *nsl_set_uuid(struct nvdimm_drvdata *ndd,
+					 struct nd_namespace_label *nd_label,
+					 const uuid_t *uuid)
+{
+	export_uuid(nd_label->uuid, uuid);
+	return uuid;
+}
+
+static inline bool nsl_uuid_equal(struct nvdimm_drvdata *ndd,
+				  struct nd_namespace_label *nd_label,
+				  const uuid_t *uuid)
+{
+	uuid_t tmp;
+
+	import_uuid(&tmp, nd_label->uuid);
+	return uuid_equal(&tmp, uuid);
+}
+
+static inline const u8 *nsl_uuid_raw(struct nvdimm_drvdata *ndd,
+				     struct nd_namespace_label *nd_label)
+{
+	return nd_label->uuid;
+}
+
 bool nsl_validate_blk_isetcookie(struct nvdimm_drvdata *ndd,
 				 struct nd_namespace_label *nd_label,
 				 u64 isetcookie);
@@ -335,7 +367,7 @@ struct nd_btt {
 	struct btt *btt;
 	unsigned long lbasize;
 	u64 size;
-	u8 *uuid;
+	uuid_t *uuid;
 	int id;
 	int initial_offset;
 	u16 version_major;
@@ -350,7 +382,7 @@ enum nd_pfn_mode {
 
 struct nd_pfn {
 	int id;
-	u8 *uuid;
+	uuid_t *uuid;
 	struct device dev;
 	unsigned long align;
 	unsigned long npfns;
@@ -378,7 +410,7 @@ void wait_nvdimm_bus_probe_idle(struct device *dev);
 void nd_device_register(struct device *dev);
 void nd_device_unregister(struct device *dev, enum nd_async_mode mode);
 void nd_device_notify(struct device *dev, enum nvdimm_event event);
-int nd_uuid_store(struct device *dev, u8 **uuid_out, const char *buf,
+int nd_uuid_store(struct device *dev, uuid_t **uuid_out, const char *buf,
 		size_t len);
 ssize_t nd_size_select_show(unsigned long current_size,
 		const unsigned long *supported, char *buf);
@@ -561,6 +593,6 @@ static inline bool is_bad_pmem(struct badblocks *bb, sector_t sector,
 	return false;
 }
 resource_size_t nd_namespace_blk_validate(struct nd_namespace_blk *nsblk);
-const u8 *nd_dev_to_uuid(struct device *dev);
+const uuid_t *nd_dev_to_uuid(struct device *dev);
 bool pmem_should_map_pages(struct device *dev);
 #endif /* __ND_H__ */
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index b499df630d4d..58eda16f5c53 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -452,7 +452,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
 	unsigned long align, start_pad;
 	struct nd_pfn_sb *pfn_sb = nd_pfn->pfn_sb;
 	struct nd_namespace_common *ndns = nd_pfn->ndns;
-	const u8 *parent_uuid = nd_dev_to_uuid(&ndns->dev);
+	const uuid_t *parent_uuid = nd_dev_to_uuid(&ndns->dev);
 
 	if (!pfn_sb || !ndns)
 		return -ENODEV;
diff --git a/include/linux/nd.h b/include/linux/nd.h
index ee9ad76afbba..8a8c63edb1b2 100644
--- a/include/linux/nd.h
+++ b/include/linux/nd.h
@@ -88,7 +88,7 @@ struct nd_namespace_pmem {
 	struct nd_namespace_io nsio;
 	unsigned long lbasize;
 	char *alt_name;
-	u8 *uuid;
+	uuid_t *uuid;
 	int id;
 };
 
@@ -105,7 +105,7 @@ struct nd_namespace_pmem {
 struct nd_namespace_blk {
 	struct nd_namespace_common common;
 	char *alt_name;
-	u8 *uuid;
+	uuid_t *uuid;
 	int id;
 	unsigned long lbasize;
 	resource_size_t size;


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 02/21] libnvdimm/label: Add a helper for nlabel validation
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
  2021-09-09  5:11 ` [PATCH v4 01/21] libnvdimm/labels: Add uuid helpers Dan Williams
@ 2021-09-09  5:11 ` Dan Williams
  2021-09-09  5:11 ` [PATCH v4 03/21] libnvdimm/labels: Introduce the concept of multi-range namespace labels Dan Williams
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:11 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

In the CXL namespace label there is no need for nlabel since that is
inferred from the region. Add a helper that moves nsl_get_label() behind
a helper that validates the number of labels relative to the region.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/namespace_devs.c |    5 ++---
 drivers/nvdimm/nd.h             |    7 +++++++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index d4959981c7d4..28ed14052e36 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -1848,12 +1848,11 @@ static bool has_uuid_at_pos(struct nd_region *nd_region, const uuid_t *uuid,
 
 		list_for_each_entry(label_ent, &nd_mapping->labels, list) {
 			struct nd_namespace_label *nd_label = label_ent->label;
-			u16 position, nlabel;
+			u16 position;
 
 			if (!nd_label)
 				continue;
 			position = nsl_get_position(ndd, nd_label);
-			nlabel = nsl_get_nlabel(ndd, nd_label);
 
 			if (!nsl_validate_isetcookie(ndd, nd_label, cookie))
 				continue;
@@ -1870,7 +1869,7 @@ static bool has_uuid_at_pos(struct nd_region *nd_region, const uuid_t *uuid,
 				return false;
 			}
 			found_uuid = true;
-			if (nlabel != nd_region->ndr_mappings)
+			if (!nsl_validate_nlabel(nd_region, ndd, nd_label))
 				continue;
 			if (position != pos)
 				continue;
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index ec3c9aad7f50..036638bdb7e3 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -342,6 +342,13 @@ struct nd_region {
 	struct nd_mapping mapping[];
 };
 
+static inline bool nsl_validate_nlabel(struct nd_region *nd_region,
+				       struct nvdimm_drvdata *ndd,
+				       struct nd_namespace_label *nd_label)
+{
+	return nsl_get_nlabel(ndd, nd_label) == nd_region->ndr_mappings;
+}
+
 struct nd_blk_region {
 	int (*enable)(struct nvdimm_bus *nvdimm_bus, struct device *dev);
 	int (*do_io)(struct nd_blk_region *ndbr, resource_size_t dpa,


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 03/21] libnvdimm/labels: Introduce the concept of multi-range namespace labels
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
  2021-09-09  5:11 ` [PATCH v4 01/21] libnvdimm/labels: Add uuid helpers Dan Williams
  2021-09-09  5:11 ` [PATCH v4 02/21] libnvdimm/label: Add a helper for nlabel validation Dan Williams
@ 2021-09-09  5:11 ` Dan Williams
  2021-09-09 13:09   ` Jonathan Cameron
  2021-09-09  5:11 ` [PATCH v4 04/21] libnvdimm/labels: Fix kernel-doc for label.h Dan Williams
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:11 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

The CXL specification defines a mechanism for namespaces to be comprised
of multiple dis-contiguous ranges. Introduce that concept to the legacy
NVDIMM namespace implementation with a new nsl_set_nrange() helper, that
sets the number of ranges to 1. Once the NVDIMM subsystem supports CXL
labels and updates its namespace capacity provisioning for
dis-contiguous support nsl_set_nrange() can be updated, but in the
meantime CXL label validation requires nrange be non-zero.

Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/label.c |    1 +
 drivers/nvdimm/nd.h    |   13 +++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
index e7fdb718ebf0..7832b190efd7 100644
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -856,6 +856,7 @@ static int __pmem_label_update(struct nd_region *nd_region,
 	nsl_set_name(ndd, nd_label, nspm->alt_name);
 	nsl_set_flags(ndd, nd_label, flags);
 	nsl_set_nlabel(ndd, nd_label, nd_region->ndr_mappings);
+	nsl_set_nrange(ndd, nd_label, 1);
 	nsl_set_position(ndd, nd_label, pos);
 	nsl_set_isetcookie(ndd, nd_label, cookie);
 	nsl_set_rawsize(ndd, nd_label, resource_size(res));
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index 036638bdb7e3..d57f95a48fe1 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -164,6 +164,19 @@ static inline void nsl_set_nlabel(struct nvdimm_drvdata *ndd,
 	nd_label->nlabel = __cpu_to_le16(nlabel);
 }
 
+static inline u16 nsl_get_nrange(struct nvdimm_drvdata *ndd,
+				 struct nd_namespace_label *nd_label)
+{
+	/* EFI labels do not have an nrange field */
+	return 1;
+}
+
+static inline void nsl_set_nrange(struct nvdimm_drvdata *ndd,
+				  struct nd_namespace_label *nd_label,
+				  u16 nrange)
+{
+}
+
 static inline u64 nsl_get_lbasize(struct nvdimm_drvdata *ndd,
 				  struct nd_namespace_label *nd_label)
 {


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 04/21] libnvdimm/labels: Fix kernel-doc for label.h
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (2 preceding siblings ...)
  2021-09-09  5:11 ` [PATCH v4 03/21] libnvdimm/labels: Introduce the concept of multi-range namespace labels Dan Williams
@ 2021-09-09  5:11 ` Dan Williams
  2021-09-10  8:38   ` Jonathan Cameron
  2021-09-09  5:11 ` [PATCH v4 05/21] libnvdimm/label: Define CXL region labels Dan Williams
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:11 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

Clean up existing kernel-doc warnings before adding new CXL label data
structures.

drivers/nvdimm/label.h:66: warning: Function parameter or member 'labelsize' not described in 'nd_namespace_index'
drivers/nvdimm/label.h:66: warning: Function parameter or member 'free' not described in 'nd_namespace_index'
drivers/nvdimm/label.h:103: warning: Function parameter or member 'align' not described in 'nd_namespace_label'
drivers/nvdimm/label.h:103: warning: Function parameter or member 'reserved' not described in 'nd_namespace_label'
drivers/nvdimm/label.h:103: warning: Function parameter or member 'type_guid' not described in 'nd_namespace_label'
drivers/nvdimm/label.h:103: warning: Function parameter or member 'abstraction_guid' not described in 'nd_namespace_label'
drivers/nvdimm/label.h:103: warning: Function parameter or member 'reserved2' not described in 'nd_namespace_label'
drivers/nvdimm/label.h:103: warning: Function parameter or member 'checksum' not described in 'nd_namespace_label'

Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/label.h |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/nvdimm/label.h b/drivers/nvdimm/label.h
index 31f94fad7b92..7fa757d47846 100644
--- a/drivers/nvdimm/label.h
+++ b/drivers/nvdimm/label.h
@@ -34,6 +34,7 @@ enum {
  * struct nd_namespace_index - label set superblock
  * @sig: NAMESPACE_INDEX\0
  * @flags: placeholder
+ * @labelsize: log2 size (v1 labels 128 bytes v2 labels 256 bytes)
  * @seq: sequence number for this index
  * @myoff: offset of this index in label area
  * @mysize: size of this index struct
@@ -43,7 +44,7 @@ enum {
  * @major: label area major version
  * @minor: label area minor version
  * @checksum: fletcher64 of all fields
- * @free[0]: bitmap, nlabel bits
+ * @free: bitmap, nlabel bits
  *
  * The size of free[] is rounded up so the total struct size is a
  * multiple of NSINDEX_ALIGN bytes.  Any bits this allocates beyond
@@ -77,7 +78,12 @@ struct nd_namespace_index {
  * @dpa: DPA of NVM range on this DIMM
  * @rawsize: size of namespace
  * @slot: slot of this label in label area
- * @unused: must be zero
+ * @align: physical address alignment of the namespace
+ * @reserved: reserved
+ * @type_guid: copy of struct acpi_nfit_system_address.range_guid
+ * @abstraction_guid: personality id (btt, btt2, fsdax, devdax....)
+ * @reserved2: reserved
+ * @checksum: fletcher64 sum of this object
  */
 struct nd_namespace_label {
 	u8 uuid[NSLABEL_UUID_LEN];


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 05/21] libnvdimm/label: Define CXL region labels
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (3 preceding siblings ...)
  2021-09-09  5:11 ` [PATCH v4 04/21] libnvdimm/labels: Fix kernel-doc for label.h Dan Williams
@ 2021-09-09  5:11 ` Dan Williams
  2021-09-09 15:58   ` Ben Widawsky
  2021-09-09  5:12 ` [PATCH v4 06/21] libnvdimm/labels: Introduce CXL labels Dan Williams
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:11 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, Jonathan Cameron, vishal.l.verma, nvdimm,
	ben.widawsky, alison.schofield, vishal.l.verma, ira.weiny,
	Jonathan.Cameron

Add a definition of the CXL 2.0 region label format. Note this is done
as a separate patch to make the next patch that adds namespace label
support easier to read.

Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/label.h |   32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/nvdimm/label.h b/drivers/nvdimm/label.h
index 7fa757d47846..0519aacc2926 100644
--- a/drivers/nvdimm/label.h
+++ b/drivers/nvdimm/label.h
@@ -66,6 +66,38 @@ struct nd_namespace_index {
 	u8 free[];
 };
 
+/**
+ * struct cxl_region_label - CXL 2.0 Table 211
+ * @type: uuid identifying this label format (region)
+ * @uuid: uuid for the region this label describes
+ * @flags: NSLABEL_FLAG_UPDATING (all other flags reserved)
+ * @nlabel: 1 per interleave-way in the region
+ * @position: this label's position in the set
+ * @dpa: start address in device-local capacity for this label
+ * @rawsize: size of this label's contribution to region
+ * @hpa: mandatory system physical address to map this region
+ * @slot: slot id of this label in label area
+ * @ig: interleave granularity (1 << @ig) * 256 bytes
+ * @align: alignment in SZ_256M blocks
+ * @reserved: reserved
+ * @checksum: fletcher64 sum of this label
+ */
+struct cxl_region_label {
+	u8 type[NSLABEL_UUID_LEN];
+	u8 uuid[NSLABEL_UUID_LEN];
+	__le32 flags;
+	__le16 nlabel;
+	__le16 position;
+	__le64 dpa;
+	__le64 rawsize;
+	__le64 hpa;
+	__le32 slot;
+	__le32 ig;
+	__le32 align;
+	u8 reserved[0xac];
+	__le64 checksum;
+};
+
 /**
  * struct nd_namespace_label - namespace superblock
  * @uuid: UUID per RFC 4122


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 06/21] libnvdimm/labels: Introduce CXL labels
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (4 preceding siblings ...)
  2021-09-09  5:11 ` [PATCH v4 05/21] libnvdimm/label: Define CXL region labels Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09  5:12 ` [PATCH v4 07/21] cxl/pci: Make 'struct cxl_mem' device type generic Dan Williams
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

Now that all of use sites of label data have been converted to nsl_*
helpers, introduce the CXL label format. The ->cxl flag in
nvdimm_drvdata indicates the label format the device expects. A
follow-on patch allows a bus provider to select the label style.

Note that the EFI definition of the labels represents the Linux "claim
class" with a GUID. The CXL definition of the labels stores the same
identifier in UUID byte order.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/label.c |  104 +++++++++++++++++++++++++++++++------
 drivers/nvdimm/label.h |   52 +++++++++++++++++-
 drivers/nvdimm/nd.h    |  135 +++++++++++++++++++++++++++++++++++++-----------
 3 files changed, 241 insertions(+), 50 deletions(-)

diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
index 7832b190efd7..5ec9a4023df9 100644
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -17,6 +17,14 @@ static guid_t nvdimm_btt2_guid;
 static guid_t nvdimm_pfn_guid;
 static guid_t nvdimm_dax_guid;
 
+static uuid_t nvdimm_btt_uuid;
+static uuid_t nvdimm_btt2_uuid;
+static uuid_t nvdimm_pfn_uuid;
+static uuid_t nvdimm_dax_uuid;
+
+static uuid_t cxl_region_uuid;
+static uuid_t cxl_namespace_uuid;
+
 static const char NSINDEX_SIGNATURE[] = "NAMESPACE_INDEX\0";
 
 static u32 best_seq(u32 a, u32 b)
@@ -352,7 +360,7 @@ static bool nsl_validate_checksum(struct nvdimm_drvdata *ndd,
 {
 	u64 sum, sum_save;
 
-	if (!namespace_label_has(ndd, checksum))
+	if (!ndd->cxl && !efi_namespace_label_has(ndd, checksum))
 		return true;
 
 	sum_save = nsl_get_checksum(ndd, nd_label);
@@ -367,7 +375,7 @@ static void nsl_calculate_checksum(struct nvdimm_drvdata *ndd,
 {
 	u64 sum;
 
-	if (!namespace_label_has(ndd, checksum))
+	if (!ndd->cxl && !efi_namespace_label_has(ndd, checksum))
 		return;
 	nsl_set_checksum(ndd, nd_label, 0);
 	sum = nd_fletcher64(nd_label, sizeof_namespace_label(ndd), 1);
@@ -725,7 +733,7 @@ static unsigned long nd_label_offset(struct nvdimm_drvdata *ndd,
 		- (unsigned long) to_namespace_index(ndd, 0);
 }
 
-static enum nvdimm_claim_class to_nvdimm_cclass(guid_t *guid)
+static enum nvdimm_claim_class guid_to_nvdimm_cclass(guid_t *guid)
 {
 	if (guid_equal(guid, &nvdimm_btt_guid))
 		return NVDIMM_CCLASS_BTT;
@@ -741,6 +749,23 @@ static enum nvdimm_claim_class to_nvdimm_cclass(guid_t *guid)
 	return NVDIMM_CCLASS_UNKNOWN;
 }
 
+/* CXL labels store UUIDs instead of GUIDs for the same data */
+static enum nvdimm_claim_class uuid_to_nvdimm_cclass(uuid_t *uuid)
+{
+	if (uuid_equal(uuid, &nvdimm_btt_uuid))
+		return NVDIMM_CCLASS_BTT;
+	else if (uuid_equal(uuid, &nvdimm_btt2_uuid))
+		return NVDIMM_CCLASS_BTT2;
+	else if (uuid_equal(uuid, &nvdimm_pfn_uuid))
+		return NVDIMM_CCLASS_PFN;
+	else if (uuid_equal(uuid, &nvdimm_dax_uuid))
+		return NVDIMM_CCLASS_DAX;
+	else if (uuid_equal(uuid, &uuid_null))
+		return NVDIMM_CCLASS_NONE;
+
+	return NVDIMM_CCLASS_UNKNOWN;
+}
+
 static const guid_t *to_abstraction_guid(enum nvdimm_claim_class claim_class,
 	guid_t *target)
 {
@@ -762,6 +787,28 @@ static const guid_t *to_abstraction_guid(enum nvdimm_claim_class claim_class,
 		return &guid_null;
 }
 
+/* CXL labels store UUIDs instead of GUIDs for the same data */
+static const uuid_t *to_abstraction_uuid(enum nvdimm_claim_class claim_class,
+					 uuid_t *target)
+{
+	if (claim_class == NVDIMM_CCLASS_BTT)
+		return &nvdimm_btt_uuid;
+	else if (claim_class == NVDIMM_CCLASS_BTT2)
+		return &nvdimm_btt2_uuid;
+	else if (claim_class == NVDIMM_CCLASS_PFN)
+		return &nvdimm_pfn_uuid;
+	else if (claim_class == NVDIMM_CCLASS_DAX)
+		return &nvdimm_dax_uuid;
+	else if (claim_class == NVDIMM_CCLASS_UNKNOWN) {
+		/*
+		 * If we're modifying a namespace for which we don't
+		 * know the claim_class, don't touch the existing uuid.
+		 */
+		return target;
+	} else
+		return &uuid_null;
+}
+
 static void reap_victim(struct nd_mapping *nd_mapping,
 		struct nd_label_ent *victim)
 {
@@ -776,18 +823,18 @@ static void reap_victim(struct nd_mapping *nd_mapping,
 static void nsl_set_type_guid(struct nvdimm_drvdata *ndd,
 			      struct nd_namespace_label *nd_label, guid_t *guid)
 {
-	if (namespace_label_has(ndd, type_guid))
-		guid_copy(&nd_label->type_guid, guid);
+	if (efi_namespace_label_has(ndd, type_guid))
+		guid_copy(&nd_label->efi.type_guid, guid);
 }
 
 bool nsl_validate_type_guid(struct nvdimm_drvdata *ndd,
 			    struct nd_namespace_label *nd_label, guid_t *guid)
 {
-	if (!namespace_label_has(ndd, type_guid))
+	if (ndd->cxl || !efi_namespace_label_has(ndd, type_guid))
 		return true;
-	if (!guid_equal(&nd_label->type_guid, guid)) {
+	if (!guid_equal(&nd_label->efi.type_guid, guid)) {
 		dev_dbg(ndd->dev, "expect type_guid %pUb got %pUb\n", guid,
-			&nd_label->type_guid);
+			&nd_label->efi.type_guid);
 		return false;
 	}
 	return true;
@@ -797,19 +844,34 @@ static void nsl_set_claim_class(struct nvdimm_drvdata *ndd,
 				struct nd_namespace_label *nd_label,
 				enum nvdimm_claim_class claim_class)
 {
-	if (!namespace_label_has(ndd, abstraction_guid))
+	if (ndd->cxl) {
+		uuid_t uuid;
+
+		import_uuid(&uuid, nd_label->cxl.abstraction_uuid);
+		export_uuid(nd_label->cxl.abstraction_uuid,
+			    to_abstraction_uuid(claim_class, &uuid));
+		return;
+	}
+
+	if (!efi_namespace_label_has(ndd, abstraction_guid))
 		return;
-	guid_copy(&nd_label->abstraction_guid,
+	guid_copy(&nd_label->efi.abstraction_guid,
 		  to_abstraction_guid(claim_class,
-				      &nd_label->abstraction_guid));
+				      &nd_label->efi.abstraction_guid));
 }
 
 enum nvdimm_claim_class nsl_get_claim_class(struct nvdimm_drvdata *ndd,
 					    struct nd_namespace_label *nd_label)
 {
-	if (!namespace_label_has(ndd, abstraction_guid))
+	if (ndd->cxl) {
+		uuid_t uuid;
+
+		import_uuid(&uuid, nd_label->cxl.abstraction_uuid);
+		return uuid_to_nvdimm_cclass(&uuid);
+	}
+	if (!efi_namespace_label_has(ndd, abstraction_guid))
 		return NVDIMM_CCLASS_NONE;
-	return to_nvdimm_cclass(&nd_label->abstraction_guid);
+	return guid_to_nvdimm_cclass(&nd_label->efi.abstraction_guid);
 }
 
 static int __pmem_label_update(struct nd_region *nd_region,
@@ -942,7 +1004,7 @@ static void nsl_set_blk_isetcookie(struct nvdimm_drvdata *ndd,
 				   struct nd_namespace_label *nd_label,
 				   u64 isetcookie)
 {
-	if (namespace_label_has(ndd, type_guid)) {
+	if (efi_namespace_label_has(ndd, type_guid)) {
 		nsl_set_isetcookie(ndd, nd_label, isetcookie);
 		return;
 	}
@@ -953,7 +1015,7 @@ bool nsl_validate_blk_isetcookie(struct nvdimm_drvdata *ndd,
 				 struct nd_namespace_label *nd_label,
 				 u64 isetcookie)
 {
-	if (!namespace_label_has(ndd, type_guid))
+	if (!efi_namespace_label_has(ndd, type_guid))
 		return true;
 
 	if (nsl_get_isetcookie(ndd, nd_label) != isetcookie) {
@@ -969,7 +1031,7 @@ static void nsl_set_blk_nlabel(struct nvdimm_drvdata *ndd,
 			       struct nd_namespace_label *nd_label, int nlabel,
 			       bool first)
 {
-	if (!namespace_label_has(ndd, type_guid)) {
+	if (!efi_namespace_label_has(ndd, type_guid)) {
 		nsl_set_nlabel(ndd, nd_label, 0); /* N/A */
 		return;
 	}
@@ -980,7 +1042,7 @@ static void nsl_set_blk_position(struct nvdimm_drvdata *ndd,
 				 struct nd_namespace_label *nd_label,
 				 bool first)
 {
-	if (!namespace_label_has(ndd, type_guid)) {
+	if (!efi_namespace_label_has(ndd, type_guid)) {
 		nsl_set_position(ndd, nd_label, 0);
 		return;
 	}
@@ -1390,5 +1452,13 @@ int __init nd_label_init(void)
 	WARN_ON(guid_parse(NVDIMM_PFN_GUID, &nvdimm_pfn_guid));
 	WARN_ON(guid_parse(NVDIMM_DAX_GUID, &nvdimm_dax_guid));
 
+	WARN_ON(uuid_parse(NVDIMM_BTT_GUID, &nvdimm_btt_uuid));
+	WARN_ON(uuid_parse(NVDIMM_BTT2_GUID, &nvdimm_btt2_uuid));
+	WARN_ON(uuid_parse(NVDIMM_PFN_GUID, &nvdimm_pfn_uuid));
+	WARN_ON(uuid_parse(NVDIMM_DAX_GUID, &nvdimm_dax_uuid));
+
+	WARN_ON(uuid_parse(CXL_REGION_UUID, &cxl_region_uuid));
+	WARN_ON(uuid_parse(CXL_NAMESPACE_UUID, &cxl_namespace_uuid));
+
 	return 0;
 }
diff --git a/drivers/nvdimm/label.h b/drivers/nvdimm/label.h
index 0519aacc2926..8ee248fc214f 100644
--- a/drivers/nvdimm/label.h
+++ b/drivers/nvdimm/label.h
@@ -99,7 +99,7 @@ struct cxl_region_label {
 };
 
 /**
- * struct nd_namespace_label - namespace superblock
+ * struct nvdimm_efi_label - namespace superblock
  * @uuid: UUID per RFC 4122
  * @name: optional name (NULL-terminated)
  * @flags: see NSLABEL_FLAG_*
@@ -117,7 +117,7 @@ struct cxl_region_label {
  * @reserved2: reserved
  * @checksum: fletcher64 sum of this object
  */
-struct nd_namespace_label {
+struct nvdimm_efi_label {
 	u8 uuid[NSLABEL_UUID_LEN];
 	u8 name[NSLABEL_NAME_LEN];
 	__le32 flags;
@@ -130,7 +130,7 @@ struct nd_namespace_label {
 	__le32 slot;
 	/*
 	 * Accessing fields past this point should be gated by a
-	 * namespace_label_has() check.
+	 * efi_namespace_label_has() check.
 	 */
 	u8 align;
 	u8 reserved[3];
@@ -140,11 +140,57 @@ struct nd_namespace_label {
 	__le64 checksum;
 };
 
+/**
+ * struct nvdimm_cxl_label - CXL 2.0 Table 212
+ * @type: uuid identifying this label format (namespace)
+ * @uuid: uuid for the namespace this label describes
+ * @name: friendly name for the namespace
+ * @flags: NSLABEL_FLAG_UPDATING (all other flags reserved)
+ * @nrange: discontiguous namespace support
+ * @position: this label's position in the set
+ * @dpa: start address in device-local capacity for this label
+ * @rawsize: size of this label's contribution to namespace
+ * @slot: slot id of this label in label area
+ * @align: alignment in SZ_256M blocks
+ * @region_uuid: host interleave set identifier
+ * @abstraction_uuid: personality driver for this namespace
+ * @lbasize: address geometry for disk-like personalities
+ * @reserved: reserved
+ * @checksum: fletcher64 sum of this label
+ */
+struct nvdimm_cxl_label {
+	u8 type[NSLABEL_UUID_LEN];
+	u8 uuid[NSLABEL_UUID_LEN];
+	u8 name[NSLABEL_NAME_LEN];
+	__le32 flags;
+	__le16 nrange;
+	__le16 position;
+	__le64 dpa;
+	__le64 rawsize;
+	__le32 slot;
+	__le32 align;
+	u8 region_uuid[16];
+	u8 abstraction_uuid[16];
+	__le16 lbasize;
+	u8 reserved[0x56];
+	__le64 checksum;
+};
+
+struct nd_namespace_label {
+	union {
+		struct nvdimm_cxl_label cxl;
+		struct nvdimm_efi_label efi;
+	};
+};
+
 #define NVDIMM_BTT_GUID "8aed63a2-29a2-4c66-8b12-f05d15d3922a"
 #define NVDIMM_BTT2_GUID "18633bfc-1735-4217-8ac9-17239282d3f8"
 #define NVDIMM_PFN_GUID "266400ba-fb9f-4677-bcb0-968f11d0d225"
 #define NVDIMM_DAX_GUID "97a86d9c-3cdd-4eda-986f-5068b4f80088"
 
+#define CXL_REGION_UUID "529d7c61-da07-47c4-a93f-ecdf2c06f444"
+#define CXL_NAMESPACE_UUID "68bb2c0a-5a77-4937-9f85-3caf41a0f93c"
+
 /**
  * struct nd_label_id - identifier string for dpa allocation
  * @id: "{blk|pmem}-<namespace uuid>"
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index d57f95a48fe1..6f8ce114032d 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -30,6 +30,7 @@ struct nvdimm_drvdata {
 	int nslabel_size;
 	struct nd_cmd_get_config_size nsarea;
 	void *data;
+	bool cxl;
 	int ns_current, ns_next;
 	struct resource dpa;
 	struct kref kref;
@@ -38,13 +39,17 @@ struct nvdimm_drvdata {
 static inline const u8 *nsl_ref_name(struct nvdimm_drvdata *ndd,
 				     struct nd_namespace_label *nd_label)
 {
-	return nd_label->name;
+	if (ndd->cxl)
+		return nd_label->cxl.name;
+	return nd_label->efi.name;
 }
 
 static inline u8 *nsl_get_name(struct nvdimm_drvdata *ndd,
 			       struct nd_namespace_label *nd_label, u8 *name)
 {
-	return memcpy(name, nd_label->name, NSLABEL_NAME_LEN);
+	if (ndd->cxl)
+		return memcpy(name, nd_label->cxl.name, NSLABEL_NAME_LEN);
+	return memcpy(name, nd_label->efi.name, NSLABEL_NAME_LEN);
 }
 
 static inline u8 *nsl_set_name(struct nvdimm_drvdata *ndd,
@@ -52,122 +57,168 @@ static inline u8 *nsl_set_name(struct nvdimm_drvdata *ndd,
 {
 	if (!name)
 		return NULL;
-	return memcpy(nd_label->name, name, NSLABEL_NAME_LEN);
+	if (ndd->cxl)
+		return memcpy(nd_label->cxl.name, name, NSLABEL_NAME_LEN);
+	return memcpy(nd_label->efi.name, name, NSLABEL_NAME_LEN);
 }
 
 static inline u32 nsl_get_slot(struct nvdimm_drvdata *ndd,
 			       struct nd_namespace_label *nd_label)
 {
-	return __le32_to_cpu(nd_label->slot);
+	if (ndd->cxl)
+		return __le32_to_cpu(nd_label->cxl.slot);
+	return __le32_to_cpu(nd_label->efi.slot);
 }
 
 static inline void nsl_set_slot(struct nvdimm_drvdata *ndd,
 				struct nd_namespace_label *nd_label, u32 slot)
 {
-	nd_label->slot = __cpu_to_le32(slot);
+	if (ndd->cxl)
+		nd_label->cxl.slot = __cpu_to_le32(slot);
+	else
+		nd_label->efi.slot = __cpu_to_le32(slot);
 }
 
 static inline u64 nsl_get_checksum(struct nvdimm_drvdata *ndd,
 				   struct nd_namespace_label *nd_label)
 {
-	return __le64_to_cpu(nd_label->checksum);
+	if (ndd->cxl)
+		return __le64_to_cpu(nd_label->cxl.checksum);
+	return __le64_to_cpu(nd_label->efi.checksum);
 }
 
 static inline void nsl_set_checksum(struct nvdimm_drvdata *ndd,
 				    struct nd_namespace_label *nd_label,
 				    u64 checksum)
 {
-	nd_label->checksum = __cpu_to_le64(checksum);
+	if (ndd->cxl)
+		nd_label->cxl.checksum = __cpu_to_le64(checksum);
+	else
+		nd_label->efi.checksum = __cpu_to_le64(checksum);
 }
 
 static inline u32 nsl_get_flags(struct nvdimm_drvdata *ndd,
 				struct nd_namespace_label *nd_label)
 {
-	return __le32_to_cpu(nd_label->flags);
+	if (ndd->cxl)
+		return __le32_to_cpu(nd_label->cxl.flags);
+	return __le32_to_cpu(nd_label->efi.flags);
 }
 
 static inline void nsl_set_flags(struct nvdimm_drvdata *ndd,
 				 struct nd_namespace_label *nd_label, u32 flags)
 {
-	nd_label->flags = __cpu_to_le32(flags);
+	if (ndd->cxl)
+		nd_label->cxl.flags = __cpu_to_le32(flags);
+	else
+		nd_label->efi.flags = __cpu_to_le32(flags);
 }
 
 static inline u64 nsl_get_dpa(struct nvdimm_drvdata *ndd,
 			      struct nd_namespace_label *nd_label)
 {
-	return __le64_to_cpu(nd_label->dpa);
+	if (ndd->cxl)
+		return __le64_to_cpu(nd_label->cxl.dpa);
+	return __le64_to_cpu(nd_label->efi.dpa);
 }
 
 static inline void nsl_set_dpa(struct nvdimm_drvdata *ndd,
 			       struct nd_namespace_label *nd_label, u64 dpa)
 {
-	nd_label->dpa = __cpu_to_le64(dpa);
+	if (ndd->cxl)
+		nd_label->cxl.dpa = __cpu_to_le64(dpa);
+	else
+		nd_label->efi.dpa = __cpu_to_le64(dpa);
 }
 
 static inline u64 nsl_get_rawsize(struct nvdimm_drvdata *ndd,
 				  struct nd_namespace_label *nd_label)
 {
-	return __le64_to_cpu(nd_label->rawsize);
+	if (ndd->cxl)
+		return __le64_to_cpu(nd_label->cxl.rawsize);
+	return __le64_to_cpu(nd_label->efi.rawsize);
 }
 
 static inline void nsl_set_rawsize(struct nvdimm_drvdata *ndd,
 				   struct nd_namespace_label *nd_label,
 				   u64 rawsize)
 {
-	nd_label->rawsize = __cpu_to_le64(rawsize);
+	if (ndd->cxl)
+		nd_label->cxl.rawsize = __cpu_to_le64(rawsize);
+	else
+		nd_label->efi.rawsize = __cpu_to_le64(rawsize);
 }
 
 static inline u64 nsl_get_isetcookie(struct nvdimm_drvdata *ndd,
 				     struct nd_namespace_label *nd_label)
 {
-	return __le64_to_cpu(nd_label->isetcookie);
+	/* WARN future refactor attempts that break this assumption */
+	if (dev_WARN_ONCE(ndd->dev, ndd->cxl,
+			  "CXL labels do not use the isetcookie concept\n"))
+		return 0;
+	return __le64_to_cpu(nd_label->efi.isetcookie);
 }
 
 static inline void nsl_set_isetcookie(struct nvdimm_drvdata *ndd,
 				      struct nd_namespace_label *nd_label,
 				      u64 isetcookie)
 {
-	nd_label->isetcookie = __cpu_to_le64(isetcookie);
+	if (!ndd->cxl)
+		nd_label->efi.isetcookie = __cpu_to_le64(isetcookie);
 }
 
 static inline bool nsl_validate_isetcookie(struct nvdimm_drvdata *ndd,
 					   struct nd_namespace_label *nd_label,
 					   u64 cookie)
 {
-	return cookie == __le64_to_cpu(nd_label->isetcookie);
+	/*
+	 * Let the EFI and CXL validation comingle, where fields that
+	 * don't matter to CXL always validate.
+	 */
+	if (ndd->cxl)
+		return true;
+	return cookie == __le64_to_cpu(nd_label->efi.isetcookie);
 }
 
 static inline u16 nsl_get_position(struct nvdimm_drvdata *ndd,
 				   struct nd_namespace_label *nd_label)
 {
-	return __le16_to_cpu(nd_label->position);
+	if (ndd->cxl)
+		return __le16_to_cpu(nd_label->cxl.position);
+	return __le16_to_cpu(nd_label->efi.position);
 }
 
 static inline void nsl_set_position(struct nvdimm_drvdata *ndd,
 				    struct nd_namespace_label *nd_label,
 				    u16 position)
 {
-	nd_label->position = __cpu_to_le16(position);
+	if (ndd->cxl)
+		nd_label->cxl.position = __cpu_to_le16(position);
+	else
+		nd_label->efi.position = __cpu_to_le16(position);
 }
 
-
 static inline u16 nsl_get_nlabel(struct nvdimm_drvdata *ndd,
 				 struct nd_namespace_label *nd_label)
 {
-	return __le16_to_cpu(nd_label->nlabel);
+	if (ndd->cxl)
+		return 0;
+	return __le16_to_cpu(nd_label->efi.nlabel);
 }
 
 static inline void nsl_set_nlabel(struct nvdimm_drvdata *ndd,
 				  struct nd_namespace_label *nd_label,
 				  u16 nlabel)
 {
-	nd_label->nlabel = __cpu_to_le16(nlabel);
+	if (!ndd->cxl)
+		nd_label->efi.nlabel = __cpu_to_le16(nlabel);
 }
 
 static inline u16 nsl_get_nrange(struct nvdimm_drvdata *ndd,
 				 struct nd_namespace_label *nd_label)
 {
-	/* EFI labels do not have an nrange field */
+	if (ndd->cxl)
+		return __le16_to_cpu(nd_label->cxl.nrange);
 	return 1;
 }
 
@@ -175,26 +226,40 @@ static inline void nsl_set_nrange(struct nvdimm_drvdata *ndd,
 				  struct nd_namespace_label *nd_label,
 				  u16 nrange)
 {
+	if (ndd->cxl)
+		nd_label->cxl.nrange = __cpu_to_le16(nrange);
 }
 
 static inline u64 nsl_get_lbasize(struct nvdimm_drvdata *ndd,
 				  struct nd_namespace_label *nd_label)
 {
-	return __le64_to_cpu(nd_label->lbasize);
+	/*
+	 * Yes, for some reason the EFI labels convey a massive 64-bit
+	 * lbasize, that got fixed for CXL.
+	 */
+	if (ndd->cxl)
+		return __le16_to_cpu(nd_label->cxl.lbasize);
+	return __le64_to_cpu(nd_label->efi.lbasize);
 }
 
 static inline void nsl_set_lbasize(struct nvdimm_drvdata *ndd,
 				   struct nd_namespace_label *nd_label,
 				   u64 lbasize)
 {
-	nd_label->lbasize = __cpu_to_le64(lbasize);
+	if (ndd->cxl)
+		nd_label->cxl.lbasize = __cpu_to_le16(lbasize);
+	else
+		nd_label->efi.lbasize = __cpu_to_le64(lbasize);
 }
 
 static inline const uuid_t *nsl_get_uuid(struct nvdimm_drvdata *ndd,
 					 struct nd_namespace_label *nd_label,
 					 uuid_t *uuid)
 {
-	import_uuid(uuid, nd_label->uuid);
+	if (ndd->cxl)
+		import_uuid(uuid, nd_label->cxl.uuid);
+	else
+		import_uuid(uuid, nd_label->efi.uuid);
 	return uuid;
 }
 
@@ -202,7 +267,10 @@ static inline const uuid_t *nsl_set_uuid(struct nvdimm_drvdata *ndd,
 					 struct nd_namespace_label *nd_label,
 					 const uuid_t *uuid)
 {
-	export_uuid(nd_label->uuid, uuid);
+	if (ndd->cxl)
+		export_uuid(nd_label->cxl.uuid, uuid);
+	else
+		export_uuid(nd_label->efi.uuid, uuid);
 	return uuid;
 }
 
@@ -212,14 +280,19 @@ static inline bool nsl_uuid_equal(struct nvdimm_drvdata *ndd,
 {
 	uuid_t tmp;
 
-	import_uuid(&tmp, nd_label->uuid);
+	if (ndd->cxl)
+		import_uuid(&tmp, nd_label->cxl.uuid);
+	else
+		import_uuid(&tmp, nd_label->efi.uuid);
 	return uuid_equal(&tmp, uuid);
 }
 
 static inline const u8 *nsl_uuid_raw(struct nvdimm_drvdata *ndd,
 				     struct nd_namespace_label *nd_label)
 {
-	return nd_label->uuid;
+	if (ndd->cxl)
+		return nd_label->cxl.uuid;
+	return nd_label->efi.uuid;
 }
 
 bool nsl_validate_blk_isetcookie(struct nvdimm_drvdata *ndd,
@@ -278,8 +351,8 @@ static inline struct nd_namespace_index *to_next_namespace_index(
 
 unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd);
 
-#define namespace_label_has(ndd, field) \
-	(offsetof(struct nd_namespace_label, field) \
+#define efi_namespace_label_has(ndd, field) \
+	(!ndd->cxl && offsetof(struct nvdimm_efi_label, field) \
 		< sizeof_namespace_label(ndd))
 
 #define nd_dbg_dpa(r, d, res, fmt, arg...) \
@@ -359,6 +432,8 @@ static inline bool nsl_validate_nlabel(struct nd_region *nd_region,
 				       struct nvdimm_drvdata *ndd,
 				       struct nd_namespace_label *nd_label)
 {
+	if (ndd->cxl)
+		return true;
 	return nsl_get_nlabel(ndd, nd_label) == nd_region->ndr_mappings;
 }
 


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 07/21] cxl/pci: Make 'struct cxl_mem' device type generic
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (5 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 06/21] libnvdimm/labels: Introduce CXL labels Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09 16:12   ` Ben Widawsky
  2021-09-10  8:43   ` Jonathan Cameron
  2021-09-09  5:12 ` [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info() Dan Williams
                   ` (13 subsequent siblings)
  20 siblings, 2 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

In preparation for adding a unit test provider of a cxl_memdev, convert
the 'struct cxl_mem' driver context to carry a generic device rather
than a pci device.

Note, some dev_dbg() lines needed extra reformatting per clang-format.

This conversion also allows the cxl_mem_create() and
devm_cxl_add_memdev() calling conventions to be simplified. The "host"
for a cxl_memdev, must be the same device for the driver that allocated
@cxlm.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c |    7 ++--
 drivers/cxl/cxlmem.h      |    6 ++--
 drivers/cxl/pci.c         |   75 +++++++++++++++++++++------------------------
 3 files changed, 41 insertions(+), 47 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index a9c317e32010..331ef7d6c598 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -149,7 +149,6 @@ static void cxl_memdev_unregister(void *_cxlmd)
 static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
 					   const struct file_operations *fops)
 {
-	struct pci_dev *pdev = cxlm->pdev;
 	struct cxl_memdev *cxlmd;
 	struct device *dev;
 	struct cdev *cdev;
@@ -166,7 +165,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
 
 	dev = &cxlmd->dev;
 	device_initialize(dev);
-	dev->parent = &pdev->dev;
+	dev->parent = cxlm->dev;
 	dev->bus = &cxl_bus_type;
 	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
 	dev->type = &cxl_memdev_type;
@@ -182,7 +181,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
 }
 
 struct cxl_memdev *
-devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
+devm_cxl_add_memdev(struct cxl_mem *cxlm,
 		    const struct cdevm_file_operations *cdevm_fops)
 {
 	struct cxl_memdev *cxlmd;
@@ -210,7 +209,7 @@ devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
 	if (rc)
 		goto err;
 
-	rc = devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
+	rc = devm_add_action_or_reset(cxlm->dev, cxl_memdev_unregister, cxlmd);
 	if (rc)
 		return ERR_PTR(rc);
 	return cxlmd;
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 6c0b1e2ea97c..d5334df83fb2 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -63,12 +63,12 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
 }
 
 struct cxl_memdev *
-devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
+devm_cxl_add_memdev(struct cxl_mem *cxlm,
 		    const struct cdevm_file_operations *cdevm_fops);
 
 /**
  * struct cxl_mem - A CXL memory device
- * @pdev: The PCI device associated with this CXL device.
+ * @dev: The device associated with this CXL device.
  * @cxlmd: Logical memory device chardev / interface
  * @regs: Parsed register blocks
  * @payload_size: Size of space for payload
@@ -82,7 +82,7 @@ devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
  * @ram_range: Volatile memory capacity information.
  */
 struct cxl_mem {
-	struct pci_dev *pdev;
+	struct device *dev;
 	struct cxl_memdev *cxlmd;
 
 	struct cxl_regs regs;
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 8e45aa07d662..c1e1d12e24b6 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -250,7 +250,7 @@ static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm)
 		cpu_relax();
 	}
 
-	dev_dbg(&cxlm->pdev->dev, "Doorbell wait took %dms",
+	dev_dbg(cxlm->dev, "Doorbell wait took %dms",
 		jiffies_to_msecs(end) - jiffies_to_msecs(start));
 	return 0;
 }
@@ -268,7 +268,7 @@ static bool cxl_is_security_command(u16 opcode)
 static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
 				 struct mbox_cmd *mbox_cmd)
 {
-	struct device *dev = &cxlm->pdev->dev;
+	struct device *dev = cxlm->dev;
 
 	dev_dbg(dev, "Mailbox command (opcode: %#x size: %zub) timed out\n",
 		mbox_cmd->opcode, mbox_cmd->size_in);
@@ -300,6 +300,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
 				   struct mbox_cmd *mbox_cmd)
 {
 	void __iomem *payload = cxlm->regs.mbox + CXLDEV_MBOX_PAYLOAD_OFFSET;
+	struct device *dev = cxlm->dev;
 	u64 cmd_reg, status_reg;
 	size_t out_len;
 	int rc;
@@ -325,8 +326,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
 
 	/* #1 */
 	if (cxl_doorbell_busy(cxlm)) {
-		dev_err_ratelimited(&cxlm->pdev->dev,
-				    "Mailbox re-busy after acquiring\n");
+		dev_err_ratelimited(dev, "Mailbox re-busy after acquiring\n");
 		return -EBUSY;
 	}
 
@@ -345,7 +345,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
 	writeq(cmd_reg, cxlm->regs.mbox + CXLDEV_MBOX_CMD_OFFSET);
 
 	/* #4 */
-	dev_dbg(&cxlm->pdev->dev, "Sending command\n");
+	dev_dbg(dev, "Sending command\n");
 	writel(CXLDEV_MBOX_CTRL_DOORBELL,
 	       cxlm->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
 
@@ -362,7 +362,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
 		FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
 
 	if (mbox_cmd->return_code != 0) {
-		dev_dbg(&cxlm->pdev->dev, "Mailbox operation had an error\n");
+		dev_dbg(dev, "Mailbox operation had an error\n");
 		return 0;
 	}
 
@@ -399,7 +399,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
  */
 static int cxl_mem_mbox_get(struct cxl_mem *cxlm)
 {
-	struct device *dev = &cxlm->pdev->dev;
+	struct device *dev = cxlm->dev;
 	u64 md_status;
 	int rc;
 
@@ -502,7 +502,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
 					u64 in_payload, u64 out_payload,
 					s32 *size_out, u32 *retval)
 {
-	struct device *dev = &cxlm->pdev->dev;
+	struct device *dev = cxlm->dev;
 	struct mbox_cmd mbox_cmd = {
 		.opcode = cmd->opcode,
 		.size_in = cmd->info.size_in,
@@ -925,20 +925,19 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
 	 */
 	cxlm->payload_size = min_t(size_t, cxlm->payload_size, SZ_1M);
 	if (cxlm->payload_size < 256) {
-		dev_err(&cxlm->pdev->dev, "Mailbox is too small (%zub)",
+		dev_err(cxlm->dev, "Mailbox is too small (%zub)",
 			cxlm->payload_size);
 		return -ENXIO;
 	}
 
-	dev_dbg(&cxlm->pdev->dev, "Mailbox payload sized %zu",
+	dev_dbg(cxlm->dev, "Mailbox payload sized %zu",
 		cxlm->payload_size);
 
 	return 0;
 }
 
-static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev)
+static struct cxl_mem *cxl_mem_create(struct device *dev)
 {
-	struct device *dev = &pdev->dev;
 	struct cxl_mem *cxlm;
 
 	cxlm = devm_kzalloc(dev, sizeof(*cxlm), GFP_KERNEL);
@@ -948,7 +947,7 @@ static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev)
 	}
 
 	mutex_init(&cxlm->mbox_mutex);
-	cxlm->pdev = pdev;
+	cxlm->dev = dev;
 	cxlm->enabled_cmds =
 		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
 				   sizeof(unsigned long),
@@ -964,9 +963,9 @@ static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev)
 static void __iomem *cxl_mem_map_regblock(struct cxl_mem *cxlm,
 					  u8 bar, u64 offset)
 {
-	struct pci_dev *pdev = cxlm->pdev;
-	struct device *dev = &pdev->dev;
 	void __iomem *addr;
+	struct device *dev = cxlm->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
 
 	/* Basic sanity check that BAR is big enough */
 	if (pci_resource_len(pdev, bar) < offset) {
@@ -989,7 +988,7 @@ static void __iomem *cxl_mem_map_regblock(struct cxl_mem *cxlm,
 
 static void cxl_mem_unmap_regblock(struct cxl_mem *cxlm, void __iomem *base)
 {
-	pci_iounmap(cxlm->pdev, base);
+	pci_iounmap(to_pci_dev(cxlm->dev), base);
 }
 
 static int cxl_mem_dvsec(struct pci_dev *pdev, int dvsec)
@@ -1018,10 +1017,9 @@ static int cxl_mem_dvsec(struct pci_dev *pdev, int dvsec)
 static int cxl_probe_regs(struct cxl_mem *cxlm, void __iomem *base,
 			  struct cxl_register_map *map)
 {
-	struct pci_dev *pdev = cxlm->pdev;
-	struct device *dev = &pdev->dev;
 	struct cxl_component_reg_map *comp_map;
 	struct cxl_device_reg_map *dev_map;
+	struct device *dev = cxlm->dev;
 
 	switch (map->reg_type) {
 	case CXL_REGLOC_RBI_COMPONENT:
@@ -1057,8 +1055,8 @@ static int cxl_probe_regs(struct cxl_mem *cxlm, void __iomem *base,
 
 static int cxl_map_regs(struct cxl_mem *cxlm, struct cxl_register_map *map)
 {
-	struct pci_dev *pdev = cxlm->pdev;
-	struct device *dev = &pdev->dev;
+	struct device *dev = cxlm->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
 
 	switch (map->reg_type) {
 	case CXL_REGLOC_RBI_COMPONENT:
@@ -1096,13 +1094,12 @@ static void cxl_decode_register_block(u32 reg_lo, u32 reg_hi,
  */
 static int cxl_mem_setup_regs(struct cxl_mem *cxlm)
 {
-	struct pci_dev *pdev = cxlm->pdev;
-	struct device *dev = &pdev->dev;
-	u32 regloc_size, regblocks;
 	void __iomem *base;
-	int regloc, i, n_maps;
+	u32 regloc_size, regblocks;
+	int regloc, i, n_maps, ret = 0;
+	struct device *dev = cxlm->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
 	struct cxl_register_map *map, maps[CXL_REGLOC_RBI_TYPES];
-	int ret = 0;
 
 	regloc = cxl_mem_dvsec(pdev, PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
 	if (!regloc) {
@@ -1226,7 +1223,7 @@ static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
 		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
 
 		if (!cmd) {
-			dev_dbg(&cxlm->pdev->dev,
+			dev_dbg(cxlm->dev,
 				"Opcode 0x%04x unsupported by driver", opcode);
 			continue;
 		}
@@ -1323,7 +1320,7 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
 static int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm)
 {
 	struct cxl_mbox_get_supported_logs *gsl;
-	struct device *dev = &cxlm->pdev->dev;
+	struct device *dev = cxlm->dev;
 	struct cxl_mem_command *cmd;
 	int i, rc;
 
@@ -1418,15 +1415,14 @@ static int cxl_mem_identify(struct cxl_mem *cxlm)
 	cxlm->partition_align_bytes = le64_to_cpu(id.partition_align);
 	cxlm->partition_align_bytes *= CXL_CAPACITY_MULTIPLIER;
 
-	dev_dbg(&cxlm->pdev->dev, "Identify Memory Device\n"
+	dev_dbg(cxlm->dev,
+		"Identify Memory Device\n"
 		"     total_bytes = %#llx\n"
 		"     volatile_only_bytes = %#llx\n"
 		"     persistent_only_bytes = %#llx\n"
 		"     partition_align_bytes = %#llx\n",
-			cxlm->total_bytes,
-			cxlm->volatile_only_bytes,
-			cxlm->persistent_only_bytes,
-			cxlm->partition_align_bytes);
+		cxlm->total_bytes, cxlm->volatile_only_bytes,
+		cxlm->persistent_only_bytes, cxlm->partition_align_bytes);
 
 	cxlm->lsa_size = le32_to_cpu(id.lsa_size);
 	memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision));
@@ -1453,19 +1449,18 @@ static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
 					&cxlm->next_volatile_bytes,
 					&cxlm->next_persistent_bytes);
 	if (rc < 0) {
-		dev_err(&cxlm->pdev->dev, "Failed to query partition information\n");
+		dev_err(cxlm->dev, "Failed to query partition information\n");
 		return rc;
 	}
 
-	dev_dbg(&cxlm->pdev->dev, "Get Partition Info\n"
+	dev_dbg(cxlm->dev,
+		"Get Partition Info\n"
 		"     active_volatile_bytes = %#llx\n"
 		"     active_persistent_bytes = %#llx\n"
 		"     next_volatile_bytes = %#llx\n"
 		"     next_persistent_bytes = %#llx\n",
-			cxlm->active_volatile_bytes,
-			cxlm->active_persistent_bytes,
-			cxlm->next_volatile_bytes,
-			cxlm->next_persistent_bytes);
+		cxlm->active_volatile_bytes, cxlm->active_persistent_bytes,
+		cxlm->next_volatile_bytes, cxlm->next_persistent_bytes);
 
 	cxlm->ram_range.start = 0;
 	cxlm->ram_range.end = cxlm->active_volatile_bytes - 1;
@@ -1487,7 +1482,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
-	cxlm = cxl_mem_create(pdev);
+	cxlm = cxl_mem_create(&pdev->dev);
 	if (IS_ERR(cxlm))
 		return PTR_ERR(cxlm);
 
@@ -1511,7 +1506,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
-	cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm, &cxl_memdev_fops);
+	cxlmd = devm_cxl_add_memdev(cxlm, &cxl_memdev_fops);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info()
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (6 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 07/21] cxl/pci: Make 'struct cxl_mem' device type generic Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09 16:20   ` Ben Widawsky
                     ` (2 more replies)
  2021-09-09  5:12 ` [PATCH v4 09/21] cxl/mbox: Introduce the mbox_send operation Dan Williams
                   ` (12 subsequent siblings)
  20 siblings, 3 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ira Weiny, Ben Widawsky, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

Commit 0b9159d0ff21 ("cxl/pci: Store memory capacity values") missed
updating the kernel-doc for 'struct cxl_mem' leading to the following
warnings:

./scripts/kernel-doc -v drivers/cxl/cxlmem.h 2>&1 | grep warn
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'total_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'volatile_only_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'persistent_only_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'partition_align_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_volatile_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_persistent_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_volatile_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_persistent_bytes' not described in 'cxl_mem'

Also, it is redundant to describe those same parameters in the
kernel-doc for cxl_mem_get_partition_info(). Given the only user of that
routine updates the values in @cxlm, just do that implicitly internal to
the helper.

Cc: Ira Weiny <ira.weiny@intel.com>
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/cxlmem.h |   15 +++++++++++++--
 drivers/cxl/pci.c    |   35 +++++++++++------------------------
 2 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index d5334df83fb2..c6fce966084a 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -78,8 +78,19 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
  * @mbox_mutex: Mutex to synchronize mailbox access.
  * @firmware_version: Firmware version for the memory device.
  * @enabled_cmds: Hardware commands found enabled in CEL.
- * @pmem_range: Persistent memory capacity information.
- * @ram_range: Volatile memory capacity information.
+ * @pmem_range: Active Persistent memory capacity configuration
+ * @ram_range: Active Volatile memory capacity configuration
+ * @total_bytes: sum of all possible capacities
+ * @volatile_only_bytes: hard volatile capacity
+ * @persistent_only_bytes: hard persistent capacity
+ * @partition_align_bytes: soft setting for configurable capacity
+ * @active_volatile_bytes: sum of hard + soft volatile
+ * @active_persistent_bytes: sum of hard + soft persistent
+ * @next_volatile_bytes: volatile capacity change pending device reset
+ * @next_persistent_bytes: persistent capacity change pending device reset
+ *
+ * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
+ * details on capacity parameters.
  */
 struct cxl_mem {
 	struct device *dev;
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index c1e1d12e24b6..8077d907e7d3 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1262,11 +1262,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
 
 /**
  * cxl_mem_get_partition_info - Get partition info
- * @cxlm: The device to act on
- * @active_volatile_bytes: returned active volatile capacity
- * @active_persistent_bytes: returned active persistent capacity
- * @next_volatile_bytes: return next volatile capacity
- * @next_persistent_bytes: return next persistent capacity
+ * @cxlm: cxl_mem instance to update partition info
  *
  * Retrieve the current partition info for the device specified.  If not 0, the
  * 'next' values are pending and take affect on next cold reset.
@@ -1275,11 +1271,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
  *
  * See CXL @8.2.9.5.2.1 Get Partition Info
  */
-static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
-				      u64 *active_volatile_bytes,
-				      u64 *active_persistent_bytes,
-				      u64 *next_volatile_bytes,
-				      u64 *next_persistent_bytes)
+static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
 {
 	struct cxl_mbox_get_partition_info {
 		__le64 active_volatile_cap;
@@ -1294,15 +1286,14 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
 	if (rc)
 		return rc;
 
-	*active_volatile_bytes = le64_to_cpu(pi.active_volatile_cap);
-	*active_persistent_bytes = le64_to_cpu(pi.active_persistent_cap);
-	*next_volatile_bytes = le64_to_cpu(pi.next_volatile_cap);
-	*next_persistent_bytes = le64_to_cpu(pi.next_volatile_cap);
-
-	*active_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
-	*active_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
-	*next_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
-	*next_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
+	cxlm->active_volatile_bytes =
+		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->active_persistent_bytes =
+		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->next_volatile_bytes =
+		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->next_persistent_bytes =
+		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
 
 	return 0;
 }
@@ -1443,11 +1434,7 @@ static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
 		return 0;
 	}
 
-	rc = cxl_mem_get_partition_info(cxlm,
-					&cxlm->active_volatile_bytes,
-					&cxlm->active_persistent_bytes,
-					&cxlm->next_volatile_bytes,
-					&cxlm->next_persistent_bytes);
+	rc = cxl_mem_get_partition_info(cxlm);
 	if (rc < 0) {
 		dev_err(cxlm->dev, "Failed to query partition information\n");
 		return rc;


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 09/21] cxl/mbox: Introduce the mbox_send operation
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (7 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info() Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09 16:34   ` Ben Widawsky
  2021-09-10  8:58   ` Jonathan Cameron
  2021-09-09  5:12 ` [PATCH v4 10/21] cxl/pci: Drop idr.h Dan Williams
                   ` (11 subsequent siblings)
  20 siblings, 2 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

In preparation for implementing a unit test backend transport for ioctl
operations, and making the mailbox available to the cxl/pmem
infrastructure, move the existing PCI specific portion of mailbox handling
to an "mbox_send" operation.

With this split all the PCI-specific transport details are comprehended
by a single operation and the rest of the mailbox infrastructure is
'struct cxl_mem' and 'struct cxl_memdev' generic.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/cxlmem.h |   42 ++++++++++++++++++++++++++++
 drivers/cxl/pci.c    |   76 ++++++++++++++------------------------------------
 2 files changed, 63 insertions(+), 55 deletions(-)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index c6fce966084a..9be5e26c5b48 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -66,6 +66,45 @@ struct cxl_memdev *
 devm_cxl_add_memdev(struct cxl_mem *cxlm,
 		    const struct cdevm_file_operations *cdevm_fops);
 
+/**
+ * struct cxl_mbox_cmd - A command to be submitted to hardware.
+ * @opcode: (input) The command set and command submitted to hardware.
+ * @payload_in: (input) Pointer to the input payload.
+ * @payload_out: (output) Pointer to the output payload. Must be allocated by
+ *		 the caller.
+ * @size_in: (input) Number of bytes to load from @payload_in.
+ * @size_out: (input) Max number of bytes loaded into @payload_out.
+ *            (output) Number of bytes generated by the device. For fixed size
+ *            outputs commands this is always expected to be deterministic. For
+ *            variable sized output commands, it tells the exact number of bytes
+ *            written.
+ * @return_code: (output) Error code returned from hardware.
+ *
+ * This is the primary mechanism used to send commands to the hardware.
+ * All the fields except @payload_* correspond exactly to the fields described in
+ * Command Register section of the CXL 2.0 8.2.8.4.5. @payload_in and
+ * @payload_out are written to, and read from the Command Payload Registers
+ * defined in CXL 2.0 8.2.8.4.8.
+ */
+struct cxl_mbox_cmd {
+	u16 opcode;
+	void *payload_in;
+	void *payload_out;
+	size_t size_in;
+	size_t size_out;
+	u16 return_code;
+#define CXL_MBOX_SUCCESS 0
+};
+
+/*
+ * CXL 2.0 - Memory capacity multiplier
+ * See Section 8.2.9.5
+ *
+ * Volatile, Persistent, and Partition capacities are specified to be in
+ * multiples of 256MB - define a multiplier to convert to/from bytes.
+ */
+#define CXL_CAPACITY_MULTIPLIER SZ_256M
+
 /**
  * struct cxl_mem - A CXL memory device
  * @dev: The device associated with this CXL device.
@@ -88,6 +127,7 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
  * @active_persistent_bytes: sum of hard + soft persistent
  * @next_volatile_bytes: volatile capacity change pending device reset
  * @next_persistent_bytes: persistent capacity change pending device reset
+ * @mbox_send: @dev specific transport for transmitting mailbox commands
  *
  * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
  * details on capacity parameters.
@@ -115,5 +155,7 @@ struct cxl_mem {
 	u64 active_persistent_bytes;
 	u64 next_volatile_bytes;
 	u64 next_persistent_bytes;
+
+	int (*mbox_send)(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd);
 };
 #endif /* __CXL_MEM_H__ */
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 8077d907e7d3..e2f27671c6b2 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -64,45 +64,6 @@ enum opcode {
 	CXL_MBOX_OP_MAX			= 0x10000
 };
 
-/*
- * CXL 2.0 - Memory capacity multiplier
- * See Section 8.2.9.5
- *
- * Volatile, Persistent, and Partition capacities are specified to be in
- * multiples of 256MB - define a multiplier to convert to/from bytes.
- */
-#define CXL_CAPACITY_MULTIPLIER SZ_256M
-
-/**
- * struct mbox_cmd - A command to be submitted to hardware.
- * @opcode: (input) The command set and command submitted to hardware.
- * @payload_in: (input) Pointer to the input payload.
- * @payload_out: (output) Pointer to the output payload. Must be allocated by
- *		 the caller.
- * @size_in: (input) Number of bytes to load from @payload_in.
- * @size_out: (input) Max number of bytes loaded into @payload_out.
- *            (output) Number of bytes generated by the device. For fixed size
- *            outputs commands this is always expected to be deterministic. For
- *            variable sized output commands, it tells the exact number of bytes
- *            written.
- * @return_code: (output) Error code returned from hardware.
- *
- * This is the primary mechanism used to send commands to the hardware.
- * All the fields except @payload_* correspond exactly to the fields described in
- * Command Register section of the CXL 2.0 8.2.8.4.5. @payload_in and
- * @payload_out are written to, and read from the Command Payload Registers
- * defined in CXL 2.0 8.2.8.4.8.
- */
-struct mbox_cmd {
-	u16 opcode;
-	void *payload_in;
-	void *payload_out;
-	size_t size_in;
-	size_t size_out;
-	u16 return_code;
-#define CXL_MBOX_SUCCESS 0
-};
-
 static DECLARE_RWSEM(cxl_memdev_rwsem);
 static struct dentry *cxl_debugfs;
 static bool cxl_raw_allow_all;
@@ -266,7 +227,7 @@ static bool cxl_is_security_command(u16 opcode)
 }
 
 static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
-				 struct mbox_cmd *mbox_cmd)
+				 struct cxl_mbox_cmd *mbox_cmd)
 {
 	struct device *dev = cxlm->dev;
 
@@ -297,7 +258,7 @@ static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
  * mailbox.
  */
 static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
-				   struct mbox_cmd *mbox_cmd)
+				   struct cxl_mbox_cmd *mbox_cmd)
 {
 	void __iomem *payload = cxlm->regs.mbox + CXLDEV_MBOX_PAYLOAD_OFFSET;
 	struct device *dev = cxlm->dev;
@@ -472,6 +433,20 @@ static void cxl_mem_mbox_put(struct cxl_mem *cxlm)
 	mutex_unlock(&cxlm->mbox_mutex);
 }
 
+static int cxl_pci_mbox_send(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
+{
+	int rc;
+
+	rc = cxl_mem_mbox_get(cxlm);
+	if (rc)
+		return rc;
+
+	rc = __cxl_mem_mbox_send_cmd(cxlm, cmd);
+	cxl_mem_mbox_put(cxlm);
+
+	return rc;
+}
+
 /**
  * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
  * @cxlm: The CXL memory device to communicate with.
@@ -503,7 +478,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
 					s32 *size_out, u32 *retval)
 {
 	struct device *dev = cxlm->dev;
-	struct mbox_cmd mbox_cmd = {
+	struct cxl_mbox_cmd mbox_cmd = {
 		.opcode = cmd->opcode,
 		.size_in = cmd->info.size_in,
 		.size_out = cmd->info.size_out,
@@ -525,10 +500,6 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
 		}
 	}
 
-	rc = cxl_mem_mbox_get(cxlm);
-	if (rc)
-		goto out;
-
 	dev_dbg(dev,
 		"Submitting %s command for user\n"
 		"\topcode: %x\n"
@@ -539,8 +510,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
 	dev_WARN_ONCE(dev, cmd->info.id == CXL_MEM_COMMAND_ID_RAW,
 		      "raw command path used\n");
 
-	rc = __cxl_mem_mbox_send_cmd(cxlm, &mbox_cmd);
-	cxl_mem_mbox_put(cxlm);
+	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
 	if (rc)
 		goto out;
 
@@ -874,7 +844,7 @@ static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode,
 				 void *out, size_t out_size)
 {
 	const struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
-	struct mbox_cmd mbox_cmd = {
+	struct cxl_mbox_cmd mbox_cmd = {
 		.opcode = opcode,
 		.payload_in = in,
 		.size_in = in_size,
@@ -886,12 +856,7 @@ static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode,
 	if (out_size > cxlm->payload_size)
 		return -E2BIG;
 
-	rc = cxl_mem_mbox_get(cxlm);
-	if (rc)
-		return rc;
-
-	rc = __cxl_mem_mbox_send_cmd(cxlm, &mbox_cmd);
-	cxl_mem_mbox_put(cxlm);
+	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
 	if (rc)
 		return rc;
 
@@ -913,6 +878,7 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
 {
 	const int cap = readl(cxlm->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
 
+	cxlm->mbox_send = cxl_pci_mbox_send;
 	cxlm->payload_size =
 		1 << FIELD_GET(CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK, cap);
 


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 10/21] cxl/pci: Drop idr.h
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (8 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 09/21] cxl/mbox: Introduce the mbox_send operation Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09 16:34   ` Ben Widawsky
  2021-09-09  5:12 ` [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core Dan Williams
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Jonathan Cameron, vishal.l.verma, nvdimm,
	ben.widawsky, alison.schofield, vishal.l.verma, ira.weiny,
	Jonathan.Cameron

Commit 3d135db51024 ("cxl/core: Move memdev management to core") left
this straggling include for cxl_memdev setup. Clean it up.

Cc: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/pci.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index e2f27671c6b2..9d8050fdd69c 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -8,7 +8,6 @@
 #include <linux/mutex.h>
 #include <linux/list.h>
 #include <linux/cdev.h>
-#include <linux/idr.h>
 #include <linux/pci.h>
 #include <linux/io.h>
 #include <linux/io-64-nonatomic-lo-hi.h>


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (9 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 10/21] cxl/pci: Drop idr.h Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09 16:41   ` Ben Widawsky
  2021-09-10  9:13   ` Jonathan Cameron
  2021-09-09  5:12 ` [PATCH v4 12/21] cxl/pci: Use module_pci_driver Dan Williams
                   ` (9 subsequent siblings)
  20 siblings, 2 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, kernel test robot, vishal.l.verma, nvdimm,
	ben.widawsky, alison.schofield, vishal.l.verma, ira.weiny,
	Jonathan.Cameron

Now that the internals of mailbox operations are abstracted from the PCI
specifics a bulk of infrastructure can move to the core.

The CXL_PMEM driver intends to proxy LIBNVDIMM UAPI and driver requests
to the equivalent functionality provided by the CXL hardware mailbox
interface. In support of that intent move the mailbox implementation to
a shared location for the CXL_PCI driver native IOCTL path and CXL_PMEM
nvdimm command proxy path to share.

A unit test framework seeks to implement a unit test backend transport
for mailbox commands to communicate mocked up payloads. It can reuse all
of the mailbox infrastructure minus the PCI specifics, so that also gets
moved to the core.

Finally with the mailbox infrastructure and ioctl handling being
transport generic there is no longer any need to pass file
file_operations to devm_cxl_add_memdev(). That allows all the ioctl
boilerplate to move into the core for unit test reuse.

No functional change intended, just code movement.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 Documentation/driver-api/cxl/memory-devices.rst |    3 
 drivers/cxl/core/Makefile                       |    1 
 drivers/cxl/core/bus.c                          |    4 
 drivers/cxl/core/core.h                         |    8 
 drivers/cxl/core/mbox.c                         |  825 +++++++++++++++++++++
 drivers/cxl/core/memdev.c                       |   81 ++
 drivers/cxl/cxlmem.h                            |   78 +-
 drivers/cxl/pci.c                               |  924 -----------------------
 8 files changed, 975 insertions(+), 949 deletions(-)
 create mode 100644 drivers/cxl/core/mbox.c

diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index 50ebcda17ad0..4624951b5c38 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -45,6 +45,9 @@ CXL Core
 .. kernel-doc:: drivers/cxl/core/regs.c
    :doc: cxl registers
 
+.. kernel-doc:: drivers/cxl/core/mbox.c
+   :doc: cxl mbox
+
 External Interfaces
 ===================
 
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 0fdbf3c6ac1a..07eb8e1fb8a6 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -6,3 +6,4 @@ cxl_core-y := bus.o
 cxl_core-y += pmem.o
 cxl_core-y += regs.o
 cxl_core-y += memdev.o
+cxl_core-y += mbox.o
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 37b87adaa33f..8073354ba232 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -636,6 +636,8 @@ static __init int cxl_core_init(void)
 {
 	int rc;
 
+	cxl_mbox_init();
+
 	rc = cxl_memdev_init();
 	if (rc)
 		return rc;
@@ -647,6 +649,7 @@ static __init int cxl_core_init(void)
 
 err:
 	cxl_memdev_exit();
+	cxl_mbox_exit();
 	return rc;
 }
 
@@ -654,6 +657,7 @@ static void cxl_core_exit(void)
 {
 	bus_unregister(&cxl_bus_type);
 	cxl_memdev_exit();
+	cxl_mbox_exit();
 }
 
 module_init(cxl_core_init);
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 036a3c8106b4..c85b7fbad02d 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -14,7 +14,15 @@ static inline void unregister_cxl_dev(void *dev)
 	device_unregister(dev);
 }
 
+struct cxl_send_command;
+struct cxl_mem_query_commands;
+int cxl_query_cmd(struct cxl_memdev *cxlmd,
+		  struct cxl_mem_query_commands __user *q);
+int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s);
+
 int cxl_memdev_init(void);
 void cxl_memdev_exit(void);
+void cxl_mbox_init(void);
+void cxl_mbox_exit(void);
 
 #endif /* __CXL_CORE_H__ */
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
new file mode 100644
index 000000000000..31e183991726
--- /dev/null
+++ b/drivers/cxl/core/mbox.c
@@ -0,0 +1,825 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/security.h>
+#include <linux/debugfs.h>
+#include <linux/mutex.h>
+#include <cxlmem.h>
+#include <cxl.h>
+
+#include "core.h"
+
+static bool cxl_raw_allow_all;
+
+/**
+ * DOC: cxl mbox
+ *
+ * Core implementation of the CXL 2.0 Type-3 Memory Device Mailbox. The
+ * implementation is used by the cxl_pci driver to initialize the device
+ * and implement the cxl_mem.h IOCTL UAPI. It also implements the
+ * backend of the cxl_pmem_ctl() transport for LIBNVDIMM.
+ */
+
+#define cxl_for_each_cmd(cmd)                                                  \
+	for ((cmd) = &cxl_mem_commands[0];                                     \
+	     ((cmd) - cxl_mem_commands) < ARRAY_SIZE(cxl_mem_commands); (cmd)++)
+
+#define CXL_CMD(_id, sin, sout, _flags)                                        \
+	[CXL_MEM_COMMAND_ID_##_id] = {                                         \
+	.info =	{                                                              \
+			.id = CXL_MEM_COMMAND_ID_##_id,                        \
+			.size_in = sin,                                        \
+			.size_out = sout,                                      \
+		},                                                             \
+	.opcode = CXL_MBOX_OP_##_id,                                           \
+	.flags = _flags,                                                       \
+	}
+
+/*
+ * This table defines the supported mailbox commands for the driver. This table
+ * is made up of a UAPI structure. Non-negative values as parameters in the
+ * table will be validated against the user's input. For example, if size_in is
+ * 0, and the user passed in 1, it is an error.
+ */
+static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
+	CXL_CMD(IDENTIFY, 0, 0x43, CXL_CMD_FLAG_FORCE_ENABLE),
+#ifdef CONFIG_CXL_MEM_RAW_COMMANDS
+	CXL_CMD(RAW, ~0, ~0, 0),
+#endif
+	CXL_CMD(GET_SUPPORTED_LOGS, 0, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
+	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
+	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
+	CXL_CMD(GET_LSA, 0x8, ~0, 0),
+	CXL_CMD(GET_HEALTH_INFO, 0, 0x12, 0),
+	CXL_CMD(GET_LOG, 0x18, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
+	CXL_CMD(SET_PARTITION_INFO, 0x0a, 0, 0),
+	CXL_CMD(SET_LSA, ~0, 0, 0),
+	CXL_CMD(GET_ALERT_CONFIG, 0, 0x10, 0),
+	CXL_CMD(SET_ALERT_CONFIG, 0xc, 0, 0),
+	CXL_CMD(GET_SHUTDOWN_STATE, 0, 0x1, 0),
+	CXL_CMD(SET_SHUTDOWN_STATE, 0x1, 0, 0),
+	CXL_CMD(GET_POISON, 0x10, ~0, 0),
+	CXL_CMD(INJECT_POISON, 0x8, 0, 0),
+	CXL_CMD(CLEAR_POISON, 0x48, 0, 0),
+	CXL_CMD(GET_SCAN_MEDIA_CAPS, 0x10, 0x4, 0),
+	CXL_CMD(SCAN_MEDIA, 0x11, 0, 0),
+	CXL_CMD(GET_SCAN_MEDIA, 0, ~0, 0),
+};
+
+/*
+ * Commands that RAW doesn't permit. The rationale for each:
+ *
+ * CXL_MBOX_OP_ACTIVATE_FW: Firmware activation requires adjustment /
+ * coordination of transaction timeout values at the root bridge level.
+ *
+ * CXL_MBOX_OP_SET_PARTITION_INFO: The device memory map may change live
+ * and needs to be coordinated with HDM updates.
+ *
+ * CXL_MBOX_OP_SET_LSA: The label storage area may be cached by the
+ * driver and any writes from userspace invalidates those contents.
+ *
+ * CXL_MBOX_OP_SET_SHUTDOWN_STATE: Set shutdown state assumes no writes
+ * to the device after it is marked clean, userspace can not make that
+ * assertion.
+ *
+ * CXL_MBOX_OP_[GET_]SCAN_MEDIA: The kernel provides a native error list that
+ * is kept up to date with patrol notifications and error management.
+ */
+static u16 cxl_disabled_raw_commands[] = {
+	CXL_MBOX_OP_ACTIVATE_FW,
+	CXL_MBOX_OP_SET_PARTITION_INFO,
+	CXL_MBOX_OP_SET_LSA,
+	CXL_MBOX_OP_SET_SHUTDOWN_STATE,
+	CXL_MBOX_OP_SCAN_MEDIA,
+	CXL_MBOX_OP_GET_SCAN_MEDIA,
+};
+
+/*
+ * Command sets that RAW doesn't permit. All opcodes in this set are
+ * disabled because they pass plain text security payloads over the
+ * user/kernel boundary. This functionality is intended to be wrapped
+ * behind the keys ABI which allows for encrypted payloads in the UAPI
+ */
+static u8 security_command_sets[] = {
+	0x44, /* Sanitize */
+	0x45, /* Persistent Memory Data-at-rest Security */
+	0x46, /* Security Passthrough */
+};
+
+static bool cxl_is_security_command(u16 opcode)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(security_command_sets); i++)
+		if (security_command_sets[i] == (opcode >> 8))
+			return true;
+	return false;
+}
+
+static struct cxl_mem_command *cxl_mem_find_command(u16 opcode)
+{
+	struct cxl_mem_command *c;
+
+	cxl_for_each_cmd(c)
+		if (c->opcode == opcode)
+			return c;
+
+	return NULL;
+}
+
+/**
+ * cxl_mem_mbox_send_cmd() - Send a mailbox command to a memory device.
+ * @cxlm: The CXL memory device to communicate with.
+ * @opcode: Opcode for the mailbox command.
+ * @in: The input payload for the mailbox command.
+ * @in_size: The length of the input payload
+ * @out: Caller allocated buffer for the output.
+ * @out_size: Expected size of output.
+ *
+ * Context: Any context. Will acquire and release mbox_mutex.
+ * Return:
+ *  * %>=0	- Number of bytes returned in @out.
+ *  * %-E2BIG	- Payload is too large for hardware.
+ *  * %-EBUSY	- Couldn't acquire exclusive mailbox access.
+ *  * %-EFAULT	- Hardware error occurred.
+ *  * %-ENXIO	- Command completed, but device reported an error.
+ *  * %-EIO	- Unexpected output size.
+ *
+ * Mailbox commands may execute successfully yet the device itself reported an
+ * error. While this distinction can be useful for commands from userspace, the
+ * kernel will only be able to use results when both are successful.
+ *
+ * See __cxl_mem_mbox_send_cmd()
+ */
+int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode, void *in,
+			  size_t in_size, void *out, size_t out_size)
+{
+	const struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
+	struct cxl_mbox_cmd mbox_cmd = {
+		.opcode = opcode,
+		.payload_in = in,
+		.size_in = in_size,
+		.size_out = out_size,
+		.payload_out = out,
+	};
+	int rc;
+
+	if (out_size > cxlm->payload_size)
+		return -E2BIG;
+
+	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
+	if (rc)
+		return rc;
+
+	/* TODO: Map return code to proper kernel style errno */
+	if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
+		return -ENXIO;
+
+	/*
+	 * Variable sized commands can't be validated and so it's up to the
+	 * caller to do that if they wish.
+	 */
+	if (cmd->info.size_out >= 0 && mbox_cmd.size_out != out_size)
+		return -EIO;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxl_mem_mbox_send_cmd);
+
+static bool cxl_mem_raw_command_allowed(u16 opcode)
+{
+	int i;
+
+	if (!IS_ENABLED(CONFIG_CXL_MEM_RAW_COMMANDS))
+		return false;
+
+	if (security_locked_down(LOCKDOWN_PCI_ACCESS))
+		return false;
+
+	if (cxl_raw_allow_all)
+		return true;
+
+	if (cxl_is_security_command(opcode))
+		return false;
+
+	for (i = 0; i < ARRAY_SIZE(cxl_disabled_raw_commands); i++)
+		if (cxl_disabled_raw_commands[i] == opcode)
+			return false;
+
+	return true;
+}
+
+/**
+ * cxl_validate_cmd_from_user() - Check fields for CXL_MEM_SEND_COMMAND.
+ * @cxlm: &struct cxl_mem device whose mailbox will be used.
+ * @send_cmd: &struct cxl_send_command copied in from userspace.
+ * @out_cmd: Sanitized and populated &struct cxl_mem_command.
+ *
+ * Return:
+ *  * %0	- @out_cmd is ready to send.
+ *  * %-ENOTTY	- Invalid command specified.
+ *  * %-EINVAL	- Reserved fields or invalid values were used.
+ *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
+ *  * %-EPERM	- Attempted to use a protected command.
+ *
+ * The result of this command is a fully validated command in @out_cmd that is
+ * safe to send to the hardware.
+ *
+ * See handle_mailbox_cmd_from_user()
+ */
+static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
+				      const struct cxl_send_command *send_cmd,
+				      struct cxl_mem_command *out_cmd)
+{
+	const struct cxl_command_info *info;
+	struct cxl_mem_command *c;
+
+	if (send_cmd->id == 0 || send_cmd->id >= CXL_MEM_COMMAND_ID_MAX)
+		return -ENOTTY;
+
+	/*
+	 * The user can never specify an input payload larger than what hardware
+	 * supports, but output can be arbitrarily large (simply write out as
+	 * much data as the hardware provides).
+	 */
+	if (send_cmd->in.size > cxlm->payload_size)
+		return -EINVAL;
+
+	/*
+	 * Checks are bypassed for raw commands but a WARN/taint will occur
+	 * later in the callchain
+	 */
+	if (send_cmd->id == CXL_MEM_COMMAND_ID_RAW) {
+		const struct cxl_mem_command temp = {
+			.info = {
+				.id = CXL_MEM_COMMAND_ID_RAW,
+				.flags = 0,
+				.size_in = send_cmd->in.size,
+				.size_out = send_cmd->out.size,
+			},
+			.opcode = send_cmd->raw.opcode
+		};
+
+		if (send_cmd->raw.rsvd)
+			return -EINVAL;
+
+		/*
+		 * Unlike supported commands, the output size of RAW commands
+		 * gets passed along without further checking, so it must be
+		 * validated here.
+		 */
+		if (send_cmd->out.size > cxlm->payload_size)
+			return -EINVAL;
+
+		if (!cxl_mem_raw_command_allowed(send_cmd->raw.opcode))
+			return -EPERM;
+
+		memcpy(out_cmd, &temp, sizeof(temp));
+
+		return 0;
+	}
+
+	if (send_cmd->flags & ~CXL_MEM_COMMAND_FLAG_MASK)
+		return -EINVAL;
+
+	if (send_cmd->rsvd)
+		return -EINVAL;
+
+	if (send_cmd->in.rsvd || send_cmd->out.rsvd)
+		return -EINVAL;
+
+	/* Convert user's command into the internal representation */
+	c = &cxl_mem_commands[send_cmd->id];
+	info = &c->info;
+
+	/* Check that the command is enabled for hardware */
+	if (!test_bit(info->id, cxlm->enabled_cmds))
+		return -ENOTTY;
+
+	/* Check the input buffer is the expected size */
+	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
+		return -ENOMEM;
+
+	/* Check the output buffer is at least large enough */
+	if (info->size_out >= 0 && send_cmd->out.size < info->size_out)
+		return -ENOMEM;
+
+	memcpy(out_cmd, c, sizeof(*c));
+	out_cmd->info.size_in = send_cmd->in.size;
+	/*
+	 * XXX: out_cmd->info.size_out will be controlled by the driver, and the
+	 * specified number of bytes @send_cmd->out.size will be copied back out
+	 * to userspace.
+	 */
+
+	return 0;
+}
+
+#define cxl_cmd_count ARRAY_SIZE(cxl_mem_commands)
+
+int cxl_query_cmd(struct cxl_memdev *cxlmd,
+		  struct cxl_mem_query_commands __user *q)
+{
+	struct device *dev = &cxlmd->dev;
+	struct cxl_mem_command *cmd;
+	u32 n_commands;
+	int j = 0;
+
+	dev_dbg(dev, "Query IOCTL\n");
+
+	if (get_user(n_commands, &q->n_commands))
+		return -EFAULT;
+
+	/* returns the total number if 0 elements are requested. */
+	if (n_commands == 0)
+		return put_user(cxl_cmd_count, &q->n_commands);
+
+	/*
+	 * otherwise, return max(n_commands, total commands) cxl_command_info
+	 * structures.
+	 */
+	cxl_for_each_cmd(cmd) {
+		const struct cxl_command_info *info = &cmd->info;
+
+		if (copy_to_user(&q->commands[j++], info, sizeof(*info)))
+			return -EFAULT;
+
+		if (j == n_commands)
+			break;
+	}
+
+	return 0;
+}
+
+/**
+ * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
+ * @cxlm: The CXL memory device to communicate with.
+ * @cmd: The validated command.
+ * @in_payload: Pointer to userspace's input payload.
+ * @out_payload: Pointer to userspace's output payload.
+ * @size_out: (Input) Max payload size to copy out.
+ *            (Output) Payload size hardware generated.
+ * @retval: Hardware generated return code from the operation.
+ *
+ * Return:
+ *  * %0	- Mailbox transaction succeeded. This implies the mailbox
+ *		  protocol completed successfully not that the operation itself
+ *		  was successful.
+ *  * %-ENOMEM  - Couldn't allocate a bounce buffer.
+ *  * %-EFAULT	- Something happened with copy_to/from_user.
+ *  * %-EINTR	- Mailbox acquisition interrupted.
+ *  * %-EXXX	- Transaction level failures.
+ *
+ * Creates the appropriate mailbox command and dispatches it on behalf of a
+ * userspace request. The input and output payloads are copied between
+ * userspace.
+ *
+ * See cxl_send_cmd().
+ */
+static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
+					const struct cxl_mem_command *cmd,
+					u64 in_payload, u64 out_payload,
+					s32 *size_out, u32 *retval)
+{
+	struct device *dev = cxlm->dev;
+	struct cxl_mbox_cmd mbox_cmd = {
+		.opcode = cmd->opcode,
+		.size_in = cmd->info.size_in,
+		.size_out = cmd->info.size_out,
+	};
+	int rc;
+
+	if (cmd->info.size_out) {
+		mbox_cmd.payload_out = kvzalloc(cmd->info.size_out, GFP_KERNEL);
+		if (!mbox_cmd.payload_out)
+			return -ENOMEM;
+	}
+
+	if (cmd->info.size_in) {
+		mbox_cmd.payload_in = vmemdup_user(u64_to_user_ptr(in_payload),
+						   cmd->info.size_in);
+		if (IS_ERR(mbox_cmd.payload_in)) {
+			kvfree(mbox_cmd.payload_out);
+			return PTR_ERR(mbox_cmd.payload_in);
+		}
+	}
+
+	dev_dbg(dev,
+		"Submitting %s command for user\n"
+		"\topcode: %x\n"
+		"\tsize: %ub\n",
+		cxl_command_names[cmd->info.id].name, mbox_cmd.opcode,
+		cmd->info.size_in);
+
+	dev_WARN_ONCE(dev, cmd->info.id == CXL_MEM_COMMAND_ID_RAW,
+		      "raw command path used\n");
+
+	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
+	if (rc)
+		goto out;
+
+	/*
+	 * @size_out contains the max size that's allowed to be written back out
+	 * to userspace. While the payload may have written more output than
+	 * this it will have to be ignored.
+	 */
+	if (mbox_cmd.size_out) {
+		dev_WARN_ONCE(dev, mbox_cmd.size_out > *size_out,
+			      "Invalid return size\n");
+		if (copy_to_user(u64_to_user_ptr(out_payload),
+				 mbox_cmd.payload_out, mbox_cmd.size_out)) {
+			rc = -EFAULT;
+			goto out;
+		}
+	}
+
+	*size_out = mbox_cmd.size_out;
+	*retval = mbox_cmd.return_code;
+
+out:
+	kvfree(mbox_cmd.payload_in);
+	kvfree(mbox_cmd.payload_out);
+	return rc;
+}
+
+int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s)
+{
+	struct cxl_mem *cxlm = cxlmd->cxlm;
+	struct device *dev = &cxlmd->dev;
+	struct cxl_send_command send;
+	struct cxl_mem_command c;
+	int rc;
+
+	dev_dbg(dev, "Send IOCTL\n");
+
+	if (copy_from_user(&send, s, sizeof(send)))
+		return -EFAULT;
+
+	rc = cxl_validate_cmd_from_user(cxlmd->cxlm, &send, &c);
+	if (rc)
+		return rc;
+
+	/* Prepare to handle a full payload for variable sized output */
+	if (c.info.size_out < 0)
+		c.info.size_out = cxlm->payload_size;
+
+	rc = handle_mailbox_cmd_from_user(cxlm, &c, send.in.payload,
+					  send.out.payload, &send.out.size,
+					  &send.retval);
+	if (rc)
+		return rc;
+
+	if (copy_to_user(s, &send, sizeof(send)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
+{
+	u32 remaining = size;
+	u32 offset = 0;
+
+	while (remaining) {
+		u32 xfer_size = min_t(u32, remaining, cxlm->payload_size);
+		struct cxl_mbox_get_log {
+			uuid_t uuid;
+			__le32 offset;
+			__le32 length;
+		} __packed log = {
+			.uuid = *uuid,
+			.offset = cpu_to_le32(offset),
+			.length = cpu_to_le32(xfer_size)
+		};
+		int rc;
+
+		rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LOG, &log,
+					   sizeof(log), out, xfer_size);
+		if (rc < 0)
+			return rc;
+
+		out += xfer_size;
+		remaining -= xfer_size;
+		offset += xfer_size;
+	}
+
+	return 0;
+}
+
+/**
+ * cxl_walk_cel() - Walk through the Command Effects Log.
+ * @cxlm: Device.
+ * @size: Length of the Command Effects Log.
+ * @cel: CEL
+ *
+ * Iterate over each entry in the CEL and determine if the driver supports the
+ * command. If so, the command is enabled for the device and can be used later.
+ */
+static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
+{
+	struct cel_entry {
+		__le16 opcode;
+		__le16 effect;
+	} __packed * cel_entry;
+	const int cel_entries = size / sizeof(*cel_entry);
+	int i;
+
+	cel_entry = (struct cel_entry *)cel;
+
+	for (i = 0; i < cel_entries; i++) {
+		u16 opcode = le16_to_cpu(cel_entry[i].opcode);
+		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
+
+		if (!cmd) {
+			dev_dbg(cxlm->dev,
+				"Opcode 0x%04x unsupported by driver", opcode);
+			continue;
+		}
+
+		set_bit(cmd->info.id, cxlm->enabled_cmds);
+	}
+}
+
+struct cxl_mbox_get_supported_logs {
+	__le16 entries;
+	u8 rsvd[6];
+	struct gsl_entry {
+		uuid_t uuid;
+		__le32 size;
+	} __packed entry[];
+} __packed;
+
+static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
+{
+	struct cxl_mbox_get_supported_logs *ret;
+	int rc;
+
+	ret = kvmalloc(cxlm->payload_size, GFP_KERNEL);
+	if (!ret)
+		return ERR_PTR(-ENOMEM);
+
+	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_SUPPORTED_LOGS, NULL,
+				   0, ret, cxlm->payload_size);
+	if (rc < 0) {
+		kvfree(ret);
+		return ERR_PTR(rc);
+	}
+
+	return ret;
+}
+
+enum {
+	CEL_UUID,
+	VENDOR_DEBUG_UUID,
+};
+
+/* See CXL 2.0 Table 170. Get Log Input Payload */
+static const uuid_t log_uuid[] = {
+	[CEL_UUID] = UUID_INIT(0xda9c0b5, 0xbf41, 0x4b78, 0x8f, 0x79, 0x96,
+			       0xb1, 0x62, 0x3b, 0x3f, 0x17),
+	[VENDOR_DEBUG_UUID] = UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f,
+					0xd6, 0x07, 0x19, 0x40, 0x3d, 0x86),
+};
+
+/**
+ * cxl_mem_enumerate_cmds() - Enumerate commands for a device.
+ * @cxlm: The device.
+ *
+ * Returns 0 if enumerate completed successfully.
+ *
+ * CXL devices have optional support for certain commands. This function will
+ * determine the set of supported commands for the hardware and update the
+ * enabled_cmds bitmap in the @cxlm.
+ */
+int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm)
+{
+	struct cxl_mbox_get_supported_logs *gsl;
+	struct device *dev = cxlm->dev;
+	struct cxl_mem_command *cmd;
+	int i, rc;
+
+	gsl = cxl_get_gsl(cxlm);
+	if (IS_ERR(gsl))
+		return PTR_ERR(gsl);
+
+	rc = -ENOENT;
+	for (i = 0; i < le16_to_cpu(gsl->entries); i++) {
+		u32 size = le32_to_cpu(gsl->entry[i].size);
+		uuid_t uuid = gsl->entry[i].uuid;
+		u8 *log;
+
+		dev_dbg(dev, "Found LOG type %pU of size %d", &uuid, size);
+
+		if (!uuid_equal(&uuid, &log_uuid[CEL_UUID]))
+			continue;
+
+		log = kvmalloc(size, GFP_KERNEL);
+		if (!log) {
+			rc = -ENOMEM;
+			goto out;
+		}
+
+		rc = cxl_xfer_log(cxlm, &uuid, size, log);
+		if (rc) {
+			kvfree(log);
+			goto out;
+		}
+
+		cxl_walk_cel(cxlm, size, log);
+		kvfree(log);
+
+		/* In case CEL was bogus, enable some default commands. */
+		cxl_for_each_cmd(cmd)
+			if (cmd->flags & CXL_CMD_FLAG_FORCE_ENABLE)
+				set_bit(cmd->info.id, cxlm->enabled_cmds);
+
+		/* Found the required CEL */
+		rc = 0;
+	}
+
+out:
+	kvfree(gsl);
+	return rc;
+}
+EXPORT_SYMBOL_GPL(cxl_mem_enumerate_cmds);
+
+/**
+ * cxl_mem_get_partition_info - Get partition info
+ * @cxlm: cxl_mem instance to update partition info
+ *
+ * Retrieve the current partition info for the device specified.  The active
+ * values are the current capacity in bytes.  If not 0, the 'next' values are
+ * the pending values, in bytes, which take affect on next cold reset.
+ *
+ * Return: 0 if no error: or the result of the mailbox command.
+ *
+ * See CXL @8.2.9.5.2.1 Get Partition Info
+ */
+static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
+{
+	struct cxl_mbox_get_partition_info {
+		__le64 active_volatile_cap;
+		__le64 active_persistent_cap;
+		__le64 next_volatile_cap;
+		__le64 next_persistent_cap;
+	} __packed pi;
+	int rc;
+
+	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_PARTITION_INFO,
+				   NULL, 0, &pi, sizeof(pi));
+
+	if (rc)
+		return rc;
+
+	cxlm->active_volatile_bytes =
+		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->active_persistent_bytes =
+		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->next_volatile_bytes =
+		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->next_persistent_bytes =
+		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
+
+	return 0;
+}
+
+/**
+ * cxl_mem_identify() - Send the IDENTIFY command to the device.
+ * @cxlm: The device to identify.
+ *
+ * Return: 0 if identify was executed successfully.
+ *
+ * This will dispatch the identify command to the device and on success populate
+ * structures to be exported to sysfs.
+ */
+int cxl_mem_identify(struct cxl_mem *cxlm)
+{
+	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
+	struct cxl_mbox_identify {
+		char fw_revision[0x10];
+		__le64 total_capacity;
+		__le64 volatile_capacity;
+		__le64 persistent_capacity;
+		__le64 partition_align;
+		__le16 info_event_log_size;
+		__le16 warning_event_log_size;
+		__le16 failure_event_log_size;
+		__le16 fatal_event_log_size;
+		__le32 lsa_size;
+		u8 poison_list_max_mer[3];
+		__le16 inject_poison_limit;
+		u8 poison_caps;
+		u8 qos_telemetry_caps;
+	} __packed id;
+	int rc;
+
+	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_IDENTIFY, NULL, 0, &id,
+				   sizeof(id));
+	if (rc < 0)
+		return rc;
+
+	cxlm->total_bytes =
+		le64_to_cpu(id.total_capacity) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->volatile_only_bytes =
+		le64_to_cpu(id.volatile_capacity) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->persistent_only_bytes =
+		le64_to_cpu(id.persistent_capacity) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->partition_align_bytes =
+		le64_to_cpu(id.partition_align) * CXL_CAPACITY_MULTIPLIER;
+
+	dev_dbg(cxlm->dev,
+		"Identify Memory Device\n"
+		"     total_bytes = %#llx\n"
+		"     volatile_only_bytes = %#llx\n"
+		"     persistent_only_bytes = %#llx\n"
+		"     partition_align_bytes = %#llx\n",
+		cxlm->total_bytes, cxlm->volatile_only_bytes,
+		cxlm->persistent_only_bytes, cxlm->partition_align_bytes);
+
+	cxlm->lsa_size = le32_to_cpu(id.lsa_size);
+	memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision));
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxl_mem_identify);
+
+int cxl_mem_create_range_info(struct cxl_mem *cxlm)
+{
+	int rc;
+
+	if (cxlm->partition_align_bytes == 0) {
+		cxlm->ram_range.start = 0;
+		cxlm->ram_range.end = cxlm->volatile_only_bytes - 1;
+		cxlm->pmem_range.start = cxlm->volatile_only_bytes;
+		cxlm->pmem_range.end = cxlm->volatile_only_bytes +
+				       cxlm->persistent_only_bytes - 1;
+		return 0;
+	}
+
+	rc = cxl_mem_get_partition_info(cxlm);
+	if (rc) {
+		dev_err(cxlm->dev, "Failed to query partition information\n");
+		return rc;
+	}
+
+	dev_dbg(cxlm->dev,
+		"Get Partition Info\n"
+		"     active_volatile_bytes = %#llx\n"
+		"     active_persistent_bytes = %#llx\n"
+		"     next_volatile_bytes = %#llx\n"
+		"     next_persistent_bytes = %#llx\n",
+		cxlm->active_volatile_bytes, cxlm->active_persistent_bytes,
+		cxlm->next_volatile_bytes, cxlm->next_persistent_bytes);
+
+	cxlm->ram_range.start = 0;
+	cxlm->ram_range.end = cxlm->active_volatile_bytes - 1;
+
+	cxlm->pmem_range.start = cxlm->active_volatile_bytes;
+	cxlm->pmem_range.end =
+		cxlm->active_volatile_bytes + cxlm->active_persistent_bytes - 1;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxl_mem_create_range_info);
+
+struct cxl_mem *cxl_mem_create(struct device *dev)
+{
+	struct cxl_mem *cxlm;
+
+	cxlm = devm_kzalloc(dev, sizeof(*cxlm), GFP_KERNEL);
+	if (!cxlm) {
+		dev_err(dev, "No memory available\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	mutex_init(&cxlm->mbox_mutex);
+	cxlm->dev = dev;
+	cxlm->enabled_cmds =
+		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
+				   sizeof(unsigned long),
+				   GFP_KERNEL | __GFP_ZERO);
+	if (!cxlm->enabled_cmds) {
+		dev_err(dev, "No memory available for bitmap\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	return cxlm;
+}
+EXPORT_SYMBOL_GPL(cxl_mem_create);
+
+static struct dentry *cxl_debugfs;
+
+void __init cxl_mbox_init(void)
+{
+	struct dentry *mbox_debugfs;
+
+	cxl_debugfs = debugfs_create_dir("cxl", NULL);
+	mbox_debugfs = debugfs_create_dir("mbox", cxl_debugfs);
+	debugfs_create_bool("raw_allow_all", 0600, mbox_debugfs,
+			    &cxl_raw_allow_all);
+}
+
+void cxl_mbox_exit(void)
+{
+	debugfs_remove_recursive(cxl_debugfs);
+}
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 331ef7d6c598..df2ba87238c2 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -8,6 +8,8 @@
 #include <cxlmem.h>
 #include "core.h"
 
+static DECLARE_RWSEM(cxl_memdev_rwsem);
+
 /*
  * An entire PCI topology full of devices should be enough for any
  * config
@@ -132,16 +134,21 @@ static const struct device_type cxl_memdev_type = {
 	.groups = cxl_memdev_attribute_groups,
 };
 
+static void cxl_memdev_shutdown(struct device *dev)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+
+	down_write(&cxl_memdev_rwsem);
+	cxlmd->cxlm = NULL;
+	up_write(&cxl_memdev_rwsem);
+}
+
 static void cxl_memdev_unregister(void *_cxlmd)
 {
 	struct cxl_memdev *cxlmd = _cxlmd;
 	struct device *dev = &cxlmd->dev;
-	struct cdev *cdev = &cxlmd->cdev;
-	const struct cdevm_file_operations *cdevm_fops;
-
-	cdevm_fops = container_of(cdev->ops, typeof(*cdevm_fops), fops);
-	cdevm_fops->shutdown(dev);
 
+	cxl_memdev_shutdown(dev);
 	cdev_device_del(&cxlmd->cdev, dev);
 	put_device(dev);
 }
@@ -180,16 +187,72 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
 	return ERR_PTR(rc);
 }
 
+static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned int cmd,
+			       unsigned long arg)
+{
+	switch (cmd) {
+	case CXL_MEM_QUERY_COMMANDS:
+		return cxl_query_cmd(cxlmd, (void __user *)arg);
+	case CXL_MEM_SEND_COMMAND:
+		return cxl_send_cmd(cxlmd, (void __user *)arg);
+	default:
+		return -ENOTTY;
+	}
+}
+
+static long cxl_memdev_ioctl(struct file *file, unsigned int cmd,
+			     unsigned long arg)
+{
+	struct cxl_memdev *cxlmd = file->private_data;
+	int rc = -ENXIO;
+
+	down_read(&cxl_memdev_rwsem);
+	if (cxlmd->cxlm)
+		rc = __cxl_memdev_ioctl(cxlmd, cmd, arg);
+	up_read(&cxl_memdev_rwsem);
+
+	return rc;
+}
+
+static int cxl_memdev_open(struct inode *inode, struct file *file)
+{
+	struct cxl_memdev *cxlmd =
+		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
+
+	get_device(&cxlmd->dev);
+	file->private_data = cxlmd;
+
+	return 0;
+}
+
+static int cxl_memdev_release_file(struct inode *inode, struct file *file)
+{
+	struct cxl_memdev *cxlmd =
+		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
+
+	put_device(&cxlmd->dev);
+
+	return 0;
+}
+
+static const struct file_operations cxl_memdev_fops = {
+	.owner = THIS_MODULE,
+	.unlocked_ioctl = cxl_memdev_ioctl,
+	.open = cxl_memdev_open,
+	.release = cxl_memdev_release_file,
+	.compat_ioctl = compat_ptr_ioctl,
+	.llseek = noop_llseek,
+};
+
 struct cxl_memdev *
-devm_cxl_add_memdev(struct cxl_mem *cxlm,
-		    const struct cdevm_file_operations *cdevm_fops)
+devm_cxl_add_memdev(struct cxl_mem *cxlm)
 {
 	struct cxl_memdev *cxlmd;
 	struct device *dev;
 	struct cdev *cdev;
 	int rc;
 
-	cxlmd = cxl_memdev_alloc(cxlm, &cdevm_fops->fops);
+	cxlmd = cxl_memdev_alloc(cxlm, &cxl_memdev_fops);
 	if (IS_ERR(cxlmd))
 		return cxlmd;
 
@@ -219,7 +282,7 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
 	 * The cdev was briefly live, shutdown any ioctl operations that
 	 * saw that state.
 	 */
-	cdevm_fops->shutdown(dev);
+	cxl_memdev_shutdown(dev);
 	put_device(dev);
 	return ERR_PTR(rc);
 }
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 9be5e26c5b48..35fd16d12532 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -2,6 +2,7 @@
 /* Copyright(c) 2020-2021 Intel Corporation. */
 #ifndef __CXL_MEM_H__
 #define __CXL_MEM_H__
+#include <uapi/linux/cxl_mem.h>
 #include <linux/cdev.h>
 #include "cxl.h"
 
@@ -28,21 +29,6 @@
 	(FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) !=                       \
 	 CXLMDEV_RESET_NEEDED_NOT)
 
-/**
- * struct cdevm_file_operations - devm coordinated cdev file operations
- * @fops: file operations that are synchronized against @shutdown
- * @shutdown: disconnect driver data
- *
- * @shutdown is invoked in the devres release path to disconnect any
- * driver instance data from @dev. It assumes synchronization with any
- * fops operation that requires driver data. After @shutdown an
- * operation may only reference @device data.
- */
-struct cdevm_file_operations {
-	struct file_operations fops;
-	void (*shutdown)(struct device *dev);
-};
-
 /**
  * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device
  * @dev: driver core device object
@@ -62,9 +48,7 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
 	return container_of(dev, struct cxl_memdev, dev);
 }
 
-struct cxl_memdev *
-devm_cxl_add_memdev(struct cxl_mem *cxlm,
-		    const struct cdevm_file_operations *cdevm_fops);
+struct cxl_memdev *devm_cxl_add_memdev(struct cxl_mem *cxlm);
 
 /**
  * struct cxl_mbox_cmd - A command to be submitted to hardware.
@@ -158,4 +142,62 @@ struct cxl_mem {
 
 	int (*mbox_send)(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd);
 };
+
+enum cxl_opcode {
+	CXL_MBOX_OP_INVALID		= 0x0000,
+	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
+	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
+	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
+	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
+	CXL_MBOX_OP_GET_LOG		= 0x0401,
+	CXL_MBOX_OP_IDENTIFY		= 0x4000,
+	CXL_MBOX_OP_GET_PARTITION_INFO	= 0x4100,
+	CXL_MBOX_OP_SET_PARTITION_INFO	= 0x4101,
+	CXL_MBOX_OP_GET_LSA		= 0x4102,
+	CXL_MBOX_OP_SET_LSA		= 0x4103,
+	CXL_MBOX_OP_GET_HEALTH_INFO	= 0x4200,
+	CXL_MBOX_OP_GET_ALERT_CONFIG	= 0x4201,
+	CXL_MBOX_OP_SET_ALERT_CONFIG	= 0x4202,
+	CXL_MBOX_OP_GET_SHUTDOWN_STATE	= 0x4203,
+	CXL_MBOX_OP_SET_SHUTDOWN_STATE	= 0x4204,
+	CXL_MBOX_OP_GET_POISON		= 0x4300,
+	CXL_MBOX_OP_INJECT_POISON	= 0x4301,
+	CXL_MBOX_OP_CLEAR_POISON	= 0x4302,
+	CXL_MBOX_OP_GET_SCAN_MEDIA_CAPS	= 0x4303,
+	CXL_MBOX_OP_SCAN_MEDIA		= 0x4304,
+	CXL_MBOX_OP_GET_SCAN_MEDIA	= 0x4305,
+	CXL_MBOX_OP_MAX			= 0x10000
+};
+
+/**
+ * struct cxl_mem_command - Driver representation of a memory device command
+ * @info: Command information as it exists for the UAPI
+ * @opcode: The actual bits used for the mailbox protocol
+ * @flags: Set of flags effecting driver behavior.
+ *
+ *  * %CXL_CMD_FLAG_FORCE_ENABLE: In cases of error, commands with this flag
+ *    will be enabled by the driver regardless of what hardware may have
+ *    advertised.
+ *
+ * The cxl_mem_command is the driver's internal representation of commands that
+ * are supported by the driver. Some of these commands may not be supported by
+ * the hardware. The driver will use @info to validate the fields passed in by
+ * the user then submit the @opcode to the hardware.
+ *
+ * See struct cxl_command_info.
+ */
+struct cxl_mem_command {
+	struct cxl_command_info info;
+	enum cxl_opcode opcode;
+	u32 flags;
+#define CXL_CMD_FLAG_NONE 0
+#define CXL_CMD_FLAG_FORCE_ENABLE BIT(0)
+};
+
+int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode, void *in,
+			  size_t in_size, void *out, size_t out_size);
+int cxl_mem_identify(struct cxl_mem *cxlm);
+int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm);
+int cxl_mem_create_range_info(struct cxl_mem *cxlm);
+struct cxl_mem *cxl_mem_create(struct device *dev);
 #endif /* __CXL_MEM_H__ */
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 9d8050fdd69c..c9f2ac134f4d 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1,16 +1,12 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
-#include <uapi/linux/cxl_mem.h>
-#include <linux/security.h>
-#include <linux/debugfs.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/module.h>
 #include <linux/sizes.h>
 #include <linux/mutex.h>
 #include <linux/list.h>
-#include <linux/cdev.h>
 #include <linux/pci.h>
 #include <linux/io.h>
-#include <linux/io-64-nonatomic-lo-hi.h>
 #include "cxlmem.h"
 #include "pci.h"
 #include "cxl.h"
@@ -37,162 +33,6 @@
 /* CXL 2.0 - 8.2.8.4 */
 #define CXL_MAILBOX_TIMEOUT_MS (2 * HZ)
 
-enum opcode {
-	CXL_MBOX_OP_INVALID		= 0x0000,
-	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
-	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
-	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
-	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
-	CXL_MBOX_OP_GET_LOG		= 0x0401,
-	CXL_MBOX_OP_IDENTIFY		= 0x4000,
-	CXL_MBOX_OP_GET_PARTITION_INFO	= 0x4100,
-	CXL_MBOX_OP_SET_PARTITION_INFO	= 0x4101,
-	CXL_MBOX_OP_GET_LSA		= 0x4102,
-	CXL_MBOX_OP_SET_LSA		= 0x4103,
-	CXL_MBOX_OP_GET_HEALTH_INFO	= 0x4200,
-	CXL_MBOX_OP_GET_ALERT_CONFIG	= 0x4201,
-	CXL_MBOX_OP_SET_ALERT_CONFIG	= 0x4202,
-	CXL_MBOX_OP_GET_SHUTDOWN_STATE	= 0x4203,
-	CXL_MBOX_OP_SET_SHUTDOWN_STATE	= 0x4204,
-	CXL_MBOX_OP_GET_POISON		= 0x4300,
-	CXL_MBOX_OP_INJECT_POISON	= 0x4301,
-	CXL_MBOX_OP_CLEAR_POISON	= 0x4302,
-	CXL_MBOX_OP_GET_SCAN_MEDIA_CAPS	= 0x4303,
-	CXL_MBOX_OP_SCAN_MEDIA		= 0x4304,
-	CXL_MBOX_OP_GET_SCAN_MEDIA	= 0x4305,
-	CXL_MBOX_OP_MAX			= 0x10000
-};
-
-static DECLARE_RWSEM(cxl_memdev_rwsem);
-static struct dentry *cxl_debugfs;
-static bool cxl_raw_allow_all;
-
-enum {
-	CEL_UUID,
-	VENDOR_DEBUG_UUID,
-};
-
-/* See CXL 2.0 Table 170. Get Log Input Payload */
-static const uuid_t log_uuid[] = {
-	[CEL_UUID] = UUID_INIT(0xda9c0b5, 0xbf41, 0x4b78, 0x8f, 0x79, 0x96,
-			       0xb1, 0x62, 0x3b, 0x3f, 0x17),
-	[VENDOR_DEBUG_UUID] = UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f,
-					0xd6, 0x07, 0x19, 0x40, 0x3d, 0x86),
-};
-
-/**
- * struct cxl_mem_command - Driver representation of a memory device command
- * @info: Command information as it exists for the UAPI
- * @opcode: The actual bits used for the mailbox protocol
- * @flags: Set of flags effecting driver behavior.
- *
- *  * %CXL_CMD_FLAG_FORCE_ENABLE: In cases of error, commands with this flag
- *    will be enabled by the driver regardless of what hardware may have
- *    advertised.
- *
- * The cxl_mem_command is the driver's internal representation of commands that
- * are supported by the driver. Some of these commands may not be supported by
- * the hardware. The driver will use @info to validate the fields passed in by
- * the user then submit the @opcode to the hardware.
- *
- * See struct cxl_command_info.
- */
-struct cxl_mem_command {
-	struct cxl_command_info info;
-	enum opcode opcode;
-	u32 flags;
-#define CXL_CMD_FLAG_NONE 0
-#define CXL_CMD_FLAG_FORCE_ENABLE BIT(0)
-};
-
-#define CXL_CMD(_id, sin, sout, _flags)                                        \
-	[CXL_MEM_COMMAND_ID_##_id] = {                                         \
-	.info =	{                                                              \
-			.id = CXL_MEM_COMMAND_ID_##_id,                        \
-			.size_in = sin,                                        \
-			.size_out = sout,                                      \
-		},                                                             \
-	.opcode = CXL_MBOX_OP_##_id,                                           \
-	.flags = _flags,                                                       \
-	}
-
-/*
- * This table defines the supported mailbox commands for the driver. This table
- * is made up of a UAPI structure. Non-negative values as parameters in the
- * table will be validated against the user's input. For example, if size_in is
- * 0, and the user passed in 1, it is an error.
- */
-static struct cxl_mem_command mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
-	CXL_CMD(IDENTIFY, 0, 0x43, CXL_CMD_FLAG_FORCE_ENABLE),
-#ifdef CONFIG_CXL_MEM_RAW_COMMANDS
-	CXL_CMD(RAW, ~0, ~0, 0),
-#endif
-	CXL_CMD(GET_SUPPORTED_LOGS, 0, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
-	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
-	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
-	CXL_CMD(GET_LSA, 0x8, ~0, 0),
-	CXL_CMD(GET_HEALTH_INFO, 0, 0x12, 0),
-	CXL_CMD(GET_LOG, 0x18, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
-	CXL_CMD(SET_PARTITION_INFO, 0x0a, 0, 0),
-	CXL_CMD(SET_LSA, ~0, 0, 0),
-	CXL_CMD(GET_ALERT_CONFIG, 0, 0x10, 0),
-	CXL_CMD(SET_ALERT_CONFIG, 0xc, 0, 0),
-	CXL_CMD(GET_SHUTDOWN_STATE, 0, 0x1, 0),
-	CXL_CMD(SET_SHUTDOWN_STATE, 0x1, 0, 0),
-	CXL_CMD(GET_POISON, 0x10, ~0, 0),
-	CXL_CMD(INJECT_POISON, 0x8, 0, 0),
-	CXL_CMD(CLEAR_POISON, 0x48, 0, 0),
-	CXL_CMD(GET_SCAN_MEDIA_CAPS, 0x10, 0x4, 0),
-	CXL_CMD(SCAN_MEDIA, 0x11, 0, 0),
-	CXL_CMD(GET_SCAN_MEDIA, 0, ~0, 0),
-};
-
-/*
- * Commands that RAW doesn't permit. The rationale for each:
- *
- * CXL_MBOX_OP_ACTIVATE_FW: Firmware activation requires adjustment /
- * coordination of transaction timeout values at the root bridge level.
- *
- * CXL_MBOX_OP_SET_PARTITION_INFO: The device memory map may change live
- * and needs to be coordinated with HDM updates.
- *
- * CXL_MBOX_OP_SET_LSA: The label storage area may be cached by the
- * driver and any writes from userspace invalidates those contents.
- *
- * CXL_MBOX_OP_SET_SHUTDOWN_STATE: Set shutdown state assumes no writes
- * to the device after it is marked clean, userspace can not make that
- * assertion.
- *
- * CXL_MBOX_OP_[GET_]SCAN_MEDIA: The kernel provides a native error list that
- * is kept up to date with patrol notifications and error management.
- */
-static u16 cxl_disabled_raw_commands[] = {
-	CXL_MBOX_OP_ACTIVATE_FW,
-	CXL_MBOX_OP_SET_PARTITION_INFO,
-	CXL_MBOX_OP_SET_LSA,
-	CXL_MBOX_OP_SET_SHUTDOWN_STATE,
-	CXL_MBOX_OP_SCAN_MEDIA,
-	CXL_MBOX_OP_GET_SCAN_MEDIA,
-};
-
-/*
- * Command sets that RAW doesn't permit. All opcodes in this set are
- * disabled because they pass plain text security payloads over the
- * user/kernel boundary. This functionality is intended to be wrapped
- * behind the keys ABI which allows for encrypted payloads in the UAPI
- */
-static u8 security_command_sets[] = {
-	0x44, /* Sanitize */
-	0x45, /* Persistent Memory Data-at-rest Security */
-	0x46, /* Security Passthrough */
-};
-
-#define cxl_for_each_cmd(cmd)                                                  \
-	for ((cmd) = &mem_commands[0];                                         \
-	     ((cmd) - mem_commands) < ARRAY_SIZE(mem_commands); (cmd)++)
-
-#define cxl_cmd_count ARRAY_SIZE(mem_commands)
-
 static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm)
 {
 	const unsigned long start = jiffies;
@@ -215,16 +55,6 @@ static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm)
 	return 0;
 }
 
-static bool cxl_is_security_command(u16 opcode)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(security_command_sets); i++)
-		if (security_command_sets[i] == (opcode >> 8))
-			return true;
-	return false;
-}
-
 static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
 				 struct cxl_mbox_cmd *mbox_cmd)
 {
@@ -446,433 +276,6 @@ static int cxl_pci_mbox_send(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
 	return rc;
 }
 
-/**
- * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
- * @cxlm: The CXL memory device to communicate with.
- * @cmd: The validated command.
- * @in_payload: Pointer to userspace's input payload.
- * @out_payload: Pointer to userspace's output payload.
- * @size_out: (Input) Max payload size to copy out.
- *            (Output) Payload size hardware generated.
- * @retval: Hardware generated return code from the operation.
- *
- * Return:
- *  * %0	- Mailbox transaction succeeded. This implies the mailbox
- *		  protocol completed successfully not that the operation itself
- *		  was successful.
- *  * %-ENOMEM  - Couldn't allocate a bounce buffer.
- *  * %-EFAULT	- Something happened with copy_to/from_user.
- *  * %-EINTR	- Mailbox acquisition interrupted.
- *  * %-EXXX	- Transaction level failures.
- *
- * Creates the appropriate mailbox command and dispatches it on behalf of a
- * userspace request. The input and output payloads are copied between
- * userspace.
- *
- * See cxl_send_cmd().
- */
-static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
-					const struct cxl_mem_command *cmd,
-					u64 in_payload, u64 out_payload,
-					s32 *size_out, u32 *retval)
-{
-	struct device *dev = cxlm->dev;
-	struct cxl_mbox_cmd mbox_cmd = {
-		.opcode = cmd->opcode,
-		.size_in = cmd->info.size_in,
-		.size_out = cmd->info.size_out,
-	};
-	int rc;
-
-	if (cmd->info.size_out) {
-		mbox_cmd.payload_out = kvzalloc(cmd->info.size_out, GFP_KERNEL);
-		if (!mbox_cmd.payload_out)
-			return -ENOMEM;
-	}
-
-	if (cmd->info.size_in) {
-		mbox_cmd.payload_in = vmemdup_user(u64_to_user_ptr(in_payload),
-						   cmd->info.size_in);
-		if (IS_ERR(mbox_cmd.payload_in)) {
-			kvfree(mbox_cmd.payload_out);
-			return PTR_ERR(mbox_cmd.payload_in);
-		}
-	}
-
-	dev_dbg(dev,
-		"Submitting %s command for user\n"
-		"\topcode: %x\n"
-		"\tsize: %ub\n",
-		cxl_command_names[cmd->info.id].name, mbox_cmd.opcode,
-		cmd->info.size_in);
-
-	dev_WARN_ONCE(dev, cmd->info.id == CXL_MEM_COMMAND_ID_RAW,
-		      "raw command path used\n");
-
-	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
-	if (rc)
-		goto out;
-
-	/*
-	 * @size_out contains the max size that's allowed to be written back out
-	 * to userspace. While the payload may have written more output than
-	 * this it will have to be ignored.
-	 */
-	if (mbox_cmd.size_out) {
-		dev_WARN_ONCE(dev, mbox_cmd.size_out > *size_out,
-			      "Invalid return size\n");
-		if (copy_to_user(u64_to_user_ptr(out_payload),
-				 mbox_cmd.payload_out, mbox_cmd.size_out)) {
-			rc = -EFAULT;
-			goto out;
-		}
-	}
-
-	*size_out = mbox_cmd.size_out;
-	*retval = mbox_cmd.return_code;
-
-out:
-	kvfree(mbox_cmd.payload_in);
-	kvfree(mbox_cmd.payload_out);
-	return rc;
-}
-
-static bool cxl_mem_raw_command_allowed(u16 opcode)
-{
-	int i;
-
-	if (!IS_ENABLED(CONFIG_CXL_MEM_RAW_COMMANDS))
-		return false;
-
-	if (security_locked_down(LOCKDOWN_PCI_ACCESS))
-		return false;
-
-	if (cxl_raw_allow_all)
-		return true;
-
-	if (cxl_is_security_command(opcode))
-		return false;
-
-	for (i = 0; i < ARRAY_SIZE(cxl_disabled_raw_commands); i++)
-		if (cxl_disabled_raw_commands[i] == opcode)
-			return false;
-
-	return true;
-}
-
-/**
- * cxl_validate_cmd_from_user() - Check fields for CXL_MEM_SEND_COMMAND.
- * @cxlm: &struct cxl_mem device whose mailbox will be used.
- * @send_cmd: &struct cxl_send_command copied in from userspace.
- * @out_cmd: Sanitized and populated &struct cxl_mem_command.
- *
- * Return:
- *  * %0	- @out_cmd is ready to send.
- *  * %-ENOTTY	- Invalid command specified.
- *  * %-EINVAL	- Reserved fields or invalid values were used.
- *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
- *  * %-EPERM	- Attempted to use a protected command.
- *
- * The result of this command is a fully validated command in @out_cmd that is
- * safe to send to the hardware.
- *
- * See handle_mailbox_cmd_from_user()
- */
-static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
-				      const struct cxl_send_command *send_cmd,
-				      struct cxl_mem_command *out_cmd)
-{
-	const struct cxl_command_info *info;
-	struct cxl_mem_command *c;
-
-	if (send_cmd->id == 0 || send_cmd->id >= CXL_MEM_COMMAND_ID_MAX)
-		return -ENOTTY;
-
-	/*
-	 * The user can never specify an input payload larger than what hardware
-	 * supports, but output can be arbitrarily large (simply write out as
-	 * much data as the hardware provides).
-	 */
-	if (send_cmd->in.size > cxlm->payload_size)
-		return -EINVAL;
-
-	/*
-	 * Checks are bypassed for raw commands but a WARN/taint will occur
-	 * later in the callchain
-	 */
-	if (send_cmd->id == CXL_MEM_COMMAND_ID_RAW) {
-		const struct cxl_mem_command temp = {
-			.info = {
-				.id = CXL_MEM_COMMAND_ID_RAW,
-				.flags = 0,
-				.size_in = send_cmd->in.size,
-				.size_out = send_cmd->out.size,
-			},
-			.opcode = send_cmd->raw.opcode
-		};
-
-		if (send_cmd->raw.rsvd)
-			return -EINVAL;
-
-		/*
-		 * Unlike supported commands, the output size of RAW commands
-		 * gets passed along without further checking, so it must be
-		 * validated here.
-		 */
-		if (send_cmd->out.size > cxlm->payload_size)
-			return -EINVAL;
-
-		if (!cxl_mem_raw_command_allowed(send_cmd->raw.opcode))
-			return -EPERM;
-
-		memcpy(out_cmd, &temp, sizeof(temp));
-
-		return 0;
-	}
-
-	if (send_cmd->flags & ~CXL_MEM_COMMAND_FLAG_MASK)
-		return -EINVAL;
-
-	if (send_cmd->rsvd)
-		return -EINVAL;
-
-	if (send_cmd->in.rsvd || send_cmd->out.rsvd)
-		return -EINVAL;
-
-	/* Convert user's command into the internal representation */
-	c = &mem_commands[send_cmd->id];
-	info = &c->info;
-
-	/* Check that the command is enabled for hardware */
-	if (!test_bit(info->id, cxlm->enabled_cmds))
-		return -ENOTTY;
-
-	/* Check the input buffer is the expected size */
-	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
-		return -ENOMEM;
-
-	/* Check the output buffer is at least large enough */
-	if (info->size_out >= 0 && send_cmd->out.size < info->size_out)
-		return -ENOMEM;
-
-	memcpy(out_cmd, c, sizeof(*c));
-	out_cmd->info.size_in = send_cmd->in.size;
-	/*
-	 * XXX: out_cmd->info.size_out will be controlled by the driver, and the
-	 * specified number of bytes @send_cmd->out.size will be copied back out
-	 * to userspace.
-	 */
-
-	return 0;
-}
-
-static int cxl_query_cmd(struct cxl_memdev *cxlmd,
-			 struct cxl_mem_query_commands __user *q)
-{
-	struct device *dev = &cxlmd->dev;
-	struct cxl_mem_command *cmd;
-	u32 n_commands;
-	int j = 0;
-
-	dev_dbg(dev, "Query IOCTL\n");
-
-	if (get_user(n_commands, &q->n_commands))
-		return -EFAULT;
-
-	/* returns the total number if 0 elements are requested. */
-	if (n_commands == 0)
-		return put_user(cxl_cmd_count, &q->n_commands);
-
-	/*
-	 * otherwise, return max(n_commands, total commands) cxl_command_info
-	 * structures.
-	 */
-	cxl_for_each_cmd(cmd) {
-		const struct cxl_command_info *info = &cmd->info;
-
-		if (copy_to_user(&q->commands[j++], info, sizeof(*info)))
-			return -EFAULT;
-
-		if (j == n_commands)
-			break;
-	}
-
-	return 0;
-}
-
-static int cxl_send_cmd(struct cxl_memdev *cxlmd,
-			struct cxl_send_command __user *s)
-{
-	struct cxl_mem *cxlm = cxlmd->cxlm;
-	struct device *dev = &cxlmd->dev;
-	struct cxl_send_command send;
-	struct cxl_mem_command c;
-	int rc;
-
-	dev_dbg(dev, "Send IOCTL\n");
-
-	if (copy_from_user(&send, s, sizeof(send)))
-		return -EFAULT;
-
-	rc = cxl_validate_cmd_from_user(cxlmd->cxlm, &send, &c);
-	if (rc)
-		return rc;
-
-	/* Prepare to handle a full payload for variable sized output */
-	if (c.info.size_out < 0)
-		c.info.size_out = cxlm->payload_size;
-
-	rc = handle_mailbox_cmd_from_user(cxlm, &c, send.in.payload,
-					  send.out.payload, &send.out.size,
-					  &send.retval);
-	if (rc)
-		return rc;
-
-	if (copy_to_user(s, &send, sizeof(send)))
-		return -EFAULT;
-
-	return 0;
-}
-
-static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned int cmd,
-			       unsigned long arg)
-{
-	switch (cmd) {
-	case CXL_MEM_QUERY_COMMANDS:
-		return cxl_query_cmd(cxlmd, (void __user *)arg);
-	case CXL_MEM_SEND_COMMAND:
-		return cxl_send_cmd(cxlmd, (void __user *)arg);
-	default:
-		return -ENOTTY;
-	}
-}
-
-static long cxl_memdev_ioctl(struct file *file, unsigned int cmd,
-			     unsigned long arg)
-{
-	struct cxl_memdev *cxlmd = file->private_data;
-	int rc = -ENXIO;
-
-	down_read(&cxl_memdev_rwsem);
-	if (cxlmd->cxlm)
-		rc = __cxl_memdev_ioctl(cxlmd, cmd, arg);
-	up_read(&cxl_memdev_rwsem);
-
-	return rc;
-}
-
-static int cxl_memdev_open(struct inode *inode, struct file *file)
-{
-	struct cxl_memdev *cxlmd =
-		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
-
-	get_device(&cxlmd->dev);
-	file->private_data = cxlmd;
-
-	return 0;
-}
-
-static int cxl_memdev_release_file(struct inode *inode, struct file *file)
-{
-	struct cxl_memdev *cxlmd =
-		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
-
-	put_device(&cxlmd->dev);
-
-	return 0;
-}
-
-static void cxl_memdev_shutdown(struct device *dev)
-{
-	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
-
-	down_write(&cxl_memdev_rwsem);
-	cxlmd->cxlm = NULL;
-	up_write(&cxl_memdev_rwsem);
-}
-
-static const struct cdevm_file_operations cxl_memdev_fops = {
-	.fops = {
-		.owner = THIS_MODULE,
-		.unlocked_ioctl = cxl_memdev_ioctl,
-		.open = cxl_memdev_open,
-		.release = cxl_memdev_release_file,
-		.compat_ioctl = compat_ptr_ioctl,
-		.llseek = noop_llseek,
-	},
-	.shutdown = cxl_memdev_shutdown,
-};
-
-static inline struct cxl_mem_command *cxl_mem_find_command(u16 opcode)
-{
-	struct cxl_mem_command *c;
-
-	cxl_for_each_cmd(c)
-		if (c->opcode == opcode)
-			return c;
-
-	return NULL;
-}
-
-/**
- * cxl_mem_mbox_send_cmd() - Send a mailbox command to a memory device.
- * @cxlm: The CXL memory device to communicate with.
- * @opcode: Opcode for the mailbox command.
- * @in: The input payload for the mailbox command.
- * @in_size: The length of the input payload
- * @out: Caller allocated buffer for the output.
- * @out_size: Expected size of output.
- *
- * Context: Any context. Will acquire and release mbox_mutex.
- * Return:
- *  * %>=0	- Number of bytes returned in @out.
- *  * %-E2BIG	- Payload is too large for hardware.
- *  * %-EBUSY	- Couldn't acquire exclusive mailbox access.
- *  * %-EFAULT	- Hardware error occurred.
- *  * %-ENXIO	- Command completed, but device reported an error.
- *  * %-EIO	- Unexpected output size.
- *
- * Mailbox commands may execute successfully yet the device itself reported an
- * error. While this distinction can be useful for commands from userspace, the
- * kernel will only be able to use results when both are successful.
- *
- * See __cxl_mem_mbox_send_cmd()
- */
-static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode,
-				 void *in, size_t in_size,
-				 void *out, size_t out_size)
-{
-	const struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
-	struct cxl_mbox_cmd mbox_cmd = {
-		.opcode = opcode,
-		.payload_in = in,
-		.size_in = in_size,
-		.size_out = out_size,
-		.payload_out = out,
-	};
-	int rc;
-
-	if (out_size > cxlm->payload_size)
-		return -E2BIG;
-
-	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
-	if (rc)
-		return rc;
-
-	/* TODO: Map return code to proper kernel style errno */
-	if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
-		return -ENXIO;
-
-	/*
-	 * Variable sized commands can't be validated and so it's up to the
-	 * caller to do that if they wish.
-	 */
-	if (cmd->info.size_out >= 0 && mbox_cmd.size_out != out_size)
-		return -EIO;
-
-	return 0;
-}
-
 static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
 {
 	const int cap = readl(cxlm->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
@@ -901,30 +304,6 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
 	return 0;
 }
 
-static struct cxl_mem *cxl_mem_create(struct device *dev)
-{
-	struct cxl_mem *cxlm;
-
-	cxlm = devm_kzalloc(dev, sizeof(*cxlm), GFP_KERNEL);
-	if (!cxlm) {
-		dev_err(dev, "No memory available\n");
-		return ERR_PTR(-ENOMEM);
-	}
-
-	mutex_init(&cxlm->mbox_mutex);
-	cxlm->dev = dev;
-	cxlm->enabled_cmds =
-		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
-				   sizeof(unsigned long),
-				   GFP_KERNEL | __GFP_ZERO);
-	if (!cxlm->enabled_cmds) {
-		dev_err(dev, "No memory available for bitmap\n");
-		return ERR_PTR(-ENOMEM);
-	}
-
-	return cxlm;
-}
-
 static void __iomem *cxl_mem_map_regblock(struct cxl_mem *cxlm,
 					  u8 bar, u64 offset)
 {
@@ -1132,298 +511,6 @@ static int cxl_mem_setup_regs(struct cxl_mem *cxlm)
 	return ret;
 }
 
-static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
-{
-	u32 remaining = size;
-	u32 offset = 0;
-
-	while (remaining) {
-		u32 xfer_size = min_t(u32, remaining, cxlm->payload_size);
-		struct cxl_mbox_get_log {
-			uuid_t uuid;
-			__le32 offset;
-			__le32 length;
-		} __packed log = {
-			.uuid = *uuid,
-			.offset = cpu_to_le32(offset),
-			.length = cpu_to_le32(xfer_size)
-		};
-		int rc;
-
-		rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LOG, &log,
-					   sizeof(log), out, xfer_size);
-		if (rc < 0)
-			return rc;
-
-		out += xfer_size;
-		remaining -= xfer_size;
-		offset += xfer_size;
-	}
-
-	return 0;
-}
-
-/**
- * cxl_walk_cel() - Walk through the Command Effects Log.
- * @cxlm: Device.
- * @size: Length of the Command Effects Log.
- * @cel: CEL
- *
- * Iterate over each entry in the CEL and determine if the driver supports the
- * command. If so, the command is enabled for the device and can be used later.
- */
-static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
-{
-	struct cel_entry {
-		__le16 opcode;
-		__le16 effect;
-	} __packed * cel_entry;
-	const int cel_entries = size / sizeof(*cel_entry);
-	int i;
-
-	cel_entry = (struct cel_entry *)cel;
-
-	for (i = 0; i < cel_entries; i++) {
-		u16 opcode = le16_to_cpu(cel_entry[i].opcode);
-		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
-
-		if (!cmd) {
-			dev_dbg(cxlm->dev,
-				"Opcode 0x%04x unsupported by driver", opcode);
-			continue;
-		}
-
-		set_bit(cmd->info.id, cxlm->enabled_cmds);
-	}
-}
-
-struct cxl_mbox_get_supported_logs {
-	__le16 entries;
-	u8 rsvd[6];
-	struct gsl_entry {
-		uuid_t uuid;
-		__le32 size;
-	} __packed entry[];
-} __packed;
-
-static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
-{
-	struct cxl_mbox_get_supported_logs *ret;
-	int rc;
-
-	ret = kvmalloc(cxlm->payload_size, GFP_KERNEL);
-	if (!ret)
-		return ERR_PTR(-ENOMEM);
-
-	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_SUPPORTED_LOGS, NULL,
-				   0, ret, cxlm->payload_size);
-	if (rc < 0) {
-		kvfree(ret);
-		return ERR_PTR(rc);
-	}
-
-	return ret;
-}
-
-/**
- * cxl_mem_get_partition_info - Get partition info
- * @cxlm: cxl_mem instance to update partition info
- *
- * Retrieve the current partition info for the device specified.  If not 0, the
- * 'next' values are pending and take affect on next cold reset.
- *
- * Return: 0 if no error: or the result of the mailbox command.
- *
- * See CXL @8.2.9.5.2.1 Get Partition Info
- */
-static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
-{
-	struct cxl_mbox_get_partition_info {
-		__le64 active_volatile_cap;
-		__le64 active_persistent_cap;
-		__le64 next_volatile_cap;
-		__le64 next_persistent_cap;
-	} __packed pi;
-	int rc;
-
-	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_PARTITION_INFO,
-				   NULL, 0, &pi, sizeof(pi));
-	if (rc)
-		return rc;
-
-	cxlm->active_volatile_bytes =
-		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
-	cxlm->active_persistent_bytes =
-		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
-	cxlm->next_volatile_bytes =
-		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
-	cxlm->next_persistent_bytes =
-		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
-
-	return 0;
-}
-
-/**
- * cxl_mem_enumerate_cmds() - Enumerate commands for a device.
- * @cxlm: The device.
- *
- * Returns 0 if enumerate completed successfully.
- *
- * CXL devices have optional support for certain commands. This function will
- * determine the set of supported commands for the hardware and update the
- * enabled_cmds bitmap in the @cxlm.
- */
-static int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm)
-{
-	struct cxl_mbox_get_supported_logs *gsl;
-	struct device *dev = cxlm->dev;
-	struct cxl_mem_command *cmd;
-	int i, rc;
-
-	gsl = cxl_get_gsl(cxlm);
-	if (IS_ERR(gsl))
-		return PTR_ERR(gsl);
-
-	rc = -ENOENT;
-	for (i = 0; i < le16_to_cpu(gsl->entries); i++) {
-		u32 size = le32_to_cpu(gsl->entry[i].size);
-		uuid_t uuid = gsl->entry[i].uuid;
-		u8 *log;
-
-		dev_dbg(dev, "Found LOG type %pU of size %d", &uuid, size);
-
-		if (!uuid_equal(&uuid, &log_uuid[CEL_UUID]))
-			continue;
-
-		log = kvmalloc(size, GFP_KERNEL);
-		if (!log) {
-			rc = -ENOMEM;
-			goto out;
-		}
-
-		rc = cxl_xfer_log(cxlm, &uuid, size, log);
-		if (rc) {
-			kvfree(log);
-			goto out;
-		}
-
-		cxl_walk_cel(cxlm, size, log);
-		kvfree(log);
-
-		/* In case CEL was bogus, enable some default commands. */
-		cxl_for_each_cmd(cmd)
-			if (cmd->flags & CXL_CMD_FLAG_FORCE_ENABLE)
-				set_bit(cmd->info.id, cxlm->enabled_cmds);
-
-		/* Found the required CEL */
-		rc = 0;
-	}
-
-out:
-	kvfree(gsl);
-	return rc;
-}
-
-/**
- * cxl_mem_identify() - Send the IDENTIFY command to the device.
- * @cxlm: The device to identify.
- *
- * Return: 0 if identify was executed successfully.
- *
- * This will dispatch the identify command to the device and on success populate
- * structures to be exported to sysfs.
- */
-static int cxl_mem_identify(struct cxl_mem *cxlm)
-{
-	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
-	struct cxl_mbox_identify {
-		char fw_revision[0x10];
-		__le64 total_capacity;
-		__le64 volatile_capacity;
-		__le64 persistent_capacity;
-		__le64 partition_align;
-		__le16 info_event_log_size;
-		__le16 warning_event_log_size;
-		__le16 failure_event_log_size;
-		__le16 fatal_event_log_size;
-		__le32 lsa_size;
-		u8 poison_list_max_mer[3];
-		__le16 inject_poison_limit;
-		u8 poison_caps;
-		u8 qos_telemetry_caps;
-	} __packed id;
-	int rc;
-
-	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_IDENTIFY, NULL, 0, &id,
-				   sizeof(id));
-	if (rc < 0)
-		return rc;
-
-	cxlm->total_bytes = le64_to_cpu(id.total_capacity);
-	cxlm->total_bytes *= CXL_CAPACITY_MULTIPLIER;
-
-	cxlm->volatile_only_bytes = le64_to_cpu(id.volatile_capacity);
-	cxlm->volatile_only_bytes *= CXL_CAPACITY_MULTIPLIER;
-
-	cxlm->persistent_only_bytes = le64_to_cpu(id.persistent_capacity);
-	cxlm->persistent_only_bytes *= CXL_CAPACITY_MULTIPLIER;
-
-	cxlm->partition_align_bytes = le64_to_cpu(id.partition_align);
-	cxlm->partition_align_bytes *= CXL_CAPACITY_MULTIPLIER;
-
-	dev_dbg(cxlm->dev,
-		"Identify Memory Device\n"
-		"     total_bytes = %#llx\n"
-		"     volatile_only_bytes = %#llx\n"
-		"     persistent_only_bytes = %#llx\n"
-		"     partition_align_bytes = %#llx\n",
-		cxlm->total_bytes, cxlm->volatile_only_bytes,
-		cxlm->persistent_only_bytes, cxlm->partition_align_bytes);
-
-	cxlm->lsa_size = le32_to_cpu(id.lsa_size);
-	memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision));
-
-	return 0;
-}
-
-static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
-{
-	int rc;
-
-	if (cxlm->partition_align_bytes == 0) {
-		cxlm->ram_range.start = 0;
-		cxlm->ram_range.end = cxlm->volatile_only_bytes - 1;
-		cxlm->pmem_range.start = cxlm->volatile_only_bytes;
-		cxlm->pmem_range.end = cxlm->volatile_only_bytes +
-					cxlm->persistent_only_bytes - 1;
-		return 0;
-	}
-
-	rc = cxl_mem_get_partition_info(cxlm);
-	if (rc < 0) {
-		dev_err(cxlm->dev, "Failed to query partition information\n");
-		return rc;
-	}
-
-	dev_dbg(cxlm->dev,
-		"Get Partition Info\n"
-		"     active_volatile_bytes = %#llx\n"
-		"     active_persistent_bytes = %#llx\n"
-		"     next_volatile_bytes = %#llx\n"
-		"     next_persistent_bytes = %#llx\n",
-		cxlm->active_volatile_bytes, cxlm->active_persistent_bytes,
-		cxlm->next_volatile_bytes, cxlm->next_persistent_bytes);
-
-	cxlm->ram_range.start = 0;
-	cxlm->ram_range.end = cxlm->active_volatile_bytes - 1;
-
-	cxlm->pmem_range.start = cxlm->active_volatile_bytes;
-	cxlm->pmem_range.end = cxlm->active_volatile_bytes +
-				cxlm->active_persistent_bytes - 1;
-
-	return 0;
-}
-
 static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct cxl_memdev *cxlmd;
@@ -1458,7 +545,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
-	cxlmd = devm_cxl_add_memdev(cxlm, &cxl_memdev_fops);
+	cxlmd = devm_cxl_add_memdev(cxlm);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
@@ -1486,7 +573,6 @@ static struct pci_driver cxl_mem_driver = {
 
 static __init int cxl_mem_init(void)
 {
-	struct dentry *mbox_debugfs;
 	int rc;
 
 	/* Double check the anonymous union trickery in struct cxl_regs */
@@ -1497,17 +583,11 @@ static __init int cxl_mem_init(void)
 	if (rc)
 		return rc;
 
-	cxl_debugfs = debugfs_create_dir("cxl", NULL);
-	mbox_debugfs = debugfs_create_dir("mbox", cxl_debugfs);
-	debugfs_create_bool("raw_allow_all", 0600, mbox_debugfs,
-			    &cxl_raw_allow_all);
-
 	return 0;
 }
 
 static __exit void cxl_mem_exit(void)
 {
-	debugfs_remove_recursive(cxl_debugfs);
 	pci_unregister_driver(&cxl_mem_driver);
 }
 


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 12/21] cxl/pci: Use module_pci_driver
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (10 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09  5:12 ` [PATCH v4 13/21] cxl/mbox: Convert 'enabled_cmds' to DECLARE_BITMAP Dan Williams
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Jonathan Cameron, vishal.l.verma, nvdimm,
	ben.widawsky, alison.schofield, vishal.l.verma, ira.weiny,
	Jonathan.Cameron

Now that cxl_mem_{init,exit} no longer need to manage debugfs, switch
back to the smaller form of the boiler plate.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/pci.c |   30 ++++++++----------------------
 1 file changed, 8 insertions(+), 22 deletions(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index c9f2ac134f4d..27f75b5a2ee2 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -517,6 +517,13 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	struct cxl_mem *cxlm;
 	int rc;
 
+	/*
+	 * Double check the anonymous union trickery in struct cxl_regs
+	 * FIXME switch to struct_group()
+	 */
+	BUILD_BUG_ON(offsetof(struct cxl_regs, memdev) !=
+		     offsetof(struct cxl_regs, device_regs.memdev));
+
 	rc = pcim_enable_device(pdev);
 	if (rc)
 		return rc;
@@ -571,27 +578,6 @@ static struct pci_driver cxl_mem_driver = {
 	},
 };
 
-static __init int cxl_mem_init(void)
-{
-	int rc;
-
-	/* Double check the anonymous union trickery in struct cxl_regs */
-	BUILD_BUG_ON(offsetof(struct cxl_regs, memdev) !=
-		     offsetof(struct cxl_regs, device_regs.memdev));
-
-	rc = pci_register_driver(&cxl_mem_driver);
-	if (rc)
-		return rc;
-
-	return 0;
-}
-
-static __exit void cxl_mem_exit(void)
-{
-	pci_unregister_driver(&cxl_mem_driver);
-}
-
 MODULE_LICENSE("GPL v2");
-module_init(cxl_mem_init);
-module_exit(cxl_mem_exit);
+module_pci_driver(cxl_mem_driver);
 MODULE_IMPORT_NS(CXL);


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 13/21] cxl/mbox: Convert 'enabled_cmds' to DECLARE_BITMAP
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (11 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 12/21] cxl/pci: Use module_pci_driver Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09  5:12 ` [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support Dan Williams
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Jonathan Cameron, vishal.l.verma, nvdimm,
	ben.widawsky, alison.schofield, vishal.l.verma, ira.weiny,
	Jonathan.Cameron

Define enabled_cmds as an embedded member of 'struct cxl_mem' rather
than a pointer to another dynamic allocation.

As this leaves only one user of cxl_cmd_count, just open code it and
delete the helper.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/mbox.c |   12 +-----------
 drivers/cxl/cxlmem.h    |    2 +-
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 31e183991726..422999740649 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -315,8 +315,6 @@ static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
 	return 0;
 }
 
-#define cxl_cmd_count ARRAY_SIZE(cxl_mem_commands)
-
 int cxl_query_cmd(struct cxl_memdev *cxlmd,
 		  struct cxl_mem_query_commands __user *q)
 {
@@ -332,7 +330,7 @@ int cxl_query_cmd(struct cxl_memdev *cxlmd,
 
 	/* returns the total number if 0 elements are requested. */
 	if (n_commands == 0)
-		return put_user(cxl_cmd_count, &q->n_commands);
+		return put_user(ARRAY_SIZE(cxl_mem_commands), &q->n_commands);
 
 	/*
 	 * otherwise, return max(n_commands, total commands) cxl_command_info
@@ -794,14 +792,6 @@ struct cxl_mem *cxl_mem_create(struct device *dev)
 
 	mutex_init(&cxlm->mbox_mutex);
 	cxlm->dev = dev;
-	cxlm->enabled_cmds =
-		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
-				   sizeof(unsigned long),
-				   GFP_KERNEL | __GFP_ZERO);
-	if (!cxlm->enabled_cmds) {
-		dev_err(dev, "No memory available for bitmap\n");
-		return ERR_PTR(-ENOMEM);
-	}
 
 	return cxlm;
 }
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 35fd16d12532..16201b7d82d2 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -126,7 +126,7 @@ struct cxl_mem {
 	size_t lsa_size;
 	struct mutex mbox_mutex; /* Protects device mailbox and firmware */
 	char firmware_version[0x10];
-	unsigned long *enabled_cmds;
+	DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
 
 	struct range pmem_range;
 	struct range ram_range;


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (12 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 13/21] cxl/mbox: Convert 'enabled_cmds' to DECLARE_BITMAP Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09 17:02   ` Ben Widawsky
                     ` (2 more replies)
  2021-09-09  5:12 ` [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands Dan Williams
                   ` (6 subsequent siblings)
  20 siblings, 3 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

The CXL_PMEM driver expects exclusive control of the label storage area
space. Similar to the LIBNVDIMM expectation that the label storage area
is only writable from userspace when the corresponding memory device is
not active in any region, the expectation is the native CXL_PCI UAPI
path is disabled while the cxl_nvdimm for a given cxl_memdev device is
active in LIBNVDIMM.

Add the ability to toggle the availability of a given command for the
UAPI path. Use that new capability to shutdown changes to partitions and
the label storage area while the cxl_nvdimm device is actively proxying
commands for LIBNVDIMM.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/162982123298.1124374.22718002900700392.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/mbox.c   |    5 +++++
 drivers/cxl/core/memdev.c |   31 +++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h      |    4 ++++
 drivers/cxl/pmem.c        |   43 ++++++++++++++++++++++++++++++++-----------
 4 files changed, 72 insertions(+), 11 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 422999740649..82e79da195fa 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -221,6 +221,7 @@ static bool cxl_mem_raw_command_allowed(u16 opcode)
  *  * %-EINVAL	- Reserved fields or invalid values were used.
  *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
  *  * %-EPERM	- Attempted to use a protected command.
+ *  * %-EBUSY	- Kernel has claimed exclusive access to this opcode
  *
  * The result of this command is a fully validated command in @out_cmd that is
  * safe to send to the hardware.
@@ -296,6 +297,10 @@ static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
 	if (!test_bit(info->id, cxlm->enabled_cmds))
 		return -ENOTTY;
 
+	/* Check that the command is not claimed for exclusive kernel use */
+	if (test_bit(info->id, cxlm->exclusive_cmds))
+		return -EBUSY;
+
 	/* Check the input buffer is the expected size */
 	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
 		return -ENOMEM;
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index df2ba87238c2..d9ade5b92330 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -134,6 +134,37 @@ static const struct device_type cxl_memdev_type = {
 	.groups = cxl_memdev_attribute_groups,
 };
 
+/**
+ * set_exclusive_cxl_commands() - atomically disable user cxl commands
+ * @cxlm: cxl_mem instance to modify
+ * @cmds: bitmap of commands to mark exclusive
+ *
+ * Flush the ioctl path and disable future execution of commands with
+ * the command ids set in @cmds.
+ */
+void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
+{
+	down_write(&cxl_memdev_rwsem);
+	bitmap_or(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
+		  CXL_MEM_COMMAND_ID_MAX);
+	up_write(&cxl_memdev_rwsem);
+}
+EXPORT_SYMBOL_GPL(set_exclusive_cxl_commands);
+
+/**
+ * clear_exclusive_cxl_commands() - atomically enable user cxl commands
+ * @cxlm: cxl_mem instance to modify
+ * @cmds: bitmap of commands to mark available for userspace
+ */
+void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
+{
+	down_write(&cxl_memdev_rwsem);
+	bitmap_andnot(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
+		      CXL_MEM_COMMAND_ID_MAX);
+	up_write(&cxl_memdev_rwsem);
+}
+EXPORT_SYMBOL_GPL(clear_exclusive_cxl_commands);
+
 static void cxl_memdev_shutdown(struct device *dev)
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 16201b7d82d2..468b7b8be207 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -101,6 +101,7 @@ struct cxl_mbox_cmd {
  * @mbox_mutex: Mutex to synchronize mailbox access.
  * @firmware_version: Firmware version for the memory device.
  * @enabled_cmds: Hardware commands found enabled in CEL.
+ * @exclusive_cmds: Commands that are kernel-internal only
  * @pmem_range: Active Persistent memory capacity configuration
  * @ram_range: Active Volatile memory capacity configuration
  * @total_bytes: sum of all possible capacities
@@ -127,6 +128,7 @@ struct cxl_mem {
 	struct mutex mbox_mutex; /* Protects device mailbox and firmware */
 	char firmware_version[0x10];
 	DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
+	DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
 
 	struct range pmem_range;
 	struct range ram_range;
@@ -200,4 +202,6 @@ int cxl_mem_identify(struct cxl_mem *cxlm);
 int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm);
 int cxl_mem_create_range_info(struct cxl_mem *cxlm);
 struct cxl_mem *cxl_mem_create(struct device *dev);
+void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
+void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
 #endif /* __CXL_MEM_H__ */
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index 9652c3ee41e7..a972af7a6e0b 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -16,10 +16,7 @@
  */
 static struct workqueue_struct *cxl_pmem_wq;
 
-static void unregister_nvdimm(void *nvdimm)
-{
-	nvdimm_delete(nvdimm);
-}
+static __read_mostly DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
 
 static int match_nvdimm_bridge(struct device *dev, const void *data)
 {
@@ -36,12 +33,25 @@ static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
 	return to_cxl_nvdimm_bridge(dev);
 }
 
+static void cxl_nvdimm_remove(struct device *dev)
+{
+	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
+	struct nvdimm *nvdimm = dev_get_drvdata(dev);
+	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
+	struct cxl_mem *cxlm = cxlmd->cxlm;
+
+	nvdimm_delete(nvdimm);
+	clear_exclusive_cxl_commands(cxlm, exclusive_cmds);
+}
+
 static int cxl_nvdimm_probe(struct device *dev)
 {
 	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
+	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
+	struct cxl_mem *cxlm = cxlmd->cxlm;
 	struct cxl_nvdimm_bridge *cxl_nvb;
+	struct nvdimm *nvdimm = NULL;
 	unsigned long flags = 0;
-	struct nvdimm *nvdimm;
 	int rc = -ENXIO;
 
 	cxl_nvb = cxl_find_nvdimm_bridge();
@@ -50,25 +60,32 @@ static int cxl_nvdimm_probe(struct device *dev)
 
 	device_lock(&cxl_nvb->dev);
 	if (!cxl_nvb->nvdimm_bus)
-		goto out;
+		goto out_unlock;
+
+	set_exclusive_cxl_commands(cxlm, exclusive_cmds);
 
 	set_bit(NDD_LABELING, &flags);
+	rc = -ENOMEM;
 	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
 			       NULL);
-	if (!nvdimm)
-		goto out;
+	dev_set_drvdata(dev, nvdimm);
 
-	rc = devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm);
-out:
+out_unlock:
 	device_unlock(&cxl_nvb->dev);
 	put_device(&cxl_nvb->dev);
 
-	return rc;
+	if (!nvdimm) {
+		clear_exclusive_cxl_commands(cxlm, exclusive_cmds);
+		return rc;
+	}
+
+	return 0;
 }
 
 static struct cxl_driver cxl_nvdimm_driver = {
 	.name = "cxl_nvdimm",
 	.probe = cxl_nvdimm_probe,
+	.remove = cxl_nvdimm_remove,
 	.id = CXL_DEVICE_NVDIMM,
 };
 
@@ -194,6 +211,10 @@ static __init int cxl_pmem_init(void)
 {
 	int rc;
 
+	set_bit(CXL_MEM_COMMAND_ID_SET_PARTITION_INFO, exclusive_cmds);
+	set_bit(CXL_MEM_COMMAND_ID_SET_SHUTDOWN_STATE, exclusive_cmds);
+	set_bit(CXL_MEM_COMMAND_ID_SET_LSA, exclusive_cmds);
+
 	cxl_pmem_wq = alloc_ordered_workqueue("cxl_pmem", 0);
 	if (!cxl_pmem_wq)
 		return -ENXIO;


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (13 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09 17:22   ` Ben Widawsky
                     ` (2 more replies)
  2021-09-09  5:12 ` [PATCH v4 16/21] cxl/pmem: Add support for multiple nvdimm-bridge objects Dan Williams
                   ` (5 subsequent siblings)
  20 siblings, 3 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to
translate the Linux command payload to the device native command format.
The LIBNVDIMM commands get-config-size, get-config-data, and
set-config-data, map to the CXL memory device commands device-identify,
get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an
NVDIMM device arranges for the provisioning of namespaces. Additionally
for CXL the LSA is used for provisioning regions as well.

The data from device-identify is already cached in the 'struct cxl_mem'
instance associated with @cxl_nvd, so that payload return is simply
crafted and no CXL command is issued. The conversion for get-lsa is
straightforward, but the conversion for set-lsa requires an allocation
to append the set-lsa header in front of the payload.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/pmem.c |  125 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 121 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index a972af7a6e0b..29d24f13aa73 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
 #include <linux/libnvdimm.h>
+#include <asm/unaligned.h>
 #include <linux/device.h>
 #include <linux/module.h>
 #include <linux/ndctl.h>
@@ -48,10 +49,10 @@ static int cxl_nvdimm_probe(struct device *dev)
 {
 	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
+	unsigned long flags = 0, cmd_mask = 0;
 	struct cxl_mem *cxlm = cxlmd->cxlm;
 	struct cxl_nvdimm_bridge *cxl_nvb;
 	struct nvdimm *nvdimm = NULL;
-	unsigned long flags = 0;
 	int rc = -ENXIO;
 
 	cxl_nvb = cxl_find_nvdimm_bridge();
@@ -66,8 +67,11 @@ static int cxl_nvdimm_probe(struct device *dev)
 
 	set_bit(NDD_LABELING, &flags);
 	rc = -ENOMEM;
-	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
-			       NULL);
+	set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
+	set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
+	set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
+	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags,
+			       cmd_mask, 0, NULL);
 	dev_set_drvdata(dev, nvdimm);
 
 out_unlock:
@@ -89,11 +93,124 @@ static struct cxl_driver cxl_nvdimm_driver = {
 	.id = CXL_DEVICE_NVDIMM,
 };
 
+static int cxl_pmem_get_config_size(struct cxl_mem *cxlm,
+				    struct nd_cmd_get_config_size *cmd,
+				    unsigned int buf_len, int *cmd_rc)
+{
+	if (sizeof(*cmd) > buf_len)
+		return -EINVAL;
+
+	*cmd = (struct nd_cmd_get_config_size) {
+		 .config_size = cxlm->lsa_size,
+		 .max_xfer = cxlm->payload_size,
+	};
+	*cmd_rc = 0;
+
+	return 0;
+}
+
+static int cxl_pmem_get_config_data(struct cxl_mem *cxlm,
+				    struct nd_cmd_get_config_data_hdr *cmd,
+				    unsigned int buf_len, int *cmd_rc)
+{
+	struct cxl_mbox_get_lsa {
+		u32 offset;
+		u32 length;
+	} get_lsa;
+	int rc;
+
+	if (sizeof(*cmd) > buf_len)
+		return -EINVAL;
+	if (struct_size(cmd, out_buf, cmd->in_length) > buf_len)
+		return -EINVAL;
+
+	get_lsa = (struct cxl_mbox_get_lsa) {
+		.offset = cmd->in_offset,
+		.length = cmd->in_length,
+	};
+
+	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LSA, &get_lsa,
+				   sizeof(get_lsa), cmd->out_buf,
+				   cmd->in_length);
+	cmd->status = 0;
+	*cmd_rc = 0;
+
+	return rc;
+}
+
+static int cxl_pmem_set_config_data(struct cxl_mem *cxlm,
+				    struct nd_cmd_set_config_hdr *cmd,
+				    unsigned int buf_len, int *cmd_rc)
+{
+	struct cxl_mbox_set_lsa {
+		u32 offset;
+		u32 reserved;
+		u8 data[];
+	} *set_lsa;
+	int rc;
+
+	if (sizeof(*cmd) > buf_len)
+		return -EINVAL;
+
+	/* 4-byte status follows the input data in the payload */
+	if (struct_size(cmd, in_buf, cmd->in_length) + 4 > buf_len)
+		return -EINVAL;
+
+	set_lsa =
+		kvzalloc(struct_size(set_lsa, data, cmd->in_length), GFP_KERNEL);
+	if (!set_lsa)
+		return -ENOMEM;
+
+	*set_lsa = (struct cxl_mbox_set_lsa) {
+		.offset = cmd->in_offset,
+	};
+	memcpy(set_lsa->data, cmd->in_buf, cmd->in_length);
+
+	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_SET_LSA, set_lsa,
+				   struct_size(set_lsa, data, cmd->in_length),
+				   NULL, 0);
+
+	/*
+	 * Set "firmware" status (4-packed bytes at the end of the input
+	 * payload.
+	 */
+	put_unaligned(0, (u32 *) &cmd->in_buf[cmd->in_length]);
+	*cmd_rc = 0;
+	kvfree(set_lsa);
+
+	return rc;
+}
+
+static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
+			       void *buf, unsigned int buf_len, int *cmd_rc)
+{
+	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
+	unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
+	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
+	struct cxl_mem *cxlm = cxlmd->cxlm;
+
+	if (!test_bit(cmd, &cmd_mask))
+		return -ENOTTY;
+
+	switch (cmd) {
+	case ND_CMD_GET_CONFIG_SIZE:
+		return cxl_pmem_get_config_size(cxlm, buf, buf_len, cmd_rc);
+	case ND_CMD_GET_CONFIG_DATA:
+		return cxl_pmem_get_config_data(cxlm, buf, buf_len, cmd_rc);
+	case ND_CMD_SET_CONFIG_DATA:
+		return cxl_pmem_set_config_data(cxlm, buf, buf_len, cmd_rc);
+	default:
+		return -ENOTTY;
+	}
+}
+
 static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
 			struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 			unsigned int buf_len, int *cmd_rc)
 {
-	return -ENOTTY;
+	if (!nvdimm)
+		return -ENOTTY;
+	return cxl_pmem_nvdimm_ctl(nvdimm, cmd, buf, buf_len, cmd_rc);
 }
 
 static bool online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 16/21] cxl/pmem: Add support for multiple nvdimm-bridge objects
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (14 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands Dan Williams
@ 2021-09-09  5:12 ` Dan Williams
  2021-09-09 22:03   ` Dan Williams
  2021-09-14 19:08   ` [PATCH v5 " Dan Williams
  2021-09-09  5:13 ` [PATCH v4 17/21] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy Dan Williams
                   ` (4 subsequent siblings)
  20 siblings, 2 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:12 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

In preparation for a mocked unit test environment for CXL objects, allow
for multiple unique nvdimm-bridge objects.

For now, just allow multiple bridges to be registered. Later, when there
are multiple present, further updates are needed to
cxl_find_nvdimm_bridge() to identify which bridge is associated with
which CXL hierarchy for nvdimm registration.

Note that this does change the kernel device-name for the bridge object.
User space should not have any attachment to the device name at this
point as it is still early days in the CXL driver development.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/pmem.c |   32 +++++++++++++++++++++++++++++++-
 drivers/cxl/cxl.h       |    2 ++
 drivers/cxl/pmem.c      |   15 ---------------
 3 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index d24570f5b8ba..9e56be3994f1 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -2,6 +2,7 @@
 /* Copyright(c) 2020 Intel Corporation. */
 #include <linux/device.h>
 #include <linux/slab.h>
+#include <linux/idr.h>
 #include <cxlmem.h>
 #include <cxl.h>
 #include "core.h"
@@ -20,10 +21,13 @@
  * operations, for example, namespace label access commands.
  */
 
+static DEFINE_IDA(cxl_nvdimm_bridge_ida);
+
 static void cxl_nvdimm_bridge_release(struct device *dev)
 {
 	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
 
+	ida_free(&cxl_nvdimm_bridge_ida, cxl_nvb->id);
 	kfree(cxl_nvb);
 }
 
@@ -47,16 +51,38 @@ struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(to_cxl_nvdimm_bridge);
 
+static int match_nvdimm_bridge(struct device *dev, const void *data)
+{
+	return dev->type == &cxl_nvdimm_bridge_type;
+}
+
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
+{
+	struct device *dev;
+
+	dev = bus_find_device(&cxl_bus_type, NULL, NULL, match_nvdimm_bridge);
+	if (!dev)
+		return NULL;
+	return to_cxl_nvdimm_bridge(dev);
+}
+EXPORT_SYMBOL_GPL(cxl_find_nvdimm_bridge);
+
 static struct cxl_nvdimm_bridge *
 cxl_nvdimm_bridge_alloc(struct cxl_port *port)
 {
 	struct cxl_nvdimm_bridge *cxl_nvb;
 	struct device *dev;
+	int rc;
 
 	cxl_nvb = kzalloc(sizeof(*cxl_nvb), GFP_KERNEL);
 	if (!cxl_nvb)
 		return ERR_PTR(-ENOMEM);
 
+	rc = ida_alloc(&cxl_nvdimm_bridge_ida, GFP_KERNEL);
+	if (rc < 0)
+		goto err;
+	cxl_nvb->id = rc;
+
 	dev = &cxl_nvb->dev;
 	cxl_nvb->port = port;
 	cxl_nvb->state = CXL_NVB_NEW;
@@ -67,6 +93,10 @@ cxl_nvdimm_bridge_alloc(struct cxl_port *port)
 	dev->type = &cxl_nvdimm_bridge_type;
 
 	return cxl_nvb;
+
+err:
+	kfree(cxl_nvb);
+	return ERR_PTR(rc);
 }
 
 static void unregister_nvb(void *_cxl_nvb)
@@ -119,7 +149,7 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
 		return cxl_nvb;
 
 	dev = &cxl_nvb->dev;
-	rc = dev_set_name(dev, "nvdimm-bridge");
+	rc = dev_set_name(dev, "nvdimm-bridge%d", cxl_nvb->id);
 	if (rc)
 		goto err;
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 53927f9fa77e..1b2e816e061e 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -211,6 +211,7 @@ enum cxl_nvdimm_brige_state {
 };
 
 struct cxl_nvdimm_bridge {
+	int id;
 	struct device dev;
 	struct cxl_port *port;
 	struct nvdimm_bus *nvdimm_bus;
@@ -323,4 +324,5 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
 struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
 bool is_cxl_nvdimm(struct device *dev);
 int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void);
 #endif /* __CXL_H__ */
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index 29d24f13aa73..bb30d3e55825 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -19,21 +19,6 @@ static struct workqueue_struct *cxl_pmem_wq;
 
 static __read_mostly DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
 
-static int match_nvdimm_bridge(struct device *dev, const void *data)
-{
-	return strcmp(dev_name(dev), "nvdimm-bridge") == 0;
-}
-
-static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
-{
-	struct device *dev;
-
-	dev = bus_find_device(&cxl_bus_type, NULL, NULL, match_nvdimm_bridge);
-	if (!dev)
-		return NULL;
-	return to_cxl_nvdimm_bridge(dev);
-}
-
 static void cxl_nvdimm_remove(struct device *dev)
 {
 	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 17/21] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (15 preceding siblings ...)
  2021-09-09  5:12 ` [PATCH v4 16/21] cxl/pmem: Add support for multiple nvdimm-bridge objects Dan Williams
@ 2021-09-09  5:13 ` Dan Williams
  2021-09-10  9:53   ` Jonathan Cameron
  2021-09-14 19:14   ` [PATCH v5 " Dan Williams
  2021-09-09  5:13 ` [PATCH v4 18/21] cxl/bus: Populate the target list at decoder create Dan Williams
                   ` (3 subsequent siblings)
  20 siblings, 2 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:13 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Vishal Verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

Create an environment for CXL plumbing unit tests. Especially when it
comes to an algorithm for HDM Decoder (Host-managed Device Memory
Decoder) programming, the availability of an in-kernel-tree emulation
environment for CXL configuration complexity and corner cases speeds
development and deters regressions.

The approach taken mirrors what was done for tools/testing/nvdimm/. I.e.
an external module, cxl_test.ko built out of the tools/testing/cxl/
directory, provides mock implementations of kernel APIs and kernel
objects to simulate a real world device hierarchy.

One feedback for the tools/testing/nvdimm/ proposal was "why not do this
in QEMU?". In fact, the CXL development community has developed a QEMU
model for CXL [1]. However, there are a few blocking issues that keep
QEMU from being a tight fit for topology + provisioning unit tests:

1/ The QEMU community has yet to show interest in merging any of this
   support that has had patches on the list since November 2020. So,
   testing CXL to date involves building custom QEMU with out-of-tree
   patches.

2/ CXL mechanisms like cross-host-bridge interleave do not have a clear
   path to be emulated by QEMU without major infrastructure work. This
   is easier to achieve with the alloc_mock_res() approach taken in this
   patch to shortcut-define emulated system physical address ranges with
   interleave behavior.

The QEMU enabling has been critical to get the driver off the ground,
and may still move forward, but it does not address the ongoing needs of
a regression testing environment and test driven development.

This patch adds an ACPI CXL Platform definition with emulated CXL
multi-ported host-bridges. A follow on patch adds emulated memory
expander devices.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20210202005948.241655-1-ben.widawsky@intel.com [1]
Link: https://lore.kernel.org/r/162982125348.1124374.17808192318402734926.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/acpi.c               |   40 ++-
 drivers/cxl/cxl.h                |   16 +
 tools/testing/cxl/Kbuild         |   36 +++
 tools/testing/cxl/config_check.c |   13 +
 tools/testing/cxl/mock_acpi.c    |  109 ++++++++
 tools/testing/cxl/test/Kbuild    |    6 
 tools/testing/cxl/test/cxl.c     |  509 ++++++++++++++++++++++++++++++++++++++
 tools/testing/cxl/test/mock.c    |  171 +++++++++++++
 tools/testing/cxl/test/mock.h    |   27 ++
 9 files changed, 911 insertions(+), 16 deletions(-)
 create mode 100644 tools/testing/cxl/Kbuild
 create mode 100644 tools/testing/cxl/config_check.c
 create mode 100644 tools/testing/cxl/mock_acpi.c
 create mode 100644 tools/testing/cxl/test/Kbuild
 create mode 100644 tools/testing/cxl/test/cxl.c
 create mode 100644 tools/testing/cxl/test/mock.c
 create mode 100644 tools/testing/cxl/test/mock.h

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 54e9d4d2cf5f..d31a97218593 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -182,15 +182,7 @@ static resource_size_t get_chbcr(struct acpi_cedt_chbs *chbs)
 	return IS_ERR(chbs) ? CXL_RESOURCE_NONE : chbs->base;
 }
 
-struct cxl_walk_context {
-	struct device *dev;
-	struct pci_bus *root;
-	struct cxl_port *port;
-	int error;
-	int count;
-};
-
-static int match_add_root_ports(struct pci_dev *pdev, void *data)
+__mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 {
 	struct cxl_walk_context *ctx = data;
 	struct pci_bus *root_bus = ctx->root;
@@ -239,15 +231,18 @@ static struct cxl_dport *find_dport_by_dev(struct cxl_port *port, struct device
 	return NULL;
 }
 
-static struct acpi_device *to_cxl_host_bridge(struct device *dev)
+__mock struct acpi_device *to_cxl_host_bridge(struct device *host,
+					      struct device *dev)
 {
 	struct acpi_device *adev = to_acpi_device(dev);
 
 	if (!acpi_pci_find_root(adev->handle))
 		return NULL;
 
-	if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0)
+	if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0) {
+		dev_dbg(host, "found host bridge %s\n", dev_name(&adev->dev));
 		return adev;
+	}
 	return NULL;
 }
 
@@ -257,9 +252,9 @@ static struct acpi_device *to_cxl_host_bridge(struct device *dev)
  */
 static int add_host_bridge_uport(struct device *match, void *arg)
 {
-	struct acpi_device *bridge = to_cxl_host_bridge(match);
 	struct cxl_port *root_port = arg;
 	struct device *host = root_port->dev.parent;
+	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
 	struct acpi_pci_root *pci_root;
 	struct cxl_walk_context ctx;
 	struct cxl_decoder *cxld;
@@ -323,7 +318,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
 	struct acpi_cedt_chbs *chbs;
 	struct cxl_port *root_port = arg;
 	struct device *host = root_port->dev.parent;
-	struct acpi_device *bridge = to_cxl_host_bridge(match);
+	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
 
 	if (!bridge)
 		return 0;
@@ -375,6 +370,17 @@ static int add_root_nvdimm_bridge(struct device *match, void *data)
 	return 1;
 }
 
+static u32 cedt_instance(struct platform_device *pdev)
+{
+	const bool *native_acpi0017 = acpi_device_get_match_data(&pdev->dev);
+
+	if (native_acpi0017 && *native_acpi0017)
+		return 0;
+
+	/* for cxl_test request a non-canonical instance */
+	return U32_MAX;
+}
+
 static int cxl_acpi_probe(struct platform_device *pdev)
 {
 	int rc;
@@ -388,7 +394,7 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 		return PTR_ERR(root_port);
 	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
 
-	status = acpi_get_table(ACPI_SIG_CEDT, 0, &acpi_cedt);
+	status = acpi_get_table(ACPI_SIG_CEDT, cedt_instance(pdev), &acpi_cedt);
 	if (ACPI_FAILURE(status))
 		return -ENXIO;
 
@@ -419,9 +425,11 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	return 0;
 }
 
+static bool native_acpi0017 = true;
+
 static const struct acpi_device_id cxl_acpi_ids[] = {
-	{ "ACPI0017", 0 },
-	{ "", 0 },
+	{ "ACPI0017", (unsigned long) &native_acpi0017 },
+	{ },
 };
 MODULE_DEVICE_TABLE(acpi, cxl_acpi_ids);
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 1b2e816e061e..c5152718267e 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -226,6 +226,14 @@ struct cxl_nvdimm {
 	struct nvdimm *nvdimm;
 };
 
+struct cxl_walk_context {
+	struct device *dev;
+	struct pci_bus *root;
+	struct cxl_port *port;
+	int error;
+	int count;
+};
+
 /**
  * struct cxl_port - logical collection of upstream port devices and
  *		     downstream port devices to construct a CXL memory
@@ -325,4 +333,12 @@ struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
 bool is_cxl_nvdimm(struct device *dev);
 int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
 struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void);
+
+/*
+ * Unit test builds overrides this to __weak, find the 'strong' version
+ * of these symbols in tools/testing/cxl/.
+ */
+#ifndef __mock
+#define __mock static
+#endif
 #endif /* __CXL_H__ */
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
new file mode 100644
index 000000000000..63a4a07e71c4
--- /dev/null
+++ b/tools/testing/cxl/Kbuild
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0
+ldflags-y += --wrap=is_acpi_device_node
+ldflags-y += --wrap=acpi_get_table
+ldflags-y += --wrap=acpi_put_table
+ldflags-y += --wrap=acpi_evaluate_integer
+ldflags-y += --wrap=acpi_pci_find_root
+ldflags-y += --wrap=pci_walk_bus
+ldflags-y += --wrap=nvdimm_bus_register
+
+DRIVERS := ../../../drivers
+CXL_SRC := $(DRIVERS)/cxl
+CXL_CORE_SRC := $(DRIVERS)/cxl/core
+ccflags-y := -I$(srctree)/drivers/cxl/
+ccflags-y += -D__mock=__weak
+
+obj-m += cxl_acpi.o
+
+cxl_acpi-y := $(CXL_SRC)/acpi.o
+cxl_acpi-y += mock_acpi.o
+cxl_acpi-y += config_check.o
+
+obj-m += cxl_pmem.o
+
+cxl_pmem-y := $(CXL_SRC)/pmem.o
+cxl_pmem-y += config_check.o
+
+obj-m += cxl_core.o
+
+cxl_core-y := $(CXL_CORE_SRC)/bus.o
+cxl_core-y += $(CXL_CORE_SRC)/pmem.o
+cxl_core-y += $(CXL_CORE_SRC)/regs.o
+cxl_core-y += $(CXL_CORE_SRC)/memdev.o
+cxl_core-y += $(CXL_CORE_SRC)/mbox.o
+cxl_core-y += config_check.o
+
+obj-m += test/
diff --git a/tools/testing/cxl/config_check.c b/tools/testing/cxl/config_check.c
new file mode 100644
index 000000000000..de5e5b3652fd
--- /dev/null
+++ b/tools/testing/cxl/config_check.c
@@ -0,0 +1,13 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bug.h>
+
+void check(void)
+{
+	/*
+	 * These kconfig symbols must be set to "m" for cxl_test to load
+	 * and operate.
+	 */
+	BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_BUS));
+	BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_ACPI));
+	BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_PMEM));
+}
diff --git a/tools/testing/cxl/mock_acpi.c b/tools/testing/cxl/mock_acpi.c
new file mode 100644
index 000000000000..4c8a493ace56
--- /dev/null
+++ b/tools/testing/cxl/mock_acpi.c
@@ -0,0 +1,109 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+
+#include <linux/platform_device.h>
+#include <linux/device.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+#include <cxl.h>
+#include "test/mock.h"
+
+struct acpi_device *to_cxl_host_bridge(struct device *host, struct device *dev)
+{
+	int index;
+	struct acpi_device *adev, *found = NULL;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops && ops->is_mock_bridge(dev)) {
+		found = ACPI_COMPANION(dev);
+		goto out;
+	}
+
+	if (dev->bus == &platform_bus_type)
+		goto out;
+
+	adev = to_acpi_device(dev);
+	if (!acpi_pci_find_root(adev->handle))
+		goto out;
+
+	if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0) {
+		found = adev;
+		dev_dbg(host, "found host bridge %s\n", dev_name(&adev->dev));
+	}
+out:
+	put_cxl_mock_ops(index);
+	return found;
+}
+
+static int match_add_root_port(struct pci_dev *pdev, void *data)
+{
+	struct cxl_walk_context *ctx = data;
+	struct pci_bus *root_bus = ctx->root;
+	struct cxl_port *port = ctx->port;
+	int type = pci_pcie_type(pdev);
+	struct device *dev = ctx->dev;
+	u32 lnkcap, port_num;
+	int rc;
+
+	if (pdev->bus != root_bus)
+		return 0;
+	if (!pci_is_pcie(pdev))
+		return 0;
+	if (type != PCI_EXP_TYPE_ROOT_PORT)
+		return 0;
+	if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
+				  &lnkcap) != PCIBIOS_SUCCESSFUL)
+		return 0;
+
+	/* TODO walk DVSEC to find component register base */
+	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
+	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
+	if (rc) {
+		dev_err(dev, "failed to add dport: %s (%d)\n",
+			dev_name(&pdev->dev), rc);
+		ctx->error = rc;
+		return rc;
+	}
+	ctx->count++;
+
+	dev_dbg(dev, "add dport%d: %s\n", port_num, dev_name(&pdev->dev));
+
+	return 0;
+}
+
+static int mock_add_root_port(struct platform_device *pdev, void *data)
+{
+	struct cxl_walk_context *ctx = data;
+	struct cxl_port *port = ctx->port;
+	struct device *dev = ctx->dev;
+	int rc;
+
+	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE);
+	if (rc) {
+		dev_err(dev, "failed to add dport: %s (%d)\n",
+			dev_name(&pdev->dev), rc);
+		ctx->error = rc;
+		return rc;
+	}
+	ctx->count++;
+
+	dev_dbg(dev, "add dport%d: %s\n", pdev->id, dev_name(&pdev->dev));
+
+	return 0;
+}
+
+int match_add_root_ports(struct pci_dev *dev, void *data)
+{
+	int index, rc;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+	struct platform_device *pdev = (struct platform_device *) dev;
+
+	if (ops && ops->is_mock_port(pdev))
+		rc = mock_add_root_port(pdev, data);
+	else
+		rc = match_add_root_port(dev, data);
+
+	put_cxl_mock_ops(index);
+
+	return rc;
+}
diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
new file mode 100644
index 000000000000..7de4ddecfd21
--- /dev/null
+++ b/tools/testing/cxl/test/Kbuild
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-m += cxl_test.o
+obj-m += cxl_mock.o
+
+cxl_test-y := cxl.o
+cxl_mock-y := mock.o
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
new file mode 100644
index 000000000000..1c47b34244a4
--- /dev/null
+++ b/tools/testing/cxl/test/cxl.c
@@ -0,0 +1,509 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2021 Intel Corporation. All rights reserved.
+
+#include <linux/platform_device.h>
+#include <linux/genalloc.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+#include <linux/mm.h>
+#include "mock.h"
+
+#define NR_CXL_HOST_BRIDGES 4
+#define NR_CXL_ROOT_PORTS 2
+
+static struct platform_device *cxl_acpi;
+static struct platform_device *cxl_host_bridge[NR_CXL_HOST_BRIDGES];
+static struct platform_device
+	*cxl_root_port[NR_CXL_HOST_BRIDGES * NR_CXL_ROOT_PORTS];
+
+static struct acpi_device acpi0017_mock;
+static struct acpi_device host_bridge[NR_CXL_HOST_BRIDGES] = {
+	[0] = {
+		.handle = &host_bridge[0],
+	},
+	[1] = {
+		.handle = &host_bridge[1],
+	},
+	[2] = {
+		.handle = &host_bridge[2],
+	},
+	[3] = {
+		.handle = &host_bridge[3],
+	},
+};
+
+static bool is_mock_dev(struct device *dev)
+{
+	if (dev == &cxl_acpi->dev)
+		return true;
+	return false;
+}
+
+static bool is_mock_adev(struct acpi_device *adev)
+{
+	int i;
+
+	if (adev == &acpi0017_mock)
+		return true;
+
+	for (i = 0; i < ARRAY_SIZE(host_bridge); i++)
+		if (adev == &host_bridge[i])
+			return true;
+
+	return false;
+}
+
+static struct {
+	struct acpi_table_cedt cedt;
+	struct acpi_cedt_chbs chbs[NR_CXL_HOST_BRIDGES];
+	struct {
+		struct acpi_cedt_cfmws cfmws;
+		u32 target[1];
+	} cfmws0;
+	struct {
+		struct acpi_cedt_cfmws cfmws;
+		u32 target[4];
+	} cfmws1;
+	struct {
+		struct acpi_cedt_cfmws cfmws;
+		u32 target[1];
+	} cfmws2;
+	struct {
+		struct acpi_cedt_cfmws cfmws;
+		u32 target[4];
+	} cfmws3;
+} __packed mock_cedt = {
+	.cedt = {
+		.header = {
+			.signature = "CEDT",
+			.length = sizeof(mock_cedt),
+			.revision = 1,
+		},
+	},
+	.chbs[0] = {
+		.header = {
+			.type = ACPI_CEDT_TYPE_CHBS,
+			.length = sizeof(mock_cedt.chbs[0]),
+		},
+		.uid = 0,
+		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
+	},
+	.chbs[1] = {
+		.header = {
+			.type = ACPI_CEDT_TYPE_CHBS,
+			.length = sizeof(mock_cedt.chbs[0]),
+		},
+		.uid = 1,
+		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
+	},
+	.chbs[2] = {
+		.header = {
+			.type = ACPI_CEDT_TYPE_CHBS,
+			.length = sizeof(mock_cedt.chbs[0]),
+		},
+		.uid = 2,
+		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
+	},
+	.chbs[3] = {
+		.header = {
+			.type = ACPI_CEDT_TYPE_CHBS,
+			.length = sizeof(mock_cedt.chbs[0]),
+		},
+		.uid = 3,
+		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
+	},
+	.cfmws0 = {
+		.cfmws = {
+			.header = {
+				.type = ACPI_CEDT_TYPE_CFMWS,
+				.length = sizeof(mock_cedt.cfmws0),
+			},
+			.interleave_ways = 0,
+			.granularity = 4,
+			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
+					ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
+			.qtg_id = 0,
+			.window_size = SZ_256M,
+		},
+		.target = { 0 },
+	},
+	.cfmws1 = {
+		.cfmws = {
+			.header = {
+				.type = ACPI_CEDT_TYPE_CFMWS,
+				.length = sizeof(mock_cedt.cfmws1),
+			},
+			.interleave_ways = 2,
+			.granularity = 4,
+			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
+					ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
+			.qtg_id = 1,
+			.window_size = SZ_256M * 4,
+		},
+		.target = { 0, 1, 2, 3 },
+	},
+	.cfmws2 = {
+		.cfmws = {
+			.header = {
+				.type = ACPI_CEDT_TYPE_CFMWS,
+				.length = sizeof(mock_cedt.cfmws2),
+			},
+			.interleave_ways = 0,
+			.granularity = 4,
+			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
+					ACPI_CEDT_CFMWS_RESTRICT_PMEM,
+			.qtg_id = 2,
+			.window_size = SZ_256M,
+		},
+		.target = { 0 },
+	},
+	.cfmws3 = {
+		.cfmws = {
+			.header = {
+				.type = ACPI_CEDT_TYPE_CFMWS,
+				.length = sizeof(mock_cedt.cfmws3),
+			},
+			.interleave_ways = 2,
+			.granularity = 4,
+			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
+					ACPI_CEDT_CFMWS_RESTRICT_PMEM,
+			.qtg_id = 3,
+			.window_size = SZ_256M * 4,
+		},
+		.target = { 0, 1, 2, 3 },
+	},
+};
+
+struct cxl_mock_res {
+	struct list_head list;
+	struct range range;
+};
+
+static LIST_HEAD(mock_res);
+static DEFINE_MUTEX(mock_res_lock);
+static struct gen_pool *cxl_mock_pool;
+
+static void depopulate_all_mock_resources(void)
+{
+	struct cxl_mock_res *res, *_res;
+
+	mutex_lock(&mock_res_lock);
+	list_for_each_entry_safe(res, _res, &mock_res, list) {
+		gen_pool_free(cxl_mock_pool, res->range.start,
+			      range_len(&res->range));
+		list_del(&res->list);
+		kfree(res);
+	}
+	mutex_unlock(&mock_res_lock);
+}
+
+static struct cxl_mock_res *alloc_mock_res(resource_size_t size)
+{
+	struct cxl_mock_res *res = kzalloc(sizeof(*res), GFP_KERNEL);
+	struct genpool_data_align data = {
+		.align = SZ_256M,
+	};
+	unsigned long phys;
+
+	INIT_LIST_HEAD(&res->list);
+	phys = gen_pool_alloc_algo(cxl_mock_pool, size,
+				   gen_pool_first_fit_align, &data);
+	if (!phys)
+		return NULL;
+
+	res->range = (struct range) {
+		.start = phys,
+		.end = phys + size - 1,
+	};
+	mutex_lock(&mock_res_lock);
+	list_add(&res->list, &mock_res);
+	mutex_unlock(&mock_res_lock);
+
+	return res;
+}
+
+static int populate_cedt(void)
+{
+	struct acpi_cedt_cfmws *cfmws[4] = {
+		[0] = &mock_cedt.cfmws0.cfmws,
+		[1] = &mock_cedt.cfmws1.cfmws,
+		[2] = &mock_cedt.cfmws2.cfmws,
+		[3] = &mock_cedt.cfmws3.cfmws,
+	};
+	struct cxl_mock_res *res;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mock_cedt.chbs); i++) {
+		struct acpi_cedt_chbs *chbs = &mock_cedt.chbs[i];
+		resource_size_t size;
+
+		if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL20)
+			size = ACPI_CEDT_CHBS_LENGTH_CXL20;
+		else
+			size = ACPI_CEDT_CHBS_LENGTH_CXL11;
+
+		res = alloc_mock_res(size);
+		if (!res)
+			return -ENOMEM;
+		chbs->base = res->range.start;
+		chbs->length = size;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(cfmws); i++) {
+		struct acpi_cedt_cfmws *window = cfmws[i];
+
+		res = alloc_mock_res(window->window_size);
+		if (!res)
+			return -ENOMEM;
+		window->base_hpa = res->range.start;
+	}
+
+	return 0;
+}
+
+static acpi_status mock_acpi_get_table(char *signature, u32 instance,
+				       struct acpi_table_header **out_table)
+{
+	if (instance < U32_MAX || strcmp(signature, ACPI_SIG_CEDT) != 0)
+		return acpi_get_table(signature, instance, out_table);
+
+	*out_table = (struct acpi_table_header *) &mock_cedt;
+	return AE_OK;
+}
+
+static void mock_acpi_put_table(struct acpi_table_header *table)
+{
+	if (table == (struct acpi_table_header *) &mock_cedt)
+		return;
+	acpi_put_table(table);
+}
+
+static bool is_mock_bridge(struct device *dev)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(cxl_host_bridge); i++)
+		if (dev == &cxl_host_bridge[i]->dev)
+			return true;
+
+	return false;
+}
+
+static int host_bridge_index(struct acpi_device *adev)
+{
+	return adev - host_bridge;
+}
+
+static struct acpi_device *find_host_bridge(acpi_handle handle)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(host_bridge); i++)
+		if (handle == host_bridge[i].handle)
+			return &host_bridge[i];
+	return NULL;
+}
+
+static acpi_status
+mock_acpi_evaluate_integer(acpi_handle handle, acpi_string pathname,
+			   struct acpi_object_list *arguments,
+			   unsigned long long *data)
+{
+	struct acpi_device *adev = find_host_bridge(handle);
+
+	if (!adev || strcmp(pathname, METHOD_NAME__UID) != 0)
+		return acpi_evaluate_integer(handle, pathname, arguments, data);
+
+	*data = host_bridge_index(adev);
+	return AE_OK;
+}
+
+static struct pci_bus mock_pci_bus[NR_CXL_HOST_BRIDGES];
+static struct acpi_pci_root mock_pci_root[NR_CXL_HOST_BRIDGES] = {
+	[0] = {
+		.bus = &mock_pci_bus[0],
+	},
+	[1] = {
+		.bus = &mock_pci_bus[1],
+	},
+	[2] = {
+		.bus = &mock_pci_bus[2],
+	},
+	[3] = {
+		.bus = &mock_pci_bus[3],
+	},
+};
+
+static struct platform_device *mock_cxl_root_port(struct pci_bus *bus, int index)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mock_pci_bus); i++)
+		if (bus == &mock_pci_bus[i])
+			return cxl_root_port[index + i * NR_CXL_ROOT_PORTS];
+	return NULL;
+}
+
+static bool is_mock_port(struct platform_device *pdev)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(cxl_root_port); i++)
+		if (pdev == cxl_root_port[i])
+			return true;
+	return false;
+}
+
+static bool is_mock_bus(struct pci_bus *bus)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mock_pci_bus); i++)
+		if (bus == &mock_pci_bus[i])
+			return true;
+	return false;
+}
+
+static struct acpi_pci_root *mock_acpi_pci_find_root(acpi_handle handle)
+{
+	struct acpi_device *adev = find_host_bridge(handle);
+
+	if (!adev)
+		return acpi_pci_find_root(handle);
+	return &mock_pci_root[host_bridge_index(adev)];
+}
+
+static struct cxl_mock_ops cxl_mock_ops = {
+	.is_mock_adev = is_mock_adev,
+	.is_mock_bridge = is_mock_bridge,
+	.is_mock_bus = is_mock_bus,
+	.is_mock_port = is_mock_port,
+	.is_mock_dev = is_mock_dev,
+	.mock_port = mock_cxl_root_port,
+	.acpi_get_table = mock_acpi_get_table,
+	.acpi_put_table = mock_acpi_put_table,
+	.acpi_evaluate_integer = mock_acpi_evaluate_integer,
+	.acpi_pci_find_root = mock_acpi_pci_find_root,
+	.list = LIST_HEAD_INIT(cxl_mock_ops.list),
+};
+
+static void mock_companion(struct acpi_device *adev, struct device *dev)
+{
+	device_initialize(&adev->dev);
+	fwnode_init(&adev->fwnode, NULL);
+	dev->fwnode = &adev->fwnode;
+	adev->fwnode.dev = dev;
+}
+
+#ifndef SZ_64G
+#define SZ_64G (SZ_32G * 2)
+#endif
+
+#ifndef SZ_512G
+#define SZ_512G (SZ_64G * 8)
+#endif
+
+static __init int cxl_test_init(void)
+{
+	int rc, i;
+
+	register_cxl_mock_ops(&cxl_mock_ops);
+
+	cxl_mock_pool = gen_pool_create(ilog2(SZ_2M), NUMA_NO_NODE);
+	if (!cxl_mock_pool) {
+		rc = -ENOMEM;
+		goto err_gen_pool_create;
+	}
+
+	rc = gen_pool_add(cxl_mock_pool, SZ_512G, SZ_64G, NUMA_NO_NODE);
+	if (rc)
+		goto err_gen_pool_add;
+
+	rc = populate_cedt();
+	if (rc)
+		goto err_populate;
+
+	for (i = 0; i < ARRAY_SIZE(cxl_host_bridge); i++) {
+		struct acpi_device *adev = &host_bridge[i];
+		struct platform_device *pdev;
+
+		pdev = platform_device_alloc("cxl_host_bridge", i);
+		if (!pdev)
+			goto err_bridge;
+
+		mock_companion(adev, &pdev->dev);
+		rc = platform_device_add(pdev);
+		if (rc) {
+			platform_device_put(pdev);
+			goto err_bridge;
+		}
+		cxl_host_bridge[i] = pdev;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(cxl_root_port); i++) {
+		struct platform_device *bridge =
+			cxl_host_bridge[i / NR_CXL_ROOT_PORTS];
+		struct platform_device *pdev;
+
+		pdev = platform_device_alloc("cxl_root_port", i);
+		if (!pdev)
+			goto err_port;
+		pdev->dev.parent = &bridge->dev;
+
+		rc = platform_device_add(pdev);
+		if (rc) {
+			platform_device_put(pdev);
+			goto err_port;
+		}
+		cxl_root_port[i] = pdev;
+	}
+
+	cxl_acpi = platform_device_alloc("cxl_acpi", 0);
+	if (!cxl_acpi)
+		goto err_port;
+
+	mock_companion(&acpi0017_mock, &cxl_acpi->dev);
+	acpi0017_mock.dev.bus = &platform_bus_type;
+
+	rc = platform_device_add(cxl_acpi);
+	if (rc)
+		goto err_add;
+
+	return 0;
+
+err_add:
+	platform_device_put(cxl_acpi);
+err_port:
+	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_root_port[i]);
+err_bridge:
+	for (i = ARRAY_SIZE(cxl_host_bridge) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_host_bridge[i]);
+err_populate:
+	depopulate_all_mock_resources();
+err_gen_pool_add:
+	gen_pool_destroy(cxl_mock_pool);
+err_gen_pool_create:
+	unregister_cxl_mock_ops(&cxl_mock_ops);
+	return rc;
+}
+
+static __exit void cxl_test_exit(void)
+{
+	int i;
+
+	platform_device_unregister(cxl_acpi);
+	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_root_port[i]);
+	for (i = ARRAY_SIZE(cxl_host_bridge) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_host_bridge[i]);
+	depopulate_all_mock_resources();
+	gen_pool_destroy(cxl_mock_pool);
+	unregister_cxl_mock_ops(&cxl_mock_ops);
+}
+
+module_init(cxl_test_init);
+module_exit(cxl_test_exit);
+MODULE_LICENSE("GPL v2");
diff --git a/tools/testing/cxl/test/mock.c b/tools/testing/cxl/test/mock.c
new file mode 100644
index 000000000000..b8c108abcf07
--- /dev/null
+++ b/tools/testing/cxl/test/mock.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0-only
+//Copyright(c) 2021 Intel Corporation. All rights reserved.
+
+#include <linux/libnvdimm.h>
+#include <linux/rculist.h>
+#include <linux/device.h>
+#include <linux/export.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+#include "mock.h"
+
+static LIST_HEAD(mock);
+
+void register_cxl_mock_ops(struct cxl_mock_ops *ops)
+{
+	list_add_rcu(&ops->list, &mock);
+}
+EXPORT_SYMBOL_GPL(register_cxl_mock_ops);
+
+static DEFINE_SRCU(cxl_mock_srcu);
+
+void unregister_cxl_mock_ops(struct cxl_mock_ops *ops)
+{
+	list_del_rcu(&ops->list);
+	synchronize_srcu(&cxl_mock_srcu);
+}
+EXPORT_SYMBOL_GPL(unregister_cxl_mock_ops);
+
+struct cxl_mock_ops *get_cxl_mock_ops(int *index)
+{
+	*index = srcu_read_lock(&cxl_mock_srcu);
+	return list_first_or_null_rcu(&mock, struct cxl_mock_ops, list);
+}
+EXPORT_SYMBOL_GPL(get_cxl_mock_ops);
+
+void put_cxl_mock_ops(int index)
+{
+	srcu_read_unlock(&cxl_mock_srcu, index);
+}
+EXPORT_SYMBOL_GPL(put_cxl_mock_ops);
+
+bool __wrap_is_acpi_device_node(const struct fwnode_handle *fwnode)
+{
+	struct acpi_device *adev =
+		container_of(fwnode, struct acpi_device, fwnode);
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+	bool retval = false;
+
+	if (ops)
+		retval = ops->is_mock_adev(adev);
+
+	if (!retval)
+		retval = is_acpi_device_node(fwnode);
+
+	put_cxl_mock_ops(index);
+	return retval;
+}
+EXPORT_SYMBOL(__wrap_is_acpi_device_node);
+
+acpi_status __wrap_acpi_get_table(char *signature, u32 instance,
+				  struct acpi_table_header **out_table)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+	acpi_status status;
+
+	if (ops)
+		status = ops->acpi_get_table(signature, instance, out_table);
+	else
+		status = acpi_get_table(signature, instance, out_table);
+
+	put_cxl_mock_ops(index);
+
+	return status;
+}
+EXPORT_SYMBOL(__wrap_acpi_get_table);
+
+void __wrap_acpi_put_table(struct acpi_table_header *table)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops)
+		ops->acpi_put_table(table);
+	else
+		acpi_put_table(table);
+	put_cxl_mock_ops(index);
+}
+EXPORT_SYMBOL(__wrap_acpi_put_table);
+
+acpi_status __wrap_acpi_evaluate_integer(acpi_handle handle,
+					 acpi_string pathname,
+					 struct acpi_object_list *arguments,
+					 unsigned long long *data)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+	acpi_status status;
+
+	if (ops)
+		status = ops->acpi_evaluate_integer(handle, pathname, arguments,
+						    data);
+	else
+		status = acpi_evaluate_integer(handle, pathname, arguments,
+					       data);
+	put_cxl_mock_ops(index);
+
+	return status;
+}
+EXPORT_SYMBOL(__wrap_acpi_evaluate_integer);
+
+struct acpi_pci_root *__wrap_acpi_pci_find_root(acpi_handle handle)
+{
+	int index;
+	struct acpi_pci_root *root;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops)
+		root = ops->acpi_pci_find_root(handle);
+	else
+		root = acpi_pci_find_root(handle);
+
+	put_cxl_mock_ops(index);
+
+	return root;
+}
+EXPORT_SYMBOL_GPL(__wrap_acpi_pci_find_root);
+
+void __wrap_pci_walk_bus(struct pci_bus *bus,
+			 int (*cb)(struct pci_dev *, void *), void *userdata)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops && ops->is_mock_bus(bus)) {
+		int rc, i;
+
+		/*
+		 * Simulate 2 root ports per host-bridge and no
+		 * depth recursion.
+		 */
+		for (i = 0; i < 2; i++) {
+			rc = cb((struct pci_dev *) ops->mock_port(bus, i),
+				userdata);
+			if (rc)
+				break;
+		}
+	} else
+		pci_walk_bus(bus, cb, userdata);
+
+	put_cxl_mock_ops(index);
+}
+EXPORT_SYMBOL_GPL(__wrap_pci_walk_bus);
+
+struct nvdimm_bus *
+__wrap_nvdimm_bus_register(struct device *dev,
+			   struct nvdimm_bus_descriptor *nd_desc)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops && ops->is_mock_dev(dev->parent->parent))
+		nd_desc->provider_name = "cxl_test";
+	put_cxl_mock_ops(index);
+
+	return nvdimm_bus_register(dev, nd_desc);
+}
+EXPORT_SYMBOL_GPL(__wrap_nvdimm_bus_register);
+
+MODULE_LICENSE("GPL v2");
diff --git a/tools/testing/cxl/test/mock.h b/tools/testing/cxl/test/mock.h
new file mode 100644
index 000000000000..805a94cb3fbe
--- /dev/null
+++ b/tools/testing/cxl/test/mock.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <linux/list.h>
+#include <linux/acpi.h>
+
+struct cxl_mock_ops {
+	struct list_head list;
+	bool (*is_mock_adev)(struct acpi_device *dev);
+	acpi_status (*acpi_get_table)(char *signature, u32 instance,
+				      struct acpi_table_header **out_table);
+	void (*acpi_put_table)(struct acpi_table_header *table);
+	bool (*is_mock_bridge)(struct device *dev);
+	acpi_status (*acpi_evaluate_integer)(acpi_handle handle,
+					     acpi_string pathname,
+					     struct acpi_object_list *arguments,
+					     unsigned long long *data);
+	struct acpi_pci_root *(*acpi_pci_find_root)(acpi_handle handle);
+	struct platform_device *(*mock_port)(struct pci_bus *bus, int index);
+	bool (*is_mock_bus)(struct pci_bus *bus);
+	bool (*is_mock_port)(struct platform_device *pdev);
+	bool (*is_mock_dev)(struct device *dev);
+};
+
+void register_cxl_mock_ops(struct cxl_mock_ops *ops);
+void unregister_cxl_mock_ops(struct cxl_mock_ops *ops);
+struct cxl_mock_ops *get_cxl_mock_ops(int *index);
+void put_cxl_mock_ops(int index);


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 18/21] cxl/bus: Populate the target list at decoder create
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (16 preceding siblings ...)
  2021-09-09  5:13 ` [PATCH v4 17/21] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy Dan Williams
@ 2021-09-09  5:13 ` Dan Williams
  2021-09-10  9:57   ` Jonathan Cameron
  2021-09-09  5:13 ` [PATCH v4 19/21] cxl/mbox: Move command definitions to common location Dan Williams
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:13 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, vishal.l.verma, ira.weiny, Jonathan.Cameron

As found by cxl_test, the implementation populated the target_list for
the single dport exceptional case, it missed populating the target_list
for the typical multi-dport case. Root decoders always know their target
list at the beginning of time, and even switch-level decoders should
have a target list of one or more zeros by default, depending on the
interleave-ways setting.

Walk the hosting port's dport list and populate based on the passed in
map.

Move devm_cxl_add_passthrough_decoder() out of line now that it does the
work of generating a target_map.

Before:
$ cat /sys/bus/cxl/devices/root2/decoder*/target_list
0

0

After:
$ cat /sys/bus/cxl/devices/root2/decoder*/target_list
0
0,1,2,3
0
0,1,2,3

Where root2 is a CXL topology root object generated by 'cxl_test'.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/acpi.c     |   13 +++++++-
 drivers/cxl/core/bus.c |   80 +++++++++++++++++++++++++++++++++++++++++-------
 drivers/cxl/cxl.h      |   25 ++++++---------
 3 files changed, 91 insertions(+), 27 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index d31a97218593..9d881eacdae5 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -52,6 +52,12 @@ static int cxl_acpi_cfmws_verify(struct device *dev,
 		return -EINVAL;
 	}
 
+	if (CFMWS_INTERLEAVE_WAYS(cfmws) > CXL_DECODER_MAX_INTERLEAVE) {
+		dev_err(dev, "CFMWS Interleave Ways (%d) too large\n",
+			CFMWS_INTERLEAVE_WAYS(cfmws));
+		return -EINVAL;
+	}
+
 	expected_len = struct_size((cfmws), interleave_targets,
 				   CFMWS_INTERLEAVE_WAYS(cfmws));
 
@@ -71,6 +77,7 @@ static int cxl_acpi_cfmws_verify(struct device *dev,
 static void cxl_add_cfmws_decoders(struct device *dev,
 				   struct cxl_port *root_port)
 {
+	int target_map[CXL_DECODER_MAX_INTERLEAVE];
 	struct acpi_cedt_cfmws *cfmws;
 	struct cxl_decoder *cxld;
 	acpi_size len, cur = 0;
@@ -83,6 +90,7 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 
 	while (cur < len) {
 		struct acpi_cedt_header *c = cedt_subtable + cur;
+		int i;
 
 		if (c->type != ACPI_CEDT_TYPE_CFMWS) {
 			cur += c->length;
@@ -108,6 +116,9 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 			continue;
 		}
 
+		for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
+			target_map[i] = cfmws->interleave_targets[i];
+
 		flags = cfmws_to_decoder_flags(cfmws->restrictions);
 		cxld = devm_cxl_add_decoder(dev, root_port,
 					    CFMWS_INTERLEAVE_WAYS(cfmws),
@@ -115,7 +126,7 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 					    CFMWS_INTERLEAVE_WAYS(cfmws),
 					    CFMWS_INTERLEAVE_GRANULARITY(cfmws),
 					    CXL_DECODER_EXPANDER,
-					    flags);
+					    flags, target_map);
 
 		if (IS_ERR(cxld)) {
 			dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 8073354ba232..176bede30c55 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -453,11 +453,38 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
 }
 EXPORT_SYMBOL_GPL(cxl_add_dport);
 
+static int decoder_populate_targets(struct device *host,
+				    struct cxl_decoder *cxld,
+				    struct cxl_port *port, int *target_map,
+				    int nr_targets)
+{
+	int rc = 0, i;
+
+	if (!target_map)
+		return 0;
+
+	device_lock(&port->dev);
+	for (i = 0; i < nr_targets; i++) {
+		struct cxl_dport *dport = find_dport(port, target_map[i]);
+
+		if (!dport) {
+			rc = -ENXIO;
+			break;
+		}
+		dev_dbg(host, "%s: target: %d\n", dev_name(dport->dport), i);
+		cxld->target[i] = dport;
+	}
+	device_unlock(&port->dev);
+
+	return rc;
+}
+
 static struct cxl_decoder *
-cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
-		  resource_size_t len, int interleave_ways,
-		  int interleave_granularity, enum cxl_decoder_type type,
-		  unsigned long flags)
+cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
+		  resource_size_t base, resource_size_t len,
+		  int interleave_ways, int interleave_granularity,
+		  enum cxl_decoder_type type, unsigned long flags,
+		  int *target_map)
 {
 	struct cxl_decoder *cxld;
 	struct device *dev;
@@ -493,10 +520,10 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
 		.target_type = type,
 	};
 
-	/* handle implied target_list */
-	if (interleave_ways == 1)
-		cxld->target[0] =
-			list_first_entry(&port->dports, struct cxl_dport, list);
+	rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
+	if (rc)
+		goto err;
+
 	dev = &cxld->dev;
 	device_initialize(dev);
 	device_set_pm_not_required(dev);
@@ -519,14 +546,19 @@ struct cxl_decoder *
 devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
 		     resource_size_t base, resource_size_t len,
 		     int interleave_ways, int interleave_granularity,
-		     enum cxl_decoder_type type, unsigned long flags)
+		     enum cxl_decoder_type type, unsigned long flags,
+		     int *target_map)
 {
 	struct cxl_decoder *cxld;
 	struct device *dev;
 	int rc;
 
-	cxld = cxl_decoder_alloc(port, nr_targets, base, len, interleave_ways,
-				 interleave_granularity, type, flags);
+	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
+		return ERR_PTR(-EINVAL);
+
+	cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
+				 interleave_ways, interleave_granularity, type,
+				 flags, target_map);
 	if (IS_ERR(cxld))
 		return cxld;
 
@@ -550,6 +582,32 @@ devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
 }
 EXPORT_SYMBOL_GPL(devm_cxl_add_decoder);
 
+/*
+ * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
+ * single ported host-bridges need not publish a decoder capability when a
+ * passthrough decode can be assumed, i.e. all transactions that the uport sees
+ * are claimed and passed to the single dport. Default the range a 0-base
+ * 0-length until the first CXL region is activated.
+ */
+struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
+						     struct cxl_port *port)
+{
+	struct cxl_dport *dport;
+	int target_map[1];
+
+	device_lock(&port->dev);
+	dport = list_first_entry_or_null(&port->dports, typeof(*dport), list);
+	device_unlock(&port->dev);
+
+	if (!dport)
+		return ERR_PTR(-ENXIO);
+
+	target_map[0] = dport->port_id;
+	return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
+				    CXL_DECODER_EXPANDER, 0, target_map);
+}
+EXPORT_SYMBOL_GPL(devm_cxl_add_passthrough_decoder);
+
 /**
  * __cxl_driver_register - register a driver for the cxl bus
  * @cxl_drv: cxl driver structure to attach
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index c5152718267e..84b8836c1f91 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -180,6 +180,12 @@ enum cxl_decoder_type {
        CXL_DECODER_EXPANDER = 3,
 };
 
+/*
+ * Current specification goes up to 8, double that seems a reasonable
+ * software max for the foreseeable future
+ */
+#define CXL_DECODER_MAX_INTERLEAVE 16
+
 /**
  * struct cxl_decoder - CXL address range decode configuration
  * @dev: this decoder's device
@@ -284,22 +290,11 @@ struct cxl_decoder *
 devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
 		     resource_size_t base, resource_size_t len,
 		     int interleave_ways, int interleave_granularity,
-		     enum cxl_decoder_type type, unsigned long flags);
-
-/*
- * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
- * single ported host-bridges need not publish a decoder capability when a
- * passthrough decode can be assumed, i.e. all transactions that the uport sees
- * are claimed and passed to the single dport. Default the range a 0-base
- * 0-length until the first CXL region is activated.
- */
-static inline struct cxl_decoder *
-devm_cxl_add_passthrough_decoder(struct device *host, struct cxl_port *port)
-{
-	return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
-				    CXL_DECODER_EXPANDER, 0);
-}
+		     enum cxl_decoder_type type, unsigned long flags,
+		     int *target_map);
 
+struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
+						     struct cxl_port *port);
 extern struct bus_type cxl_bus_type;
 
 struct cxl_driver {


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 19/21] cxl/mbox: Move command definitions to common location
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (17 preceding siblings ...)
  2021-09-09  5:13 ` [PATCH v4 18/21] cxl/bus: Populate the target list at decoder create Dan Williams
@ 2021-09-09  5:13 ` Dan Williams
  2021-09-09  5:13 ` [PATCH v4 20/21] tools/testing/cxl: Introduce a mock memory device + driver Dan Williams
  2021-09-09  5:13 ` [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add Dan Williams
  20 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:13 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Jonathan Cameron, vishal.l.verma, nvdimm,
	ben.widawsky, alison.schofield, vishal.l.verma, ira.weiny,
	Jonathan.Cameron

In preparation for cxl_test to mock responses to mailbox command
requests, move some definitions from core/mbox.c to cxlmem.h.

No functional changes intended.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/mbox.c |   45 +++++--------------------------------
 drivers/cxl/cxlmem.h    |   57 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/pmem.c      |   11 ++-------
 3 files changed, 65 insertions(+), 48 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 82e79da195fa..576796a5d9f3 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -485,11 +485,7 @@ static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
 
 	while (remaining) {
 		u32 xfer_size = min_t(u32, remaining, cxlm->payload_size);
-		struct cxl_mbox_get_log {
-			uuid_t uuid;
-			__le32 offset;
-			__le32 length;
-		} __packed log = {
+		struct cxl_mbox_get_log log = {
 			.uuid = *uuid,
 			.offset = cpu_to_le32(offset),
 			.length = cpu_to_le32(xfer_size)
@@ -520,14 +516,11 @@ static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
  */
 static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
 {
-	struct cel_entry {
-		__le16 opcode;
-		__le16 effect;
-	} __packed * cel_entry;
+	struct cxl_cel_entry *cel_entry;
 	const int cel_entries = size / sizeof(*cel_entry);
 	int i;
 
-	cel_entry = (struct cel_entry *)cel;
+	cel_entry = (struct cxl_cel_entry *) cel;
 
 	for (i = 0; i < cel_entries; i++) {
 		u16 opcode = le16_to_cpu(cel_entry[i].opcode);
@@ -543,15 +536,6 @@ static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
 	}
 }
 
-struct cxl_mbox_get_supported_logs {
-	__le16 entries;
-	u8 rsvd[6];
-	struct gsl_entry {
-		uuid_t uuid;
-		__le32 size;
-	} __packed entry[];
-} __packed;
-
 static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
 {
 	struct cxl_mbox_get_supported_logs *ret;
@@ -578,10 +562,8 @@ enum {
 
 /* See CXL 2.0 Table 170. Get Log Input Payload */
 static const uuid_t log_uuid[] = {
-	[CEL_UUID] = UUID_INIT(0xda9c0b5, 0xbf41, 0x4b78, 0x8f, 0x79, 0x96,
-			       0xb1, 0x62, 0x3b, 0x3f, 0x17),
-	[VENDOR_DEBUG_UUID] = UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f,
-					0xd6, 0x07, 0x19, 0x40, 0x3d, 0x86),
+	[CEL_UUID] = DEFINE_CXL_CEL_UUID,
+	[VENDOR_DEBUG_UUID] = DEFINE_CXL_VENDOR_DEBUG_UUID,
 };
 
 /**
@@ -698,22 +680,7 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
 int cxl_mem_identify(struct cxl_mem *cxlm)
 {
 	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
-	struct cxl_mbox_identify {
-		char fw_revision[0x10];
-		__le64 total_capacity;
-		__le64 volatile_capacity;
-		__le64 persistent_capacity;
-		__le64 partition_align;
-		__le16 info_event_log_size;
-		__le16 warning_event_log_size;
-		__le16 failure_event_log_size;
-		__le16 fatal_event_log_size;
-		__le32 lsa_size;
-		u8 poison_list_max_mer[3];
-		__le16 inject_poison_limit;
-		u8 poison_caps;
-		u8 qos_telemetry_caps;
-	} __packed id;
+	struct cxl_mbox_identify id;
 	int rc;
 
 	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_IDENTIFY, NULL, 0, &id,
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 468b7b8be207..3841d7fe73d7 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -171,6 +171,63 @@ enum cxl_opcode {
 	CXL_MBOX_OP_MAX			= 0x10000
 };
 
+#define DEFINE_CXL_CEL_UUID                                                    \
+	UUID_INIT(0xda9c0b5, 0xbf41, 0x4b78, 0x8f, 0x79, 0x96, 0xb1, 0x62,     \
+		  0x3b, 0x3f, 0x17)
+
+#define DEFINE_CXL_VENDOR_DEBUG_UUID                                           \
+	UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f, 0xd6, 0x07, 0x19,     \
+		  0x40, 0x3d, 0x86)
+
+struct cxl_mbox_get_supported_logs {
+	__le16 entries;
+	u8 rsvd[6];
+	struct cxl_gsl_entry {
+		uuid_t uuid;
+		__le32 size;
+	} __packed entry[];
+}  __packed;
+
+struct cxl_cel_entry {
+	__le16 opcode;
+	__le16 effect;
+} __packed;
+
+struct cxl_mbox_get_log {
+	uuid_t uuid;
+	__le32 offset;
+	__le32 length;
+} __packed;
+
+/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
+struct cxl_mbox_identify {
+	char fw_revision[0x10];
+	__le64 total_capacity;
+	__le64 volatile_capacity;
+	__le64 persistent_capacity;
+	__le64 partition_align;
+	__le16 info_event_log_size;
+	__le16 warning_event_log_size;
+	__le16 failure_event_log_size;
+	__le16 fatal_event_log_size;
+	__le32 lsa_size;
+	u8 poison_list_max_mer[3];
+	__le16 inject_poison_limit;
+	u8 poison_caps;
+	u8 qos_telemetry_caps;
+} __packed;
+
+struct cxl_mbox_get_lsa {
+	u32 offset;
+	u32 length;
+} __packed;
+
+struct cxl_mbox_set_lsa {
+	u32 offset;
+	u32 reserved;
+	u8 data[];
+} __packed;
+
 /**
  * struct cxl_mem_command - Driver representation of a memory device command
  * @info: Command information as it exists for the UAPI
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index bb30d3e55825..877b0a84e586 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -98,10 +98,7 @@ static int cxl_pmem_get_config_data(struct cxl_mem *cxlm,
 				    struct nd_cmd_get_config_data_hdr *cmd,
 				    unsigned int buf_len, int *cmd_rc)
 {
-	struct cxl_mbox_get_lsa {
-		u32 offset;
-		u32 length;
-	} get_lsa;
+	struct cxl_mbox_get_lsa get_lsa;
 	int rc;
 
 	if (sizeof(*cmd) > buf_len)
@@ -127,11 +124,7 @@ static int cxl_pmem_set_config_data(struct cxl_mem *cxlm,
 				    struct nd_cmd_set_config_hdr *cmd,
 				    unsigned int buf_len, int *cmd_rc)
 {
-	struct cxl_mbox_set_lsa {
-		u32 offset;
-		u32 reserved;
-		u8 data[];
-	} *set_lsa;
+	struct cxl_mbox_set_lsa *set_lsa;
 	int rc;
 
 	if (sizeof(*cmd) > buf_len)


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 20/21] tools/testing/cxl: Introduce a mock memory device + driver
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (18 preceding siblings ...)
  2021-09-09  5:13 ` [PATCH v4 19/21] cxl/mbox: Move command definitions to common location Dan Williams
@ 2021-09-09  5:13 ` Dan Williams
  2021-09-10 10:09   ` Jonathan Cameron
  2021-09-09  5:13 ` [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add Dan Williams
  20 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:13 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, kernel test robot, vishal.l.verma, nvdimm,
	ben.widawsky, alison.schofield, vishal.l.verma, ira.weiny,
	Jonathan.Cameron

Introduce an emulated device-set plus driver to register CXL memory
devices, 'struct cxl_memdev' instances, in the mock cxl_test topology.
This enables the development of HDM Decoder (Host-managed Device Memory
Decoder) programming flow (region provisioning) in an environment that
can be updated alongside the kernel as it gains more functionality.

Whereas the cxl_pci module looks for CXL memory expanders on the 'pci'
bus, the cxl_mock_mem module attaches to CXL expanders on the platform
bus emitted by cxl_test.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/pmem.c       |    6 -
 drivers/cxl/cxl.h             |    2 
 drivers/cxl/pmem.c            |    2 
 tools/testing/cxl/Kbuild      |    2 
 tools/testing/cxl/mock_pmem.c |   24 ++++
 tools/testing/cxl/test/Kbuild |    4 +
 tools/testing/cxl/test/cxl.c  |   69 +++++++++++
 tools/testing/cxl/test/mem.c  |  256 +++++++++++++++++++++++++++++++++++++++++
 8 files changed, 359 insertions(+), 6 deletions(-)
 create mode 100644 tools/testing/cxl/mock_pmem.c
 create mode 100644 tools/testing/cxl/test/mem.c

diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index 9e56be3994f1..74be5132df1c 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -51,16 +51,16 @@ struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(to_cxl_nvdimm_bridge);
 
-static int match_nvdimm_bridge(struct device *dev, const void *data)
+__mock int match_nvdimm_bridge(struct device *dev, const void *data)
 {
 	return dev->type == &cxl_nvdimm_bridge_type;
 }
 
-struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd)
 {
 	struct device *dev;
 
-	dev = bus_find_device(&cxl_bus_type, NULL, NULL, match_nvdimm_bridge);
+	dev = bus_find_device(&cxl_bus_type, NULL, cxl_nvd, match_nvdimm_bridge);
 	if (!dev)
 		return NULL;
 	return to_cxl_nvdimm_bridge(dev);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 84b8836c1f91..9af5745ba2c0 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -327,7 +327,7 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
 struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
 bool is_cxl_nvdimm(struct device *dev);
 int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
-struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void);
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd);
 
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index 877b0a84e586..569e6ff2a456 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -40,7 +40,7 @@ static int cxl_nvdimm_probe(struct device *dev)
 	struct nvdimm *nvdimm = NULL;
 	int rc = -ENXIO;
 
-	cxl_nvb = cxl_find_nvdimm_bridge();
+	cxl_nvb = cxl_find_nvdimm_bridge(cxl_nvd);
 	if (!cxl_nvb)
 		return -ENXIO;
 
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index 63a4a07e71c4..86deba8308a1 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -33,4 +33,6 @@ cxl_core-y += $(CXL_CORE_SRC)/memdev.o
 cxl_core-y += $(CXL_CORE_SRC)/mbox.o
 cxl_core-y += config_check.o
 
+cxl_core-y += mock_pmem.o
+
 obj-m += test/
diff --git a/tools/testing/cxl/mock_pmem.c b/tools/testing/cxl/mock_pmem.c
new file mode 100644
index 000000000000..f7315e6f52c0
--- /dev/null
+++ b/tools/testing/cxl/mock_pmem.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <cxl.h>
+#include "test/mock.h"
+#include <core/core.h>
+
+int match_nvdimm_bridge(struct device *dev, const void *data)
+{
+	int index, rc = 0;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+	const struct cxl_nvdimm *cxl_nvd = data;
+
+	if (ops) {
+		if (dev->type == &cxl_nvdimm_bridge_type &&
+		    (ops->is_mock_dev(dev->parent->parent) ==
+		     ops->is_mock_dev(cxl_nvd->dev.parent->parent)))
+			rc = 1;
+	} else
+		rc = dev->type == &cxl_nvdimm_bridge_type;
+
+	put_cxl_mock_ops(index);
+
+	return rc;
+}
diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
index 7de4ddecfd21..4e59e2c911f6 100644
--- a/tools/testing/cxl/test/Kbuild
+++ b/tools/testing/cxl/test/Kbuild
@@ -1,6 +1,10 @@
 # SPDX-License-Identifier: GPL-2.0
+ccflags-y := -I$(srctree)/drivers/cxl/
+
 obj-m += cxl_test.o
 obj-m += cxl_mock.o
+obj-m += cxl_mock_mem.o
 
 cxl_test-y := cxl.o
 cxl_mock-y := mock.o
+cxl_mock_mem-y := mem.o
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 1c47b34244a4..cb32f9e27d5d 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -17,6 +17,7 @@ static struct platform_device *cxl_acpi;
 static struct platform_device *cxl_host_bridge[NR_CXL_HOST_BRIDGES];
 static struct platform_device
 	*cxl_root_port[NR_CXL_HOST_BRIDGES * NR_CXL_ROOT_PORTS];
+struct platform_device *cxl_mem[NR_CXL_HOST_BRIDGES * NR_CXL_ROOT_PORTS];
 
 static struct acpi_device acpi0017_mock;
 static struct acpi_device host_bridge[NR_CXL_HOST_BRIDGES] = {
@@ -36,6 +37,11 @@ static struct acpi_device host_bridge[NR_CXL_HOST_BRIDGES] = {
 
 static bool is_mock_dev(struct device *dev)
 {
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(cxl_mem); i++)
+		if (dev == &cxl_mem[i]->dev)
+			return true;
 	if (dev == &cxl_acpi->dev)
 		return true;
 	return false;
@@ -405,6 +411,44 @@ static void mock_companion(struct acpi_device *adev, struct device *dev)
 #define SZ_512G (SZ_64G * 8)
 #endif
 
+static struct platform_device *alloc_memdev(int id)
+{
+	struct resource res[] = {
+		[0] = {
+			.flags = IORESOURCE_MEM,
+		},
+		[1] = {
+			.flags = IORESOURCE_MEM,
+			.desc = IORES_DESC_PERSISTENT_MEMORY,
+		},
+	};
+	struct platform_device *pdev;
+	int i, rc;
+
+	for (i = 0; i < ARRAY_SIZE(res); i++) {
+		struct cxl_mock_res *r = alloc_mock_res(SZ_256M);
+
+		if (!r)
+			return NULL;
+		res[i].start = r->range.start;
+		res[i].end = r->range.end;
+	}
+
+	pdev = platform_device_alloc("cxl_mem", id);
+	if (!pdev)
+		return NULL;
+
+	rc = platform_device_add_resources(pdev, res, ARRAY_SIZE(res));
+	if (rc)
+		goto err;
+
+	return pdev;
+
+err:
+	platform_device_put(pdev);
+	return NULL;
+}
+
 static __init int cxl_test_init(void)
 {
 	int rc, i;
@@ -460,9 +504,27 @@ static __init int cxl_test_init(void)
 		cxl_root_port[i] = pdev;
 	}
 
+	BUILD_BUG_ON(ARRAY_SIZE(cxl_mem) != ARRAY_SIZE(cxl_root_port));
+	for (i = 0; i < ARRAY_SIZE(cxl_mem); i++) {
+		struct platform_device *port = cxl_root_port[i];
+		struct platform_device *pdev;
+
+		pdev = alloc_memdev(i);
+		if (!pdev)
+			goto err_mem;
+		pdev->dev.parent = &port->dev;
+
+		rc = platform_device_add(pdev);
+		if (rc) {
+			platform_device_put(pdev);
+			goto err_mem;
+		}
+		cxl_mem[i] = pdev;
+	}
+
 	cxl_acpi = platform_device_alloc("cxl_acpi", 0);
 	if (!cxl_acpi)
-		goto err_port;
+		goto err_mem;
 
 	mock_companion(&acpi0017_mock, &cxl_acpi->dev);
 	acpi0017_mock.dev.bus = &platform_bus_type;
@@ -475,6 +537,9 @@ static __init int cxl_test_init(void)
 
 err_add:
 	platform_device_put(cxl_acpi);
+err_mem:
+	for (i = ARRAY_SIZE(cxl_mem) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_mem[i]);
 err_port:
 	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
 		platform_device_unregister(cxl_root_port[i]);
@@ -495,6 +560,8 @@ static __exit void cxl_test_exit(void)
 	int i;
 
 	platform_device_unregister(cxl_acpi);
+	for (i = ARRAY_SIZE(cxl_mem) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_mem[i]);
 	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
 		platform_device_unregister(cxl_root_port[i]);
 	for (i = ARRAY_SIZE(cxl_host_bridge) - 1; i >= 0; i--)
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
new file mode 100644
index 000000000000..12a8437a9ca0
--- /dev/null
+++ b/tools/testing/cxl/test/mem.c
@@ -0,0 +1,256 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2021 Intel Corporation. All rights reserved.
+
+#include <linux/platform_device.h>
+#include <linux/mod_devicetable.h>
+#include <linux/module.h>
+#include <linux/sizes.h>
+#include <linux/bits.h>
+#include <cxlmem.h>
+
+#define LSA_SIZE SZ_128K
+#define EFFECT(x) (1U << x)
+
+static struct cxl_cel_entry mock_cel[] = {
+	{
+		.opcode = cpu_to_le16(CXL_MBOX_OP_GET_SUPPORTED_LOGS),
+		.effect = cpu_to_le16(0),
+	},
+	{
+		.opcode = cpu_to_le16(CXL_MBOX_OP_IDENTIFY),
+		.effect = cpu_to_le16(0),
+	},
+	{
+		.opcode = cpu_to_le16(CXL_MBOX_OP_GET_LSA),
+		.effect = cpu_to_le16(0),
+	},
+	{
+		.opcode = cpu_to_le16(CXL_MBOX_OP_SET_LSA),
+		.effect = cpu_to_le16(EFFECT(1) | EFFECT(2)),
+	},
+};
+
+static struct {
+	struct cxl_mbox_get_supported_logs gsl;
+	struct cxl_gsl_entry entry;
+} mock_gsl_payload = {
+	.gsl = {
+		.entries = cpu_to_le16(1),
+	},
+	.entry = {
+		.uuid = DEFINE_CXL_CEL_UUID,
+		.size = cpu_to_le32(sizeof(mock_cel)),
+	},
+};
+
+static int mock_gsl(struct cxl_mbox_cmd *cmd)
+{
+	if (cmd->size_out < sizeof(mock_gsl_payload))
+		return -EINVAL;
+
+	memcpy(cmd->payload_out, &mock_gsl_payload, sizeof(mock_gsl_payload));
+	cmd->size_out = sizeof(mock_gsl_payload);
+
+	return 0;
+}
+
+static int mock_get_log(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_mbox_get_log *gl = cmd->payload_in;
+	u32 offset = le32_to_cpu(gl->offset);
+	u32 length = le32_to_cpu(gl->length);
+	uuid_t uuid = DEFINE_CXL_CEL_UUID;
+	void *data = &mock_cel;
+
+	if (cmd->size_in < sizeof(*gl))
+		return -EINVAL;
+	if (length > cxlm->payload_size)
+		return -EINVAL;
+	if (offset + length > sizeof(mock_cel))
+		return -EINVAL;
+	if (!uuid_equal(&gl->uuid, &uuid))
+		return -EINVAL;
+	if (length > cmd->size_out)
+		return -EINVAL;
+
+	memcpy(cmd->payload_out, data + offset, length);
+
+	return 0;
+}
+
+static int mock_id(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
+{
+	struct platform_device *pdev = to_platform_device(cxlm->dev);
+	struct cxl_mbox_identify id = {
+		.fw_revision = { "mock fw v1 " },
+		.lsa_size = cpu_to_le32(LSA_SIZE),
+		/* FIXME: Add partition support */
+		.partition_align = cpu_to_le64(0),
+	};
+	u64 capacity = 0;
+	int i;
+
+	if (cmd->size_out < sizeof(id))
+		return -EINVAL;
+
+	for (i = 0; i < 2; i++) {
+		struct resource *res;
+
+		res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+		if (!res)
+			break;
+
+		capacity += resource_size(res) / CXL_CAPACITY_MULTIPLIER;
+
+		if (le64_to_cpu(id.partition_align))
+			continue;
+
+		if (res->desc == IORES_DESC_PERSISTENT_MEMORY)
+			id.persistent_capacity = cpu_to_le64(
+				resource_size(res) / CXL_CAPACITY_MULTIPLIER);
+		else
+			id.volatile_capacity = cpu_to_le64(
+				resource_size(res) / CXL_CAPACITY_MULTIPLIER);
+	}
+
+	id.total_capacity = cpu_to_le64(capacity);
+
+	memcpy(cmd->payload_out, &id, sizeof(id));
+
+	return 0;
+}
+
+static int mock_get_lsa(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_mbox_get_lsa *get_lsa = cmd->payload_in;
+	void *lsa = dev_get_drvdata(cxlm->dev);
+	u32 offset, length;
+
+	if (sizeof(*get_lsa) > cmd->size_in)
+		return -EINVAL;
+	offset = le32_to_cpu(get_lsa->offset);
+	length = le32_to_cpu(get_lsa->length);
+	if (offset + length > LSA_SIZE)
+		return -EINVAL;
+	if (length > cmd->size_out)
+		return -EINVAL;
+
+	memcpy(cmd->payload_out, lsa + offset, length);
+	return 0;
+}
+
+static int mock_set_lsa(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_mbox_set_lsa *set_lsa = cmd->payload_in;
+	void *lsa = dev_get_drvdata(cxlm->dev);
+	u32 offset, length;
+
+	if (sizeof(*set_lsa) > cmd->size_in)
+		return -EINVAL;
+	offset = le32_to_cpu(set_lsa->offset);
+	length = cmd->size_in - sizeof(*set_lsa);
+	if (offset + length > LSA_SIZE)
+		return -EINVAL;
+
+	memcpy(lsa + offset, &set_lsa->data[0], length);
+	return 0;
+}
+
+static int cxl_mock_mbox_send(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
+{
+	struct device *dev = cxlm->dev;
+	int rc = -EIO;
+
+	switch (cmd->opcode) {
+	case CXL_MBOX_OP_GET_SUPPORTED_LOGS:
+		rc = mock_gsl(cmd);
+		break;
+	case CXL_MBOX_OP_GET_LOG:
+		rc = mock_get_log(cxlm, cmd);
+		break;
+	case CXL_MBOX_OP_IDENTIFY:
+		rc = mock_id(cxlm, cmd);
+		break;
+	case CXL_MBOX_OP_GET_LSA:
+		rc = mock_get_lsa(cxlm, cmd);
+		break;
+	case CXL_MBOX_OP_SET_LSA:
+		rc = mock_set_lsa(cxlm, cmd);
+		break;
+	default:
+		break;
+	}
+
+	dev_dbg(dev, "opcode: %#x sz_in: %zd sz_out: %zd rc: %d\n", cmd->opcode,
+		cmd->size_in, cmd->size_out, rc);
+
+	return rc;
+}
+
+static void label_area_release(void *lsa)
+{
+	vfree(lsa);
+}
+
+static int cxl_mock_mem_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct cxl_memdev *cxlmd;
+	struct cxl_mem *cxlm;
+	void *lsa;
+	int rc;
+
+	lsa = vmalloc(LSA_SIZE);
+	if (!lsa)
+		return -ENOMEM;
+	rc = devm_add_action_or_reset(dev, label_area_release, lsa);
+	if (rc)
+		return rc;
+	dev_set_drvdata(dev, lsa);
+
+	cxlm = cxl_mem_create(dev);
+	if (IS_ERR(cxlm))
+		return PTR_ERR(cxlm);
+
+	cxlm->mbox_send = cxl_mock_mbox_send;
+	cxlm->payload_size = SZ_4K;
+
+	rc = cxl_mem_enumerate_cmds(cxlm);
+	if (rc)
+		return rc;
+
+	rc = cxl_mem_identify(cxlm);
+	if (rc)
+		return rc;
+
+	rc = cxl_mem_create_range_info(cxlm);
+	if (rc)
+		return rc;
+
+	cxlmd = devm_cxl_add_memdev(cxlm);
+	if (IS_ERR(cxlmd))
+		return PTR_ERR(cxlmd);
+
+	if (range_len(&cxlm->pmem_range) && IS_ENABLED(CONFIG_CXL_PMEM))
+		rc = devm_cxl_add_nvdimm(dev, cxlmd);
+
+	return 0;
+}
+
+static const struct platform_device_id cxl_mock_mem_ids[] = {
+	{ .name = "cxl_mem", },
+	{ },
+};
+MODULE_DEVICE_TABLE(platform, cxl_mock_mem_ids);
+
+static struct platform_driver cxl_mock_mem_driver = {
+	.probe = cxl_mock_mem_probe,
+	.id_table = cxl_mock_mem_ids,
+	.driver = {
+		.name = KBUILD_MODNAME,
+	},
+};
+
+module_platform_driver(cxl_mock_mem_driver);
+MODULE_LICENSE("GPL v2");
+MODULE_IMPORT_NS(CXL);


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
                   ` (19 preceding siblings ...)
  2021-09-09  5:13 ` [PATCH v4 20/21] tools/testing/cxl: Introduce a mock memory device + driver Dan Williams
@ 2021-09-09  5:13 ` Dan Williams
  2021-09-10 10:33   ` Jonathan Cameron
  2021-09-14 19:31   ` [PATCH v5 " Dan Williams
  20 siblings, 2 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09  5:13 UTC (permalink / raw)
  To: linux-cxl
  Cc: kernel test robot, Nathan Chancellor, Dan Carpenter,
	vishal.l.verma, nvdimm, ben.widawsky, alison.schofield,
	vishal.l.verma, ira.weiny, Jonathan.Cameron

The kbuild robot reports:

    drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
    limit (1024) in function 'devm_cxl_add_decoder'

It is also the case the devm_cxl_add_decoder() is unwieldy to use for
all the different decoder types. Fix the stack usage by splitting the
creation into alloc and add steps. This also allows for context
specific construction before adding.

With the split the caller is responsible for registering a devm callback
to trigger device_unregister() for the decoder rather than it being
implicit in the decoder registration. I.e. the routine that calls alloc
is responsible for calling put_device() if the "add" operation fails.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/acpi.c      |   84 +++++++++++++++++++++++++----------
 drivers/cxl/core/bus.c  |  114 ++++++++++++++---------------------------------
 drivers/cxl/core/core.h |    5 --
 drivers/cxl/core/pmem.c |    7 ++-
 drivers/cxl/cxl.h       |   16 +++----
 5 files changed, 106 insertions(+), 120 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 9d881eacdae5..654a80547526 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 	struct cxl_decoder *cxld;
 	acpi_size len, cur = 0;
 	void *cedt_subtable;
-	unsigned long flags;
 	int rc;
 
 	len = acpi_cedt->length - sizeof(*acpi_cedt);
@@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 		for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
 			target_map[i] = cfmws->interleave_targets[i];
 
-		flags = cfmws_to_decoder_flags(cfmws->restrictions);
-		cxld = devm_cxl_add_decoder(dev, root_port,
-					    CFMWS_INTERLEAVE_WAYS(cfmws),
-					    cfmws->base_hpa, cfmws->window_size,
-					    CFMWS_INTERLEAVE_WAYS(cfmws),
-					    CFMWS_INTERLEAVE_GRANULARITY(cfmws),
-					    CXL_DECODER_EXPANDER,
-					    flags, target_map);
-
-		if (IS_ERR(cxld)) {
+		cxld = cxl_decoder_alloc(root_port,
+					 CFMWS_INTERLEAVE_WAYS(cfmws));
+		if (IS_ERR(cxld))
+			goto next;
+
+		cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
+		cxld->target_type = CXL_DECODER_EXPANDER;
+		cxld->range = (struct range) {
+			.start = cfmws->base_hpa,
+			.end = cfmws->base_hpa + cfmws->window_size - 1,
+		};
+		cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
+		cxld->interleave_granularity =
+			CFMWS_INTERLEAVE_GRANULARITY(cfmws);
+
+		rc = cxl_decoder_add(dev, cxld, target_map);
+		if (rc)
+			put_device(&cxld->dev);
+		else
+			rc = cxl_decoder_autoremove(dev, cxld);
+		if (rc) {
 			dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
 				cfmws->base_hpa, cfmws->base_hpa +
 				cfmws->window_size - 1);
-		} else {
-			dev_dbg(dev, "add: %s range %#llx-%#llx\n",
-				dev_name(&cxld->dev), cfmws->base_hpa,
-				 cfmws->base_hpa + cfmws->window_size - 1);
+			goto next;
 		}
+		dev_dbg(dev, "add: %s range %#llx-%#llx\n",
+			dev_name(&cxld->dev), cfmws->base_hpa,
+			cfmws->base_hpa + cfmws->window_size - 1);
+next:
 		cur += c->length;
 	}
 }
@@ -268,6 +279,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
 	struct acpi_pci_root *pci_root;
 	struct cxl_walk_context ctx;
+	int single_port_map[1], rc;
 	struct cxl_decoder *cxld;
 	struct cxl_dport *dport;
 	struct cxl_port *port;
@@ -303,22 +315,46 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 		return -ENODEV;
 	if (ctx.error)
 		return ctx.error;
+	if (ctx.count > 1)
+		return 0;
 
 	/* TODO: Scan CHBCR for HDM Decoder resources */
 
 	/*
-	 * In the single-port host-bridge case there are no HDM decoders
-	 * in the CHBCR and a 1:1 passthrough decode is implied.
+	 * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
+	 * Structure) single ported host-bridges need not publish a decoder
+	 * capability when a passthrough decode can be assumed, i.e. all
+	 * transactions that the uport sees are claimed and passed to the single
+	 * dport. Default the range a 0-base 0-length until the first CXL region
+	 * is activated.
 	 */
-	if (ctx.count == 1) {
-		cxld = devm_cxl_add_passthrough_decoder(host, port);
-		if (IS_ERR(cxld))
-			return PTR_ERR(cxld);
+	cxld = cxl_decoder_alloc(port, 1);
+	if (IS_ERR(cxld))
+		return PTR_ERR(cxld);
+
+	cxld->interleave_ways = 1;
+	cxld->interleave_granularity = PAGE_SIZE;
+	cxld->target_type = CXL_DECODER_EXPANDER;
+	cxld->range = (struct range) {
+		.start = 0,
+		.end = -1,
+	};
 
-		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
-	}
+	device_lock(&port->dev);
+	dport = list_first_entry(&port->dports, typeof(*dport), list);
+	device_unlock(&port->dev);
 
-	return 0;
+	single_port_map[0] = dport->port_id;
+
+	rc = cxl_decoder_add(host, cxld, single_port_map);
+	if (rc)
+		put_device(&cxld->dev);
+	else
+		rc = cxl_decoder_autoremove(host, cxld);
+
+	if (rc == 0)
+		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
+	return rc;
 }
 
 static int add_host_bridge_dport(struct device *match, void *arg)
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 176bede30c55..be787685b13e 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -455,16 +455,18 @@ EXPORT_SYMBOL_GPL(cxl_add_dport);
 
 static int decoder_populate_targets(struct device *host,
 				    struct cxl_decoder *cxld,
-				    struct cxl_port *port, int *target_map,
-				    int nr_targets)
+				    struct cxl_port *port, int *target_map)
 {
 	int rc = 0, i;
 
+	if (list_empty(&port->dports))
+		return -EINVAL;
+
 	if (!target_map)
 		return 0;
 
 	device_lock(&port->dev);
-	for (i = 0; i < nr_targets; i++) {
+	for (i = 0; i < cxld->nr_targets; i++) {
 		struct cxl_dport *dport = find_dport(port, target_map[i]);
 
 		if (!dport) {
@@ -479,27 +481,15 @@ static int decoder_populate_targets(struct device *host,
 	return rc;
 }
 
-static struct cxl_decoder *
-cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
-		  resource_size_t base, resource_size_t len,
-		  int interleave_ways, int interleave_granularity,
-		  enum cxl_decoder_type type, unsigned long flags,
-		  int *target_map)
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
 {
 	struct cxl_decoder *cxld;
 	struct device *dev;
 	int rc = 0;
 
-	if (interleave_ways < 1)
+	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
 		return ERR_PTR(-EINVAL);
 
-	device_lock(&port->dev);
-	if (list_empty(&port->dports))
-		rc = -EINVAL;
-	device_unlock(&port->dev);
-	if (rc)
-		return ERR_PTR(rc);
-
 	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
 	if (!cxld)
 		return ERR_PTR(-ENOMEM);
@@ -508,22 +498,8 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
 	if (rc < 0)
 		goto err;
 
-	*cxld = (struct cxl_decoder) {
-		.id = rc,
-		.range = {
-			.start = base,
-			.end = base + len - 1,
-		},
-		.flags = flags,
-		.interleave_ways = interleave_ways,
-		.interleave_granularity = interleave_granularity,
-		.target_type = type,
-	};
-
-	rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
-	if (rc)
-		goto err;
-
+	cxld->id = rc;
+	cxld->nr_targets = nr_targets;
 	dev = &cxld->dev;
 	device_initialize(dev);
 	device_set_pm_not_required(dev);
@@ -541,72 +517,48 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
 	kfree(cxld);
 	return ERR_PTR(rc);
 }
+EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
 
-struct cxl_decoder *
-devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
-		     resource_size_t base, resource_size_t len,
-		     int interleave_ways, int interleave_granularity,
-		     enum cxl_decoder_type type, unsigned long flags,
-		     int *target_map)
+int cxl_decoder_add(struct device *host, struct cxl_decoder *cxld,
+		    int *target_map)
 {
-	struct cxl_decoder *cxld;
+	struct cxl_port *port;
 	struct device *dev;
 	int rc;
 
-	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
-		return ERR_PTR(-EINVAL);
+	if (!cxld)
+		return -EINVAL;
 
-	cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
-				 interleave_ways, interleave_granularity, type,
-				 flags, target_map);
 	if (IS_ERR(cxld))
-		return cxld;
+		return PTR_ERR(cxld);
 
-	dev = &cxld->dev;
-	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
-	if (rc)
-		goto err;
+	if (cxld->interleave_ways < 1)
+		return -EINVAL;
 
-	rc = device_add(dev);
+	port = to_cxl_port(cxld->dev.parent);
+	rc = decoder_populate_targets(host, cxld, port, target_map);
 	if (rc)
-		goto err;
+		return rc;
 
-	rc = devm_add_action_or_reset(host, unregister_cxl_dev, dev);
+	dev = &cxld->dev;
+	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
 	if (rc)
-		return ERR_PTR(rc);
-	return cxld;
+		return rc;
 
-err:
-	put_device(dev);
-	return ERR_PTR(rc);
+	return device_add(dev);
 }
-EXPORT_SYMBOL_GPL(devm_cxl_add_decoder);
+EXPORT_SYMBOL_GPL(cxl_decoder_add);
 
-/*
- * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
- * single ported host-bridges need not publish a decoder capability when a
- * passthrough decode can be assumed, i.e. all transactions that the uport sees
- * are claimed and passed to the single dport. Default the range a 0-base
- * 0-length until the first CXL region is activated.
- */
-struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
-						     struct cxl_port *port)
+static void cxld_unregister(void *dev)
 {
-	struct cxl_dport *dport;
-	int target_map[1];
-
-	device_lock(&port->dev);
-	dport = list_first_entry_or_null(&port->dports, typeof(*dport), list);
-	device_unlock(&port->dev);
-
-	if (!dport)
-		return ERR_PTR(-ENXIO);
+	device_unregister(dev);
+}
 
-	target_map[0] = dport->port_id;
-	return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
-				    CXL_DECODER_EXPANDER, 0, target_map);
+int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld)
+{
+	return devm_add_action_or_reset(host, cxld_unregister, &cxld->dev);
 }
-EXPORT_SYMBOL_GPL(devm_cxl_add_passthrough_decoder);
+EXPORT_SYMBOL_GPL(cxl_decoder_autoremove);
 
 /**
  * __cxl_driver_register - register a driver for the cxl bus
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index c85b7fbad02d..e0c9aacc4e9c 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -9,11 +9,6 @@ extern const struct device_type cxl_nvdimm_type;
 
 extern struct attribute_group cxl_base_attribute_group;
 
-static inline void unregister_cxl_dev(void *dev)
-{
-	device_unregister(dev);
-}
-
 struct cxl_send_command;
 struct cxl_mem_query_commands;
 int cxl_query_cmd(struct cxl_memdev *cxlmd,
diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index 74be5132df1c..5032f4c1c69d 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -222,6 +222,11 @@ static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
 	return cxl_nvd;
 }
 
+static void cxl_nvd_unregister(void *dev)
+{
+	device_unregister(dev);
+}
+
 /**
  * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm
  * @host: same host as @cxlmd
@@ -251,7 +256,7 @@ int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
 	dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
 		dev_name(dev));
 
-	return devm_add_action_or_reset(host, unregister_cxl_dev, dev);
+	return devm_add_action_or_reset(host, cxl_nvd_unregister, dev);
 
 err:
 	put_device(dev);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 9af5745ba2c0..6c7a7e9af0d4 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -195,6 +195,7 @@ enum cxl_decoder_type {
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
  * @flags: memory type capabilities and locking
+ * @nr_targets: number of elements in @target
  * @target: active ordered target list in current decoder configuration
  */
 struct cxl_decoder {
@@ -205,6 +206,7 @@ struct cxl_decoder {
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
 	unsigned long flags;
+	int nr_targets;
 	struct cxl_dport *target[];
 };
 
@@ -286,15 +288,11 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
-struct cxl_decoder *
-devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
-		     resource_size_t base, resource_size_t len,
-		     int interleave_ways, int interleave_granularity,
-		     enum cxl_decoder_type type, unsigned long flags,
-		     int *target_map);
-
-struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
-						     struct cxl_port *port);
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
+int cxl_decoder_add(struct device *host, struct cxl_decoder *cxld,
+		    int *target_map);
+int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
+
 extern struct bus_type cxl_bus_type;
 
 struct cxl_driver {


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 03/21] libnvdimm/labels: Introduce the concept of multi-range namespace labels
  2021-09-09  5:11 ` [PATCH v4 03/21] libnvdimm/labels: Introduce the concept of multi-range namespace labels Dan Williams
@ 2021-09-09 13:09   ` Jonathan Cameron
  2021-09-09 15:16     ` Dan Williams
  0 siblings, 1 reply; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-09 13:09 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, ira.weiny

On Wed, 8 Sep 2021 22:11:48 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> The CXL specification defines a mechanism for namespaces to be comprised
> of multiple dis-contiguous ranges. Introduce that concept to the legacy
> NVDIMM namespace implementation with a new nsl_set_nrange() helper, that
> sets the number of ranges to 1. Once the NVDIMM subsystem supports CXL
> labels and updates its namespace capacity provisioning for
> dis-contiguous support nsl_set_nrange() can be updated, but in the
> meantime CXL label validation requires nrange be non-zero.
> 
> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

(gave tag on v3 and this looks to be the same).

Thanks,

Jonathan

> ---
>  drivers/nvdimm/label.c |    1 +
>  drivers/nvdimm/nd.h    |   13 +++++++++++++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
> index e7fdb718ebf0..7832b190efd7 100644
> --- a/drivers/nvdimm/label.c
> +++ b/drivers/nvdimm/label.c
> @@ -856,6 +856,7 @@ static int __pmem_label_update(struct nd_region *nd_region,
>  	nsl_set_name(ndd, nd_label, nspm->alt_name);
>  	nsl_set_flags(ndd, nd_label, flags);
>  	nsl_set_nlabel(ndd, nd_label, nd_region->ndr_mappings);
> +	nsl_set_nrange(ndd, nd_label, 1);
>  	nsl_set_position(ndd, nd_label, pos);
>  	nsl_set_isetcookie(ndd, nd_label, cookie);
>  	nsl_set_rawsize(ndd, nd_label, resource_size(res));
> diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
> index 036638bdb7e3..d57f95a48fe1 100644
> --- a/drivers/nvdimm/nd.h
> +++ b/drivers/nvdimm/nd.h
> @@ -164,6 +164,19 @@ static inline void nsl_set_nlabel(struct nvdimm_drvdata *ndd,
>  	nd_label->nlabel = __cpu_to_le16(nlabel);
>  }
>  
> +static inline u16 nsl_get_nrange(struct nvdimm_drvdata *ndd,
> +				 struct nd_namespace_label *nd_label)
> +{
> +	/* EFI labels do not have an nrange field */
> +	return 1;
> +}
> +
> +static inline void nsl_set_nrange(struct nvdimm_drvdata *ndd,
> +				  struct nd_namespace_label *nd_label,
> +				  u16 nrange)
> +{
> +}
> +
>  static inline u64 nsl_get_lbasize(struct nvdimm_drvdata *ndd,
>  				  struct nd_namespace_label *nd_label)
>  {
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 03/21] libnvdimm/labels: Introduce the concept of multi-range namespace labels
  2021-09-09 13:09   ` Jonathan Cameron
@ 2021-09-09 15:16     ` Dan Williams
  0 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09 15:16 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Vishal L Verma, Linux NVDIMM, Ben Widawsky, Schofield,
	Alison, Weiny, Ira

On Thu, Sep 9, 2021 at 6:10 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Wed, 8 Sep 2021 22:11:48 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > The CXL specification defines a mechanism for namespaces to be comprised
> > of multiple dis-contiguous ranges. Introduce that concept to the legacy
> > NVDIMM namespace implementation with a new nsl_set_nrange() helper, that
> > sets the number of ranges to 1. Once the NVDIMM subsystem supports CXL
> > labels and updates its namespace capacity provisioning for
> > dis-contiguous support nsl_set_nrange() can be updated, but in the
> > meantime CXL label validation requires nrange be non-zero.
> >
> > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>
> (gave tag on v3 and this looks to be the same).

Whoops, sorry about that, now recorded.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 05/21] libnvdimm/label: Define CXL region labels
  2021-09-09  5:11 ` [PATCH v4 05/21] libnvdimm/label: Define CXL region labels Dan Williams
@ 2021-09-09 15:58   ` Ben Widawsky
  2021-09-09 18:38     ` Dan Williams
  0 siblings, 1 reply; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 15:58 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Jonathan Cameron, vishal.l.verma, nvdimm,
	alison.schofield, ira.weiny

On 21-09-08 22:11:58, Dan Williams wrote:
> Add a definition of the CXL 2.0 region label format. Note this is done
> as a separate patch to make the next patch that adds namespace label
> support easier to read.
> 
> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/nvdimm/label.h |   32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)

Wondering how awkward it's going to be to use this in the cxl region driver. Is
the intent to push all device based reads/writes to labels happen in
drivers/nvdimm?

> 
> diff --git a/drivers/nvdimm/label.h b/drivers/nvdimm/label.h
> index 7fa757d47846..0519aacc2926 100644
> --- a/drivers/nvdimm/label.h
> +++ b/drivers/nvdimm/label.h
> @@ -66,6 +66,38 @@ struct nd_namespace_index {
>  	u8 free[];
>  };
>  
> +/**
> + * struct cxl_region_label - CXL 2.0 Table 211
> + * @type: uuid identifying this label format (region)
> + * @uuid: uuid for the region this label describes
> + * @flags: NSLABEL_FLAG_UPDATING (all other flags reserved)
> + * @nlabel: 1 per interleave-way in the region
> + * @position: this label's position in the set
> + * @dpa: start address in device-local capacity for this label
> + * @rawsize: size of this label's contribution to region
> + * @hpa: mandatory system physical address to map this region
> + * @slot: slot id of this label in label area
> + * @ig: interleave granularity (1 << @ig) * 256 bytes
> + * @align: alignment in SZ_256M blocks
> + * @reserved: reserved
> + * @checksum: fletcher64 sum of this label
> + */
> +struct cxl_region_label {
> +	u8 type[NSLABEL_UUID_LEN];
> +	u8 uuid[NSLABEL_UUID_LEN];
> +	__le32 flags;
> +	__le16 nlabel;
> +	__le16 position;
> +	__le64 dpa;
> +	__le64 rawsize;
> +	__le64 hpa;
> +	__le32 slot;
> +	__le32 ig;
> +	__le32 align;
> +	u8 reserved[0xac];
> +	__le64 checksum;
> +};
> +
>  /**
>   * struct nd_namespace_label - namespace superblock
>   * @uuid: UUID per RFC 4122
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 07/21] cxl/pci: Make 'struct cxl_mem' device type generic
  2021-09-09  5:12 ` [PATCH v4 07/21] cxl/pci: Make 'struct cxl_mem' device type generic Dan Williams
@ 2021-09-09 16:12   ` Ben Widawsky
  2021-09-10  8:43   ` Jonathan Cameron
  1 sibling, 0 replies; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 16:12 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, vishal.l.verma, nvdimm, alison.schofield, ira.weiny,
	Jonathan.Cameron

On 21-09-08 22:12:09, Dan Williams wrote:
> In preparation for adding a unit test provider of a cxl_memdev, convert
> the 'struct cxl_mem' driver context to carry a generic device rather
> than a pci device.
> 
> Note, some dev_dbg() lines needed extra reformatting per clang-format.
> 
> This conversion also allows the cxl_mem_create() and
> devm_cxl_add_memdev() calling conventions to be simplified. The "host"
> for a cxl_memdev, must be the same device for the driver that allocated
> @cxlm.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>

You can bump this to
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>

> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c |    7 ++--
>  drivers/cxl/cxlmem.h      |    6 ++--
>  drivers/cxl/pci.c         |   75 +++++++++++++++++++++------------------------
>  3 files changed, 41 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index a9c317e32010..331ef7d6c598 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -149,7 +149,6 @@ static void cxl_memdev_unregister(void *_cxlmd)
>  static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
>  					   const struct file_operations *fops)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
>  	struct cxl_memdev *cxlmd;
>  	struct device *dev;
>  	struct cdev *cdev;
> @@ -166,7 +165,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
>  
>  	dev = &cxlmd->dev;
>  	device_initialize(dev);
> -	dev->parent = &pdev->dev;
> +	dev->parent = cxlm->dev;
>  	dev->bus = &cxl_bus_type;
>  	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>  	dev->type = &cxl_memdev_type;
> @@ -182,7 +181,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
>  }
>  
>  struct cxl_memdev *
> -devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
> +devm_cxl_add_memdev(struct cxl_mem *cxlm,
>  		    const struct cdevm_file_operations *cdevm_fops)
>  {
>  	struct cxl_memdev *cxlmd;
> @@ -210,7 +209,7 @@ devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
>  	if (rc)
>  		goto err;
>  
> -	rc = devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
> +	rc = devm_add_action_or_reset(cxlm->dev, cxl_memdev_unregister, cxlmd);
>  	if (rc)
>  		return ERR_PTR(rc);
>  	return cxlmd;
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 6c0b1e2ea97c..d5334df83fb2 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -63,12 +63,12 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
>  }
>  
>  struct cxl_memdev *
> -devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
> +devm_cxl_add_memdev(struct cxl_mem *cxlm,
>  		    const struct cdevm_file_operations *cdevm_fops);
>  
>  /**
>   * struct cxl_mem - A CXL memory device
> - * @pdev: The PCI device associated with this CXL device.
> + * @dev: The device associated with this CXL device.
>   * @cxlmd: Logical memory device chardev / interface
>   * @regs: Parsed register blocks
>   * @payload_size: Size of space for payload
> @@ -82,7 +82,7 @@ devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
>   * @ram_range: Volatile memory capacity information.
>   */
>  struct cxl_mem {
> -	struct pci_dev *pdev;
> +	struct device *dev;
>  	struct cxl_memdev *cxlmd;
>  
>  	struct cxl_regs regs;
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8e45aa07d662..c1e1d12e24b6 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -250,7 +250,7 @@ static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm)
>  		cpu_relax();
>  	}
>  
> -	dev_dbg(&cxlm->pdev->dev, "Doorbell wait took %dms",
> +	dev_dbg(cxlm->dev, "Doorbell wait took %dms",
>  		jiffies_to_msecs(end) - jiffies_to_msecs(start));
>  	return 0;
>  }
> @@ -268,7 +268,7 @@ static bool cxl_is_security_command(u16 opcode)
>  static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
>  				 struct mbox_cmd *mbox_cmd)
>  {
> -	struct device *dev = &cxlm->pdev->dev;
> +	struct device *dev = cxlm->dev;
>  
>  	dev_dbg(dev, "Mailbox command (opcode: %#x size: %zub) timed out\n",
>  		mbox_cmd->opcode, mbox_cmd->size_in);
> @@ -300,6 +300,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>  				   struct mbox_cmd *mbox_cmd)
>  {
>  	void __iomem *payload = cxlm->regs.mbox + CXLDEV_MBOX_PAYLOAD_OFFSET;
> +	struct device *dev = cxlm->dev;
>  	u64 cmd_reg, status_reg;
>  	size_t out_len;
>  	int rc;
> @@ -325,8 +326,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>  
>  	/* #1 */
>  	if (cxl_doorbell_busy(cxlm)) {
> -		dev_err_ratelimited(&cxlm->pdev->dev,
> -				    "Mailbox re-busy after acquiring\n");
> +		dev_err_ratelimited(dev, "Mailbox re-busy after acquiring\n");
>  		return -EBUSY;
>  	}
>  
> @@ -345,7 +345,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>  	writeq(cmd_reg, cxlm->regs.mbox + CXLDEV_MBOX_CMD_OFFSET);
>  
>  	/* #4 */
> -	dev_dbg(&cxlm->pdev->dev, "Sending command\n");
> +	dev_dbg(dev, "Sending command\n");
>  	writel(CXLDEV_MBOX_CTRL_DOORBELL,
>  	       cxlm->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
>  
> @@ -362,7 +362,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>  		FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
>  
>  	if (mbox_cmd->return_code != 0) {
> -		dev_dbg(&cxlm->pdev->dev, "Mailbox operation had an error\n");
> +		dev_dbg(dev, "Mailbox operation had an error\n");
>  		return 0;
>  	}
>  
> @@ -399,7 +399,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>   */
>  static int cxl_mem_mbox_get(struct cxl_mem *cxlm)
>  {
> -	struct device *dev = &cxlm->pdev->dev;
> +	struct device *dev = cxlm->dev;
>  	u64 md_status;
>  	int rc;
>  
> @@ -502,7 +502,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
>  					u64 in_payload, u64 out_payload,
>  					s32 *size_out, u32 *retval)
>  {
> -	struct device *dev = &cxlm->pdev->dev;
> +	struct device *dev = cxlm->dev;
>  	struct mbox_cmd mbox_cmd = {
>  		.opcode = cmd->opcode,
>  		.size_in = cmd->info.size_in,
> @@ -925,20 +925,19 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
>  	 */
>  	cxlm->payload_size = min_t(size_t, cxlm->payload_size, SZ_1M);
>  	if (cxlm->payload_size < 256) {
> -		dev_err(&cxlm->pdev->dev, "Mailbox is too small (%zub)",
> +		dev_err(cxlm->dev, "Mailbox is too small (%zub)",
>  			cxlm->payload_size);
>  		return -ENXIO;
>  	}
>  
> -	dev_dbg(&cxlm->pdev->dev, "Mailbox payload sized %zu",
> +	dev_dbg(cxlm->dev, "Mailbox payload sized %zu",
>  		cxlm->payload_size);
>  
>  	return 0;
>  }
>  
> -static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev)
> +static struct cxl_mem *cxl_mem_create(struct device *dev)
>  {
> -	struct device *dev = &pdev->dev;
>  	struct cxl_mem *cxlm;
>  
>  	cxlm = devm_kzalloc(dev, sizeof(*cxlm), GFP_KERNEL);
> @@ -948,7 +947,7 @@ static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev)
>  	}
>  
>  	mutex_init(&cxlm->mbox_mutex);
> -	cxlm->pdev = pdev;
> +	cxlm->dev = dev;
>  	cxlm->enabled_cmds =
>  		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
>  				   sizeof(unsigned long),
> @@ -964,9 +963,9 @@ static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev)
>  static void __iomem *cxl_mem_map_regblock(struct cxl_mem *cxlm,
>  					  u8 bar, u64 offset)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
> -	struct device *dev = &pdev->dev;
>  	void __iomem *addr;
> +	struct device *dev = cxlm->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
>  
>  	/* Basic sanity check that BAR is big enough */
>  	if (pci_resource_len(pdev, bar) < offset) {
> @@ -989,7 +988,7 @@ static void __iomem *cxl_mem_map_regblock(struct cxl_mem *cxlm,
>  
>  static void cxl_mem_unmap_regblock(struct cxl_mem *cxlm, void __iomem *base)
>  {
> -	pci_iounmap(cxlm->pdev, base);
> +	pci_iounmap(to_pci_dev(cxlm->dev), base);
>  }
>  
>  static int cxl_mem_dvsec(struct pci_dev *pdev, int dvsec)
> @@ -1018,10 +1017,9 @@ static int cxl_mem_dvsec(struct pci_dev *pdev, int dvsec)
>  static int cxl_probe_regs(struct cxl_mem *cxlm, void __iomem *base,
>  			  struct cxl_register_map *map)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
> -	struct device *dev = &pdev->dev;
>  	struct cxl_component_reg_map *comp_map;
>  	struct cxl_device_reg_map *dev_map;
> +	struct device *dev = cxlm->dev;
>  
>  	switch (map->reg_type) {
>  	case CXL_REGLOC_RBI_COMPONENT:
> @@ -1057,8 +1055,8 @@ static int cxl_probe_regs(struct cxl_mem *cxlm, void __iomem *base,
>  
>  static int cxl_map_regs(struct cxl_mem *cxlm, struct cxl_register_map *map)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
> -	struct device *dev = &pdev->dev;
> +	struct device *dev = cxlm->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
>  
>  	switch (map->reg_type) {
>  	case CXL_REGLOC_RBI_COMPONENT:
> @@ -1096,13 +1094,12 @@ static void cxl_decode_register_block(u32 reg_lo, u32 reg_hi,
>   */
>  static int cxl_mem_setup_regs(struct cxl_mem *cxlm)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
> -	struct device *dev = &pdev->dev;
> -	u32 regloc_size, regblocks;
>  	void __iomem *base;
> -	int regloc, i, n_maps;
> +	u32 regloc_size, regblocks;
> +	int regloc, i, n_maps, ret = 0;
> +	struct device *dev = cxlm->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
>  	struct cxl_register_map *map, maps[CXL_REGLOC_RBI_TYPES];
> -	int ret = 0;
>  
>  	regloc = cxl_mem_dvsec(pdev, PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
>  	if (!regloc) {
> @@ -1226,7 +1223,7 @@ static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
>  		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
>  
>  		if (!cmd) {
> -			dev_dbg(&cxlm->pdev->dev,
> +			dev_dbg(cxlm->dev,
>  				"Opcode 0x%04x unsupported by driver", opcode);
>  			continue;
>  		}
> @@ -1323,7 +1320,7 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
>  static int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm)
>  {
>  	struct cxl_mbox_get_supported_logs *gsl;
> -	struct device *dev = &cxlm->pdev->dev;
> +	struct device *dev = cxlm->dev;
>  	struct cxl_mem_command *cmd;
>  	int i, rc;
>  
> @@ -1418,15 +1415,14 @@ static int cxl_mem_identify(struct cxl_mem *cxlm)
>  	cxlm->partition_align_bytes = le64_to_cpu(id.partition_align);
>  	cxlm->partition_align_bytes *= CXL_CAPACITY_MULTIPLIER;
>  
> -	dev_dbg(&cxlm->pdev->dev, "Identify Memory Device\n"
> +	dev_dbg(cxlm->dev,
> +		"Identify Memory Device\n"
>  		"     total_bytes = %#llx\n"
>  		"     volatile_only_bytes = %#llx\n"
>  		"     persistent_only_bytes = %#llx\n"
>  		"     partition_align_bytes = %#llx\n",
> -			cxlm->total_bytes,
> -			cxlm->volatile_only_bytes,
> -			cxlm->persistent_only_bytes,
> -			cxlm->partition_align_bytes);
> +		cxlm->total_bytes, cxlm->volatile_only_bytes,
> +		cxlm->persistent_only_bytes, cxlm->partition_align_bytes);
>  
>  	cxlm->lsa_size = le32_to_cpu(id.lsa_size);
>  	memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision));
> @@ -1453,19 +1449,18 @@ static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
>  					&cxlm->next_volatile_bytes,
>  					&cxlm->next_persistent_bytes);
>  	if (rc < 0) {
> -		dev_err(&cxlm->pdev->dev, "Failed to query partition information\n");
> +		dev_err(cxlm->dev, "Failed to query partition information\n");
>  		return rc;
>  	}
>  
> -	dev_dbg(&cxlm->pdev->dev, "Get Partition Info\n"
> +	dev_dbg(cxlm->dev,
> +		"Get Partition Info\n"
>  		"     active_volatile_bytes = %#llx\n"
>  		"     active_persistent_bytes = %#llx\n"
>  		"     next_volatile_bytes = %#llx\n"
>  		"     next_persistent_bytes = %#llx\n",
> -			cxlm->active_volatile_bytes,
> -			cxlm->active_persistent_bytes,
> -			cxlm->next_volatile_bytes,
> -			cxlm->next_persistent_bytes);
> +		cxlm->active_volatile_bytes, cxlm->active_persistent_bytes,
> +		cxlm->next_volatile_bytes, cxlm->next_persistent_bytes);
>  
>  	cxlm->ram_range.start = 0;
>  	cxlm->ram_range.end = cxlm->active_volatile_bytes - 1;
> @@ -1487,7 +1482,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	cxlm = cxl_mem_create(pdev);
> +	cxlm = cxl_mem_create(&pdev->dev);
>  	if (IS_ERR(cxlm))
>  		return PTR_ERR(cxlm);
>  
> @@ -1511,7 +1506,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm, &cxl_memdev_fops);
> +	cxlmd = devm_cxl_add_memdev(cxlm, &cxl_memdev_fops);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info()
  2021-09-09  5:12 ` [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info() Dan Williams
@ 2021-09-09 16:20   ` Ben Widawsky
  2021-09-09 18:06     ` Dan Williams
  2021-09-13 22:19   ` [PATCH v5 " Dan Williams
  2021-09-13 22:24   ` [PATCH v6 " Dan Williams
  2 siblings, 1 reply; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 16:20 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ira Weiny, vishal.l.verma, nvdimm, alison.schofield,
	Jonathan.Cameron

On 21-09-08 22:12:15, Dan Williams wrote:
> Commit 0b9159d0ff21 ("cxl/pci: Store memory capacity values") missed
> updating the kernel-doc for 'struct cxl_mem' leading to the following
> warnings:
> 
> ./scripts/kernel-doc -v drivers/cxl/cxlmem.h 2>&1 | grep warn
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'total_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'volatile_only_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'persistent_only_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'partition_align_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_volatile_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_persistent_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_volatile_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_persistent_bytes' not described in 'cxl_mem'
> 
> Also, it is redundant to describe those same parameters in the
> kernel-doc for cxl_mem_get_partition_info(). Given the only user of that
> routine updates the values in @cxlm, just do that implicitly internal to
> the helper.
> 
> Cc: Ira Weiny <ira.weiny@intel.com>
> Reported-by: Ben Widawsky <ben.widawsky@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/cxlmem.h |   15 +++++++++++++--
>  drivers/cxl/pci.c    |   35 +++++++++++------------------------
>  2 files changed, 24 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index d5334df83fb2..c6fce966084a 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -78,8 +78,19 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
>   * @mbox_mutex: Mutex to synchronize mailbox access.
>   * @firmware_version: Firmware version for the memory device.
>   * @enabled_cmds: Hardware commands found enabled in CEL.
> - * @pmem_range: Persistent memory capacity information.
> - * @ram_range: Volatile memory capacity information.
> + * @pmem_range: Active Persistent memory capacity configuration
> + * @ram_range: Active Volatile memory capacity configuration
> + * @total_bytes: sum of all possible capacities
> + * @volatile_only_bytes: hard volatile capacity
> + * @persistent_only_bytes: hard persistent capacity
> + * @partition_align_bytes: soft setting for configurable capacity
> + * @active_volatile_bytes: sum of hard + soft volatile
> + * @active_persistent_bytes: sum of hard + soft persistent

Looking at this now, probably makes sense to create some helper macros or inline
functions to calculate these as needed, rather than storing them in the
structure.

> + * @next_volatile_bytes: volatile capacity change pending device reset
> + * @next_persistent_bytes: persistent capacity change pending device reset
> + *
> + * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> + * details on capacity parameters.
>   */
>  struct cxl_mem {
>  	struct device *dev;
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index c1e1d12e24b6..8077d907e7d3 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1262,11 +1262,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
>  
>  /**
>   * cxl_mem_get_partition_info - Get partition info
> - * @cxlm: The device to act on
> - * @active_volatile_bytes: returned active volatile capacity
> - * @active_persistent_bytes: returned active persistent capacity
> - * @next_volatile_bytes: return next volatile capacity
> - * @next_persistent_bytes: return next persistent capacity
> + * @cxlm: cxl_mem instance to update partition info
>   *
>   * Retrieve the current partition info for the device specified.  If not 0, the
>   * 'next' values are pending and take affect on next cold reset.
> @@ -1275,11 +1271,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
>   *
>   * See CXL @8.2.9.5.2.1 Get Partition Info
>   */
> -static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
> -				      u64 *active_volatile_bytes,
> -				      u64 *active_persistent_bytes,
> -				      u64 *next_volatile_bytes,
> -				      u64 *next_persistent_bytes)
> +static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
>  {
>  	struct cxl_mbox_get_partition_info {
>  		__le64 active_volatile_cap;
> @@ -1294,15 +1286,14 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
>  	if (rc)
>  		return rc;
>  
> -	*active_volatile_bytes = le64_to_cpu(pi.active_volatile_cap);
> -	*active_persistent_bytes = le64_to_cpu(pi.active_persistent_cap);
> -	*next_volatile_bytes = le64_to_cpu(pi.next_volatile_cap);
> -	*next_persistent_bytes = le64_to_cpu(pi.next_volatile_cap);
> -
> -	*active_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
> -	*active_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
> -	*next_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
> -	*next_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
> +	cxlm->active_volatile_bytes =
> +		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->active_persistent_bytes =
> +		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->next_volatile_bytes =
> +		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->next_persistent_bytes =
> +		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;

Personally, I prefer the more functional style implementation. I guess if you
wanted to make the change, my preference would be to kill
cxl_mem_get_partition_info() entirely. Up to you though...

>  
>  	return 0;
>  }
> @@ -1443,11 +1434,7 @@ static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
>  		return 0;
>  	}
>  
> -	rc = cxl_mem_get_partition_info(cxlm,
> -					&cxlm->active_volatile_bytes,
> -					&cxlm->active_persistent_bytes,
> -					&cxlm->next_volatile_bytes,
> -					&cxlm->next_persistent_bytes);
> +	rc = cxl_mem_get_partition_info(cxlm);
>  	if (rc < 0) {
>  		dev_err(cxlm->dev, "Failed to query partition information\n");
>  		return rc;
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 09/21] cxl/mbox: Introduce the mbox_send operation
  2021-09-09  5:12 ` [PATCH v4 09/21] cxl/mbox: Introduce the mbox_send operation Dan Williams
@ 2021-09-09 16:34   ` Ben Widawsky
  2021-09-10  8:58   ` Jonathan Cameron
  1 sibling, 0 replies; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 16:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, vishal.l.verma, nvdimm, alison.schofield, ira.weiny,
	Jonathan.Cameron

On 21-09-08 22:12:21, Dan Williams wrote:
> In preparation for implementing a unit test backend transport for ioctl
> operations, and making the mailbox available to the cxl/pmem
> infrastructure, move the existing PCI specific portion of mailbox handling
> to an "mbox_send" operation.
> 
> With this split all the PCI-specific transport details are comprehended
> by a single operation and the rest of the mailbox infrastructure is
> 'struct cxl_mem' and 'struct cxl_memdev' generic.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>

Upgrade to:
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>

> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/cxlmem.h |   42 ++++++++++++++++++++++++++++
>  drivers/cxl/pci.c    |   76 ++++++++++++++------------------------------------
>  2 files changed, 63 insertions(+), 55 deletions(-)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index c6fce966084a..9be5e26c5b48 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -66,6 +66,45 @@ struct cxl_memdev *
>  devm_cxl_add_memdev(struct cxl_mem *cxlm,
>  		    const struct cdevm_file_operations *cdevm_fops);
>  
> +/**
> + * struct cxl_mbox_cmd - A command to be submitted to hardware.
> + * @opcode: (input) The command set and command submitted to hardware.
> + * @payload_in: (input) Pointer to the input payload.
> + * @payload_out: (output) Pointer to the output payload. Must be allocated by
> + *		 the caller.
> + * @size_in: (input) Number of bytes to load from @payload_in.
> + * @size_out: (input) Max number of bytes loaded into @payload_out.
> + *            (output) Number of bytes generated by the device. For fixed size
> + *            outputs commands this is always expected to be deterministic. For
> + *            variable sized output commands, it tells the exact number of bytes
> + *            written.
> + * @return_code: (output) Error code returned from hardware.
> + *
> + * This is the primary mechanism used to send commands to the hardware.
> + * All the fields except @payload_* correspond exactly to the fields described in
> + * Command Register section of the CXL 2.0 8.2.8.4.5. @payload_in and
> + * @payload_out are written to, and read from the Command Payload Registers
> + * defined in CXL 2.0 8.2.8.4.8.
> + */
> +struct cxl_mbox_cmd {
> +	u16 opcode;
> +	void *payload_in;
> +	void *payload_out;
> +	size_t size_in;
> +	size_t size_out;
> +	u16 return_code;
> +#define CXL_MBOX_SUCCESS 0
> +};
> +
> +/*
> + * CXL 2.0 - Memory capacity multiplier
> + * See Section 8.2.9.5
> + *
> + * Volatile, Persistent, and Partition capacities are specified to be in
> + * multiples of 256MB - define a multiplier to convert to/from bytes.
> + */
> +#define CXL_CAPACITY_MULTIPLIER SZ_256M
> +
>  /**
>   * struct cxl_mem - A CXL memory device
>   * @dev: The device associated with this CXL device.
> @@ -88,6 +127,7 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
>   * @active_persistent_bytes: sum of hard + soft persistent
>   * @next_volatile_bytes: volatile capacity change pending device reset
>   * @next_persistent_bytes: persistent capacity change pending device reset
> + * @mbox_send: @dev specific transport for transmitting mailbox commands
>   *
>   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
>   * details on capacity parameters.
> @@ -115,5 +155,7 @@ struct cxl_mem {
>  	u64 active_persistent_bytes;
>  	u64 next_volatile_bytes;
>  	u64 next_persistent_bytes;
> +
> +	int (*mbox_send)(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd);
>  };
>  #endif /* __CXL_MEM_H__ */
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8077d907e7d3..e2f27671c6b2 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -64,45 +64,6 @@ enum opcode {
>  	CXL_MBOX_OP_MAX			= 0x10000
>  };
>  
> -/*
> - * CXL 2.0 - Memory capacity multiplier
> - * See Section 8.2.9.5
> - *
> - * Volatile, Persistent, and Partition capacities are specified to be in
> - * multiples of 256MB - define a multiplier to convert to/from bytes.
> - */
> -#define CXL_CAPACITY_MULTIPLIER SZ_256M
> -
> -/**
> - * struct mbox_cmd - A command to be submitted to hardware.
> - * @opcode: (input) The command set and command submitted to hardware.
> - * @payload_in: (input) Pointer to the input payload.
> - * @payload_out: (output) Pointer to the output payload. Must be allocated by
> - *		 the caller.
> - * @size_in: (input) Number of bytes to load from @payload_in.
> - * @size_out: (input) Max number of bytes loaded into @payload_out.
> - *            (output) Number of bytes generated by the device. For fixed size
> - *            outputs commands this is always expected to be deterministic. For
> - *            variable sized output commands, it tells the exact number of bytes
> - *            written.
> - * @return_code: (output) Error code returned from hardware.
> - *
> - * This is the primary mechanism used to send commands to the hardware.
> - * All the fields except @payload_* correspond exactly to the fields described in
> - * Command Register section of the CXL 2.0 8.2.8.4.5. @payload_in and
> - * @payload_out are written to, and read from the Command Payload Registers
> - * defined in CXL 2.0 8.2.8.4.8.
> - */
> -struct mbox_cmd {
> -	u16 opcode;
> -	void *payload_in;
> -	void *payload_out;
> -	size_t size_in;
> -	size_t size_out;
> -	u16 return_code;
> -#define CXL_MBOX_SUCCESS 0
> -};
> -
>  static DECLARE_RWSEM(cxl_memdev_rwsem);
>  static struct dentry *cxl_debugfs;
>  static bool cxl_raw_allow_all;
> @@ -266,7 +227,7 @@ static bool cxl_is_security_command(u16 opcode)
>  }
>  
>  static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
> -				 struct mbox_cmd *mbox_cmd)
> +				 struct cxl_mbox_cmd *mbox_cmd)
>  {
>  	struct device *dev = cxlm->dev;
>  
> @@ -297,7 +258,7 @@ static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
>   * mailbox.
>   */
>  static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
> -				   struct mbox_cmd *mbox_cmd)
> +				   struct cxl_mbox_cmd *mbox_cmd)
>  {
>  	void __iomem *payload = cxlm->regs.mbox + CXLDEV_MBOX_PAYLOAD_OFFSET;
>  	struct device *dev = cxlm->dev;
> @@ -472,6 +433,20 @@ static void cxl_mem_mbox_put(struct cxl_mem *cxlm)
>  	mutex_unlock(&cxlm->mbox_mutex);
>  }
>  
> +static int cxl_pci_mbox_send(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
> +{
> +	int rc;
> +
> +	rc = cxl_mem_mbox_get(cxlm);
> +	if (rc)
> +		return rc;
> +
> +	rc = __cxl_mem_mbox_send_cmd(cxlm, cmd);
> +	cxl_mem_mbox_put(cxlm);
> +
> +	return rc;
> +}
> +
>  /**
>   * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
>   * @cxlm: The CXL memory device to communicate with.
> @@ -503,7 +478,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
>  					s32 *size_out, u32 *retval)
>  {
>  	struct device *dev = cxlm->dev;
> -	struct mbox_cmd mbox_cmd = {
> +	struct cxl_mbox_cmd mbox_cmd = {
>  		.opcode = cmd->opcode,
>  		.size_in = cmd->info.size_in,
>  		.size_out = cmd->info.size_out,
> @@ -525,10 +500,6 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
>  		}
>  	}
>  
> -	rc = cxl_mem_mbox_get(cxlm);
> -	if (rc)
> -		goto out;
> -
>  	dev_dbg(dev,
>  		"Submitting %s command for user\n"
>  		"\topcode: %x\n"
> @@ -539,8 +510,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
>  	dev_WARN_ONCE(dev, cmd->info.id == CXL_MEM_COMMAND_ID_RAW,
>  		      "raw command path used\n");
>  
> -	rc = __cxl_mem_mbox_send_cmd(cxlm, &mbox_cmd);
> -	cxl_mem_mbox_put(cxlm);
> +	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
>  	if (rc)
>  		goto out;
>  
> @@ -874,7 +844,7 @@ static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode,
>  				 void *out, size_t out_size)
>  {
>  	const struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> -	struct mbox_cmd mbox_cmd = {
> +	struct cxl_mbox_cmd mbox_cmd = {
>  		.opcode = opcode,
>  		.payload_in = in,
>  		.size_in = in_size,
> @@ -886,12 +856,7 @@ static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode,
>  	if (out_size > cxlm->payload_size)
>  		return -E2BIG;
>  
> -	rc = cxl_mem_mbox_get(cxlm);
> -	if (rc)
> -		return rc;
> -
> -	rc = __cxl_mem_mbox_send_cmd(cxlm, &mbox_cmd);
> -	cxl_mem_mbox_put(cxlm);
> +	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
>  	if (rc)
>  		return rc;
>  
> @@ -913,6 +878,7 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
>  {
>  	const int cap = readl(cxlm->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
>  
> +	cxlm->mbox_send = cxl_pci_mbox_send;
>  	cxlm->payload_size =
>  		1 << FIELD_GET(CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK, cap);
>  
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 10/21] cxl/pci: Drop idr.h
  2021-09-09  5:12 ` [PATCH v4 10/21] cxl/pci: Drop idr.h Dan Williams
@ 2021-09-09 16:34   ` Ben Widawsky
  2021-09-10  8:46     ` Jonathan Cameron
  0 siblings, 1 reply; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 16:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Jonathan Cameron, vishal.l.verma, nvdimm,
	alison.schofield, ira.weiny

On 21-09-08 22:12:26, Dan Williams wrote:
> Commit 3d135db51024 ("cxl/core: Move memdev management to core") left
> this straggling include for cxl_memdev setup. Clean it up.
> 
> Cc: Ben Widawsky <ben.widawsky@intel.com>
> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>

> ---
>  drivers/cxl/pci.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index e2f27671c6b2..9d8050fdd69c 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -8,7 +8,6 @@
>  #include <linux/mutex.h>
>  #include <linux/list.h>
>  #include <linux/cdev.h>
> -#include <linux/idr.h>
>  #include <linux/pci.h>
>  #include <linux/io.h>
>  #include <linux/io-64-nonatomic-lo-hi.h>
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core
  2021-09-09  5:12 ` [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core Dan Williams
@ 2021-09-09 16:41   ` Ben Widawsky
  2021-09-09 18:50     ` Dan Williams
  2021-09-10  9:13   ` Jonathan Cameron
  1 sibling, 1 reply; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 16:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, kernel test robot, vishal.l.verma, nvdimm,
	alison.schofield, ira.weiny, Jonathan.Cameron

On 21-09-08 22:12:32, Dan Williams wrote:
> Now that the internals of mailbox operations are abstracted from the PCI
> specifics a bulk of infrastructure can move to the core.
> 
> The CXL_PMEM driver intends to proxy LIBNVDIMM UAPI and driver requests
> to the equivalent functionality provided by the CXL hardware mailbox
> interface. In support of that intent move the mailbox implementation to
> a shared location for the CXL_PCI driver native IOCTL path and CXL_PMEM
> nvdimm command proxy path to share.
> 
> A unit test framework seeks to implement a unit test backend transport
> for mailbox commands to communicate mocked up payloads. It can reuse all
> of the mailbox infrastructure minus the PCI specifics, so that also gets
> moved to the core.
> 
> Finally with the mailbox infrastructure and ioctl handling being
> transport generic there is no longer any need to pass file
> file_operations to devm_cxl_add_memdev(). That allows all the ioctl
> boilerplate to move into the core for unit test reuse.
> 
> No functional change intended, just code movement.

At some point, I think some of the comments and kernel docs need updating since
the target is no longer exclusively memory devices. Perhaps you do this in later
patches....

> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  Documentation/driver-api/cxl/memory-devices.rst |    3 
>  drivers/cxl/core/Makefile                       |    1 
>  drivers/cxl/core/bus.c                          |    4 
>  drivers/cxl/core/core.h                         |    8 
>  drivers/cxl/core/mbox.c                         |  825 +++++++++++++++++++++
>  drivers/cxl/core/memdev.c                       |   81 ++
>  drivers/cxl/cxlmem.h                            |   78 +-
>  drivers/cxl/pci.c                               |  924 -----------------------
>  8 files changed, 975 insertions(+), 949 deletions(-)
>  create mode 100644 drivers/cxl/core/mbox.c
> 
> diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> index 50ebcda17ad0..4624951b5c38 100644
> --- a/Documentation/driver-api/cxl/memory-devices.rst
> +++ b/Documentation/driver-api/cxl/memory-devices.rst
> @@ -45,6 +45,9 @@ CXL Core
>  .. kernel-doc:: drivers/cxl/core/regs.c
>     :doc: cxl registers
>  
> +.. kernel-doc:: drivers/cxl/core/mbox.c
> +   :doc: cxl mbox
> +
>  External Interfaces
>  ===================
>  
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 0fdbf3c6ac1a..07eb8e1fb8a6 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -6,3 +6,4 @@ cxl_core-y := bus.o
>  cxl_core-y += pmem.o
>  cxl_core-y += regs.o
>  cxl_core-y += memdev.o
> +cxl_core-y += mbox.o
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 37b87adaa33f..8073354ba232 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -636,6 +636,8 @@ static __init int cxl_core_init(void)
>  {
>  	int rc;
>  
> +	cxl_mbox_init();
> +
>  	rc = cxl_memdev_init();
>  	if (rc)
>  		return rc;
> @@ -647,6 +649,7 @@ static __init int cxl_core_init(void)
>  
>  err:
>  	cxl_memdev_exit();
> +	cxl_mbox_exit();
>  	return rc;
>  }
>  
> @@ -654,6 +657,7 @@ static void cxl_core_exit(void)
>  {
>  	bus_unregister(&cxl_bus_type);
>  	cxl_memdev_exit();
> +	cxl_mbox_exit();
>  }
>  
>  module_init(cxl_core_init);
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 036a3c8106b4..c85b7fbad02d 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -14,7 +14,15 @@ static inline void unregister_cxl_dev(void *dev)
>  	device_unregister(dev);
>  }
>  
> +struct cxl_send_command;
> +struct cxl_mem_query_commands;
> +int cxl_query_cmd(struct cxl_memdev *cxlmd,
> +		  struct cxl_mem_query_commands __user *q);
> +int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s);
> +
>  int cxl_memdev_init(void);
>  void cxl_memdev_exit(void);
> +void cxl_mbox_init(void);
> +void cxl_mbox_exit(void);

cxl_mbox_fini()?

>  
>  #endif /* __CXL_CORE_H__ */
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> new file mode 100644
> index 000000000000..31e183991726
> --- /dev/null
> +++ b/drivers/cxl/core/mbox.c
> @@ -0,0 +1,825 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> +#include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/security.h>
> +#include <linux/debugfs.h>
> +#include <linux/mutex.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +
> +#include "core.h"
> +
> +static bool cxl_raw_allow_all;
> +
> +/**
> + * DOC: cxl mbox
> + *
> + * Core implementation of the CXL 2.0 Type-3 Memory Device Mailbox. The
> + * implementation is used by the cxl_pci driver to initialize the device
> + * and implement the cxl_mem.h IOCTL UAPI. It also implements the
> + * backend of the cxl_pmem_ctl() transport for LIBNVDIMM.
> + */
> +
> +#define cxl_for_each_cmd(cmd)                                                  \
> +	for ((cmd) = &cxl_mem_commands[0];                                     \
> +	     ((cmd) - cxl_mem_commands) < ARRAY_SIZE(cxl_mem_commands); (cmd)++)
> +
> +#define CXL_CMD(_id, sin, sout, _flags)                                        \
> +	[CXL_MEM_COMMAND_ID_##_id] = {                                         \
> +	.info =	{                                                              \
> +			.id = CXL_MEM_COMMAND_ID_##_id,                        \
> +			.size_in = sin,                                        \
> +			.size_out = sout,                                      \
> +		},                                                             \
> +	.opcode = CXL_MBOX_OP_##_id,                                           \
> +	.flags = _flags,                                                       \
> +	}
> +
> +/*
> + * This table defines the supported mailbox commands for the driver. This table
> + * is made up of a UAPI structure. Non-negative values as parameters in the
> + * table will be validated against the user's input. For example, if size_in is
> + * 0, and the user passed in 1, it is an error.
> + */
> +static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
> +	CXL_CMD(IDENTIFY, 0, 0x43, CXL_CMD_FLAG_FORCE_ENABLE),
> +#ifdef CONFIG_CXL_MEM_RAW_COMMANDS
> +	CXL_CMD(RAW, ~0, ~0, 0),
> +#endif
> +	CXL_CMD(GET_SUPPORTED_LOGS, 0, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
> +	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
> +	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
> +	CXL_CMD(GET_LSA, 0x8, ~0, 0),
> +	CXL_CMD(GET_HEALTH_INFO, 0, 0x12, 0),
> +	CXL_CMD(GET_LOG, 0x18, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
> +	CXL_CMD(SET_PARTITION_INFO, 0x0a, 0, 0),
> +	CXL_CMD(SET_LSA, ~0, 0, 0),
> +	CXL_CMD(GET_ALERT_CONFIG, 0, 0x10, 0),
> +	CXL_CMD(SET_ALERT_CONFIG, 0xc, 0, 0),
> +	CXL_CMD(GET_SHUTDOWN_STATE, 0, 0x1, 0),
> +	CXL_CMD(SET_SHUTDOWN_STATE, 0x1, 0, 0),
> +	CXL_CMD(GET_POISON, 0x10, ~0, 0),
> +	CXL_CMD(INJECT_POISON, 0x8, 0, 0),
> +	CXL_CMD(CLEAR_POISON, 0x48, 0, 0),
> +	CXL_CMD(GET_SCAN_MEDIA_CAPS, 0x10, 0x4, 0),
> +	CXL_CMD(SCAN_MEDIA, 0x11, 0, 0),
> +	CXL_CMD(GET_SCAN_MEDIA, 0, ~0, 0),
> +};
> +
> +/*
> + * Commands that RAW doesn't permit. The rationale for each:
> + *
> + * CXL_MBOX_OP_ACTIVATE_FW: Firmware activation requires adjustment /
> + * coordination of transaction timeout values at the root bridge level.
> + *
> + * CXL_MBOX_OP_SET_PARTITION_INFO: The device memory map may change live
> + * and needs to be coordinated with HDM updates.
> + *
> + * CXL_MBOX_OP_SET_LSA: The label storage area may be cached by the
> + * driver and any writes from userspace invalidates those contents.
> + *
> + * CXL_MBOX_OP_SET_SHUTDOWN_STATE: Set shutdown state assumes no writes
> + * to the device after it is marked clean, userspace can not make that
> + * assertion.
> + *
> + * CXL_MBOX_OP_[GET_]SCAN_MEDIA: The kernel provides a native error list that
> + * is kept up to date with patrol notifications and error management.
> + */
> +static u16 cxl_disabled_raw_commands[] = {
> +	CXL_MBOX_OP_ACTIVATE_FW,
> +	CXL_MBOX_OP_SET_PARTITION_INFO,
> +	CXL_MBOX_OP_SET_LSA,
> +	CXL_MBOX_OP_SET_SHUTDOWN_STATE,
> +	CXL_MBOX_OP_SCAN_MEDIA,
> +	CXL_MBOX_OP_GET_SCAN_MEDIA,
> +};
> +
> +/*
> + * Command sets that RAW doesn't permit. All opcodes in this set are
> + * disabled because they pass plain text security payloads over the
> + * user/kernel boundary. This functionality is intended to be wrapped
> + * behind the keys ABI which allows for encrypted payloads in the UAPI
> + */
> +static u8 security_command_sets[] = {
> +	0x44, /* Sanitize */
> +	0x45, /* Persistent Memory Data-at-rest Security */
> +	0x46, /* Security Passthrough */
> +};
> +
> +static bool cxl_is_security_command(u16 opcode)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(security_command_sets); i++)
> +		if (security_command_sets[i] == (opcode >> 8))
> +			return true;
> +	return false;
> +}
> +
> +static struct cxl_mem_command *cxl_mem_find_command(u16 opcode)
> +{
> +	struct cxl_mem_command *c;
> +
> +	cxl_for_each_cmd(c)
> +		if (c->opcode == opcode)
> +			return c;
> +
> +	return NULL;
> +}
> +
> +/**
> + * cxl_mem_mbox_send_cmd() - Send a mailbox command to a memory device.
> + * @cxlm: The CXL memory device to communicate with.
> + * @opcode: Opcode for the mailbox command.
> + * @in: The input payload for the mailbox command.
> + * @in_size: The length of the input payload
> + * @out: Caller allocated buffer for the output.
> + * @out_size: Expected size of output.
> + *
> + * Context: Any context. Will acquire and release mbox_mutex.
> + * Return:
> + *  * %>=0	- Number of bytes returned in @out.
> + *  * %-E2BIG	- Payload is too large for hardware.
> + *  * %-EBUSY	- Couldn't acquire exclusive mailbox access.
> + *  * %-EFAULT	- Hardware error occurred.
> + *  * %-ENXIO	- Command completed, but device reported an error.
> + *  * %-EIO	- Unexpected output size.
> + *
> + * Mailbox commands may execute successfully yet the device itself reported an
> + * error. While this distinction can be useful for commands from userspace, the
> + * kernel will only be able to use results when both are successful.
> + *
> + * See __cxl_mem_mbox_send_cmd()
> + */
> +int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode, void *in,
> +			  size_t in_size, void *out, size_t out_size)
> +{
> +	const struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> +	struct cxl_mbox_cmd mbox_cmd = {
> +		.opcode = opcode,
> +		.payload_in = in,
> +		.size_in = in_size,
> +		.size_out = out_size,
> +		.payload_out = out,
> +	};
> +	int rc;
> +
> +	if (out_size > cxlm->payload_size)
> +		return -E2BIG;
> +
> +	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
> +	if (rc)
> +		return rc;
> +
> +	/* TODO: Map return code to proper kernel style errno */
> +	if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
> +		return -ENXIO;
> +
> +	/*
> +	 * Variable sized commands can't be validated and so it's up to the
> +	 * caller to do that if they wish.
> +	 */
> +	if (cmd->info.size_out >= 0 && mbox_cmd.size_out != out_size)
> +		return -EIO;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_mbox_send_cmd);
> +
> +static bool cxl_mem_raw_command_allowed(u16 opcode)
> +{
> +	int i;
> +
> +	if (!IS_ENABLED(CONFIG_CXL_MEM_RAW_COMMANDS))
> +		return false;
> +
> +	if (security_locked_down(LOCKDOWN_PCI_ACCESS))
> +		return false;
> +
> +	if (cxl_raw_allow_all)
> +		return true;
> +
> +	if (cxl_is_security_command(opcode))
> +		return false;
> +
> +	for (i = 0; i < ARRAY_SIZE(cxl_disabled_raw_commands); i++)
> +		if (cxl_disabled_raw_commands[i] == opcode)
> +			return false;
> +
> +	return true;
> +}
> +
> +/**
> + * cxl_validate_cmd_from_user() - Check fields for CXL_MEM_SEND_COMMAND.
> + * @cxlm: &struct cxl_mem device whose mailbox will be used.
> + * @send_cmd: &struct cxl_send_command copied in from userspace.
> + * @out_cmd: Sanitized and populated &struct cxl_mem_command.
> + *
> + * Return:
> + *  * %0	- @out_cmd is ready to send.
> + *  * %-ENOTTY	- Invalid command specified.
> + *  * %-EINVAL	- Reserved fields or invalid values were used.
> + *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
> + *  * %-EPERM	- Attempted to use a protected command.
> + *
> + * The result of this command is a fully validated command in @out_cmd that is
> + * safe to send to the hardware.
> + *
> + * See handle_mailbox_cmd_from_user()
> + */
> +static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
> +				      const struct cxl_send_command *send_cmd,
> +				      struct cxl_mem_command *out_cmd)
> +{
> +	const struct cxl_command_info *info;
> +	struct cxl_mem_command *c;
> +
> +	if (send_cmd->id == 0 || send_cmd->id >= CXL_MEM_COMMAND_ID_MAX)
> +		return -ENOTTY;
> +
> +	/*
> +	 * The user can never specify an input payload larger than what hardware
> +	 * supports, but output can be arbitrarily large (simply write out as
> +	 * much data as the hardware provides).
> +	 */
> +	if (send_cmd->in.size > cxlm->payload_size)
> +		return -EINVAL;
> +
> +	/*
> +	 * Checks are bypassed for raw commands but a WARN/taint will occur
> +	 * later in the callchain
> +	 */
> +	if (send_cmd->id == CXL_MEM_COMMAND_ID_RAW) {
> +		const struct cxl_mem_command temp = {
> +			.info = {
> +				.id = CXL_MEM_COMMAND_ID_RAW,
> +				.flags = 0,
> +				.size_in = send_cmd->in.size,
> +				.size_out = send_cmd->out.size,
> +			},
> +			.opcode = send_cmd->raw.opcode
> +		};
> +
> +		if (send_cmd->raw.rsvd)
> +			return -EINVAL;
> +
> +		/*
> +		 * Unlike supported commands, the output size of RAW commands
> +		 * gets passed along without further checking, so it must be
> +		 * validated here.
> +		 */
> +		if (send_cmd->out.size > cxlm->payload_size)
> +			return -EINVAL;
> +
> +		if (!cxl_mem_raw_command_allowed(send_cmd->raw.opcode))
> +			return -EPERM;
> +
> +		memcpy(out_cmd, &temp, sizeof(temp));
> +
> +		return 0;
> +	}
> +
> +	if (send_cmd->flags & ~CXL_MEM_COMMAND_FLAG_MASK)
> +		return -EINVAL;
> +
> +	if (send_cmd->rsvd)
> +		return -EINVAL;
> +
> +	if (send_cmd->in.rsvd || send_cmd->out.rsvd)
> +		return -EINVAL;
> +
> +	/* Convert user's command into the internal representation */
> +	c = &cxl_mem_commands[send_cmd->id];
> +	info = &c->info;
> +
> +	/* Check that the command is enabled for hardware */
> +	if (!test_bit(info->id, cxlm->enabled_cmds))
> +		return -ENOTTY;
> +
> +	/* Check the input buffer is the expected size */
> +	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
> +		return -ENOMEM;
> +
> +	/* Check the output buffer is at least large enough */
> +	if (info->size_out >= 0 && send_cmd->out.size < info->size_out)
> +		return -ENOMEM;
> +
> +	memcpy(out_cmd, c, sizeof(*c));
> +	out_cmd->info.size_in = send_cmd->in.size;
> +	/*
> +	 * XXX: out_cmd->info.size_out will be controlled by the driver, and the
> +	 * specified number of bytes @send_cmd->out.size will be copied back out
> +	 * to userspace.
> +	 */
> +
> +	return 0;
> +}
> +
> +#define cxl_cmd_count ARRAY_SIZE(cxl_mem_commands)
> +
> +int cxl_query_cmd(struct cxl_memdev *cxlmd,
> +		  struct cxl_mem_query_commands __user *q)
> +{
> +	struct device *dev = &cxlmd->dev;
> +	struct cxl_mem_command *cmd;
> +	u32 n_commands;
> +	int j = 0;
> +
> +	dev_dbg(dev, "Query IOCTL\n");
> +
> +	if (get_user(n_commands, &q->n_commands))
> +		return -EFAULT;
> +
> +	/* returns the total number if 0 elements are requested. */
> +	if (n_commands == 0)
> +		return put_user(cxl_cmd_count, &q->n_commands);
> +
> +	/*
> +	 * otherwise, return max(n_commands, total commands) cxl_command_info
> +	 * structures.
> +	 */
> +	cxl_for_each_cmd(cmd) {
> +		const struct cxl_command_info *info = &cmd->info;
> +
> +		if (copy_to_user(&q->commands[j++], info, sizeof(*info)))
> +			return -EFAULT;
> +
> +		if (j == n_commands)
> +			break;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
> + * @cxlm: The CXL memory device to communicate with.
> + * @cmd: The validated command.
> + * @in_payload: Pointer to userspace's input payload.
> + * @out_payload: Pointer to userspace's output payload.
> + * @size_out: (Input) Max payload size to copy out.
> + *            (Output) Payload size hardware generated.
> + * @retval: Hardware generated return code from the operation.
> + *
> + * Return:
> + *  * %0	- Mailbox transaction succeeded. This implies the mailbox
> + *		  protocol completed successfully not that the operation itself
> + *		  was successful.
> + *  * %-ENOMEM  - Couldn't allocate a bounce buffer.
> + *  * %-EFAULT	- Something happened with copy_to/from_user.
> + *  * %-EINTR	- Mailbox acquisition interrupted.
> + *  * %-EXXX	- Transaction level failures.
> + *
> + * Creates the appropriate mailbox command and dispatches it on behalf of a
> + * userspace request. The input and output payloads are copied between
> + * userspace.
> + *
> + * See cxl_send_cmd().
> + */
> +static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
> +					const struct cxl_mem_command *cmd,
> +					u64 in_payload, u64 out_payload,
> +					s32 *size_out, u32 *retval)
> +{
> +	struct device *dev = cxlm->dev;
> +	struct cxl_mbox_cmd mbox_cmd = {
> +		.opcode = cmd->opcode,
> +		.size_in = cmd->info.size_in,
> +		.size_out = cmd->info.size_out,
> +	};
> +	int rc;
> +
> +	if (cmd->info.size_out) {
> +		mbox_cmd.payload_out = kvzalloc(cmd->info.size_out, GFP_KERNEL);
> +		if (!mbox_cmd.payload_out)
> +			return -ENOMEM;
> +	}
> +
> +	if (cmd->info.size_in) {
> +		mbox_cmd.payload_in = vmemdup_user(u64_to_user_ptr(in_payload),
> +						   cmd->info.size_in);
> +		if (IS_ERR(mbox_cmd.payload_in)) {
> +			kvfree(mbox_cmd.payload_out);
> +			return PTR_ERR(mbox_cmd.payload_in);
> +		}
> +	}
> +
> +	dev_dbg(dev,
> +		"Submitting %s command for user\n"
> +		"\topcode: %x\n"
> +		"\tsize: %ub\n",
> +		cxl_command_names[cmd->info.id].name, mbox_cmd.opcode,
> +		cmd->info.size_in);
> +
> +	dev_WARN_ONCE(dev, cmd->info.id == CXL_MEM_COMMAND_ID_RAW,
> +		      "raw command path used\n");
> +
> +	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
> +	if (rc)
> +		goto out;
> +
> +	/*
> +	 * @size_out contains the max size that's allowed to be written back out
> +	 * to userspace. While the payload may have written more output than
> +	 * this it will have to be ignored.
> +	 */
> +	if (mbox_cmd.size_out) {
> +		dev_WARN_ONCE(dev, mbox_cmd.size_out > *size_out,
> +			      "Invalid return size\n");
> +		if (copy_to_user(u64_to_user_ptr(out_payload),
> +				 mbox_cmd.payload_out, mbox_cmd.size_out)) {
> +			rc = -EFAULT;
> +			goto out;
> +		}
> +	}
> +
> +	*size_out = mbox_cmd.size_out;
> +	*retval = mbox_cmd.return_code;
> +
> +out:
> +	kvfree(mbox_cmd.payload_in);
> +	kvfree(mbox_cmd.payload_out);
> +	return rc;
> +}
> +
> +int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s)
> +{
> +	struct cxl_mem *cxlm = cxlmd->cxlm;
> +	struct device *dev = &cxlmd->dev;
> +	struct cxl_send_command send;
> +	struct cxl_mem_command c;
> +	int rc;
> +
> +	dev_dbg(dev, "Send IOCTL\n");
> +
> +	if (copy_from_user(&send, s, sizeof(send)))
> +		return -EFAULT;
> +
> +	rc = cxl_validate_cmd_from_user(cxlmd->cxlm, &send, &c);
> +	if (rc)
> +		return rc;
> +
> +	/* Prepare to handle a full payload for variable sized output */
> +	if (c.info.size_out < 0)
> +		c.info.size_out = cxlm->payload_size;
> +
> +	rc = handle_mailbox_cmd_from_user(cxlm, &c, send.in.payload,
> +					  send.out.payload, &send.out.size,
> +					  &send.retval);
> +	if (rc)
> +		return rc;
> +
> +	if (copy_to_user(s, &send, sizeof(send)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
> +{
> +	u32 remaining = size;
> +	u32 offset = 0;
> +
> +	while (remaining) {
> +		u32 xfer_size = min_t(u32, remaining, cxlm->payload_size);
> +		struct cxl_mbox_get_log {
> +			uuid_t uuid;
> +			__le32 offset;
> +			__le32 length;
> +		} __packed log = {
> +			.uuid = *uuid,
> +			.offset = cpu_to_le32(offset),
> +			.length = cpu_to_le32(xfer_size)
> +		};
> +		int rc;
> +
> +		rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LOG, &log,
> +					   sizeof(log), out, xfer_size);
> +		if (rc < 0)
> +			return rc;
> +
> +		out += xfer_size;
> +		remaining -= xfer_size;
> +		offset += xfer_size;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * cxl_walk_cel() - Walk through the Command Effects Log.
> + * @cxlm: Device.
> + * @size: Length of the Command Effects Log.
> + * @cel: CEL
> + *
> + * Iterate over each entry in the CEL and determine if the driver supports the
> + * command. If so, the command is enabled for the device and can be used later.
> + */
> +static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
> +{
> +	struct cel_entry {
> +		__le16 opcode;
> +		__le16 effect;
> +	} __packed * cel_entry;
> +	const int cel_entries = size / sizeof(*cel_entry);
> +	int i;
> +
> +	cel_entry = (struct cel_entry *)cel;
> +
> +	for (i = 0; i < cel_entries; i++) {
> +		u16 opcode = le16_to_cpu(cel_entry[i].opcode);
> +		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> +
> +		if (!cmd) {
> +			dev_dbg(cxlm->dev,
> +				"Opcode 0x%04x unsupported by driver", opcode);
> +			continue;
> +		}
> +
> +		set_bit(cmd->info.id, cxlm->enabled_cmds);
> +	}
> +}
> +
> +struct cxl_mbox_get_supported_logs {
> +	__le16 entries;
> +	u8 rsvd[6];
> +	struct gsl_entry {
> +		uuid_t uuid;
> +		__le32 size;
> +	} __packed entry[];
> +} __packed;
> +
> +static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> +{
> +	struct cxl_mbox_get_supported_logs *ret;
> +	int rc;
> +
> +	ret = kvmalloc(cxlm->payload_size, GFP_KERNEL);
> +	if (!ret)
> +		return ERR_PTR(-ENOMEM);
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_SUPPORTED_LOGS, NULL,
> +				   0, ret, cxlm->payload_size);
> +	if (rc < 0) {
> +		kvfree(ret);
> +		return ERR_PTR(rc);
> +	}
> +
> +	return ret;
> +}
> +
> +enum {
> +	CEL_UUID,
> +	VENDOR_DEBUG_UUID,
> +};
> +
> +/* See CXL 2.0 Table 170. Get Log Input Payload */
> +static const uuid_t log_uuid[] = {
> +	[CEL_UUID] = UUID_INIT(0xda9c0b5, 0xbf41, 0x4b78, 0x8f, 0x79, 0x96,
> +			       0xb1, 0x62, 0x3b, 0x3f, 0x17),
> +	[VENDOR_DEBUG_UUID] = UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f,
> +					0xd6, 0x07, 0x19, 0x40, 0x3d, 0x86),
> +};
> +
> +/**
> + * cxl_mem_enumerate_cmds() - Enumerate commands for a device.
> + * @cxlm: The device.
> + *
> + * Returns 0 if enumerate completed successfully.
> + *
> + * CXL devices have optional support for certain commands. This function will
> + * determine the set of supported commands for the hardware and update the
> + * enabled_cmds bitmap in the @cxlm.
> + */
> +int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm)
> +{
> +	struct cxl_mbox_get_supported_logs *gsl;
> +	struct device *dev = cxlm->dev;
> +	struct cxl_mem_command *cmd;
> +	int i, rc;
> +
> +	gsl = cxl_get_gsl(cxlm);
> +	if (IS_ERR(gsl))
> +		return PTR_ERR(gsl);
> +
> +	rc = -ENOENT;
> +	for (i = 0; i < le16_to_cpu(gsl->entries); i++) {
> +		u32 size = le32_to_cpu(gsl->entry[i].size);
> +		uuid_t uuid = gsl->entry[i].uuid;
> +		u8 *log;
> +
> +		dev_dbg(dev, "Found LOG type %pU of size %d", &uuid, size);
> +
> +		if (!uuid_equal(&uuid, &log_uuid[CEL_UUID]))
> +			continue;
> +
> +		log = kvmalloc(size, GFP_KERNEL);
> +		if (!log) {
> +			rc = -ENOMEM;
> +			goto out;
> +		}
> +
> +		rc = cxl_xfer_log(cxlm, &uuid, size, log);
> +		if (rc) {
> +			kvfree(log);
> +			goto out;
> +		}
> +
> +		cxl_walk_cel(cxlm, size, log);
> +		kvfree(log);
> +
> +		/* In case CEL was bogus, enable some default commands. */
> +		cxl_for_each_cmd(cmd)
> +			if (cmd->flags & CXL_CMD_FLAG_FORCE_ENABLE)
> +				set_bit(cmd->info.id, cxlm->enabled_cmds);
> +
> +		/* Found the required CEL */
> +		rc = 0;
> +	}
> +
> +out:
> +	kvfree(gsl);
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_enumerate_cmds);
> +
> +/**
> + * cxl_mem_get_partition_info - Get partition info
> + * @cxlm: cxl_mem instance to update partition info
> + *
> + * Retrieve the current partition info for the device specified.  The active
> + * values are the current capacity in bytes.  If not 0, the 'next' values are
> + * the pending values, in bytes, which take affect on next cold reset.
> + *
> + * Return: 0 if no error: or the result of the mailbox command.
> + *
> + * See CXL @8.2.9.5.2.1 Get Partition Info
> + */
> +static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
> +{
> +	struct cxl_mbox_get_partition_info {
> +		__le64 active_volatile_cap;
> +		__le64 active_persistent_cap;
> +		__le64 next_volatile_cap;
> +		__le64 next_persistent_cap;
> +	} __packed pi;
> +	int rc;
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_PARTITION_INFO,
> +				   NULL, 0, &pi, sizeof(pi));
> +
> +	if (rc)
> +		return rc;
> +
> +	cxlm->active_volatile_bytes =
> +		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->active_persistent_bytes =
> +		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->next_volatile_bytes =
> +		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->next_persistent_bytes =
> +		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> +
> +	return 0;
> +}
> +
> +/**
> + * cxl_mem_identify() - Send the IDENTIFY command to the device.
> + * @cxlm: The device to identify.
> + *
> + * Return: 0 if identify was executed successfully.
> + *
> + * This will dispatch the identify command to the device and on success populate
> + * structures to be exported to sysfs.
> + */
> +int cxl_mem_identify(struct cxl_mem *cxlm)
> +{
> +	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
> +	struct cxl_mbox_identify {
> +		char fw_revision[0x10];
> +		__le64 total_capacity;
> +		__le64 volatile_capacity;
> +		__le64 persistent_capacity;
> +		__le64 partition_align;
> +		__le16 info_event_log_size;
> +		__le16 warning_event_log_size;
> +		__le16 failure_event_log_size;
> +		__le16 fatal_event_log_size;
> +		__le32 lsa_size;
> +		u8 poison_list_max_mer[3];
> +		__le16 inject_poison_limit;
> +		u8 poison_caps;
> +		u8 qos_telemetry_caps;
> +	} __packed id;
> +	int rc;
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_IDENTIFY, NULL, 0, &id,
> +				   sizeof(id));
> +	if (rc < 0)
> +		return rc;
> +
> +	cxlm->total_bytes =
> +		le64_to_cpu(id.total_capacity) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->volatile_only_bytes =
> +		le64_to_cpu(id.volatile_capacity) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->persistent_only_bytes =
> +		le64_to_cpu(id.persistent_capacity) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->partition_align_bytes =
> +		le64_to_cpu(id.partition_align) * CXL_CAPACITY_MULTIPLIER;
> +
> +	dev_dbg(cxlm->dev,
> +		"Identify Memory Device\n"
> +		"     total_bytes = %#llx\n"
> +		"     volatile_only_bytes = %#llx\n"
> +		"     persistent_only_bytes = %#llx\n"
> +		"     partition_align_bytes = %#llx\n",
> +		cxlm->total_bytes, cxlm->volatile_only_bytes,
> +		cxlm->persistent_only_bytes, cxlm->partition_align_bytes);
> +
> +	cxlm->lsa_size = le32_to_cpu(id.lsa_size);
> +	memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision));
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_identify);
> +
> +int cxl_mem_create_range_info(struct cxl_mem *cxlm)
> +{
> +	int rc;
> +
> +	if (cxlm->partition_align_bytes == 0) {
> +		cxlm->ram_range.start = 0;
> +		cxlm->ram_range.end = cxlm->volatile_only_bytes - 1;
> +		cxlm->pmem_range.start = cxlm->volatile_only_bytes;
> +		cxlm->pmem_range.end = cxlm->volatile_only_bytes +
> +				       cxlm->persistent_only_bytes - 1;
> +		return 0;
> +	}
> +
> +	rc = cxl_mem_get_partition_info(cxlm);
> +	if (rc) {
> +		dev_err(cxlm->dev, "Failed to query partition information\n");
> +		return rc;
> +	}
> +
> +	dev_dbg(cxlm->dev,
> +		"Get Partition Info\n"
> +		"     active_volatile_bytes = %#llx\n"
> +		"     active_persistent_bytes = %#llx\n"
> +		"     next_volatile_bytes = %#llx\n"
> +		"     next_persistent_bytes = %#llx\n",
> +		cxlm->active_volatile_bytes, cxlm->active_persistent_bytes,
> +		cxlm->next_volatile_bytes, cxlm->next_persistent_bytes);
> +
> +	cxlm->ram_range.start = 0;
> +	cxlm->ram_range.end = cxlm->active_volatile_bytes - 1;
> +
> +	cxlm->pmem_range.start = cxlm->active_volatile_bytes;
> +	cxlm->pmem_range.end =
> +		cxlm->active_volatile_bytes + cxlm->active_persistent_bytes - 1;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_create_range_info);
> +
> +struct cxl_mem *cxl_mem_create(struct device *dev)
> +{
> +	struct cxl_mem *cxlm;
> +
> +	cxlm = devm_kzalloc(dev, sizeof(*cxlm), GFP_KERNEL);
> +	if (!cxlm) {
> +		dev_err(dev, "No memory available\n");
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	mutex_init(&cxlm->mbox_mutex);
> +	cxlm->dev = dev;
> +	cxlm->enabled_cmds =
> +		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
> +				   sizeof(unsigned long),
> +				   GFP_KERNEL | __GFP_ZERO);
> +	if (!cxlm->enabled_cmds) {
> +		dev_err(dev, "No memory available for bitmap\n");
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	return cxlm;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_create);
> +
> +static struct dentry *cxl_debugfs;
> +
> +void __init cxl_mbox_init(void)
> +{
> +	struct dentry *mbox_debugfs;
> +
> +	cxl_debugfs = debugfs_create_dir("cxl", NULL);
> +	mbox_debugfs = debugfs_create_dir("mbox", cxl_debugfs);
> +	debugfs_create_bool("raw_allow_all", 0600, mbox_debugfs,
> +			    &cxl_raw_allow_all);
> +}
> +
> +void cxl_mbox_exit(void)
> +{
> +	debugfs_remove_recursive(cxl_debugfs);
> +}
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 331ef7d6c598..df2ba87238c2 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -8,6 +8,8 @@
>  #include <cxlmem.h>
>  #include "core.h"
>  
> +static DECLARE_RWSEM(cxl_memdev_rwsem);
> +
>  /*
>   * An entire PCI topology full of devices should be enough for any
>   * config
> @@ -132,16 +134,21 @@ static const struct device_type cxl_memdev_type = {
>  	.groups = cxl_memdev_attribute_groups,
>  };
>  
> +static void cxl_memdev_shutdown(struct device *dev)
> +{
> +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +
> +	down_write(&cxl_memdev_rwsem);
> +	cxlmd->cxlm = NULL;
> +	up_write(&cxl_memdev_rwsem);
> +}
> +
>  static void cxl_memdev_unregister(void *_cxlmd)
>  {
>  	struct cxl_memdev *cxlmd = _cxlmd;
>  	struct device *dev = &cxlmd->dev;
> -	struct cdev *cdev = &cxlmd->cdev;
> -	const struct cdevm_file_operations *cdevm_fops;
> -
> -	cdevm_fops = container_of(cdev->ops, typeof(*cdevm_fops), fops);
> -	cdevm_fops->shutdown(dev);
>  
> +	cxl_memdev_shutdown(dev);
>  	cdev_device_del(&cxlmd->cdev, dev);
>  	put_device(dev);
>  }
> @@ -180,16 +187,72 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
>  	return ERR_PTR(rc);
>  }
>  
> +static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned int cmd,
> +			       unsigned long arg)
> +{
> +	switch (cmd) {
> +	case CXL_MEM_QUERY_COMMANDS:
> +		return cxl_query_cmd(cxlmd, (void __user *)arg);
> +	case CXL_MEM_SEND_COMMAND:
> +		return cxl_send_cmd(cxlmd, (void __user *)arg);
> +	default:
> +		return -ENOTTY;
> +	}
> +}
> +
> +static long cxl_memdev_ioctl(struct file *file, unsigned int cmd,
> +			     unsigned long arg)
> +{
> +	struct cxl_memdev *cxlmd = file->private_data;
> +	int rc = -ENXIO;
> +
> +	down_read(&cxl_memdev_rwsem);
> +	if (cxlmd->cxlm)
> +		rc = __cxl_memdev_ioctl(cxlmd, cmd, arg);
> +	up_read(&cxl_memdev_rwsem);
> +
> +	return rc;
> +}
> +
> +static int cxl_memdev_open(struct inode *inode, struct file *file)
> +{
> +	struct cxl_memdev *cxlmd =
> +		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
> +
> +	get_device(&cxlmd->dev);
> +	file->private_data = cxlmd;
> +
> +	return 0;
> +}
> +
> +static int cxl_memdev_release_file(struct inode *inode, struct file *file)
> +{
> +	struct cxl_memdev *cxlmd =
> +		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
> +
> +	put_device(&cxlmd->dev);
> +
> +	return 0;
> +}
> +
> +static const struct file_operations cxl_memdev_fops = {
> +	.owner = THIS_MODULE,
> +	.unlocked_ioctl = cxl_memdev_ioctl,
> +	.open = cxl_memdev_open,
> +	.release = cxl_memdev_release_file,
> +	.compat_ioctl = compat_ptr_ioctl,
> +	.llseek = noop_llseek,
> +};
> +
>  struct cxl_memdev *
> -devm_cxl_add_memdev(struct cxl_mem *cxlm,
> -		    const struct cdevm_file_operations *cdevm_fops)
> +devm_cxl_add_memdev(struct cxl_mem *cxlm)
>  {
>  	struct cxl_memdev *cxlmd;
>  	struct device *dev;
>  	struct cdev *cdev;
>  	int rc;
>  
> -	cxlmd = cxl_memdev_alloc(cxlm, &cdevm_fops->fops);
> +	cxlmd = cxl_memdev_alloc(cxlm, &cxl_memdev_fops);
>  	if (IS_ERR(cxlmd))
>  		return cxlmd;
>  
> @@ -219,7 +282,7 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
>  	 * The cdev was briefly live, shutdown any ioctl operations that
>  	 * saw that state.
>  	 */
> -	cdevm_fops->shutdown(dev);
> +	cxl_memdev_shutdown(dev);
>  	put_device(dev);
>  	return ERR_PTR(rc);
>  }
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 9be5e26c5b48..35fd16d12532 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -2,6 +2,7 @@
>  /* Copyright(c) 2020-2021 Intel Corporation. */
>  #ifndef __CXL_MEM_H__
>  #define __CXL_MEM_H__
> +#include <uapi/linux/cxl_mem.h>
>  #include <linux/cdev.h>
>  #include "cxl.h"
>  
> @@ -28,21 +29,6 @@
>  	(FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) !=                       \
>  	 CXLMDEV_RESET_NEEDED_NOT)
>  
> -/**
> - * struct cdevm_file_operations - devm coordinated cdev file operations
> - * @fops: file operations that are synchronized against @shutdown
> - * @shutdown: disconnect driver data
> - *
> - * @shutdown is invoked in the devres release path to disconnect any
> - * driver instance data from @dev. It assumes synchronization with any
> - * fops operation that requires driver data. After @shutdown an
> - * operation may only reference @device data.
> - */
> -struct cdevm_file_operations {
> -	struct file_operations fops;
> -	void (*shutdown)(struct device *dev);
> -};
> -
>  /**
>   * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device
>   * @dev: driver core device object
> @@ -62,9 +48,7 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
>  	return container_of(dev, struct cxl_memdev, dev);
>  }
>  
> -struct cxl_memdev *
> -devm_cxl_add_memdev(struct cxl_mem *cxlm,
> -		    const struct cdevm_file_operations *cdevm_fops);
> +struct cxl_memdev *devm_cxl_add_memdev(struct cxl_mem *cxlm);
>  
>  /**
>   * struct cxl_mbox_cmd - A command to be submitted to hardware.
> @@ -158,4 +142,62 @@ struct cxl_mem {
>  
>  	int (*mbox_send)(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd);
>  };
> +
> +enum cxl_opcode {
> +	CXL_MBOX_OP_INVALID		= 0x0000,
> +	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> +	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> +	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> +	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> +	CXL_MBOX_OP_GET_LOG		= 0x0401,
> +	CXL_MBOX_OP_IDENTIFY		= 0x4000,
> +	CXL_MBOX_OP_GET_PARTITION_INFO	= 0x4100,
> +	CXL_MBOX_OP_SET_PARTITION_INFO	= 0x4101,
> +	CXL_MBOX_OP_GET_LSA		= 0x4102,
> +	CXL_MBOX_OP_SET_LSA		= 0x4103,
> +	CXL_MBOX_OP_GET_HEALTH_INFO	= 0x4200,
> +	CXL_MBOX_OP_GET_ALERT_CONFIG	= 0x4201,
> +	CXL_MBOX_OP_SET_ALERT_CONFIG	= 0x4202,
> +	CXL_MBOX_OP_GET_SHUTDOWN_STATE	= 0x4203,
> +	CXL_MBOX_OP_SET_SHUTDOWN_STATE	= 0x4204,
> +	CXL_MBOX_OP_GET_POISON		= 0x4300,
> +	CXL_MBOX_OP_INJECT_POISON	= 0x4301,
> +	CXL_MBOX_OP_CLEAR_POISON	= 0x4302,
> +	CXL_MBOX_OP_GET_SCAN_MEDIA_CAPS	= 0x4303,
> +	CXL_MBOX_OP_SCAN_MEDIA		= 0x4304,
> +	CXL_MBOX_OP_GET_SCAN_MEDIA	= 0x4305,
> +	CXL_MBOX_OP_MAX			= 0x10000
> +};
> +
> +/**
> + * struct cxl_mem_command - Driver representation of a memory device command
> + * @info: Command information as it exists for the UAPI
> + * @opcode: The actual bits used for the mailbox protocol
> + * @flags: Set of flags effecting driver behavior.
> + *
> + *  * %CXL_CMD_FLAG_FORCE_ENABLE: In cases of error, commands with this flag
> + *    will be enabled by the driver regardless of what hardware may have
> + *    advertised.
> + *
> + * The cxl_mem_command is the driver's internal representation of commands that
> + * are supported by the driver. Some of these commands may not be supported by
> + * the hardware. The driver will use @info to validate the fields passed in by
> + * the user then submit the @opcode to the hardware.
> + *
> + * See struct cxl_command_info.
> + */
> +struct cxl_mem_command {
> +	struct cxl_command_info info;
> +	enum cxl_opcode opcode;
> +	u32 flags;
> +#define CXL_CMD_FLAG_NONE 0
> +#define CXL_CMD_FLAG_FORCE_ENABLE BIT(0)
> +};
> +
> +int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode, void *in,
> +			  size_t in_size, void *out, size_t out_size);
> +int cxl_mem_identify(struct cxl_mem *cxlm);
> +int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm);
> +int cxl_mem_create_range_info(struct cxl_mem *cxlm);
> +struct cxl_mem *cxl_mem_create(struct device *dev);
>  #endif /* __CXL_MEM_H__ */
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 9d8050fdd69c..c9f2ac134f4d 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1,16 +1,12 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> -#include <uapi/linux/cxl_mem.h>
> -#include <linux/security.h>
> -#include <linux/debugfs.h>
> +#include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/module.h>
>  #include <linux/sizes.h>
>  #include <linux/mutex.h>
>  #include <linux/list.h>
> -#include <linux/cdev.h>
>  #include <linux/pci.h>
>  #include <linux/io.h>
> -#include <linux/io-64-nonatomic-lo-hi.h>
>  #include "cxlmem.h"
>  #include "pci.h"
>  #include "cxl.h"
> @@ -37,162 +33,6 @@
>  /* CXL 2.0 - 8.2.8.4 */
>  #define CXL_MAILBOX_TIMEOUT_MS (2 * HZ)
>  
> -enum opcode {
> -	CXL_MBOX_OP_INVALID		= 0x0000,
> -	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> -	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> -	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> -	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> -	CXL_MBOX_OP_GET_LOG		= 0x0401,
> -	CXL_MBOX_OP_IDENTIFY		= 0x4000,
> -	CXL_MBOX_OP_GET_PARTITION_INFO	= 0x4100,
> -	CXL_MBOX_OP_SET_PARTITION_INFO	= 0x4101,
> -	CXL_MBOX_OP_GET_LSA		= 0x4102,
> -	CXL_MBOX_OP_SET_LSA		= 0x4103,
> -	CXL_MBOX_OP_GET_HEALTH_INFO	= 0x4200,
> -	CXL_MBOX_OP_GET_ALERT_CONFIG	= 0x4201,
> -	CXL_MBOX_OP_SET_ALERT_CONFIG	= 0x4202,
> -	CXL_MBOX_OP_GET_SHUTDOWN_STATE	= 0x4203,
> -	CXL_MBOX_OP_SET_SHUTDOWN_STATE	= 0x4204,
> -	CXL_MBOX_OP_GET_POISON		= 0x4300,
> -	CXL_MBOX_OP_INJECT_POISON	= 0x4301,
> -	CXL_MBOX_OP_CLEAR_POISON	= 0x4302,
> -	CXL_MBOX_OP_GET_SCAN_MEDIA_CAPS	= 0x4303,
> -	CXL_MBOX_OP_SCAN_MEDIA		= 0x4304,
> -	CXL_MBOX_OP_GET_SCAN_MEDIA	= 0x4305,
> -	CXL_MBOX_OP_MAX			= 0x10000
> -};
> -
> -static DECLARE_RWSEM(cxl_memdev_rwsem);
> -static struct dentry *cxl_debugfs;
> -static bool cxl_raw_allow_all;
> -
> -enum {
> -	CEL_UUID,
> -	VENDOR_DEBUG_UUID,
> -};
> -
> -/* See CXL 2.0 Table 170. Get Log Input Payload */
> -static const uuid_t log_uuid[] = {
> -	[CEL_UUID] = UUID_INIT(0xda9c0b5, 0xbf41, 0x4b78, 0x8f, 0x79, 0x96,
> -			       0xb1, 0x62, 0x3b, 0x3f, 0x17),
> -	[VENDOR_DEBUG_UUID] = UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f,
> -					0xd6, 0x07, 0x19, 0x40, 0x3d, 0x86),
> -};
> -
> -/**
> - * struct cxl_mem_command - Driver representation of a memory device command
> - * @info: Command information as it exists for the UAPI
> - * @opcode: The actual bits used for the mailbox protocol
> - * @flags: Set of flags effecting driver behavior.
> - *
> - *  * %CXL_CMD_FLAG_FORCE_ENABLE: In cases of error, commands with this flag
> - *    will be enabled by the driver regardless of what hardware may have
> - *    advertised.
> - *
> - * The cxl_mem_command is the driver's internal representation of commands that
> - * are supported by the driver. Some of these commands may not be supported by
> - * the hardware. The driver will use @info to validate the fields passed in by
> - * the user then submit the @opcode to the hardware.
> - *
> - * See struct cxl_command_info.
> - */
> -struct cxl_mem_command {
> -	struct cxl_command_info info;
> -	enum opcode opcode;
> -	u32 flags;
> -#define CXL_CMD_FLAG_NONE 0
> -#define CXL_CMD_FLAG_FORCE_ENABLE BIT(0)
> -};
> -
> -#define CXL_CMD(_id, sin, sout, _flags)                                        \
> -	[CXL_MEM_COMMAND_ID_##_id] = {                                         \
> -	.info =	{                                                              \
> -			.id = CXL_MEM_COMMAND_ID_##_id,                        \
> -			.size_in = sin,                                        \
> -			.size_out = sout,                                      \
> -		},                                                             \
> -	.opcode = CXL_MBOX_OP_##_id,                                           \
> -	.flags = _flags,                                                       \
> -	}
> -
> -/*
> - * This table defines the supported mailbox commands for the driver. This table
> - * is made up of a UAPI structure. Non-negative values as parameters in the
> - * table will be validated against the user's input. For example, if size_in is
> - * 0, and the user passed in 1, it is an error.
> - */
> -static struct cxl_mem_command mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
> -	CXL_CMD(IDENTIFY, 0, 0x43, CXL_CMD_FLAG_FORCE_ENABLE),
> -#ifdef CONFIG_CXL_MEM_RAW_COMMANDS
> -	CXL_CMD(RAW, ~0, ~0, 0),
> -#endif
> -	CXL_CMD(GET_SUPPORTED_LOGS, 0, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
> -	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
> -	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
> -	CXL_CMD(GET_LSA, 0x8, ~0, 0),
> -	CXL_CMD(GET_HEALTH_INFO, 0, 0x12, 0),
> -	CXL_CMD(GET_LOG, 0x18, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
> -	CXL_CMD(SET_PARTITION_INFO, 0x0a, 0, 0),
> -	CXL_CMD(SET_LSA, ~0, 0, 0),
> -	CXL_CMD(GET_ALERT_CONFIG, 0, 0x10, 0),
> -	CXL_CMD(SET_ALERT_CONFIG, 0xc, 0, 0),
> -	CXL_CMD(GET_SHUTDOWN_STATE, 0, 0x1, 0),
> -	CXL_CMD(SET_SHUTDOWN_STATE, 0x1, 0, 0),
> -	CXL_CMD(GET_POISON, 0x10, ~0, 0),
> -	CXL_CMD(INJECT_POISON, 0x8, 0, 0),
> -	CXL_CMD(CLEAR_POISON, 0x48, 0, 0),
> -	CXL_CMD(GET_SCAN_MEDIA_CAPS, 0x10, 0x4, 0),
> -	CXL_CMD(SCAN_MEDIA, 0x11, 0, 0),
> -	CXL_CMD(GET_SCAN_MEDIA, 0, ~0, 0),
> -};
> -
> -/*
> - * Commands that RAW doesn't permit. The rationale for each:
> - *
> - * CXL_MBOX_OP_ACTIVATE_FW: Firmware activation requires adjustment /
> - * coordination of transaction timeout values at the root bridge level.
> - *
> - * CXL_MBOX_OP_SET_PARTITION_INFO: The device memory map may change live
> - * and needs to be coordinated with HDM updates.
> - *
> - * CXL_MBOX_OP_SET_LSA: The label storage area may be cached by the
> - * driver and any writes from userspace invalidates those contents.
> - *
> - * CXL_MBOX_OP_SET_SHUTDOWN_STATE: Set shutdown state assumes no writes
> - * to the device after it is marked clean, userspace can not make that
> - * assertion.
> - *
> - * CXL_MBOX_OP_[GET_]SCAN_MEDIA: The kernel provides a native error list that
> - * is kept up to date with patrol notifications and error management.
> - */
> -static u16 cxl_disabled_raw_commands[] = {
> -	CXL_MBOX_OP_ACTIVATE_FW,
> -	CXL_MBOX_OP_SET_PARTITION_INFO,
> -	CXL_MBOX_OP_SET_LSA,
> -	CXL_MBOX_OP_SET_SHUTDOWN_STATE,
> -	CXL_MBOX_OP_SCAN_MEDIA,
> -	CXL_MBOX_OP_GET_SCAN_MEDIA,
> -};
> -
> -/*
> - * Command sets that RAW doesn't permit. All opcodes in this set are
> - * disabled because they pass plain text security payloads over the
> - * user/kernel boundary. This functionality is intended to be wrapped
> - * behind the keys ABI which allows for encrypted payloads in the UAPI
> - */
> -static u8 security_command_sets[] = {
> -	0x44, /* Sanitize */
> -	0x45, /* Persistent Memory Data-at-rest Security */
> -	0x46, /* Security Passthrough */
> -};
> -
> -#define cxl_for_each_cmd(cmd)                                                  \
> -	for ((cmd) = &mem_commands[0];                                         \
> -	     ((cmd) - mem_commands) < ARRAY_SIZE(mem_commands); (cmd)++)
> -
> -#define cxl_cmd_count ARRAY_SIZE(mem_commands)
> -
>  static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm)
>  {
>  	const unsigned long start = jiffies;
> @@ -215,16 +55,6 @@ static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm)
>  	return 0;
>  }
>  
> -static bool cxl_is_security_command(u16 opcode)
> -{
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(security_command_sets); i++)
> -		if (security_command_sets[i] == (opcode >> 8))
> -			return true;
> -	return false;
> -}
> -
>  static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
>  				 struct cxl_mbox_cmd *mbox_cmd)
>  {
> @@ -446,433 +276,6 @@ static int cxl_pci_mbox_send(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
>  	return rc;
>  }
>  
> -/**
> - * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
> - * @cxlm: The CXL memory device to communicate with.
> - * @cmd: The validated command.
> - * @in_payload: Pointer to userspace's input payload.
> - * @out_payload: Pointer to userspace's output payload.
> - * @size_out: (Input) Max payload size to copy out.
> - *            (Output) Payload size hardware generated.
> - * @retval: Hardware generated return code from the operation.
> - *
> - * Return:
> - *  * %0	- Mailbox transaction succeeded. This implies the mailbox
> - *		  protocol completed successfully not that the operation itself
> - *		  was successful.
> - *  * %-ENOMEM  - Couldn't allocate a bounce buffer.
> - *  * %-EFAULT	- Something happened with copy_to/from_user.
> - *  * %-EINTR	- Mailbox acquisition interrupted.
> - *  * %-EXXX	- Transaction level failures.
> - *
> - * Creates the appropriate mailbox command and dispatches it on behalf of a
> - * userspace request. The input and output payloads are copied between
> - * userspace.
> - *
> - * See cxl_send_cmd().
> - */
> -static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
> -					const struct cxl_mem_command *cmd,
> -					u64 in_payload, u64 out_payload,
> -					s32 *size_out, u32 *retval)
> -{
> -	struct device *dev = cxlm->dev;
> -	struct cxl_mbox_cmd mbox_cmd = {
> -		.opcode = cmd->opcode,
> -		.size_in = cmd->info.size_in,
> -		.size_out = cmd->info.size_out,
> -	};
> -	int rc;
> -
> -	if (cmd->info.size_out) {
> -		mbox_cmd.payload_out = kvzalloc(cmd->info.size_out, GFP_KERNEL);
> -		if (!mbox_cmd.payload_out)
> -			return -ENOMEM;
> -	}
> -
> -	if (cmd->info.size_in) {
> -		mbox_cmd.payload_in = vmemdup_user(u64_to_user_ptr(in_payload),
> -						   cmd->info.size_in);
> -		if (IS_ERR(mbox_cmd.payload_in)) {
> -			kvfree(mbox_cmd.payload_out);
> -			return PTR_ERR(mbox_cmd.payload_in);
> -		}
> -	}
> -
> -	dev_dbg(dev,
> -		"Submitting %s command for user\n"
> -		"\topcode: %x\n"
> -		"\tsize: %ub\n",
> -		cxl_command_names[cmd->info.id].name, mbox_cmd.opcode,
> -		cmd->info.size_in);
> -
> -	dev_WARN_ONCE(dev, cmd->info.id == CXL_MEM_COMMAND_ID_RAW,
> -		      "raw command path used\n");
> -
> -	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
> -	if (rc)
> -		goto out;
> -
> -	/*
> -	 * @size_out contains the max size that's allowed to be written back out
> -	 * to userspace. While the payload may have written more output than
> -	 * this it will have to be ignored.
> -	 */
> -	if (mbox_cmd.size_out) {
> -		dev_WARN_ONCE(dev, mbox_cmd.size_out > *size_out,
> -			      "Invalid return size\n");
> -		if (copy_to_user(u64_to_user_ptr(out_payload),
> -				 mbox_cmd.payload_out, mbox_cmd.size_out)) {
> -			rc = -EFAULT;
> -			goto out;
> -		}
> -	}
> -
> -	*size_out = mbox_cmd.size_out;
> -	*retval = mbox_cmd.return_code;
> -
> -out:
> -	kvfree(mbox_cmd.payload_in);
> -	kvfree(mbox_cmd.payload_out);
> -	return rc;
> -}
> -
> -static bool cxl_mem_raw_command_allowed(u16 opcode)
> -{
> -	int i;
> -
> -	if (!IS_ENABLED(CONFIG_CXL_MEM_RAW_COMMANDS))
> -		return false;
> -
> -	if (security_locked_down(LOCKDOWN_PCI_ACCESS))
> -		return false;
> -
> -	if (cxl_raw_allow_all)
> -		return true;
> -
> -	if (cxl_is_security_command(opcode))
> -		return false;
> -
> -	for (i = 0; i < ARRAY_SIZE(cxl_disabled_raw_commands); i++)
> -		if (cxl_disabled_raw_commands[i] == opcode)
> -			return false;
> -
> -	return true;
> -}
> -
> -/**
> - * cxl_validate_cmd_from_user() - Check fields for CXL_MEM_SEND_COMMAND.
> - * @cxlm: &struct cxl_mem device whose mailbox will be used.
> - * @send_cmd: &struct cxl_send_command copied in from userspace.
> - * @out_cmd: Sanitized and populated &struct cxl_mem_command.
> - *
> - * Return:
> - *  * %0	- @out_cmd is ready to send.
> - *  * %-ENOTTY	- Invalid command specified.
> - *  * %-EINVAL	- Reserved fields or invalid values were used.
> - *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
> - *  * %-EPERM	- Attempted to use a protected command.
> - *
> - * The result of this command is a fully validated command in @out_cmd that is
> - * safe to send to the hardware.
> - *
> - * See handle_mailbox_cmd_from_user()
> - */
> -static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
> -				      const struct cxl_send_command *send_cmd,
> -				      struct cxl_mem_command *out_cmd)
> -{
> -	const struct cxl_command_info *info;
> -	struct cxl_mem_command *c;
> -
> -	if (send_cmd->id == 0 || send_cmd->id >= CXL_MEM_COMMAND_ID_MAX)
> -		return -ENOTTY;
> -
> -	/*
> -	 * The user can never specify an input payload larger than what hardware
> -	 * supports, but output can be arbitrarily large (simply write out as
> -	 * much data as the hardware provides).
> -	 */
> -	if (send_cmd->in.size > cxlm->payload_size)
> -		return -EINVAL;
> -
> -	/*
> -	 * Checks are bypassed for raw commands but a WARN/taint will occur
> -	 * later in the callchain
> -	 */
> -	if (send_cmd->id == CXL_MEM_COMMAND_ID_RAW) {
> -		const struct cxl_mem_command temp = {
> -			.info = {
> -				.id = CXL_MEM_COMMAND_ID_RAW,
> -				.flags = 0,
> -				.size_in = send_cmd->in.size,
> -				.size_out = send_cmd->out.size,
> -			},
> -			.opcode = send_cmd->raw.opcode
> -		};
> -
> -		if (send_cmd->raw.rsvd)
> -			return -EINVAL;
> -
> -		/*
> -		 * Unlike supported commands, the output size of RAW commands
> -		 * gets passed along without further checking, so it must be
> -		 * validated here.
> -		 */
> -		if (send_cmd->out.size > cxlm->payload_size)
> -			return -EINVAL;
> -
> -		if (!cxl_mem_raw_command_allowed(send_cmd->raw.opcode))
> -			return -EPERM;
> -
> -		memcpy(out_cmd, &temp, sizeof(temp));
> -
> -		return 0;
> -	}
> -
> -	if (send_cmd->flags & ~CXL_MEM_COMMAND_FLAG_MASK)
> -		return -EINVAL;
> -
> -	if (send_cmd->rsvd)
> -		return -EINVAL;
> -
> -	if (send_cmd->in.rsvd || send_cmd->out.rsvd)
> -		return -EINVAL;
> -
> -	/* Convert user's command into the internal representation */
> -	c = &mem_commands[send_cmd->id];
> -	info = &c->info;
> -
> -	/* Check that the command is enabled for hardware */
> -	if (!test_bit(info->id, cxlm->enabled_cmds))
> -		return -ENOTTY;
> -
> -	/* Check the input buffer is the expected size */
> -	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
> -		return -ENOMEM;
> -
> -	/* Check the output buffer is at least large enough */
> -	if (info->size_out >= 0 && send_cmd->out.size < info->size_out)
> -		return -ENOMEM;
> -
> -	memcpy(out_cmd, c, sizeof(*c));
> -	out_cmd->info.size_in = send_cmd->in.size;
> -	/*
> -	 * XXX: out_cmd->info.size_out will be controlled by the driver, and the
> -	 * specified number of bytes @send_cmd->out.size will be copied back out
> -	 * to userspace.
> -	 */
> -
> -	return 0;
> -}
> -
> -static int cxl_query_cmd(struct cxl_memdev *cxlmd,
> -			 struct cxl_mem_query_commands __user *q)
> -{
> -	struct device *dev = &cxlmd->dev;
> -	struct cxl_mem_command *cmd;
> -	u32 n_commands;
> -	int j = 0;
> -
> -	dev_dbg(dev, "Query IOCTL\n");
> -
> -	if (get_user(n_commands, &q->n_commands))
> -		return -EFAULT;
> -
> -	/* returns the total number if 0 elements are requested. */
> -	if (n_commands == 0)
> -		return put_user(cxl_cmd_count, &q->n_commands);
> -
> -	/*
> -	 * otherwise, return max(n_commands, total commands) cxl_command_info
> -	 * structures.
> -	 */
> -	cxl_for_each_cmd(cmd) {
> -		const struct cxl_command_info *info = &cmd->info;
> -
> -		if (copy_to_user(&q->commands[j++], info, sizeof(*info)))
> -			return -EFAULT;
> -
> -		if (j == n_commands)
> -			break;
> -	}
> -
> -	return 0;
> -}
> -
> -static int cxl_send_cmd(struct cxl_memdev *cxlmd,
> -			struct cxl_send_command __user *s)
> -{
> -	struct cxl_mem *cxlm = cxlmd->cxlm;
> -	struct device *dev = &cxlmd->dev;
> -	struct cxl_send_command send;
> -	struct cxl_mem_command c;
> -	int rc;
> -
> -	dev_dbg(dev, "Send IOCTL\n");
> -
> -	if (copy_from_user(&send, s, sizeof(send)))
> -		return -EFAULT;
> -
> -	rc = cxl_validate_cmd_from_user(cxlmd->cxlm, &send, &c);
> -	if (rc)
> -		return rc;
> -
> -	/* Prepare to handle a full payload for variable sized output */
> -	if (c.info.size_out < 0)
> -		c.info.size_out = cxlm->payload_size;
> -
> -	rc = handle_mailbox_cmd_from_user(cxlm, &c, send.in.payload,
> -					  send.out.payload, &send.out.size,
> -					  &send.retval);
> -	if (rc)
> -		return rc;
> -
> -	if (copy_to_user(s, &send, sizeof(send)))
> -		return -EFAULT;
> -
> -	return 0;
> -}
> -
> -static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned int cmd,
> -			       unsigned long arg)
> -{
> -	switch (cmd) {
> -	case CXL_MEM_QUERY_COMMANDS:
> -		return cxl_query_cmd(cxlmd, (void __user *)arg);
> -	case CXL_MEM_SEND_COMMAND:
> -		return cxl_send_cmd(cxlmd, (void __user *)arg);
> -	default:
> -		return -ENOTTY;
> -	}
> -}
> -
> -static long cxl_memdev_ioctl(struct file *file, unsigned int cmd,
> -			     unsigned long arg)
> -{
> -	struct cxl_memdev *cxlmd = file->private_data;
> -	int rc = -ENXIO;
> -
> -	down_read(&cxl_memdev_rwsem);
> -	if (cxlmd->cxlm)
> -		rc = __cxl_memdev_ioctl(cxlmd, cmd, arg);
> -	up_read(&cxl_memdev_rwsem);
> -
> -	return rc;
> -}
> -
> -static int cxl_memdev_open(struct inode *inode, struct file *file)
> -{
> -	struct cxl_memdev *cxlmd =
> -		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
> -
> -	get_device(&cxlmd->dev);
> -	file->private_data = cxlmd;
> -
> -	return 0;
> -}
> -
> -static int cxl_memdev_release_file(struct inode *inode, struct file *file)
> -{
> -	struct cxl_memdev *cxlmd =
> -		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
> -
> -	put_device(&cxlmd->dev);
> -
> -	return 0;
> -}
> -
> -static void cxl_memdev_shutdown(struct device *dev)
> -{
> -	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> -
> -	down_write(&cxl_memdev_rwsem);
> -	cxlmd->cxlm = NULL;
> -	up_write(&cxl_memdev_rwsem);
> -}
> -
> -static const struct cdevm_file_operations cxl_memdev_fops = {
> -	.fops = {
> -		.owner = THIS_MODULE,
> -		.unlocked_ioctl = cxl_memdev_ioctl,
> -		.open = cxl_memdev_open,
> -		.release = cxl_memdev_release_file,
> -		.compat_ioctl = compat_ptr_ioctl,
> -		.llseek = noop_llseek,
> -	},
> -	.shutdown = cxl_memdev_shutdown,
> -};
> -
> -static inline struct cxl_mem_command *cxl_mem_find_command(u16 opcode)
> -{
> -	struct cxl_mem_command *c;
> -
> -	cxl_for_each_cmd(c)
> -		if (c->opcode == opcode)
> -			return c;
> -
> -	return NULL;
> -}
> -
> -/**
> - * cxl_mem_mbox_send_cmd() - Send a mailbox command to a memory device.
> - * @cxlm: The CXL memory device to communicate with.
> - * @opcode: Opcode for the mailbox command.
> - * @in: The input payload for the mailbox command.
> - * @in_size: The length of the input payload
> - * @out: Caller allocated buffer for the output.
> - * @out_size: Expected size of output.
> - *
> - * Context: Any context. Will acquire and release mbox_mutex.
> - * Return:
> - *  * %>=0	- Number of bytes returned in @out.
> - *  * %-E2BIG	- Payload is too large for hardware.
> - *  * %-EBUSY	- Couldn't acquire exclusive mailbox access.
> - *  * %-EFAULT	- Hardware error occurred.
> - *  * %-ENXIO	- Command completed, but device reported an error.
> - *  * %-EIO	- Unexpected output size.
> - *
> - * Mailbox commands may execute successfully yet the device itself reported an
> - * error. While this distinction can be useful for commands from userspace, the
> - * kernel will only be able to use results when both are successful.
> - *
> - * See __cxl_mem_mbox_send_cmd()
> - */
> -static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode,
> -				 void *in, size_t in_size,
> -				 void *out, size_t out_size)
> -{
> -	const struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> -	struct cxl_mbox_cmd mbox_cmd = {
> -		.opcode = opcode,
> -		.payload_in = in,
> -		.size_in = in_size,
> -		.size_out = out_size,
> -		.payload_out = out,
> -	};
> -	int rc;
> -
> -	if (out_size > cxlm->payload_size)
> -		return -E2BIG;
> -
> -	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
> -	if (rc)
> -		return rc;
> -
> -	/* TODO: Map return code to proper kernel style errno */
> -	if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
> -		return -ENXIO;
> -
> -	/*
> -	 * Variable sized commands can't be validated and so it's up to the
> -	 * caller to do that if they wish.
> -	 */
> -	if (cmd->info.size_out >= 0 && mbox_cmd.size_out != out_size)
> -		return -EIO;
> -
> -	return 0;
> -}
> -
>  static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
>  {
>  	const int cap = readl(cxlm->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> @@ -901,30 +304,6 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
>  	return 0;
>  }
>  
> -static struct cxl_mem *cxl_mem_create(struct device *dev)
> -{
> -	struct cxl_mem *cxlm;
> -
> -	cxlm = devm_kzalloc(dev, sizeof(*cxlm), GFP_KERNEL);
> -	if (!cxlm) {
> -		dev_err(dev, "No memory available\n");
> -		return ERR_PTR(-ENOMEM);
> -	}
> -
> -	mutex_init(&cxlm->mbox_mutex);
> -	cxlm->dev = dev;
> -	cxlm->enabled_cmds =
> -		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
> -				   sizeof(unsigned long),
> -				   GFP_KERNEL | __GFP_ZERO);
> -	if (!cxlm->enabled_cmds) {
> -		dev_err(dev, "No memory available for bitmap\n");
> -		return ERR_PTR(-ENOMEM);
> -	}
> -
> -	return cxlm;
> -}
> -
>  static void __iomem *cxl_mem_map_regblock(struct cxl_mem *cxlm,
>  					  u8 bar, u64 offset)
>  {
> @@ -1132,298 +511,6 @@ static int cxl_mem_setup_regs(struct cxl_mem *cxlm)
>  	return ret;
>  }
>  
> -static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
> -{
> -	u32 remaining = size;
> -	u32 offset = 0;
> -
> -	while (remaining) {
> -		u32 xfer_size = min_t(u32, remaining, cxlm->payload_size);
> -		struct cxl_mbox_get_log {
> -			uuid_t uuid;
> -			__le32 offset;
> -			__le32 length;
> -		} __packed log = {
> -			.uuid = *uuid,
> -			.offset = cpu_to_le32(offset),
> -			.length = cpu_to_le32(xfer_size)
> -		};
> -		int rc;
> -
> -		rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LOG, &log,
> -					   sizeof(log), out, xfer_size);
> -		if (rc < 0)
> -			return rc;
> -
> -		out += xfer_size;
> -		remaining -= xfer_size;
> -		offset += xfer_size;
> -	}
> -
> -	return 0;
> -}
> -
> -/**
> - * cxl_walk_cel() - Walk through the Command Effects Log.
> - * @cxlm: Device.
> - * @size: Length of the Command Effects Log.
> - * @cel: CEL
> - *
> - * Iterate over each entry in the CEL and determine if the driver supports the
> - * command. If so, the command is enabled for the device and can be used later.
> - */
> -static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
> -{
> -	struct cel_entry {
> -		__le16 opcode;
> -		__le16 effect;
> -	} __packed * cel_entry;
> -	const int cel_entries = size / sizeof(*cel_entry);
> -	int i;
> -
> -	cel_entry = (struct cel_entry *)cel;
> -
> -	for (i = 0; i < cel_entries; i++) {
> -		u16 opcode = le16_to_cpu(cel_entry[i].opcode);
> -		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> -
> -		if (!cmd) {
> -			dev_dbg(cxlm->dev,
> -				"Opcode 0x%04x unsupported by driver", opcode);
> -			continue;
> -		}
> -
> -		set_bit(cmd->info.id, cxlm->enabled_cmds);
> -	}
> -}
> -
> -struct cxl_mbox_get_supported_logs {
> -	__le16 entries;
> -	u8 rsvd[6];
> -	struct gsl_entry {
> -		uuid_t uuid;
> -		__le32 size;
> -	} __packed entry[];
> -} __packed;
> -
> -static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> -{
> -	struct cxl_mbox_get_supported_logs *ret;
> -	int rc;
> -
> -	ret = kvmalloc(cxlm->payload_size, GFP_KERNEL);
> -	if (!ret)
> -		return ERR_PTR(-ENOMEM);
> -
> -	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_SUPPORTED_LOGS, NULL,
> -				   0, ret, cxlm->payload_size);
> -	if (rc < 0) {
> -		kvfree(ret);
> -		return ERR_PTR(rc);
> -	}
> -
> -	return ret;
> -}
> -
> -/**
> - * cxl_mem_get_partition_info - Get partition info
> - * @cxlm: cxl_mem instance to update partition info
> - *
> - * Retrieve the current partition info for the device specified.  If not 0, the
> - * 'next' values are pending and take affect on next cold reset.
> - *
> - * Return: 0 if no error: or the result of the mailbox command.
> - *
> - * See CXL @8.2.9.5.2.1 Get Partition Info
> - */
> -static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
> -{
> -	struct cxl_mbox_get_partition_info {
> -		__le64 active_volatile_cap;
> -		__le64 active_persistent_cap;
> -		__le64 next_volatile_cap;
> -		__le64 next_persistent_cap;
> -	} __packed pi;
> -	int rc;
> -
> -	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_PARTITION_INFO,
> -				   NULL, 0, &pi, sizeof(pi));
> -	if (rc)
> -		return rc;
> -
> -	cxlm->active_volatile_bytes =
> -		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> -	cxlm->active_persistent_bytes =
> -		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> -	cxlm->next_volatile_bytes =
> -		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> -	cxlm->next_persistent_bytes =
> -		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> -
> -	return 0;
> -}
> -
> -/**
> - * cxl_mem_enumerate_cmds() - Enumerate commands for a device.
> - * @cxlm: The device.
> - *
> - * Returns 0 if enumerate completed successfully.
> - *
> - * CXL devices have optional support for certain commands. This function will
> - * determine the set of supported commands for the hardware and update the
> - * enabled_cmds bitmap in the @cxlm.
> - */
> -static int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm)
> -{
> -	struct cxl_mbox_get_supported_logs *gsl;
> -	struct device *dev = cxlm->dev;
> -	struct cxl_mem_command *cmd;
> -	int i, rc;
> -
> -	gsl = cxl_get_gsl(cxlm);
> -	if (IS_ERR(gsl))
> -		return PTR_ERR(gsl);
> -
> -	rc = -ENOENT;
> -	for (i = 0; i < le16_to_cpu(gsl->entries); i++) {
> -		u32 size = le32_to_cpu(gsl->entry[i].size);
> -		uuid_t uuid = gsl->entry[i].uuid;
> -		u8 *log;
> -
> -		dev_dbg(dev, "Found LOG type %pU of size %d", &uuid, size);
> -
> -		if (!uuid_equal(&uuid, &log_uuid[CEL_UUID]))
> -			continue;
> -
> -		log = kvmalloc(size, GFP_KERNEL);
> -		if (!log) {
> -			rc = -ENOMEM;
> -			goto out;
> -		}
> -
> -		rc = cxl_xfer_log(cxlm, &uuid, size, log);
> -		if (rc) {
> -			kvfree(log);
> -			goto out;
> -		}
> -
> -		cxl_walk_cel(cxlm, size, log);
> -		kvfree(log);
> -
> -		/* In case CEL was bogus, enable some default commands. */
> -		cxl_for_each_cmd(cmd)
> -			if (cmd->flags & CXL_CMD_FLAG_FORCE_ENABLE)
> -				set_bit(cmd->info.id, cxlm->enabled_cmds);
> -
> -		/* Found the required CEL */
> -		rc = 0;
> -	}
> -
> -out:
> -	kvfree(gsl);
> -	return rc;
> -}
> -
> -/**
> - * cxl_mem_identify() - Send the IDENTIFY command to the device.
> - * @cxlm: The device to identify.
> - *
> - * Return: 0 if identify was executed successfully.
> - *
> - * This will dispatch the identify command to the device and on success populate
> - * structures to be exported to sysfs.
> - */
> -static int cxl_mem_identify(struct cxl_mem *cxlm)
> -{
> -	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
> -	struct cxl_mbox_identify {
> -		char fw_revision[0x10];
> -		__le64 total_capacity;
> -		__le64 volatile_capacity;
> -		__le64 persistent_capacity;
> -		__le64 partition_align;
> -		__le16 info_event_log_size;
> -		__le16 warning_event_log_size;
> -		__le16 failure_event_log_size;
> -		__le16 fatal_event_log_size;
> -		__le32 lsa_size;
> -		u8 poison_list_max_mer[3];
> -		__le16 inject_poison_limit;
> -		u8 poison_caps;
> -		u8 qos_telemetry_caps;
> -	} __packed id;
> -	int rc;
> -
> -	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_IDENTIFY, NULL, 0, &id,
> -				   sizeof(id));
> -	if (rc < 0)
> -		return rc;
> -
> -	cxlm->total_bytes = le64_to_cpu(id.total_capacity);
> -	cxlm->total_bytes *= CXL_CAPACITY_MULTIPLIER;
> -
> -	cxlm->volatile_only_bytes = le64_to_cpu(id.volatile_capacity);
> -	cxlm->volatile_only_bytes *= CXL_CAPACITY_MULTIPLIER;
> -
> -	cxlm->persistent_only_bytes = le64_to_cpu(id.persistent_capacity);
> -	cxlm->persistent_only_bytes *= CXL_CAPACITY_MULTIPLIER;
> -
> -	cxlm->partition_align_bytes = le64_to_cpu(id.partition_align);
> -	cxlm->partition_align_bytes *= CXL_CAPACITY_MULTIPLIER;
> -
> -	dev_dbg(cxlm->dev,
> -		"Identify Memory Device\n"
> -		"     total_bytes = %#llx\n"
> -		"     volatile_only_bytes = %#llx\n"
> -		"     persistent_only_bytes = %#llx\n"
> -		"     partition_align_bytes = %#llx\n",
> -		cxlm->total_bytes, cxlm->volatile_only_bytes,
> -		cxlm->persistent_only_bytes, cxlm->partition_align_bytes);
> -
> -	cxlm->lsa_size = le32_to_cpu(id.lsa_size);
> -	memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision));
> -
> -	return 0;
> -}
> -
> -static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
> -{
> -	int rc;
> -
> -	if (cxlm->partition_align_bytes == 0) {
> -		cxlm->ram_range.start = 0;
> -		cxlm->ram_range.end = cxlm->volatile_only_bytes - 1;
> -		cxlm->pmem_range.start = cxlm->volatile_only_bytes;
> -		cxlm->pmem_range.end = cxlm->volatile_only_bytes +
> -					cxlm->persistent_only_bytes - 1;
> -		return 0;
> -	}
> -
> -	rc = cxl_mem_get_partition_info(cxlm);
> -	if (rc < 0) {
> -		dev_err(cxlm->dev, "Failed to query partition information\n");
> -		return rc;
> -	}
> -
> -	dev_dbg(cxlm->dev,
> -		"Get Partition Info\n"
> -		"     active_volatile_bytes = %#llx\n"
> -		"     active_persistent_bytes = %#llx\n"
> -		"     next_volatile_bytes = %#llx\n"
> -		"     next_persistent_bytes = %#llx\n",
> -		cxlm->active_volatile_bytes, cxlm->active_persistent_bytes,
> -		cxlm->next_volatile_bytes, cxlm->next_persistent_bytes);
> -
> -	cxlm->ram_range.start = 0;
> -	cxlm->ram_range.end = cxlm->active_volatile_bytes - 1;
> -
> -	cxlm->pmem_range.start = cxlm->active_volatile_bytes;
> -	cxlm->pmem_range.end = cxlm->active_volatile_bytes +
> -				cxlm->active_persistent_bytes - 1;
> -
> -	return 0;
> -}
> -
>  static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct cxl_memdev *cxlmd;
> @@ -1458,7 +545,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	cxlmd = devm_cxl_add_memdev(cxlm, &cxl_memdev_fops);
> +	cxlmd = devm_cxl_add_memdev(cxlm);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> @@ -1486,7 +573,6 @@ static struct pci_driver cxl_mem_driver = {
>  
>  static __init int cxl_mem_init(void)
>  {
> -	struct dentry *mbox_debugfs;
>  	int rc;
>  
>  	/* Double check the anonymous union trickery in struct cxl_regs */
> @@ -1497,17 +583,11 @@ static __init int cxl_mem_init(void)
>  	if (rc)
>  		return rc;
>  
> -	cxl_debugfs = debugfs_create_dir("cxl", NULL);
> -	mbox_debugfs = debugfs_create_dir("mbox", cxl_debugfs);
> -	debugfs_create_bool("raw_allow_all", 0600, mbox_debugfs,
> -			    &cxl_raw_allow_all);
> -
>  	return 0;
>  }
>  
>  static __exit void cxl_mem_exit(void)
>  {
> -	debugfs_remove_recursive(cxl_debugfs);
>  	pci_unregister_driver(&cxl_mem_driver);
>  }
>  
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support
  2021-09-09  5:12 ` [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support Dan Williams
@ 2021-09-09 17:02   ` Ben Widawsky
  2021-09-10  9:33   ` Jonathan Cameron
  2021-09-14 19:03   ` [PATCH v5 " Dan Williams
  2 siblings, 0 replies; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 17:02 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, vishal.l.verma, nvdimm, alison.schofield, ira.weiny,
	Jonathan.Cameron

On 21-09-08 22:12:49, Dan Williams wrote:
> The CXL_PMEM driver expects exclusive control of the label storage area
> space. Similar to the LIBNVDIMM expectation that the label storage area
> is only writable from userspace when the corresponding memory device is
> not active in any region, the expectation is the native CXL_PCI UAPI
> path is disabled while the cxl_nvdimm for a given cxl_memdev device is
> active in LIBNVDIMM.
> 
> Add the ability to toggle the availability of a given command for the
> UAPI path. Use that new capability to shutdown changes to partitions and
> the label storage area while the cxl_nvdimm device is actively proxying
> commands for LIBNVDIMM.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Link: https://lore.kernel.org/r/162982123298.1124374.22718002900700392.stgit@dwillia2-desk3.amr.corp.intel.com
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

I really wanted a way to make the exclusivity a property of the command itself
and determine whether or not there's an nvdimm bridge connected before
dispatching the command. Unfortunately, I couldn't make anything that was less
complex than this, so it is upgraded to:
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>

> ---
>  drivers/cxl/core/mbox.c   |    5 +++++
>  drivers/cxl/core/memdev.c |   31 +++++++++++++++++++++++++++++++
>  drivers/cxl/cxlmem.h      |    4 ++++
>  drivers/cxl/pmem.c        |   43 ++++++++++++++++++++++++++++++++-----------
>  4 files changed, 72 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 422999740649..82e79da195fa 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -221,6 +221,7 @@ static bool cxl_mem_raw_command_allowed(u16 opcode)
>   *  * %-EINVAL	- Reserved fields or invalid values were used.
>   *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
>   *  * %-EPERM	- Attempted to use a protected command.
> + *  * %-EBUSY	- Kernel has claimed exclusive access to this opcode
>   *
>   * The result of this command is a fully validated command in @out_cmd that is
>   * safe to send to the hardware.
> @@ -296,6 +297,10 @@ static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
>  	if (!test_bit(info->id, cxlm->enabled_cmds))
>  		return -ENOTTY;
>  
> +	/* Check that the command is not claimed for exclusive kernel use */
> +	if (test_bit(info->id, cxlm->exclusive_cmds))
> +		return -EBUSY;
> +
>  	/* Check the input buffer is the expected size */
>  	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
>  		return -ENOMEM;
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index df2ba87238c2..d9ade5b92330 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -134,6 +134,37 @@ static const struct device_type cxl_memdev_type = {
>  	.groups = cxl_memdev_attribute_groups,
>  };
>  
> +/**
> + * set_exclusive_cxl_commands() - atomically disable user cxl commands
> + * @cxlm: cxl_mem instance to modify
> + * @cmds: bitmap of commands to mark exclusive
> + *
> + * Flush the ioctl path and disable future execution of commands with
> + * the command ids set in @cmds.
> + */
> +void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
> +{
> +	down_write(&cxl_memdev_rwsem);
> +	bitmap_or(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
> +		  CXL_MEM_COMMAND_ID_MAX);
> +	up_write(&cxl_memdev_rwsem);
> +}
> +EXPORT_SYMBOL_GPL(set_exclusive_cxl_commands);
> +
> +/**
> + * clear_exclusive_cxl_commands() - atomically enable user cxl commands
> + * @cxlm: cxl_mem instance to modify
> + * @cmds: bitmap of commands to mark available for userspace
> + */
> +void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
> +{
> +	down_write(&cxl_memdev_rwsem);
> +	bitmap_andnot(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
> +		      CXL_MEM_COMMAND_ID_MAX);
> +	up_write(&cxl_memdev_rwsem);
> +}
> +EXPORT_SYMBOL_GPL(clear_exclusive_cxl_commands);
> +
>  static void cxl_memdev_shutdown(struct device *dev)
>  {
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 16201b7d82d2..468b7b8be207 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -101,6 +101,7 @@ struct cxl_mbox_cmd {
>   * @mbox_mutex: Mutex to synchronize mailbox access.
>   * @firmware_version: Firmware version for the memory device.
>   * @enabled_cmds: Hardware commands found enabled in CEL.
> + * @exclusive_cmds: Commands that are kernel-internal only
>   * @pmem_range: Active Persistent memory capacity configuration
>   * @ram_range: Active Volatile memory capacity configuration
>   * @total_bytes: sum of all possible capacities
> @@ -127,6 +128,7 @@ struct cxl_mem {
>  	struct mutex mbox_mutex; /* Protects device mailbox and firmware */
>  	char firmware_version[0x10];
>  	DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
> +	DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
>  
>  	struct range pmem_range;
>  	struct range ram_range;
> @@ -200,4 +202,6 @@ int cxl_mem_identify(struct cxl_mem *cxlm);
>  int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm);
>  int cxl_mem_create_range_info(struct cxl_mem *cxlm);
>  struct cxl_mem *cxl_mem_create(struct device *dev);
> +void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
> +void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
>  #endif /* __CXL_MEM_H__ */
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> index 9652c3ee41e7..a972af7a6e0b 100644
> --- a/drivers/cxl/pmem.c
> +++ b/drivers/cxl/pmem.c
> @@ -16,10 +16,7 @@
>   */
>  static struct workqueue_struct *cxl_pmem_wq;
>  
> -static void unregister_nvdimm(void *nvdimm)
> -{
> -	nvdimm_delete(nvdimm);
> -}
> +static __read_mostly DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
>  
>  static int match_nvdimm_bridge(struct device *dev, const void *data)
>  {
> @@ -36,12 +33,25 @@ static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
>  	return to_cxl_nvdimm_bridge(dev);
>  }
>  
> +static void cxl_nvdimm_remove(struct device *dev)
> +{
> +	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> +	struct nvdimm *nvdimm = dev_get_drvdata(dev);
> +	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> +	struct cxl_mem *cxlm = cxlmd->cxlm;
> +
> +	nvdimm_delete(nvdimm);
> +	clear_exclusive_cxl_commands(cxlm, exclusive_cmds);
> +}
> +
>  static int cxl_nvdimm_probe(struct device *dev)
>  {
>  	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> +	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> +	struct cxl_mem *cxlm = cxlmd->cxlm;
>  	struct cxl_nvdimm_bridge *cxl_nvb;
> +	struct nvdimm *nvdimm = NULL;
>  	unsigned long flags = 0;
> -	struct nvdimm *nvdimm;
>  	int rc = -ENXIO;
>  
>  	cxl_nvb = cxl_find_nvdimm_bridge();
> @@ -50,25 +60,32 @@ static int cxl_nvdimm_probe(struct device *dev)
>  
>  	device_lock(&cxl_nvb->dev);
>  	if (!cxl_nvb->nvdimm_bus)
> -		goto out;
> +		goto out_unlock;
> +
> +	set_exclusive_cxl_commands(cxlm, exclusive_cmds);
>  
>  	set_bit(NDD_LABELING, &flags);
> +	rc = -ENOMEM;
>  	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
>  			       NULL);
> -	if (!nvdimm)
> -		goto out;
> +	dev_set_drvdata(dev, nvdimm);
>  
> -	rc = devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm);
> -out:
> +out_unlock:
>  	device_unlock(&cxl_nvb->dev);
>  	put_device(&cxl_nvb->dev);
>  
> -	return rc;
> +	if (!nvdimm) {
> +		clear_exclusive_cxl_commands(cxlm, exclusive_cmds);
> +		return rc;
> +	}
> +
> +	return 0;
>  }
>  
>  static struct cxl_driver cxl_nvdimm_driver = {
>  	.name = "cxl_nvdimm",
>  	.probe = cxl_nvdimm_probe,
> +	.remove = cxl_nvdimm_remove,
>  	.id = CXL_DEVICE_NVDIMM,
>  };
>  
> @@ -194,6 +211,10 @@ static __init int cxl_pmem_init(void)
>  {
>  	int rc;
>  
> +	set_bit(CXL_MEM_COMMAND_ID_SET_PARTITION_INFO, exclusive_cmds);
> +	set_bit(CXL_MEM_COMMAND_ID_SET_SHUTDOWN_STATE, exclusive_cmds);
> +	set_bit(CXL_MEM_COMMAND_ID_SET_LSA, exclusive_cmds);
> +
>  	cxl_pmem_wq = alloc_ordered_workqueue("cxl_pmem", 0);
>  	if (!cxl_pmem_wq)
>  		return -ENXIO;
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands
  2021-09-09  5:12 ` [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands Dan Williams
@ 2021-09-09 17:22   ` Ben Widawsky
  2021-09-09 19:03     ` Dan Williams
  2021-09-09 22:08   ` [PATCH v5 " Dan Williams
  2021-09-14 19:06   ` Dan Williams
  2 siblings, 1 reply; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 17:22 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, vishal.l.verma, nvdimm, alison.schofield, ira.weiny,
	Jonathan.Cameron

On 21-09-08 22:12:54, Dan Williams wrote:
> The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to
> translate the Linux command payload to the device native command format.
> The LIBNVDIMM commands get-config-size, get-config-data, and
> set-config-data, map to the CXL memory device commands device-identify,
> get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an
> NVDIMM device arranges for the provisioning of namespaces. Additionally
> for CXL the LSA is used for provisioning regions as well.
> 
> The data from device-identify is already cached in the 'struct cxl_mem'
> instance associated with @cxl_nvd, so that payload return is simply
> crafted and no CXL command is issued. The conversion for get-lsa is
> straightforward, but the conversion for set-lsa requires an allocation
> to append the set-lsa header in front of the payload.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/pmem.c |  125 ++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 121 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> index a972af7a6e0b..29d24f13aa73 100644
> --- a/drivers/cxl/pmem.c
> +++ b/drivers/cxl/pmem.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
>  #include <linux/libnvdimm.h>
> +#include <asm/unaligned.h>
>  #include <linux/device.h>
>  #include <linux/module.h>
>  #include <linux/ndctl.h>
> @@ -48,10 +49,10 @@ static int cxl_nvdimm_probe(struct device *dev)
>  {
>  	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
>  	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> +	unsigned long flags = 0, cmd_mask = 0;
>  	struct cxl_mem *cxlm = cxlmd->cxlm;
>  	struct cxl_nvdimm_bridge *cxl_nvb;
>  	struct nvdimm *nvdimm = NULL;
> -	unsigned long flags = 0;
>  	int rc = -ENXIO;
>  
>  	cxl_nvb = cxl_find_nvdimm_bridge();
> @@ -66,8 +67,11 @@ static int cxl_nvdimm_probe(struct device *dev)
>  
>  	set_bit(NDD_LABELING, &flags);
>  	rc = -ENOMEM;
> -	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
> -			       NULL);
> +	set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
> +	set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
> +	set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
> +	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags,
> +			       cmd_mask, 0, NULL);
>  	dev_set_drvdata(dev, nvdimm);
>  
>  out_unlock:
> @@ -89,11 +93,124 @@ static struct cxl_driver cxl_nvdimm_driver = {
>  	.id = CXL_DEVICE_NVDIMM,
>  };
>  
> +static int cxl_pmem_get_config_size(struct cxl_mem *cxlm,
> +				    struct nd_cmd_get_config_size *cmd,
> +				    unsigned int buf_len, int *cmd_rc)
> +{
> +	if (sizeof(*cmd) > buf_len)
> +		return -EINVAL;
> +
> +	*cmd = (struct nd_cmd_get_config_size) {
> +		 .config_size = cxlm->lsa_size,
> +		 .max_xfer = cxlm->payload_size,
> +	};
> +	*cmd_rc = 0;
> +
> +	return 0;
> +}
> +
> +static int cxl_pmem_get_config_data(struct cxl_mem *cxlm,
> +				    struct nd_cmd_get_config_data_hdr *cmd,
> +				    unsigned int buf_len, int *cmd_rc)
> +{
> +	struct cxl_mbox_get_lsa {
> +		u32 offset;
> +		u32 length;
> +	} get_lsa;
> +	int rc;
> +
> +	if (sizeof(*cmd) > buf_len)
> +		return -EINVAL;
> +	if (struct_size(cmd, out_buf, cmd->in_length) > buf_len)
> +		return -EINVAL;
> +
> +	get_lsa = (struct cxl_mbox_get_lsa) {
> +		.offset = cmd->in_offset,
> +		.length = cmd->in_length,
> +	};
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LSA, &get_lsa,
> +				   sizeof(get_lsa), cmd->out_buf,
> +				   cmd->in_length);
> +	cmd->status = 0;
> +	*cmd_rc = 0;
> +
> +	return rc;
> +}
> +
> +static int cxl_pmem_set_config_data(struct cxl_mem *cxlm,
> +				    struct nd_cmd_set_config_hdr *cmd,
> +				    unsigned int buf_len, int *cmd_rc)
> +{
> +	struct cxl_mbox_set_lsa {
> +		u32 offset;
> +		u32 reserved;
> +		u8 data[];
> +	} *set_lsa;
> +	int rc;
> +
> +	if (sizeof(*cmd) > buf_len)
> +		return -EINVAL;
> +
> +	/* 4-byte status follows the input data in the payload */
> +	if (struct_size(cmd, in_buf, cmd->in_length) + 4 > buf_len)
> +		return -EINVAL;
> +
> +	set_lsa =
> +		kvzalloc(struct_size(set_lsa, data, cmd->in_length), GFP_KERNEL);
> +	if (!set_lsa)
> +		return -ENOMEM;
> +
> +	*set_lsa = (struct cxl_mbox_set_lsa) {
> +		.offset = cmd->in_offset,
> +	};
> +	memcpy(set_lsa->data, cmd->in_buf, cmd->in_length);
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_SET_LSA, set_lsa,
> +				   struct_size(set_lsa, data, cmd->in_length),
> +				   NULL, 0);
> +
> +	/*
> +	 * Set "firmware" status (4-packed bytes at the end of the input
> +	 * payload.
> +	 */
> +	put_unaligned(0, (u32 *) &cmd->in_buf[cmd->in_length]);
> +	*cmd_rc = 0;
> +	kvfree(set_lsa);
> +
> +	return rc;
> +}
> +
> +static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
> +			       void *buf, unsigned int buf_len, int *cmd_rc)
> +{
> +	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
> +	unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
> +	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> +	struct cxl_mem *cxlm = cxlmd->cxlm;
> +
> +	if (!test_bit(cmd, &cmd_mask))
> +		return -ENOTTY;
> +
> +	switch (cmd) {
> +	case ND_CMD_GET_CONFIG_SIZE:
> +		return cxl_pmem_get_config_size(cxlm, buf, buf_len, cmd_rc);
> +	case ND_CMD_GET_CONFIG_DATA:
> +		return cxl_pmem_get_config_data(cxlm, buf, buf_len, cmd_rc);
> +	case ND_CMD_SET_CONFIG_DATA:
> +		return cxl_pmem_set_config_data(cxlm, buf, buf_len, cmd_rc);
> +	default:
> +		return -ENOTTY;
> +	}
> +}
> +

Is there some intended purpose for passing cmd_rc down, if it isn't actually
ever used? Perhaps add it when needed later?

>  static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
>  			struct nvdimm *nvdimm, unsigned int cmd, void *buf,
>  			unsigned int buf_len, int *cmd_rc)
>  {
> -	return -ENOTTY;
> +	if (!nvdimm)
> +		return -ENOTTY;
> +	return cxl_pmem_nvdimm_ctl(nvdimm, cmd, buf, buf_len, cmd_rc);
>  }
>  
>  static bool online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info()
  2021-09-09 16:20   ` Ben Widawsky
@ 2021-09-09 18:06     ` Dan Williams
  2021-09-09 21:05       ` Ben Widawsky
  0 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09 18:06 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Ira Weiny, Vishal L Verma, Linux NVDIMM, Schofield,
	Alison, Jonathan Cameron

On Thu, Sep 9, 2021 at 9:20 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-09-08 22:12:15, Dan Williams wrote:
> > Commit 0b9159d0ff21 ("cxl/pci: Store memory capacity values") missed
> > updating the kernel-doc for 'struct cxl_mem' leading to the following
> > warnings:
> >
> > ./scripts/kernel-doc -v drivers/cxl/cxlmem.h 2>&1 | grep warn
> > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'total_bytes' not described in 'cxl_mem'
> > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'volatile_only_bytes' not described in 'cxl_mem'
> > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'persistent_only_bytes' not described in 'cxl_mem'
> > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'partition_align_bytes' not described in 'cxl_mem'
> > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_volatile_bytes' not described in 'cxl_mem'
> > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_persistent_bytes' not described in 'cxl_mem'
> > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_volatile_bytes' not described in 'cxl_mem'
> > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_persistent_bytes' not described in 'cxl_mem'
> >
> > Also, it is redundant to describe those same parameters in the
> > kernel-doc for cxl_mem_get_partition_info(). Given the only user of that
> > routine updates the values in @cxlm, just do that implicitly internal to
> > the helper.
> >
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Reported-by: Ben Widawsky <ben.widawsky@intel.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/cxl/cxlmem.h |   15 +++++++++++++--
> >  drivers/cxl/pci.c    |   35 +++++++++++------------------------
> >  2 files changed, 24 insertions(+), 26 deletions(-)
> >
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index d5334df83fb2..c6fce966084a 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -78,8 +78,19 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
> >   * @mbox_mutex: Mutex to synchronize mailbox access.
> >   * @firmware_version: Firmware version for the memory device.
> >   * @enabled_cmds: Hardware commands found enabled in CEL.
> > - * @pmem_range: Persistent memory capacity information.
> > - * @ram_range: Volatile memory capacity information.
> > + * @pmem_range: Active Persistent memory capacity configuration
> > + * @ram_range: Active Volatile memory capacity configuration
> > + * @total_bytes: sum of all possible capacities
> > + * @volatile_only_bytes: hard volatile capacity
> > + * @persistent_only_bytes: hard persistent capacity
> > + * @partition_align_bytes: soft setting for configurable capacity
> > + * @active_volatile_bytes: sum of hard + soft volatile
> > + * @active_persistent_bytes: sum of hard + soft persistent
>
> Looking at this now, probably makes sense to create some helper macros or inline
> functions to calculate these as needed, rather than storing them in the
> structure.

Perhaps, I would need to look deeper into what is worth caching vs
what is suitable to be recalculated. Do you have a proposal here?


>
> > + * @next_volatile_bytes: volatile capacity change pending device reset
> > + * @next_persistent_bytes: persistent capacity change pending device reset
> > + *
> > + * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > + * details on capacity parameters.
> >   */
> >  struct cxl_mem {
> >       struct device *dev;
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index c1e1d12e24b6..8077d907e7d3 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -1262,11 +1262,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> >
> >  /**
> >   * cxl_mem_get_partition_info - Get partition info
> > - * @cxlm: The device to act on
> > - * @active_volatile_bytes: returned active volatile capacity
> > - * @active_persistent_bytes: returned active persistent capacity
> > - * @next_volatile_bytes: return next volatile capacity
> > - * @next_persistent_bytes: return next persistent capacity
> > + * @cxlm: cxl_mem instance to update partition info
> >   *
> >   * Retrieve the current partition info for the device specified.  If not 0, the
> >   * 'next' values are pending and take affect on next cold reset.
> > @@ -1275,11 +1271,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> >   *
> >   * See CXL @8.2.9.5.2.1 Get Partition Info
> >   */
> > -static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
> > -                                   u64 *active_volatile_bytes,
> > -                                   u64 *active_persistent_bytes,
> > -                                   u64 *next_volatile_bytes,
> > -                                   u64 *next_persistent_bytes)
> > +static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
> >  {
> >       struct cxl_mbox_get_partition_info {
> >               __le64 active_volatile_cap;
> > @@ -1294,15 +1286,14 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
> >       if (rc)
> >               return rc;
> >
> > -     *active_volatile_bytes = le64_to_cpu(pi.active_volatile_cap);
> > -     *active_persistent_bytes = le64_to_cpu(pi.active_persistent_cap);
> > -     *next_volatile_bytes = le64_to_cpu(pi.next_volatile_cap);
> > -     *next_persistent_bytes = le64_to_cpu(pi.next_volatile_cap);
> > -
> > -     *active_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
> > -     *active_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
> > -     *next_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
> > -     *next_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
> > +     cxlm->active_volatile_bytes =
> > +             le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> > +     cxlm->active_persistent_bytes =
> > +             le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> > +     cxlm->next_volatile_bytes =
> > +             le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> > +     cxlm->next_persistent_bytes =
> > +             le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
>
> Personally, I prefer the more functional style implementation. I guess if you
> wanted to make the change, my preference would be to kill
> cxl_mem_get_partition_info() entirely. Up to you though...

I was bringing this function in line with the precedent we already set
with cxl_mem_identify() that caches the result in @cxlm. Are you
saying you want to change that style too?

I feel like caching device attributes is idiomatic. Look at all the
PCI attributes that are cached in "struct pci_device" that could be
re-read or re-calculated rather than cached. In general I think a
routine that returns 4 values is better off filling in a structure.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 05/21] libnvdimm/label: Define CXL region labels
  2021-09-09 15:58   ` Ben Widawsky
@ 2021-09-09 18:38     ` Dan Williams
  0 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09 18:38 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Jonathan Cameron, Vishal L Verma, Linux NVDIMM,
	Schofield, Alison, Weiny, Ira

On Thu, Sep 9, 2021 at 8:58 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-09-08 22:11:58, Dan Williams wrote:
> > Add a definition of the CXL 2.0 region label format. Note this is done
> > as a separate patch to make the next patch that adds namespace label
> > support easier to read.
> >
> > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/nvdimm/label.h |   32 ++++++++++++++++++++++++++++++++
> >  1 file changed, 32 insertions(+)
>
> Wondering how awkward it's going to be to use this in the cxl region driver. Is
> the intent to push all device based reads/writes to labels happen in
> drivers/nvdimm?

I am looking at it from the perspective that namespace provisioning
and assembly will need to consult region details. So drivers/nvdimm/
needs to have region-label awareness regardless.

Now, when the CXL side wants to provision a region, it will also need
to coordinate label area access with any namespace provisioning
operations that might be happening on other regions that the CXL
device is contributing.

Both of those lead me to believe that CXL should just request the
nvdimm sub-system to update labels to keep all the competing label
operations and cached label data in-sync.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core
  2021-09-09 16:41   ` Ben Widawsky
@ 2021-09-09 18:50     ` Dan Williams
  2021-09-09 20:35       ` Ben Widawsky
  0 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09 18:50 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, kernel test robot, Vishal L Verma, Linux NVDIMM,
	Schofield, Alison, Weiny, Ira, Jonathan Cameron

On Thu, Sep 9, 2021 at 9:41 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-09-08 22:12:32, Dan Williams wrote:
> > Now that the internals of mailbox operations are abstracted from the PCI
> > specifics a bulk of infrastructure can move to the core.
> >
> > The CXL_PMEM driver intends to proxy LIBNVDIMM UAPI and driver requests
> > to the equivalent functionality provided by the CXL hardware mailbox
> > interface. In support of that intent move the mailbox implementation to
> > a shared location for the CXL_PCI driver native IOCTL path and CXL_PMEM
> > nvdimm command proxy path to share.
> >
> > A unit test framework seeks to implement a unit test backend transport
> > for mailbox commands to communicate mocked up payloads. It can reuse all
> > of the mailbox infrastructure minus the PCI specifics, so that also gets
> > moved to the core.
> >
> > Finally with the mailbox infrastructure and ioctl handling being
> > transport generic there is no longer any need to pass file
> > file_operations to devm_cxl_add_memdev(). That allows all the ioctl
> > boilerplate to move into the core for unit test reuse.
> >
> > No functional change intended, just code movement.
>
> At some point, I think some of the comments and kernel docs need updating since
> the target is no longer exclusively memory devices. Perhaps you do this in later
> patches....

I would wait to rework comments when/if it becomes clear that a
non-memory-device driver wants to reuse the mailbox core. I do not see
any indications that the comments are currently broken, do you?

[..]
> > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> > index 036a3c8106b4..c85b7fbad02d 100644
> > --- a/drivers/cxl/core/core.h
> > +++ b/drivers/cxl/core/core.h
> > @@ -14,7 +14,15 @@ static inline void unregister_cxl_dev(void *dev)
> >       device_unregister(dev);
> >  }
> >
> > +struct cxl_send_command;
> > +struct cxl_mem_query_commands;
> > +int cxl_query_cmd(struct cxl_memdev *cxlmd,
> > +               struct cxl_mem_query_commands __user *q);
> > +int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s);
> > +
> >  int cxl_memdev_init(void);
> >  void cxl_memdev_exit(void);
> > +void cxl_mbox_init(void);
> > +void cxl_mbox_exit(void);
>
> cxl_mbox_fini()?

The idiomatic kernel module shutdown function is suffixed _exit().

[..]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands
  2021-09-09 17:22   ` Ben Widawsky
@ 2021-09-09 19:03     ` Dan Williams
  2021-09-09 20:32       ` Ben Widawsky
  0 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09 19:03 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Vishal L Verma, Linux NVDIMM, Schofield, Alison,
	Weiny, Ira, Jonathan Cameron

On Thu, Sep 9, 2021 at 10:22 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-09-08 22:12:54, Dan Williams wrote:
> > The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to
> > translate the Linux command payload to the device native command format.
> > The LIBNVDIMM commands get-config-size, get-config-data, and
> > set-config-data, map to the CXL memory device commands device-identify,
> > get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an
> > NVDIMM device arranges for the provisioning of namespaces. Additionally
> > for CXL the LSA is used for provisioning regions as well.
> >
> > The data from device-identify is already cached in the 'struct cxl_mem'
> > instance associated with @cxl_nvd, so that payload return is simply
> > crafted and no CXL command is issued. The conversion for get-lsa is
> > straightforward, but the conversion for set-lsa requires an allocation
> > to append the set-lsa header in front of the payload.
> >
> > Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/cxl/pmem.c |  125 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 121 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> > index a972af7a6e0b..29d24f13aa73 100644
> > --- a/drivers/cxl/pmem.c
> > +++ b/drivers/cxl/pmem.c
> > @@ -1,6 +1,7 @@
> >  // SPDX-License-Identifier: GPL-2.0-only
> >  /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> >  #include <linux/libnvdimm.h>
> > +#include <asm/unaligned.h>
> >  #include <linux/device.h>
> >  #include <linux/module.h>
> >  #include <linux/ndctl.h>
> > @@ -48,10 +49,10 @@ static int cxl_nvdimm_probe(struct device *dev)
> >  {
> >       struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> >       struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > +     unsigned long flags = 0, cmd_mask = 0;
> >       struct cxl_mem *cxlm = cxlmd->cxlm;
> >       struct cxl_nvdimm_bridge *cxl_nvb;
> >       struct nvdimm *nvdimm = NULL;
> > -     unsigned long flags = 0;
> >       int rc = -ENXIO;
> >
> >       cxl_nvb = cxl_find_nvdimm_bridge();
> > @@ -66,8 +67,11 @@ static int cxl_nvdimm_probe(struct device *dev)
> >
> >       set_bit(NDD_LABELING, &flags);
> >       rc = -ENOMEM;
> > -     nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
> > -                            NULL);
> > +     set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
> > +     set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
> > +     set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
> > +     nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags,
> > +                            cmd_mask, 0, NULL);
> >       dev_set_drvdata(dev, nvdimm);
> >
> >  out_unlock:
> > @@ -89,11 +93,124 @@ static struct cxl_driver cxl_nvdimm_driver = {
> >       .id = CXL_DEVICE_NVDIMM,
> >  };
> >
> > +static int cxl_pmem_get_config_size(struct cxl_mem *cxlm,
> > +                                 struct nd_cmd_get_config_size *cmd,
> > +                                 unsigned int buf_len, int *cmd_rc)
> > +{
> > +     if (sizeof(*cmd) > buf_len)
> > +             return -EINVAL;
> > +
> > +     *cmd = (struct nd_cmd_get_config_size) {
> > +              .config_size = cxlm->lsa_size,
> > +              .max_xfer = cxlm->payload_size,
> > +     };
> > +     *cmd_rc = 0;
> > +
> > +     return 0;
> > +}
> > +
> > +static int cxl_pmem_get_config_data(struct cxl_mem *cxlm,
> > +                                 struct nd_cmd_get_config_data_hdr *cmd,
> > +                                 unsigned int buf_len, int *cmd_rc)
> > +{
> > +     struct cxl_mbox_get_lsa {
> > +             u32 offset;
> > +             u32 length;
> > +     } get_lsa;
> > +     int rc;
> > +
> > +     if (sizeof(*cmd) > buf_len)
> > +             return -EINVAL;
> > +     if (struct_size(cmd, out_buf, cmd->in_length) > buf_len)
> > +             return -EINVAL;
> > +
> > +     get_lsa = (struct cxl_mbox_get_lsa) {
> > +             .offset = cmd->in_offset,
> > +             .length = cmd->in_length,
> > +     };
> > +
> > +     rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LSA, &get_lsa,
> > +                                sizeof(get_lsa), cmd->out_buf,
> > +                                cmd->in_length);
> > +     cmd->status = 0;
> > +     *cmd_rc = 0;
> > +
> > +     return rc;
> > +}
> > +
> > +static int cxl_pmem_set_config_data(struct cxl_mem *cxlm,
> > +                                 struct nd_cmd_set_config_hdr *cmd,
> > +                                 unsigned int buf_len, int *cmd_rc)
> > +{
> > +     struct cxl_mbox_set_lsa {
> > +             u32 offset;
> > +             u32 reserved;
> > +             u8 data[];
> > +     } *set_lsa;
> > +     int rc;
> > +
> > +     if (sizeof(*cmd) > buf_len)
> > +             return -EINVAL;
> > +
> > +     /* 4-byte status follows the input data in the payload */
> > +     if (struct_size(cmd, in_buf, cmd->in_length) + 4 > buf_len)
> > +             return -EINVAL;
> > +
> > +     set_lsa =
> > +             kvzalloc(struct_size(set_lsa, data, cmd->in_length), GFP_KERNEL);
> > +     if (!set_lsa)
> > +             return -ENOMEM;
> > +
> > +     *set_lsa = (struct cxl_mbox_set_lsa) {
> > +             .offset = cmd->in_offset,
> > +     };
> > +     memcpy(set_lsa->data, cmd->in_buf, cmd->in_length);
> > +
> > +     rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_SET_LSA, set_lsa,
> > +                                struct_size(set_lsa, data, cmd->in_length),
> > +                                NULL, 0);
> > +
> > +     /*
> > +      * Set "firmware" status (4-packed bytes at the end of the input
> > +      * payload.
> > +      */
> > +     put_unaligned(0, (u32 *) &cmd->in_buf[cmd->in_length]);
> > +     *cmd_rc = 0;
> > +     kvfree(set_lsa);
> > +
> > +     return rc;
> > +}
> > +
> > +static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
> > +                            void *buf, unsigned int buf_len, int *cmd_rc)
> > +{
> > +     struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
> > +     unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
> > +     struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > +     struct cxl_mem *cxlm = cxlmd->cxlm;
> > +
> > +     if (!test_bit(cmd, &cmd_mask))
> > +             return -ENOTTY;
> > +
> > +     switch (cmd) {
> > +     case ND_CMD_GET_CONFIG_SIZE:
> > +             return cxl_pmem_get_config_size(cxlm, buf, buf_len, cmd_rc);
> > +     case ND_CMD_GET_CONFIG_DATA:
> > +             return cxl_pmem_get_config_data(cxlm, buf, buf_len, cmd_rc);
> > +     case ND_CMD_SET_CONFIG_DATA:
> > +             return cxl_pmem_set_config_data(cxlm, buf, buf_len, cmd_rc);
> > +     default:
> > +             return -ENOTTY;
> > +     }
> > +}
> > +
>
> Is there some intended purpose for passing cmd_rc down, if it isn't actually
> ever used? Perhaps add it when needed later?

Ah true, copy-pasta leftovers from other similar routines. I'll clean this up.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands
  2021-09-09 19:03     ` Dan Williams
@ 2021-09-09 20:32       ` Ben Widawsky
  2021-09-10  9:39         ` Jonathan Cameron
  0 siblings, 1 reply; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 20:32 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Vishal L Verma, Linux NVDIMM, Schofield, Alison,
	Weiny, Ira, Jonathan Cameron

On 21-09-09 12:03:49, Dan Williams wrote:
> On Thu, Sep 9, 2021 at 10:22 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-09-08 22:12:54, Dan Williams wrote:
> > > The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to
> > > translate the Linux command payload to the device native command format.
> > > The LIBNVDIMM commands get-config-size, get-config-data, and
> > > set-config-data, map to the CXL memory device commands device-identify,
> > > get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an
> > > NVDIMM device arranges for the provisioning of namespaces. Additionally
> > > for CXL the LSA is used for provisioning regions as well.
> > >
> > > The data from device-identify is already cached in the 'struct cxl_mem'
> > > instance associated with @cxl_nvd, so that payload return is simply
> > > crafted and no CXL command is issued. The conversion for get-lsa is
> > > straightforward, but the conversion for set-lsa requires an allocation
> > > to append the set-lsa header in front of the payload.
> > >
> > > Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > > ---
> > >  drivers/cxl/pmem.c |  125 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> > >  1 file changed, 121 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> > > index a972af7a6e0b..29d24f13aa73 100644
> > > --- a/drivers/cxl/pmem.c
> > > +++ b/drivers/cxl/pmem.c
> > > @@ -1,6 +1,7 @@
> > >  // SPDX-License-Identifier: GPL-2.0-only
> > >  /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> > >  #include <linux/libnvdimm.h>
> > > +#include <asm/unaligned.h>
> > >  #include <linux/device.h>
> > >  #include <linux/module.h>
> > >  #include <linux/ndctl.h>
> > > @@ -48,10 +49,10 @@ static int cxl_nvdimm_probe(struct device *dev)
> > >  {
> > >       struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> > >       struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > > +     unsigned long flags = 0, cmd_mask = 0;
> > >       struct cxl_mem *cxlm = cxlmd->cxlm;
> > >       struct cxl_nvdimm_bridge *cxl_nvb;
> > >       struct nvdimm *nvdimm = NULL;
> > > -     unsigned long flags = 0;
> > >       int rc = -ENXIO;
> > >
> > >       cxl_nvb = cxl_find_nvdimm_bridge();
> > > @@ -66,8 +67,11 @@ static int cxl_nvdimm_probe(struct device *dev)
> > >
> > >       set_bit(NDD_LABELING, &flags);
> > >       rc = -ENOMEM;
> > > -     nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
> > > -                            NULL);
> > > +     set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
> > > +     set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
> > > +     set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
> > > +     nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags,
> > > +                            cmd_mask, 0, NULL);
> > >       dev_set_drvdata(dev, nvdimm);
> > >
> > >  out_unlock:
> > > @@ -89,11 +93,124 @@ static struct cxl_driver cxl_nvdimm_driver = {
> > >       .id = CXL_DEVICE_NVDIMM,
> > >  };
> > >
> > > +static int cxl_pmem_get_config_size(struct cxl_mem *cxlm,
> > > +                                 struct nd_cmd_get_config_size *cmd,
> > > +                                 unsigned int buf_len, int *cmd_rc)
> > > +{
> > > +     if (sizeof(*cmd) > buf_len)
> > > +             return -EINVAL;
> > > +
> > > +     *cmd = (struct nd_cmd_get_config_size) {
> > > +              .config_size = cxlm->lsa_size,
> > > +              .max_xfer = cxlm->payload_size,
> > > +     };
> > > +     *cmd_rc = 0;
> > > +
> > > +     return 0;
> > > +}
> > > +
> > > +static int cxl_pmem_get_config_data(struct cxl_mem *cxlm,
> > > +                                 struct nd_cmd_get_config_data_hdr *cmd,
> > > +                                 unsigned int buf_len, int *cmd_rc)
> > > +{
> > > +     struct cxl_mbox_get_lsa {
> > > +             u32 offset;
> > > +             u32 length;
> > > +     } get_lsa;
> > > +     int rc;
> > > +
> > > +     if (sizeof(*cmd) > buf_len)
> > > +             return -EINVAL;
> > > +     if (struct_size(cmd, out_buf, cmd->in_length) > buf_len)
> > > +             return -EINVAL;
> > > +
> > > +     get_lsa = (struct cxl_mbox_get_lsa) {
> > > +             .offset = cmd->in_offset,
> > > +             .length = cmd->in_length,
> > > +     };
> > > +
> > > +     rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LSA, &get_lsa,
> > > +                                sizeof(get_lsa), cmd->out_buf,
> > > +                                cmd->in_length);
> > > +     cmd->status = 0;
> > > +     *cmd_rc = 0;
> > > +
> > > +     return rc;
> > > +}
> > > +
> > > +static int cxl_pmem_set_config_data(struct cxl_mem *cxlm,
> > > +                                 struct nd_cmd_set_config_hdr *cmd,
> > > +                                 unsigned int buf_len, int *cmd_rc)
> > > +{
> > > +     struct cxl_mbox_set_lsa {
> > > +             u32 offset;
> > > +             u32 reserved;
> > > +             u8 data[];
> > > +     } *set_lsa;
> > > +     int rc;
> > > +
> > > +     if (sizeof(*cmd) > buf_len)
> > > +             return -EINVAL;
> > > +
> > > +     /* 4-byte status follows the input data in the payload */
> > > +     if (struct_size(cmd, in_buf, cmd->in_length) + 4 > buf_len)
> > > +             return -EINVAL;
> > > +
> > > +     set_lsa =
> > > +             kvzalloc(struct_size(set_lsa, data, cmd->in_length), GFP_KERNEL);
> > > +     if (!set_lsa)
> > > +             return -ENOMEM;
> > > +
> > > +     *set_lsa = (struct cxl_mbox_set_lsa) {
> > > +             .offset = cmd->in_offset,
> > > +     };
> > > +     memcpy(set_lsa->data, cmd->in_buf, cmd->in_length);
> > > +
> > > +     rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_SET_LSA, set_lsa,
> > > +                                struct_size(set_lsa, data, cmd->in_length),
> > > +                                NULL, 0);
> > > +
> > > +     /*
> > > +      * Set "firmware" status (4-packed bytes at the end of the input
> > > +      * payload.
> > > +      */
> > > +     put_unaligned(0, (u32 *) &cmd->in_buf[cmd->in_length]);
> > > +     *cmd_rc = 0;
> > > +     kvfree(set_lsa);
> > > +
> > > +     return rc;
> > > +}
> > > +
> > > +static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
> > > +                            void *buf, unsigned int buf_len, int *cmd_rc)
> > > +{
> > > +     struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
> > > +     unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
> > > +     struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > > +     struct cxl_mem *cxlm = cxlmd->cxlm;
> > > +
> > > +     if (!test_bit(cmd, &cmd_mask))
> > > +             return -ENOTTY;
> > > +
> > > +     switch (cmd) {
> > > +     case ND_CMD_GET_CONFIG_SIZE:
> > > +             return cxl_pmem_get_config_size(cxlm, buf, buf_len, cmd_rc);
> > > +     case ND_CMD_GET_CONFIG_DATA:
> > > +             return cxl_pmem_get_config_data(cxlm, buf, buf_len, cmd_rc);
> > > +     case ND_CMD_SET_CONFIG_DATA:
> > > +             return cxl_pmem_set_config_data(cxlm, buf, buf_len, cmd_rc);
> > > +     default:
> > > +             return -ENOTTY;
> > > +     }
> > > +}
> > > +
> >
> > Is there some intended purpose for passing cmd_rc down, if it isn't actually
> > ever used? Perhaps add it when needed later?
> 
> Ah true, copy-pasta leftovers from other similar routines. I'll clean this up.

With that,
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core
  2021-09-09 18:50     ` Dan Williams
@ 2021-09-09 20:35       ` Ben Widawsky
  2021-09-09 21:05         ` Dan Williams
  0 siblings, 1 reply; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 20:35 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, kernel test robot, Vishal L Verma, Linux NVDIMM,
	Schofield, Alison, Weiny, Ira, Jonathan Cameron

On 21-09-09 11:50:01, Dan Williams wrote:
> On Thu, Sep 9, 2021 at 9:41 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-09-08 22:12:32, Dan Williams wrote:
> > > Now that the internals of mailbox operations are abstracted from the PCI
> > > specifics a bulk of infrastructure can move to the core.
> > >
> > > The CXL_PMEM driver intends to proxy LIBNVDIMM UAPI and driver requests
> > > to the equivalent functionality provided by the CXL hardware mailbox
> > > interface. In support of that intent move the mailbox implementation to
> > > a shared location for the CXL_PCI driver native IOCTL path and CXL_PMEM
> > > nvdimm command proxy path to share.
> > >
> > > A unit test framework seeks to implement a unit test backend transport
> > > for mailbox commands to communicate mocked up payloads. It can reuse all
> > > of the mailbox infrastructure minus the PCI specifics, so that also gets
> > > moved to the core.
> > >
> > > Finally with the mailbox infrastructure and ioctl handling being
> > > transport generic there is no longer any need to pass file
> > > file_operations to devm_cxl_add_memdev(). That allows all the ioctl
> > > boilerplate to move into the core for unit test reuse.
> > >
> > > No functional change intended, just code movement.
> >
> > At some point, I think some of the comments and kernel docs need updating since
> > the target is no longer exclusively memory devices. Perhaps you do this in later
> > patches....
> 
> I would wait to rework comments when/if it becomes clear that a
> non-memory-device driver wants to reuse the mailbox core. I do not see
> any indications that the comments are currently broken, do you?

I didn't see anything which is incorrect, no. But to would be non-memory-driver
writers, they could be scared off by such comments... I don't mean that it
should hold this patch up btw.

> 
> [..]
> > > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> > > index 036a3c8106b4..c85b7fbad02d 100644
> > > --- a/drivers/cxl/core/core.h
> > > +++ b/drivers/cxl/core/core.h
> > > @@ -14,7 +14,15 @@ static inline void unregister_cxl_dev(void *dev)
> > >       device_unregister(dev);
> > >  }
> > >
> > > +struct cxl_send_command;
> > > +struct cxl_mem_query_commands;
> > > +int cxl_query_cmd(struct cxl_memdev *cxlmd,
> > > +               struct cxl_mem_query_commands __user *q);
> > > +int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s);
> > > +
> > >  int cxl_memdev_init(void);
> > >  void cxl_memdev_exit(void);
> > > +void cxl_mbox_init(void);
> > > +void cxl_mbox_exit(void);
> >
> > cxl_mbox_fini()?
> 
> The idiomatic kernel module shutdown function is suffixed _exit().
> 
> [..]

Got it, I argue that these aren't kernel module init/exit functions though. I
will leave it at that.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core
  2021-09-09 20:35       ` Ben Widawsky
@ 2021-09-09 21:05         ` Dan Williams
  0 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09 21:05 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, kernel test robot, Vishal L Verma, Linux NVDIMM,
	Schofield, Alison, Weiny, Ira, Jonathan Cameron

On Thu, Sep 9, 2021 at 1:35 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-09-09 11:50:01, Dan Williams wrote:
> > On Thu, Sep 9, 2021 at 9:41 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-09-08 22:12:32, Dan Williams wrote:
> > > > Now that the internals of mailbox operations are abstracted from the PCI
> > > > specifics a bulk of infrastructure can move to the core.
> > > >
> > > > The CXL_PMEM driver intends to proxy LIBNVDIMM UAPI and driver requests
> > > > to the equivalent functionality provided by the CXL hardware mailbox
> > > > interface. In support of that intent move the mailbox implementation to
> > > > a shared location for the CXL_PCI driver native IOCTL path and CXL_PMEM
> > > > nvdimm command proxy path to share.
> > > >
> > > > A unit test framework seeks to implement a unit test backend transport
> > > > for mailbox commands to communicate mocked up payloads. It can reuse all
> > > > of the mailbox infrastructure minus the PCI specifics, so that also gets
> > > > moved to the core.
> > > >
> > > > Finally with the mailbox infrastructure and ioctl handling being
> > > > transport generic there is no longer any need to pass file
> > > > file_operations to devm_cxl_add_memdev(). That allows all the ioctl
> > > > boilerplate to move into the core for unit test reuse.
> > > >
> > > > No functional change intended, just code movement.
> > >
> > > At some point, I think some of the comments and kernel docs need updating since
> > > the target is no longer exclusively memory devices. Perhaps you do this in later
> > > patches....
> >
> > I would wait to rework comments when/if it becomes clear that a
> > non-memory-device driver wants to reuse the mailbox core. I do not see
> > any indications that the comments are currently broken, do you?
>
> I didn't see anything which is incorrect, no. But to would be non-memory-driver
> writers, they could be scared off by such comments... I don't mean that it
> should hold this patch up btw.

Ok, we can cross that bridge when it comes to it. I don't expect
someone to reinvent mailbox command infrastructure especially when the
the changelog for the commit creating drivers/cxl/core/mbox.c
specifically mentions disconnecting the mailbox core from the cxl_pci
driver.

>
> >
> > [..]
> > > > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> > > > index 036a3c8106b4..c85b7fbad02d 100644
> > > > --- a/drivers/cxl/core/core.h
> > > > +++ b/drivers/cxl/core/core.h
> > > > @@ -14,7 +14,15 @@ static inline void unregister_cxl_dev(void *dev)
> > > >       device_unregister(dev);
> > > >  }
> > > >
> > > > +struct cxl_send_command;
> > > > +struct cxl_mem_query_commands;
> > > > +int cxl_query_cmd(struct cxl_memdev *cxlmd,
> > > > +               struct cxl_mem_query_commands __user *q);
> > > > +int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s);
> > > > +
> > > >  int cxl_memdev_init(void);
> > > >  void cxl_memdev_exit(void);
> > > > +void cxl_mbox_init(void);
> > > > +void cxl_mbox_exit(void);
> > >
> > > cxl_mbox_fini()?
> >
> > The idiomatic kernel module shutdown function is suffixed _exit().
> >
> > [..]
>
> Got it, I argue that these aren't kernel module init/exit functions though. I
> will leave it at that.

I would expect a _fini() function to destruct a single object, not a
void _exit() that is tearing down global static infrastructure related
to a compilation unit.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info()
  2021-09-09 18:06     ` Dan Williams
@ 2021-09-09 21:05       ` Ben Widawsky
  2021-09-09 21:10         ` Dan Williams
  2021-09-10  8:56         ` Jonathan Cameron
  0 siblings, 2 replies; 78+ messages in thread
From: Ben Widawsky @ 2021-09-09 21:05 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ira Weiny, Vishal L Verma, Linux NVDIMM, Schofield,
	Alison, Jonathan Cameron

On 21-09-09 11:06:53, Dan Williams wrote:
> On Thu, Sep 9, 2021 at 9:20 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-09-08 22:12:15, Dan Williams wrote:
> > > Commit 0b9159d0ff21 ("cxl/pci: Store memory capacity values") missed
> > > updating the kernel-doc for 'struct cxl_mem' leading to the following
> > > warnings:
> > >
> > > ./scripts/kernel-doc -v drivers/cxl/cxlmem.h 2>&1 | grep warn
> > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'total_bytes' not described in 'cxl_mem'
> > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'volatile_only_bytes' not described in 'cxl_mem'
> > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'persistent_only_bytes' not described in 'cxl_mem'
> > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'partition_align_bytes' not described in 'cxl_mem'
> > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_volatile_bytes' not described in 'cxl_mem'
> > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_persistent_bytes' not described in 'cxl_mem'
> > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_volatile_bytes' not described in 'cxl_mem'
> > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_persistent_bytes' not described in 'cxl_mem'
> > >
> > > Also, it is redundant to describe those same parameters in the
> > > kernel-doc for cxl_mem_get_partition_info(). Given the only user of that
> > > routine updates the values in @cxlm, just do that implicitly internal to
> > > the helper.
> > >
> > > Cc: Ira Weiny <ira.weiny@intel.com>
> > > Reported-by: Ben Widawsky <ben.widawsky@intel.com>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > > ---
> > >  drivers/cxl/cxlmem.h |   15 +++++++++++++--
> > >  drivers/cxl/pci.c    |   35 +++++++++++------------------------
> > >  2 files changed, 24 insertions(+), 26 deletions(-)
> > >
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index d5334df83fb2..c6fce966084a 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h
> > > @@ -78,8 +78,19 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
> > >   * @mbox_mutex: Mutex to synchronize mailbox access.
> > >   * @firmware_version: Firmware version for the memory device.
> > >   * @enabled_cmds: Hardware commands found enabled in CEL.
> > > - * @pmem_range: Persistent memory capacity information.
> > > - * @ram_range: Volatile memory capacity information.
> > > + * @pmem_range: Active Persistent memory capacity configuration
> > > + * @ram_range: Active Volatile memory capacity configuration
> > > + * @total_bytes: sum of all possible capacities
> > > + * @volatile_only_bytes: hard volatile capacity
> > > + * @persistent_only_bytes: hard persistent capacity
> > > + * @partition_align_bytes: soft setting for configurable capacity

see below... How about:
"alignment size for partition-able capacity"

> > > + * @active_volatile_bytes: sum of hard + soft volatile
> > > + * @active_persistent_bytes: sum of hard + soft persistent
> >
> > Looking at this now, probably makes sense to create some helper macros or inline
> > functions to calculate these as needed, rather than storing them in the
> > structure.
> 
> Perhaps, I would need to look deeper into what is worth caching vs
> what is suitable to be recalculated. Do you have a proposal here?

I think it's worth being considered... however, this suggestion was my mistake.
I thought there was a way to infer the active capacities just from identify, but
there isn't. This leads me to request an update to the kdoc above which was how
I got confused.

> 
> 
> >
> > > + * @next_volatile_bytes: volatile capacity change pending device reset
> > > + * @next_persistent_bytes: persistent capacity change pending device reset
> > > + *
> > > + * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > > + * details on capacity parameters.
> > >   */
> > >  struct cxl_mem {
> > >       struct device *dev;
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index c1e1d12e24b6..8077d907e7d3 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -1262,11 +1262,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> > >
> > >  /**
> > >   * cxl_mem_get_partition_info - Get partition info
> > > - * @cxlm: The device to act on
> > > - * @active_volatile_bytes: returned active volatile capacity
> > > - * @active_persistent_bytes: returned active persistent capacity
> > > - * @next_volatile_bytes: return next volatile capacity
> > > - * @next_persistent_bytes: return next persistent capacity
> > > + * @cxlm: cxl_mem instance to update partition info
> > >   *
> > >   * Retrieve the current partition info for the device specified.  If not 0, the
> > >   * 'next' values are pending and take affect on next cold reset.
> > > @@ -1275,11 +1271,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> > >   *
> > >   * See CXL @8.2.9.5.2.1 Get Partition Info
> > >   */
> > > -static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
> > > -                                   u64 *active_volatile_bytes,
> > > -                                   u64 *active_persistent_bytes,
> > > -                                   u64 *next_volatile_bytes,
> > > -                                   u64 *next_persistent_bytes)
> > > +static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
> > >  {
> > >       struct cxl_mbox_get_partition_info {
> > >               __le64 active_volatile_cap;
> > > @@ -1294,15 +1286,14 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
> > >       if (rc)
> > >               return rc;
> > >
> > > -     *active_volatile_bytes = le64_to_cpu(pi.active_volatile_cap);
> > > -     *active_persistent_bytes = le64_to_cpu(pi.active_persistent_cap);
> > > -     *next_volatile_bytes = le64_to_cpu(pi.next_volatile_cap);
> > > -     *next_persistent_bytes = le64_to_cpu(pi.next_volatile_cap);
> > > -
> > > -     *active_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
> > > -     *active_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
> > > -     *next_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
> > > -     *next_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
> > > +     cxlm->active_volatile_bytes =
> > > +             le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> > > +     cxlm->active_persistent_bytes =
> > > +             le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> > > +     cxlm->next_volatile_bytes =
> > > +             le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> > > +     cxlm->next_persistent_bytes =
> > > +             le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> >
> > Personally, I prefer the more functional style implementation. I guess if you
> > wanted to make the change, my preference would be to kill
> > cxl_mem_get_partition_info() entirely. Up to you though...
> 
> I was bringing this function in line with the precedent we already set
> with cxl_mem_identify() that caches the result in @cxlm. Are you
> saying you want to change that style too?
> 
> I feel like caching device attributes is idiomatic. Look at all the
> PCI attributes that are cached in "struct pci_device" that could be
> re-read or re-calculated rather than cached. In general I think a
> routine that returns 4 values is better off filling in a structure.

Caching is totally fine, I was just suggesting keeping the side effects out of
the innocuous sounding cxl_mem_get_partition_info(). I think I originally
authored authored cxl_mem_identify(), so mea culpa. I just realized it after
seeing the removal that I liked Ira's functional style.

Perhaps simply renaming these functions is the right solution (ie. save it for a
rainy day). Populate is not my favorite verb, but as an example:
static int cxl_mem_populate_partition_info
static int cxl_mem_populate_indentify

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info()
  2021-09-09 21:05       ` Ben Widawsky
@ 2021-09-09 21:10         ` Dan Williams
  2021-09-10  8:56         ` Jonathan Cameron
  1 sibling, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09 21:10 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Ira Weiny, Vishal L Verma, Linux NVDIMM, Schofield,
	Alison, Jonathan Cameron

On Thu, Sep 9, 2021 at 2:05 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
[..]
>
> Caching is totally fine, I was just suggesting keeping the side effects out of
> the innocuous sounding cxl_mem_get_partition_info(). I think I originally
> authored authored cxl_mem_identify(), so mea culpa. I just realized it after
> seeing the removal that I liked Ira's functional style.
>
> Perhaps simply renaming these functions is the right solution (ie. save it for a
> rainy day). Populate is not my favorite verb, but as an example:
> static int cxl_mem_populate_partition_info
> static int cxl_mem_populate_indentify

Sure, I would not say "no" to a follow-on cleanup patch along those
lines that brought these two functions into a coherent style.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 16/21] cxl/pmem: Add support for multiple nvdimm-bridge objects
  2021-09-09  5:12 ` [PATCH v4 16/21] cxl/pmem: Add support for multiple nvdimm-bridge objects Dan Williams
@ 2021-09-09 22:03   ` Dan Williams
  2021-09-14 19:08   ` [PATCH v5 " Dan Williams
  1 sibling, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-09 22:03 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ben Widawsky, Vishal L Verma, Linux NVDIMM, Schofield, Alison,
	Weiny, Ira, Jonathan Cameron

On Wed, Sep 8, 2021 at 10:13 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> In preparation for a mocked unit test environment for CXL objects, allow
> for multiple unique nvdimm-bridge objects.
>
> For now, just allow multiple bridges to be registered. Later, when there
> are multiple present, further updates are needed to
> cxl_find_nvdimm_bridge() to identify which bridge is associated with
> which CXL hierarchy for nvdimm registration.
>
> Note that this does change the kernel device-name for the bridge object.
> User space should not have any attachment to the device name at this
> point as it is still early days in the CXL driver development.

Apologies Jonathan, missed your:

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

here...

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v5 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands
  2021-09-09  5:12 ` [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands Dan Williams
  2021-09-09 17:22   ` Ben Widawsky
@ 2021-09-09 22:08   ` Dan Williams
  2021-09-10  9:40     ` Jonathan Cameron
  2021-09-14 19:06   ` Dan Williams
  2 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-09 22:08 UTC (permalink / raw)
  To: linux-cxl
  Cc: nvdimm, ben.widawsky, ira.weiny, Jonathan.Cameron,
	vishal.l.verma, ira.weiny

The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to
translate the Linux command payload to the device native command format.
The LIBNVDIMM commands get-config-size, get-config-data, and
set-config-data, map to the CXL memory device commands device-identify,
get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an
NVDIMM device arranges for the provisioning of namespaces. Additionally
for CXL the LSA is used for provisioning regions as well.

The data from device-identify is already cached in the 'struct cxl_mem'
instance associated with @cxl_nvd, so that payload return is simply
crafted and no CXL command is issued. The conversion for get-lsa is
straightforward, but the conversion for set-lsa requires an allocation
to append the set-lsa header in front of the payload.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v4:
- Drop cmd_rc plumbing as it is presently unused (Ben)

 drivers/cxl/pmem.c |  128 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 124 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index a972af7a6e0b..d349a6cb01df 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
 #include <linux/libnvdimm.h>
+#include <asm/unaligned.h>
 #include <linux/device.h>
 #include <linux/module.h>
 #include <linux/ndctl.h>
@@ -48,10 +49,10 @@ static int cxl_nvdimm_probe(struct device *dev)
 {
 	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
+	unsigned long flags = 0, cmd_mask = 0;
 	struct cxl_mem *cxlm = cxlmd->cxlm;
 	struct cxl_nvdimm_bridge *cxl_nvb;
 	struct nvdimm *nvdimm = NULL;
-	unsigned long flags = 0;
 	int rc = -ENXIO;
 
 	cxl_nvb = cxl_find_nvdimm_bridge();
@@ -66,8 +67,11 @@ static int cxl_nvdimm_probe(struct device *dev)
 
 	set_bit(NDD_LABELING, &flags);
 	rc = -ENOMEM;
-	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
-			       NULL);
+	set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
+	set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
+	set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
+	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags,
+			       cmd_mask, 0, NULL);
 	dev_set_drvdata(dev, nvdimm);
 
 out_unlock:
@@ -89,11 +93,127 @@ static struct cxl_driver cxl_nvdimm_driver = {
 	.id = CXL_DEVICE_NVDIMM,
 };
 
+static int cxl_pmem_get_config_size(struct cxl_mem *cxlm,
+				    struct nd_cmd_get_config_size *cmd,
+				    unsigned int buf_len)
+{
+	if (sizeof(*cmd) > buf_len)
+		return -EINVAL;
+
+	*cmd = (struct nd_cmd_get_config_size) {
+		 .config_size = cxlm->lsa_size,
+		 .max_xfer = cxlm->payload_size,
+	};
+
+	return 0;
+}
+
+static int cxl_pmem_get_config_data(struct cxl_mem *cxlm,
+				    struct nd_cmd_get_config_data_hdr *cmd,
+				    unsigned int buf_len)
+{
+	struct cxl_mbox_get_lsa {
+		u32 offset;
+		u32 length;
+	} get_lsa;
+	int rc;
+
+	if (sizeof(*cmd) > buf_len)
+		return -EINVAL;
+	if (struct_size(cmd, out_buf, cmd->in_length) > buf_len)
+		return -EINVAL;
+
+	get_lsa = (struct cxl_mbox_get_lsa) {
+		.offset = cmd->in_offset,
+		.length = cmd->in_length,
+	};
+
+	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LSA, &get_lsa,
+				   sizeof(get_lsa), cmd->out_buf,
+				   cmd->in_length);
+	cmd->status = 0;
+
+	return rc;
+}
+
+static int cxl_pmem_set_config_data(struct cxl_mem *cxlm,
+				    struct nd_cmd_set_config_hdr *cmd,
+				    unsigned int buf_len)
+{
+	struct cxl_mbox_set_lsa {
+		u32 offset;
+		u32 reserved;
+		u8 data[];
+	} *set_lsa;
+	int rc;
+
+	if (sizeof(*cmd) > buf_len)
+		return -EINVAL;
+
+	/* 4-byte status follows the input data in the payload */
+	if (struct_size(cmd, in_buf, cmd->in_length) + 4 > buf_len)
+		return -EINVAL;
+
+	set_lsa =
+		kvzalloc(struct_size(set_lsa, data, cmd->in_length), GFP_KERNEL);
+	if (!set_lsa)
+		return -ENOMEM;
+
+	*set_lsa = (struct cxl_mbox_set_lsa) {
+		.offset = cmd->in_offset,
+	};
+	memcpy(set_lsa->data, cmd->in_buf, cmd->in_length);
+
+	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_SET_LSA, set_lsa,
+				   struct_size(set_lsa, data, cmd->in_length),
+				   NULL, 0);
+
+	/*
+	 * Set "firmware" status (4-packed bytes at the end of the input
+	 * payload.
+	 */
+	put_unaligned(0, (u32 *) &cmd->in_buf[cmd->in_length]);
+	kvfree(set_lsa);
+
+	return rc;
+}
+
+static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
+			       void *buf, unsigned int buf_len)
+{
+	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
+	unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
+	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
+	struct cxl_mem *cxlm = cxlmd->cxlm;
+
+	if (!test_bit(cmd, &cmd_mask))
+		return -ENOTTY;
+
+	switch (cmd) {
+	case ND_CMD_GET_CONFIG_SIZE:
+		return cxl_pmem_get_config_size(cxlm, buf, buf_len);
+	case ND_CMD_GET_CONFIG_DATA:
+		return cxl_pmem_get_config_data(cxlm, buf, buf_len);
+	case ND_CMD_SET_CONFIG_DATA:
+		return cxl_pmem_set_config_data(cxlm, buf, buf_len);
+	default:
+		return -ENOTTY;
+	}
+}
+
 static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
 			struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 			unsigned int buf_len, int *cmd_rc)
 {
-	return -ENOTTY;
+	/*
+	 * No firmware response to translate, let the transport error
+	 * code take precedence.
+	 */
+	*cmd_rc = 0;
+
+	if (!nvdimm)
+		return -ENOTTY;
+	return cxl_pmem_nvdimm_ctl(nvdimm, cmd, buf, buf_len);
 }
 
 static bool online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 04/21] libnvdimm/labels: Fix kernel-doc for label.h
  2021-09-09  5:11 ` [PATCH v4 04/21] libnvdimm/labels: Fix kernel-doc for label.h Dan Williams
@ 2021-09-10  8:38   ` Jonathan Cameron
  0 siblings, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  8:38 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, vishal.l.verma, nvdimm, ben.widawsky,
	alison.schofield, ira.weiny

On Wed, 8 Sep 2021 22:11:53 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Clean up existing kernel-doc warnings before adding new CXL label data
> structures.
> 
> drivers/nvdimm/label.h:66: warning: Function parameter or member 'labelsize' not described in 'nd_namespace_index'
> drivers/nvdimm/label.h:66: warning: Function parameter or member 'free' not described in 'nd_namespace_index'
> drivers/nvdimm/label.h:103: warning: Function parameter or member 'align' not described in 'nd_namespace_label'
> drivers/nvdimm/label.h:103: warning: Function parameter or member 'reserved' not described in 'nd_namespace_label'
> drivers/nvdimm/label.h:103: warning: Function parameter or member 'type_guid' not described in 'nd_namespace_label'
> drivers/nvdimm/label.h:103: warning: Function parameter or member 'abstraction_guid' not described in 'nd_namespace_label'
> drivers/nvdimm/label.h:103: warning: Function parameter or member 'reserved2' not described in 'nd_namespace_label'
> drivers/nvdimm/label.h:103: warning: Function parameter or member 'checksum' not described in 'nd_namespace_label'
> 
> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
LGTM
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/nvdimm/label.h |   10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvdimm/label.h b/drivers/nvdimm/label.h
> index 31f94fad7b92..7fa757d47846 100644
> --- a/drivers/nvdimm/label.h
> +++ b/drivers/nvdimm/label.h
> @@ -34,6 +34,7 @@ enum {
>   * struct nd_namespace_index - label set superblock
>   * @sig: NAMESPACE_INDEX\0
>   * @flags: placeholder
> + * @labelsize: log2 size (v1 labels 128 bytes v2 labels 256 bytes)
>   * @seq: sequence number for this index
>   * @myoff: offset of this index in label area
>   * @mysize: size of this index struct
> @@ -43,7 +44,7 @@ enum {
>   * @major: label area major version
>   * @minor: label area minor version
>   * @checksum: fletcher64 of all fields
> - * @free[0]: bitmap, nlabel bits
> + * @free: bitmap, nlabel bits
>   *
>   * The size of free[] is rounded up so the total struct size is a
>   * multiple of NSINDEX_ALIGN bytes.  Any bits this allocates beyond
> @@ -77,7 +78,12 @@ struct nd_namespace_index {
>   * @dpa: DPA of NVM range on this DIMM
>   * @rawsize: size of namespace
>   * @slot: slot of this label in label area
> - * @unused: must be zero
> + * @align: physical address alignment of the namespace
> + * @reserved: reserved
> + * @type_guid: copy of struct acpi_nfit_system_address.range_guid
> + * @abstraction_guid: personality id (btt, btt2, fsdax, devdax....)
> + * @reserved2: reserved
> + * @checksum: fletcher64 sum of this object
>   */
>  struct nd_namespace_label {
>  	u8 uuid[NSLABEL_UUID_LEN];
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 07/21] cxl/pci: Make 'struct cxl_mem' device type generic
  2021-09-09  5:12 ` [PATCH v4 07/21] cxl/pci: Make 'struct cxl_mem' device type generic Dan Williams
  2021-09-09 16:12   ` Ben Widawsky
@ 2021-09-10  8:43   ` Jonathan Cameron
  1 sibling, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  8:43 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ben Widawsky, vishal.l.verma, nvdimm,
	alison.schofield, ira.weiny

On Wed, 8 Sep 2021 22:12:09 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for adding a unit test provider of a cxl_memdev, convert
> the 'struct cxl_mem' driver context to carry a generic device rather
> than a pci device.
> 
> Note, some dev_dbg() lines needed extra reformatting per clang-format.
> 
> This conversion also allows the cxl_mem_create() and
> devm_cxl_add_memdev() calling conventions to be simplified. The "host"
> for a cxl_memdev, must be the same device for the driver that allocated
> @cxlm.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/core/memdev.c |    7 ++--
>  drivers/cxl/cxlmem.h      |    6 ++--
>  drivers/cxl/pci.c         |   75 +++++++++++++++++++++------------------------
>  3 files changed, 41 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index a9c317e32010..331ef7d6c598 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -149,7 +149,6 @@ static void cxl_memdev_unregister(void *_cxlmd)
>  static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
>  					   const struct file_operations *fops)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
>  	struct cxl_memdev *cxlmd;
>  	struct device *dev;
>  	struct cdev *cdev;
> @@ -166,7 +165,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
>  
>  	dev = &cxlmd->dev;
>  	device_initialize(dev);
> -	dev->parent = &pdev->dev;
> +	dev->parent = cxlm->dev;
>  	dev->bus = &cxl_bus_type;
>  	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>  	dev->type = &cxl_memdev_type;
> @@ -182,7 +181,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
>  }
>  
>  struct cxl_memdev *
> -devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
> +devm_cxl_add_memdev(struct cxl_mem *cxlm,
>  		    const struct cdevm_file_operations *cdevm_fops)
>  {
>  	struct cxl_memdev *cxlmd;
> @@ -210,7 +209,7 @@ devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
>  	if (rc)
>  		goto err;
>  
> -	rc = devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
> +	rc = devm_add_action_or_reset(cxlm->dev, cxl_memdev_unregister, cxlmd);
>  	if (rc)
>  		return ERR_PTR(rc);
>  	return cxlmd;
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 6c0b1e2ea97c..d5334df83fb2 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -63,12 +63,12 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
>  }
>  
>  struct cxl_memdev *
> -devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
> +devm_cxl_add_memdev(struct cxl_mem *cxlm,
>  		    const struct cdevm_file_operations *cdevm_fops);
>  
>  /**
>   * struct cxl_mem - A CXL memory device
> - * @pdev: The PCI device associated with this CXL device.
> + * @dev: The device associated with this CXL device.
>   * @cxlmd: Logical memory device chardev / interface
>   * @regs: Parsed register blocks
>   * @payload_size: Size of space for payload
> @@ -82,7 +82,7 @@ devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm,
>   * @ram_range: Volatile memory capacity information.
>   */
>  struct cxl_mem {
> -	struct pci_dev *pdev;
> +	struct device *dev;
>  	struct cxl_memdev *cxlmd;
>  
>  	struct cxl_regs regs;
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8e45aa07d662..c1e1d12e24b6 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -250,7 +250,7 @@ static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm)
>  		cpu_relax();
>  	}
>  
> -	dev_dbg(&cxlm->pdev->dev, "Doorbell wait took %dms",
> +	dev_dbg(cxlm->dev, "Doorbell wait took %dms",
>  		jiffies_to_msecs(end) - jiffies_to_msecs(start));
>  	return 0;
>  }
> @@ -268,7 +268,7 @@ static bool cxl_is_security_command(u16 opcode)
>  static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
>  				 struct mbox_cmd *mbox_cmd)
>  {
> -	struct device *dev = &cxlm->pdev->dev;
> +	struct device *dev = cxlm->dev;
>  
>  	dev_dbg(dev, "Mailbox command (opcode: %#x size: %zub) timed out\n",
>  		mbox_cmd->opcode, mbox_cmd->size_in);
> @@ -300,6 +300,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>  				   struct mbox_cmd *mbox_cmd)
>  {
>  	void __iomem *payload = cxlm->regs.mbox + CXLDEV_MBOX_PAYLOAD_OFFSET;
> +	struct device *dev = cxlm->dev;
>  	u64 cmd_reg, status_reg;
>  	size_t out_len;
>  	int rc;
> @@ -325,8 +326,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>  
>  	/* #1 */
>  	if (cxl_doorbell_busy(cxlm)) {
> -		dev_err_ratelimited(&cxlm->pdev->dev,
> -				    "Mailbox re-busy after acquiring\n");
> +		dev_err_ratelimited(dev, "Mailbox re-busy after acquiring\n");
>  		return -EBUSY;
>  	}
>  
> @@ -345,7 +345,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>  	writeq(cmd_reg, cxlm->regs.mbox + CXLDEV_MBOX_CMD_OFFSET);
>  
>  	/* #4 */
> -	dev_dbg(&cxlm->pdev->dev, "Sending command\n");
> +	dev_dbg(dev, "Sending command\n");
>  	writel(CXLDEV_MBOX_CTRL_DOORBELL,
>  	       cxlm->regs.mbox + CXLDEV_MBOX_CTRL_OFFSET);
>  
> @@ -362,7 +362,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>  		FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg);
>  
>  	if (mbox_cmd->return_code != 0) {
> -		dev_dbg(&cxlm->pdev->dev, "Mailbox operation had an error\n");
> +		dev_dbg(dev, "Mailbox operation had an error\n");
>  		return 0;
>  	}
>  
> @@ -399,7 +399,7 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
>   */
>  static int cxl_mem_mbox_get(struct cxl_mem *cxlm)
>  {
> -	struct device *dev = &cxlm->pdev->dev;
> +	struct device *dev = cxlm->dev;
>  	u64 md_status;
>  	int rc;
>  
> @@ -502,7 +502,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
>  					u64 in_payload, u64 out_payload,
>  					s32 *size_out, u32 *retval)
>  {
> -	struct device *dev = &cxlm->pdev->dev;
> +	struct device *dev = cxlm->dev;
>  	struct mbox_cmd mbox_cmd = {
>  		.opcode = cmd->opcode,
>  		.size_in = cmd->info.size_in,
> @@ -925,20 +925,19 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
>  	 */
>  	cxlm->payload_size = min_t(size_t, cxlm->payload_size, SZ_1M);
>  	if (cxlm->payload_size < 256) {
> -		dev_err(&cxlm->pdev->dev, "Mailbox is too small (%zub)",
> +		dev_err(cxlm->dev, "Mailbox is too small (%zub)",
>  			cxlm->payload_size);
>  		return -ENXIO;
>  	}
>  
> -	dev_dbg(&cxlm->pdev->dev, "Mailbox payload sized %zu",
> +	dev_dbg(cxlm->dev, "Mailbox payload sized %zu",
>  		cxlm->payload_size);
>  
>  	return 0;
>  }
>  
> -static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev)
> +static struct cxl_mem *cxl_mem_create(struct device *dev)
>  {
> -	struct device *dev = &pdev->dev;
>  	struct cxl_mem *cxlm;
>  
>  	cxlm = devm_kzalloc(dev, sizeof(*cxlm), GFP_KERNEL);
> @@ -948,7 +947,7 @@ static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev)
>  	}
>  
>  	mutex_init(&cxlm->mbox_mutex);
> -	cxlm->pdev = pdev;
> +	cxlm->dev = dev;
>  	cxlm->enabled_cmds =
>  		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
>  				   sizeof(unsigned long),
> @@ -964,9 +963,9 @@ static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev)
>  static void __iomem *cxl_mem_map_regblock(struct cxl_mem *cxlm,
>  					  u8 bar, u64 offset)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
> -	struct device *dev = &pdev->dev;
>  	void __iomem *addr;
> +	struct device *dev = cxlm->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
>  
>  	/* Basic sanity check that BAR is big enough */
>  	if (pci_resource_len(pdev, bar) < offset) {
> @@ -989,7 +988,7 @@ static void __iomem *cxl_mem_map_regblock(struct cxl_mem *cxlm,
>  
>  static void cxl_mem_unmap_regblock(struct cxl_mem *cxlm, void __iomem *base)
>  {
> -	pci_iounmap(cxlm->pdev, base);
> +	pci_iounmap(to_pci_dev(cxlm->dev), base);
>  }
>  
>  static int cxl_mem_dvsec(struct pci_dev *pdev, int dvsec)
> @@ -1018,10 +1017,9 @@ static int cxl_mem_dvsec(struct pci_dev *pdev, int dvsec)
>  static int cxl_probe_regs(struct cxl_mem *cxlm, void __iomem *base,
>  			  struct cxl_register_map *map)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
> -	struct device *dev = &pdev->dev;
>  	struct cxl_component_reg_map *comp_map;
>  	struct cxl_device_reg_map *dev_map;
> +	struct device *dev = cxlm->dev;
>  
>  	switch (map->reg_type) {
>  	case CXL_REGLOC_RBI_COMPONENT:
> @@ -1057,8 +1055,8 @@ static int cxl_probe_regs(struct cxl_mem *cxlm, void __iomem *base,
>  
>  static int cxl_map_regs(struct cxl_mem *cxlm, struct cxl_register_map *map)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
> -	struct device *dev = &pdev->dev;
> +	struct device *dev = cxlm->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
>  
>  	switch (map->reg_type) {
>  	case CXL_REGLOC_RBI_COMPONENT:
> @@ -1096,13 +1094,12 @@ static void cxl_decode_register_block(u32 reg_lo, u32 reg_hi,
>   */
>  static int cxl_mem_setup_regs(struct cxl_mem *cxlm)
>  {
> -	struct pci_dev *pdev = cxlm->pdev;
> -	struct device *dev = &pdev->dev;
> -	u32 regloc_size, regblocks;
>  	void __iomem *base;
> -	int regloc, i, n_maps;
> +	u32 regloc_size, regblocks;
> +	int regloc, i, n_maps, ret = 0;
> +	struct device *dev = cxlm->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
>  	struct cxl_register_map *map, maps[CXL_REGLOC_RBI_TYPES];
> -	int ret = 0;
>  
>  	regloc = cxl_mem_dvsec(pdev, PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
>  	if (!regloc) {
> @@ -1226,7 +1223,7 @@ static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
>  		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
>  
>  		if (!cmd) {
> -			dev_dbg(&cxlm->pdev->dev,
> +			dev_dbg(cxlm->dev,
>  				"Opcode 0x%04x unsupported by driver", opcode);
>  			continue;
>  		}
> @@ -1323,7 +1320,7 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
>  static int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm)
>  {
>  	struct cxl_mbox_get_supported_logs *gsl;
> -	struct device *dev = &cxlm->pdev->dev;
> +	struct device *dev = cxlm->dev;
>  	struct cxl_mem_command *cmd;
>  	int i, rc;
>  
> @@ -1418,15 +1415,14 @@ static int cxl_mem_identify(struct cxl_mem *cxlm)
>  	cxlm->partition_align_bytes = le64_to_cpu(id.partition_align);
>  	cxlm->partition_align_bytes *= CXL_CAPACITY_MULTIPLIER;
>  
> -	dev_dbg(&cxlm->pdev->dev, "Identify Memory Device\n"
> +	dev_dbg(cxlm->dev,
> +		"Identify Memory Device\n"
>  		"     total_bytes = %#llx\n"
>  		"     volatile_only_bytes = %#llx\n"
>  		"     persistent_only_bytes = %#llx\n"
>  		"     partition_align_bytes = %#llx\n",
> -			cxlm->total_bytes,
> -			cxlm->volatile_only_bytes,
> -			cxlm->persistent_only_bytes,
> -			cxlm->partition_align_bytes);
> +		cxlm->total_bytes, cxlm->volatile_only_bytes,
> +		cxlm->persistent_only_bytes, cxlm->partition_align_bytes);
>  
>  	cxlm->lsa_size = le32_to_cpu(id.lsa_size);
>  	memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision));
> @@ -1453,19 +1449,18 @@ static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
>  					&cxlm->next_volatile_bytes,
>  					&cxlm->next_persistent_bytes);
>  	if (rc < 0) {
> -		dev_err(&cxlm->pdev->dev, "Failed to query partition information\n");
> +		dev_err(cxlm->dev, "Failed to query partition information\n");
>  		return rc;
>  	}
>  
> -	dev_dbg(&cxlm->pdev->dev, "Get Partition Info\n"
> +	dev_dbg(cxlm->dev,
> +		"Get Partition Info\n"
>  		"     active_volatile_bytes = %#llx\n"
>  		"     active_persistent_bytes = %#llx\n"
>  		"     next_volatile_bytes = %#llx\n"
>  		"     next_persistent_bytes = %#llx\n",
> -			cxlm->active_volatile_bytes,
> -			cxlm->active_persistent_bytes,
> -			cxlm->next_volatile_bytes,
> -			cxlm->next_persistent_bytes);
> +		cxlm->active_volatile_bytes, cxlm->active_persistent_bytes,
> +		cxlm->next_volatile_bytes, cxlm->next_persistent_bytes);
>  
>  	cxlm->ram_range.start = 0;
>  	cxlm->ram_range.end = cxlm->active_volatile_bytes - 1;
> @@ -1487,7 +1482,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	cxlm = cxl_mem_create(pdev);
> +	cxlm = cxl_mem_create(&pdev->dev);
>  	if (IS_ERR(cxlm))
>  		return PTR_ERR(cxlm);
>  
> @@ -1511,7 +1506,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm, &cxl_memdev_fops);
> +	cxlmd = devm_cxl_add_memdev(cxlm, &cxl_memdev_fops);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 10/21] cxl/pci: Drop idr.h
  2021-09-09 16:34   ` Ben Widawsky
@ 2021-09-10  8:46     ` Jonathan Cameron
  0 siblings, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  8:46 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Dan Williams, linux-cxl, vishal.l.verma, nvdimm,
	alison.schofield, ira.weiny

On Thu, 9 Sep 2021 09:34:57 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-09-08 22:12:26, Dan Williams wrote:
> > Commit 3d135db51024 ("cxl/core: Move memdev management to core") left
> > this straggling include for cxl_memdev setup. Clean it up.
> > 
> > Cc: Ben Widawsky <ben.widawsky@intel.com>
> > Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>  
> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks for breaking this out.

J
> 
> > ---
> >  drivers/cxl/pci.c |    1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index e2f27671c6b2..9d8050fdd69c 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -8,7 +8,6 @@
> >  #include <linux/mutex.h>
> >  #include <linux/list.h>
> >  #include <linux/cdev.h>
> > -#include <linux/idr.h>
> >  #include <linux/pci.h>
> >  #include <linux/io.h>
> >  #include <linux/io-64-nonatomic-lo-hi.h>
> >   


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info()
  2021-09-09 21:05       ` Ben Widawsky
  2021-09-09 21:10         ` Dan Williams
@ 2021-09-10  8:56         ` Jonathan Cameron
  1 sibling, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  8:56 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Dan Williams, linux-cxl, Ira Weiny, Vishal L Verma, Linux NVDIMM,
	Schofield, Alison

On Thu, 9 Sep 2021 14:05:27 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-09-09 11:06:53, Dan Williams wrote:
> > On Thu, Sep 9, 2021 at 9:20 AM Ben Widawsky <ben.widawsky@intel.com> wrote:  
> > >
> > > On 21-09-08 22:12:15, Dan Williams wrote:  
> > > > Commit 0b9159d0ff21 ("cxl/pci: Store memory capacity values") missed
> > > > updating the kernel-doc for 'struct cxl_mem' leading to the following
> > > > warnings:
> > > >
> > > > ./scripts/kernel-doc -v drivers/cxl/cxlmem.h 2>&1 | grep warn
> > > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'total_bytes' not described in 'cxl_mem'
> > > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'volatile_only_bytes' not described in 'cxl_mem'
> > > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'persistent_only_bytes' not described in 'cxl_mem'
> > > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'partition_align_bytes' not described in 'cxl_mem'
> > > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_volatile_bytes' not described in 'cxl_mem'
> > > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_persistent_bytes' not described in 'cxl_mem'
> > > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_volatile_bytes' not described in 'cxl_mem'
> > > > drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_persistent_bytes' not described in 'cxl_mem'
> > > >
> > > > Also, it is redundant to describe those same parameters in the
> > > > kernel-doc for cxl_mem_get_partition_info(). Given the only user of that
> > > > routine updates the values in @cxlm, just do that implicitly internal to
> > > > the helper.
> > > >
> > > > Cc: Ira Weiny <ira.weiny@intel.com>
> > > > Reported-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > > > ---
> > > >  drivers/cxl/cxlmem.h |   15 +++++++++++++--
> > > >  drivers/cxl/pci.c    |   35 +++++++++++------------------------
> > > >  2 files changed, 24 insertions(+), 26 deletions(-)
> > > >
> > > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > > index d5334df83fb2..c6fce966084a 100644
> > > > --- a/drivers/cxl/cxlmem.h
> > > > +++ b/drivers/cxl/cxlmem.h
> > > > @@ -78,8 +78,19 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
> > > >   * @mbox_mutex: Mutex to synchronize mailbox access.
> > > >   * @firmware_version: Firmware version for the memory device.
> > > >   * @enabled_cmds: Hardware commands found enabled in CEL.
> > > > - * @pmem_range: Persistent memory capacity information.
> > > > - * @ram_range: Volatile memory capacity information.
> > > > + * @pmem_range: Active Persistent memory capacity configuration
> > > > + * @ram_range: Active Volatile memory capacity configuration
> > > > + * @total_bytes: sum of all possible capacities
> > > > + * @volatile_only_bytes: hard volatile capacity
> > > > + * @persistent_only_bytes: hard persistent capacity
> > > > + * @partition_align_bytes: soft setting for configurable capacity  
> 
> see below... How about:
> "alignment size for partition-able capacity"

With that change or something similar.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>

Renaming functions as discussed looks sensible to me as well in the longer term.

> 
> > > > + * @active_volatile_bytes: sum of hard + soft volatile
> > > > + * @active_persistent_bytes: sum of hard + soft persistent  
> > >
> > > Looking at this now, probably makes sense to create some helper macros or inline
> > > functions to calculate these as needed, rather than storing them in the
> > > structure.  
> > 
> > Perhaps, I would need to look deeper into what is worth caching vs
> > what is suitable to be recalculated. Do you have a proposal here?  
> 
> I think it's worth being considered... however, this suggestion was my mistake.
> I thought there was a way to infer the active capacities just from identify, but
> there isn't. This leads me to request an update to the kdoc above which was how
> I got confused.
> 
> > 
> >   
> > >  
> > > > + * @next_volatile_bytes: volatile capacity change pending device reset
> > > > + * @next_persistent_bytes: persistent capacity change pending device reset
> > > > + *
> > > > + * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > > > + * details on capacity parameters.
> > > >   */
> > > >  struct cxl_mem {
> > > >       struct device *dev;
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > index c1e1d12e24b6..8077d907e7d3 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -1262,11 +1262,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> > > >
> > > >  /**
> > > >   * cxl_mem_get_partition_info - Get partition info
> > > > - * @cxlm: The device to act on
> > > > - * @active_volatile_bytes: returned active volatile capacity
> > > > - * @active_persistent_bytes: returned active persistent capacity
> > > > - * @next_volatile_bytes: return next volatile capacity
> > > > - * @next_persistent_bytes: return next persistent capacity
> > > > + * @cxlm: cxl_mem instance to update partition info
> > > >   *
> > > >   * Retrieve the current partition info for the device specified.  If not 0, the
> > > >   * 'next' values are pending and take affect on next cold reset.
> > > > @@ -1275,11 +1271,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> > > >   *
> > > >   * See CXL @8.2.9.5.2.1 Get Partition Info
> > > >   */
> > > > -static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
> > > > -                                   u64 *active_volatile_bytes,
> > > > -                                   u64 *active_persistent_bytes,
> > > > -                                   u64 *next_volatile_bytes,
> > > > -                                   u64 *next_persistent_bytes)
> > > > +static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
> > > >  {
> > > >       struct cxl_mbox_get_partition_info {
> > > >               __le64 active_volatile_cap;
> > > > @@ -1294,15 +1286,14 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
> > > >       if (rc)
> > > >               return rc;
> > > >
> > > > -     *active_volatile_bytes = le64_to_cpu(pi.active_volatile_cap);
> > > > -     *active_persistent_bytes = le64_to_cpu(pi.active_persistent_cap);
> > > > -     *next_volatile_bytes = le64_to_cpu(pi.next_volatile_cap);
> > > > -     *next_persistent_bytes = le64_to_cpu(pi.next_volatile_cap);
> > > > -
> > > > -     *active_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
> > > > -     *active_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
> > > > -     *next_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
> > > > -     *next_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
> > > > +     cxlm->active_volatile_bytes =
> > > > +             le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> > > > +     cxlm->active_persistent_bytes =
> > > > +             le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> > > > +     cxlm->next_volatile_bytes =
> > > > +             le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> > > > +     cxlm->next_persistent_bytes =
> > > > +             le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;  
> > >
> > > Personally, I prefer the more functional style implementation. I guess if you
> > > wanted to make the change, my preference would be to kill
> > > cxl_mem_get_partition_info() entirely. Up to you though...  
> > 
> > I was bringing this function in line with the precedent we already set
> > with cxl_mem_identify() that caches the result in @cxlm. Are you
> > saying you want to change that style too?
> > 
> > I feel like caching device attributes is idiomatic. Look at all the
> > PCI attributes that are cached in "struct pci_device" that could be
> > re-read or re-calculated rather than cached. In general I think a
> > routine that returns 4 values is better off filling in a structure.  
> 
> Caching is totally fine, I was just suggesting keeping the side effects out of
> the innocuous sounding cxl_mem_get_partition_info(). I think I originally
> authored authored cxl_mem_identify(), so mea culpa. I just realized it after
> seeing the removal that I liked Ira's functional style.
> 
> Perhaps simply renaming these functions is the right solution (ie. save it for a
> rainy day). Populate is not my favorite verb, but as an example:
> static int cxl_mem_populate_partition_info
> static int cxl_mem_populate_indentify


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 09/21] cxl/mbox: Introduce the mbox_send operation
  2021-09-09  5:12 ` [PATCH v4 09/21] cxl/mbox: Introduce the mbox_send operation Dan Williams
  2021-09-09 16:34   ` Ben Widawsky
@ 2021-09-10  8:58   ` Jonathan Cameron
  1 sibling, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  8:58 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ben Widawsky, vishal.l.verma, nvdimm,
	alison.schofield, ira.weiny

On Wed, 8 Sep 2021 22:12:21 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for implementing a unit test backend transport for ioctl
> operations, and making the mailbox available to the cxl/pmem
> infrastructure, move the existing PCI specific portion of mailbox handling
> to an "mbox_send" operation.
> 
> With this split all the PCI-specific transport details are comprehended
> by a single operation and the rest of the mailbox infrastructure is
> 'struct cxl_mem' and 'struct cxl_memdev' generic.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Looks good.  Not sure why I didn't give a provisional tag for this one
in v4 given the trivial nature of my only review comment.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/cxlmem.h |   42 ++++++++++++++++++++++++++++
>  drivers/cxl/pci.c    |   76 ++++++++++++++------------------------------------
>  2 files changed, 63 insertions(+), 55 deletions(-)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index c6fce966084a..9be5e26c5b48 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -66,6 +66,45 @@ struct cxl_memdev *
>  devm_cxl_add_memdev(struct cxl_mem *cxlm,
>  		    const struct cdevm_file_operations *cdevm_fops);
>  
> +/**
> + * struct cxl_mbox_cmd - A command to be submitted to hardware.
> + * @opcode: (input) The command set and command submitted to hardware.
> + * @payload_in: (input) Pointer to the input payload.
> + * @payload_out: (output) Pointer to the output payload. Must be allocated by
> + *		 the caller.
> + * @size_in: (input) Number of bytes to load from @payload_in.
> + * @size_out: (input) Max number of bytes loaded into @payload_out.
> + *            (output) Number of bytes generated by the device. For fixed size
> + *            outputs commands this is always expected to be deterministic. For
> + *            variable sized output commands, it tells the exact number of bytes
> + *            written.
> + * @return_code: (output) Error code returned from hardware.
> + *
> + * This is the primary mechanism used to send commands to the hardware.
> + * All the fields except @payload_* correspond exactly to the fields described in
> + * Command Register section of the CXL 2.0 8.2.8.4.5. @payload_in and
> + * @payload_out are written to, and read from the Command Payload Registers
> + * defined in CXL 2.0 8.2.8.4.8.
> + */
> +struct cxl_mbox_cmd {
> +	u16 opcode;
> +	void *payload_in;
> +	void *payload_out;
> +	size_t size_in;
> +	size_t size_out;
> +	u16 return_code;
> +#define CXL_MBOX_SUCCESS 0
> +};
> +
> +/*
> + * CXL 2.0 - Memory capacity multiplier
> + * See Section 8.2.9.5
> + *
> + * Volatile, Persistent, and Partition capacities are specified to be in
> + * multiples of 256MB - define a multiplier to convert to/from bytes.
> + */
> +#define CXL_CAPACITY_MULTIPLIER SZ_256M
> +
>  /**
>   * struct cxl_mem - A CXL memory device
>   * @dev: The device associated with this CXL device.
> @@ -88,6 +127,7 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
>   * @active_persistent_bytes: sum of hard + soft persistent
>   * @next_volatile_bytes: volatile capacity change pending device reset
>   * @next_persistent_bytes: persistent capacity change pending device reset
> + * @mbox_send: @dev specific transport for transmitting mailbox commands
>   *
>   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
>   * details on capacity parameters.
> @@ -115,5 +155,7 @@ struct cxl_mem {
>  	u64 active_persistent_bytes;
>  	u64 next_volatile_bytes;
>  	u64 next_persistent_bytes;
> +
> +	int (*mbox_send)(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd);
>  };
>  #endif /* __CXL_MEM_H__ */
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8077d907e7d3..e2f27671c6b2 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -64,45 +64,6 @@ enum opcode {
>  	CXL_MBOX_OP_MAX			= 0x10000
>  };
>  
> -/*
> - * CXL 2.0 - Memory capacity multiplier
> - * See Section 8.2.9.5
> - *
> - * Volatile, Persistent, and Partition capacities are specified to be in
> - * multiples of 256MB - define a multiplier to convert to/from bytes.
> - */
> -#define CXL_CAPACITY_MULTIPLIER SZ_256M
> -
> -/**
> - * struct mbox_cmd - A command to be submitted to hardware.
> - * @opcode: (input) The command set and command submitted to hardware.
> - * @payload_in: (input) Pointer to the input payload.
> - * @payload_out: (output) Pointer to the output payload. Must be allocated by
> - *		 the caller.
> - * @size_in: (input) Number of bytes to load from @payload_in.
> - * @size_out: (input) Max number of bytes loaded into @payload_out.
> - *            (output) Number of bytes generated by the device. For fixed size
> - *            outputs commands this is always expected to be deterministic. For
> - *            variable sized output commands, it tells the exact number of bytes
> - *            written.
> - * @return_code: (output) Error code returned from hardware.
> - *
> - * This is the primary mechanism used to send commands to the hardware.
> - * All the fields except @payload_* correspond exactly to the fields described in
> - * Command Register section of the CXL 2.0 8.2.8.4.5. @payload_in and
> - * @payload_out are written to, and read from the Command Payload Registers
> - * defined in CXL 2.0 8.2.8.4.8.
> - */
> -struct mbox_cmd {
> -	u16 opcode;
> -	void *payload_in;
> -	void *payload_out;
> -	size_t size_in;
> -	size_t size_out;
> -	u16 return_code;
> -#define CXL_MBOX_SUCCESS 0
> -};
> -
>  static DECLARE_RWSEM(cxl_memdev_rwsem);
>  static struct dentry *cxl_debugfs;
>  static bool cxl_raw_allow_all;
> @@ -266,7 +227,7 @@ static bool cxl_is_security_command(u16 opcode)
>  }
>  
>  static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
> -				 struct mbox_cmd *mbox_cmd)
> +				 struct cxl_mbox_cmd *mbox_cmd)
>  {
>  	struct device *dev = cxlm->dev;
>  
> @@ -297,7 +258,7 @@ static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
>   * mailbox.
>   */
>  static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
> -				   struct mbox_cmd *mbox_cmd)
> +				   struct cxl_mbox_cmd *mbox_cmd)
>  {
>  	void __iomem *payload = cxlm->regs.mbox + CXLDEV_MBOX_PAYLOAD_OFFSET;
>  	struct device *dev = cxlm->dev;
> @@ -472,6 +433,20 @@ static void cxl_mem_mbox_put(struct cxl_mem *cxlm)
>  	mutex_unlock(&cxlm->mbox_mutex);
>  }
>  
> +static int cxl_pci_mbox_send(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
> +{
> +	int rc;
> +
> +	rc = cxl_mem_mbox_get(cxlm);
> +	if (rc)
> +		return rc;
> +
> +	rc = __cxl_mem_mbox_send_cmd(cxlm, cmd);
> +	cxl_mem_mbox_put(cxlm);
> +
> +	return rc;
> +}
> +
>  /**
>   * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
>   * @cxlm: The CXL memory device to communicate with.
> @@ -503,7 +478,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
>  					s32 *size_out, u32 *retval)
>  {
>  	struct device *dev = cxlm->dev;
> -	struct mbox_cmd mbox_cmd = {
> +	struct cxl_mbox_cmd mbox_cmd = {
>  		.opcode = cmd->opcode,
>  		.size_in = cmd->info.size_in,
>  		.size_out = cmd->info.size_out,
> @@ -525,10 +500,6 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
>  		}
>  	}
>  
> -	rc = cxl_mem_mbox_get(cxlm);
> -	if (rc)
> -		goto out;
> -
>  	dev_dbg(dev,
>  		"Submitting %s command for user\n"
>  		"\topcode: %x\n"
> @@ -539,8 +510,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
>  	dev_WARN_ONCE(dev, cmd->info.id == CXL_MEM_COMMAND_ID_RAW,
>  		      "raw command path used\n");
>  
> -	rc = __cxl_mem_mbox_send_cmd(cxlm, &mbox_cmd);
> -	cxl_mem_mbox_put(cxlm);
> +	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
>  	if (rc)
>  		goto out;
>  
> @@ -874,7 +844,7 @@ static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode,
>  				 void *out, size_t out_size)
>  {
>  	const struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> -	struct mbox_cmd mbox_cmd = {
> +	struct cxl_mbox_cmd mbox_cmd = {
>  		.opcode = opcode,
>  		.payload_in = in,
>  		.size_in = in_size,
> @@ -886,12 +856,7 @@ static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode,
>  	if (out_size > cxlm->payload_size)
>  		return -E2BIG;
>  
> -	rc = cxl_mem_mbox_get(cxlm);
> -	if (rc)
> -		return rc;
> -
> -	rc = __cxl_mem_mbox_send_cmd(cxlm, &mbox_cmd);
> -	cxl_mem_mbox_put(cxlm);
> +	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
>  	if (rc)
>  		return rc;
>  
> @@ -913,6 +878,7 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
>  {
>  	const int cap = readl(cxlm->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
>  
> +	cxlm->mbox_send = cxl_pci_mbox_send;
>  	cxlm->payload_size =
>  		1 << FIELD_GET(CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK, cap);
>  
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core
  2021-09-09  5:12 ` [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core Dan Williams
  2021-09-09 16:41   ` Ben Widawsky
@ 2021-09-10  9:13   ` Jonathan Cameron
  1 sibling, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  9:13 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ben Widawsky, kernel test robot, vishal.l.verma,
	nvdimm, alison.schofield, ira.weiny

On Wed, 8 Sep 2021 22:12:32 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Now that the internals of mailbox operations are abstracted from the PCI
> specifics a bulk of infrastructure can move to the core.
> 
> The CXL_PMEM driver intends to proxy LIBNVDIMM UAPI and driver requests
> to the equivalent functionality provided by the CXL hardware mailbox
> interface. In support of that intent move the mailbox implementation to
> a shared location for the CXL_PCI driver native IOCTL path and CXL_PMEM
> nvdimm command proxy path to share.
> 
> A unit test framework seeks to implement a unit test backend transport
> for mailbox commands to communicate mocked up payloads. It can reuse all
> of the mailbox infrastructure minus the PCI specifics, so that also gets
> moved to the core.
> 
> Finally with the mailbox infrastructure and ioctl handling being
> transport generic there is no longer any need to pass file
> file_operations to devm_cxl_add_memdev(). That allows all the ioctl
> boilerplate to move into the core for unit test reuse.
> 
> No functional change intended, just code movement.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  Documentation/driver-api/cxl/memory-devices.rst |    3 
>  drivers/cxl/core/Makefile                       |    1 
>  drivers/cxl/core/bus.c                          |    4 
>  drivers/cxl/core/core.h                         |    8 
>  drivers/cxl/core/mbox.c                         |  825 +++++++++++++++++++++
>  drivers/cxl/core/memdev.c                       |   81 ++
>  drivers/cxl/cxlmem.h                            |   78 +-
>  drivers/cxl/pci.c                               |  924 -----------------------
>  8 files changed, 975 insertions(+), 949 deletions(-)
>  create mode 100644 drivers/cxl/core/mbox.c
> 
> diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> index 50ebcda17ad0..4624951b5c38 100644
> --- a/Documentation/driver-api/cxl/memory-devices.rst
> +++ b/Documentation/driver-api/cxl/memory-devices.rst
> @@ -45,6 +45,9 @@ CXL Core
>  .. kernel-doc:: drivers/cxl/core/regs.c
>     :doc: cxl registers
>  
> +.. kernel-doc:: drivers/cxl/core/mbox.c
> +   :doc: cxl mbox
> +
>  External Interfaces
>  ===================
>  
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 0fdbf3c6ac1a..07eb8e1fb8a6 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -6,3 +6,4 @@ cxl_core-y := bus.o
>  cxl_core-y += pmem.o
>  cxl_core-y += regs.o
>  cxl_core-y += memdev.o
> +cxl_core-y += mbox.o
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 37b87adaa33f..8073354ba232 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -636,6 +636,8 @@ static __init int cxl_core_init(void)
>  {
>  	int rc;
>  
> +	cxl_mbox_init();
> +
>  	rc = cxl_memdev_init();
>  	if (rc)
>  		return rc;
> @@ -647,6 +649,7 @@ static __init int cxl_core_init(void)
>  
>  err:
>  	cxl_memdev_exit();
> +	cxl_mbox_exit();
>  	return rc;
>  }
>  
> @@ -654,6 +657,7 @@ static void cxl_core_exit(void)
>  {
>  	bus_unregister(&cxl_bus_type);
>  	cxl_memdev_exit();
> +	cxl_mbox_exit();
>  }
>  
>  module_init(cxl_core_init);
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 036a3c8106b4..c85b7fbad02d 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -14,7 +14,15 @@ static inline void unregister_cxl_dev(void *dev)
>  	device_unregister(dev);
>  }
>  
> +struct cxl_send_command;
> +struct cxl_mem_query_commands;
> +int cxl_query_cmd(struct cxl_memdev *cxlmd,
> +		  struct cxl_mem_query_commands __user *q);
> +int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s);
> +
>  int cxl_memdev_init(void);
>  void cxl_memdev_exit(void);
> +void cxl_mbox_init(void);
> +void cxl_mbox_exit(void);
>  
>  #endif /* __CXL_CORE_H__ */
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> new file mode 100644
> index 000000000000..31e183991726
> --- /dev/null
> +++ b/drivers/cxl/core/mbox.c
> @@ -0,0 +1,825 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> +#include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/security.h>
> +#include <linux/debugfs.h>
> +#include <linux/mutex.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +
> +#include "core.h"
> +
> +static bool cxl_raw_allow_all;
> +
> +/**
> + * DOC: cxl mbox
> + *
> + * Core implementation of the CXL 2.0 Type-3 Memory Device Mailbox. The
> + * implementation is used by the cxl_pci driver to initialize the device
> + * and implement the cxl_mem.h IOCTL UAPI. It also implements the
> + * backend of the cxl_pmem_ctl() transport for LIBNVDIMM.
> + */
> +
> +#define cxl_for_each_cmd(cmd)                                                  \
> +	for ((cmd) = &cxl_mem_commands[0];                                     \
> +	     ((cmd) - cxl_mem_commands) < ARRAY_SIZE(cxl_mem_commands); (cmd)++)
> +
> +#define CXL_CMD(_id, sin, sout, _flags)                                        \
> +	[CXL_MEM_COMMAND_ID_##_id] = {                                         \
> +	.info =	{                                                              \
> +			.id = CXL_MEM_COMMAND_ID_##_id,                        \
> +			.size_in = sin,                                        \
> +			.size_out = sout,                                      \
> +		},                                                             \
> +	.opcode = CXL_MBOX_OP_##_id,                                           \
> +	.flags = _flags,                                                       \
> +	}
> +
> +/*
> + * This table defines the supported mailbox commands for the driver. This table
> + * is made up of a UAPI structure. Non-negative values as parameters in the
> + * table will be validated against the user's input. For example, if size_in is
> + * 0, and the user passed in 1, it is an error.
> + */
> +static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
> +	CXL_CMD(IDENTIFY, 0, 0x43, CXL_CMD_FLAG_FORCE_ENABLE),
> +#ifdef CONFIG_CXL_MEM_RAW_COMMANDS
> +	CXL_CMD(RAW, ~0, ~0, 0),
> +#endif
> +	CXL_CMD(GET_SUPPORTED_LOGS, 0, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
> +	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
> +	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
> +	CXL_CMD(GET_LSA, 0x8, ~0, 0),
> +	CXL_CMD(GET_HEALTH_INFO, 0, 0x12, 0),
> +	CXL_CMD(GET_LOG, 0x18, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
> +	CXL_CMD(SET_PARTITION_INFO, 0x0a, 0, 0),
> +	CXL_CMD(SET_LSA, ~0, 0, 0),
> +	CXL_CMD(GET_ALERT_CONFIG, 0, 0x10, 0),
> +	CXL_CMD(SET_ALERT_CONFIG, 0xc, 0, 0),
> +	CXL_CMD(GET_SHUTDOWN_STATE, 0, 0x1, 0),
> +	CXL_CMD(SET_SHUTDOWN_STATE, 0x1, 0, 0),
> +	CXL_CMD(GET_POISON, 0x10, ~0, 0),
> +	CXL_CMD(INJECT_POISON, 0x8, 0, 0),
> +	CXL_CMD(CLEAR_POISON, 0x48, 0, 0),
> +	CXL_CMD(GET_SCAN_MEDIA_CAPS, 0x10, 0x4, 0),
> +	CXL_CMD(SCAN_MEDIA, 0x11, 0, 0),
> +	CXL_CMD(GET_SCAN_MEDIA, 0, ~0, 0),
> +};
> +
> +/*
> + * Commands that RAW doesn't permit. The rationale for each:
> + *
> + * CXL_MBOX_OP_ACTIVATE_FW: Firmware activation requires adjustment /
> + * coordination of transaction timeout values at the root bridge level.
> + *
> + * CXL_MBOX_OP_SET_PARTITION_INFO: The device memory map may change live
> + * and needs to be coordinated with HDM updates.
> + *
> + * CXL_MBOX_OP_SET_LSA: The label storage area may be cached by the
> + * driver and any writes from userspace invalidates those contents.
> + *
> + * CXL_MBOX_OP_SET_SHUTDOWN_STATE: Set shutdown state assumes no writes
> + * to the device after it is marked clean, userspace can not make that
> + * assertion.
> + *
> + * CXL_MBOX_OP_[GET_]SCAN_MEDIA: The kernel provides a native error list that
> + * is kept up to date with patrol notifications and error management.
> + */
> +static u16 cxl_disabled_raw_commands[] = {
> +	CXL_MBOX_OP_ACTIVATE_FW,
> +	CXL_MBOX_OP_SET_PARTITION_INFO,
> +	CXL_MBOX_OP_SET_LSA,
> +	CXL_MBOX_OP_SET_SHUTDOWN_STATE,
> +	CXL_MBOX_OP_SCAN_MEDIA,
> +	CXL_MBOX_OP_GET_SCAN_MEDIA,
> +};
> +
> +/*
> + * Command sets that RAW doesn't permit. All opcodes in this set are
> + * disabled because they pass plain text security payloads over the
> + * user/kernel boundary. This functionality is intended to be wrapped
> + * behind the keys ABI which allows for encrypted payloads in the UAPI
> + */
> +static u8 security_command_sets[] = {
> +	0x44, /* Sanitize */
> +	0x45, /* Persistent Memory Data-at-rest Security */
> +	0x46, /* Security Passthrough */
> +};
> +
> +static bool cxl_is_security_command(u16 opcode)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(security_command_sets); i++)
> +		if (security_command_sets[i] == (opcode >> 8))
> +			return true;
> +	return false;
> +}
> +
> +static struct cxl_mem_command *cxl_mem_find_command(u16 opcode)
> +{
> +	struct cxl_mem_command *c;
> +
> +	cxl_for_each_cmd(c)
> +		if (c->opcode == opcode)
> +			return c;
> +
> +	return NULL;
> +}
> +
> +/**
> + * cxl_mem_mbox_send_cmd() - Send a mailbox command to a memory device.
> + * @cxlm: The CXL memory device to communicate with.
> + * @opcode: Opcode for the mailbox command.
> + * @in: The input payload for the mailbox command.
> + * @in_size: The length of the input payload
> + * @out: Caller allocated buffer for the output.
> + * @out_size: Expected size of output.
> + *
> + * Context: Any context. Will acquire and release mbox_mutex.
> + * Return:
> + *  * %>=0	- Number of bytes returned in @out.
> + *  * %-E2BIG	- Payload is too large for hardware.
> + *  * %-EBUSY	- Couldn't acquire exclusive mailbox access.
> + *  * %-EFAULT	- Hardware error occurred.
> + *  * %-ENXIO	- Command completed, but device reported an error.
> + *  * %-EIO	- Unexpected output size.
> + *
> + * Mailbox commands may execute successfully yet the device itself reported an
> + * error. While this distinction can be useful for commands from userspace, the
> + * kernel will only be able to use results when both are successful.
> + *
> + * See __cxl_mem_mbox_send_cmd()
> + */
> +int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode, void *in,
> +			  size_t in_size, void *out, size_t out_size)
> +{
> +	const struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> +	struct cxl_mbox_cmd mbox_cmd = {
> +		.opcode = opcode,
> +		.payload_in = in,
> +		.size_in = in_size,
> +		.size_out = out_size,
> +		.payload_out = out,
> +	};
> +	int rc;
> +
> +	if (out_size > cxlm->payload_size)
> +		return -E2BIG;
> +
> +	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
> +	if (rc)
> +		return rc;
> +
> +	/* TODO: Map return code to proper kernel style errno */
> +	if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
> +		return -ENXIO;
> +
> +	/*
> +	 * Variable sized commands can't be validated and so it's up to the
> +	 * caller to do that if they wish.
> +	 */
> +	if (cmd->info.size_out >= 0 && mbox_cmd.size_out != out_size)
> +		return -EIO;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_mbox_send_cmd);
> +
> +static bool cxl_mem_raw_command_allowed(u16 opcode)
> +{
> +	int i;
> +
> +	if (!IS_ENABLED(CONFIG_CXL_MEM_RAW_COMMANDS))
> +		return false;
> +
> +	if (security_locked_down(LOCKDOWN_PCI_ACCESS))
> +		return false;
> +
> +	if (cxl_raw_allow_all)
> +		return true;
> +
> +	if (cxl_is_security_command(opcode))
> +		return false;
> +
> +	for (i = 0; i < ARRAY_SIZE(cxl_disabled_raw_commands); i++)
> +		if (cxl_disabled_raw_commands[i] == opcode)
> +			return false;
> +
> +	return true;
> +}
> +
> +/**
> + * cxl_validate_cmd_from_user() - Check fields for CXL_MEM_SEND_COMMAND.
> + * @cxlm: &struct cxl_mem device whose mailbox will be used.
> + * @send_cmd: &struct cxl_send_command copied in from userspace.
> + * @out_cmd: Sanitized and populated &struct cxl_mem_command.
> + *
> + * Return:
> + *  * %0	- @out_cmd is ready to send.
> + *  * %-ENOTTY	- Invalid command specified.
> + *  * %-EINVAL	- Reserved fields or invalid values were used.
> + *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
> + *  * %-EPERM	- Attempted to use a protected command.
> + *
> + * The result of this command is a fully validated command in @out_cmd that is
> + * safe to send to the hardware.
> + *
> + * See handle_mailbox_cmd_from_user()
> + */
> +static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
> +				      const struct cxl_send_command *send_cmd,
> +				      struct cxl_mem_command *out_cmd)
> +{
> +	const struct cxl_command_info *info;
> +	struct cxl_mem_command *c;
> +
> +	if (send_cmd->id == 0 || send_cmd->id >= CXL_MEM_COMMAND_ID_MAX)
> +		return -ENOTTY;
> +
> +	/*
> +	 * The user can never specify an input payload larger than what hardware
> +	 * supports, but output can be arbitrarily large (simply write out as
> +	 * much data as the hardware provides).
> +	 */
> +	if (send_cmd->in.size > cxlm->payload_size)
> +		return -EINVAL;
> +
> +	/*
> +	 * Checks are bypassed for raw commands but a WARN/taint will occur
> +	 * later in the callchain
> +	 */
> +	if (send_cmd->id == CXL_MEM_COMMAND_ID_RAW) {
> +		const struct cxl_mem_command temp = {
> +			.info = {
> +				.id = CXL_MEM_COMMAND_ID_RAW,
> +				.flags = 0,
> +				.size_in = send_cmd->in.size,
> +				.size_out = send_cmd->out.size,
> +			},
> +			.opcode = send_cmd->raw.opcode
> +		};
> +
> +		if (send_cmd->raw.rsvd)
> +			return -EINVAL;
> +
> +		/*
> +		 * Unlike supported commands, the output size of RAW commands
> +		 * gets passed along without further checking, so it must be
> +		 * validated here.
> +		 */
> +		if (send_cmd->out.size > cxlm->payload_size)
> +			return -EINVAL;
> +
> +		if (!cxl_mem_raw_command_allowed(send_cmd->raw.opcode))
> +			return -EPERM;
> +
> +		memcpy(out_cmd, &temp, sizeof(temp));
> +
> +		return 0;
> +	}
> +
> +	if (send_cmd->flags & ~CXL_MEM_COMMAND_FLAG_MASK)
> +		return -EINVAL;
> +
> +	if (send_cmd->rsvd)
> +		return -EINVAL;
> +
> +	if (send_cmd->in.rsvd || send_cmd->out.rsvd)
> +		return -EINVAL;
> +
> +	/* Convert user's command into the internal representation */
> +	c = &cxl_mem_commands[send_cmd->id];
> +	info = &c->info;
> +
> +	/* Check that the command is enabled for hardware */
> +	if (!test_bit(info->id, cxlm->enabled_cmds))
> +		return -ENOTTY;
> +
> +	/* Check the input buffer is the expected size */
> +	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
> +		return -ENOMEM;
> +
> +	/* Check the output buffer is at least large enough */
> +	if (info->size_out >= 0 && send_cmd->out.size < info->size_out)
> +		return -ENOMEM;
> +
> +	memcpy(out_cmd, c, sizeof(*c));
> +	out_cmd->info.size_in = send_cmd->in.size;
> +	/*
> +	 * XXX: out_cmd->info.size_out will be controlled by the driver, and the
> +	 * specified number of bytes @send_cmd->out.size will be copied back out
> +	 * to userspace.
> +	 */
> +
> +	return 0;
> +}
> +
> +#define cxl_cmd_count ARRAY_SIZE(cxl_mem_commands)
> +
> +int cxl_query_cmd(struct cxl_memdev *cxlmd,
> +		  struct cxl_mem_query_commands __user *q)
> +{
> +	struct device *dev = &cxlmd->dev;
> +	struct cxl_mem_command *cmd;
> +	u32 n_commands;
> +	int j = 0;
> +
> +	dev_dbg(dev, "Query IOCTL\n");
> +
> +	if (get_user(n_commands, &q->n_commands))
> +		return -EFAULT;
> +
> +	/* returns the total number if 0 elements are requested. */
> +	if (n_commands == 0)
> +		return put_user(cxl_cmd_count, &q->n_commands);
> +
> +	/*
> +	 * otherwise, return max(n_commands, total commands) cxl_command_info
> +	 * structures.
> +	 */
> +	cxl_for_each_cmd(cmd) {
> +		const struct cxl_command_info *info = &cmd->info;
> +
> +		if (copy_to_user(&q->commands[j++], info, sizeof(*info)))
> +			return -EFAULT;
> +
> +		if (j == n_commands)
> +			break;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
> + * @cxlm: The CXL memory device to communicate with.
> + * @cmd: The validated command.
> + * @in_payload: Pointer to userspace's input payload.
> + * @out_payload: Pointer to userspace's output payload.
> + * @size_out: (Input) Max payload size to copy out.
> + *            (Output) Payload size hardware generated.
> + * @retval: Hardware generated return code from the operation.
> + *
> + * Return:
> + *  * %0	- Mailbox transaction succeeded. This implies the mailbox
> + *		  protocol completed successfully not that the operation itself
> + *		  was successful.
> + *  * %-ENOMEM  - Couldn't allocate a bounce buffer.
> + *  * %-EFAULT	- Something happened with copy_to/from_user.
> + *  * %-EINTR	- Mailbox acquisition interrupted.
> + *  * %-EXXX	- Transaction level failures.
> + *
> + * Creates the appropriate mailbox command and dispatches it on behalf of a
> + * userspace request. The input and output payloads are copied between
> + * userspace.
> + *
> + * See cxl_send_cmd().
> + */
> +static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
> +					const struct cxl_mem_command *cmd,
> +					u64 in_payload, u64 out_payload,
> +					s32 *size_out, u32 *retval)
> +{
> +	struct device *dev = cxlm->dev;
> +	struct cxl_mbox_cmd mbox_cmd = {
> +		.opcode = cmd->opcode,
> +		.size_in = cmd->info.size_in,
> +		.size_out = cmd->info.size_out,
> +	};
> +	int rc;
> +
> +	if (cmd->info.size_out) {
> +		mbox_cmd.payload_out = kvzalloc(cmd->info.size_out, GFP_KERNEL);
> +		if (!mbox_cmd.payload_out)
> +			return -ENOMEM;
> +	}
> +
> +	if (cmd->info.size_in) {
> +		mbox_cmd.payload_in = vmemdup_user(u64_to_user_ptr(in_payload),
> +						   cmd->info.size_in);
> +		if (IS_ERR(mbox_cmd.payload_in)) {
> +			kvfree(mbox_cmd.payload_out);
> +			return PTR_ERR(mbox_cmd.payload_in);
> +		}
> +	}
> +
> +	dev_dbg(dev,
> +		"Submitting %s command for user\n"
> +		"\topcode: %x\n"
> +		"\tsize: %ub\n",
> +		cxl_command_names[cmd->info.id].name, mbox_cmd.opcode,
> +		cmd->info.size_in);
> +
> +	dev_WARN_ONCE(dev, cmd->info.id == CXL_MEM_COMMAND_ID_RAW,
> +		      "raw command path used\n");
> +
> +	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
> +	if (rc)
> +		goto out;
> +
> +	/*
> +	 * @size_out contains the max size that's allowed to be written back out
> +	 * to userspace. While the payload may have written more output than
> +	 * this it will have to be ignored.
> +	 */
> +	if (mbox_cmd.size_out) {
> +		dev_WARN_ONCE(dev, mbox_cmd.size_out > *size_out,
> +			      "Invalid return size\n");
> +		if (copy_to_user(u64_to_user_ptr(out_payload),
> +				 mbox_cmd.payload_out, mbox_cmd.size_out)) {
> +			rc = -EFAULT;
> +			goto out;
> +		}
> +	}
> +
> +	*size_out = mbox_cmd.size_out;
> +	*retval = mbox_cmd.return_code;
> +
> +out:
> +	kvfree(mbox_cmd.payload_in);
> +	kvfree(mbox_cmd.payload_out);
> +	return rc;
> +}
> +
> +int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s)
> +{
> +	struct cxl_mem *cxlm = cxlmd->cxlm;
> +	struct device *dev = &cxlmd->dev;
> +	struct cxl_send_command send;
> +	struct cxl_mem_command c;
> +	int rc;
> +
> +	dev_dbg(dev, "Send IOCTL\n");
> +
> +	if (copy_from_user(&send, s, sizeof(send)))
> +		return -EFAULT;
> +
> +	rc = cxl_validate_cmd_from_user(cxlmd->cxlm, &send, &c);
> +	if (rc)
> +		return rc;
> +
> +	/* Prepare to handle a full payload for variable sized output */
> +	if (c.info.size_out < 0)
> +		c.info.size_out = cxlm->payload_size;
> +
> +	rc = handle_mailbox_cmd_from_user(cxlm, &c, send.in.payload,
> +					  send.out.payload, &send.out.size,
> +					  &send.retval);
> +	if (rc)
> +		return rc;
> +
> +	if (copy_to_user(s, &send, sizeof(send)))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
> +{
> +	u32 remaining = size;
> +	u32 offset = 0;
> +
> +	while (remaining) {
> +		u32 xfer_size = min_t(u32, remaining, cxlm->payload_size);
> +		struct cxl_mbox_get_log {
> +			uuid_t uuid;
> +			__le32 offset;
> +			__le32 length;
> +		} __packed log = {
> +			.uuid = *uuid,
> +			.offset = cpu_to_le32(offset),
> +			.length = cpu_to_le32(xfer_size)
> +		};
> +		int rc;
> +
> +		rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LOG, &log,
> +					   sizeof(log), out, xfer_size);
> +		if (rc < 0)
> +			return rc;
> +
> +		out += xfer_size;
> +		remaining -= xfer_size;
> +		offset += xfer_size;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * cxl_walk_cel() - Walk through the Command Effects Log.
> + * @cxlm: Device.
> + * @size: Length of the Command Effects Log.
> + * @cel: CEL
> + *
> + * Iterate over each entry in the CEL and determine if the driver supports the
> + * command. If so, the command is enabled for the device and can be used later.
> + */
> +static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
> +{
> +	struct cel_entry {
> +		__le16 opcode;
> +		__le16 effect;
> +	} __packed * cel_entry;
> +	const int cel_entries = size / sizeof(*cel_entry);
> +	int i;
> +
> +	cel_entry = (struct cel_entry *)cel;
> +
> +	for (i = 0; i < cel_entries; i++) {
> +		u16 opcode = le16_to_cpu(cel_entry[i].opcode);
> +		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> +
> +		if (!cmd) {
> +			dev_dbg(cxlm->dev,
> +				"Opcode 0x%04x unsupported by driver", opcode);
> +			continue;
> +		}
> +
> +		set_bit(cmd->info.id, cxlm->enabled_cmds);
> +	}
> +}
> +
> +struct cxl_mbox_get_supported_logs {
> +	__le16 entries;
> +	u8 rsvd[6];
> +	struct gsl_entry {
> +		uuid_t uuid;
> +		__le32 size;
> +	} __packed entry[];
> +} __packed;
> +
> +static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> +{
> +	struct cxl_mbox_get_supported_logs *ret;
> +	int rc;
> +
> +	ret = kvmalloc(cxlm->payload_size, GFP_KERNEL);
> +	if (!ret)
> +		return ERR_PTR(-ENOMEM);
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_SUPPORTED_LOGS, NULL,
> +				   0, ret, cxlm->payload_size);
> +	if (rc < 0) {
> +		kvfree(ret);
> +		return ERR_PTR(rc);
> +	}
> +
> +	return ret;
> +}
> +
> +enum {
> +	CEL_UUID,
> +	VENDOR_DEBUG_UUID,
> +};
> +
> +/* See CXL 2.0 Table 170. Get Log Input Payload */
> +static const uuid_t log_uuid[] = {
> +	[CEL_UUID] = UUID_INIT(0xda9c0b5, 0xbf41, 0x4b78, 0x8f, 0x79, 0x96,
> +			       0xb1, 0x62, 0x3b, 0x3f, 0x17),
> +	[VENDOR_DEBUG_UUID] = UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f,
> +					0xd6, 0x07, 0x19, 0x40, 0x3d, 0x86),
> +};
> +
> +/**
> + * cxl_mem_enumerate_cmds() - Enumerate commands for a device.
> + * @cxlm: The device.
> + *
> + * Returns 0 if enumerate completed successfully.
> + *
> + * CXL devices have optional support for certain commands. This function will
> + * determine the set of supported commands for the hardware and update the
> + * enabled_cmds bitmap in the @cxlm.
> + */
> +int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm)
> +{
> +	struct cxl_mbox_get_supported_logs *gsl;
> +	struct device *dev = cxlm->dev;
> +	struct cxl_mem_command *cmd;
> +	int i, rc;
> +
> +	gsl = cxl_get_gsl(cxlm);
> +	if (IS_ERR(gsl))
> +		return PTR_ERR(gsl);
> +
> +	rc = -ENOENT;
> +	for (i = 0; i < le16_to_cpu(gsl->entries); i++) {
> +		u32 size = le32_to_cpu(gsl->entry[i].size);
> +		uuid_t uuid = gsl->entry[i].uuid;
> +		u8 *log;
> +
> +		dev_dbg(dev, "Found LOG type %pU of size %d", &uuid, size);
> +
> +		if (!uuid_equal(&uuid, &log_uuid[CEL_UUID]))
> +			continue;
> +
> +		log = kvmalloc(size, GFP_KERNEL);
> +		if (!log) {
> +			rc = -ENOMEM;
> +			goto out;
> +		}
> +
> +		rc = cxl_xfer_log(cxlm, &uuid, size, log);
> +		if (rc) {
> +			kvfree(log);
> +			goto out;
> +		}
> +
> +		cxl_walk_cel(cxlm, size, log);
> +		kvfree(log);
> +
> +		/* In case CEL was bogus, enable some default commands. */
> +		cxl_for_each_cmd(cmd)
> +			if (cmd->flags & CXL_CMD_FLAG_FORCE_ENABLE)
> +				set_bit(cmd->info.id, cxlm->enabled_cmds);
> +
> +		/* Found the required CEL */
> +		rc = 0;
> +	}
> +
> +out:
> +	kvfree(gsl);
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_enumerate_cmds);
> +
> +/**
> + * cxl_mem_get_partition_info - Get partition info
> + * @cxlm: cxl_mem instance to update partition info
> + *
> + * Retrieve the current partition info for the device specified.  The active
> + * values are the current capacity in bytes.  If not 0, the 'next' values are
> + * the pending values, in bytes, which take affect on next cold reset.
> + *
> + * Return: 0 if no error: or the result of the mailbox command.
> + *
> + * See CXL @8.2.9.5.2.1 Get Partition Info
> + */
> +static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
> +{
> +	struct cxl_mbox_get_partition_info {
> +		__le64 active_volatile_cap;
> +		__le64 active_persistent_cap;
> +		__le64 next_volatile_cap;
> +		__le64 next_persistent_cap;
> +	} __packed pi;
> +	int rc;
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_PARTITION_INFO,
> +				   NULL, 0, &pi, sizeof(pi));
> +
> +	if (rc)
> +		return rc;
> +
> +	cxlm->active_volatile_bytes =
> +		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->active_persistent_bytes =
> +		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->next_volatile_bytes =
> +		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->next_persistent_bytes =
> +		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> +
> +	return 0;
> +}
> +
> +/**
> + * cxl_mem_identify() - Send the IDENTIFY command to the device.
> + * @cxlm: The device to identify.
> + *
> + * Return: 0 if identify was executed successfully.
> + *
> + * This will dispatch the identify command to the device and on success populate
> + * structures to be exported to sysfs.
> + */
> +int cxl_mem_identify(struct cxl_mem *cxlm)
> +{
> +	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
> +	struct cxl_mbox_identify {
> +		char fw_revision[0x10];
> +		__le64 total_capacity;
> +		__le64 volatile_capacity;
> +		__le64 persistent_capacity;
> +		__le64 partition_align;
> +		__le16 info_event_log_size;
> +		__le16 warning_event_log_size;
> +		__le16 failure_event_log_size;
> +		__le16 fatal_event_log_size;
> +		__le32 lsa_size;
> +		u8 poison_list_max_mer[3];
> +		__le16 inject_poison_limit;
> +		u8 poison_caps;
> +		u8 qos_telemetry_caps;
> +	} __packed id;
> +	int rc;
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_IDENTIFY, NULL, 0, &id,
> +				   sizeof(id));
> +	if (rc < 0)
> +		return rc;
> +
> +	cxlm->total_bytes =
> +		le64_to_cpu(id.total_capacity) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->volatile_only_bytes =
> +		le64_to_cpu(id.volatile_capacity) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->persistent_only_bytes =
> +		le64_to_cpu(id.persistent_capacity) * CXL_CAPACITY_MULTIPLIER;
> +	cxlm->partition_align_bytes =
> +		le64_to_cpu(id.partition_align) * CXL_CAPACITY_MULTIPLIER;
> +
> +	dev_dbg(cxlm->dev,
> +		"Identify Memory Device\n"
> +		"     total_bytes = %#llx\n"
> +		"     volatile_only_bytes = %#llx\n"
> +		"     persistent_only_bytes = %#llx\n"
> +		"     partition_align_bytes = %#llx\n",
> +		cxlm->total_bytes, cxlm->volatile_only_bytes,
> +		cxlm->persistent_only_bytes, cxlm->partition_align_bytes);
> +
> +	cxlm->lsa_size = le32_to_cpu(id.lsa_size);
> +	memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision));
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_identify);
> +
> +int cxl_mem_create_range_info(struct cxl_mem *cxlm)
> +{
> +	int rc;
> +
> +	if (cxlm->partition_align_bytes == 0) {
> +		cxlm->ram_range.start = 0;
> +		cxlm->ram_range.end = cxlm->volatile_only_bytes - 1;
> +		cxlm->pmem_range.start = cxlm->volatile_only_bytes;
> +		cxlm->pmem_range.end = cxlm->volatile_only_bytes +
> +				       cxlm->persistent_only_bytes - 1;
> +		return 0;
> +	}
> +
> +	rc = cxl_mem_get_partition_info(cxlm);
> +	if (rc) {
> +		dev_err(cxlm->dev, "Failed to query partition information\n");
> +		return rc;
> +	}
> +
> +	dev_dbg(cxlm->dev,
> +		"Get Partition Info\n"
> +		"     active_volatile_bytes = %#llx\n"
> +		"     active_persistent_bytes = %#llx\n"
> +		"     next_volatile_bytes = %#llx\n"
> +		"     next_persistent_bytes = %#llx\n",
> +		cxlm->active_volatile_bytes, cxlm->active_persistent_bytes,
> +		cxlm->next_volatile_bytes, cxlm->next_persistent_bytes);
> +
> +	cxlm->ram_range.start = 0;
> +	cxlm->ram_range.end = cxlm->active_volatile_bytes - 1;
> +
> +	cxlm->pmem_range.start = cxlm->active_volatile_bytes;
> +	cxlm->pmem_range.end =
> +		cxlm->active_volatile_bytes + cxlm->active_persistent_bytes - 1;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_create_range_info);
> +
> +struct cxl_mem *cxl_mem_create(struct device *dev)
> +{
> +	struct cxl_mem *cxlm;
> +
> +	cxlm = devm_kzalloc(dev, sizeof(*cxlm), GFP_KERNEL);
> +	if (!cxlm) {
> +		dev_err(dev, "No memory available\n");
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	mutex_init(&cxlm->mbox_mutex);
> +	cxlm->dev = dev;
> +	cxlm->enabled_cmds =
> +		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
> +				   sizeof(unsigned long),
> +				   GFP_KERNEL | __GFP_ZERO);
> +	if (!cxlm->enabled_cmds) {
> +		dev_err(dev, "No memory available for bitmap\n");
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	return cxlm;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_create);
> +
> +static struct dentry *cxl_debugfs;
> +
> +void __init cxl_mbox_init(void)
> +{
> +	struct dentry *mbox_debugfs;
> +
> +	cxl_debugfs = debugfs_create_dir("cxl", NULL);
> +	mbox_debugfs = debugfs_create_dir("mbox", cxl_debugfs);
> +	debugfs_create_bool("raw_allow_all", 0600, mbox_debugfs,
> +			    &cxl_raw_allow_all);
> +}
> +
> +void cxl_mbox_exit(void)
> +{
> +	debugfs_remove_recursive(cxl_debugfs);
> +}
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 331ef7d6c598..df2ba87238c2 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -8,6 +8,8 @@
>  #include <cxlmem.h>
>  #include "core.h"
>  
> +static DECLARE_RWSEM(cxl_memdev_rwsem);
> +
>  /*
>   * An entire PCI topology full of devices should be enough for any
>   * config
> @@ -132,16 +134,21 @@ static const struct device_type cxl_memdev_type = {
>  	.groups = cxl_memdev_attribute_groups,
>  };
>  
> +static void cxl_memdev_shutdown(struct device *dev)
> +{
> +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +
> +	down_write(&cxl_memdev_rwsem);
> +	cxlmd->cxlm = NULL;
> +	up_write(&cxl_memdev_rwsem);
> +}
> +
>  static void cxl_memdev_unregister(void *_cxlmd)
>  {
>  	struct cxl_memdev *cxlmd = _cxlmd;
>  	struct device *dev = &cxlmd->dev;
> -	struct cdev *cdev = &cxlmd->cdev;
> -	const struct cdevm_file_operations *cdevm_fops;
> -
> -	cdevm_fops = container_of(cdev->ops, typeof(*cdevm_fops), fops);
> -	cdevm_fops->shutdown(dev);
>  
> +	cxl_memdev_shutdown(dev);
>  	cdev_device_del(&cxlmd->cdev, dev);
>  	put_device(dev);
>  }
> @@ -180,16 +187,72 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm,
>  	return ERR_PTR(rc);
>  }
>  
> +static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned int cmd,
> +			       unsigned long arg)
> +{
> +	switch (cmd) {
> +	case CXL_MEM_QUERY_COMMANDS:
> +		return cxl_query_cmd(cxlmd, (void __user *)arg);
> +	case CXL_MEM_SEND_COMMAND:
> +		return cxl_send_cmd(cxlmd, (void __user *)arg);
> +	default:
> +		return -ENOTTY;
> +	}
> +}
> +
> +static long cxl_memdev_ioctl(struct file *file, unsigned int cmd,
> +			     unsigned long arg)
> +{
> +	struct cxl_memdev *cxlmd = file->private_data;
> +	int rc = -ENXIO;
> +
> +	down_read(&cxl_memdev_rwsem);
> +	if (cxlmd->cxlm)
> +		rc = __cxl_memdev_ioctl(cxlmd, cmd, arg);
> +	up_read(&cxl_memdev_rwsem);
> +
> +	return rc;
> +}
> +
> +static int cxl_memdev_open(struct inode *inode, struct file *file)
> +{
> +	struct cxl_memdev *cxlmd =
> +		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
> +
> +	get_device(&cxlmd->dev);
> +	file->private_data = cxlmd;
> +
> +	return 0;
> +}
> +
> +static int cxl_memdev_release_file(struct inode *inode, struct file *file)
> +{
> +	struct cxl_memdev *cxlmd =
> +		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
> +
> +	put_device(&cxlmd->dev);
> +
> +	return 0;
> +}
> +
> +static const struct file_operations cxl_memdev_fops = {
> +	.owner = THIS_MODULE,
> +	.unlocked_ioctl = cxl_memdev_ioctl,
> +	.open = cxl_memdev_open,
> +	.release = cxl_memdev_release_file,
> +	.compat_ioctl = compat_ptr_ioctl,
> +	.llseek = noop_llseek,
> +};
> +
>  struct cxl_memdev *
> -devm_cxl_add_memdev(struct cxl_mem *cxlm,
> -		    const struct cdevm_file_operations *cdevm_fops)
> +devm_cxl_add_memdev(struct cxl_mem *cxlm)
>  {
>  	struct cxl_memdev *cxlmd;
>  	struct device *dev;
>  	struct cdev *cdev;
>  	int rc;
>  
> -	cxlmd = cxl_memdev_alloc(cxlm, &cdevm_fops->fops);
> +	cxlmd = cxl_memdev_alloc(cxlm, &cxl_memdev_fops);
>  	if (IS_ERR(cxlmd))
>  		return cxlmd;
>  
> @@ -219,7 +282,7 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
>  	 * The cdev was briefly live, shutdown any ioctl operations that
>  	 * saw that state.
>  	 */
> -	cdevm_fops->shutdown(dev);
> +	cxl_memdev_shutdown(dev);
>  	put_device(dev);
>  	return ERR_PTR(rc);
>  }
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 9be5e26c5b48..35fd16d12532 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -2,6 +2,7 @@
>  /* Copyright(c) 2020-2021 Intel Corporation. */
>  #ifndef __CXL_MEM_H__
>  #define __CXL_MEM_H__
> +#include <uapi/linux/cxl_mem.h>
>  #include <linux/cdev.h>
>  #include "cxl.h"
>  
> @@ -28,21 +29,6 @@
>  	(FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) !=                       \
>  	 CXLMDEV_RESET_NEEDED_NOT)
>  
> -/**
> - * struct cdevm_file_operations - devm coordinated cdev file operations
> - * @fops: file operations that are synchronized against @shutdown
> - * @shutdown: disconnect driver data
> - *
> - * @shutdown is invoked in the devres release path to disconnect any
> - * driver instance data from @dev. It assumes synchronization with any
> - * fops operation that requires driver data. After @shutdown an
> - * operation may only reference @device data.
> - */
> -struct cdevm_file_operations {
> -	struct file_operations fops;
> -	void (*shutdown)(struct device *dev);
> -};
> -
>  /**
>   * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device
>   * @dev: driver core device object
> @@ -62,9 +48,7 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
>  	return container_of(dev, struct cxl_memdev, dev);
>  }
>  
> -struct cxl_memdev *
> -devm_cxl_add_memdev(struct cxl_mem *cxlm,
> -		    const struct cdevm_file_operations *cdevm_fops);
> +struct cxl_memdev *devm_cxl_add_memdev(struct cxl_mem *cxlm);
>  
>  /**
>   * struct cxl_mbox_cmd - A command to be submitted to hardware.
> @@ -158,4 +142,62 @@ struct cxl_mem {
>  
>  	int (*mbox_send)(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd);
>  };
> +
> +enum cxl_opcode {
> +	CXL_MBOX_OP_INVALID		= 0x0000,
> +	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> +	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> +	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> +	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> +	CXL_MBOX_OP_GET_LOG		= 0x0401,
> +	CXL_MBOX_OP_IDENTIFY		= 0x4000,
> +	CXL_MBOX_OP_GET_PARTITION_INFO	= 0x4100,
> +	CXL_MBOX_OP_SET_PARTITION_INFO	= 0x4101,
> +	CXL_MBOX_OP_GET_LSA		= 0x4102,
> +	CXL_MBOX_OP_SET_LSA		= 0x4103,
> +	CXL_MBOX_OP_GET_HEALTH_INFO	= 0x4200,
> +	CXL_MBOX_OP_GET_ALERT_CONFIG	= 0x4201,
> +	CXL_MBOX_OP_SET_ALERT_CONFIG	= 0x4202,
> +	CXL_MBOX_OP_GET_SHUTDOWN_STATE	= 0x4203,
> +	CXL_MBOX_OP_SET_SHUTDOWN_STATE	= 0x4204,
> +	CXL_MBOX_OP_GET_POISON		= 0x4300,
> +	CXL_MBOX_OP_INJECT_POISON	= 0x4301,
> +	CXL_MBOX_OP_CLEAR_POISON	= 0x4302,
> +	CXL_MBOX_OP_GET_SCAN_MEDIA_CAPS	= 0x4303,
> +	CXL_MBOX_OP_SCAN_MEDIA		= 0x4304,
> +	CXL_MBOX_OP_GET_SCAN_MEDIA	= 0x4305,
> +	CXL_MBOX_OP_MAX			= 0x10000
> +};
> +
> +/**
> + * struct cxl_mem_command - Driver representation of a memory device command
> + * @info: Command information as it exists for the UAPI
> + * @opcode: The actual bits used for the mailbox protocol
> + * @flags: Set of flags effecting driver behavior.
> + *
> + *  * %CXL_CMD_FLAG_FORCE_ENABLE: In cases of error, commands with this flag
> + *    will be enabled by the driver regardless of what hardware may have
> + *    advertised.
> + *
> + * The cxl_mem_command is the driver's internal representation of commands that
> + * are supported by the driver. Some of these commands may not be supported by
> + * the hardware. The driver will use @info to validate the fields passed in by
> + * the user then submit the @opcode to the hardware.
> + *
> + * See struct cxl_command_info.
> + */
> +struct cxl_mem_command {
> +	struct cxl_command_info info;
> +	enum cxl_opcode opcode;
> +	u32 flags;
> +#define CXL_CMD_FLAG_NONE 0
> +#define CXL_CMD_FLAG_FORCE_ENABLE BIT(0)
> +};
> +
> +int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode, void *in,
> +			  size_t in_size, void *out, size_t out_size);
> +int cxl_mem_identify(struct cxl_mem *cxlm);
> +int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm);
> +int cxl_mem_create_range_info(struct cxl_mem *cxlm);
> +struct cxl_mem *cxl_mem_create(struct device *dev);
>  #endif /* __CXL_MEM_H__ */
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 9d8050fdd69c..c9f2ac134f4d 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1,16 +1,12 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> -#include <uapi/linux/cxl_mem.h>
> -#include <linux/security.h>
> -#include <linux/debugfs.h>
> +#include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/module.h>
>  #include <linux/sizes.h>
>  #include <linux/mutex.h>
>  #include <linux/list.h>
> -#include <linux/cdev.h>
>  #include <linux/pci.h>
>  #include <linux/io.h>
> -#include <linux/io-64-nonatomic-lo-hi.h>
>  #include "cxlmem.h"
>  #include "pci.h"
>  #include "cxl.h"
> @@ -37,162 +33,6 @@
>  /* CXL 2.0 - 8.2.8.4 */
>  #define CXL_MAILBOX_TIMEOUT_MS (2 * HZ)
>  
> -enum opcode {
> -	CXL_MBOX_OP_INVALID		= 0x0000,
> -	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> -	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> -	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> -	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> -	CXL_MBOX_OP_GET_LOG		= 0x0401,
> -	CXL_MBOX_OP_IDENTIFY		= 0x4000,
> -	CXL_MBOX_OP_GET_PARTITION_INFO	= 0x4100,
> -	CXL_MBOX_OP_SET_PARTITION_INFO	= 0x4101,
> -	CXL_MBOX_OP_GET_LSA		= 0x4102,
> -	CXL_MBOX_OP_SET_LSA		= 0x4103,
> -	CXL_MBOX_OP_GET_HEALTH_INFO	= 0x4200,
> -	CXL_MBOX_OP_GET_ALERT_CONFIG	= 0x4201,
> -	CXL_MBOX_OP_SET_ALERT_CONFIG	= 0x4202,
> -	CXL_MBOX_OP_GET_SHUTDOWN_STATE	= 0x4203,
> -	CXL_MBOX_OP_SET_SHUTDOWN_STATE	= 0x4204,
> -	CXL_MBOX_OP_GET_POISON		= 0x4300,
> -	CXL_MBOX_OP_INJECT_POISON	= 0x4301,
> -	CXL_MBOX_OP_CLEAR_POISON	= 0x4302,
> -	CXL_MBOX_OP_GET_SCAN_MEDIA_CAPS	= 0x4303,
> -	CXL_MBOX_OP_SCAN_MEDIA		= 0x4304,
> -	CXL_MBOX_OP_GET_SCAN_MEDIA	= 0x4305,
> -	CXL_MBOX_OP_MAX			= 0x10000
> -};
> -
> -static DECLARE_RWSEM(cxl_memdev_rwsem);
> -static struct dentry *cxl_debugfs;
> -static bool cxl_raw_allow_all;
> -
> -enum {
> -	CEL_UUID,
> -	VENDOR_DEBUG_UUID,
> -};
> -
> -/* See CXL 2.0 Table 170. Get Log Input Payload */
> -static const uuid_t log_uuid[] = {
> -	[CEL_UUID] = UUID_INIT(0xda9c0b5, 0xbf41, 0x4b78, 0x8f, 0x79, 0x96,
> -			       0xb1, 0x62, 0x3b, 0x3f, 0x17),
> -	[VENDOR_DEBUG_UUID] = UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f,
> -					0xd6, 0x07, 0x19, 0x40, 0x3d, 0x86),
> -};
> -
> -/**
> - * struct cxl_mem_command - Driver representation of a memory device command
> - * @info: Command information as it exists for the UAPI
> - * @opcode: The actual bits used for the mailbox protocol
> - * @flags: Set of flags effecting driver behavior.
> - *
> - *  * %CXL_CMD_FLAG_FORCE_ENABLE: In cases of error, commands with this flag
> - *    will be enabled by the driver regardless of what hardware may have
> - *    advertised.
> - *
> - * The cxl_mem_command is the driver's internal representation of commands that
> - * are supported by the driver. Some of these commands may not be supported by
> - * the hardware. The driver will use @info to validate the fields passed in by
> - * the user then submit the @opcode to the hardware.
> - *
> - * See struct cxl_command_info.
> - */
> -struct cxl_mem_command {
> -	struct cxl_command_info info;
> -	enum opcode opcode;
> -	u32 flags;
> -#define CXL_CMD_FLAG_NONE 0
> -#define CXL_CMD_FLAG_FORCE_ENABLE BIT(0)
> -};
> -
> -#define CXL_CMD(_id, sin, sout, _flags)                                        \
> -	[CXL_MEM_COMMAND_ID_##_id] = {                                         \
> -	.info =	{                                                              \
> -			.id = CXL_MEM_COMMAND_ID_##_id,                        \
> -			.size_in = sin,                                        \
> -			.size_out = sout,                                      \
> -		},                                                             \
> -	.opcode = CXL_MBOX_OP_##_id,                                           \
> -	.flags = _flags,                                                       \
> -	}
> -
> -/*
> - * This table defines the supported mailbox commands for the driver. This table
> - * is made up of a UAPI structure. Non-negative values as parameters in the
> - * table will be validated against the user's input. For example, if size_in is
> - * 0, and the user passed in 1, it is an error.
> - */
> -static struct cxl_mem_command mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
> -	CXL_CMD(IDENTIFY, 0, 0x43, CXL_CMD_FLAG_FORCE_ENABLE),
> -#ifdef CONFIG_CXL_MEM_RAW_COMMANDS
> -	CXL_CMD(RAW, ~0, ~0, 0),
> -#endif
> -	CXL_CMD(GET_SUPPORTED_LOGS, 0, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
> -	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
> -	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
> -	CXL_CMD(GET_LSA, 0x8, ~0, 0),
> -	CXL_CMD(GET_HEALTH_INFO, 0, 0x12, 0),
> -	CXL_CMD(GET_LOG, 0x18, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
> -	CXL_CMD(SET_PARTITION_INFO, 0x0a, 0, 0),
> -	CXL_CMD(SET_LSA, ~0, 0, 0),
> -	CXL_CMD(GET_ALERT_CONFIG, 0, 0x10, 0),
> -	CXL_CMD(SET_ALERT_CONFIG, 0xc, 0, 0),
> -	CXL_CMD(GET_SHUTDOWN_STATE, 0, 0x1, 0),
> -	CXL_CMD(SET_SHUTDOWN_STATE, 0x1, 0, 0),
> -	CXL_CMD(GET_POISON, 0x10, ~0, 0),
> -	CXL_CMD(INJECT_POISON, 0x8, 0, 0),
> -	CXL_CMD(CLEAR_POISON, 0x48, 0, 0),
> -	CXL_CMD(GET_SCAN_MEDIA_CAPS, 0x10, 0x4, 0),
> -	CXL_CMD(SCAN_MEDIA, 0x11, 0, 0),
> -	CXL_CMD(GET_SCAN_MEDIA, 0, ~0, 0),
> -};
> -
> -/*
> - * Commands that RAW doesn't permit. The rationale for each:
> - *
> - * CXL_MBOX_OP_ACTIVATE_FW: Firmware activation requires adjustment /
> - * coordination of transaction timeout values at the root bridge level.
> - *
> - * CXL_MBOX_OP_SET_PARTITION_INFO: The device memory map may change live
> - * and needs to be coordinated with HDM updates.
> - *
> - * CXL_MBOX_OP_SET_LSA: The label storage area may be cached by the
> - * driver and any writes from userspace invalidates those contents.
> - *
> - * CXL_MBOX_OP_SET_SHUTDOWN_STATE: Set shutdown state assumes no writes
> - * to the device after it is marked clean, userspace can not make that
> - * assertion.
> - *
> - * CXL_MBOX_OP_[GET_]SCAN_MEDIA: The kernel provides a native error list that
> - * is kept up to date with patrol notifications and error management.
> - */
> -static u16 cxl_disabled_raw_commands[] = {
> -	CXL_MBOX_OP_ACTIVATE_FW,
> -	CXL_MBOX_OP_SET_PARTITION_INFO,
> -	CXL_MBOX_OP_SET_LSA,
> -	CXL_MBOX_OP_SET_SHUTDOWN_STATE,
> -	CXL_MBOX_OP_SCAN_MEDIA,
> -	CXL_MBOX_OP_GET_SCAN_MEDIA,
> -};
> -
> -/*
> - * Command sets that RAW doesn't permit. All opcodes in this set are
> - * disabled because they pass plain text security payloads over the
> - * user/kernel boundary. This functionality is intended to be wrapped
> - * behind the keys ABI which allows for encrypted payloads in the UAPI
> - */
> -static u8 security_command_sets[] = {
> -	0x44, /* Sanitize */
> -	0x45, /* Persistent Memory Data-at-rest Security */
> -	0x46, /* Security Passthrough */
> -};
> -
> -#define cxl_for_each_cmd(cmd)                                                  \
> -	for ((cmd) = &mem_commands[0];                                         \
> -	     ((cmd) - mem_commands) < ARRAY_SIZE(mem_commands); (cmd)++)
> -
> -#define cxl_cmd_count ARRAY_SIZE(mem_commands)
> -
>  static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm)
>  {
>  	const unsigned long start = jiffies;
> @@ -215,16 +55,6 @@ static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm)
>  	return 0;
>  }
>  
> -static bool cxl_is_security_command(u16 opcode)
> -{
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(security_command_sets); i++)
> -		if (security_command_sets[i] == (opcode >> 8))
> -			return true;
> -	return false;
> -}
> -
>  static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm,
>  				 struct cxl_mbox_cmd *mbox_cmd)
>  {
> @@ -446,433 +276,6 @@ static int cxl_pci_mbox_send(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
>  	return rc;
>  }
>  
> -/**
> - * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
> - * @cxlm: The CXL memory device to communicate with.
> - * @cmd: The validated command.
> - * @in_payload: Pointer to userspace's input payload.
> - * @out_payload: Pointer to userspace's output payload.
> - * @size_out: (Input) Max payload size to copy out.
> - *            (Output) Payload size hardware generated.
> - * @retval: Hardware generated return code from the operation.
> - *
> - * Return:
> - *  * %0	- Mailbox transaction succeeded. This implies the mailbox
> - *		  protocol completed successfully not that the operation itself
> - *		  was successful.
> - *  * %-ENOMEM  - Couldn't allocate a bounce buffer.
> - *  * %-EFAULT	- Something happened with copy_to/from_user.
> - *  * %-EINTR	- Mailbox acquisition interrupted.
> - *  * %-EXXX	- Transaction level failures.
> - *
> - * Creates the appropriate mailbox command and dispatches it on behalf of a
> - * userspace request. The input and output payloads are copied between
> - * userspace.
> - *
> - * See cxl_send_cmd().
> - */
> -static int handle_mailbox_cmd_from_user(struct cxl_mem *cxlm,
> -					const struct cxl_mem_command *cmd,
> -					u64 in_payload, u64 out_payload,
> -					s32 *size_out, u32 *retval)
> -{
> -	struct device *dev = cxlm->dev;
> -	struct cxl_mbox_cmd mbox_cmd = {
> -		.opcode = cmd->opcode,
> -		.size_in = cmd->info.size_in,
> -		.size_out = cmd->info.size_out,
> -	};
> -	int rc;
> -
> -	if (cmd->info.size_out) {
> -		mbox_cmd.payload_out = kvzalloc(cmd->info.size_out, GFP_KERNEL);
> -		if (!mbox_cmd.payload_out)
> -			return -ENOMEM;
> -	}
> -
> -	if (cmd->info.size_in) {
> -		mbox_cmd.payload_in = vmemdup_user(u64_to_user_ptr(in_payload),
> -						   cmd->info.size_in);
> -		if (IS_ERR(mbox_cmd.payload_in)) {
> -			kvfree(mbox_cmd.payload_out);
> -			return PTR_ERR(mbox_cmd.payload_in);
> -		}
> -	}
> -
> -	dev_dbg(dev,
> -		"Submitting %s command for user\n"
> -		"\topcode: %x\n"
> -		"\tsize: %ub\n",
> -		cxl_command_names[cmd->info.id].name, mbox_cmd.opcode,
> -		cmd->info.size_in);
> -
> -	dev_WARN_ONCE(dev, cmd->info.id == CXL_MEM_COMMAND_ID_RAW,
> -		      "raw command path used\n");
> -
> -	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
> -	if (rc)
> -		goto out;
> -
> -	/*
> -	 * @size_out contains the max size that's allowed to be written back out
> -	 * to userspace. While the payload may have written more output than
> -	 * this it will have to be ignored.
> -	 */
> -	if (mbox_cmd.size_out) {
> -		dev_WARN_ONCE(dev, mbox_cmd.size_out > *size_out,
> -			      "Invalid return size\n");
> -		if (copy_to_user(u64_to_user_ptr(out_payload),
> -				 mbox_cmd.payload_out, mbox_cmd.size_out)) {
> -			rc = -EFAULT;
> -			goto out;
> -		}
> -	}
> -
> -	*size_out = mbox_cmd.size_out;
> -	*retval = mbox_cmd.return_code;
> -
> -out:
> -	kvfree(mbox_cmd.payload_in);
> -	kvfree(mbox_cmd.payload_out);
> -	return rc;
> -}
> -
> -static bool cxl_mem_raw_command_allowed(u16 opcode)
> -{
> -	int i;
> -
> -	if (!IS_ENABLED(CONFIG_CXL_MEM_RAW_COMMANDS))
> -		return false;
> -
> -	if (security_locked_down(LOCKDOWN_PCI_ACCESS))
> -		return false;
> -
> -	if (cxl_raw_allow_all)
> -		return true;
> -
> -	if (cxl_is_security_command(opcode))
> -		return false;
> -
> -	for (i = 0; i < ARRAY_SIZE(cxl_disabled_raw_commands); i++)
> -		if (cxl_disabled_raw_commands[i] == opcode)
> -			return false;
> -
> -	return true;
> -}
> -
> -/**
> - * cxl_validate_cmd_from_user() - Check fields for CXL_MEM_SEND_COMMAND.
> - * @cxlm: &struct cxl_mem device whose mailbox will be used.
> - * @send_cmd: &struct cxl_send_command copied in from userspace.
> - * @out_cmd: Sanitized and populated &struct cxl_mem_command.
> - *
> - * Return:
> - *  * %0	- @out_cmd is ready to send.
> - *  * %-ENOTTY	- Invalid command specified.
> - *  * %-EINVAL	- Reserved fields or invalid values were used.
> - *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
> - *  * %-EPERM	- Attempted to use a protected command.
> - *
> - * The result of this command is a fully validated command in @out_cmd that is
> - * safe to send to the hardware.
> - *
> - * See handle_mailbox_cmd_from_user()
> - */
> -static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
> -				      const struct cxl_send_command *send_cmd,
> -				      struct cxl_mem_command *out_cmd)
> -{
> -	const struct cxl_command_info *info;
> -	struct cxl_mem_command *c;
> -
> -	if (send_cmd->id == 0 || send_cmd->id >= CXL_MEM_COMMAND_ID_MAX)
> -		return -ENOTTY;
> -
> -	/*
> -	 * The user can never specify an input payload larger than what hardware
> -	 * supports, but output can be arbitrarily large (simply write out as
> -	 * much data as the hardware provides).
> -	 */
> -	if (send_cmd->in.size > cxlm->payload_size)
> -		return -EINVAL;
> -
> -	/*
> -	 * Checks are bypassed for raw commands but a WARN/taint will occur
> -	 * later in the callchain
> -	 */
> -	if (send_cmd->id == CXL_MEM_COMMAND_ID_RAW) {
> -		const struct cxl_mem_command temp = {
> -			.info = {
> -				.id = CXL_MEM_COMMAND_ID_RAW,
> -				.flags = 0,
> -				.size_in = send_cmd->in.size,
> -				.size_out = send_cmd->out.size,
> -			},
> -			.opcode = send_cmd->raw.opcode
> -		};
> -
> -		if (send_cmd->raw.rsvd)
> -			return -EINVAL;
> -
> -		/*
> -		 * Unlike supported commands, the output size of RAW commands
> -		 * gets passed along without further checking, so it must be
> -		 * validated here.
> -		 */
> -		if (send_cmd->out.size > cxlm->payload_size)
> -			return -EINVAL;
> -
> -		if (!cxl_mem_raw_command_allowed(send_cmd->raw.opcode))
> -			return -EPERM;
> -
> -		memcpy(out_cmd, &temp, sizeof(temp));
> -
> -		return 0;
> -	}
> -
> -	if (send_cmd->flags & ~CXL_MEM_COMMAND_FLAG_MASK)
> -		return -EINVAL;
> -
> -	if (send_cmd->rsvd)
> -		return -EINVAL;
> -
> -	if (send_cmd->in.rsvd || send_cmd->out.rsvd)
> -		return -EINVAL;
> -
> -	/* Convert user's command into the internal representation */
> -	c = &mem_commands[send_cmd->id];
> -	info = &c->info;
> -
> -	/* Check that the command is enabled for hardware */
> -	if (!test_bit(info->id, cxlm->enabled_cmds))
> -		return -ENOTTY;
> -
> -	/* Check the input buffer is the expected size */
> -	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
> -		return -ENOMEM;
> -
> -	/* Check the output buffer is at least large enough */
> -	if (info->size_out >= 0 && send_cmd->out.size < info->size_out)
> -		return -ENOMEM;
> -
> -	memcpy(out_cmd, c, sizeof(*c));
> -	out_cmd->info.size_in = send_cmd->in.size;
> -	/*
> -	 * XXX: out_cmd->info.size_out will be controlled by the driver, and the
> -	 * specified number of bytes @send_cmd->out.size will be copied back out
> -	 * to userspace.
> -	 */
> -
> -	return 0;
> -}
> -
> -static int cxl_query_cmd(struct cxl_memdev *cxlmd,
> -			 struct cxl_mem_query_commands __user *q)
> -{
> -	struct device *dev = &cxlmd->dev;
> -	struct cxl_mem_command *cmd;
> -	u32 n_commands;
> -	int j = 0;
> -
> -	dev_dbg(dev, "Query IOCTL\n");
> -
> -	if (get_user(n_commands, &q->n_commands))
> -		return -EFAULT;
> -
> -	/* returns the total number if 0 elements are requested. */
> -	if (n_commands == 0)
> -		return put_user(cxl_cmd_count, &q->n_commands);
> -
> -	/*
> -	 * otherwise, return max(n_commands, total commands) cxl_command_info
> -	 * structures.
> -	 */
> -	cxl_for_each_cmd(cmd) {
> -		const struct cxl_command_info *info = &cmd->info;
> -
> -		if (copy_to_user(&q->commands[j++], info, sizeof(*info)))
> -			return -EFAULT;
> -
> -		if (j == n_commands)
> -			break;
> -	}
> -
> -	return 0;
> -}
> -
> -static int cxl_send_cmd(struct cxl_memdev *cxlmd,
> -			struct cxl_send_command __user *s)
> -{
> -	struct cxl_mem *cxlm = cxlmd->cxlm;
> -	struct device *dev = &cxlmd->dev;
> -	struct cxl_send_command send;
> -	struct cxl_mem_command c;
> -	int rc;
> -
> -	dev_dbg(dev, "Send IOCTL\n");
> -
> -	if (copy_from_user(&send, s, sizeof(send)))
> -		return -EFAULT;
> -
> -	rc = cxl_validate_cmd_from_user(cxlmd->cxlm, &send, &c);
> -	if (rc)
> -		return rc;
> -
> -	/* Prepare to handle a full payload for variable sized output */
> -	if (c.info.size_out < 0)
> -		c.info.size_out = cxlm->payload_size;
> -
> -	rc = handle_mailbox_cmd_from_user(cxlm, &c, send.in.payload,
> -					  send.out.payload, &send.out.size,
> -					  &send.retval);
> -	if (rc)
> -		return rc;
> -
> -	if (copy_to_user(s, &send, sizeof(send)))
> -		return -EFAULT;
> -
> -	return 0;
> -}
> -
> -static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned int cmd,
> -			       unsigned long arg)
> -{
> -	switch (cmd) {
> -	case CXL_MEM_QUERY_COMMANDS:
> -		return cxl_query_cmd(cxlmd, (void __user *)arg);
> -	case CXL_MEM_SEND_COMMAND:
> -		return cxl_send_cmd(cxlmd, (void __user *)arg);
> -	default:
> -		return -ENOTTY;
> -	}
> -}
> -
> -static long cxl_memdev_ioctl(struct file *file, unsigned int cmd,
> -			     unsigned long arg)
> -{
> -	struct cxl_memdev *cxlmd = file->private_data;
> -	int rc = -ENXIO;
> -
> -	down_read(&cxl_memdev_rwsem);
> -	if (cxlmd->cxlm)
> -		rc = __cxl_memdev_ioctl(cxlmd, cmd, arg);
> -	up_read(&cxl_memdev_rwsem);
> -
> -	return rc;
> -}
> -
> -static int cxl_memdev_open(struct inode *inode, struct file *file)
> -{
> -	struct cxl_memdev *cxlmd =
> -		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
> -
> -	get_device(&cxlmd->dev);
> -	file->private_data = cxlmd;
> -
> -	return 0;
> -}
> -
> -static int cxl_memdev_release_file(struct inode *inode, struct file *file)
> -{
> -	struct cxl_memdev *cxlmd =
> -		container_of(inode->i_cdev, typeof(*cxlmd), cdev);
> -
> -	put_device(&cxlmd->dev);
> -
> -	return 0;
> -}
> -
> -static void cxl_memdev_shutdown(struct device *dev)
> -{
> -	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> -
> -	down_write(&cxl_memdev_rwsem);
> -	cxlmd->cxlm = NULL;
> -	up_write(&cxl_memdev_rwsem);
> -}
> -
> -static const struct cdevm_file_operations cxl_memdev_fops = {
> -	.fops = {
> -		.owner = THIS_MODULE,
> -		.unlocked_ioctl = cxl_memdev_ioctl,
> -		.open = cxl_memdev_open,
> -		.release = cxl_memdev_release_file,
> -		.compat_ioctl = compat_ptr_ioctl,
> -		.llseek = noop_llseek,
> -	},
> -	.shutdown = cxl_memdev_shutdown,
> -};
> -
> -static inline struct cxl_mem_command *cxl_mem_find_command(u16 opcode)
> -{
> -	struct cxl_mem_command *c;
> -
> -	cxl_for_each_cmd(c)
> -		if (c->opcode == opcode)
> -			return c;
> -
> -	return NULL;
> -}
> -
> -/**
> - * cxl_mem_mbox_send_cmd() - Send a mailbox command to a memory device.
> - * @cxlm: The CXL memory device to communicate with.
> - * @opcode: Opcode for the mailbox command.
> - * @in: The input payload for the mailbox command.
> - * @in_size: The length of the input payload
> - * @out: Caller allocated buffer for the output.
> - * @out_size: Expected size of output.
> - *
> - * Context: Any context. Will acquire and release mbox_mutex.
> - * Return:
> - *  * %>=0	- Number of bytes returned in @out.
> - *  * %-E2BIG	- Payload is too large for hardware.
> - *  * %-EBUSY	- Couldn't acquire exclusive mailbox access.
> - *  * %-EFAULT	- Hardware error occurred.
> - *  * %-ENXIO	- Command completed, but device reported an error.
> - *  * %-EIO	- Unexpected output size.
> - *
> - * Mailbox commands may execute successfully yet the device itself reported an
> - * error. While this distinction can be useful for commands from userspace, the
> - * kernel will only be able to use results when both are successful.
> - *
> - * See __cxl_mem_mbox_send_cmd()
> - */
> -static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, u16 opcode,
> -				 void *in, size_t in_size,
> -				 void *out, size_t out_size)
> -{
> -	const struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> -	struct cxl_mbox_cmd mbox_cmd = {
> -		.opcode = opcode,
> -		.payload_in = in,
> -		.size_in = in_size,
> -		.size_out = out_size,
> -		.payload_out = out,
> -	};
> -	int rc;
> -
> -	if (out_size > cxlm->payload_size)
> -		return -E2BIG;
> -
> -	rc = cxlm->mbox_send(cxlm, &mbox_cmd);
> -	if (rc)
> -		return rc;
> -
> -	/* TODO: Map return code to proper kernel style errno */
> -	if (mbox_cmd.return_code != CXL_MBOX_SUCCESS)
> -		return -ENXIO;
> -
> -	/*
> -	 * Variable sized commands can't be validated and so it's up to the
> -	 * caller to do that if they wish.
> -	 */
> -	if (cmd->info.size_out >= 0 && mbox_cmd.size_out != out_size)
> -		return -EIO;
> -
> -	return 0;
> -}
> -
>  static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
>  {
>  	const int cap = readl(cxlm->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> @@ -901,30 +304,6 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
>  	return 0;
>  }
>  
> -static struct cxl_mem *cxl_mem_create(struct device *dev)
> -{
> -	struct cxl_mem *cxlm;
> -
> -	cxlm = devm_kzalloc(dev, sizeof(*cxlm), GFP_KERNEL);
> -	if (!cxlm) {
> -		dev_err(dev, "No memory available\n");
> -		return ERR_PTR(-ENOMEM);
> -	}
> -
> -	mutex_init(&cxlm->mbox_mutex);
> -	cxlm->dev = dev;
> -	cxlm->enabled_cmds =
> -		devm_kmalloc_array(dev, BITS_TO_LONGS(cxl_cmd_count),
> -				   sizeof(unsigned long),
> -				   GFP_KERNEL | __GFP_ZERO);
> -	if (!cxlm->enabled_cmds) {
> -		dev_err(dev, "No memory available for bitmap\n");
> -		return ERR_PTR(-ENOMEM);
> -	}
> -
> -	return cxlm;
> -}
> -
>  static void __iomem *cxl_mem_map_regblock(struct cxl_mem *cxlm,
>  					  u8 bar, u64 offset)
>  {
> @@ -1132,298 +511,6 @@ static int cxl_mem_setup_regs(struct cxl_mem *cxlm)
>  	return ret;
>  }
>  
> -static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
> -{
> -	u32 remaining = size;
> -	u32 offset = 0;
> -
> -	while (remaining) {
> -		u32 xfer_size = min_t(u32, remaining, cxlm->payload_size);
> -		struct cxl_mbox_get_log {
> -			uuid_t uuid;
> -			__le32 offset;
> -			__le32 length;
> -		} __packed log = {
> -			.uuid = *uuid,
> -			.offset = cpu_to_le32(offset),
> -			.length = cpu_to_le32(xfer_size)
> -		};
> -		int rc;
> -
> -		rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LOG, &log,
> -					   sizeof(log), out, xfer_size);
> -		if (rc < 0)
> -			return rc;
> -
> -		out += xfer_size;
> -		remaining -= xfer_size;
> -		offset += xfer_size;
> -	}
> -
> -	return 0;
> -}
> -
> -/**
> - * cxl_walk_cel() - Walk through the Command Effects Log.
> - * @cxlm: Device.
> - * @size: Length of the Command Effects Log.
> - * @cel: CEL
> - *
> - * Iterate over each entry in the CEL and determine if the driver supports the
> - * command. If so, the command is enabled for the device and can be used later.
> - */
> -static void cxl_walk_cel(struct cxl_mem *cxlm, size_t size, u8 *cel)
> -{
> -	struct cel_entry {
> -		__le16 opcode;
> -		__le16 effect;
> -	} __packed * cel_entry;
> -	const int cel_entries = size / sizeof(*cel_entry);
> -	int i;
> -
> -	cel_entry = (struct cel_entry *)cel;
> -
> -	for (i = 0; i < cel_entries; i++) {
> -		u16 opcode = le16_to_cpu(cel_entry[i].opcode);
> -		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
> -
> -		if (!cmd) {
> -			dev_dbg(cxlm->dev,
> -				"Opcode 0x%04x unsupported by driver", opcode);
> -			continue;
> -		}
> -
> -		set_bit(cmd->info.id, cxlm->enabled_cmds);
> -	}
> -}
> -
> -struct cxl_mbox_get_supported_logs {
> -	__le16 entries;
> -	u8 rsvd[6];
> -	struct gsl_entry {
> -		uuid_t uuid;
> -		__le32 size;
> -	} __packed entry[];
> -} __packed;
> -
> -static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
> -{
> -	struct cxl_mbox_get_supported_logs *ret;
> -	int rc;
> -
> -	ret = kvmalloc(cxlm->payload_size, GFP_KERNEL);
> -	if (!ret)
> -		return ERR_PTR(-ENOMEM);
> -
> -	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_SUPPORTED_LOGS, NULL,
> -				   0, ret, cxlm->payload_size);
> -	if (rc < 0) {
> -		kvfree(ret);
> -		return ERR_PTR(rc);
> -	}
> -
> -	return ret;
> -}
> -
> -/**
> - * cxl_mem_get_partition_info - Get partition info
> - * @cxlm: cxl_mem instance to update partition info
> - *
> - * Retrieve the current partition info for the device specified.  If not 0, the
> - * 'next' values are pending and take affect on next cold reset.
> - *
> - * Return: 0 if no error: or the result of the mailbox command.
> - *
> - * See CXL @8.2.9.5.2.1 Get Partition Info
> - */
> -static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
> -{
> -	struct cxl_mbox_get_partition_info {
> -		__le64 active_volatile_cap;
> -		__le64 active_persistent_cap;
> -		__le64 next_volatile_cap;
> -		__le64 next_persistent_cap;
> -	} __packed pi;
> -	int rc;
> -
> -	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_PARTITION_INFO,
> -				   NULL, 0, &pi, sizeof(pi));
> -	if (rc)
> -		return rc;
> -
> -	cxlm->active_volatile_bytes =
> -		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> -	cxlm->active_persistent_bytes =
> -		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> -	cxlm->next_volatile_bytes =
> -		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> -	cxlm->next_persistent_bytes =
> -		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> -
> -	return 0;
> -}
> -
> -/**
> - * cxl_mem_enumerate_cmds() - Enumerate commands for a device.
> - * @cxlm: The device.
> - *
> - * Returns 0 if enumerate completed successfully.
> - *
> - * CXL devices have optional support for certain commands. This function will
> - * determine the set of supported commands for the hardware and update the
> - * enabled_cmds bitmap in the @cxlm.
> - */
> -static int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm)
> -{
> -	struct cxl_mbox_get_supported_logs *gsl;
> -	struct device *dev = cxlm->dev;
> -	struct cxl_mem_command *cmd;
> -	int i, rc;
> -
> -	gsl = cxl_get_gsl(cxlm);
> -	if (IS_ERR(gsl))
> -		return PTR_ERR(gsl);
> -
> -	rc = -ENOENT;
> -	for (i = 0; i < le16_to_cpu(gsl->entries); i++) {
> -		u32 size = le32_to_cpu(gsl->entry[i].size);
> -		uuid_t uuid = gsl->entry[i].uuid;
> -		u8 *log;
> -
> -		dev_dbg(dev, "Found LOG type %pU of size %d", &uuid, size);
> -
> -		if (!uuid_equal(&uuid, &log_uuid[CEL_UUID]))
> -			continue;
> -
> -		log = kvmalloc(size, GFP_KERNEL);
> -		if (!log) {
> -			rc = -ENOMEM;
> -			goto out;
> -		}
> -
> -		rc = cxl_xfer_log(cxlm, &uuid, size, log);
> -		if (rc) {
> -			kvfree(log);
> -			goto out;
> -		}
> -
> -		cxl_walk_cel(cxlm, size, log);
> -		kvfree(log);
> -
> -		/* In case CEL was bogus, enable some default commands. */
> -		cxl_for_each_cmd(cmd)
> -			if (cmd->flags & CXL_CMD_FLAG_FORCE_ENABLE)
> -				set_bit(cmd->info.id, cxlm->enabled_cmds);
> -
> -		/* Found the required CEL */
> -		rc = 0;
> -	}
> -
> -out:
> -	kvfree(gsl);
> -	return rc;
> -}
> -
> -/**
> - * cxl_mem_identify() - Send the IDENTIFY command to the device.
> - * @cxlm: The device to identify.
> - *
> - * Return: 0 if identify was executed successfully.
> - *
> - * This will dispatch the identify command to the device and on success populate
> - * structures to be exported to sysfs.
> - */
> -static int cxl_mem_identify(struct cxl_mem *cxlm)
> -{
> -	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
> -	struct cxl_mbox_identify {
> -		char fw_revision[0x10];
> -		__le64 total_capacity;
> -		__le64 volatile_capacity;
> -		__le64 persistent_capacity;
> -		__le64 partition_align;
> -		__le16 info_event_log_size;
> -		__le16 warning_event_log_size;
> -		__le16 failure_event_log_size;
> -		__le16 fatal_event_log_size;
> -		__le32 lsa_size;
> -		u8 poison_list_max_mer[3];
> -		__le16 inject_poison_limit;
> -		u8 poison_caps;
> -		u8 qos_telemetry_caps;
> -	} __packed id;
> -	int rc;
> -
> -	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_IDENTIFY, NULL, 0, &id,
> -				   sizeof(id));
> -	if (rc < 0)
> -		return rc;
> -
> -	cxlm->total_bytes = le64_to_cpu(id.total_capacity);
> -	cxlm->total_bytes *= CXL_CAPACITY_MULTIPLIER;
> -
> -	cxlm->volatile_only_bytes = le64_to_cpu(id.volatile_capacity);
> -	cxlm->volatile_only_bytes *= CXL_CAPACITY_MULTIPLIER;
> -
> -	cxlm->persistent_only_bytes = le64_to_cpu(id.persistent_capacity);
> -	cxlm->persistent_only_bytes *= CXL_CAPACITY_MULTIPLIER;
> -
> -	cxlm->partition_align_bytes = le64_to_cpu(id.partition_align);
> -	cxlm->partition_align_bytes *= CXL_CAPACITY_MULTIPLIER;
> -
> -	dev_dbg(cxlm->dev,
> -		"Identify Memory Device\n"
> -		"     total_bytes = %#llx\n"
> -		"     volatile_only_bytes = %#llx\n"
> -		"     persistent_only_bytes = %#llx\n"
> -		"     partition_align_bytes = %#llx\n",
> -		cxlm->total_bytes, cxlm->volatile_only_bytes,
> -		cxlm->persistent_only_bytes, cxlm->partition_align_bytes);
> -
> -	cxlm->lsa_size = le32_to_cpu(id.lsa_size);
> -	memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision));
> -
> -	return 0;
> -}
> -
> -static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
> -{
> -	int rc;
> -
> -	if (cxlm->partition_align_bytes == 0) {
> -		cxlm->ram_range.start = 0;
> -		cxlm->ram_range.end = cxlm->volatile_only_bytes - 1;
> -		cxlm->pmem_range.start = cxlm->volatile_only_bytes;
> -		cxlm->pmem_range.end = cxlm->volatile_only_bytes +
> -					cxlm->persistent_only_bytes - 1;
> -		return 0;
> -	}
> -
> -	rc = cxl_mem_get_partition_info(cxlm);
> -	if (rc < 0) {
> -		dev_err(cxlm->dev, "Failed to query partition information\n");
> -		return rc;
> -	}
> -
> -	dev_dbg(cxlm->dev,
> -		"Get Partition Info\n"
> -		"     active_volatile_bytes = %#llx\n"
> -		"     active_persistent_bytes = %#llx\n"
> -		"     next_volatile_bytes = %#llx\n"
> -		"     next_persistent_bytes = %#llx\n",
> -		cxlm->active_volatile_bytes, cxlm->active_persistent_bytes,
> -		cxlm->next_volatile_bytes, cxlm->next_persistent_bytes);
> -
> -	cxlm->ram_range.start = 0;
> -	cxlm->ram_range.end = cxlm->active_volatile_bytes - 1;
> -
> -	cxlm->pmem_range.start = cxlm->active_volatile_bytes;
> -	cxlm->pmem_range.end = cxlm->active_volatile_bytes +
> -				cxlm->active_persistent_bytes - 1;
> -
> -	return 0;
> -}
> -
>  static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct cxl_memdev *cxlmd;
> @@ -1458,7 +545,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	cxlmd = devm_cxl_add_memdev(cxlm, &cxl_memdev_fops);
> +	cxlmd = devm_cxl_add_memdev(cxlm);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> @@ -1486,7 +573,6 @@ static struct pci_driver cxl_mem_driver = {
>  
>  static __init int cxl_mem_init(void)
>  {
> -	struct dentry *mbox_debugfs;
>  	int rc;
>  
>  	/* Double check the anonymous union trickery in struct cxl_regs */
> @@ -1497,17 +583,11 @@ static __init int cxl_mem_init(void)
>  	if (rc)
>  		return rc;
>  
> -	cxl_debugfs = debugfs_create_dir("cxl", NULL);
> -	mbox_debugfs = debugfs_create_dir("mbox", cxl_debugfs);
> -	debugfs_create_bool("raw_allow_all", 0600, mbox_debugfs,
> -			    &cxl_raw_allow_all);
> -
>  	return 0;
>  }
>  
>  static __exit void cxl_mem_exit(void)
>  {
> -	debugfs_remove_recursive(cxl_debugfs);
>  	pci_unregister_driver(&cxl_mem_driver);
>  }
>  
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support
  2021-09-09  5:12 ` [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support Dan Williams
  2021-09-09 17:02   ` Ben Widawsky
@ 2021-09-10  9:33   ` Jonathan Cameron
  2021-09-13 23:46     ` Dan Williams
  2021-09-14 19:03   ` [PATCH v5 " Dan Williams
  2 siblings, 1 reply; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  9:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ben Widawsky, vishal.l.verma, nvdimm,
	alison.schofield, ira.weiny

On Wed, 8 Sep 2021 22:12:49 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> The CXL_PMEM driver expects exclusive control of the label storage area
> space. Similar to the LIBNVDIMM expectation that the label storage area
> is only writable from userspace when the corresponding memory device is
> not active in any region, the expectation is the native CXL_PCI UAPI
> path is disabled while the cxl_nvdimm for a given cxl_memdev device is
> active in LIBNVDIMM.
> 
> Add the ability to toggle the availability of a given command for the
> UAPI path. Use that new capability to shutdown changes to partitions and
> the label storage area while the cxl_nvdimm device is actively proxying
> commands for LIBNVDIMM.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Link: https://lore.kernel.org/r/162982123298.1124374.22718002900700392.stgit@dwillia2-desk3.amr.corp.intel.com
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

In the ideal world I'd like to have seen this as a noop patch going from devm
to non devm for cleanup followed by new stuff.  meh, the world isn't ideal
and all that sort of nice stuff takes time!

Whilst I'm not that keen on the exact form of the code in probe() it will
be easier to read when not a diff so if you prefer to keep it as you have
it I won't object - it just took a little more careful reading than I'd like.

Thanks,

Jonathan


> ---
>  drivers/cxl/core/mbox.c   |    5 +++++
>  drivers/cxl/core/memdev.c |   31 +++++++++++++++++++++++++++++++
>  drivers/cxl/cxlmem.h      |    4 ++++
>  drivers/cxl/pmem.c        |   43 ++++++++++++++++++++++++++++++++-----------
>  4 files changed, 72 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 422999740649..82e79da195fa 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -221,6 +221,7 @@ static bool cxl_mem_raw_command_allowed(u16 opcode)
>   *  * %-EINVAL	- Reserved fields or invalid values were used.
>   *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
>   *  * %-EPERM	- Attempted to use a protected command.
> + *  * %-EBUSY	- Kernel has claimed exclusive access to this opcode
>   *
>   * The result of this command is a fully validated command in @out_cmd that is
>   * safe to send to the hardware.
> @@ -296,6 +297,10 @@ static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
>  	if (!test_bit(info->id, cxlm->enabled_cmds))
>  		return -ENOTTY;
>  
> +	/* Check that the command is not claimed for exclusive kernel use */
> +	if (test_bit(info->id, cxlm->exclusive_cmds))
> +		return -EBUSY;
> +
>  	/* Check the input buffer is the expected size */
>  	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
>  		return -ENOMEM;
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index df2ba87238c2..d9ade5b92330 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -134,6 +134,37 @@ static const struct device_type cxl_memdev_type = {
>  	.groups = cxl_memdev_attribute_groups,
>  };
>  
> +/**
> + * set_exclusive_cxl_commands() - atomically disable user cxl commands
> + * @cxlm: cxl_mem instance to modify
> + * @cmds: bitmap of commands to mark exclusive
> + *
> + * Flush the ioctl path and disable future execution of commands with
> + * the command ids set in @cmds.

It's not obvious this function is doing that 'flush', Perhaps consider rewording?

> + */
> +void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
> +{
> +	down_write(&cxl_memdev_rwsem);
> +	bitmap_or(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
> +		  CXL_MEM_COMMAND_ID_MAX);
> +	up_write(&cxl_memdev_rwsem);
> +}
> +EXPORT_SYMBOL_GPL(set_exclusive_cxl_commands);
> +
> +/**
> + * clear_exclusive_cxl_commands() - atomically enable user cxl commands
> + * @cxlm: cxl_mem instance to modify
> + * @cmds: bitmap of commands to mark available for userspace
> + */
> +void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
> +{
> +	down_write(&cxl_memdev_rwsem);
> +	bitmap_andnot(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
> +		      CXL_MEM_COMMAND_ID_MAX);
> +	up_write(&cxl_memdev_rwsem);
> +}
> +EXPORT_SYMBOL_GPL(clear_exclusive_cxl_commands);
> +
>  static void cxl_memdev_shutdown(struct device *dev)
>  {
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 16201b7d82d2..468b7b8be207 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -101,6 +101,7 @@ struct cxl_mbox_cmd {
>   * @mbox_mutex: Mutex to synchronize mailbox access.
>   * @firmware_version: Firmware version for the memory device.
>   * @enabled_cmds: Hardware commands found enabled in CEL.
> + * @exclusive_cmds: Commands that are kernel-internal only
>   * @pmem_range: Active Persistent memory capacity configuration
>   * @ram_range: Active Volatile memory capacity configuration
>   * @total_bytes: sum of all possible capacities
> @@ -127,6 +128,7 @@ struct cxl_mem {
>  	struct mutex mbox_mutex; /* Protects device mailbox and firmware */
>  	char firmware_version[0x10];
>  	DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
> +	DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
>  
>  	struct range pmem_range;
>  	struct range ram_range;
> @@ -200,4 +202,6 @@ int cxl_mem_identify(struct cxl_mem *cxlm);
>  int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm);
>  int cxl_mem_create_range_info(struct cxl_mem *cxlm);
>  struct cxl_mem *cxl_mem_create(struct device *dev);
> +void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
> +void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
>  #endif /* __CXL_MEM_H__ */
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> index 9652c3ee41e7..a972af7a6e0b 100644
> --- a/drivers/cxl/pmem.c
> +++ b/drivers/cxl/pmem.c
> @@ -16,10 +16,7 @@
>   */
>  static struct workqueue_struct *cxl_pmem_wq;
>  
> -static void unregister_nvdimm(void *nvdimm)
> -{
> -	nvdimm_delete(nvdimm);
> -}
> +static __read_mostly DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
>  
>  static int match_nvdimm_bridge(struct device *dev, const void *data)
>  {
> @@ -36,12 +33,25 @@ static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
>  	return to_cxl_nvdimm_bridge(dev);
>  }
>  
> +static void cxl_nvdimm_remove(struct device *dev)
> +{
> +	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> +	struct nvdimm *nvdimm = dev_get_drvdata(dev);
> +	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> +	struct cxl_mem *cxlm = cxlmd->cxlm;

Given cxlmd isn't used, perhaps combine the two lines above?

> +
> +	nvdimm_delete(nvdimm);
> +	clear_exclusive_cxl_commands(cxlm, exclusive_cmds);
> +}
> +
>  static int cxl_nvdimm_probe(struct device *dev)
>  {
>  	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> +	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> +	struct cxl_mem *cxlm = cxlmd->cxlm;

Again, clxmd not used so could save a line of code
without loosing anything (unless it get used in a later patch of
course!)

>  	struct cxl_nvdimm_bridge *cxl_nvb;
> +	struct nvdimm *nvdimm = NULL;
>  	unsigned long flags = 0;
> -	struct nvdimm *nvdimm;
>  	int rc = -ENXIO;
>  
>  	cxl_nvb = cxl_find_nvdimm_bridge();
> @@ -50,25 +60,32 @@ static int cxl_nvdimm_probe(struct device *dev)
>  
>  	device_lock(&cxl_nvb->dev);
>  	if (!cxl_nvb->nvdimm_bus)
> -		goto out;
> +		goto out_unlock;
> +
> +	set_exclusive_cxl_commands(cxlm, exclusive_cmds);
>  
>  	set_bit(NDD_LABELING, &flags);
> +	rc = -ENOMEM;

Hmm. Setting rc to an error value even in the good path is a bit
unusual.  I'd just add the few lines to set rc = -ENXIO only in the error
path above and
rc = -ENOMEM here only if nvdimm_create fails.

What you have strikes me as a bit too clever :)

>  	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
>  			       NULL);
> -	if (!nvdimm)
> -		goto out;
> +	dev_set_drvdata(dev, nvdimm);
>  
> -	rc = devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm);
> -out:
> +out_unlock:
>  	device_unlock(&cxl_nvb->dev);
>  	put_device(&cxl_nvb->dev);
>  
> -	return rc;
> +	if (!nvdimm) {

If you change the above as suggested this becomes a simple if (ret)

> +		clear_exclusive_cxl_commands(cxlm, exclusive_cmds);
> +		return rc;

> +	}
> +
> +	return 0;
>  }
>  
>  static struct cxl_driver cxl_nvdimm_driver = {
>  	.name = "cxl_nvdimm",
>  	.probe = cxl_nvdimm_probe,
> +	.remove = cxl_nvdimm_remove,
>  	.id = CXL_DEVICE_NVDIMM,
>  };
>  
> @@ -194,6 +211,10 @@ static __init int cxl_pmem_init(void)
>  {
>  	int rc;
>  
> +	set_bit(CXL_MEM_COMMAND_ID_SET_PARTITION_INFO, exclusive_cmds);
> +	set_bit(CXL_MEM_COMMAND_ID_SET_SHUTDOWN_STATE, exclusive_cmds);
> +	set_bit(CXL_MEM_COMMAND_ID_SET_LSA, exclusive_cmds);
> +
>  	cxl_pmem_wq = alloc_ordered_workqueue("cxl_pmem", 0);
>  	if (!cxl_pmem_wq)
>  		return -ENXIO;
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands
  2021-09-09 20:32       ` Ben Widawsky
@ 2021-09-10  9:39         ` Jonathan Cameron
  0 siblings, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  9:39 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Dan Williams, linux-cxl, Vishal L Verma, Linux NVDIMM, Schofield,
	Alison, Weiny, Ira

On Thu, 9 Sep 2021 13:32:14 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-09-09 12:03:49, Dan Williams wrote:
> > On Thu, Sep 9, 2021 at 10:22 AM Ben Widawsky <ben.widawsky@intel.com> wrote:  
> > >
> > > On 21-09-08 22:12:54, Dan Williams wrote:  
> > > > The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to
> > > > translate the Linux command payload to the device native command format.
> > > > The LIBNVDIMM commands get-config-size, get-config-data, and
> > > > set-config-data, map to the CXL memory device commands device-identify,
> > > > get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an
> > > > NVDIMM device arranges for the provisioning of namespaces. Additionally
> > > > for CXL the LSA is used for provisioning regions as well.
> > > >
> > > > The data from device-identify is already cached in the 'struct cxl_mem'
> > > > instance associated with @cxl_nvd, so that payload return is simply
> > > > crafted and no CXL command is issued. The conversion for get-lsa is
> > > > straightforward, but the conversion for set-lsa requires an allocation
> > > > to append the set-lsa header in front of the payload.
> > > >
> > > > Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > > > ---
> > > >  drivers/cxl/pmem.c |  125 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> > > >  1 file changed, 121 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> > > > index a972af7a6e0b..29d24f13aa73 100644
> > > > --- a/drivers/cxl/pmem.c
> > > > +++ b/drivers/cxl/pmem.c
> > > > @@ -1,6 +1,7 @@
> > > >  // SPDX-License-Identifier: GPL-2.0-only
> > > >  /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> > > >  #include <linux/libnvdimm.h>
> > > > +#include <asm/unaligned.h>
> > > >  #include <linux/device.h>
> > > >  #include <linux/module.h>
> > > >  #include <linux/ndctl.h>
> > > > @@ -48,10 +49,10 @@ static int cxl_nvdimm_probe(struct device *dev)
> > > >  {
> > > >       struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> > > >       struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > > > +     unsigned long flags = 0, cmd_mask = 0;
> > > >       struct cxl_mem *cxlm = cxlmd->cxlm;
> > > >       struct cxl_nvdimm_bridge *cxl_nvb;
> > > >       struct nvdimm *nvdimm = NULL;
> > > > -     unsigned long flags = 0;
> > > >       int rc = -ENXIO;
> > > >
> > > >       cxl_nvb = cxl_find_nvdimm_bridge();
> > > > @@ -66,8 +67,11 @@ static int cxl_nvdimm_probe(struct device *dev)
> > > >
> > > >       set_bit(NDD_LABELING, &flags);
> > > >       rc = -ENOMEM;
> > > > -     nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
> > > > -                            NULL);
> > > > +     set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
> > > > +     set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
> > > > +     set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
> > > > +     nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags,
> > > > +                            cmd_mask, 0, NULL);
> > > >       dev_set_drvdata(dev, nvdimm);
> > > >
> > > >  out_unlock:
> > > > @@ -89,11 +93,124 @@ static struct cxl_driver cxl_nvdimm_driver = {
> > > >       .id = CXL_DEVICE_NVDIMM,
> > > >  };
> > > >
> > > > +static int cxl_pmem_get_config_size(struct cxl_mem *cxlm,
> > > > +                                 struct nd_cmd_get_config_size *cmd,
> > > > +                                 unsigned int buf_len, int *cmd_rc)
> > > > +{
> > > > +     if (sizeof(*cmd) > buf_len)
> > > > +             return -EINVAL;
> > > > +
> > > > +     *cmd = (struct nd_cmd_get_config_size) {
> > > > +              .config_size = cxlm->lsa_size,
> > > > +              .max_xfer = cxlm->payload_size,
> > > > +     };
> > > > +     *cmd_rc = 0;
> > > > +
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +static int cxl_pmem_get_config_data(struct cxl_mem *cxlm,
> > > > +                                 struct nd_cmd_get_config_data_hdr *cmd,
> > > > +                                 unsigned int buf_len, int *cmd_rc)
> > > > +{
> > > > +     struct cxl_mbox_get_lsa {
> > > > +             u32 offset;
> > > > +             u32 length;
> > > > +     } get_lsa;
> > > > +     int rc;
> > > > +
> > > > +     if (sizeof(*cmd) > buf_len)
> > > > +             return -EINVAL;
> > > > +     if (struct_size(cmd, out_buf, cmd->in_length) > buf_len)
> > > > +             return -EINVAL;
> > > > +
> > > > +     get_lsa = (struct cxl_mbox_get_lsa) {
> > > > +             .offset = cmd->in_offset,
> > > > +             .length = cmd->in_length,
> > > > +     };
> > > > +
> > > > +     rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LSA, &get_lsa,
> > > > +                                sizeof(get_lsa), cmd->out_buf,
> > > > +                                cmd->in_length);
> > > > +     cmd->status = 0;
> > > > +     *cmd_rc = 0;
> > > > +
> > > > +     return rc;
> > > > +}
> > > > +
> > > > +static int cxl_pmem_set_config_data(struct cxl_mem *cxlm,
> > > > +                                 struct nd_cmd_set_config_hdr *cmd,
> > > > +                                 unsigned int buf_len, int *cmd_rc)
> > > > +{
> > > > +     struct cxl_mbox_set_lsa {
> > > > +             u32 offset;
> > > > +             u32 reserved;
> > > > +             u8 data[];
> > > > +     } *set_lsa;
> > > > +     int rc;
> > > > +
> > > > +     if (sizeof(*cmd) > buf_len)
> > > > +             return -EINVAL;
> > > > +
> > > > +     /* 4-byte status follows the input data in the payload */
> > > > +     if (struct_size(cmd, in_buf, cmd->in_length) + 4 > buf_len)
> > > > +             return -EINVAL;
> > > > +
> > > > +     set_lsa =
> > > > +             kvzalloc(struct_size(set_lsa, data, cmd->in_length), GFP_KERNEL);
> > > > +     if (!set_lsa)
> > > > +             return -ENOMEM;
> > > > +
> > > > +     *set_lsa = (struct cxl_mbox_set_lsa) {
> > > > +             .offset = cmd->in_offset,
> > > > +     };
> > > > +     memcpy(set_lsa->data, cmd->in_buf, cmd->in_length);
> > > > +
> > > > +     rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_SET_LSA, set_lsa,
> > > > +                                struct_size(set_lsa, data, cmd->in_length),
> > > > +                                NULL, 0);
> > > > +
> > > > +     /*
> > > > +      * Set "firmware" status (4-packed bytes at the end of the input
> > > > +      * payload.
> > > > +      */
> > > > +     put_unaligned(0, (u32 *) &cmd->in_buf[cmd->in_length]);
> > > > +     *cmd_rc = 0;
> > > > +     kvfree(set_lsa);
> > > > +
> > > > +     return rc;
> > > > +}
> > > > +
> > > > +static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
> > > > +                            void *buf, unsigned int buf_len, int *cmd_rc)
> > > > +{
> > > > +     struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
> > > > +     unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
> > > > +     struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > > > +     struct cxl_mem *cxlm = cxlmd->cxlm;
> > > > +
> > > > +     if (!test_bit(cmd, &cmd_mask))
> > > > +             return -ENOTTY;
> > > > +
> > > > +     switch (cmd) {
> > > > +     case ND_CMD_GET_CONFIG_SIZE:
> > > > +             return cxl_pmem_get_config_size(cxlm, buf, buf_len, cmd_rc);
> > > > +     case ND_CMD_GET_CONFIG_DATA:
> > > > +             return cxl_pmem_get_config_data(cxlm, buf, buf_len, cmd_rc);
> > > > +     case ND_CMD_SET_CONFIG_DATA:
> > > > +             return cxl_pmem_set_config_data(cxlm, buf, buf_len, cmd_rc);
> > > > +     default:
> > > > +             return -ENOTTY;
> > > > +     }
> > > > +}
> > > > +  
> > >
> > > Is there some intended purpose for passing cmd_rc down, if it isn't actually
> > > ever used? Perhaps add it when needed later?  
> > 
> > Ah true, copy-pasta leftovers from other similar routines. I'll clean this up.  
> 
> With that,
> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
on basis you fixed the one thing I moaned about in v3 :)

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands
  2021-09-09 22:08   ` [PATCH v5 " Dan Williams
@ 2021-09-10  9:40     ` Jonathan Cameron
  0 siblings, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  9:40 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, nvdimm, ben.widawsky, ira.weiny, vishal.l.verma

On Thu, 9 Sep 2021 15:08:15 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to
> translate the Linux command payload to the device native command format.
> The LIBNVDIMM commands get-config-size, get-config-data, and
> set-config-data, map to the CXL memory device commands device-identify,
> get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an
> NVDIMM device arranges for the provisioning of namespaces. Additionally
> for CXL the LSA is used for provisioning regions as well.
> 
> The data from device-identify is already cached in the 'struct cxl_mem'
> instance associated with @cxl_nvd, so that payload return is simply
> crafted and no CXL command is issued. The conversion for get-lsa is
> straightforward, but the conversion for set-lsa requires an allocation
> to append the set-lsa header in front of the payload.
> 
> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

As I just sent to outdated v4
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> Changes since v4:
> - Drop cmd_rc plumbing as it is presently unused (Ben)
> 
>  drivers/cxl/pmem.c |  128 ++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 124 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> index a972af7a6e0b..d349a6cb01df 100644
> --- a/drivers/cxl/pmem.c
> +++ b/drivers/cxl/pmem.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
>  #include <linux/libnvdimm.h>
> +#include <asm/unaligned.h>
>  #include <linux/device.h>
>  #include <linux/module.h>
>  #include <linux/ndctl.h>
> @@ -48,10 +49,10 @@ static int cxl_nvdimm_probe(struct device *dev)
>  {
>  	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
>  	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> +	unsigned long flags = 0, cmd_mask = 0;
>  	struct cxl_mem *cxlm = cxlmd->cxlm;
>  	struct cxl_nvdimm_bridge *cxl_nvb;
>  	struct nvdimm *nvdimm = NULL;
> -	unsigned long flags = 0;
>  	int rc = -ENXIO;
>  
>  	cxl_nvb = cxl_find_nvdimm_bridge();
> @@ -66,8 +67,11 @@ static int cxl_nvdimm_probe(struct device *dev)
>  
>  	set_bit(NDD_LABELING, &flags);
>  	rc = -ENOMEM;
> -	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
> -			       NULL);
> +	set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
> +	set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
> +	set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
> +	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags,
> +			       cmd_mask, 0, NULL);
>  	dev_set_drvdata(dev, nvdimm);
>  
>  out_unlock:
> @@ -89,11 +93,127 @@ static struct cxl_driver cxl_nvdimm_driver = {
>  	.id = CXL_DEVICE_NVDIMM,
>  };
>  
> +static int cxl_pmem_get_config_size(struct cxl_mem *cxlm,
> +				    struct nd_cmd_get_config_size *cmd,
> +				    unsigned int buf_len)
> +{
> +	if (sizeof(*cmd) > buf_len)
> +		return -EINVAL;
> +
> +	*cmd = (struct nd_cmd_get_config_size) {
> +		 .config_size = cxlm->lsa_size,
> +		 .max_xfer = cxlm->payload_size,
> +	};
> +
> +	return 0;
> +}
> +
> +static int cxl_pmem_get_config_data(struct cxl_mem *cxlm,
> +				    struct nd_cmd_get_config_data_hdr *cmd,
> +				    unsigned int buf_len)
> +{
> +	struct cxl_mbox_get_lsa {
> +		u32 offset;
> +		u32 length;
> +	} get_lsa;
> +	int rc;
> +
> +	if (sizeof(*cmd) > buf_len)
> +		return -EINVAL;
> +	if (struct_size(cmd, out_buf, cmd->in_length) > buf_len)
> +		return -EINVAL;
> +
> +	get_lsa = (struct cxl_mbox_get_lsa) {
> +		.offset = cmd->in_offset,
> +		.length = cmd->in_length,
> +	};
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LSA, &get_lsa,
> +				   sizeof(get_lsa), cmd->out_buf,
> +				   cmd->in_length);
> +	cmd->status = 0;
> +
> +	return rc;
> +}
> +
> +static int cxl_pmem_set_config_data(struct cxl_mem *cxlm,
> +				    struct nd_cmd_set_config_hdr *cmd,
> +				    unsigned int buf_len)
> +{
> +	struct cxl_mbox_set_lsa {
> +		u32 offset;
> +		u32 reserved;
> +		u8 data[];
> +	} *set_lsa;
> +	int rc;
> +
> +	if (sizeof(*cmd) > buf_len)
> +		return -EINVAL;
> +
> +	/* 4-byte status follows the input data in the payload */
> +	if (struct_size(cmd, in_buf, cmd->in_length) + 4 > buf_len)
> +		return -EINVAL;
> +
> +	set_lsa =
> +		kvzalloc(struct_size(set_lsa, data, cmd->in_length), GFP_KERNEL);
> +	if (!set_lsa)
> +		return -ENOMEM;
> +
> +	*set_lsa = (struct cxl_mbox_set_lsa) {
> +		.offset = cmd->in_offset,
> +	};
> +	memcpy(set_lsa->data, cmd->in_buf, cmd->in_length);
> +
> +	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_SET_LSA, set_lsa,
> +				   struct_size(set_lsa, data, cmd->in_length),
> +				   NULL, 0);
> +
> +	/*
> +	 * Set "firmware" status (4-packed bytes at the end of the input
> +	 * payload.
> +	 */
> +	put_unaligned(0, (u32 *) &cmd->in_buf[cmd->in_length]);
> +	kvfree(set_lsa);
> +
> +	return rc;
> +}
> +
> +static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
> +			       void *buf, unsigned int buf_len)
> +{
> +	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
> +	unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
> +	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> +	struct cxl_mem *cxlm = cxlmd->cxlm;
> +
> +	if (!test_bit(cmd, &cmd_mask))
> +		return -ENOTTY;
> +
> +	switch (cmd) {
> +	case ND_CMD_GET_CONFIG_SIZE:
> +		return cxl_pmem_get_config_size(cxlm, buf, buf_len);
> +	case ND_CMD_GET_CONFIG_DATA:
> +		return cxl_pmem_get_config_data(cxlm, buf, buf_len);
> +	case ND_CMD_SET_CONFIG_DATA:
> +		return cxl_pmem_set_config_data(cxlm, buf, buf_len);
> +	default:
> +		return -ENOTTY;
> +	}
> +}
> +
>  static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
>  			struct nvdimm *nvdimm, unsigned int cmd, void *buf,
>  			unsigned int buf_len, int *cmd_rc)
>  {
> -	return -ENOTTY;
> +	/*
> +	 * No firmware response to translate, let the transport error
> +	 * code take precedence.
> +	 */
> +	*cmd_rc = 0;
> +
> +	if (!nvdimm)
> +		return -ENOTTY;
> +	return cxl_pmem_nvdimm_ctl(nvdimm, cmd, buf, buf_len);
>  }
>  
>  static bool online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 17/21] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
  2021-09-09  5:13 ` [PATCH v4 17/21] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy Dan Williams
@ 2021-09-10  9:53   ` Jonathan Cameron
  2021-09-10 18:46     ` Dan Williams
  2021-09-14 19:14   ` [PATCH v5 " Dan Williams
  1 sibling, 1 reply; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  9:53 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ben Widawsky, Vishal Verma, nvdimm, alison.schofield,
	ira.weiny

On Wed, 8 Sep 2021 22:13:04 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Create an environment for CXL plumbing unit tests. Especially when it
> comes to an algorithm for HDM Decoder (Host-managed Device Memory
> Decoder) programming, the availability of an in-kernel-tree emulation
> environment for CXL configuration complexity and corner cases speeds
> development and deters regressions.
> 
> The approach taken mirrors what was done for tools/testing/nvdimm/. I.e.
> an external module, cxl_test.ko built out of the tools/testing/cxl/
> directory, provides mock implementations of kernel APIs and kernel
> objects to simulate a real world device hierarchy.
> 
> One feedback for the tools/testing/nvdimm/ proposal was "why not do this
> in QEMU?". In fact, the CXL development community has developed a QEMU
> model for CXL [1]. However, there are a few blocking issues that keep
> QEMU from being a tight fit for topology + provisioning unit tests:
> 
> 1/ The QEMU community has yet to show interest in merging any of this
>    support that has had patches on the list since November 2020. So,
>    testing CXL to date involves building custom QEMU with out-of-tree
>    patches.
> 
> 2/ CXL mechanisms like cross-host-bridge interleave do not have a clear
>    path to be emulated by QEMU without major infrastructure work. This
>    is easier to achieve with the alloc_mock_res() approach taken in this
>    patch to shortcut-define emulated system physical address ranges with
>    interleave behavior.
> 
> The QEMU enabling has been critical to get the driver off the ground,
> and may still move forward, but it does not address the ongoing needs of
> a regression testing environment and test driven development.
> 
> This patch adds an ACPI CXL Platform definition with emulated CXL
> multi-ported host-bridges. A follow on patch adds emulated memory
> expander devices.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Reported-by: Vishal Verma <vishal.l.verma@intel.com>
> Link: https://lore.kernel.org/r/20210202005948.241655-1-ben.widawsky@intel.com [1]
> Link: https://lore.kernel.org/r/162982125348.1124374.17808192318402734926.stgit@dwillia2-desk3.amr.corp.intel.com
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A trivial comment below, but I'm fine with leave that one change in here
as it is only a very small amount of noise.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>



> ---
>  drivers/cxl/acpi.c               |   40 ++-
>  drivers/cxl/cxl.h                |   16 +
>  tools/testing/cxl/Kbuild         |   36 +++
>  tools/testing/cxl/config_check.c |   13 +
>  tools/testing/cxl/mock_acpi.c    |  109 ++++++++
>  tools/testing/cxl/test/Kbuild    |    6 
>  tools/testing/cxl/test/cxl.c     |  509 ++++++++++++++++++++++++++++++++++++++
>  tools/testing/cxl/test/mock.c    |  171 +++++++++++++
>  tools/testing/cxl/test/mock.h    |   27 ++
>  9 files changed, 911 insertions(+), 16 deletions(-)
>  create mode 100644 tools/testing/cxl/Kbuild
>  create mode 100644 tools/testing/cxl/config_check.c
>  create mode 100644 tools/testing/cxl/mock_acpi.c
>  create mode 100644 tools/testing/cxl/test/Kbuild
>  create mode 100644 tools/testing/cxl/test/cxl.c
>  create mode 100644 tools/testing/cxl/test/mock.c
>  create mode 100644 tools/testing/cxl/test/mock.h
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 54e9d4d2cf5f..d31a97218593 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -182,15 +182,7 @@ static resource_size_t get_chbcr(struct acpi_cedt_chbs *chbs)
>  	return IS_ERR(chbs) ? CXL_RESOURCE_NONE : chbs->base;
>  }
>  
> -struct cxl_walk_context {
> -	struct device *dev;
> -	struct pci_bus *root;
> -	struct cxl_port *port;
> -	int error;
> -	int count;
> -};
> -
> -static int match_add_root_ports(struct pci_dev *pdev, void *data)
> +__mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>  {
>  	struct cxl_walk_context *ctx = data;
>  	struct pci_bus *root_bus = ctx->root;
> @@ -239,15 +231,18 @@ static struct cxl_dport *find_dport_by_dev(struct cxl_port *port, struct device
>  	return NULL;
>  }
>  
> -static struct acpi_device *to_cxl_host_bridge(struct device *dev)
> +__mock struct acpi_device *to_cxl_host_bridge(struct device *host,
> +					      struct device *dev)
>  {
>  	struct acpi_device *adev = to_acpi_device(dev);
>  
>  	if (!acpi_pci_find_root(adev->handle))
>  		return NULL;
>  
> -	if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0)
> +	if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0) {
> +		dev_dbg(host, "found host bridge %s\n", dev_name(&adev->dev));

I didn't call it out in the previous review, but technically unrelated to the
rest of the patch even if useful.

>  		return adev;
> +	}
>  	return NULL;
>  }
>  
> @@ -257,9 +252,9 @@ static struct acpi_device *to_cxl_host_bridge(struct device *dev)
>   */
>  static int add_host_bridge_uport(struct device *match, void *arg)
>  {
> -	struct acpi_device *bridge = to_cxl_host_bridge(match);
>  	struct cxl_port *root_port = arg;
>  	struct device *host = root_port->dev.parent;
> +	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
>  	struct acpi_pci_root *pci_root;
>  	struct cxl_walk_context ctx;
>  	struct cxl_decoder *cxld;
> @@ -323,7 +318,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
>  	struct acpi_cedt_chbs *chbs;
>  	struct cxl_port *root_port = arg;
>  	struct device *host = root_port->dev.parent;
> -	struct acpi_device *bridge = to_cxl_host_bridge(match);
> +	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
>  
>  	if (!bridge)
>  		return 0;
> @@ -375,6 +370,17 @@ static int add_root_nvdimm_bridge(struct device *match, void *data)
>  	return 1;
>  }
>  
> +static u32 cedt_instance(struct platform_device *pdev)
> +{
> +	const bool *native_acpi0017 = acpi_device_get_match_data(&pdev->dev);
> +
> +	if (native_acpi0017 && *native_acpi0017)
> +		return 0;
> +
> +	/* for cxl_test request a non-canonical instance */
> +	return U32_MAX;
> +}
> +
>  static int cxl_acpi_probe(struct platform_device *pdev)
>  {
>  	int rc;
> @@ -388,7 +394,7 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>  		return PTR_ERR(root_port);
>  	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
>  
> -	status = acpi_get_table(ACPI_SIG_CEDT, 0, &acpi_cedt);
> +	status = acpi_get_table(ACPI_SIG_CEDT, cedt_instance(pdev), &acpi_cedt);
>  	if (ACPI_FAILURE(status))
>  		return -ENXIO;
>  
> @@ -419,9 +425,11 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>  	return 0;
>  }
>  
> +static bool native_acpi0017 = true;
> +
>  static const struct acpi_device_id cxl_acpi_ids[] = {
> -	{ "ACPI0017", 0 },
> -	{ "", 0 },
> +	{ "ACPI0017", (unsigned long) &native_acpi0017 },
> +	{ },
>  };
>  MODULE_DEVICE_TABLE(acpi, cxl_acpi_ids);
>  
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 1b2e816e061e..c5152718267e 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -226,6 +226,14 @@ struct cxl_nvdimm {
>  	struct nvdimm *nvdimm;
>  };
>  
> +struct cxl_walk_context {
> +	struct device *dev;
> +	struct pci_bus *root;
> +	struct cxl_port *port;
> +	int error;
> +	int count;
> +};
> +
>  /**
>   * struct cxl_port - logical collection of upstream port devices and
>   *		     downstream port devices to construct a CXL memory
> @@ -325,4 +333,12 @@ struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
>  bool is_cxl_nvdimm(struct device *dev);
>  int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
>  struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void);
> +
> +/*
> + * Unit test builds overrides this to __weak, find the 'strong' version
> + * of these symbols in tools/testing/cxl/.
> + */
> +#ifndef __mock
> +#define __mock static
> +#endif
>  #endif /* __CXL_H__ */
> diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
> new file mode 100644
> index 000000000000..63a4a07e71c4
> --- /dev/null
> +++ b/tools/testing/cxl/Kbuild
> @@ -0,0 +1,36 @@
> +# SPDX-License-Identifier: GPL-2.0
> +ldflags-y += --wrap=is_acpi_device_node
> +ldflags-y += --wrap=acpi_get_table
> +ldflags-y += --wrap=acpi_put_table
> +ldflags-y += --wrap=acpi_evaluate_integer
> +ldflags-y += --wrap=acpi_pci_find_root
> +ldflags-y += --wrap=pci_walk_bus
> +ldflags-y += --wrap=nvdimm_bus_register
> +
> +DRIVERS := ../../../drivers
> +CXL_SRC := $(DRIVERS)/cxl
> +CXL_CORE_SRC := $(DRIVERS)/cxl/core
> +ccflags-y := -I$(srctree)/drivers/cxl/
> +ccflags-y += -D__mock=__weak
> +
> +obj-m += cxl_acpi.o
> +
> +cxl_acpi-y := $(CXL_SRC)/acpi.o
> +cxl_acpi-y += mock_acpi.o
> +cxl_acpi-y += config_check.o
> +
> +obj-m += cxl_pmem.o
> +
> +cxl_pmem-y := $(CXL_SRC)/pmem.o
> +cxl_pmem-y += config_check.o
> +
> +obj-m += cxl_core.o
> +
> +cxl_core-y := $(CXL_CORE_SRC)/bus.o
> +cxl_core-y += $(CXL_CORE_SRC)/pmem.o
> +cxl_core-y += $(CXL_CORE_SRC)/regs.o
> +cxl_core-y += $(CXL_CORE_SRC)/memdev.o
> +cxl_core-y += $(CXL_CORE_SRC)/mbox.o
> +cxl_core-y += config_check.o
> +
> +obj-m += test/
> diff --git a/tools/testing/cxl/config_check.c b/tools/testing/cxl/config_check.c
> new file mode 100644
> index 000000000000..de5e5b3652fd
> --- /dev/null
> +++ b/tools/testing/cxl/config_check.c
> @@ -0,0 +1,13 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bug.h>
> +
> +void check(void)
> +{
> +	/*
> +	 * These kconfig symbols must be set to "m" for cxl_test to load
> +	 * and operate.
> +	 */
> +	BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_BUS));
> +	BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_ACPI));
> +	BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_PMEM));
> +}
> diff --git a/tools/testing/cxl/mock_acpi.c b/tools/testing/cxl/mock_acpi.c
> new file mode 100644
> index 000000000000..4c8a493ace56
> --- /dev/null
> +++ b/tools/testing/cxl/mock_acpi.c
> @@ -0,0 +1,109 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +
> +#include <linux/platform_device.h>
> +#include <linux/device.h>
> +#include <linux/acpi.h>
> +#include <linux/pci.h>
> +#include <cxl.h>
> +#include "test/mock.h"
> +
> +struct acpi_device *to_cxl_host_bridge(struct device *host, struct device *dev)
> +{
> +	int index;
> +	struct acpi_device *adev, *found = NULL;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +
> +	if (ops && ops->is_mock_bridge(dev)) {
> +		found = ACPI_COMPANION(dev);
> +		goto out;
> +	}
> +
> +	if (dev->bus == &platform_bus_type)
> +		goto out;
> +
> +	adev = to_acpi_device(dev);
> +	if (!acpi_pci_find_root(adev->handle))
> +		goto out;
> +
> +	if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0) {
> +		found = adev;
> +		dev_dbg(host, "found host bridge %s\n", dev_name(&adev->dev));
> +	}
> +out:
> +	put_cxl_mock_ops(index);
> +	return found;
> +}
> +
> +static int match_add_root_port(struct pci_dev *pdev, void *data)
> +{
> +	struct cxl_walk_context *ctx = data;
> +	struct pci_bus *root_bus = ctx->root;
> +	struct cxl_port *port = ctx->port;
> +	int type = pci_pcie_type(pdev);
> +	struct device *dev = ctx->dev;
> +	u32 lnkcap, port_num;
> +	int rc;
> +
> +	if (pdev->bus != root_bus)
> +		return 0;
> +	if (!pci_is_pcie(pdev))
> +		return 0;
> +	if (type != PCI_EXP_TYPE_ROOT_PORT)
> +		return 0;
> +	if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
> +				  &lnkcap) != PCIBIOS_SUCCESSFUL)
> +		return 0;
> +
> +	/* TODO walk DVSEC to find component register base */
> +	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> +	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
> +	if (rc) {
> +		dev_err(dev, "failed to add dport: %s (%d)\n",
> +			dev_name(&pdev->dev), rc);
> +		ctx->error = rc;
> +		return rc;
> +	}
> +	ctx->count++;
> +
> +	dev_dbg(dev, "add dport%d: %s\n", port_num, dev_name(&pdev->dev));
> +
> +	return 0;
> +}
> +
> +static int mock_add_root_port(struct platform_device *pdev, void *data)
> +{
> +	struct cxl_walk_context *ctx = data;
> +	struct cxl_port *port = ctx->port;
> +	struct device *dev = ctx->dev;
> +	int rc;
> +
> +	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE);
> +	if (rc) {
> +		dev_err(dev, "failed to add dport: %s (%d)\n",
> +			dev_name(&pdev->dev), rc);
> +		ctx->error = rc;
> +		return rc;
> +	}
> +	ctx->count++;
> +
> +	dev_dbg(dev, "add dport%d: %s\n", pdev->id, dev_name(&pdev->dev));
> +
> +	return 0;
> +}
> +
> +int match_add_root_ports(struct pci_dev *dev, void *data)
> +{
> +	int index, rc;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +	struct platform_device *pdev = (struct platform_device *) dev;
> +
> +	if (ops && ops->is_mock_port(pdev))
> +		rc = mock_add_root_port(pdev, data);
> +	else
> +		rc = match_add_root_port(dev, data);
> +
> +	put_cxl_mock_ops(index);
> +
> +	return rc;
> +}
> diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
> new file mode 100644
> index 000000000000..7de4ddecfd21
> --- /dev/null
> +++ b/tools/testing/cxl/test/Kbuild
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: GPL-2.0
> +obj-m += cxl_test.o
> +obj-m += cxl_mock.o
> +
> +cxl_test-y := cxl.o
> +cxl_mock-y := mock.o
> diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
> new file mode 100644
> index 000000000000..1c47b34244a4
> --- /dev/null
> +++ b/tools/testing/cxl/test/cxl.c
> @@ -0,0 +1,509 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright(c) 2021 Intel Corporation. All rights reserved.
> +
> +#include <linux/platform_device.h>
> +#include <linux/genalloc.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/acpi.h>
> +#include <linux/pci.h>
> +#include <linux/mm.h>
> +#include "mock.h"
> +
> +#define NR_CXL_HOST_BRIDGES 4
> +#define NR_CXL_ROOT_PORTS 2
> +
> +static struct platform_device *cxl_acpi;
> +static struct platform_device *cxl_host_bridge[NR_CXL_HOST_BRIDGES];
> +static struct platform_device
> +	*cxl_root_port[NR_CXL_HOST_BRIDGES * NR_CXL_ROOT_PORTS];
> +
> +static struct acpi_device acpi0017_mock;
> +static struct acpi_device host_bridge[NR_CXL_HOST_BRIDGES] = {
> +	[0] = {
> +		.handle = &host_bridge[0],
> +	},
> +	[1] = {
> +		.handle = &host_bridge[1],
> +	},
> +	[2] = {
> +		.handle = &host_bridge[2],
> +	},
> +	[3] = {
> +		.handle = &host_bridge[3],
> +	},
> +};
> +
> +static bool is_mock_dev(struct device *dev)
> +{
> +	if (dev == &cxl_acpi->dev)
> +		return true;
> +	return false;
> +}
> +
> +static bool is_mock_adev(struct acpi_device *adev)
> +{
> +	int i;
> +
> +	if (adev == &acpi0017_mock)
> +		return true;
> +
> +	for (i = 0; i < ARRAY_SIZE(host_bridge); i++)
> +		if (adev == &host_bridge[i])
> +			return true;
> +
> +	return false;
> +}
> +
> +static struct {
> +	struct acpi_table_cedt cedt;
> +	struct acpi_cedt_chbs chbs[NR_CXL_HOST_BRIDGES];
> +	struct {
> +		struct acpi_cedt_cfmws cfmws;
> +		u32 target[1];
> +	} cfmws0;
> +	struct {
> +		struct acpi_cedt_cfmws cfmws;
> +		u32 target[4];
> +	} cfmws1;
> +	struct {
> +		struct acpi_cedt_cfmws cfmws;
> +		u32 target[1];
> +	} cfmws2;
> +	struct {
> +		struct acpi_cedt_cfmws cfmws;
> +		u32 target[4];
> +	} cfmws3;
> +} __packed mock_cedt = {
> +	.cedt = {
> +		.header = {
> +			.signature = "CEDT",
> +			.length = sizeof(mock_cedt),
> +			.revision = 1,
> +		},
> +	},
> +	.chbs[0] = {
> +		.header = {
> +			.type = ACPI_CEDT_TYPE_CHBS,
> +			.length = sizeof(mock_cedt.chbs[0]),
> +		},
> +		.uid = 0,
> +		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
> +	},
> +	.chbs[1] = {
> +		.header = {
> +			.type = ACPI_CEDT_TYPE_CHBS,
> +			.length = sizeof(mock_cedt.chbs[0]),
> +		},
> +		.uid = 1,
> +		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
> +	},
> +	.chbs[2] = {
> +		.header = {
> +			.type = ACPI_CEDT_TYPE_CHBS,
> +			.length = sizeof(mock_cedt.chbs[0]),
> +		},
> +		.uid = 2,
> +		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
> +	},
> +	.chbs[3] = {
> +		.header = {
> +			.type = ACPI_CEDT_TYPE_CHBS,
> +			.length = sizeof(mock_cedt.chbs[0]),
> +		},
> +		.uid = 3,
> +		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
> +	},
> +	.cfmws0 = {
> +		.cfmws = {
> +			.header = {
> +				.type = ACPI_CEDT_TYPE_CFMWS,
> +				.length = sizeof(mock_cedt.cfmws0),
> +			},
> +			.interleave_ways = 0,
> +			.granularity = 4,
> +			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
> +					ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
> +			.qtg_id = 0,
> +			.window_size = SZ_256M,
> +		},
> +		.target = { 0 },
> +	},
> +	.cfmws1 = {
> +		.cfmws = {
> +			.header = {
> +				.type = ACPI_CEDT_TYPE_CFMWS,
> +				.length = sizeof(mock_cedt.cfmws1),
> +			},
> +			.interleave_ways = 2,
> +			.granularity = 4,
> +			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
> +					ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
> +			.qtg_id = 1,
> +			.window_size = SZ_256M * 4,
> +		},
> +		.target = { 0, 1, 2, 3 },
> +	},
> +	.cfmws2 = {
> +		.cfmws = {
> +			.header = {
> +				.type = ACPI_CEDT_TYPE_CFMWS,
> +				.length = sizeof(mock_cedt.cfmws2),
> +			},
> +			.interleave_ways = 0,
> +			.granularity = 4,
> +			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
> +					ACPI_CEDT_CFMWS_RESTRICT_PMEM,
> +			.qtg_id = 2,
> +			.window_size = SZ_256M,
> +		},
> +		.target = { 0 },
> +	},
> +	.cfmws3 = {
> +		.cfmws = {
> +			.header = {
> +				.type = ACPI_CEDT_TYPE_CFMWS,
> +				.length = sizeof(mock_cedt.cfmws3),
> +			},
> +			.interleave_ways = 2,
> +			.granularity = 4,
> +			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
> +					ACPI_CEDT_CFMWS_RESTRICT_PMEM,
> +			.qtg_id = 3,
> +			.window_size = SZ_256M * 4,
> +		},
> +		.target = { 0, 1, 2, 3 },
> +	},
> +};
> +
> +struct cxl_mock_res {
> +	struct list_head list;
> +	struct range range;
> +};
> +
> +static LIST_HEAD(mock_res);
> +static DEFINE_MUTEX(mock_res_lock);
> +static struct gen_pool *cxl_mock_pool;
> +
> +static void depopulate_all_mock_resources(void)
> +{
> +	struct cxl_mock_res *res, *_res;
> +
> +	mutex_lock(&mock_res_lock);
> +	list_for_each_entry_safe(res, _res, &mock_res, list) {
> +		gen_pool_free(cxl_mock_pool, res->range.start,
> +			      range_len(&res->range));
> +		list_del(&res->list);
> +		kfree(res);
> +	}
> +	mutex_unlock(&mock_res_lock);
> +}
> +
> +static struct cxl_mock_res *alloc_mock_res(resource_size_t size)
> +{
> +	struct cxl_mock_res *res = kzalloc(sizeof(*res), GFP_KERNEL);
> +	struct genpool_data_align data = {
> +		.align = SZ_256M,
> +	};
> +	unsigned long phys;
> +
> +	INIT_LIST_HEAD(&res->list);
> +	phys = gen_pool_alloc_algo(cxl_mock_pool, size,
> +				   gen_pool_first_fit_align, &data);
> +	if (!phys)
> +		return NULL;
> +
> +	res->range = (struct range) {
> +		.start = phys,
> +		.end = phys + size - 1,
> +	};
> +	mutex_lock(&mock_res_lock);
> +	list_add(&res->list, &mock_res);
> +	mutex_unlock(&mock_res_lock);
> +
> +	return res;
> +}
> +
> +static int populate_cedt(void)
> +{
> +	struct acpi_cedt_cfmws *cfmws[4] = {
> +		[0] = &mock_cedt.cfmws0.cfmws,
> +		[1] = &mock_cedt.cfmws1.cfmws,
> +		[2] = &mock_cedt.cfmws2.cfmws,
> +		[3] = &mock_cedt.cfmws3.cfmws,
> +	};
> +	struct cxl_mock_res *res;
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(mock_cedt.chbs); i++) {
> +		struct acpi_cedt_chbs *chbs = &mock_cedt.chbs[i];
> +		resource_size_t size;
> +
> +		if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL20)
> +			size = ACPI_CEDT_CHBS_LENGTH_CXL20;
> +		else
> +			size = ACPI_CEDT_CHBS_LENGTH_CXL11;
> +
> +		res = alloc_mock_res(size);
> +		if (!res)
> +			return -ENOMEM;
> +		chbs->base = res->range.start;
> +		chbs->length = size;
> +	}
> +
> +	for (i = 0; i < ARRAY_SIZE(cfmws); i++) {
> +		struct acpi_cedt_cfmws *window = cfmws[i];
> +
> +		res = alloc_mock_res(window->window_size);
> +		if (!res)
> +			return -ENOMEM;
> +		window->base_hpa = res->range.start;
> +	}
> +
> +	return 0;
> +}
> +
> +static acpi_status mock_acpi_get_table(char *signature, u32 instance,
> +				       struct acpi_table_header **out_table)
> +{
> +	if (instance < U32_MAX || strcmp(signature, ACPI_SIG_CEDT) != 0)
> +		return acpi_get_table(signature, instance, out_table);
> +
> +	*out_table = (struct acpi_table_header *) &mock_cedt;
> +	return AE_OK;
> +}
> +
> +static void mock_acpi_put_table(struct acpi_table_header *table)
> +{
> +	if (table == (struct acpi_table_header *) &mock_cedt)
> +		return;
> +	acpi_put_table(table);
> +}
> +
> +static bool is_mock_bridge(struct device *dev)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(cxl_host_bridge); i++)
> +		if (dev == &cxl_host_bridge[i]->dev)
> +			return true;
> +
> +	return false;
> +}
> +
> +static int host_bridge_index(struct acpi_device *adev)
> +{
> +	return adev - host_bridge;
> +}
> +
> +static struct acpi_device *find_host_bridge(acpi_handle handle)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(host_bridge); i++)
> +		if (handle == host_bridge[i].handle)
> +			return &host_bridge[i];
> +	return NULL;
> +}
> +
> +static acpi_status
> +mock_acpi_evaluate_integer(acpi_handle handle, acpi_string pathname,
> +			   struct acpi_object_list *arguments,
> +			   unsigned long long *data)
> +{
> +	struct acpi_device *adev = find_host_bridge(handle);
> +
> +	if (!adev || strcmp(pathname, METHOD_NAME__UID) != 0)
> +		return acpi_evaluate_integer(handle, pathname, arguments, data);
> +
> +	*data = host_bridge_index(adev);
> +	return AE_OK;
> +}
> +
> +static struct pci_bus mock_pci_bus[NR_CXL_HOST_BRIDGES];
> +static struct acpi_pci_root mock_pci_root[NR_CXL_HOST_BRIDGES] = {
> +	[0] = {
> +		.bus = &mock_pci_bus[0],
> +	},
> +	[1] = {
> +		.bus = &mock_pci_bus[1],
> +	},
> +	[2] = {
> +		.bus = &mock_pci_bus[2],
> +	},
> +	[3] = {
> +		.bus = &mock_pci_bus[3],
> +	},
> +};
> +
> +static struct platform_device *mock_cxl_root_port(struct pci_bus *bus, int index)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(mock_pci_bus); i++)
> +		if (bus == &mock_pci_bus[i])
> +			return cxl_root_port[index + i * NR_CXL_ROOT_PORTS];
> +	return NULL;
> +}
> +
> +static bool is_mock_port(struct platform_device *pdev)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(cxl_root_port); i++)
> +		if (pdev == cxl_root_port[i])
> +			return true;
> +	return false;
> +}
> +
> +static bool is_mock_bus(struct pci_bus *bus)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(mock_pci_bus); i++)
> +		if (bus == &mock_pci_bus[i])
> +			return true;
> +	return false;
> +}
> +
> +static struct acpi_pci_root *mock_acpi_pci_find_root(acpi_handle handle)
> +{
> +	struct acpi_device *adev = find_host_bridge(handle);
> +
> +	if (!adev)
> +		return acpi_pci_find_root(handle);
> +	return &mock_pci_root[host_bridge_index(adev)];
> +}
> +
> +static struct cxl_mock_ops cxl_mock_ops = {
> +	.is_mock_adev = is_mock_adev,
> +	.is_mock_bridge = is_mock_bridge,
> +	.is_mock_bus = is_mock_bus,
> +	.is_mock_port = is_mock_port,
> +	.is_mock_dev = is_mock_dev,
> +	.mock_port = mock_cxl_root_port,
> +	.acpi_get_table = mock_acpi_get_table,
> +	.acpi_put_table = mock_acpi_put_table,
> +	.acpi_evaluate_integer = mock_acpi_evaluate_integer,
> +	.acpi_pci_find_root = mock_acpi_pci_find_root,
> +	.list = LIST_HEAD_INIT(cxl_mock_ops.list),
> +};
> +
> +static void mock_companion(struct acpi_device *adev, struct device *dev)
> +{
> +	device_initialize(&adev->dev);
> +	fwnode_init(&adev->fwnode, NULL);
> +	dev->fwnode = &adev->fwnode;
> +	adev->fwnode.dev = dev;
> +}
> +
> +#ifndef SZ_64G
> +#define SZ_64G (SZ_32G * 2)
> +#endif
> +
> +#ifndef SZ_512G
> +#define SZ_512G (SZ_64G * 8)
> +#endif
> +
> +static __init int cxl_test_init(void)
> +{
> +	int rc, i;
> +
> +	register_cxl_mock_ops(&cxl_mock_ops);
> +
> +	cxl_mock_pool = gen_pool_create(ilog2(SZ_2M), NUMA_NO_NODE);
> +	if (!cxl_mock_pool) {
> +		rc = -ENOMEM;
> +		goto err_gen_pool_create;
> +	}
> +
> +	rc = gen_pool_add(cxl_mock_pool, SZ_512G, SZ_64G, NUMA_NO_NODE);
> +	if (rc)
> +		goto err_gen_pool_add;
> +
> +	rc = populate_cedt();
> +	if (rc)
> +		goto err_populate;
> +
> +	for (i = 0; i < ARRAY_SIZE(cxl_host_bridge); i++) {
> +		struct acpi_device *adev = &host_bridge[i];
> +		struct platform_device *pdev;
> +
> +		pdev = platform_device_alloc("cxl_host_bridge", i);
> +		if (!pdev)
> +			goto err_bridge;
> +
> +		mock_companion(adev, &pdev->dev);
> +		rc = platform_device_add(pdev);
> +		if (rc) {
> +			platform_device_put(pdev);
> +			goto err_bridge;
> +		}
> +		cxl_host_bridge[i] = pdev;
> +	}
> +
> +	for (i = 0; i < ARRAY_SIZE(cxl_root_port); i++) {
> +		struct platform_device *bridge =
> +			cxl_host_bridge[i / NR_CXL_ROOT_PORTS];
> +		struct platform_device *pdev;
> +
> +		pdev = platform_device_alloc("cxl_root_port", i);
> +		if (!pdev)
> +			goto err_port;
> +		pdev->dev.parent = &bridge->dev;
> +
> +		rc = platform_device_add(pdev);
> +		if (rc) {
> +			platform_device_put(pdev);
> +			goto err_port;
> +		}
> +		cxl_root_port[i] = pdev;
> +	}
> +
> +	cxl_acpi = platform_device_alloc("cxl_acpi", 0);
> +	if (!cxl_acpi)
> +		goto err_port;
> +
> +	mock_companion(&acpi0017_mock, &cxl_acpi->dev);
> +	acpi0017_mock.dev.bus = &platform_bus_type;
> +
> +	rc = platform_device_add(cxl_acpi);
> +	if (rc)
> +		goto err_add;
> +
> +	return 0;
> +
> +err_add:
> +	platform_device_put(cxl_acpi);
> +err_port:
> +	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
> +		platform_device_unregister(cxl_root_port[i]);
> +err_bridge:
> +	for (i = ARRAY_SIZE(cxl_host_bridge) - 1; i >= 0; i--)
> +		platform_device_unregister(cxl_host_bridge[i]);
> +err_populate:
> +	depopulate_all_mock_resources();
> +err_gen_pool_add:
> +	gen_pool_destroy(cxl_mock_pool);
> +err_gen_pool_create:
> +	unregister_cxl_mock_ops(&cxl_mock_ops);
> +	return rc;
> +}
> +
> +static __exit void cxl_test_exit(void)
> +{
> +	int i;
> +
> +	platform_device_unregister(cxl_acpi);
> +	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
> +		platform_device_unregister(cxl_root_port[i]);
> +	for (i = ARRAY_SIZE(cxl_host_bridge) - 1; i >= 0; i--)
> +		platform_device_unregister(cxl_host_bridge[i]);
> +	depopulate_all_mock_resources();
> +	gen_pool_destroy(cxl_mock_pool);
> +	unregister_cxl_mock_ops(&cxl_mock_ops);
> +}
> +
> +module_init(cxl_test_init);
> +module_exit(cxl_test_exit);
> +MODULE_LICENSE("GPL v2");
> diff --git a/tools/testing/cxl/test/mock.c b/tools/testing/cxl/test/mock.c
> new file mode 100644
> index 000000000000..b8c108abcf07
> --- /dev/null
> +++ b/tools/testing/cxl/test/mock.c
> @@ -0,0 +1,171 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +//Copyright(c) 2021 Intel Corporation. All rights reserved.
> +
> +#include <linux/libnvdimm.h>
> +#include <linux/rculist.h>
> +#include <linux/device.h>
> +#include <linux/export.h>
> +#include <linux/acpi.h>
> +#include <linux/pci.h>
> +#include "mock.h"
> +
> +static LIST_HEAD(mock);
> +
> +void register_cxl_mock_ops(struct cxl_mock_ops *ops)
> +{
> +	list_add_rcu(&ops->list, &mock);
> +}
> +EXPORT_SYMBOL_GPL(register_cxl_mock_ops);
> +
> +static DEFINE_SRCU(cxl_mock_srcu);
> +
> +void unregister_cxl_mock_ops(struct cxl_mock_ops *ops)
> +{
> +	list_del_rcu(&ops->list);
> +	synchronize_srcu(&cxl_mock_srcu);
> +}
> +EXPORT_SYMBOL_GPL(unregister_cxl_mock_ops);
> +
> +struct cxl_mock_ops *get_cxl_mock_ops(int *index)
> +{
> +	*index = srcu_read_lock(&cxl_mock_srcu);
> +	return list_first_or_null_rcu(&mock, struct cxl_mock_ops, list);
> +}
> +EXPORT_SYMBOL_GPL(get_cxl_mock_ops);
> +
> +void put_cxl_mock_ops(int index)
> +{
> +	srcu_read_unlock(&cxl_mock_srcu, index);
> +}
> +EXPORT_SYMBOL_GPL(put_cxl_mock_ops);
> +
> +bool __wrap_is_acpi_device_node(const struct fwnode_handle *fwnode)
> +{
> +	struct acpi_device *adev =
> +		container_of(fwnode, struct acpi_device, fwnode);
> +	int index;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +	bool retval = false;
> +
> +	if (ops)
> +		retval = ops->is_mock_adev(adev);
> +
> +	if (!retval)
> +		retval = is_acpi_device_node(fwnode);
> +
> +	put_cxl_mock_ops(index);
> +	return retval;
> +}
> +EXPORT_SYMBOL(__wrap_is_acpi_device_node);
> +
> +acpi_status __wrap_acpi_get_table(char *signature, u32 instance,
> +				  struct acpi_table_header **out_table)
> +{
> +	int index;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +	acpi_status status;
> +
> +	if (ops)
> +		status = ops->acpi_get_table(signature, instance, out_table);
> +	else
> +		status = acpi_get_table(signature, instance, out_table);
> +
> +	put_cxl_mock_ops(index);
> +
> +	return status;
> +}
> +EXPORT_SYMBOL(__wrap_acpi_get_table);
> +
> +void __wrap_acpi_put_table(struct acpi_table_header *table)
> +{
> +	int index;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +
> +	if (ops)
> +		ops->acpi_put_table(table);
> +	else
> +		acpi_put_table(table);
> +	put_cxl_mock_ops(index);
> +}
> +EXPORT_SYMBOL(__wrap_acpi_put_table);
> +
> +acpi_status __wrap_acpi_evaluate_integer(acpi_handle handle,
> +					 acpi_string pathname,
> +					 struct acpi_object_list *arguments,
> +					 unsigned long long *data)
> +{
> +	int index;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +	acpi_status status;
> +
> +	if (ops)
> +		status = ops->acpi_evaluate_integer(handle, pathname, arguments,
> +						    data);
> +	else
> +		status = acpi_evaluate_integer(handle, pathname, arguments,
> +					       data);
> +	put_cxl_mock_ops(index);
> +
> +	return status;
> +}
> +EXPORT_SYMBOL(__wrap_acpi_evaluate_integer);
> +
> +struct acpi_pci_root *__wrap_acpi_pci_find_root(acpi_handle handle)
> +{
> +	int index;
> +	struct acpi_pci_root *root;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +
> +	if (ops)
> +		root = ops->acpi_pci_find_root(handle);
> +	else
> +		root = acpi_pci_find_root(handle);
> +
> +	put_cxl_mock_ops(index);
> +
> +	return root;
> +}
> +EXPORT_SYMBOL_GPL(__wrap_acpi_pci_find_root);
> +
> +void __wrap_pci_walk_bus(struct pci_bus *bus,
> +			 int (*cb)(struct pci_dev *, void *), void *userdata)
> +{
> +	int index;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +
> +	if (ops && ops->is_mock_bus(bus)) {
> +		int rc, i;
> +
> +		/*
> +		 * Simulate 2 root ports per host-bridge and no
> +		 * depth recursion.
> +		 */
> +		for (i = 0; i < 2; i++) {
> +			rc = cb((struct pci_dev *) ops->mock_port(bus, i),
> +				userdata);
> +			if (rc)
> +				break;
> +		}
> +	} else
> +		pci_walk_bus(bus, cb, userdata);
> +
> +	put_cxl_mock_ops(index);
> +}
> +EXPORT_SYMBOL_GPL(__wrap_pci_walk_bus);
> +
> +struct nvdimm_bus *
> +__wrap_nvdimm_bus_register(struct device *dev,
> +			   struct nvdimm_bus_descriptor *nd_desc)
> +{
> +	int index;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +
> +	if (ops && ops->is_mock_dev(dev->parent->parent))
> +		nd_desc->provider_name = "cxl_test";
> +	put_cxl_mock_ops(index);
> +
> +	return nvdimm_bus_register(dev, nd_desc);
> +}
> +EXPORT_SYMBOL_GPL(__wrap_nvdimm_bus_register);
> +
> +MODULE_LICENSE("GPL v2");
> diff --git a/tools/testing/cxl/test/mock.h b/tools/testing/cxl/test/mock.h
> new file mode 100644
> index 000000000000..805a94cb3fbe
> --- /dev/null
> +++ b/tools/testing/cxl/test/mock.h
> @@ -0,0 +1,27 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#include <linux/list.h>
> +#include <linux/acpi.h>
> +
> +struct cxl_mock_ops {
> +	struct list_head list;
> +	bool (*is_mock_adev)(struct acpi_device *dev);
> +	acpi_status (*acpi_get_table)(char *signature, u32 instance,
> +				      struct acpi_table_header **out_table);
> +	void (*acpi_put_table)(struct acpi_table_header *table);
> +	bool (*is_mock_bridge)(struct device *dev);
> +	acpi_status (*acpi_evaluate_integer)(acpi_handle handle,
> +					     acpi_string pathname,
> +					     struct acpi_object_list *arguments,
> +					     unsigned long long *data);
> +	struct acpi_pci_root *(*acpi_pci_find_root)(acpi_handle handle);
> +	struct platform_device *(*mock_port)(struct pci_bus *bus, int index);
> +	bool (*is_mock_bus)(struct pci_bus *bus);
> +	bool (*is_mock_port)(struct platform_device *pdev);
> +	bool (*is_mock_dev)(struct device *dev);
> +};
> +
> +void register_cxl_mock_ops(struct cxl_mock_ops *ops);
> +void unregister_cxl_mock_ops(struct cxl_mock_ops *ops);
> +struct cxl_mock_ops *get_cxl_mock_ops(int *index);
> +void put_cxl_mock_ops(int index);
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 18/21] cxl/bus: Populate the target list at decoder create
  2021-09-09  5:13 ` [PATCH v4 18/21] cxl/bus: Populate the target list at decoder create Dan Williams
@ 2021-09-10  9:57   ` Jonathan Cameron
  0 siblings, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10  9:57 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ben Widawsky, vishal.l.verma, nvdimm,
	alison.schofield, ira.weiny

On Wed, 8 Sep 2021 22:13:10 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> As found by cxl_test, the implementation populated the target_list for
> the single dport exceptional case, it missed populating the target_list
> for the typical multi-dport case. Root decoders always know their target
> list at the beginning of time, and even switch-level decoders should
> have a target list of one or more zeros by default, depending on the
> interleave-ways setting.
> 
> Walk the hosting port's dport list and populate based on the passed in
> map.
> 
> Move devm_cxl_add_passthrough_decoder() out of line now that it does the
> work of generating a target_map.
> 
> Before:
> $ cat /sys/bus/cxl/devices/root2/decoder*/target_list
> 0
> 
> 0
> 
> After:
> $ cat /sys/bus/cxl/devices/root2/decoder*/target_list
> 0
> 0,1,2,3
> 0
> 0,1,2,3
> 
> Where root2 is a CXL topology root object generated by 'cxl_test'.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/acpi.c     |   13 +++++++-
>  drivers/cxl/core/bus.c |   80 +++++++++++++++++++++++++++++++++++++++++-------
>  drivers/cxl/cxl.h      |   25 ++++++---------
>  3 files changed, 91 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index d31a97218593..9d881eacdae5 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -52,6 +52,12 @@ static int cxl_acpi_cfmws_verify(struct device *dev,
>  		return -EINVAL;
>  	}
>  
> +	if (CFMWS_INTERLEAVE_WAYS(cfmws) > CXL_DECODER_MAX_INTERLEAVE) {
> +		dev_err(dev, "CFMWS Interleave Ways (%d) too large\n",
> +			CFMWS_INTERLEAVE_WAYS(cfmws));
> +		return -EINVAL;
> +	}
> +
>  	expected_len = struct_size((cfmws), interleave_targets,
>  				   CFMWS_INTERLEAVE_WAYS(cfmws));
>  
> @@ -71,6 +77,7 @@ static int cxl_acpi_cfmws_verify(struct device *dev,
>  static void cxl_add_cfmws_decoders(struct device *dev,
>  				   struct cxl_port *root_port)
>  {
> +	int target_map[CXL_DECODER_MAX_INTERLEAVE];
>  	struct acpi_cedt_cfmws *cfmws;
>  	struct cxl_decoder *cxld;
>  	acpi_size len, cur = 0;
> @@ -83,6 +90,7 @@ static void cxl_add_cfmws_decoders(struct device *dev,
>  
>  	while (cur < len) {
>  		struct acpi_cedt_header *c = cedt_subtable + cur;
> +		int i;
>  
>  		if (c->type != ACPI_CEDT_TYPE_CFMWS) {
>  			cur += c->length;
> @@ -108,6 +116,9 @@ static void cxl_add_cfmws_decoders(struct device *dev,
>  			continue;
>  		}
>  
> +		for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
> +			target_map[i] = cfmws->interleave_targets[i];
> +
>  		flags = cfmws_to_decoder_flags(cfmws->restrictions);
>  		cxld = devm_cxl_add_decoder(dev, root_port,
>  					    CFMWS_INTERLEAVE_WAYS(cfmws),
> @@ -115,7 +126,7 @@ static void cxl_add_cfmws_decoders(struct device *dev,
>  					    CFMWS_INTERLEAVE_WAYS(cfmws),
>  					    CFMWS_INTERLEAVE_GRANULARITY(cfmws),
>  					    CXL_DECODER_EXPANDER,
> -					    flags);
> +					    flags, target_map);
>  
>  		if (IS_ERR(cxld)) {
>  			dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 8073354ba232..176bede30c55 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -453,11 +453,38 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
>  }
>  EXPORT_SYMBOL_GPL(cxl_add_dport);
>  
> +static int decoder_populate_targets(struct device *host,
> +				    struct cxl_decoder *cxld,
> +				    struct cxl_port *port, int *target_map,
> +				    int nr_targets)
> +{
> +	int rc = 0, i;
> +
> +	if (!target_map)
> +		return 0;
> +
> +	device_lock(&port->dev);
> +	for (i = 0; i < nr_targets; i++) {
> +		struct cxl_dport *dport = find_dport(port, target_map[i]);
> +
> +		if (!dport) {
> +			rc = -ENXIO;
> +			break;
> +		}
> +		dev_dbg(host, "%s: target: %d\n", dev_name(dport->dport), i);
> +		cxld->target[i] = dport;
> +	}
> +	device_unlock(&port->dev);
> +
> +	return rc;
> +}
> +
>  static struct cxl_decoder *
> -cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
> -		  resource_size_t len, int interleave_ways,
> -		  int interleave_granularity, enum cxl_decoder_type type,
> -		  unsigned long flags)
> +cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> +		  resource_size_t base, resource_size_t len,
> +		  int interleave_ways, int interleave_granularity,
> +		  enum cxl_decoder_type type, unsigned long flags,
> +		  int *target_map)
>  {
>  	struct cxl_decoder *cxld;
>  	struct device *dev;
> @@ -493,10 +520,10 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base,
>  		.target_type = type,
>  	};
>  
> -	/* handle implied target_list */
> -	if (interleave_ways == 1)
> -		cxld->target[0] =
> -			list_first_entry(&port->dports, struct cxl_dport, list);
> +	rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
> +	if (rc)
> +		goto err;
> +
>  	dev = &cxld->dev;
>  	device_initialize(dev);
>  	device_set_pm_not_required(dev);
> @@ -519,14 +546,19 @@ struct cxl_decoder *
>  devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
>  		     resource_size_t base, resource_size_t len,
>  		     int interleave_ways, int interleave_granularity,
> -		     enum cxl_decoder_type type, unsigned long flags)
> +		     enum cxl_decoder_type type, unsigned long flags,
> +		     int *target_map)
>  {
>  	struct cxl_decoder *cxld;
>  	struct device *dev;
>  	int rc;
>  
> -	cxld = cxl_decoder_alloc(port, nr_targets, base, len, interleave_ways,
> -				 interleave_granularity, type, flags);
> +	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
> +		return ERR_PTR(-EINVAL);
> +
> +	cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
> +				 interleave_ways, interleave_granularity, type,
> +				 flags, target_map);
>  	if (IS_ERR(cxld))
>  		return cxld;
>  
> @@ -550,6 +582,32 @@ devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
>  }
>  EXPORT_SYMBOL_GPL(devm_cxl_add_decoder);
>  
> +/*
> + * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
> + * single ported host-bridges need not publish a decoder capability when a
> + * passthrough decode can be assumed, i.e. all transactions that the uport sees
> + * are claimed and passed to the single dport. Default the range a 0-base
> + * 0-length until the first CXL region is activated.
> + */
> +struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> +						     struct cxl_port *port)
> +{
> +	struct cxl_dport *dport;
> +	int target_map[1];
> +
> +	device_lock(&port->dev);
> +	dport = list_first_entry_or_null(&port->dports, typeof(*dport), list);
> +	device_unlock(&port->dev);
> +
> +	if (!dport)
> +		return ERR_PTR(-ENXIO);
> +
> +	target_map[0] = dport->port_id;
> +	return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
> +				    CXL_DECODER_EXPANDER, 0, target_map);
> +}
> +EXPORT_SYMBOL_GPL(devm_cxl_add_passthrough_decoder);
> +
>  /**
>   * __cxl_driver_register - register a driver for the cxl bus
>   * @cxl_drv: cxl driver structure to attach
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index c5152718267e..84b8836c1f91 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -180,6 +180,12 @@ enum cxl_decoder_type {
>         CXL_DECODER_EXPANDER = 3,
>  };
>  
> +/*
> + * Current specification goes up to 8, double that seems a reasonable
> + * software max for the foreseeable future
> + */
> +#define CXL_DECODER_MAX_INTERLEAVE 16
> +
>  /**
>   * struct cxl_decoder - CXL address range decode configuration
>   * @dev: this decoder's device
> @@ -284,22 +290,11 @@ struct cxl_decoder *
>  devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
>  		     resource_size_t base, resource_size_t len,
>  		     int interleave_ways, int interleave_granularity,
> -		     enum cxl_decoder_type type, unsigned long flags);
> -
> -/*
> - * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
> - * single ported host-bridges need not publish a decoder capability when a
> - * passthrough decode can be assumed, i.e. all transactions that the uport sees
> - * are claimed and passed to the single dport. Default the range a 0-base
> - * 0-length until the first CXL region is activated.
> - */
> -static inline struct cxl_decoder *
> -devm_cxl_add_passthrough_decoder(struct device *host, struct cxl_port *port)
> -{
> -	return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
> -				    CXL_DECODER_EXPANDER, 0);
> -}
> +		     enum cxl_decoder_type type, unsigned long flags,
> +		     int *target_map);
>  
> +struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> +						     struct cxl_port *port);
>  extern struct bus_type cxl_bus_type;
>  
>  struct cxl_driver {
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 20/21] tools/testing/cxl: Introduce a mock memory device + driver
  2021-09-09  5:13 ` [PATCH v4 20/21] tools/testing/cxl: Introduce a mock memory device + driver Dan Williams
@ 2021-09-10 10:09   ` Jonathan Cameron
  0 siblings, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10 10:09 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ben Widawsky, kernel test robot, vishal.l.verma,
	nvdimm, alison.schofield, ira.weiny

On Wed, 8 Sep 2021 22:13:21 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Introduce an emulated device-set plus driver to register CXL memory
> devices, 'struct cxl_memdev' instances, in the mock cxl_test topology.
> This enables the development of HDM Decoder (Host-managed Device Memory
> Decoder) programming flow (region provisioning) in an environment that
> can be updated alongside the kernel as it gains more functionality.
> 
> Whereas the cxl_pci module looks for CXL memory expanders on the 'pci'
> bus, the cxl_mock_mem module attaches to CXL expanders on the platform
> bus emitted by cxl_test.
> 
> Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/core/pmem.c       |    6 -
>  drivers/cxl/cxl.h             |    2 
>  drivers/cxl/pmem.c            |    2 
>  tools/testing/cxl/Kbuild      |    2 
>  tools/testing/cxl/mock_pmem.c |   24 ++++
>  tools/testing/cxl/test/Kbuild |    4 +
>  tools/testing/cxl/test/cxl.c  |   69 +++++++++++
>  tools/testing/cxl/test/mem.c  |  256 +++++++++++++++++++++++++++++++++++++++++
>  8 files changed, 359 insertions(+), 6 deletions(-)
>  create mode 100644 tools/testing/cxl/mock_pmem.c
>  create mode 100644 tools/testing/cxl/test/mem.c
> 
> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
> index 9e56be3994f1..74be5132df1c 100644
> --- a/drivers/cxl/core/pmem.c
> +++ b/drivers/cxl/core/pmem.c
> @@ -51,16 +51,16 @@ struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(to_cxl_nvdimm_bridge);
>  
> -static int match_nvdimm_bridge(struct device *dev, const void *data)
> +__mock int match_nvdimm_bridge(struct device *dev, const void *data)
>  {
>  	return dev->type == &cxl_nvdimm_bridge_type;
>  }
>  
> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
> +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd)
>  {
>  	struct device *dev;
>  
> -	dev = bus_find_device(&cxl_bus_type, NULL, NULL, match_nvdimm_bridge);
> +	dev = bus_find_device(&cxl_bus_type, NULL, cxl_nvd, match_nvdimm_bridge);
>  	if (!dev)
>  		return NULL;
>  	return to_cxl_nvdimm_bridge(dev);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 84b8836c1f91..9af5745ba2c0 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -327,7 +327,7 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
>  struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
>  bool is_cxl_nvdimm(struct device *dev);
>  int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
> -struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void);
> +struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd);
>  
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> index 877b0a84e586..569e6ff2a456 100644
> --- a/drivers/cxl/pmem.c
> +++ b/drivers/cxl/pmem.c
> @@ -40,7 +40,7 @@ static int cxl_nvdimm_probe(struct device *dev)
>  	struct nvdimm *nvdimm = NULL;
>  	int rc = -ENXIO;
>  
> -	cxl_nvb = cxl_find_nvdimm_bridge();
> +	cxl_nvb = cxl_find_nvdimm_bridge(cxl_nvd);
>  	if (!cxl_nvb)
>  		return -ENXIO;
>  
> diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
> index 63a4a07e71c4..86deba8308a1 100644
> --- a/tools/testing/cxl/Kbuild
> +++ b/tools/testing/cxl/Kbuild
> @@ -33,4 +33,6 @@ cxl_core-y += $(CXL_CORE_SRC)/memdev.o
>  cxl_core-y += $(CXL_CORE_SRC)/mbox.o
>  cxl_core-y += config_check.o
>  
> +cxl_core-y += mock_pmem.o
> +
>  obj-m += test/
> diff --git a/tools/testing/cxl/mock_pmem.c b/tools/testing/cxl/mock_pmem.c
> new file mode 100644
> index 000000000000..f7315e6f52c0
> --- /dev/null
> +++ b/tools/testing/cxl/mock_pmem.c
> @@ -0,0 +1,24 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +#include <cxl.h>
> +#include "test/mock.h"
> +#include <core/core.h>
> +
> +int match_nvdimm_bridge(struct device *dev, const void *data)
> +{
> +	int index, rc = 0;
> +	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
> +	const struct cxl_nvdimm *cxl_nvd = data;
> +
> +	if (ops) {
> +		if (dev->type == &cxl_nvdimm_bridge_type &&
> +		    (ops->is_mock_dev(dev->parent->parent) ==
> +		     ops->is_mock_dev(cxl_nvd->dev.parent->parent)))
> +			rc = 1;
> +	} else
> +		rc = dev->type == &cxl_nvdimm_bridge_type;
> +
> +	put_cxl_mock_ops(index);
> +
> +	return rc;
> +}
> diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
> index 7de4ddecfd21..4e59e2c911f6 100644
> --- a/tools/testing/cxl/test/Kbuild
> +++ b/tools/testing/cxl/test/Kbuild
> @@ -1,6 +1,10 @@
>  # SPDX-License-Identifier: GPL-2.0
> +ccflags-y := -I$(srctree)/drivers/cxl/
> +
>  obj-m += cxl_test.o
>  obj-m += cxl_mock.o
> +obj-m += cxl_mock_mem.o
>  
>  cxl_test-y := cxl.o
>  cxl_mock-y := mock.o
> +cxl_mock_mem-y := mem.o
> diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
> index 1c47b34244a4..cb32f9e27d5d 100644
> --- a/tools/testing/cxl/test/cxl.c
> +++ b/tools/testing/cxl/test/cxl.c
> @@ -17,6 +17,7 @@ static struct platform_device *cxl_acpi;
>  static struct platform_device *cxl_host_bridge[NR_CXL_HOST_BRIDGES];
>  static struct platform_device
>  	*cxl_root_port[NR_CXL_HOST_BRIDGES * NR_CXL_ROOT_PORTS];
> +struct platform_device *cxl_mem[NR_CXL_HOST_BRIDGES * NR_CXL_ROOT_PORTS];
>  
>  static struct acpi_device acpi0017_mock;
>  static struct acpi_device host_bridge[NR_CXL_HOST_BRIDGES] = {
> @@ -36,6 +37,11 @@ static struct acpi_device host_bridge[NR_CXL_HOST_BRIDGES] = {
>  
>  static bool is_mock_dev(struct device *dev)
>  {
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(cxl_mem); i++)
> +		if (dev == &cxl_mem[i]->dev)
> +			return true;
>  	if (dev == &cxl_acpi->dev)
>  		return true;
>  	return false;
> @@ -405,6 +411,44 @@ static void mock_companion(struct acpi_device *adev, struct device *dev)
>  #define SZ_512G (SZ_64G * 8)
>  #endif
>  
> +static struct platform_device *alloc_memdev(int id)
> +{
> +	struct resource res[] = {
> +		[0] = {
> +			.flags = IORESOURCE_MEM,
> +		},
> +		[1] = {
> +			.flags = IORESOURCE_MEM,
> +			.desc = IORES_DESC_PERSISTENT_MEMORY,
> +		},
> +	};
> +	struct platform_device *pdev;
> +	int i, rc;
> +
> +	for (i = 0; i < ARRAY_SIZE(res); i++) {
> +		struct cxl_mock_res *r = alloc_mock_res(SZ_256M);
> +
> +		if (!r)
> +			return NULL;
> +		res[i].start = r->range.start;
> +		res[i].end = r->range.end;
> +	}
> +
> +	pdev = platform_device_alloc("cxl_mem", id);
> +	if (!pdev)
> +		return NULL;
> +
> +	rc = platform_device_add_resources(pdev, res, ARRAY_SIZE(res));
> +	if (rc)
> +		goto err;
> +
> +	return pdev;
> +
> +err:
> +	platform_device_put(pdev);
> +	return NULL;
> +}
> +
>  static __init int cxl_test_init(void)
>  {
>  	int rc, i;
> @@ -460,9 +504,27 @@ static __init int cxl_test_init(void)
>  		cxl_root_port[i] = pdev;
>  	}
>  
> +	BUILD_BUG_ON(ARRAY_SIZE(cxl_mem) != ARRAY_SIZE(cxl_root_port));
> +	for (i = 0; i < ARRAY_SIZE(cxl_mem); i++) {
> +		struct platform_device *port = cxl_root_port[i];
> +		struct platform_device *pdev;
> +
> +		pdev = alloc_memdev(i);
> +		if (!pdev)
> +			goto err_mem;
> +		pdev->dev.parent = &port->dev;
> +
> +		rc = platform_device_add(pdev);
> +		if (rc) {
> +			platform_device_put(pdev);
> +			goto err_mem;
> +		}
> +		cxl_mem[i] = pdev;
> +	}
> +
>  	cxl_acpi = platform_device_alloc("cxl_acpi", 0);
>  	if (!cxl_acpi)
> -		goto err_port;
> +		goto err_mem;
>  
>  	mock_companion(&acpi0017_mock, &cxl_acpi->dev);
>  	acpi0017_mock.dev.bus = &platform_bus_type;
> @@ -475,6 +537,9 @@ static __init int cxl_test_init(void)
>  
>  err_add:
>  	platform_device_put(cxl_acpi);
> +err_mem:
> +	for (i = ARRAY_SIZE(cxl_mem) - 1; i >= 0; i--)
> +		platform_device_unregister(cxl_mem[i]);
>  err_port:
>  	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
>  		platform_device_unregister(cxl_root_port[i]);
> @@ -495,6 +560,8 @@ static __exit void cxl_test_exit(void)
>  	int i;
>  
>  	platform_device_unregister(cxl_acpi);
> +	for (i = ARRAY_SIZE(cxl_mem) - 1; i >= 0; i--)
> +		platform_device_unregister(cxl_mem[i]);
>  	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
>  		platform_device_unregister(cxl_root_port[i]);
>  	for (i = ARRAY_SIZE(cxl_host_bridge) - 1; i >= 0; i--)
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> new file mode 100644
> index 000000000000..12a8437a9ca0
> --- /dev/null
> +++ b/tools/testing/cxl/test/mem.c
> @@ -0,0 +1,256 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright(c) 2021 Intel Corporation. All rights reserved.
> +
> +#include <linux/platform_device.h>
> +#include <linux/mod_devicetable.h>
> +#include <linux/module.h>
> +#include <linux/sizes.h>
> +#include <linux/bits.h>
> +#include <cxlmem.h>
> +
> +#define LSA_SIZE SZ_128K
> +#define EFFECT(x) (1U << x)
> +
> +static struct cxl_cel_entry mock_cel[] = {
> +	{
> +		.opcode = cpu_to_le16(CXL_MBOX_OP_GET_SUPPORTED_LOGS),
> +		.effect = cpu_to_le16(0),
> +	},
> +	{
> +		.opcode = cpu_to_le16(CXL_MBOX_OP_IDENTIFY),
> +		.effect = cpu_to_le16(0),
> +	},
> +	{
> +		.opcode = cpu_to_le16(CXL_MBOX_OP_GET_LSA),
> +		.effect = cpu_to_le16(0),
> +	},
> +	{
> +		.opcode = cpu_to_le16(CXL_MBOX_OP_SET_LSA),
> +		.effect = cpu_to_le16(EFFECT(1) | EFFECT(2)),
> +	},
> +};
> +
> +static struct {
> +	struct cxl_mbox_get_supported_logs gsl;
> +	struct cxl_gsl_entry entry;
> +} mock_gsl_payload = {
> +	.gsl = {
> +		.entries = cpu_to_le16(1),
> +	},
> +	.entry = {
> +		.uuid = DEFINE_CXL_CEL_UUID,
> +		.size = cpu_to_le32(sizeof(mock_cel)),
> +	},
> +};
> +
> +static int mock_gsl(struct cxl_mbox_cmd *cmd)
> +{
> +	if (cmd->size_out < sizeof(mock_gsl_payload))
> +		return -EINVAL;
> +
> +	memcpy(cmd->payload_out, &mock_gsl_payload, sizeof(mock_gsl_payload));
> +	cmd->size_out = sizeof(mock_gsl_payload);
> +
> +	return 0;
> +}
> +
> +static int mock_get_log(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
> +{
> +	struct cxl_mbox_get_log *gl = cmd->payload_in;
> +	u32 offset = le32_to_cpu(gl->offset);
> +	u32 length = le32_to_cpu(gl->length);
> +	uuid_t uuid = DEFINE_CXL_CEL_UUID;
> +	void *data = &mock_cel;
> +
> +	if (cmd->size_in < sizeof(*gl))
> +		return -EINVAL;
> +	if (length > cxlm->payload_size)
> +		return -EINVAL;
> +	if (offset + length > sizeof(mock_cel))
> +		return -EINVAL;
> +	if (!uuid_equal(&gl->uuid, &uuid))
> +		return -EINVAL;
> +	if (length > cmd->size_out)
> +		return -EINVAL;
> +
> +	memcpy(cmd->payload_out, data + offset, length);
> +
> +	return 0;
> +}
> +
> +static int mock_id(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
> +{
> +	struct platform_device *pdev = to_platform_device(cxlm->dev);
> +	struct cxl_mbox_identify id = {
> +		.fw_revision = { "mock fw v1 " },
> +		.lsa_size = cpu_to_le32(LSA_SIZE),
> +		/* FIXME: Add partition support */
> +		.partition_align = cpu_to_le64(0),
> +	};
> +	u64 capacity = 0;
> +	int i;
> +
> +	if (cmd->size_out < sizeof(id))
> +		return -EINVAL;
> +
> +	for (i = 0; i < 2; i++) {
> +		struct resource *res;
> +
> +		res = platform_get_resource(pdev, IORESOURCE_MEM, i);
> +		if (!res)
> +			break;
> +
> +		capacity += resource_size(res) / CXL_CAPACITY_MULTIPLIER;
> +
> +		if (le64_to_cpu(id.partition_align))
> +			continue;
> +
> +		if (res->desc == IORES_DESC_PERSISTENT_MEMORY)
> +			id.persistent_capacity = cpu_to_le64(
> +				resource_size(res) / CXL_CAPACITY_MULTIPLIER);
> +		else
> +			id.volatile_capacity = cpu_to_le64(
> +				resource_size(res) / CXL_CAPACITY_MULTIPLIER);
> +	}
> +
> +	id.total_capacity = cpu_to_le64(capacity);
> +
> +	memcpy(cmd->payload_out, &id, sizeof(id));
> +
> +	return 0;
> +}
> +
> +static int mock_get_lsa(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
> +{
> +	struct cxl_mbox_get_lsa *get_lsa = cmd->payload_in;
> +	void *lsa = dev_get_drvdata(cxlm->dev);
> +	u32 offset, length;
> +
> +	if (sizeof(*get_lsa) > cmd->size_in)
> +		return -EINVAL;
> +	offset = le32_to_cpu(get_lsa->offset);
> +	length = le32_to_cpu(get_lsa->length);
> +	if (offset + length > LSA_SIZE)
> +		return -EINVAL;
> +	if (length > cmd->size_out)
> +		return -EINVAL;
> +
> +	memcpy(cmd->payload_out, lsa + offset, length);
> +	return 0;
> +}
> +
> +static int mock_set_lsa(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
> +{
> +	struct cxl_mbox_set_lsa *set_lsa = cmd->payload_in;
> +	void *lsa = dev_get_drvdata(cxlm->dev);
> +	u32 offset, length;
> +
> +	if (sizeof(*set_lsa) > cmd->size_in)
> +		return -EINVAL;
> +	offset = le32_to_cpu(set_lsa->offset);
> +	length = cmd->size_in - sizeof(*set_lsa);
> +	if (offset + length > LSA_SIZE)
> +		return -EINVAL;
> +
> +	memcpy(lsa + offset, &set_lsa->data[0], length);
> +	return 0;
> +}
> +
> +static int cxl_mock_mbox_send(struct cxl_mem *cxlm, struct cxl_mbox_cmd *cmd)
> +{
> +	struct device *dev = cxlm->dev;
> +	int rc = -EIO;
> +
> +	switch (cmd->opcode) {
> +	case CXL_MBOX_OP_GET_SUPPORTED_LOGS:
> +		rc = mock_gsl(cmd);
> +		break;
> +	case CXL_MBOX_OP_GET_LOG:
> +		rc = mock_get_log(cxlm, cmd);
> +		break;
> +	case CXL_MBOX_OP_IDENTIFY:
> +		rc = mock_id(cxlm, cmd);
> +		break;
> +	case CXL_MBOX_OP_GET_LSA:
> +		rc = mock_get_lsa(cxlm, cmd);
> +		break;
> +	case CXL_MBOX_OP_SET_LSA:
> +		rc = mock_set_lsa(cxlm, cmd);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	dev_dbg(dev, "opcode: %#x sz_in: %zd sz_out: %zd rc: %d\n", cmd->opcode,
> +		cmd->size_in, cmd->size_out, rc);
> +
> +	return rc;
> +}
> +
> +static void label_area_release(void *lsa)
> +{
> +	vfree(lsa);
> +}
> +
> +static int cxl_mock_mem_probe(struct platform_device *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct cxl_memdev *cxlmd;
> +	struct cxl_mem *cxlm;
> +	void *lsa;
> +	int rc;
> +
> +	lsa = vmalloc(LSA_SIZE);
> +	if (!lsa)
> +		return -ENOMEM;
> +	rc = devm_add_action_or_reset(dev, label_area_release, lsa);
> +	if (rc)
> +		return rc;
> +	dev_set_drvdata(dev, lsa);
> +
> +	cxlm = cxl_mem_create(dev);
> +	if (IS_ERR(cxlm))
> +		return PTR_ERR(cxlm);
> +
> +	cxlm->mbox_send = cxl_mock_mbox_send;
> +	cxlm->payload_size = SZ_4K;
> +
> +	rc = cxl_mem_enumerate_cmds(cxlm);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_mem_identify(cxlm);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_mem_create_range_info(cxlm);
> +	if (rc)
> +		return rc;
> +
> +	cxlmd = devm_cxl_add_memdev(cxlm);
> +	if (IS_ERR(cxlmd))
> +		return PTR_ERR(cxlmd);
> +
> +	if (range_len(&cxlm->pmem_range) && IS_ENABLED(CONFIG_CXL_PMEM))
> +		rc = devm_cxl_add_nvdimm(dev, cxlmd);
> +
> +	return 0;
> +}
> +
> +static const struct platform_device_id cxl_mock_mem_ids[] = {
> +	{ .name = "cxl_mem", },
> +	{ },
> +};
> +MODULE_DEVICE_TABLE(platform, cxl_mock_mem_ids);
> +
> +static struct platform_driver cxl_mock_mem_driver = {
> +	.probe = cxl_mock_mem_probe,
> +	.id_table = cxl_mock_mem_ids,
> +	.driver = {
> +		.name = KBUILD_MODNAME,
> +	},
> +};
> +
> +module_platform_driver(cxl_mock_mem_driver);
> +MODULE_LICENSE("GPL v2");
> +MODULE_IMPORT_NS(CXL);
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-09  5:13 ` [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add Dan Williams
@ 2021-09-10 10:33   ` Jonathan Cameron
  2021-09-10 18:36     ` Dan Williams
  2021-09-14 19:31   ` [PATCH v5 " Dan Williams
  1 sibling, 1 reply; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-10 10:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, kernel test robot, Nathan Chancellor, Dan Carpenter,
	vishal.l.verma, nvdimm, ben.widawsky, alison.schofield,
	ira.weiny

On Wed, 8 Sep 2021 22:13:26 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> The kbuild robot reports:
> 
>     drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
>     limit (1024) in function 'devm_cxl_add_decoder'
> 
> It is also the case the devm_cxl_add_decoder() is unwieldy to use for
> all the different decoder types. Fix the stack usage by splitting the
> creation into alloc and add steps. This also allows for context
> specific construction before adding.
> 
> With the split the caller is responsible for registering a devm callback
> to trigger device_unregister() for the decoder rather than it being
> implicit in the decoder registration. I.e. the routine that calls alloc
> is responsible for calling put_device() if the "add" operation fails.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

A few minor things inline. This one was definitely a case where diff
wasn't being helpful in how it chose to format things!

I haven't taken the time to figure out if the device_lock() changes
make complete sense as I don't understand the intent.
I think they should be called out in the patch description as they
seem a little non obvious.

Jonathan

> ---
>  drivers/cxl/acpi.c      |   84 +++++++++++++++++++++++++----------
>  drivers/cxl/core/bus.c  |  114 ++++++++++++++---------------------------------
>  drivers/cxl/core/core.h |    5 --
>  drivers/cxl/core/pmem.c |    7 ++-
>  drivers/cxl/cxl.h       |   16 +++----
>  5 files changed, 106 insertions(+), 120 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 9d881eacdae5..654a80547526 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
>  	struct cxl_decoder *cxld;
>  	acpi_size len, cur = 0;
>  	void *cedt_subtable;
> -	unsigned long flags;
>  	int rc;
>  
>  	len = acpi_cedt->length - sizeof(*acpi_cedt);
> @@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
>  		for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
>  			target_map[i] = cfmws->interleave_targets[i];
>  
> -		flags = cfmws_to_decoder_flags(cfmws->restrictions);
> -		cxld = devm_cxl_add_decoder(dev, root_port,
> -					    CFMWS_INTERLEAVE_WAYS(cfmws),
> -					    cfmws->base_hpa, cfmws->window_size,
> -					    CFMWS_INTERLEAVE_WAYS(cfmws),
> -					    CFMWS_INTERLEAVE_GRANULARITY(cfmws),
> -					    CXL_DECODER_EXPANDER,
> -					    flags, target_map);
> -
> -		if (IS_ERR(cxld)) {
> +		cxld = cxl_decoder_alloc(root_port,
> +					 CFMWS_INTERLEAVE_WAYS(cfmws));
> +		if (IS_ERR(cxld))
> +			goto next;
> +
> +		cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
> +		cxld->target_type = CXL_DECODER_EXPANDER;
> +		cxld->range = (struct range) {
> +			.start = cfmws->base_hpa,
> +			.end = cfmws->base_hpa + cfmws->window_size - 1,
> +		};
> +		cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
> +		cxld->interleave_granularity =
> +			CFMWS_INTERLEAVE_GRANULARITY(cfmws);
> +
> +		rc = cxl_decoder_add(dev, cxld, target_map);
> +		if (rc)
> +			put_device(&cxld->dev);
> +		else
> +			rc = cxl_decoder_autoremove(dev, cxld);
> +		if (rc) {
>  			dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
>  				cfmws->base_hpa, cfmws->base_hpa +
>  				cfmws->window_size - 1);
> -		} else {
> -			dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> -				dev_name(&cxld->dev), cfmws->base_hpa,
> -				 cfmws->base_hpa + cfmws->window_size - 1);
> +			goto next;
>  		}
> +		dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> +			dev_name(&cxld->dev), cfmws->base_hpa,
> +			cfmws->base_hpa + cfmws->window_size - 1);
> +next:
>  		cur += c->length;
>  	}
>  }
> @@ -268,6 +279,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>  	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
>  	struct acpi_pci_root *pci_root;
>  	struct cxl_walk_context ctx;
> +	int single_port_map[1], rc;
>  	struct cxl_decoder *cxld;
>  	struct cxl_dport *dport;
>  	struct cxl_port *port;
> @@ -303,22 +315,46 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>  		return -ENODEV;
>  	if (ctx.error)
>  		return ctx.error;
> +	if (ctx.count > 1)
> +		return 0;
>  
>  	/* TODO: Scan CHBCR for HDM Decoder resources */
>  
>  	/*
> -	 * In the single-port host-bridge case there are no HDM decoders
> -	 * in the CHBCR and a 1:1 passthrough decode is implied.
> +	 * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> +	 * Structure) single ported host-bridges need not publish a decoder
> +	 * capability when a passthrough decode can be assumed, i.e. all
> +	 * transactions that the uport sees are claimed and passed to the single
> +	 * dport. Default the range a 0-base 0-length until the first CXL region
> +	 * is activated.
>  	 */
> -	if (ctx.count == 1) {
> -		cxld = devm_cxl_add_passthrough_decoder(host, port);
> -		if (IS_ERR(cxld))
> -			return PTR_ERR(cxld);
> +	cxld = cxl_decoder_alloc(port, 1);
> +	if (IS_ERR(cxld))
> +		return PTR_ERR(cxld);
> +
> +	cxld->interleave_ways = 1;
> +	cxld->interleave_granularity = PAGE_SIZE;
> +	cxld->target_type = CXL_DECODER_EXPANDER;
> +	cxld->range = (struct range) {
> +		.start = 0,
> +		.end = -1,
> +	};
>  
> -		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> -	}

This was previously done without taking the device lock... (see below)

> +	device_lock(&port->dev);
> +	dport = list_first_entry(&port->dports, typeof(*dport), list);
> +	device_unlock(&port->dev);
>  
> -	return 0;
> +	single_port_map[0] = dport->port_id;
> +
> +	rc = cxl_decoder_add(host, cxld, single_port_map);
> +	if (rc)
> +		put_device(&cxld->dev);
> +	else
> +		rc = cxl_decoder_autoremove(host, cxld);
> +
> +	if (rc == 0)
> +		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> +	return rc;
>  }
>  
>  static int add_host_bridge_dport(struct device *match, void *arg)
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 176bede30c55..be787685b13e 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -455,16 +455,18 @@ EXPORT_SYMBOL_GPL(cxl_add_dport);
>  
>  static int decoder_populate_targets(struct device *host,
>  				    struct cxl_decoder *cxld,
> -				    struct cxl_port *port, int *target_map,
> -				    int nr_targets)
> +				    struct cxl_port *port, int *target_map)
>  {
>  	int rc = 0, i;
>  
> +	if (list_empty(&port->dports))
> +		return -EINVAL;
> +

The code before this patch did this under the device_lock() Perhaps call
out the fact there is no need to do that if we don't need to?

>  	if (!target_map)
>  		return 0;
>  
>  	device_lock(&port->dev);
> -	for (i = 0; i < nr_targets; i++) {
> +	for (i = 0; i < cxld->nr_targets; i++) {
>  		struct cxl_dport *dport = find_dport(port, target_map[i]);
>  
>  		if (!dport) {
> @@ -479,27 +481,15 @@ static int decoder_populate_targets(struct device *host,
>  	return rc;
>  }
>  
> -static struct cxl_decoder *
> -cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> -		  resource_size_t base, resource_size_t len,
> -		  int interleave_ways, int interleave_granularity,
> -		  enum cxl_decoder_type type, unsigned long flags,
> -		  int *target_map)
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>  {
>  	struct cxl_decoder *cxld;
>  	struct device *dev;
>  	int rc = 0;
>  
> -	if (interleave_ways < 1)
> +	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
>  		return ERR_PTR(-EINVAL);
>  
> -	device_lock(&port->dev);
> -	if (list_empty(&port->dports))
> -		rc = -EINVAL;
> -	device_unlock(&port->dev);
> -	if (rc)
> -		return ERR_PTR(rc);
> -
>  	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
>  	if (!cxld)
>  		return ERR_PTR(-ENOMEM);
> @@ -508,22 +498,8 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
>  	if (rc < 0)
>  		goto err;
>  
> -	*cxld = (struct cxl_decoder) {
> -		.id = rc,
> -		.range = {
> -			.start = base,
> -			.end = base + len - 1,
> -		},
> -		.flags = flags,
> -		.interleave_ways = interleave_ways,
> -		.interleave_granularity = interleave_granularity,
> -		.target_type = type,
> -	};
> -
> -	rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
> -	if (rc)
> -		goto err;
> -
> +	cxld->id = rc;
> +	cxld->nr_targets = nr_targets;
>  	dev = &cxld->dev;
>  	device_initialize(dev);
>  	device_set_pm_not_required(dev);
> @@ -541,72 +517,48 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
>  	kfree(cxld);
>  	return ERR_PTR(rc);
>  }
> +EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
>  
> -struct cxl_decoder *
> -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
> -		     resource_size_t base, resource_size_t len,
> -		     int interleave_ways, int interleave_granularity,
> -		     enum cxl_decoder_type type, unsigned long flags,
> -		     int *target_map)
> +int cxl_decoder_add(struct device *host, struct cxl_decoder *cxld,
> +		    int *target_map)
>  {
> -	struct cxl_decoder *cxld;
> +	struct cxl_port *port;
>  	struct device *dev;
>  	int rc;
>  
> -	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
> -		return ERR_PTR(-EINVAL);
> +	if (!cxld)
> +		return -EINVAL;
>  
> -	cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
> -				 interleave_ways, interleave_granularity, type,
> -				 flags, target_map);
>  	if (IS_ERR(cxld))
> -		return cxld;
> +		return PTR_ERR(cxld);

It feels like the checks here are overly paranoid.
Obviously harmless, but if we get as far as cxl_decoder_add() with either
cxld == NULL or IS_ERR() then something horrible is going on.

I think you could reasonably drop them both.

>  
> -	dev = &cxld->dev;
> -	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
> -	if (rc)
> -		goto err;
> +	if (cxld->interleave_ways < 1)
> +		return -EINVAL;
>  
> -	rc = device_add(dev);
> +	port = to_cxl_port(cxld->dev.parent);
> +	rc = decoder_populate_targets(host, cxld, port, target_map);
>  	if (rc)
> -		goto err;
> +		return rc;
>  
> -	rc = devm_add_action_or_reset(host, unregister_cxl_dev, dev);
> +	dev = &cxld->dev;
> +	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
>  	if (rc)
> -		return ERR_PTR(rc);
> -	return cxld;
> +		return rc;
>  
> -err:
> -	put_device(dev);
> -	return ERR_PTR(rc);
> +	return device_add(dev);
>  }
> -EXPORT_SYMBOL_GPL(devm_cxl_add_decoder);
> +EXPORT_SYMBOL_GPL(cxl_decoder_add);
>  
> -/*
> - * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
> - * single ported host-bridges need not publish a decoder capability when a
> - * passthrough decode can be assumed, i.e. all transactions that the uport sees
> - * are claimed and passed to the single dport. Default the range a 0-base
> - * 0-length until the first CXL region is activated.
> - */
> -struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> -						     struct cxl_port *port)
> +static void cxld_unregister(void *dev)
>  {
> -	struct cxl_dport *dport;
> -	int target_map[1];
> -
> -	device_lock(&port->dev);
> -	dport = list_first_entry_or_null(&port->dports, typeof(*dport), list);
> -	device_unlock(&port->dev);
> -
> -	if (!dport)
> -		return ERR_PTR(-ENXIO);
> +	device_unregister(dev);
> +}
>  
> -	target_map[0] = dport->port_id;
> -	return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
> -				    CXL_DECODER_EXPANDER, 0, target_map);
> +int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld)
> +{
> +	return devm_add_action_or_reset(host, cxld_unregister, &cxld->dev);
>  }
> -EXPORT_SYMBOL_GPL(devm_cxl_add_passthrough_decoder);
> +EXPORT_SYMBOL_GPL(cxl_decoder_autoremove);
>  
>  /**
>   * __cxl_driver_register - register a driver for the cxl bus
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index c85b7fbad02d..e0c9aacc4e9c 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -9,11 +9,6 @@ extern const struct device_type cxl_nvdimm_type;
>  
>  extern struct attribute_group cxl_base_attribute_group;
>  
> -static inline void unregister_cxl_dev(void *dev)
> -{
> -	device_unregister(dev);
> -}
> -
>  struct cxl_send_command;
>  struct cxl_mem_query_commands;
>  int cxl_query_cmd(struct cxl_memdev *cxlmd,
> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
> index 74be5132df1c..5032f4c1c69d 100644
> --- a/drivers/cxl/core/pmem.c
> +++ b/drivers/cxl/core/pmem.c
> @@ -222,6 +222,11 @@ static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
>  	return cxl_nvd;
>  }
>  
> +static void cxl_nvd_unregister(void *dev)
> +{
> +	device_unregister(dev);
> +}
> +
>  /**
>   * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm
>   * @host: same host as @cxlmd
> @@ -251,7 +256,7 @@ int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
>  	dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
>  		dev_name(dev));
>  
> -	return devm_add_action_or_reset(host, unregister_cxl_dev, dev);
> +	return devm_add_action_or_reset(host, cxl_nvd_unregister, dev);
>  
>  err:
>  	put_device(dev);
...

> @@ -286,15 +288,11 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
>  
>  struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  bool is_root_decoder(struct device *dev);
> -struct cxl_decoder *
> -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
> -		     resource_size_t base, resource_size_t len,
> -		     int interleave_ways, int interleave_granularity,
> -		     enum cxl_decoder_type type, unsigned long flags,
> -		     int *target_map);
> -
> -struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> -						     struct cxl_port *port);
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
> +int cxl_decoder_add(struct device *host, struct cxl_decoder *cxld,
> +		    int *target_map);
> +int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> +
>  extern struct bus_type cxl_bus_type;
>  
>  struct cxl_driver {
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-10 10:33   ` Jonathan Cameron
@ 2021-09-10 18:36     ` Dan Williams
  2021-09-11 17:15       ` Ben Widawsky
  0 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-10 18:36 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, kernel test robot, Nathan Chancellor, Dan Carpenter,
	Vishal L Verma, Linux NVDIMM, Ben Widawsky, Schofield, Alison,
	Weiny, Ira

On Fri, Sep 10, 2021 at 3:34 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Wed, 8 Sep 2021 22:13:26 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > The kbuild robot reports:
> >
> >     drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
> >     limit (1024) in function 'devm_cxl_add_decoder'
> >
> > It is also the case the devm_cxl_add_decoder() is unwieldy to use for
> > all the different decoder types. Fix the stack usage by splitting the
> > creation into alloc and add steps. This also allows for context
> > specific construction before adding.
> >
> > With the split the caller is responsible for registering a devm callback
> > to trigger device_unregister() for the decoder rather than it being
> > implicit in the decoder registration. I.e. the routine that calls alloc
> > is responsible for calling put_device() if the "add" operation fails.
> >
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reported-by: Nathan Chancellor <nathan@kernel.org>
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> A few minor things inline. This one was definitely a case where diff
> wasn't being helpful in how it chose to format things!
>
> I haven't taken the time to figure out if the device_lock() changes
> make complete sense as I don't understand the intent.
> I think they should be called out in the patch description as they
> seem a little non obvious.
>
> Jonathan
>
> > ---
> >  drivers/cxl/acpi.c      |   84 +++++++++++++++++++++++++----------
> >  drivers/cxl/core/bus.c  |  114 ++++++++++++++---------------------------------
> >  drivers/cxl/core/core.h |    5 --
> >  drivers/cxl/core/pmem.c |    7 ++-
> >  drivers/cxl/cxl.h       |   16 +++----
> >  5 files changed, 106 insertions(+), 120 deletions(-)
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 9d881eacdae5..654a80547526 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> >       struct cxl_decoder *cxld;
> >       acpi_size len, cur = 0;
> >       void *cedt_subtable;
> > -     unsigned long flags;
> >       int rc;
> >
> >       len = acpi_cedt->length - sizeof(*acpi_cedt);
> > @@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> >               for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
> >                       target_map[i] = cfmws->interleave_targets[i];
> >
> > -             flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > -             cxld = devm_cxl_add_decoder(dev, root_port,
> > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > -                                         cfmws->base_hpa, cfmws->window_size,
> > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > -                                         CFMWS_INTERLEAVE_GRANULARITY(cfmws),
> > -                                         CXL_DECODER_EXPANDER,
> > -                                         flags, target_map);
> > -
> > -             if (IS_ERR(cxld)) {
> > +             cxld = cxl_decoder_alloc(root_port,
> > +                                      CFMWS_INTERLEAVE_WAYS(cfmws));
> > +             if (IS_ERR(cxld))
> > +                     goto next;
> > +
> > +             cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > +             cxld->target_type = CXL_DECODER_EXPANDER;
> > +             cxld->range = (struct range) {
> > +                     .start = cfmws->base_hpa,
> > +                     .end = cfmws->base_hpa + cfmws->window_size - 1,
> > +             };
> > +             cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
> > +             cxld->interleave_granularity =
> > +                     CFMWS_INTERLEAVE_GRANULARITY(cfmws);
> > +
> > +             rc = cxl_decoder_add(dev, cxld, target_map);
> > +             if (rc)
> > +                     put_device(&cxld->dev);
> > +             else
> > +                     rc = cxl_decoder_autoremove(dev, cxld);
> > +             if (rc) {
> >                       dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
> >                               cfmws->base_hpa, cfmws->base_hpa +
> >                               cfmws->window_size - 1);
> > -             } else {
> > -                     dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > -                             dev_name(&cxld->dev), cfmws->base_hpa,
> > -                              cfmws->base_hpa + cfmws->window_size - 1);
> > +                     goto next;
> >               }
> > +             dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > +                     dev_name(&cxld->dev), cfmws->base_hpa,
> > +                     cfmws->base_hpa + cfmws->window_size - 1);
> > +next:
> >               cur += c->length;
> >       }
> >  }
> > @@ -268,6 +279,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >       struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> >       struct acpi_pci_root *pci_root;
> >       struct cxl_walk_context ctx;
> > +     int single_port_map[1], rc;
> >       struct cxl_decoder *cxld;
> >       struct cxl_dport *dport;
> >       struct cxl_port *port;
> > @@ -303,22 +315,46 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >               return -ENODEV;
> >       if (ctx.error)
> >               return ctx.error;
> > +     if (ctx.count > 1)
> > +             return 0;
> >
> >       /* TODO: Scan CHBCR for HDM Decoder resources */
> >
> >       /*
> > -      * In the single-port host-bridge case there are no HDM decoders
> > -      * in the CHBCR and a 1:1 passthrough decode is implied.
> > +      * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> > +      * Structure) single ported host-bridges need not publish a decoder
> > +      * capability when a passthrough decode can be assumed, i.e. all
> > +      * transactions that the uport sees are claimed and passed to the single
> > +      * dport. Default the range a 0-base 0-length until the first CXL region
> > +      * is activated.
> >        */
> > -     if (ctx.count == 1) {
> > -             cxld = devm_cxl_add_passthrough_decoder(host, port);
> > -             if (IS_ERR(cxld))
> > -                     return PTR_ERR(cxld);
> > +     cxld = cxl_decoder_alloc(port, 1);
> > +     if (IS_ERR(cxld))
> > +             return PTR_ERR(cxld);
> > +
> > +     cxld->interleave_ways = 1;
> > +     cxld->interleave_granularity = PAGE_SIZE;
> > +     cxld->target_type = CXL_DECODER_EXPANDER;
> > +     cxld->range = (struct range) {
> > +             .start = 0,
> > +             .end = -1,
> > +     };
> >
> > -             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > -     }
>
> This was previously done without taking the device lock... (see below)
>
> > +     device_lock(&port->dev);
> > +     dport = list_first_entry(&port->dports, typeof(*dport), list);
> > +     device_unlock(&port->dev);
> >
> > -     return 0;
> > +     single_port_map[0] = dport->port_id;
> > +
> > +     rc = cxl_decoder_add(host, cxld, single_port_map);
> > +     if (rc)
> > +             put_device(&cxld->dev);
> > +     else
> > +             rc = cxl_decoder_autoremove(host, cxld);
> > +
> > +     if (rc == 0)
> > +             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > +     return rc;
> >  }
> >
> >  static int add_host_bridge_dport(struct device *match, void *arg)
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 176bede30c55..be787685b13e 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -455,16 +455,18 @@ EXPORT_SYMBOL_GPL(cxl_add_dport);
> >
> >  static int decoder_populate_targets(struct device *host,
> >                                   struct cxl_decoder *cxld,
> > -                                 struct cxl_port *port, int *target_map,
> > -                                 int nr_targets)
> > +                                 struct cxl_port *port, int *target_map)
> >  {
> >       int rc = 0, i;
> >
> > +     if (list_empty(&port->dports))
> > +             return -EINVAL;
> > +
>
> The code before this patch did this under the device_lock() Perhaps call
> out the fact there is no need to do that if we don't need to?

Nope, bug, good catch.

>
> >       if (!target_map)
> >               return 0;
> >
> >       device_lock(&port->dev);
> > -     for (i = 0; i < nr_targets; i++) {
> > +     for (i = 0; i < cxld->nr_targets; i++) {
> >               struct cxl_dport *dport = find_dport(port, target_map[i]);
> >
> >               if (!dport) {
> > @@ -479,27 +481,15 @@ static int decoder_populate_targets(struct device *host,
> >       return rc;
> >  }
> >
> > -static struct cxl_decoder *
> > -cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> > -               resource_size_t base, resource_size_t len,
> > -               int interleave_ways, int interleave_granularity,
> > -               enum cxl_decoder_type type, unsigned long flags,
> > -               int *target_map)
> > +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
> >  {
> >       struct cxl_decoder *cxld;
> >       struct device *dev;
> >       int rc = 0;
> >
> > -     if (interleave_ways < 1)
> > +     if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
> >               return ERR_PTR(-EINVAL);
> >
> > -     device_lock(&port->dev);
> > -     if (list_empty(&port->dports))
> > -             rc = -EINVAL;
> > -     device_unlock(&port->dev);
> > -     if (rc)
> > -             return ERR_PTR(rc);
> > -
> >       cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> >       if (!cxld)
> >               return ERR_PTR(-ENOMEM);
> > @@ -508,22 +498,8 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> >       if (rc < 0)
> >               goto err;
> >
> > -     *cxld = (struct cxl_decoder) {
> > -             .id = rc,
> > -             .range = {
> > -                     .start = base,
> > -                     .end = base + len - 1,
> > -             },
> > -             .flags = flags,
> > -             .interleave_ways = interleave_ways,
> > -             .interleave_granularity = interleave_granularity,
> > -             .target_type = type,
> > -     };
> > -
> > -     rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
> > -     if (rc)
> > -             goto err;
> > -
> > +     cxld->id = rc;
> > +     cxld->nr_targets = nr_targets;
> >       dev = &cxld->dev;
> >       device_initialize(dev);
> >       device_set_pm_not_required(dev);
> > @@ -541,72 +517,48 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> >       kfree(cxld);
> >       return ERR_PTR(rc);
> >  }
> > +EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
> >
> > -struct cxl_decoder *
> > -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
> > -                  resource_size_t base, resource_size_t len,
> > -                  int interleave_ways, int interleave_granularity,
> > -                  enum cxl_decoder_type type, unsigned long flags,
> > -                  int *target_map)
> > +int cxl_decoder_add(struct device *host, struct cxl_decoder *cxld,
> > +                 int *target_map)
> >  {
> > -     struct cxl_decoder *cxld;
> > +     struct cxl_port *port;
> >       struct device *dev;
> >       int rc;
> >
> > -     if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
> > -             return ERR_PTR(-EINVAL);
> > +     if (!cxld)
> > +             return -EINVAL;
> >
> > -     cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
> > -                              interleave_ways, interleave_granularity, type,
> > -                              flags, target_map);
> >       if (IS_ERR(cxld))
> > -             return cxld;
> > +             return PTR_ERR(cxld);
>
> It feels like the checks here are overly paranoid.
> Obviously harmless, but if we get as far as cxl_decoder_add() with either
> cxld == NULL or IS_ERR() then something horrible is going on.

True in the current users...

>
> I think you could reasonably drop them both.

They could be dropped, but it actually already paid dividends in
indicating to static analysis checkers that the input could be NULL.
If these were internal static functions I would agree with dropping
the extra error checking, but since this is exported and the call
sites will grow I'm inclined to keep them.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 17/21] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
  2021-09-10  9:53   ` Jonathan Cameron
@ 2021-09-10 18:46     ` Dan Williams
  0 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-10 18:46 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Ben Widawsky, Vishal Verma, Linux NVDIMM, Schofield,
	Alison, Weiny, Ira

On Fri, Sep 10, 2021 at 2:53 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Wed, 8 Sep 2021 22:13:04 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > Create an environment for CXL plumbing unit tests. Especially when it
> > comes to an algorithm for HDM Decoder (Host-managed Device Memory
> > Decoder) programming, the availability of an in-kernel-tree emulation
> > environment for CXL configuration complexity and corner cases speeds
> > development and deters regressions.
> >
> > The approach taken mirrors what was done for tools/testing/nvdimm/. I.e.
> > an external module, cxl_test.ko built out of the tools/testing/cxl/
> > directory, provides mock implementations of kernel APIs and kernel
> > objects to simulate a real world device hierarchy.
> >
> > One feedback for the tools/testing/nvdimm/ proposal was "why not do this
> > in QEMU?". In fact, the CXL development community has developed a QEMU
> > model for CXL [1]. However, there are a few blocking issues that keep
> > QEMU from being a tight fit for topology + provisioning unit tests:
> >
> > 1/ The QEMU community has yet to show interest in merging any of this
> >    support that has had patches on the list since November 2020. So,
> >    testing CXL to date involves building custom QEMU with out-of-tree
> >    patches.
> >
> > 2/ CXL mechanisms like cross-host-bridge interleave do not have a clear
> >    path to be emulated by QEMU without major infrastructure work. This
> >    is easier to achieve with the alloc_mock_res() approach taken in this
> >    patch to shortcut-define emulated system physical address ranges with
> >    interleave behavior.
> >
> > The QEMU enabling has been critical to get the driver off the ground,
> > and may still move forward, but it does not address the ongoing needs of
> > a regression testing environment and test driven development.
> >
> > This patch adds an ACPI CXL Platform definition with emulated CXL
> > multi-ported host-bridges. A follow on patch adds emulated memory
> > expander devices.
> >
> > Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> > Reported-by: Vishal Verma <vishal.l.verma@intel.com>
> > Link: https://lore.kernel.org/r/20210202005948.241655-1-ben.widawsky@intel.com [1]
> > Link: https://lore.kernel.org/r/162982125348.1124374.17808192318402734926.stgit@dwillia2-desk3.amr.corp.intel.com
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> A trivial comment below, but I'm fine with leave that one change in here
> as it is only a very small amount of noise.
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>
>
>
> > ---
> >  drivers/cxl/acpi.c               |   40 ++-
> >  drivers/cxl/cxl.h                |   16 +
> >  tools/testing/cxl/Kbuild         |   36 +++
> >  tools/testing/cxl/config_check.c |   13 +
> >  tools/testing/cxl/mock_acpi.c    |  109 ++++++++
> >  tools/testing/cxl/test/Kbuild    |    6
> >  tools/testing/cxl/test/cxl.c     |  509 ++++++++++++++++++++++++++++++++++++++
> >  tools/testing/cxl/test/mock.c    |  171 +++++++++++++
> >  tools/testing/cxl/test/mock.h    |   27 ++
> >  9 files changed, 911 insertions(+), 16 deletions(-)
> >  create mode 100644 tools/testing/cxl/Kbuild
> >  create mode 100644 tools/testing/cxl/config_check.c
> >  create mode 100644 tools/testing/cxl/mock_acpi.c
> >  create mode 100644 tools/testing/cxl/test/Kbuild
> >  create mode 100644 tools/testing/cxl/test/cxl.c
> >  create mode 100644 tools/testing/cxl/test/mock.c
> >  create mode 100644 tools/testing/cxl/test/mock.h
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 54e9d4d2cf5f..d31a97218593 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -182,15 +182,7 @@ static resource_size_t get_chbcr(struct acpi_cedt_chbs *chbs)
> >       return IS_ERR(chbs) ? CXL_RESOURCE_NONE : chbs->base;
> >  }
> >
> > -struct cxl_walk_context {
> > -     struct device *dev;
> > -     struct pci_bus *root;
> > -     struct cxl_port *port;
> > -     int error;
> > -     int count;
> > -};
> > -
> > -static int match_add_root_ports(struct pci_dev *pdev, void *data)
> > +__mock int match_add_root_ports(struct pci_dev *pdev, void *data)
> >  {
> >       struct cxl_walk_context *ctx = data;
> >       struct pci_bus *root_bus = ctx->root;
> > @@ -239,15 +231,18 @@ static struct cxl_dport *find_dport_by_dev(struct cxl_port *port, struct device
> >       return NULL;
> >  }
> >
> > -static struct acpi_device *to_cxl_host_bridge(struct device *dev)
> > +__mock struct acpi_device *to_cxl_host_bridge(struct device *host,
> > +                                           struct device *dev)
> >  {
> >       struct acpi_device *adev = to_acpi_device(dev);
> >
> >       if (!acpi_pci_find_root(adev->handle))
> >               return NULL;
> >
> > -     if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0)
> > +     if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0) {
> > +             dev_dbg(host, "found host bridge %s\n", dev_name(&adev->dev));
>
> I didn't call it out in the previous review, but technically unrelated to the
> rest of the patch even if useful.

I'll delete it. It was added during development to debug the mocking
code, but it should have moved to a separate patch, or been dropped.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-10 18:36     ` Dan Williams
@ 2021-09-11 17:15       ` Ben Widawsky
  2021-09-11 20:20         ` Dan Williams
  0 siblings, 1 reply; 78+ messages in thread
From: Ben Widawsky @ 2021-09-11 17:15 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, linux-cxl, kernel test robot,
	Nathan Chancellor, Dan Carpenter, Vishal L Verma, Linux NVDIMM,
	Schofield, Alison, Weiny, Ira

On 21-09-10 11:36:05, Dan Williams wrote:
> On Fri, Sep 10, 2021 at 3:34 AM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Wed, 8 Sep 2021 22:13:26 -0700
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > > The kbuild robot reports:
> > >
> > >     drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
> > >     limit (1024) in function 'devm_cxl_add_decoder'
> > >
> > > It is also the case the devm_cxl_add_decoder() is unwieldy to use for
> > > all the different decoder types. Fix the stack usage by splitting the
> > > creation into alloc and add steps. This also allows for context
> > > specific construction before adding.
> > >
> > > With the split the caller is responsible for registering a devm callback
> > > to trigger device_unregister() for the decoder rather than it being
> > > implicit in the decoder registration. I.e. the routine that calls alloc
> > > is responsible for calling put_device() if the "add" operation fails.
> > >
> > > Reported-by: kernel test robot <lkp@intel.com>
> > > Reported-by: Nathan Chancellor <nathan@kernel.org>
> > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >
> > A few minor things inline. This one was definitely a case where diff
> > wasn't being helpful in how it chose to format things!
> >
> > I haven't taken the time to figure out if the device_lock() changes
> > make complete sense as I don't understand the intent.
> > I think they should be called out in the patch description as they
> > seem a little non obvious.
> >
> > Jonathan
> >
> > > ---
> > >  drivers/cxl/acpi.c      |   84 +++++++++++++++++++++++++----------
> > >  drivers/cxl/core/bus.c  |  114 ++++++++++++++---------------------------------
> > >  drivers/cxl/core/core.h |    5 --
> > >  drivers/cxl/core/pmem.c |    7 ++-
> > >  drivers/cxl/cxl.h       |   16 +++----
> > >  5 files changed, 106 insertions(+), 120 deletions(-)
> > >
> > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > > index 9d881eacdae5..654a80547526 100644
> > > --- a/drivers/cxl/acpi.c
> > > +++ b/drivers/cxl/acpi.c
> > > @@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> > >       struct cxl_decoder *cxld;
> > >       acpi_size len, cur = 0;
> > >       void *cedt_subtable;
> > > -     unsigned long flags;
> > >       int rc;
> > >
> > >       len = acpi_cedt->length - sizeof(*acpi_cedt);
> > > @@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> > >               for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
> > >                       target_map[i] = cfmws->interleave_targets[i];
> > >
> > > -             flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > > -             cxld = devm_cxl_add_decoder(dev, root_port,
> > > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > > -                                         cfmws->base_hpa, cfmws->window_size,
> > > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > > -                                         CFMWS_INTERLEAVE_GRANULARITY(cfmws),
> > > -                                         CXL_DECODER_EXPANDER,
> > > -                                         flags, target_map);
> > > -
> > > -             if (IS_ERR(cxld)) {
> > > +             cxld = cxl_decoder_alloc(root_port,
> > > +                                      CFMWS_INTERLEAVE_WAYS(cfmws));
> > > +             if (IS_ERR(cxld))
> > > +                     goto next;
> > > +
> > > +             cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > > +             cxld->target_type = CXL_DECODER_EXPANDER;
> > > +             cxld->range = (struct range) {
> > > +                     .start = cfmws->base_hpa,
> > > +                     .end = cfmws->base_hpa + cfmws->window_size - 1,
> > > +             };
> > > +             cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
> > > +             cxld->interleave_granularity =
> > > +                     CFMWS_INTERLEAVE_GRANULARITY(cfmws);
> > > +
> > > +             rc = cxl_decoder_add(dev, cxld, target_map);
> > > +             if (rc)
> > > +                     put_device(&cxld->dev);
> > > +             else
> > > +                     rc = cxl_decoder_autoremove(dev, cxld);
> > > +             if (rc) {
> > >                       dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
> > >                               cfmws->base_hpa, cfmws->base_hpa +
> > >                               cfmws->window_size - 1);
> > > -             } else {
> > > -                     dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > > -                             dev_name(&cxld->dev), cfmws->base_hpa,
> > > -                              cfmws->base_hpa + cfmws->window_size - 1);
> > > +                     goto next;
> > >               }
> > > +             dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > > +                     dev_name(&cxld->dev), cfmws->base_hpa,
> > > +                     cfmws->base_hpa + cfmws->window_size - 1);
> > > +next:
> > >               cur += c->length;
> > >       }
> > >  }
> > > @@ -268,6 +279,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > >       struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> > >       struct acpi_pci_root *pci_root;
> > >       struct cxl_walk_context ctx;
> > > +     int single_port_map[1], rc;
> > >       struct cxl_decoder *cxld;
> > >       struct cxl_dport *dport;
> > >       struct cxl_port *port;
> > > @@ -303,22 +315,46 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > >               return -ENODEV;
> > >       if (ctx.error)
> > >               return ctx.error;
> > > +     if (ctx.count > 1)
> > > +             return 0;
> > >
> > >       /* TODO: Scan CHBCR for HDM Decoder resources */
> > >
> > >       /*
> > > -      * In the single-port host-bridge case there are no HDM decoders
> > > -      * in the CHBCR and a 1:1 passthrough decode is implied.
> > > +      * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> > > +      * Structure) single ported host-bridges need not publish a decoder
> > > +      * capability when a passthrough decode can be assumed, i.e. all
> > > +      * transactions that the uport sees are claimed and passed to the single
> > > +      * dport. Default the range a 0-base 0-length until the first CXL region
> > > +      * is activated.
> > >        */
> > > -     if (ctx.count == 1) {
> > > -             cxld = devm_cxl_add_passthrough_decoder(host, port);
> > > -             if (IS_ERR(cxld))
> > > -                     return PTR_ERR(cxld);
> > > +     cxld = cxl_decoder_alloc(port, 1);
> > > +     if (IS_ERR(cxld))
> > > +             return PTR_ERR(cxld);
> > > +
> > > +     cxld->interleave_ways = 1;
> > > +     cxld->interleave_granularity = PAGE_SIZE;
> > > +     cxld->target_type = CXL_DECODER_EXPANDER;
> > > +     cxld->range = (struct range) {
> > > +             .start = 0,
> > > +             .end = -1,
> > > +     };
> > >
> > > -             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > > -     }
> >
> > This was previously done without taking the device lock... (see below)
> >
> > > +     device_lock(&port->dev);
> > > +     dport = list_first_entry(&port->dports, typeof(*dport), list);
> > > +     device_unlock(&port->dev);
> > >
> > > -     return 0;
> > > +     single_port_map[0] = dport->port_id;
> > > +
> > > +     rc = cxl_decoder_add(host, cxld, single_port_map);
> > > +     if (rc)
> > > +             put_device(&cxld->dev);
> > > +     else
> > > +             rc = cxl_decoder_autoremove(host, cxld);
> > > +
> > > +     if (rc == 0)
> > > +             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > > +     return rc;
> > >  }
> > >
> > >  static int add_host_bridge_dport(struct device *match, void *arg)
> > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > index 176bede30c55..be787685b13e 100644
> > > --- a/drivers/cxl/core/bus.c
> > > +++ b/drivers/cxl/core/bus.c
> > > @@ -455,16 +455,18 @@ EXPORT_SYMBOL_GPL(cxl_add_dport);
> > >
> > >  static int decoder_populate_targets(struct device *host,
> > >                                   struct cxl_decoder *cxld,
> > > -                                 struct cxl_port *port, int *target_map,
> > > -                                 int nr_targets)
> > > +                                 struct cxl_port *port, int *target_map)
> > >  {
> > >       int rc = 0, i;
> > >
> > > +     if (list_empty(&port->dports))
> > > +             return -EINVAL;
> > > +
> >
> > The code before this patch did this under the device_lock() Perhaps call
> > out the fact there is no need to do that if we don't need to?
> 
> Nope, bug, good catch.
> 

While you're changing this might I request you make this change so I can drop my
patch:

commit fe46c7a3e30129c649e17a4c555699e816cf04e7
Author: Ben Widawsky <ben.widawsky@intel.com>
Date:   16 hours ago

    core/bus: Don't error for targetless decoders

    Decoders may not have any targets, endpoints are the best example of
    this. In order to prevent having to special case, it's easiest to just
    not return an error when no target map is specified as those endpoints
    also won't have dports.

    Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index c13b6c4d4135..b98b3e343a3d 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -547,12 +547,12 @@ static int decoder_populate_targets(struct device *host,
 {
        int rc = 0, i;

-       if (list_empty(&port->dports))
-               return -EINVAL;
-
        if (!target_map)
                return 0;

+       if (list_empty(&port->dports))
+               return -EINVAL;
+
        device_lock(&port->dev);
        for (i = 0; i < cxld->nr_targets; i++) {
                struct cxl_dport *dport = find_dport(port, target_map[i]);

> >
> > >       if (!target_map)
> > >               return 0;
> > >
> > >       device_lock(&port->dev);
> > > -     for (i = 0; i < nr_targets; i++) {
> > > +     for (i = 0; i < cxld->nr_targets; i++) {
> > >               struct cxl_dport *dport = find_dport(port, target_map[i]);
> > >
> > >               if (!dport) {
> > > @@ -479,27 +481,15 @@ static int decoder_populate_targets(struct device *host,
> > >       return rc;
> > >  }
> > >
> > > -static struct cxl_decoder *
> > > -cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> > > -               resource_size_t base, resource_size_t len,
> > > -               int interleave_ways, int interleave_granularity,
> > > -               enum cxl_decoder_type type, unsigned long flags,
> > > -               int *target_map)
> > > +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
> > >  {
> > >       struct cxl_decoder *cxld;
> > >       struct device *dev;
> > >       int rc = 0;
> > >
> > > -     if (interleave_ways < 1)
> > > +     if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
> > >               return ERR_PTR(-EINVAL);
> > >
> > > -     device_lock(&port->dev);
> > > -     if (list_empty(&port->dports))
> > > -             rc = -EINVAL;
> > > -     device_unlock(&port->dev);
> > > -     if (rc)
> > > -             return ERR_PTR(rc);
> > > -
> > >       cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> > >       if (!cxld)
> > >               return ERR_PTR(-ENOMEM);
> > > @@ -508,22 +498,8 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> > >       if (rc < 0)
> > >               goto err;
> > >
> > > -     *cxld = (struct cxl_decoder) {
> > > -             .id = rc,
> > > -             .range = {
> > > -                     .start = base,
> > > -                     .end = base + len - 1,
> > > -             },
> > > -             .flags = flags,
> > > -             .interleave_ways = interleave_ways,
> > > -             .interleave_granularity = interleave_granularity,
> > > -             .target_type = type,
> > > -     };
> > > -
> > > -     rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
> > > -     if (rc)
> > > -             goto err;
> > > -
> > > +     cxld->id = rc;
> > > +     cxld->nr_targets = nr_targets;
> > >       dev = &cxld->dev;
> > >       device_initialize(dev);
> > >       device_set_pm_not_required(dev);
> > > @@ -541,72 +517,48 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> > >       kfree(cxld);
> > >       return ERR_PTR(rc);
> > >  }
> > > +EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
> > >
> > > -struct cxl_decoder *
> > > -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
> > > -                  resource_size_t base, resource_size_t len,
> > > -                  int interleave_ways, int interleave_granularity,
> > > -                  enum cxl_decoder_type type, unsigned long flags,
> > > -                  int *target_map)
> > > +int cxl_decoder_add(struct device *host, struct cxl_decoder *cxld,
> > > +                 int *target_map)
> > >  {
> > > -     struct cxl_decoder *cxld;
> > > +     struct cxl_port *port;
> > >       struct device *dev;
> > >       int rc;
> > >
> > > -     if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
> > > -             return ERR_PTR(-EINVAL);
> > > +     if (!cxld)
> > > +             return -EINVAL;
> > >
> > > -     cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
> > > -                              interleave_ways, interleave_granularity, type,
> > > -                              flags, target_map);
> > >       if (IS_ERR(cxld))
> > > -             return cxld;
> > > +             return PTR_ERR(cxld);
> >
> > It feels like the checks here are overly paranoid.
> > Obviously harmless, but if we get as far as cxl_decoder_add() with either
> > cxld == NULL or IS_ERR() then something horrible is going on.
> 
> True in the current users...
> 
> >
> > I think you could reasonably drop them both.
> 
> They could be dropped, but it actually already paid dividends in
> indicating to static analysis checkers that the input could be NULL.
> If these were internal static functions I would agree with dropping
> the extra error checking, but since this is exported and the call
> sites will grow I'm inclined to keep them.

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-11 17:15       ` Ben Widawsky
@ 2021-09-11 20:20         ` Dan Williams
  0 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-11 20:20 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Jonathan Cameron, linux-cxl, kernel test robot,
	Nathan Chancellor, Dan Carpenter, Vishal L Verma, Linux NVDIMM,
	Schofield, Alison, Weiny, Ira

On Sat, Sep 11, 2021 at 10:15 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-09-10 11:36:05, Dan Williams wrote:
> > On Fri, Sep 10, 2021 at 3:34 AM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > On Wed, 8 Sep 2021 22:13:26 -0700
> > > Dan Williams <dan.j.williams@intel.com> wrote:
> > >
> > > > The kbuild robot reports:
> > > >
> > > >     drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
> > > >     limit (1024) in function 'devm_cxl_add_decoder'
> > > >
> > > > It is also the case the devm_cxl_add_decoder() is unwieldy to use for
> > > > all the different decoder types. Fix the stack usage by splitting the
> > > > creation into alloc and add steps. This also allows for context
> > > > specific construction before adding.
> > > >
> > > > With the split the caller is responsible for registering a devm callback
> > > > to trigger device_unregister() for the decoder rather than it being
> > > > implicit in the decoder registration. I.e. the routine that calls alloc
> > > > is responsible for calling put_device() if the "add" operation fails.
> > > >
> > > > Reported-by: kernel test robot <lkp@intel.com>
> > > > Reported-by: Nathan Chancellor <nathan@kernel.org>
> > > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > >
> > > A few minor things inline. This one was definitely a case where diff
> > > wasn't being helpful in how it chose to format things!
> > >
> > > I haven't taken the time to figure out if the device_lock() changes
> > > make complete sense as I don't understand the intent.
> > > I think they should be called out in the patch description as they
> > > seem a little non obvious.
> > >
> > > Jonathan
> > >
> > > > ---
> > > >  drivers/cxl/acpi.c      |   84 +++++++++++++++++++++++++----------
> > > >  drivers/cxl/core/bus.c  |  114 ++++++++++++++---------------------------------
> > > >  drivers/cxl/core/core.h |    5 --
> > > >  drivers/cxl/core/pmem.c |    7 ++-
> > > >  drivers/cxl/cxl.h       |   16 +++----
> > > >  5 files changed, 106 insertions(+), 120 deletions(-)
> > > >
> > > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > > > index 9d881eacdae5..654a80547526 100644
> > > > --- a/drivers/cxl/acpi.c
> > > > +++ b/drivers/cxl/acpi.c
> > > > @@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> > > >       struct cxl_decoder *cxld;
> > > >       acpi_size len, cur = 0;
> > > >       void *cedt_subtable;
> > > > -     unsigned long flags;
> > > >       int rc;
> > > >
> > > >       len = acpi_cedt->length - sizeof(*acpi_cedt);
> > > > @@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> > > >               for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
> > > >                       target_map[i] = cfmws->interleave_targets[i];
> > > >
> > > > -             flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > > > -             cxld = devm_cxl_add_decoder(dev, root_port,
> > > > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > > > -                                         cfmws->base_hpa, cfmws->window_size,
> > > > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > > > -                                         CFMWS_INTERLEAVE_GRANULARITY(cfmws),
> > > > -                                         CXL_DECODER_EXPANDER,
> > > > -                                         flags, target_map);
> > > > -
> > > > -             if (IS_ERR(cxld)) {
> > > > +             cxld = cxl_decoder_alloc(root_port,
> > > > +                                      CFMWS_INTERLEAVE_WAYS(cfmws));
> > > > +             if (IS_ERR(cxld))
> > > > +                     goto next;
> > > > +
> > > > +             cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > > > +             cxld->target_type = CXL_DECODER_EXPANDER;
> > > > +             cxld->range = (struct range) {
> > > > +                     .start = cfmws->base_hpa,
> > > > +                     .end = cfmws->base_hpa + cfmws->window_size - 1,
> > > > +             };
> > > > +             cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
> > > > +             cxld->interleave_granularity =
> > > > +                     CFMWS_INTERLEAVE_GRANULARITY(cfmws);
> > > > +
> > > > +             rc = cxl_decoder_add(dev, cxld, target_map);
> > > > +             if (rc)
> > > > +                     put_device(&cxld->dev);
> > > > +             else
> > > > +                     rc = cxl_decoder_autoremove(dev, cxld);
> > > > +             if (rc) {
> > > >                       dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
> > > >                               cfmws->base_hpa, cfmws->base_hpa +
> > > >                               cfmws->window_size - 1);
> > > > -             } else {
> > > > -                     dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > > > -                             dev_name(&cxld->dev), cfmws->base_hpa,
> > > > -                              cfmws->base_hpa + cfmws->window_size - 1);
> > > > +                     goto next;
> > > >               }
> > > > +             dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > > > +                     dev_name(&cxld->dev), cfmws->base_hpa,
> > > > +                     cfmws->base_hpa + cfmws->window_size - 1);
> > > > +next:
> > > >               cur += c->length;
> > > >       }
> > > >  }
> > > > @@ -268,6 +279,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > > >       struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> > > >       struct acpi_pci_root *pci_root;
> > > >       struct cxl_walk_context ctx;
> > > > +     int single_port_map[1], rc;
> > > >       struct cxl_decoder *cxld;
> > > >       struct cxl_dport *dport;
> > > >       struct cxl_port *port;
> > > > @@ -303,22 +315,46 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > > >               return -ENODEV;
> > > >       if (ctx.error)
> > > >               return ctx.error;
> > > > +     if (ctx.count > 1)
> > > > +             return 0;
> > > >
> > > >       /* TODO: Scan CHBCR for HDM Decoder resources */
> > > >
> > > >       /*
> > > > -      * In the single-port host-bridge case there are no HDM decoders
> > > > -      * in the CHBCR and a 1:1 passthrough decode is implied.
> > > > +      * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> > > > +      * Structure) single ported host-bridges need not publish a decoder
> > > > +      * capability when a passthrough decode can be assumed, i.e. all
> > > > +      * transactions that the uport sees are claimed and passed to the single
> > > > +      * dport. Default the range a 0-base 0-length until the first CXL region
> > > > +      * is activated.
> > > >        */
> > > > -     if (ctx.count == 1) {
> > > > -             cxld = devm_cxl_add_passthrough_decoder(host, port);
> > > > -             if (IS_ERR(cxld))
> > > > -                     return PTR_ERR(cxld);
> > > > +     cxld = cxl_decoder_alloc(port, 1);
> > > > +     if (IS_ERR(cxld))
> > > > +             return PTR_ERR(cxld);
> > > > +
> > > > +     cxld->interleave_ways = 1;
> > > > +     cxld->interleave_granularity = PAGE_SIZE;
> > > > +     cxld->target_type = CXL_DECODER_EXPANDER;
> > > > +     cxld->range = (struct range) {
> > > > +             .start = 0,
> > > > +             .end = -1,
> > > > +     };
> > > >
> > > > -             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > > > -     }
> > >
> > > This was previously done without taking the device lock... (see below)
> > >
> > > > +     device_lock(&port->dev);
> > > > +     dport = list_first_entry(&port->dports, typeof(*dport), list);
> > > > +     device_unlock(&port->dev);
> > > >
> > > > -     return 0;
> > > > +     single_port_map[0] = dport->port_id;
> > > > +
> > > > +     rc = cxl_decoder_add(host, cxld, single_port_map);
> > > > +     if (rc)
> > > > +             put_device(&cxld->dev);
> > > > +     else
> > > > +             rc = cxl_decoder_autoremove(host, cxld);
> > > > +
> > > > +     if (rc == 0)
> > > > +             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > > > +     return rc;
> > > >  }
> > > >
> > > >  static int add_host_bridge_dport(struct device *match, void *arg)
> > > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > > index 176bede30c55..be787685b13e 100644
> > > > --- a/drivers/cxl/core/bus.c
> > > > +++ b/drivers/cxl/core/bus.c
> > > > @@ -455,16 +455,18 @@ EXPORT_SYMBOL_GPL(cxl_add_dport);
> > > >
> > > >  static int decoder_populate_targets(struct device *host,
> > > >                                   struct cxl_decoder *cxld,
> > > > -                                 struct cxl_port *port, int *target_map,
> > > > -                                 int nr_targets)
> > > > +                                 struct cxl_port *port, int *target_map)
> > > >  {
> > > >       int rc = 0, i;
> > > >
> > > > +     if (list_empty(&port->dports))
> > > > +             return -EINVAL;
> > > > +
> > >
> > > The code before this patch did this under the device_lock() Perhaps call
> > > out the fact there is no need to do that if we don't need to?
> >
> > Nope, bug, good catch.
> >
>
> While you're changing this might I request you make this change so I can drop my
> patch:
>
> commit fe46c7a3e30129c649e17a4c555699e816cf04e7
> Author: Ben Widawsky <ben.widawsky@intel.com>
> Date:   16 hours ago
>
>     core/bus: Don't error for targetless decoders
>
>     Decoders may not have any targets, endpoints are the best example of
>     this. In order to prevent having to special case, it's easiest to just
>     not return an error when no target map is specified as those endpoints
>     also won't have dports.
>
>     Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index c13b6c4d4135..b98b3e343a3d 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -547,12 +547,12 @@ static int decoder_populate_targets(struct device *host,
>  {
>         int rc = 0, i;
>
> -       if (list_empty(&port->dports))
> -               return -EINVAL;
> -
>         if (!target_map)
>                 return 0;
>
> +       if (list_empty(&port->dports))
> +               return -EINVAL;
> +
>         device_lock(&port->dev);
>         for (i = 0; i < cxld->nr_targets; i++) {
>                 struct cxl_dport *dport = find_dport(port, target_map[i]);
>

Yes, I will fold that fix in when I address the locking around the
list_empty check.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v5 08/21] cxl/pci: Clean up cxl_mem_get_partition_info()
  2021-09-09  5:12 ` [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info() Dan Williams
  2021-09-09 16:20   ` Ben Widawsky
@ 2021-09-13 22:19   ` Dan Williams
  2021-09-13 22:21     ` Dan Williams
  2021-09-13 22:24   ` [PATCH v6 " Dan Williams
  2 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-13 22:19 UTC (permalink / raw)
  To: linux-cxl; +Cc: Ira Weiny, Ben Widawsky, Jonathan Cameron, nvdimm

Commit 0b9159d0ff21 ("cxl/pci: Store memory capacity values") missed
updating the kernel-doc for 'struct cxl_mem' leading to the following
warnings:

./scripts/kernel-doc -v drivers/cxl/cxlmem.h 2>&1 | grep warn
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'total_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'volatile_only_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'persistent_only_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'partition_align_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_volatile_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_persistent_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_volatile_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_persistent_bytes' not described in 'cxl_mem'

Also, it is redundant to describe those same parameters in the
kernel-doc for cxl_mem_get_partition_info(). Given the only user of that
routine updates the values in @cxlm, just do that implicitly internal to
the helper.

Cc: Ira Weiny <ira.weiny@intel.com>
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v4:
- Update the kdoc for @partition_align_bytes (Ben)
- Collect Jonathan's reviewed-by pending above update.

 drivers/cxl/cxlmem.h |   15 +++++++++++++--
 drivers/cxl/pci.c    |   35 +++++++++++------------------------
 2 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index d5334df83fb2..c6fce966084a 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -78,8 +78,19 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
  * @mbox_mutex: Mutex to synchronize mailbox access.
  * @firmware_version: Firmware version for the memory device.
  * @enabled_cmds: Hardware commands found enabled in CEL.
- * @pmem_range: Persistent memory capacity information.
- * @ram_range: Volatile memory capacity information.
+ * @pmem_range: Active Persistent memory capacity configuration
+ * @ram_range: Active Volatile memory capacity configuration
+ * @total_bytes: sum of all possible capacities
+ * @volatile_only_bytes: hard volatile capacity
+ * @persistent_only_bytes: hard persistent capacity
+ * @partition_align_bytes: soft setting for configurable capacity
+ * @active_volatile_bytes: sum of hard + soft volatile
+ * @active_persistent_bytes: sum of hard + soft persistent
+ * @next_volatile_bytes: volatile capacity change pending device reset
+ * @next_persistent_bytes: persistent capacity change pending device reset
+ *
+ * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
+ * details on capacity parameters.
  */
 struct cxl_mem {
 	struct device *dev;
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index c1e1d12e24b6..8077d907e7d3 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1262,11 +1262,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
 
 /**
  * cxl_mem_get_partition_info - Get partition info
- * @cxlm: The device to act on
- * @active_volatile_bytes: returned active volatile capacity
- * @active_persistent_bytes: returned active persistent capacity
- * @next_volatile_bytes: return next volatile capacity
- * @next_persistent_bytes: return next persistent capacity
+ * @cxlm: cxl_mem instance to update partition info
  *
  * Retrieve the current partition info for the device specified.  If not 0, the
  * 'next' values are pending and take affect on next cold reset.
@@ -1275,11 +1271,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
  *
  * See CXL @8.2.9.5.2.1 Get Partition Info
  */
-static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
-				      u64 *active_volatile_bytes,
-				      u64 *active_persistent_bytes,
-				      u64 *next_volatile_bytes,
-				      u64 *next_persistent_bytes)
+static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
 {
 	struct cxl_mbox_get_partition_info {
 		__le64 active_volatile_cap;
@@ -1294,15 +1286,14 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
 	if (rc)
 		return rc;
 
-	*active_volatile_bytes = le64_to_cpu(pi.active_volatile_cap);
-	*active_persistent_bytes = le64_to_cpu(pi.active_persistent_cap);
-	*next_volatile_bytes = le64_to_cpu(pi.next_volatile_cap);
-	*next_persistent_bytes = le64_to_cpu(pi.next_volatile_cap);
-
-	*active_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
-	*active_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
-	*next_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
-	*next_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
+	cxlm->active_volatile_bytes =
+		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->active_persistent_bytes =
+		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->next_volatile_bytes =
+		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->next_persistent_bytes =
+		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
 
 	return 0;
 }
@@ -1443,11 +1434,7 @@ static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
 		return 0;
 	}
 
-	rc = cxl_mem_get_partition_info(cxlm,
-					&cxlm->active_volatile_bytes,
-					&cxlm->active_persistent_bytes,
-					&cxlm->next_volatile_bytes,
-					&cxlm->next_persistent_bytes);
+	rc = cxl_mem_get_partition_info(cxlm);
 	if (rc < 0) {
 		dev_err(cxlm->dev, "Failed to query partition information\n");
 		return rc;


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 08/21] cxl/pci: Clean up cxl_mem_get_partition_info()
  2021-09-13 22:19   ` [PATCH v5 " Dan Williams
@ 2021-09-13 22:21     ` Dan Williams
  0 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-13 22:21 UTC (permalink / raw)
  To: linux-cxl; +Cc: Ira Weiny, Ben Widawsky, Jonathan Cameron, Linux NVDIMM

On Mon, Sep 13, 2021 at 3:20 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Commit 0b9159d0ff21 ("cxl/pci: Store memory capacity values") missed
> updating the kernel-doc for 'struct cxl_mem' leading to the following
> warnings:
>
> ./scripts/kernel-doc -v drivers/cxl/cxlmem.h 2>&1 | grep warn
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'total_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'volatile_only_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'persistent_only_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'partition_align_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_volatile_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_persistent_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_volatile_bytes' not described in 'cxl_mem'
> drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_persistent_bytes' not described in 'cxl_mem'
>
> Also, it is redundant to describe those same parameters in the
> kernel-doc for cxl_mem_get_partition_info(). Given the only user of that
> routine updates the values in @cxlm, just do that implicitly internal to
> the helper.
>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Reported-by: Ben Widawsky <ben.widawsky@intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
> Changes since v4:
> - Update the kdoc for @partition_align_bytes (Ben)
> - Collect Jonathan's reviewed-by pending above update.

Ugh, nope, still had this change in my working tree. v6 inbound.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v6 08/21] cxl/pci: Clean up cxl_mem_get_partition_info()
  2021-09-09  5:12 ` [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info() Dan Williams
  2021-09-09 16:20   ` Ben Widawsky
  2021-09-13 22:19   ` [PATCH v5 " Dan Williams
@ 2021-09-13 22:24   ` Dan Williams
  2 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-13 22:24 UTC (permalink / raw)
  To: linux-cxl; +Cc: Ira Weiny, Ben Widawsky, Jonathan Cameron, nvdimm

Commit 0b9159d0ff21 ("cxl/pci: Store memory capacity values") missed
updating the kernel-doc for 'struct cxl_mem' leading to the following
warnings:

./scripts/kernel-doc -v drivers/cxl/cxlmem.h 2>&1 | grep warn
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'total_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'volatile_only_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'persistent_only_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'partition_align_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_volatile_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'active_persistent_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_volatile_bytes' not described in 'cxl_mem'
drivers/cxl/cxlmem.h:107: warning: Function parameter or member 'next_persistent_bytes' not described in 'cxl_mem'

Also, it is redundant to describe those same parameters in the
kernel-doc for cxl_mem_get_partition_info(). Given the only user of that
routine updates the values in @cxlm, just do that implicitly internal to
the helper.

Cc: Ira Weiny <ira.weiny@intel.com>
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v5:
- Update the kdoc for @partition_align_bytes (Ben) (for real this time)
- Fix Jonathan's typo in his reviewed-by tag.

 drivers/cxl/cxlmem.h |   15 +++++++++++++--
 drivers/cxl/pci.c    |   35 +++++++++++------------------------
 2 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index d5334df83fb2..e14bcd7a1ba1 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -78,8 +78,19 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm,
  * @mbox_mutex: Mutex to synchronize mailbox access.
  * @firmware_version: Firmware version for the memory device.
  * @enabled_cmds: Hardware commands found enabled in CEL.
- * @pmem_range: Persistent memory capacity information.
- * @ram_range: Volatile memory capacity information.
+ * @pmem_range: Active Persistent memory capacity configuration
+ * @ram_range: Active Volatile memory capacity configuration
+ * @total_bytes: sum of all possible capacities
+ * @volatile_only_bytes: hard volatile capacity
+ * @persistent_only_bytes: hard persistent capacity
+ * @partition_align_bytes: alignment size for partition-able capacity
+ * @active_volatile_bytes: sum of hard + soft volatile
+ * @active_persistent_bytes: sum of hard + soft persistent
+ * @next_volatile_bytes: volatile capacity change pending device reset
+ * @next_persistent_bytes: persistent capacity change pending device reset
+ *
+ * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
+ * details on capacity parameters.
  */
 struct cxl_mem {
 	struct device *dev;
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index c1e1d12e24b6..8077d907e7d3 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1262,11 +1262,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
 
 /**
  * cxl_mem_get_partition_info - Get partition info
- * @cxlm: The device to act on
- * @active_volatile_bytes: returned active volatile capacity
- * @active_persistent_bytes: returned active persistent capacity
- * @next_volatile_bytes: return next volatile capacity
- * @next_persistent_bytes: return next persistent capacity
+ * @cxlm: cxl_mem instance to update partition info
  *
  * Retrieve the current partition info for the device specified.  If not 0, the
  * 'next' values are pending and take affect on next cold reset.
@@ -1275,11 +1271,7 @@ static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_mem *cxlm)
  *
  * See CXL @8.2.9.5.2.1 Get Partition Info
  */
-static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
-				      u64 *active_volatile_bytes,
-				      u64 *active_persistent_bytes,
-				      u64 *next_volatile_bytes,
-				      u64 *next_persistent_bytes)
+static int cxl_mem_get_partition_info(struct cxl_mem *cxlm)
 {
 	struct cxl_mbox_get_partition_info {
 		__le64 active_volatile_cap;
@@ -1294,15 +1286,14 @@ static int cxl_mem_get_partition_info(struct cxl_mem *cxlm,
 	if (rc)
 		return rc;
 
-	*active_volatile_bytes = le64_to_cpu(pi.active_volatile_cap);
-	*active_persistent_bytes = le64_to_cpu(pi.active_persistent_cap);
-	*next_volatile_bytes = le64_to_cpu(pi.next_volatile_cap);
-	*next_persistent_bytes = le64_to_cpu(pi.next_volatile_cap);
-
-	*active_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
-	*active_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
-	*next_volatile_bytes *= CXL_CAPACITY_MULTIPLIER;
-	*next_persistent_bytes *= CXL_CAPACITY_MULTIPLIER;
+	cxlm->active_volatile_bytes =
+		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->active_persistent_bytes =
+		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->next_volatile_bytes =
+		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
+	cxlm->next_persistent_bytes =
+		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
 
 	return 0;
 }
@@ -1443,11 +1434,7 @@ static int cxl_mem_create_range_info(struct cxl_mem *cxlm)
 		return 0;
 	}
 
-	rc = cxl_mem_get_partition_info(cxlm,
-					&cxlm->active_volatile_bytes,
-					&cxlm->active_persistent_bytes,
-					&cxlm->next_volatile_bytes,
-					&cxlm->next_persistent_bytes);
+	rc = cxl_mem_get_partition_info(cxlm);
 	if (rc < 0) {
 		dev_err(cxlm->dev, "Failed to query partition information\n");
 		return rc;


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support
  2021-09-10  9:33   ` Jonathan Cameron
@ 2021-09-13 23:46     ` Dan Williams
  2021-09-14  9:01       ` Jonathan Cameron
  2021-09-14 12:22       ` Konstantin Ryabitsev
  0 siblings, 2 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-13 23:46 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Ben Widawsky, Vishal L Verma, Linux NVDIMM, Schofield,
	Alison, Weiny, Ira

On Fri, Sep 10, 2021 at 2:34 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Wed, 8 Sep 2021 22:12:49 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > The CXL_PMEM driver expects exclusive control of the label storage area
> > space. Similar to the LIBNVDIMM expectation that the label storage area
> > is only writable from userspace when the corresponding memory device is
> > not active in any region, the expectation is the native CXL_PCI UAPI
> > path is disabled while the cxl_nvdimm for a given cxl_memdev device is
> > active in LIBNVDIMM.
> >
> > Add the ability to toggle the availability of a given command for the
> > UAPI path. Use that new capability to shutdown changes to partitions and
> > the label storage area while the cxl_nvdimm device is actively proxying
> > commands for LIBNVDIMM.
> >
> > Acked-by: Ben Widawsky <ben.widawsky@intel.com>
> > Link: https://lore.kernel.org/r/162982123298.1124374.22718002900700392.stgit@dwillia2-desk3.amr.corp.intel.com
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> In the ideal world I'd like to have seen this as a noop patch going from devm
> to non devm for cleanup followed by new stuff.  meh, the world isn't ideal
> and all that sort of nice stuff takes time!

It would also require a series resend since I can't use the in-place
update in a way that b4 will recognize.

> Whilst I'm not that keen on the exact form of the code in probe() it will
> be easier to read when not a diff so if you prefer to keep it as you have
> it I won't object - it just took a little more careful reading than I'd like.

I circled back to devm after taking out the cleverness as you noted,
and that makes the patch more readable.

>
> Thanks,
>
> Jonathan
>
>
> > ---
> >  drivers/cxl/core/mbox.c   |    5 +++++
> >  drivers/cxl/core/memdev.c |   31 +++++++++++++++++++++++++++++++
> >  drivers/cxl/cxlmem.h      |    4 ++++
> >  drivers/cxl/pmem.c        |   43 ++++++++++++++++++++++++++++++++-----------
> >  4 files changed, 72 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 422999740649..82e79da195fa 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -221,6 +221,7 @@ static bool cxl_mem_raw_command_allowed(u16 opcode)
> >   *  * %-EINVAL       - Reserved fields or invalid values were used.
> >   *  * %-ENOMEM       - Input or output buffer wasn't sized properly.
> >   *  * %-EPERM        - Attempted to use a protected command.
> > + *  * %-EBUSY        - Kernel has claimed exclusive access to this opcode
> >   *
> >   * The result of this command is a fully validated command in @out_cmd that is
> >   * safe to send to the hardware.
> > @@ -296,6 +297,10 @@ static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
> >       if (!test_bit(info->id, cxlm->enabled_cmds))
> >               return -ENOTTY;
> >
> > +     /* Check that the command is not claimed for exclusive kernel use */
> > +     if (test_bit(info->id, cxlm->exclusive_cmds))
> > +             return -EBUSY;
> > +
> >       /* Check the input buffer is the expected size */
> >       if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
> >               return -ENOMEM;
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index df2ba87238c2..d9ade5b92330 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -134,6 +134,37 @@ static const struct device_type cxl_memdev_type = {
> >       .groups = cxl_memdev_attribute_groups,
> >  };
> >
> > +/**
> > + * set_exclusive_cxl_commands() - atomically disable user cxl commands
> > + * @cxlm: cxl_mem instance to modify
> > + * @cmds: bitmap of commands to mark exclusive
> > + *
> > + * Flush the ioctl path and disable future execution of commands with
> > + * the command ids set in @cmds.
>
> It's not obvious this function is doing that 'flush', Perhaps consider rewording?

Changed it to:

"Grab the cxl_memdev_rwsem in write mode to flush in-flight
invocations of the ioctl path and then disable future execution of
commands with the command ids set in @cmds."

>
> > + */
> > +void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
> > +{
> > +     down_write(&cxl_memdev_rwsem);
> > +     bitmap_or(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
> > +               CXL_MEM_COMMAND_ID_MAX);
> > +     up_write(&cxl_memdev_rwsem);
> > +}
> > +EXPORT_SYMBOL_GPL(set_exclusive_cxl_commands);
> > +
> > +/**
> > + * clear_exclusive_cxl_commands() - atomically enable user cxl commands
> > + * @cxlm: cxl_mem instance to modify
> > + * @cmds: bitmap of commands to mark available for userspace
> > + */
> > +void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
> > +{
> > +     down_write(&cxl_memdev_rwsem);
> > +     bitmap_andnot(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
> > +                   CXL_MEM_COMMAND_ID_MAX);
> > +     up_write(&cxl_memdev_rwsem);
> > +}
> > +EXPORT_SYMBOL_GPL(clear_exclusive_cxl_commands);
> > +
> >  static void cxl_memdev_shutdown(struct device *dev)
> >  {
> >       struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 16201b7d82d2..468b7b8be207 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -101,6 +101,7 @@ struct cxl_mbox_cmd {
> >   * @mbox_mutex: Mutex to synchronize mailbox access.
> >   * @firmware_version: Firmware version for the memory device.
> >   * @enabled_cmds: Hardware commands found enabled in CEL.
> > + * @exclusive_cmds: Commands that are kernel-internal only
> >   * @pmem_range: Active Persistent memory capacity configuration
> >   * @ram_range: Active Volatile memory capacity configuration
> >   * @total_bytes: sum of all possible capacities
> > @@ -127,6 +128,7 @@ struct cxl_mem {
> >       struct mutex mbox_mutex; /* Protects device mailbox and firmware */
> >       char firmware_version[0x10];
> >       DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
> > +     DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
> >
> >       struct range pmem_range;
> >       struct range ram_range;
> > @@ -200,4 +202,6 @@ int cxl_mem_identify(struct cxl_mem *cxlm);
> >  int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm);
> >  int cxl_mem_create_range_info(struct cxl_mem *cxlm);
> >  struct cxl_mem *cxl_mem_create(struct device *dev);
> > +void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
> > +void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
> >  #endif /* __CXL_MEM_H__ */
> > diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> > index 9652c3ee41e7..a972af7a6e0b 100644
> > --- a/drivers/cxl/pmem.c
> > +++ b/drivers/cxl/pmem.c
> > @@ -16,10 +16,7 @@
> >   */
> >  static struct workqueue_struct *cxl_pmem_wq;
> >
> > -static void unregister_nvdimm(void *nvdimm)
> > -{
> > -     nvdimm_delete(nvdimm);
> > -}
> > +static __read_mostly DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
> >
> >  static int match_nvdimm_bridge(struct device *dev, const void *data)
> >  {
> > @@ -36,12 +33,25 @@ static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
> >       return to_cxl_nvdimm_bridge(dev);
> >  }
> >
> > +static void cxl_nvdimm_remove(struct device *dev)
> > +{
> > +     struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> > +     struct nvdimm *nvdimm = dev_get_drvdata(dev);
> > +     struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > +     struct cxl_mem *cxlm = cxlmd->cxlm;
>
> Given cxlmd isn't used, perhaps combine the two lines above?

...gone with the return of devm.

>
> > +
> > +     nvdimm_delete(nvdimm);
> > +     clear_exclusive_cxl_commands(cxlm, exclusive_cmds);
> > +}
> > +
> >  static int cxl_nvdimm_probe(struct device *dev)
> >  {
> >       struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> > +     struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > +     struct cxl_mem *cxlm = cxlmd->cxlm;
>
> Again, clxmd not used so could save a line of code
> without loosing anything (unless it get used in a later patch of
> course!)

It is used... to grab cxlm, but it's an arbitrary style preference to
avoid de-reference chains longer than one. However, since I'm only
doing it once now perhaps you'll grant me this indulgence?

>
> >       struct cxl_nvdimm_bridge *cxl_nvb;
> > +     struct nvdimm *nvdimm = NULL;
> >       unsigned long flags = 0;
> > -     struct nvdimm *nvdimm;
> >       int rc = -ENXIO;
> >
> >       cxl_nvb = cxl_find_nvdimm_bridge();
> > @@ -50,25 +60,32 @@ static int cxl_nvdimm_probe(struct device *dev)
> >
> >       device_lock(&cxl_nvb->dev);
> >       if (!cxl_nvb->nvdimm_bus)
> > -             goto out;
> > +             goto out_unlock;
> > +
> > +     set_exclusive_cxl_commands(cxlm, exclusive_cmds);
> >
> >       set_bit(NDD_LABELING, &flags);
> > +     rc = -ENOMEM;
>
> Hmm. Setting rc to an error value even in the good path is a bit
> unusual.  I'd just add the few lines to set rc = -ENXIO only in the error
> path above and
> rc = -ENOMEM here only if nvdimm_create fails.
>
> What you have strikes me as a bit too clever :)

Agree, and devm slots in nicely again with that removed.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support
  2021-09-13 23:46     ` Dan Williams
@ 2021-09-14  9:01       ` Jonathan Cameron
  2021-09-14 12:22       ` Konstantin Ryabitsev
  1 sibling, 0 replies; 78+ messages in thread
From: Jonathan Cameron @ 2021-09-14  9:01 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Ben Widawsky, Vishal L Verma, Linux NVDIMM, Schofield,
	Alison, Weiny, Ira

> > > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > > index df2ba87238c2..d9ade5b92330 100644
> > > --- a/drivers/cxl/core/memdev.c
> > > +++ b/drivers/cxl/core/memdev.c
> > > @@ -134,6 +134,37 @@ static const struct device_type cxl_memdev_type = {
> > >       .groups = cxl_memdev_attribute_groups,
> > >  };
> > >
> > > +/**
> > > + * set_exclusive_cxl_commands() - atomically disable user cxl commands
> > > + * @cxlm: cxl_mem instance to modify
> > > + * @cmds: bitmap of commands to mark exclusive
> > > + *
> > > + * Flush the ioctl path and disable future execution of commands with
> > > + * the command ids set in @cmds.  
> >
> > It's not obvious this function is doing that 'flush', Perhaps consider rewording?  
> 
> Changed it to:
> 
> "Grab the cxl_memdev_rwsem in write mode to flush in-flight
> invocations of the ioctl path and then disable future execution of
> commands with the command ids set in @cmds."

Great

> 
> >  
> > > + */
> > > +void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
> > > +{
> > > +     down_write(&cxl_memdev_rwsem);
> > > +     bitmap_or(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
> > > +               CXL_MEM_COMMAND_ID_MAX);
> > > +     up_write(&cxl_memdev_rwsem);
> > > +}
> > > +EXPORT_SYMBOL_GPL(set_exclusive_cxl_commands);

...

> > > diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> > > index 9652c3ee41e7..a972af7a6e0b 100644
> > > --- a/drivers/cxl/pmem.c
> > > +++ b/drivers/cxl/pmem.c
> > > @@ -16,10 +16,7 @@
> > >   */
> > >  static struct workqueue_struct *cxl_pmem_wq;
> > >

...

> >  
> > > +
> > > +     nvdimm_delete(nvdimm);
> > > +     clear_exclusive_cxl_commands(cxlm, exclusive_cmds);
> > > +}
> > > +
> > >  static int cxl_nvdimm_probe(struct device *dev)
> > >  {
> > >       struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> > > +     struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> > > +     struct cxl_mem *cxlm = cxlmd->cxlm;  
> >
> > Again, clxmd not used so could save a line of code
> > without loosing anything (unless it get used in a later patch of
> > course!)  
> 
> It is used... to grab cxlm, but it's an arbitrary style preference to
> avoid de-reference chains longer than one. However, since I'm only
> doing it once now perhaps you'll grant me this indulgence?
> 

This one was a 'could'.  Entirely up to you whether you do :)

Jonathan



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support
  2021-09-13 23:46     ` Dan Williams
  2021-09-14  9:01       ` Jonathan Cameron
@ 2021-09-14 12:22       ` Konstantin Ryabitsev
  2021-09-14 14:39         ` Dan Williams
  1 sibling, 1 reply; 78+ messages in thread
From: Konstantin Ryabitsev @ 2021-09-14 12:22 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, linux-cxl, Ben Widawsky, Vishal L Verma,
	Linux NVDIMM, Schofield, Alison, Weiny, Ira

On Mon, Sep 13, 2021 at 04:46:47PM -0700, Dan Williams wrote:
> > In the ideal world I'd like to have seen this as a noop patch going from devm
> > to non devm for cleanup followed by new stuff.  meh, the world isn't ideal
> > and all that sort of nice stuff takes time!
> 
> It would also require a series resend since I can't use the in-place
> update in a way that b4 will recognize.

BTW, b4 0.7+ can do partial series rerolls. You can just send a single
follow-up patch without needing to reroll the whole series, e.g.:

[PATCH 1/3]
[PATCH 2/3]
\- [PATCH v2 2/3]
[PATCH 3/3]

This is enough for b4 to make a v2 series where only 2/3 is replaced.

-K

(Yes, I am monitoring all mentions of "b4" on lore.kernel.org/all in a totally
non-creepy way, I swear.)

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support
  2021-09-14 12:22       ` Konstantin Ryabitsev
@ 2021-09-14 14:39         ` Dan Williams
  2021-09-14 15:51           ` Konstantin Ryabitsev
  0 siblings, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-14 14:39 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Jonathan Cameron, linux-cxl, Ben Widawsky, Vishal L Verma,
	Linux NVDIMM, Schofield, Alison, Weiny, Ira

On Tue, Sep 14, 2021 at 5:22 AM Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
>
> On Mon, Sep 13, 2021 at 04:46:47PM -0700, Dan Williams wrote:
> > > In the ideal world I'd like to have seen this as a noop patch going from devm
> > > to non devm for cleanup followed by new stuff.  meh, the world isn't ideal
> > > and all that sort of nice stuff takes time!
> >
> > It would also require a series resend since I can't use the in-place
> > update in a way that b4 will recognize.
>
> BTW, b4 0.7+ can do partial series rerolls. You can just send a single
> follow-up patch without needing to reroll the whole series, e.g.:
>
> [PATCH 1/3]
> [PATCH 2/3]
> \- [PATCH v2 2/3]
> [PATCH 3/3]
>
> This is enough for b4 to make a v2 series where only 2/3 is replaced.

Oh, yes, I use that liberally, istr asking for it originally. What I
was referring to here was feedback that alluded to injecting another
patch into the series, ala:

[PATCH 1/3]
[PATCH 2/3]
\- [PATCH v2 2/4]
 \- [PATCH v2 3/4]
[PATCH 3/3]   <-- this one would be 4/4

I don't expect b4 to handle that case, and would expect to re-roll the
series with the new numbering.

>
> -K
>
> (Yes, I am monitoring all mentions of "b4" on lore.kernel.org/all in a totally
> non-creepy way, I swear.)

I still need to do that for my sub-systems.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support
  2021-09-14 14:39         ` Dan Williams
@ 2021-09-14 15:51           ` Konstantin Ryabitsev
  0 siblings, 0 replies; 78+ messages in thread
From: Konstantin Ryabitsev @ 2021-09-14 15:51 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, linux-cxl, Ben Widawsky, Vishal L Verma,
	Linux NVDIMM, Schofield, Alison, Weiny, Ira

On Tue, Sep 14, 2021 at 07:39:47AM -0700, Dan Williams wrote:
> > > It would also require a series resend since I can't use the in-place
> > > update in a way that b4 will recognize.
> >
> > BTW, b4 0.7+ can do partial series rerolls. You can just send a single
> > follow-up patch without needing to reroll the whole series, e.g.:
> >
> > [PATCH 1/3]
> > [PATCH 2/3]
> > \- [PATCH v2 2/3]
> > [PATCH 3/3]
> >
> > This is enough for b4 to make a v2 series where only 2/3 is replaced.
> 
> Oh, yes, I use that liberally, istr asking for it originally. What I
> was referring to here was feedback that alluded to injecting another
> patch into the series, ala:
> 
> [PATCH 1/3]
> [PATCH 2/3]
> \- [PATCH v2 2/4]
>  \- [PATCH v2 3/4]
> [PATCH 3/3]   <-- this one would be 4/4
> 
> I don't expect b4 to handle that case, and would expect to re-roll the
> series with the new numbering.

Oooh, yeah, you're right. One option is to download the mbox file and manually
edit the patch subject to be [PATCH v2 4/4].

> > (Yes, I am monitoring all mentions of "b4" on lore.kernel.org/all in a totally
> > non-creepy way, I swear.)
> 
> I still need to do that for my sub-systems.

I'll provide ample docs by the time plumbers rolls around next week.

-K

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v5 14/21] cxl/mbox: Add exclusive kernel command support
  2021-09-09  5:12 ` [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support Dan Williams
  2021-09-09 17:02   ` Ben Widawsky
  2021-09-10  9:33   ` Jonathan Cameron
@ 2021-09-14 19:03   ` Dan Williams
  2 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-14 19:03 UTC (permalink / raw)
  To: linux-cxl; +Cc: Ben Widawsky, nvdimm, Jonathan.Cameron

The CXL_PMEM driver expects exclusive control of the label storage area
space. Similar to the LIBNVDIMM expectation that the label storage area
is only writable from userspace when the corresponding memory device is
not active in any region, the expectation is the native CXL_PCI UAPI
path is disabled while the cxl_nvdimm for a given cxl_memdev device is
active in LIBNVDIMM.

Add the ability to toggle the availability of a given command for the
UAPI path. Use that new capability to shutdown changes to partitions and
the label storage area while the cxl_nvdimm device is actively proxying
commands for LIBNVDIMM.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v4:
- improve the kdoc for set_exclusive_cxl_commands() (Jonathan)
- drop the cleverness in cxl_nvdimm_probe() to make the patch more
  readable (Jonathan)

 drivers/cxl/core/mbox.c   |    5 +++++
 drivers/cxl/core/memdev.c |   32 ++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h      |    4 ++++
 drivers/cxl/pmem.c        |   29 ++++++++++++++++++++++++++---
 4 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 422999740649..82e79da195fa 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -221,6 +221,7 @@ static bool cxl_mem_raw_command_allowed(u16 opcode)
  *  * %-EINVAL	- Reserved fields or invalid values were used.
  *  * %-ENOMEM	- Input or output buffer wasn't sized properly.
  *  * %-EPERM	- Attempted to use a protected command.
+ *  * %-EBUSY	- Kernel has claimed exclusive access to this opcode
  *
  * The result of this command is a fully validated command in @out_cmd that is
  * safe to send to the hardware.
@@ -296,6 +297,10 @@ static int cxl_validate_cmd_from_user(struct cxl_mem *cxlm,
 	if (!test_bit(info->id, cxlm->enabled_cmds))
 		return -ENOTTY;
 
+	/* Check that the command is not claimed for exclusive kernel use */
+	if (test_bit(info->id, cxlm->exclusive_cmds))
+		return -EBUSY;
+
 	/* Check the input buffer is the expected size */
 	if (info->size_in >= 0 && info->size_in != send_cmd->in.size)
 		return -ENOMEM;
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index df2ba87238c2..bf1b04d00ff4 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -134,6 +134,38 @@ static const struct device_type cxl_memdev_type = {
 	.groups = cxl_memdev_attribute_groups,
 };
 
+/**
+ * set_exclusive_cxl_commands() - atomically disable user cxl commands
+ * @cxlm: cxl_mem instance to modify
+ * @cmds: bitmap of commands to mark exclusive
+ *
+ * Grab the cxl_memdev_rwsem in write mode to flush in-flight
+ * invocations of the ioctl path and then disable future execution of
+ * commands with the command ids set in @cmds.
+ */
+void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
+{
+	down_write(&cxl_memdev_rwsem);
+	bitmap_or(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
+		  CXL_MEM_COMMAND_ID_MAX);
+	up_write(&cxl_memdev_rwsem);
+}
+EXPORT_SYMBOL_GPL(set_exclusive_cxl_commands);
+
+/**
+ * clear_exclusive_cxl_commands() - atomically enable user cxl commands
+ * @cxlm: cxl_mem instance to modify
+ * @cmds: bitmap of commands to mark available for userspace
+ */
+void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds)
+{
+	down_write(&cxl_memdev_rwsem);
+	bitmap_andnot(cxlm->exclusive_cmds, cxlm->exclusive_cmds, cmds,
+		      CXL_MEM_COMMAND_ID_MAX);
+	up_write(&cxl_memdev_rwsem);
+}
+EXPORT_SYMBOL_GPL(clear_exclusive_cxl_commands);
+
 static void cxl_memdev_shutdown(struct device *dev)
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 8f59a89a0aab..373add570ef6 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -101,6 +101,7 @@ struct cxl_mbox_cmd {
  * @mbox_mutex: Mutex to synchronize mailbox access.
  * @firmware_version: Firmware version for the memory device.
  * @enabled_cmds: Hardware commands found enabled in CEL.
+ * @exclusive_cmds: Commands that are kernel-internal only
  * @pmem_range: Active Persistent memory capacity configuration
  * @ram_range: Active Volatile memory capacity configuration
  * @total_bytes: sum of all possible capacities
@@ -127,6 +128,7 @@ struct cxl_mem {
 	struct mutex mbox_mutex; /* Protects device mailbox and firmware */
 	char firmware_version[0x10];
 	DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
+	DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
 
 	struct range pmem_range;
 	struct range ram_range;
@@ -200,4 +202,6 @@ int cxl_mem_identify(struct cxl_mem *cxlm);
 int cxl_mem_enumerate_cmds(struct cxl_mem *cxlm);
 int cxl_mem_create_range_info(struct cxl_mem *cxlm);
 struct cxl_mem *cxl_mem_create(struct device *dev);
+void set_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
+void clear_exclusive_cxl_commands(struct cxl_mem *cxlm, unsigned long *cmds);
 #endif /* __CXL_MEM_H__ */
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index 9652c3ee41e7..5629289939af 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -16,6 +16,13 @@
  */
 static struct workqueue_struct *cxl_pmem_wq;
 
+static __read_mostly DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
+
+static void clear_exclusive(void *cxlm)
+{
+	clear_exclusive_cxl_commands(cxlm, exclusive_cmds);
+}
+
 static void unregister_nvdimm(void *nvdimm)
 {
 	nvdimm_delete(nvdimm);
@@ -39,25 +46,37 @@ static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
 static int cxl_nvdimm_probe(struct device *dev)
 {
 	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
+	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
+	struct cxl_mem *cxlm = cxlmd->cxlm;
 	struct cxl_nvdimm_bridge *cxl_nvb;
 	unsigned long flags = 0;
 	struct nvdimm *nvdimm;
-	int rc = -ENXIO;
+	int rc;
 
 	cxl_nvb = cxl_find_nvdimm_bridge();
 	if (!cxl_nvb)
 		return -ENXIO;
 
 	device_lock(&cxl_nvb->dev);
-	if (!cxl_nvb->nvdimm_bus)
+	if (!cxl_nvb->nvdimm_bus) {
+		rc = -ENXIO;
+		goto out;
+	}
+
+	set_exclusive_cxl_commands(cxlm, exclusive_cmds);
+	rc = devm_add_action_or_reset(dev, clear_exclusive, cxlm);
+	if (rc)
 		goto out;
 
 	set_bit(NDD_LABELING, &flags);
 	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
 			       NULL);
-	if (!nvdimm)
+	if (!nvdimm) {
+		rc = -ENOMEM;
 		goto out;
+	}
 
+	dev_set_drvdata(dev, nvdimm);
 	rc = devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm);
 out:
 	device_unlock(&cxl_nvb->dev);
@@ -194,6 +213,10 @@ static __init int cxl_pmem_init(void)
 {
 	int rc;
 
+	set_bit(CXL_MEM_COMMAND_ID_SET_PARTITION_INFO, exclusive_cmds);
+	set_bit(CXL_MEM_COMMAND_ID_SET_SHUTDOWN_STATE, exclusive_cmds);
+	set_bit(CXL_MEM_COMMAND_ID_SET_LSA, exclusive_cmds);
+
 	cxl_pmem_wq = alloc_ordered_workqueue("cxl_pmem", 0);
 	if (!cxl_pmem_wq)
 		return -ENXIO;


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands
  2021-09-09  5:12 ` [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands Dan Williams
  2021-09-09 17:22   ` Ben Widawsky
  2021-09-09 22:08   ` [PATCH v5 " Dan Williams
@ 2021-09-14 19:06   ` Dan Williams
  2 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-14 19:06 UTC (permalink / raw)
  To: linux-cxl; +Cc: Ben Widawsky, Jonathan Cameron, nvdimm, Jonathan.Cameron

The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to
translate the Linux command payload to the device native command format.
The LIBNVDIMM commands get-config-size, get-config-data, and
set-config-data, map to the CXL memory device commands device-identify,
get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an
NVDIMM device arranges for the provisioning of namespaces. Additionally
for CXL the LSA is used for provisioning regions as well.

The data from device-identify is already cached in the 'struct cxl_mem'
instance associated with @cxl_nvd, so that payload return is simply
crafted and no CXL command is issued. The conversion for get-lsa is
straightforward, but the conversion for set-lsa requires an allocation
to append the set-lsa header in front of the payload.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v4:
- rebase on [PATCH v5 14/21] cxl/mbox: Add exclusive kernel command support

 drivers/cxl/pmem.c |  128 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 124 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index 5629289939af..bc1466b0e999 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
 #include <linux/libnvdimm.h>
+#include <asm/unaligned.h>
 #include <linux/device.h>
 #include <linux/module.h>
 #include <linux/ndctl.h>
@@ -47,9 +48,9 @@ static int cxl_nvdimm_probe(struct device *dev)
 {
 	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
+	unsigned long flags = 0, cmd_mask = 0;
 	struct cxl_mem *cxlm = cxlmd->cxlm;
 	struct cxl_nvdimm_bridge *cxl_nvb;
-	unsigned long flags = 0;
 	struct nvdimm *nvdimm;
 	int rc;
 
@@ -69,8 +70,11 @@ static int cxl_nvdimm_probe(struct device *dev)
 		goto out;
 
 	set_bit(NDD_LABELING, &flags);
-	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
-			       NULL);
+	set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
+	set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
+	set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
+	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags,
+			       cmd_mask, 0, NULL);
 	if (!nvdimm) {
 		rc = -ENOMEM;
 		goto out;
@@ -91,11 +95,127 @@ static struct cxl_driver cxl_nvdimm_driver = {
 	.id = CXL_DEVICE_NVDIMM,
 };
 
+static int cxl_pmem_get_config_size(struct cxl_mem *cxlm,
+				    struct nd_cmd_get_config_size *cmd,
+				    unsigned int buf_len)
+{
+	if (sizeof(*cmd) > buf_len)
+		return -EINVAL;
+
+	*cmd = (struct nd_cmd_get_config_size) {
+		 .config_size = cxlm->lsa_size,
+		 .max_xfer = cxlm->payload_size,
+	};
+
+	return 0;
+}
+
+static int cxl_pmem_get_config_data(struct cxl_mem *cxlm,
+				    struct nd_cmd_get_config_data_hdr *cmd,
+				    unsigned int buf_len)
+{
+	struct cxl_mbox_get_lsa {
+		u32 offset;
+		u32 length;
+	} get_lsa;
+	int rc;
+
+	if (sizeof(*cmd) > buf_len)
+		return -EINVAL;
+	if (struct_size(cmd, out_buf, cmd->in_length) > buf_len)
+		return -EINVAL;
+
+	get_lsa = (struct cxl_mbox_get_lsa) {
+		.offset = cmd->in_offset,
+		.length = cmd->in_length,
+	};
+
+	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_GET_LSA, &get_lsa,
+				   sizeof(get_lsa), cmd->out_buf,
+				   cmd->in_length);
+	cmd->status = 0;
+
+	return rc;
+}
+
+static int cxl_pmem_set_config_data(struct cxl_mem *cxlm,
+				    struct nd_cmd_set_config_hdr *cmd,
+				    unsigned int buf_len)
+{
+	struct cxl_mbox_set_lsa {
+		u32 offset;
+		u32 reserved;
+		u8 data[];
+	} *set_lsa;
+	int rc;
+
+	if (sizeof(*cmd) > buf_len)
+		return -EINVAL;
+
+	/* 4-byte status follows the input data in the payload */
+	if (struct_size(cmd, in_buf, cmd->in_length) + 4 > buf_len)
+		return -EINVAL;
+
+	set_lsa =
+		kvzalloc(struct_size(set_lsa, data, cmd->in_length), GFP_KERNEL);
+	if (!set_lsa)
+		return -ENOMEM;
+
+	*set_lsa = (struct cxl_mbox_set_lsa) {
+		.offset = cmd->in_offset,
+	};
+	memcpy(set_lsa->data, cmd->in_buf, cmd->in_length);
+
+	rc = cxl_mem_mbox_send_cmd(cxlm, CXL_MBOX_OP_SET_LSA, set_lsa,
+				   struct_size(set_lsa, data, cmd->in_length),
+				   NULL, 0);
+
+	/*
+	 * Set "firmware" status (4-packed bytes at the end of the input
+	 * payload.
+	 */
+	put_unaligned(0, (u32 *) &cmd->in_buf[cmd->in_length]);
+	kvfree(set_lsa);
+
+	return rc;
+}
+
+static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
+			       void *buf, unsigned int buf_len)
+{
+	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
+	unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
+	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
+	struct cxl_mem *cxlm = cxlmd->cxlm;
+
+	if (!test_bit(cmd, &cmd_mask))
+		return -ENOTTY;
+
+	switch (cmd) {
+	case ND_CMD_GET_CONFIG_SIZE:
+		return cxl_pmem_get_config_size(cxlm, buf, buf_len);
+	case ND_CMD_GET_CONFIG_DATA:
+		return cxl_pmem_get_config_data(cxlm, buf, buf_len);
+	case ND_CMD_SET_CONFIG_DATA:
+		return cxl_pmem_set_config_data(cxlm, buf, buf_len);
+	default:
+		return -ENOTTY;
+	}
+}
+
 static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
 			struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 			unsigned int buf_len, int *cmd_rc)
 {
-	return -ENOTTY;
+	/*
+	 * No firmware response to translate, let the transport error
+	 * code take precedence.
+	 */
+	*cmd_rc = 0;
+
+	if (!nvdimm)
+		return -ENOTTY;
+	return cxl_pmem_nvdimm_ctl(nvdimm, cmd, buf, buf_len);
 }
 
 static bool online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 16/21] cxl/pmem: Add support for multiple nvdimm-bridge objects
  2021-09-09  5:12 ` [PATCH v4 16/21] cxl/pmem: Add support for multiple nvdimm-bridge objects Dan Williams
  2021-09-09 22:03   ` Dan Williams
@ 2021-09-14 19:08   ` Dan Williams
  1 sibling, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-14 19:08 UTC (permalink / raw)
  To: linux-cxl; +Cc: Ben Widawsky, Jonathan Cameron, nvdimm, Jonathan.Cameron

In preparation for a mocked unit test environment for CXL objects, allow
for multiple unique nvdimm-bridge objects.

For now, just allow multiple bridges to be registered. Later, when there
are multiple present, further updates are needed to
cxl_find_nvdimm_bridge() to identify which bridge is associated with
which CXL hierarchy for nvdimm registration.

Note that this does change the kernel device-name for the bridge object.
User space should not have any attachment to the device name at this
point as it is still early days in the CXL driver development.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v4:

- rebase on "[PATCH v5 15/21] cxl/pmem: Translate NVDIMM label commands
  to CXL label commands"

 drivers/cxl/core/pmem.c |   32 +++++++++++++++++++++++++++++++-
 drivers/cxl/cxl.h       |    2 ++
 drivers/cxl/pmem.c      |   15 ---------------
 3 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index d24570f5b8ba..9e56be3994f1 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -2,6 +2,7 @@
 /* Copyright(c) 2020 Intel Corporation. */
 #include <linux/device.h>
 #include <linux/slab.h>
+#include <linux/idr.h>
 #include <cxlmem.h>
 #include <cxl.h>
 #include "core.h"
@@ -20,10 +21,13 @@
  * operations, for example, namespace label access commands.
  */
 
+static DEFINE_IDA(cxl_nvdimm_bridge_ida);
+
 static void cxl_nvdimm_bridge_release(struct device *dev)
 {
 	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
 
+	ida_free(&cxl_nvdimm_bridge_ida, cxl_nvb->id);
 	kfree(cxl_nvb);
 }
 
@@ -47,16 +51,38 @@ struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(to_cxl_nvdimm_bridge);
 
+static int match_nvdimm_bridge(struct device *dev, const void *data)
+{
+	return dev->type == &cxl_nvdimm_bridge_type;
+}
+
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
+{
+	struct device *dev;
+
+	dev = bus_find_device(&cxl_bus_type, NULL, NULL, match_nvdimm_bridge);
+	if (!dev)
+		return NULL;
+	return to_cxl_nvdimm_bridge(dev);
+}
+EXPORT_SYMBOL_GPL(cxl_find_nvdimm_bridge);
+
 static struct cxl_nvdimm_bridge *
 cxl_nvdimm_bridge_alloc(struct cxl_port *port)
 {
 	struct cxl_nvdimm_bridge *cxl_nvb;
 	struct device *dev;
+	int rc;
 
 	cxl_nvb = kzalloc(sizeof(*cxl_nvb), GFP_KERNEL);
 	if (!cxl_nvb)
 		return ERR_PTR(-ENOMEM);
 
+	rc = ida_alloc(&cxl_nvdimm_bridge_ida, GFP_KERNEL);
+	if (rc < 0)
+		goto err;
+	cxl_nvb->id = rc;
+
 	dev = &cxl_nvb->dev;
 	cxl_nvb->port = port;
 	cxl_nvb->state = CXL_NVB_NEW;
@@ -67,6 +93,10 @@ cxl_nvdimm_bridge_alloc(struct cxl_port *port)
 	dev->type = &cxl_nvdimm_bridge_type;
 
 	return cxl_nvb;
+
+err:
+	kfree(cxl_nvb);
+	return ERR_PTR(rc);
 }
 
 static void unregister_nvb(void *_cxl_nvb)
@@ -119,7 +149,7 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
 		return cxl_nvb;
 
 	dev = &cxl_nvb->dev;
-	rc = dev_set_name(dev, "nvdimm-bridge");
+	rc = dev_set_name(dev, "nvdimm-bridge%d", cxl_nvb->id);
 	if (rc)
 		goto err;
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 53927f9fa77e..1b2e816e061e 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -211,6 +211,7 @@ enum cxl_nvdimm_brige_state {
 };
 
 struct cxl_nvdimm_bridge {
+	int id;
 	struct device dev;
 	struct cxl_port *port;
 	struct nvdimm_bus *nvdimm_bus;
@@ -323,4 +324,5 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
 struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
 bool is_cxl_nvdimm(struct device *dev);
 int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
+struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void);
 #endif /* __CXL_H__ */
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index bc1466b0e999..5ccf9aa6b3ae 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -29,21 +29,6 @@ static void unregister_nvdimm(void *nvdimm)
 	nvdimm_delete(nvdimm);
 }
 
-static int match_nvdimm_bridge(struct device *dev, const void *data)
-{
-	return strcmp(dev_name(dev), "nvdimm-bridge") == 0;
-}
-
-static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
-{
-	struct device *dev;
-
-	dev = bus_find_device(&cxl_bus_type, NULL, NULL, match_nvdimm_bridge);
-	if (!dev)
-		return NULL;
-	return to_cxl_nvdimm_bridge(dev);
-}
-
 static int cxl_nvdimm_probe(struct device *dev)
 {
 	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 17/21] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
  2021-09-09  5:13 ` [PATCH v4 17/21] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy Dan Williams
  2021-09-10  9:53   ` Jonathan Cameron
@ 2021-09-14 19:14   ` Dan Williams
  1 sibling, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-14 19:14 UTC (permalink / raw)
  To: linux-cxl; +Cc: Ben Widawsky, Vishal Verma, nvdimm, Jonathan.Cameron

Create an environment for CXL plumbing unit tests. Especially when it
comes to an algorithm for HDM Decoder (Host-managed Device Memory
Decoder) programming, the availability of an in-kernel-tree emulation
environment for CXL configuration complexity and corner cases speeds
development and deters regressions.

The approach taken mirrors what was done for tools/testing/nvdimm/. I.e.
an external module, cxl_test.ko built out of the tools/testing/cxl/
directory, provides mock implementations of kernel APIs and kernel
objects to simulate a real world device hierarchy.

One feedback for the tools/testing/nvdimm/ proposal was "why not do this
in QEMU?". In fact, the CXL development community has developed a QEMU
model for CXL [1]. However, there are a few blocking issues that keep
QEMU from being a tight fit for topology + provisioning unit tests:

1/ The QEMU community has yet to show interest in merging any of this
   support that has had patches on the list since November 2020. So,
   testing CXL to date involves building custom QEMU with out-of-tree
   patches.

2/ CXL mechanisms like cross-host-bridge interleave do not have a clear
   path to be emulated by QEMU without major infrastructure work. This
   is easier to achieve with the alloc_mock_res() approach taken in this
   patch to shortcut-define emulated system physical address ranges with
   interleave behavior.

The QEMU enabling has been critical to get the driver off the ground,
and may still move forward, but it does not address the ongoing needs of
a regression testing environment and test driven development.

This patch adds an ACPI CXL Platform definition with emulated CXL
multi-ported host-bridges. A follow on patch adds emulated memory
expander devices.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20210202005948.241655-1-ben.widawsky@intel.com [1]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v4:
- drop the addition of a new debug message as it does not belong in this
  patch. (Jonathan)

 drivers/cxl/acpi.c               |   36 ++-
 drivers/cxl/cxl.h                |   16 +
 tools/testing/cxl/Kbuild         |   36 +++
 tools/testing/cxl/config_check.c |   13 +
 tools/testing/cxl/mock_acpi.c    |  109 ++++++++
 tools/testing/cxl/test/Kbuild    |    6 
 tools/testing/cxl/test/cxl.c     |  509 ++++++++++++++++++++++++++++++++++++++
 tools/testing/cxl/test/mock.c    |  171 +++++++++++++
 tools/testing/cxl/test/mock.h    |   27 ++
 9 files changed, 908 insertions(+), 15 deletions(-)
 create mode 100644 tools/testing/cxl/Kbuild
 create mode 100644 tools/testing/cxl/config_check.c
 create mode 100644 tools/testing/cxl/mock_acpi.c
 create mode 100644 tools/testing/cxl/test/Kbuild
 create mode 100644 tools/testing/cxl/test/cxl.c
 create mode 100644 tools/testing/cxl/test/mock.c
 create mode 100644 tools/testing/cxl/test/mock.h

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 54e9d4d2cf5f..32775d1ac4b3 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -182,15 +182,7 @@ static resource_size_t get_chbcr(struct acpi_cedt_chbs *chbs)
 	return IS_ERR(chbs) ? CXL_RESOURCE_NONE : chbs->base;
 }
 
-struct cxl_walk_context {
-	struct device *dev;
-	struct pci_bus *root;
-	struct cxl_port *port;
-	int error;
-	int count;
-};
-
-static int match_add_root_ports(struct pci_dev *pdev, void *data)
+__mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 {
 	struct cxl_walk_context *ctx = data;
 	struct pci_bus *root_bus = ctx->root;
@@ -239,7 +231,8 @@ static struct cxl_dport *find_dport_by_dev(struct cxl_port *port, struct device
 	return NULL;
 }
 
-static struct acpi_device *to_cxl_host_bridge(struct device *dev)
+__mock struct acpi_device *to_cxl_host_bridge(struct device *host,
+					      struct device *dev)
 {
 	struct acpi_device *adev = to_acpi_device(dev);
 
@@ -257,9 +250,9 @@ static struct acpi_device *to_cxl_host_bridge(struct device *dev)
  */
 static int add_host_bridge_uport(struct device *match, void *arg)
 {
-	struct acpi_device *bridge = to_cxl_host_bridge(match);
 	struct cxl_port *root_port = arg;
 	struct device *host = root_port->dev.parent;
+	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
 	struct acpi_pci_root *pci_root;
 	struct cxl_walk_context ctx;
 	struct cxl_decoder *cxld;
@@ -323,7 +316,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
 	struct acpi_cedt_chbs *chbs;
 	struct cxl_port *root_port = arg;
 	struct device *host = root_port->dev.parent;
-	struct acpi_device *bridge = to_cxl_host_bridge(match);
+	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
 
 	if (!bridge)
 		return 0;
@@ -375,6 +368,17 @@ static int add_root_nvdimm_bridge(struct device *match, void *data)
 	return 1;
 }
 
+static u32 cedt_instance(struct platform_device *pdev)
+{
+	const bool *native_acpi0017 = acpi_device_get_match_data(&pdev->dev);
+
+	if (native_acpi0017 && *native_acpi0017)
+		return 0;
+
+	/* for cxl_test request a non-canonical instance */
+	return U32_MAX;
+}
+
 static int cxl_acpi_probe(struct platform_device *pdev)
 {
 	int rc;
@@ -388,7 +392,7 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 		return PTR_ERR(root_port);
 	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
 
-	status = acpi_get_table(ACPI_SIG_CEDT, 0, &acpi_cedt);
+	status = acpi_get_table(ACPI_SIG_CEDT, cedt_instance(pdev), &acpi_cedt);
 	if (ACPI_FAILURE(status))
 		return -ENXIO;
 
@@ -419,9 +423,11 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	return 0;
 }
 
+static bool native_acpi0017 = true;
+
 static const struct acpi_device_id cxl_acpi_ids[] = {
-	{ "ACPI0017", 0 },
-	{ "", 0 },
+	{ "ACPI0017", (unsigned long) &native_acpi0017 },
+	{ },
 };
 MODULE_DEVICE_TABLE(acpi, cxl_acpi_ids);
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 1b2e816e061e..c5152718267e 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -226,6 +226,14 @@ struct cxl_nvdimm {
 	struct nvdimm *nvdimm;
 };
 
+struct cxl_walk_context {
+	struct device *dev;
+	struct pci_bus *root;
+	struct cxl_port *port;
+	int error;
+	int count;
+};
+
 /**
  * struct cxl_port - logical collection of upstream port devices and
  *		     downstream port devices to construct a CXL memory
@@ -325,4 +333,12 @@ struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
 bool is_cxl_nvdimm(struct device *dev);
 int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
 struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void);
+
+/*
+ * Unit test builds overrides this to __weak, find the 'strong' version
+ * of these symbols in tools/testing/cxl/.
+ */
+#ifndef __mock
+#define __mock static
+#endif
 #endif /* __CXL_H__ */
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
new file mode 100644
index 000000000000..63a4a07e71c4
--- /dev/null
+++ b/tools/testing/cxl/Kbuild
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0
+ldflags-y += --wrap=is_acpi_device_node
+ldflags-y += --wrap=acpi_get_table
+ldflags-y += --wrap=acpi_put_table
+ldflags-y += --wrap=acpi_evaluate_integer
+ldflags-y += --wrap=acpi_pci_find_root
+ldflags-y += --wrap=pci_walk_bus
+ldflags-y += --wrap=nvdimm_bus_register
+
+DRIVERS := ../../../drivers
+CXL_SRC := $(DRIVERS)/cxl
+CXL_CORE_SRC := $(DRIVERS)/cxl/core
+ccflags-y := -I$(srctree)/drivers/cxl/
+ccflags-y += -D__mock=__weak
+
+obj-m += cxl_acpi.o
+
+cxl_acpi-y := $(CXL_SRC)/acpi.o
+cxl_acpi-y += mock_acpi.o
+cxl_acpi-y += config_check.o
+
+obj-m += cxl_pmem.o
+
+cxl_pmem-y := $(CXL_SRC)/pmem.o
+cxl_pmem-y += config_check.o
+
+obj-m += cxl_core.o
+
+cxl_core-y := $(CXL_CORE_SRC)/bus.o
+cxl_core-y += $(CXL_CORE_SRC)/pmem.o
+cxl_core-y += $(CXL_CORE_SRC)/regs.o
+cxl_core-y += $(CXL_CORE_SRC)/memdev.o
+cxl_core-y += $(CXL_CORE_SRC)/mbox.o
+cxl_core-y += config_check.o
+
+obj-m += test/
diff --git a/tools/testing/cxl/config_check.c b/tools/testing/cxl/config_check.c
new file mode 100644
index 000000000000..de5e5b3652fd
--- /dev/null
+++ b/tools/testing/cxl/config_check.c
@@ -0,0 +1,13 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bug.h>
+
+void check(void)
+{
+	/*
+	 * These kconfig symbols must be set to "m" for cxl_test to load
+	 * and operate.
+	 */
+	BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_BUS));
+	BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_ACPI));
+	BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_PMEM));
+}
diff --git a/tools/testing/cxl/mock_acpi.c b/tools/testing/cxl/mock_acpi.c
new file mode 100644
index 000000000000..4c8a493ace56
--- /dev/null
+++ b/tools/testing/cxl/mock_acpi.c
@@ -0,0 +1,109 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+
+#include <linux/platform_device.h>
+#include <linux/device.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+#include <cxl.h>
+#include "test/mock.h"
+
+struct acpi_device *to_cxl_host_bridge(struct device *host, struct device *dev)
+{
+	int index;
+	struct acpi_device *adev, *found = NULL;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops && ops->is_mock_bridge(dev)) {
+		found = ACPI_COMPANION(dev);
+		goto out;
+	}
+
+	if (dev->bus == &platform_bus_type)
+		goto out;
+
+	adev = to_acpi_device(dev);
+	if (!acpi_pci_find_root(adev->handle))
+		goto out;
+
+	if (strcmp(acpi_device_hid(adev), "ACPI0016") == 0) {
+		found = adev;
+		dev_dbg(host, "found host bridge %s\n", dev_name(&adev->dev));
+	}
+out:
+	put_cxl_mock_ops(index);
+	return found;
+}
+
+static int match_add_root_port(struct pci_dev *pdev, void *data)
+{
+	struct cxl_walk_context *ctx = data;
+	struct pci_bus *root_bus = ctx->root;
+	struct cxl_port *port = ctx->port;
+	int type = pci_pcie_type(pdev);
+	struct device *dev = ctx->dev;
+	u32 lnkcap, port_num;
+	int rc;
+
+	if (pdev->bus != root_bus)
+		return 0;
+	if (!pci_is_pcie(pdev))
+		return 0;
+	if (type != PCI_EXP_TYPE_ROOT_PORT)
+		return 0;
+	if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
+				  &lnkcap) != PCIBIOS_SUCCESSFUL)
+		return 0;
+
+	/* TODO walk DVSEC to find component register base */
+	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
+	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
+	if (rc) {
+		dev_err(dev, "failed to add dport: %s (%d)\n",
+			dev_name(&pdev->dev), rc);
+		ctx->error = rc;
+		return rc;
+	}
+	ctx->count++;
+
+	dev_dbg(dev, "add dport%d: %s\n", port_num, dev_name(&pdev->dev));
+
+	return 0;
+}
+
+static int mock_add_root_port(struct platform_device *pdev, void *data)
+{
+	struct cxl_walk_context *ctx = data;
+	struct cxl_port *port = ctx->port;
+	struct device *dev = ctx->dev;
+	int rc;
+
+	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE);
+	if (rc) {
+		dev_err(dev, "failed to add dport: %s (%d)\n",
+			dev_name(&pdev->dev), rc);
+		ctx->error = rc;
+		return rc;
+	}
+	ctx->count++;
+
+	dev_dbg(dev, "add dport%d: %s\n", pdev->id, dev_name(&pdev->dev));
+
+	return 0;
+}
+
+int match_add_root_ports(struct pci_dev *dev, void *data)
+{
+	int index, rc;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+	struct platform_device *pdev = (struct platform_device *) dev;
+
+	if (ops && ops->is_mock_port(pdev))
+		rc = mock_add_root_port(pdev, data);
+	else
+		rc = match_add_root_port(dev, data);
+
+	put_cxl_mock_ops(index);
+
+	return rc;
+}
diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
new file mode 100644
index 000000000000..7de4ddecfd21
--- /dev/null
+++ b/tools/testing/cxl/test/Kbuild
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-m += cxl_test.o
+obj-m += cxl_mock.o
+
+cxl_test-y := cxl.o
+cxl_mock-y := mock.o
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
new file mode 100644
index 000000000000..1c47b34244a4
--- /dev/null
+++ b/tools/testing/cxl/test/cxl.c
@@ -0,0 +1,509 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2021 Intel Corporation. All rights reserved.
+
+#include <linux/platform_device.h>
+#include <linux/genalloc.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+#include <linux/mm.h>
+#include "mock.h"
+
+#define NR_CXL_HOST_BRIDGES 4
+#define NR_CXL_ROOT_PORTS 2
+
+static struct platform_device *cxl_acpi;
+static struct platform_device *cxl_host_bridge[NR_CXL_HOST_BRIDGES];
+static struct platform_device
+	*cxl_root_port[NR_CXL_HOST_BRIDGES * NR_CXL_ROOT_PORTS];
+
+static struct acpi_device acpi0017_mock;
+static struct acpi_device host_bridge[NR_CXL_HOST_BRIDGES] = {
+	[0] = {
+		.handle = &host_bridge[0],
+	},
+	[1] = {
+		.handle = &host_bridge[1],
+	},
+	[2] = {
+		.handle = &host_bridge[2],
+	},
+	[3] = {
+		.handle = &host_bridge[3],
+	},
+};
+
+static bool is_mock_dev(struct device *dev)
+{
+	if (dev == &cxl_acpi->dev)
+		return true;
+	return false;
+}
+
+static bool is_mock_adev(struct acpi_device *adev)
+{
+	int i;
+
+	if (adev == &acpi0017_mock)
+		return true;
+
+	for (i = 0; i < ARRAY_SIZE(host_bridge); i++)
+		if (adev == &host_bridge[i])
+			return true;
+
+	return false;
+}
+
+static struct {
+	struct acpi_table_cedt cedt;
+	struct acpi_cedt_chbs chbs[NR_CXL_HOST_BRIDGES];
+	struct {
+		struct acpi_cedt_cfmws cfmws;
+		u32 target[1];
+	} cfmws0;
+	struct {
+		struct acpi_cedt_cfmws cfmws;
+		u32 target[4];
+	} cfmws1;
+	struct {
+		struct acpi_cedt_cfmws cfmws;
+		u32 target[1];
+	} cfmws2;
+	struct {
+		struct acpi_cedt_cfmws cfmws;
+		u32 target[4];
+	} cfmws3;
+} __packed mock_cedt = {
+	.cedt = {
+		.header = {
+			.signature = "CEDT",
+			.length = sizeof(mock_cedt),
+			.revision = 1,
+		},
+	},
+	.chbs[0] = {
+		.header = {
+			.type = ACPI_CEDT_TYPE_CHBS,
+			.length = sizeof(mock_cedt.chbs[0]),
+		},
+		.uid = 0,
+		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
+	},
+	.chbs[1] = {
+		.header = {
+			.type = ACPI_CEDT_TYPE_CHBS,
+			.length = sizeof(mock_cedt.chbs[0]),
+		},
+		.uid = 1,
+		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
+	},
+	.chbs[2] = {
+		.header = {
+			.type = ACPI_CEDT_TYPE_CHBS,
+			.length = sizeof(mock_cedt.chbs[0]),
+		},
+		.uid = 2,
+		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
+	},
+	.chbs[3] = {
+		.header = {
+			.type = ACPI_CEDT_TYPE_CHBS,
+			.length = sizeof(mock_cedt.chbs[0]),
+		},
+		.uid = 3,
+		.cxl_version = ACPI_CEDT_CHBS_VERSION_CXL20,
+	},
+	.cfmws0 = {
+		.cfmws = {
+			.header = {
+				.type = ACPI_CEDT_TYPE_CFMWS,
+				.length = sizeof(mock_cedt.cfmws0),
+			},
+			.interleave_ways = 0,
+			.granularity = 4,
+			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
+					ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
+			.qtg_id = 0,
+			.window_size = SZ_256M,
+		},
+		.target = { 0 },
+	},
+	.cfmws1 = {
+		.cfmws = {
+			.header = {
+				.type = ACPI_CEDT_TYPE_CFMWS,
+				.length = sizeof(mock_cedt.cfmws1),
+			},
+			.interleave_ways = 2,
+			.granularity = 4,
+			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
+					ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
+			.qtg_id = 1,
+			.window_size = SZ_256M * 4,
+		},
+		.target = { 0, 1, 2, 3 },
+	},
+	.cfmws2 = {
+		.cfmws = {
+			.header = {
+				.type = ACPI_CEDT_TYPE_CFMWS,
+				.length = sizeof(mock_cedt.cfmws2),
+			},
+			.interleave_ways = 0,
+			.granularity = 4,
+			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
+					ACPI_CEDT_CFMWS_RESTRICT_PMEM,
+			.qtg_id = 2,
+			.window_size = SZ_256M,
+		},
+		.target = { 0 },
+	},
+	.cfmws3 = {
+		.cfmws = {
+			.header = {
+				.type = ACPI_CEDT_TYPE_CFMWS,
+				.length = sizeof(mock_cedt.cfmws3),
+			},
+			.interleave_ways = 2,
+			.granularity = 4,
+			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
+					ACPI_CEDT_CFMWS_RESTRICT_PMEM,
+			.qtg_id = 3,
+			.window_size = SZ_256M * 4,
+		},
+		.target = { 0, 1, 2, 3 },
+	},
+};
+
+struct cxl_mock_res {
+	struct list_head list;
+	struct range range;
+};
+
+static LIST_HEAD(mock_res);
+static DEFINE_MUTEX(mock_res_lock);
+static struct gen_pool *cxl_mock_pool;
+
+static void depopulate_all_mock_resources(void)
+{
+	struct cxl_mock_res *res, *_res;
+
+	mutex_lock(&mock_res_lock);
+	list_for_each_entry_safe(res, _res, &mock_res, list) {
+		gen_pool_free(cxl_mock_pool, res->range.start,
+			      range_len(&res->range));
+		list_del(&res->list);
+		kfree(res);
+	}
+	mutex_unlock(&mock_res_lock);
+}
+
+static struct cxl_mock_res *alloc_mock_res(resource_size_t size)
+{
+	struct cxl_mock_res *res = kzalloc(sizeof(*res), GFP_KERNEL);
+	struct genpool_data_align data = {
+		.align = SZ_256M,
+	};
+	unsigned long phys;
+
+	INIT_LIST_HEAD(&res->list);
+	phys = gen_pool_alloc_algo(cxl_mock_pool, size,
+				   gen_pool_first_fit_align, &data);
+	if (!phys)
+		return NULL;
+
+	res->range = (struct range) {
+		.start = phys,
+		.end = phys + size - 1,
+	};
+	mutex_lock(&mock_res_lock);
+	list_add(&res->list, &mock_res);
+	mutex_unlock(&mock_res_lock);
+
+	return res;
+}
+
+static int populate_cedt(void)
+{
+	struct acpi_cedt_cfmws *cfmws[4] = {
+		[0] = &mock_cedt.cfmws0.cfmws,
+		[1] = &mock_cedt.cfmws1.cfmws,
+		[2] = &mock_cedt.cfmws2.cfmws,
+		[3] = &mock_cedt.cfmws3.cfmws,
+	};
+	struct cxl_mock_res *res;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mock_cedt.chbs); i++) {
+		struct acpi_cedt_chbs *chbs = &mock_cedt.chbs[i];
+		resource_size_t size;
+
+		if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL20)
+			size = ACPI_CEDT_CHBS_LENGTH_CXL20;
+		else
+			size = ACPI_CEDT_CHBS_LENGTH_CXL11;
+
+		res = alloc_mock_res(size);
+		if (!res)
+			return -ENOMEM;
+		chbs->base = res->range.start;
+		chbs->length = size;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(cfmws); i++) {
+		struct acpi_cedt_cfmws *window = cfmws[i];
+
+		res = alloc_mock_res(window->window_size);
+		if (!res)
+			return -ENOMEM;
+		window->base_hpa = res->range.start;
+	}
+
+	return 0;
+}
+
+static acpi_status mock_acpi_get_table(char *signature, u32 instance,
+				       struct acpi_table_header **out_table)
+{
+	if (instance < U32_MAX || strcmp(signature, ACPI_SIG_CEDT) != 0)
+		return acpi_get_table(signature, instance, out_table);
+
+	*out_table = (struct acpi_table_header *) &mock_cedt;
+	return AE_OK;
+}
+
+static void mock_acpi_put_table(struct acpi_table_header *table)
+{
+	if (table == (struct acpi_table_header *) &mock_cedt)
+		return;
+	acpi_put_table(table);
+}
+
+static bool is_mock_bridge(struct device *dev)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(cxl_host_bridge); i++)
+		if (dev == &cxl_host_bridge[i]->dev)
+			return true;
+
+	return false;
+}
+
+static int host_bridge_index(struct acpi_device *adev)
+{
+	return adev - host_bridge;
+}
+
+static struct acpi_device *find_host_bridge(acpi_handle handle)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(host_bridge); i++)
+		if (handle == host_bridge[i].handle)
+			return &host_bridge[i];
+	return NULL;
+}
+
+static acpi_status
+mock_acpi_evaluate_integer(acpi_handle handle, acpi_string pathname,
+			   struct acpi_object_list *arguments,
+			   unsigned long long *data)
+{
+	struct acpi_device *adev = find_host_bridge(handle);
+
+	if (!adev || strcmp(pathname, METHOD_NAME__UID) != 0)
+		return acpi_evaluate_integer(handle, pathname, arguments, data);
+
+	*data = host_bridge_index(adev);
+	return AE_OK;
+}
+
+static struct pci_bus mock_pci_bus[NR_CXL_HOST_BRIDGES];
+static struct acpi_pci_root mock_pci_root[NR_CXL_HOST_BRIDGES] = {
+	[0] = {
+		.bus = &mock_pci_bus[0],
+	},
+	[1] = {
+		.bus = &mock_pci_bus[1],
+	},
+	[2] = {
+		.bus = &mock_pci_bus[2],
+	},
+	[3] = {
+		.bus = &mock_pci_bus[3],
+	},
+};
+
+static struct platform_device *mock_cxl_root_port(struct pci_bus *bus, int index)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mock_pci_bus); i++)
+		if (bus == &mock_pci_bus[i])
+			return cxl_root_port[index + i * NR_CXL_ROOT_PORTS];
+	return NULL;
+}
+
+static bool is_mock_port(struct platform_device *pdev)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(cxl_root_port); i++)
+		if (pdev == cxl_root_port[i])
+			return true;
+	return false;
+}
+
+static bool is_mock_bus(struct pci_bus *bus)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mock_pci_bus); i++)
+		if (bus == &mock_pci_bus[i])
+			return true;
+	return false;
+}
+
+static struct acpi_pci_root *mock_acpi_pci_find_root(acpi_handle handle)
+{
+	struct acpi_device *adev = find_host_bridge(handle);
+
+	if (!adev)
+		return acpi_pci_find_root(handle);
+	return &mock_pci_root[host_bridge_index(adev)];
+}
+
+static struct cxl_mock_ops cxl_mock_ops = {
+	.is_mock_adev = is_mock_adev,
+	.is_mock_bridge = is_mock_bridge,
+	.is_mock_bus = is_mock_bus,
+	.is_mock_port = is_mock_port,
+	.is_mock_dev = is_mock_dev,
+	.mock_port = mock_cxl_root_port,
+	.acpi_get_table = mock_acpi_get_table,
+	.acpi_put_table = mock_acpi_put_table,
+	.acpi_evaluate_integer = mock_acpi_evaluate_integer,
+	.acpi_pci_find_root = mock_acpi_pci_find_root,
+	.list = LIST_HEAD_INIT(cxl_mock_ops.list),
+};
+
+static void mock_companion(struct acpi_device *adev, struct device *dev)
+{
+	device_initialize(&adev->dev);
+	fwnode_init(&adev->fwnode, NULL);
+	dev->fwnode = &adev->fwnode;
+	adev->fwnode.dev = dev;
+}
+
+#ifndef SZ_64G
+#define SZ_64G (SZ_32G * 2)
+#endif
+
+#ifndef SZ_512G
+#define SZ_512G (SZ_64G * 8)
+#endif
+
+static __init int cxl_test_init(void)
+{
+	int rc, i;
+
+	register_cxl_mock_ops(&cxl_mock_ops);
+
+	cxl_mock_pool = gen_pool_create(ilog2(SZ_2M), NUMA_NO_NODE);
+	if (!cxl_mock_pool) {
+		rc = -ENOMEM;
+		goto err_gen_pool_create;
+	}
+
+	rc = gen_pool_add(cxl_mock_pool, SZ_512G, SZ_64G, NUMA_NO_NODE);
+	if (rc)
+		goto err_gen_pool_add;
+
+	rc = populate_cedt();
+	if (rc)
+		goto err_populate;
+
+	for (i = 0; i < ARRAY_SIZE(cxl_host_bridge); i++) {
+		struct acpi_device *adev = &host_bridge[i];
+		struct platform_device *pdev;
+
+		pdev = platform_device_alloc("cxl_host_bridge", i);
+		if (!pdev)
+			goto err_bridge;
+
+		mock_companion(adev, &pdev->dev);
+		rc = platform_device_add(pdev);
+		if (rc) {
+			platform_device_put(pdev);
+			goto err_bridge;
+		}
+		cxl_host_bridge[i] = pdev;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(cxl_root_port); i++) {
+		struct platform_device *bridge =
+			cxl_host_bridge[i / NR_CXL_ROOT_PORTS];
+		struct platform_device *pdev;
+
+		pdev = platform_device_alloc("cxl_root_port", i);
+		if (!pdev)
+			goto err_port;
+		pdev->dev.parent = &bridge->dev;
+
+		rc = platform_device_add(pdev);
+		if (rc) {
+			platform_device_put(pdev);
+			goto err_port;
+		}
+		cxl_root_port[i] = pdev;
+	}
+
+	cxl_acpi = platform_device_alloc("cxl_acpi", 0);
+	if (!cxl_acpi)
+		goto err_port;
+
+	mock_companion(&acpi0017_mock, &cxl_acpi->dev);
+	acpi0017_mock.dev.bus = &platform_bus_type;
+
+	rc = platform_device_add(cxl_acpi);
+	if (rc)
+		goto err_add;
+
+	return 0;
+
+err_add:
+	platform_device_put(cxl_acpi);
+err_port:
+	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_root_port[i]);
+err_bridge:
+	for (i = ARRAY_SIZE(cxl_host_bridge) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_host_bridge[i]);
+err_populate:
+	depopulate_all_mock_resources();
+err_gen_pool_add:
+	gen_pool_destroy(cxl_mock_pool);
+err_gen_pool_create:
+	unregister_cxl_mock_ops(&cxl_mock_ops);
+	return rc;
+}
+
+static __exit void cxl_test_exit(void)
+{
+	int i;
+
+	platform_device_unregister(cxl_acpi);
+	for (i = ARRAY_SIZE(cxl_root_port) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_root_port[i]);
+	for (i = ARRAY_SIZE(cxl_host_bridge) - 1; i >= 0; i--)
+		platform_device_unregister(cxl_host_bridge[i]);
+	depopulate_all_mock_resources();
+	gen_pool_destroy(cxl_mock_pool);
+	unregister_cxl_mock_ops(&cxl_mock_ops);
+}
+
+module_init(cxl_test_init);
+module_exit(cxl_test_exit);
+MODULE_LICENSE("GPL v2");
diff --git a/tools/testing/cxl/test/mock.c b/tools/testing/cxl/test/mock.c
new file mode 100644
index 000000000000..b8c108abcf07
--- /dev/null
+++ b/tools/testing/cxl/test/mock.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0-only
+//Copyright(c) 2021 Intel Corporation. All rights reserved.
+
+#include <linux/libnvdimm.h>
+#include <linux/rculist.h>
+#include <linux/device.h>
+#include <linux/export.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+#include "mock.h"
+
+static LIST_HEAD(mock);
+
+void register_cxl_mock_ops(struct cxl_mock_ops *ops)
+{
+	list_add_rcu(&ops->list, &mock);
+}
+EXPORT_SYMBOL_GPL(register_cxl_mock_ops);
+
+static DEFINE_SRCU(cxl_mock_srcu);
+
+void unregister_cxl_mock_ops(struct cxl_mock_ops *ops)
+{
+	list_del_rcu(&ops->list);
+	synchronize_srcu(&cxl_mock_srcu);
+}
+EXPORT_SYMBOL_GPL(unregister_cxl_mock_ops);
+
+struct cxl_mock_ops *get_cxl_mock_ops(int *index)
+{
+	*index = srcu_read_lock(&cxl_mock_srcu);
+	return list_first_or_null_rcu(&mock, struct cxl_mock_ops, list);
+}
+EXPORT_SYMBOL_GPL(get_cxl_mock_ops);
+
+void put_cxl_mock_ops(int index)
+{
+	srcu_read_unlock(&cxl_mock_srcu, index);
+}
+EXPORT_SYMBOL_GPL(put_cxl_mock_ops);
+
+bool __wrap_is_acpi_device_node(const struct fwnode_handle *fwnode)
+{
+	struct acpi_device *adev =
+		container_of(fwnode, struct acpi_device, fwnode);
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+	bool retval = false;
+
+	if (ops)
+		retval = ops->is_mock_adev(adev);
+
+	if (!retval)
+		retval = is_acpi_device_node(fwnode);
+
+	put_cxl_mock_ops(index);
+	return retval;
+}
+EXPORT_SYMBOL(__wrap_is_acpi_device_node);
+
+acpi_status __wrap_acpi_get_table(char *signature, u32 instance,
+				  struct acpi_table_header **out_table)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+	acpi_status status;
+
+	if (ops)
+		status = ops->acpi_get_table(signature, instance, out_table);
+	else
+		status = acpi_get_table(signature, instance, out_table);
+
+	put_cxl_mock_ops(index);
+
+	return status;
+}
+EXPORT_SYMBOL(__wrap_acpi_get_table);
+
+void __wrap_acpi_put_table(struct acpi_table_header *table)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops)
+		ops->acpi_put_table(table);
+	else
+		acpi_put_table(table);
+	put_cxl_mock_ops(index);
+}
+EXPORT_SYMBOL(__wrap_acpi_put_table);
+
+acpi_status __wrap_acpi_evaluate_integer(acpi_handle handle,
+					 acpi_string pathname,
+					 struct acpi_object_list *arguments,
+					 unsigned long long *data)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+	acpi_status status;
+
+	if (ops)
+		status = ops->acpi_evaluate_integer(handle, pathname, arguments,
+						    data);
+	else
+		status = acpi_evaluate_integer(handle, pathname, arguments,
+					       data);
+	put_cxl_mock_ops(index);
+
+	return status;
+}
+EXPORT_SYMBOL(__wrap_acpi_evaluate_integer);
+
+struct acpi_pci_root *__wrap_acpi_pci_find_root(acpi_handle handle)
+{
+	int index;
+	struct acpi_pci_root *root;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops)
+		root = ops->acpi_pci_find_root(handle);
+	else
+		root = acpi_pci_find_root(handle);
+
+	put_cxl_mock_ops(index);
+
+	return root;
+}
+EXPORT_SYMBOL_GPL(__wrap_acpi_pci_find_root);
+
+void __wrap_pci_walk_bus(struct pci_bus *bus,
+			 int (*cb)(struct pci_dev *, void *), void *userdata)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops && ops->is_mock_bus(bus)) {
+		int rc, i;
+
+		/*
+		 * Simulate 2 root ports per host-bridge and no
+		 * depth recursion.
+		 */
+		for (i = 0; i < 2; i++) {
+			rc = cb((struct pci_dev *) ops->mock_port(bus, i),
+				userdata);
+			if (rc)
+				break;
+		}
+	} else
+		pci_walk_bus(bus, cb, userdata);
+
+	put_cxl_mock_ops(index);
+}
+EXPORT_SYMBOL_GPL(__wrap_pci_walk_bus);
+
+struct nvdimm_bus *
+__wrap_nvdimm_bus_register(struct device *dev,
+			   struct nvdimm_bus_descriptor *nd_desc)
+{
+	int index;
+	struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+
+	if (ops && ops->is_mock_dev(dev->parent->parent))
+		nd_desc->provider_name = "cxl_test";
+	put_cxl_mock_ops(index);
+
+	return nvdimm_bus_register(dev, nd_desc);
+}
+EXPORT_SYMBOL_GPL(__wrap_nvdimm_bus_register);
+
+MODULE_LICENSE("GPL v2");
diff --git a/tools/testing/cxl/test/mock.h b/tools/testing/cxl/test/mock.h
new file mode 100644
index 000000000000..805a94cb3fbe
--- /dev/null
+++ b/tools/testing/cxl/test/mock.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <linux/list.h>
+#include <linux/acpi.h>
+
+struct cxl_mock_ops {
+	struct list_head list;
+	bool (*is_mock_adev)(struct acpi_device *dev);
+	acpi_status (*acpi_get_table)(char *signature, u32 instance,
+				      struct acpi_table_header **out_table);
+	void (*acpi_put_table)(struct acpi_table_header *table);
+	bool (*is_mock_bridge)(struct device *dev);
+	acpi_status (*acpi_evaluate_integer)(acpi_handle handle,
+					     acpi_string pathname,
+					     struct acpi_object_list *arguments,
+					     unsigned long long *data);
+	struct acpi_pci_root *(*acpi_pci_find_root)(acpi_handle handle);
+	struct platform_device *(*mock_port)(struct pci_bus *bus, int index);
+	bool (*is_mock_bus)(struct pci_bus *bus);
+	bool (*is_mock_port)(struct platform_device *pdev);
+	bool (*is_mock_dev)(struct device *dev);
+};
+
+void register_cxl_mock_ops(struct cxl_mock_ops *ops);
+void unregister_cxl_mock_ops(struct cxl_mock_ops *ops);
+struct cxl_mock_ops *get_cxl_mock_ops(int *index);
+void put_cxl_mock_ops(int index);


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH v5 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-09  5:13 ` [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add Dan Williams
  2021-09-10 10:33   ` Jonathan Cameron
@ 2021-09-14 19:31   ` Dan Williams
  2021-09-21 14:24     ` Ben Widawsky
  2021-09-21 19:22     ` [PATCH v6 " Dan Williams
  1 sibling, 2 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-14 19:31 UTC (permalink / raw)
  To: linux-cxl
  Cc: kernel test robot, Nathan Chancellor, Dan Carpenter, nvdimm,
	Jonathan.Cameron

The kbuild robot reports:

    drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
    limit (1024) in function 'devm_cxl_add_decoder'

It is also the case the devm_cxl_add_decoder() is unwieldy to use for
all the different decoder types. Fix the stack usage by splitting the
creation into alloc and add steps. This also allows for context
specific construction before adding.

With the split the caller is responsible for registering a devm callback
to trigger device_unregister() for the decoder rather than it being
implicit in the decoder registration. I.e. the routine that calls alloc
is responsible for calling put_device() if the "add" operation fails.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v4:
- hold the device lock over the list_empty(&port->dports) check
  (Jonathan)
- move the list_empty() check after the check for NULL @target_map in
  anticipation of endpoint decoders (Ben)

 drivers/cxl/acpi.c      |   84 +++++++++++++++++++++++---------
 drivers/cxl/core/bus.c  |  123 +++++++++++++++--------------------------------
 drivers/cxl/core/core.h |    5 --
 drivers/cxl/core/pmem.c |    7 ++-
 drivers/cxl/cxl.h       |   15 ++----
 5 files changed, 110 insertions(+), 124 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index d39cc797a64e..2368a8b67698 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 	struct cxl_decoder *cxld;
 	acpi_size len, cur = 0;
 	void *cedt_subtable;
-	unsigned long flags;
 	int rc;
 
 	len = acpi_cedt->length - sizeof(*acpi_cedt);
@@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 		for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
 			target_map[i] = cfmws->interleave_targets[i];
 
-		flags = cfmws_to_decoder_flags(cfmws->restrictions);
-		cxld = devm_cxl_add_decoder(dev, root_port,
-					    CFMWS_INTERLEAVE_WAYS(cfmws),
-					    cfmws->base_hpa, cfmws->window_size,
-					    CFMWS_INTERLEAVE_WAYS(cfmws),
-					    CFMWS_INTERLEAVE_GRANULARITY(cfmws),
-					    CXL_DECODER_EXPANDER,
-					    flags, target_map);
-
-		if (IS_ERR(cxld)) {
+		cxld = cxl_decoder_alloc(root_port,
+					 CFMWS_INTERLEAVE_WAYS(cfmws));
+		if (IS_ERR(cxld))
+			goto next;
+
+		cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
+		cxld->target_type = CXL_DECODER_EXPANDER;
+		cxld->range = (struct range) {
+			.start = cfmws->base_hpa,
+			.end = cfmws->base_hpa + cfmws->window_size - 1,
+		};
+		cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
+		cxld->interleave_granularity =
+			CFMWS_INTERLEAVE_GRANULARITY(cfmws);
+
+		rc = cxl_decoder_add(cxld, target_map);
+		if (rc)
+			put_device(&cxld->dev);
+		else
+			rc = cxl_decoder_autoremove(dev, cxld);
+		if (rc) {
 			dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
 				cfmws->base_hpa, cfmws->base_hpa +
 				cfmws->window_size - 1);
-		} else {
-			dev_dbg(dev, "add: %s range %#llx-%#llx\n",
-				dev_name(&cxld->dev), cfmws->base_hpa,
-				 cfmws->base_hpa + cfmws->window_size - 1);
+			goto next;
 		}
+		dev_dbg(dev, "add: %s range %#llx-%#llx\n",
+			dev_name(&cxld->dev), cfmws->base_hpa,
+			cfmws->base_hpa + cfmws->window_size - 1);
+next:
 		cur += c->length;
 	}
 }
@@ -266,6 +277,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
 	struct acpi_pci_root *pci_root;
 	struct cxl_walk_context ctx;
+	int single_port_map[1], rc;
 	struct cxl_decoder *cxld;
 	struct cxl_dport *dport;
 	struct cxl_port *port;
@@ -301,22 +313,46 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 		return -ENODEV;
 	if (ctx.error)
 		return ctx.error;
+	if (ctx.count > 1)
+		return 0;
 
 	/* TODO: Scan CHBCR for HDM Decoder resources */
 
 	/*
-	 * In the single-port host-bridge case there are no HDM decoders
-	 * in the CHBCR and a 1:1 passthrough decode is implied.
+	 * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
+	 * Structure) single ported host-bridges need not publish a decoder
+	 * capability when a passthrough decode can be assumed, i.e. all
+	 * transactions that the uport sees are claimed and passed to the single
+	 * dport. Default the range a 0-base 0-length until the first CXL region
+	 * is activated.
 	 */
-	if (ctx.count == 1) {
-		cxld = devm_cxl_add_passthrough_decoder(host, port);
-		if (IS_ERR(cxld))
-			return PTR_ERR(cxld);
+	cxld = cxl_decoder_alloc(port, 1);
+	if (IS_ERR(cxld))
+		return PTR_ERR(cxld);
+
+	cxld->interleave_ways = 1;
+	cxld->interleave_granularity = PAGE_SIZE;
+	cxld->target_type = CXL_DECODER_EXPANDER;
+	cxld->range = (struct range) {
+		.start = 0,
+		.end = -1,
+	};
 
-		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
-	}
+	device_lock(&port->dev);
+	dport = list_first_entry(&port->dports, typeof(*dport), list);
+	device_unlock(&port->dev);
 
-	return 0;
+	single_port_map[0] = dport->port_id;
+
+	rc = cxl_decoder_add(cxld, single_port_map);
+	if (rc)
+		put_device(&cxld->dev);
+	else
+		rc = cxl_decoder_autoremove(host, cxld);
+
+	if (rc == 0)
+		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
+	return rc;
 }
 
 static int add_host_bridge_dport(struct device *match, void *arg)
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 6dfdeaf999f0..396252749477 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -453,10 +453,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
 }
 EXPORT_SYMBOL_GPL(cxl_add_dport);
 
-static int decoder_populate_targets(struct device *host,
-				    struct cxl_decoder *cxld,
-				    struct cxl_port *port, int *target_map,
-				    int nr_targets)
+static int decoder_populate_targets(struct cxl_decoder *cxld,
+				    struct cxl_port *port, int *target_map)
 {
 	int rc = 0, i;
 
@@ -464,42 +462,36 @@ static int decoder_populate_targets(struct device *host,
 		return 0;
 
 	device_lock(&port->dev);
-	for (i = 0; i < nr_targets; i++) {
+	if (list_empty(&port->dports)) {
+		rc = -EINVAL;
+		goto out_unlock;
+	}
+
+	for (i = 0; i < cxld->nr_targets; i++) {
 		struct cxl_dport *dport = find_dport(port, target_map[i]);
 
 		if (!dport) {
 			rc = -ENXIO;
-			break;
+			goto out_unlock;
 		}
-		dev_dbg(host, "%s: target: %d\n", dev_name(dport->dport), i);
 		cxld->target[i] = dport;
 	}
+
+out_unlock:
 	device_unlock(&port->dev);
 
 	return rc;
 }
 
-static struct cxl_decoder *
-cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
-		  resource_size_t base, resource_size_t len,
-		  int interleave_ways, int interleave_granularity,
-		  enum cxl_decoder_type type, unsigned long flags,
-		  int *target_map)
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
 {
 	struct cxl_decoder *cxld;
 	struct device *dev;
 	int rc = 0;
 
-	if (interleave_ways < 1)
+	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
 		return ERR_PTR(-EINVAL);
 
-	device_lock(&port->dev);
-	if (list_empty(&port->dports))
-		rc = -EINVAL;
-	device_unlock(&port->dev);
-	if (rc)
-		return ERR_PTR(rc);
-
 	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
 	if (!cxld)
 		return ERR_PTR(-ENOMEM);
@@ -508,22 +500,8 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
 	if (rc < 0)
 		goto err;
 
-	*cxld = (struct cxl_decoder) {
-		.id = rc,
-		.range = {
-			.start = base,
-			.end = base + len - 1,
-		},
-		.flags = flags,
-		.interleave_ways = interleave_ways,
-		.interleave_granularity = interleave_granularity,
-		.target_type = type,
-	};
-
-	rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
-	if (rc)
-		goto err;
-
+	cxld->id = rc;
+	cxld->nr_targets = nr_targets;
 	dev = &cxld->dev;
 	device_initialize(dev);
 	device_set_pm_not_required(dev);
@@ -541,72 +519,47 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
 	kfree(cxld);
 	return ERR_PTR(rc);
 }
+EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
 
-struct cxl_decoder *
-devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
-		     resource_size_t base, resource_size_t len,
-		     int interleave_ways, int interleave_granularity,
-		     enum cxl_decoder_type type, unsigned long flags,
-		     int *target_map)
+int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 {
-	struct cxl_decoder *cxld;
+	struct cxl_port *port;
 	struct device *dev;
 	int rc;
 
-	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
-		return ERR_PTR(-EINVAL);
+	if (!cxld)
+		return -EINVAL;
 
-	cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
-				 interleave_ways, interleave_granularity, type,
-				 flags, target_map);
 	if (IS_ERR(cxld))
-		return cxld;
+		return PTR_ERR(cxld);
 
-	dev = &cxld->dev;
-	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
-	if (rc)
-		goto err;
+	if (cxld->interleave_ways < 1)
+		return -EINVAL;
 
-	rc = device_add(dev);
+	port = to_cxl_port(cxld->dev.parent);
+	rc = decoder_populate_targets(cxld, port, target_map);
 	if (rc)
-		goto err;
+		return rc;
 
-	rc = devm_add_action_or_reset(host, unregister_cxl_dev, dev);
+	dev = &cxld->dev;
+	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
 	if (rc)
-		return ERR_PTR(rc);
-	return cxld;
+		return rc;
 
-err:
-	put_device(dev);
-	return ERR_PTR(rc);
+	return device_add(dev);
 }
-EXPORT_SYMBOL_GPL(devm_cxl_add_decoder);
+EXPORT_SYMBOL_GPL(cxl_decoder_add);
 
-/*
- * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
- * single ported host-bridges need not publish a decoder capability when a
- * passthrough decode can be assumed, i.e. all transactions that the uport sees
- * are claimed and passed to the single dport. Default the range a 0-base
- * 0-length until the first CXL region is activated.
- */
-struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
-						     struct cxl_port *port)
+static void cxld_unregister(void *dev)
 {
-	struct cxl_dport *dport;
-	int target_map[1];
-
-	device_lock(&port->dev);
-	dport = list_first_entry_or_null(&port->dports, typeof(*dport), list);
-	device_unlock(&port->dev);
-
-	if (!dport)
-		return ERR_PTR(-ENXIO);
+	device_unregister(dev);
+}
 
-	target_map[0] = dport->port_id;
-	return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
-				    CXL_DECODER_EXPANDER, 0, target_map);
+int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld)
+{
+	return devm_add_action_or_reset(host, cxld_unregister, &cxld->dev);
 }
-EXPORT_SYMBOL_GPL(devm_cxl_add_passthrough_decoder);
+EXPORT_SYMBOL_GPL(cxl_decoder_autoremove);
 
 /**
  * __cxl_driver_register - register a driver for the cxl bus
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index c85b7fbad02d..e0c9aacc4e9c 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -9,11 +9,6 @@ extern const struct device_type cxl_nvdimm_type;
 
 extern struct attribute_group cxl_base_attribute_group;
 
-static inline void unregister_cxl_dev(void *dev)
-{
-	device_unregister(dev);
-}
-
 struct cxl_send_command;
 struct cxl_mem_query_commands;
 int cxl_query_cmd(struct cxl_memdev *cxlmd,
diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index 74be5132df1c..5032f4c1c69d 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -222,6 +222,11 @@ static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
 	return cxl_nvd;
 }
 
+static void cxl_nvd_unregister(void *dev)
+{
+	device_unregister(dev);
+}
+
 /**
  * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm
  * @host: same host as @cxlmd
@@ -251,7 +256,7 @@ int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
 	dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
 		dev_name(dev));
 
-	return devm_add_action_or_reset(host, unregister_cxl_dev, dev);
+	return devm_add_action_or_reset(host, cxl_nvd_unregister, dev);
 
 err:
 	put_device(dev);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 9af5745ba2c0..7d6b011dd963 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -195,6 +195,7 @@ enum cxl_decoder_type {
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
  * @flags: memory type capabilities and locking
+ * @nr_targets: number of elements in @target
  * @target: active ordered target list in current decoder configuration
  */
 struct cxl_decoder {
@@ -205,6 +206,7 @@ struct cxl_decoder {
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
 	unsigned long flags;
+	int nr_targets;
 	struct cxl_dport *target[];
 };
 
@@ -286,15 +288,10 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
-struct cxl_decoder *
-devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
-		     resource_size_t base, resource_size_t len,
-		     int interleave_ways, int interleave_granularity,
-		     enum cxl_decoder_type type, unsigned long flags,
-		     int *target_map);
-
-struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
-						     struct cxl_port *port);
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
+int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
+int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
+
 extern struct bus_type cxl_bus_type;
 
 struct cxl_driver {


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-14 19:31   ` [PATCH v5 " Dan Williams
@ 2021-09-21 14:24     ` Ben Widawsky
  2021-09-21 16:18       ` Dan Williams
  2021-09-21 19:22     ` [PATCH v6 " Dan Williams
  1 sibling, 1 reply; 78+ messages in thread
From: Ben Widawsky @ 2021-09-21 14:24 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, kernel test robot, Nathan Chancellor, Dan Carpenter,
	nvdimm, Jonathan.Cameron

On 21-09-14 12:31:22, Dan Williams wrote:
> The kbuild robot reports:
> 
>     drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
>     limit (1024) in function 'devm_cxl_add_decoder'
> 
> It is also the case the devm_cxl_add_decoder() is unwieldy to use for
> all the different decoder types. Fix the stack usage by splitting the
> creation into alloc and add steps. This also allows for context
> specific construction before adding.
> 
> With the split the caller is responsible for registering a devm callback
> to trigger device_unregister() for the decoder rather than it being
> implicit in the decoder registration. I.e. the routine that calls alloc
> is responsible for calling put_device() if the "add" operation fails.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

I have some comments inline. You can take them or leave them. Hopefully you can
pull in my patch to document these after too.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>

> ---
> Changes since v4:
> - hold the device lock over the list_empty(&port->dports) check
>   (Jonathan)
> - move the list_empty() check after the check for NULL @target_map in
>   anticipation of endpoint decoders (Ben)
> 
>  drivers/cxl/acpi.c      |   84 +++++++++++++++++++++++---------
>  drivers/cxl/core/bus.c  |  123 +++++++++++++++--------------------------------
>  drivers/cxl/core/core.h |    5 --
>  drivers/cxl/core/pmem.c |    7 ++-
>  drivers/cxl/cxl.h       |   15 ++----
>  5 files changed, 110 insertions(+), 124 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index d39cc797a64e..2368a8b67698 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
>  	struct cxl_decoder *cxld;
>  	acpi_size len, cur = 0;
>  	void *cedt_subtable;
> -	unsigned long flags;
>  	int rc;
>  
>  	len = acpi_cedt->length - sizeof(*acpi_cedt);
> @@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
>  		for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
>  			target_map[i] = cfmws->interleave_targets[i];
>  
> -		flags = cfmws_to_decoder_flags(cfmws->restrictions);
> -		cxld = devm_cxl_add_decoder(dev, root_port,
> -					    CFMWS_INTERLEAVE_WAYS(cfmws),
> -					    cfmws->base_hpa, cfmws->window_size,
> -					    CFMWS_INTERLEAVE_WAYS(cfmws),
> -					    CFMWS_INTERLEAVE_GRANULARITY(cfmws),
> -					    CXL_DECODER_EXPANDER,
> -					    flags, target_map);
> -
> -		if (IS_ERR(cxld)) {
> +		cxld = cxl_decoder_alloc(root_port,
> +					 CFMWS_INTERLEAVE_WAYS(cfmws));
> +		if (IS_ERR(cxld))
> +			goto next;
> +
> +		cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
> +		cxld->target_type = CXL_DECODER_EXPANDER;
> +		cxld->range = (struct range) {
> +			.start = cfmws->base_hpa,
> +			.end = cfmws->base_hpa + cfmws->window_size - 1,
> +		};
> +		cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
> +		cxld->interleave_granularity =
> +			CFMWS_INTERLEAVE_GRANULARITY(cfmws);
> +
> +		rc = cxl_decoder_add(cxld, target_map);
> +		if (rc)
> +			put_device(&cxld->dev);
> +		else
> +			rc = cxl_decoder_autoremove(dev, cxld);

For posterity, I'll say I don't love this interface overall, but I don't have a
better suggestion.

alloc()
open coded configuration
add()
open coded autoremove

I understand some of the background on moving the responsibility of the devm
callback to the actual aller, it just ends up a fairly weird interface now since
all 4 steps are needed to actually create a decoder for consumption by the
driver.

I'd request a new function to configure the decoder before adding except I don't
think it's worth doing that either.

> +		if (rc) {
>  			dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
>  				cfmws->base_hpa, cfmws->base_hpa +
>  				cfmws->window_size - 1);

Do you think it makes sense to explain to the user what the consequence is of
this?

> -		} else {
> -			dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> -				dev_name(&cxld->dev), cfmws->base_hpa,
> -				 cfmws->base_hpa + cfmws->window_size - 1);
> +			goto next;
>  		}
> +		dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> +			dev_name(&cxld->dev), cfmws->base_hpa,
> +			cfmws->base_hpa + cfmws->window_size - 1);
> +next:
>  		cur += c->length;
>  	}
>  }
> @@ -266,6 +277,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>  	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
>  	struct acpi_pci_root *pci_root;
>  	struct cxl_walk_context ctx;
> +	int single_port_map[1], rc;
>  	struct cxl_decoder *cxld;
>  	struct cxl_dport *dport;
>  	struct cxl_port *port;
> @@ -301,22 +313,46 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>  		return -ENODEV;
>  	if (ctx.error)
>  		return ctx.error;
> +	if (ctx.count > 1)
> +		return 0;
>  
>  	/* TODO: Scan CHBCR for HDM Decoder resources */
>  
>  	/*
> -	 * In the single-port host-bridge case there are no HDM decoders
> -	 * in the CHBCR and a 1:1 passthrough decode is implied.
> +	 * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> +	 * Structure) single ported host-bridges need not publish a decoder
> +	 * capability when a passthrough decode can be assumed, i.e. all
> +	 * transactions that the uport sees are claimed and passed to the single
> +	 * dport. Default the range a 0-base 0-length until the first CXL region
> +	 * is activated.
>  	 */
> -	if (ctx.count == 1) {
> -		cxld = devm_cxl_add_passthrough_decoder(host, port);
> -		if (IS_ERR(cxld))
> -			return PTR_ERR(cxld);
> +	cxld = cxl_decoder_alloc(port, 1);
> +	if (IS_ERR(cxld))
> +		return PTR_ERR(cxld);
> +
> +	cxld->interleave_ways = 1;
> +	cxld->interleave_granularity = PAGE_SIZE;
> +	cxld->target_type = CXL_DECODER_EXPANDER;
> +	cxld->range = (struct range) {
> +		.start = 0,
> +		.end = -1,
> +	};
>  
> -		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> -	}
> +	device_lock(&port->dev);
> +	dport = list_first_entry(&port->dports, typeof(*dport), list);
> +	device_unlock(&port->dev);
>  
> -	return 0;
> +	single_port_map[0] = dport->port_id;
> +
> +	rc = cxl_decoder_add(cxld, single_port_map);
> +	if (rc)
> +		put_device(&cxld->dev);
> +	else
> +		rc = cxl_decoder_autoremove(host, cxld);
> +
> +	if (rc == 0)
> +		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> +	return rc;
>  }
>  
>  static int add_host_bridge_dport(struct device *match, void *arg)
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 6dfdeaf999f0..396252749477 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -453,10 +453,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
>  }
>  EXPORT_SYMBOL_GPL(cxl_add_dport);
>  
> -static int decoder_populate_targets(struct device *host,
> -				    struct cxl_decoder *cxld,
> -				    struct cxl_port *port, int *target_map,
> -				    int nr_targets)
> +static int decoder_populate_targets(struct cxl_decoder *cxld,
> +				    struct cxl_port *port, int *target_map)
>  {
>  	int rc = 0, i;
>  
> @@ -464,42 +462,36 @@ static int decoder_populate_targets(struct device *host,
>  		return 0;
>  
>  	device_lock(&port->dev);
> -	for (i = 0; i < nr_targets; i++) {
> +	if (list_empty(&port->dports)) {
> +		rc = -EINVAL;
> +		goto out_unlock;
> +	}

Forewarning, I think I'm still going to need to modify this check for endpoints.

> +
> +	for (i = 0; i < cxld->nr_targets; i++) {
>  		struct cxl_dport *dport = find_dport(port, target_map[i]);
>  
>  		if (!dport) {
>  			rc = -ENXIO;
> -			break;
> +			goto out_unlock;
>  		}
> -		dev_dbg(host, "%s: target: %d\n", dev_name(dport->dport), i);
>  		cxld->target[i] = dport;
>  	}
> +
> +out_unlock:
>  	device_unlock(&port->dev);
>  
>  	return rc;
>  }
>  
> -static struct cxl_decoder *
> -cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> -		  resource_size_t base, resource_size_t len,
> -		  int interleave_ways, int interleave_granularity,
> -		  enum cxl_decoder_type type, unsigned long flags,
> -		  int *target_map)
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>  {
>  	struct cxl_decoder *cxld;
>  	struct device *dev;
>  	int rc = 0;
>  
> -	if (interleave_ways < 1)
> +	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
>  		return ERR_PTR(-EINVAL);
>  
> -	device_lock(&port->dev);
> -	if (list_empty(&port->dports))
> -		rc = -EINVAL;
> -	device_unlock(&port->dev);
> -	if (rc)
> -		return ERR_PTR(rc);
> -
>  	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
>  	if (!cxld)
>  		return ERR_PTR(-ENOMEM);
> @@ -508,22 +500,8 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
>  	if (rc < 0)
>  		goto err;
>  
> -	*cxld = (struct cxl_decoder) {
> -		.id = rc,
> -		.range = {
> -			.start = base,
> -			.end = base + len - 1,
> -		},
> -		.flags = flags,
> -		.interleave_ways = interleave_ways,
> -		.interleave_granularity = interleave_granularity,
> -		.target_type = type,
> -	};
> -
> -	rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
> -	if (rc)
> -		goto err;
> -
> +	cxld->id = rc;
> +	cxld->nr_targets = nr_targets;

Would be really nice if cxld->nr_targets could be const...

>  	dev = &cxld->dev;
>  	device_initialize(dev);
>  	device_set_pm_not_required(dev);
> @@ -541,72 +519,47 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
>  	kfree(cxld);
>  	return ERR_PTR(rc);
>  }
> +EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
>  
> -struct cxl_decoder *
> -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
> -		     resource_size_t base, resource_size_t len,
> -		     int interleave_ways, int interleave_granularity,
> -		     enum cxl_decoder_type type, unsigned long flags,
> -		     int *target_map)
> +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>  {
> -	struct cxl_decoder *cxld;
> +	struct cxl_port *port;
>  	struct device *dev;
>  	int rc;
>  
> -	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
> -		return ERR_PTR(-EINVAL);
> +	if (!cxld)
> +		return -EINVAL;

I don't mind, but I think calling this with !cxld is a driver bug, right?
Perhaps upgrade to WARN_ONCE?

>  
> -	cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
> -				 interleave_ways, interleave_granularity, type,
> -				 flags, target_map);
>  	if (IS_ERR(cxld))
> -		return cxld;
> +		return PTR_ERR(cxld);

Same as above.

>  
> -	dev = &cxld->dev;
> -	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
> -	if (rc)
> -		goto err;
> +	if (cxld->interleave_ways < 1)
> +		return -EINVAL;
>  
> -	rc = device_add(dev);
> +	port = to_cxl_port(cxld->dev.parent);
> +	rc = decoder_populate_targets(cxld, port, target_map);
>  	if (rc)
> -		goto err;
> +		return rc;
>  
> -	rc = devm_add_action_or_reset(host, unregister_cxl_dev, dev);
> +	dev = &cxld->dev;
> +	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
>  	if (rc)
> -		return ERR_PTR(rc);
> -	return cxld;
> +		return rc;
>  
> -err:
> -	put_device(dev);
> -	return ERR_PTR(rc);
> +	return device_add(dev);
>  }
> -EXPORT_SYMBOL_GPL(devm_cxl_add_decoder);
> +EXPORT_SYMBOL_GPL(cxl_decoder_add);
>  
> -/*
> - * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
> - * single ported host-bridges need not publish a decoder capability when a
> - * passthrough decode can be assumed, i.e. all transactions that the uport sees
> - * are claimed and passed to the single dport. Default the range a 0-base
> - * 0-length until the first CXL region is activated.
> - */
> -struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> -						     struct cxl_port *port)
> +static void cxld_unregister(void *dev)
>  {
> -	struct cxl_dport *dport;
> -	int target_map[1];
> -
> -	device_lock(&port->dev);
> -	dport = list_first_entry_or_null(&port->dports, typeof(*dport), list);
> -	device_unlock(&port->dev);
> -
> -	if (!dport)
> -		return ERR_PTR(-ENXIO);
> +	device_unregister(dev);
> +}
>  
> -	target_map[0] = dport->port_id;
> -	return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
> -				    CXL_DECODER_EXPANDER, 0, target_map);
> +int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld)
> +{
> +	return devm_add_action_or_reset(host, cxld_unregister, &cxld->dev);
>  }
> -EXPORT_SYMBOL_GPL(devm_cxl_add_passthrough_decoder);
> +EXPORT_SYMBOL_GPL(cxl_decoder_autoremove);
>  
>  /**
>   * __cxl_driver_register - register a driver for the cxl bus
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index c85b7fbad02d..e0c9aacc4e9c 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -9,11 +9,6 @@ extern const struct device_type cxl_nvdimm_type;
>  
>  extern struct attribute_group cxl_base_attribute_group;
>  
> -static inline void unregister_cxl_dev(void *dev)
> -{
> -	device_unregister(dev);
> -}
> -
>  struct cxl_send_command;
>  struct cxl_mem_query_commands;
>  int cxl_query_cmd(struct cxl_memdev *cxlmd,
> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
> index 74be5132df1c..5032f4c1c69d 100644
> --- a/drivers/cxl/core/pmem.c
> +++ b/drivers/cxl/core/pmem.c
> @@ -222,6 +222,11 @@ static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
>  	return cxl_nvd;
>  }
>  
> +static void cxl_nvd_unregister(void *dev)
> +{
> +	device_unregister(dev);
> +}
> +
>  /**
>   * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm
>   * @host: same host as @cxlmd
> @@ -251,7 +256,7 @@ int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
>  	dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
>  		dev_name(dev));
>  
> -	return devm_add_action_or_reset(host, unregister_cxl_dev, dev);
> +	return devm_add_action_or_reset(host, cxl_nvd_unregister, dev);
>  
>  err:
>  	put_device(dev);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 9af5745ba2c0..7d6b011dd963 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -195,6 +195,7 @@ enum cxl_decoder_type {
>   * @interleave_granularity: data stride per dport
>   * @target_type: accelerator vs expander (type2 vs type3) selector
>   * @flags: memory type capabilities and locking
> + * @nr_targets: number of elements in @target
>   * @target: active ordered target list in current decoder configuration
>   */
>  struct cxl_decoder {
> @@ -205,6 +206,7 @@ struct cxl_decoder {
>  	int interleave_granularity;
>  	enum cxl_decoder_type target_type;
>  	unsigned long flags;
> +	int nr_targets;
>  	struct cxl_dport *target[];
>  };
>  
> @@ -286,15 +288,10 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
>  
>  struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  bool is_root_decoder(struct device *dev);
> -struct cxl_decoder *
> -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
> -		     resource_size_t base, resource_size_t len,
> -		     int interleave_ways, int interleave_granularity,
> -		     enum cxl_decoder_type type, unsigned long flags,
> -		     int *target_map);
> -
> -struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> -						     struct cxl_port *port);
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
> +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
> +int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> +
>  extern struct bus_type cxl_bus_type;
>  
>  struct cxl_driver {
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v5 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-21 14:24     ` Ben Widawsky
@ 2021-09-21 16:18       ` Dan Williams
  0 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-09-21 16:18 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, kernel test robot, Nathan Chancellor, Dan Carpenter,
	Linux NVDIMM, Jonathan Cameron

On Tue, Sep 21, 2021 at 7:24 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-09-14 12:31:22, Dan Williams wrote:
> > The kbuild robot reports:
> >
> >     drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
> >     limit (1024) in function 'devm_cxl_add_decoder'
> >
> > It is also the case the devm_cxl_add_decoder() is unwieldy to use for
> > all the different decoder types. Fix the stack usage by splitting the
> > creation into alloc and add steps. This also allows for context
> > specific construction before adding.
> >
> > With the split the caller is responsible for registering a devm callback
> > to trigger device_unregister() for the decoder rather than it being
> > implicit in the decoder registration. I.e. the routine that calls alloc
> > is responsible for calling put_device() if the "add" operation fails.
> >
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reported-by: Nathan Chancellor <nathan@kernel.org>
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> I have some comments inline. You can take them or leave them. Hopefully you can
> pull in my patch to document these after too.
>
> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
>
> > ---
> > Changes since v4:
> > - hold the device lock over the list_empty(&port->dports) check
> >   (Jonathan)
> > - move the list_empty() check after the check for NULL @target_map in
> >   anticipation of endpoint decoders (Ben)
> >
> >  drivers/cxl/acpi.c      |   84 +++++++++++++++++++++++---------
> >  drivers/cxl/core/bus.c  |  123 +++++++++++++++--------------------------------
> >  drivers/cxl/core/core.h |    5 --
> >  drivers/cxl/core/pmem.c |    7 ++-
> >  drivers/cxl/cxl.h       |   15 ++----
> >  5 files changed, 110 insertions(+), 124 deletions(-)
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index d39cc797a64e..2368a8b67698 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> >       struct cxl_decoder *cxld;
> >       acpi_size len, cur = 0;
> >       void *cedt_subtable;
> > -     unsigned long flags;
> >       int rc;
> >
> >       len = acpi_cedt->length - sizeof(*acpi_cedt);
> > @@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> >               for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
> >                       target_map[i] = cfmws->interleave_targets[i];
> >
> > -             flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > -             cxld = devm_cxl_add_decoder(dev, root_port,
> > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > -                                         cfmws->base_hpa, cfmws->window_size,
> > -                                         CFMWS_INTERLEAVE_WAYS(cfmws),
> > -                                         CFMWS_INTERLEAVE_GRANULARITY(cfmws),
> > -                                         CXL_DECODER_EXPANDER,
> > -                                         flags, target_map);
> > -
> > -             if (IS_ERR(cxld)) {
> > +             cxld = cxl_decoder_alloc(root_port,
> > +                                      CFMWS_INTERLEAVE_WAYS(cfmws));
> > +             if (IS_ERR(cxld))
> > +                     goto next;
> > +
> > +             cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
> > +             cxld->target_type = CXL_DECODER_EXPANDER;
> > +             cxld->range = (struct range) {
> > +                     .start = cfmws->base_hpa,
> > +                     .end = cfmws->base_hpa + cfmws->window_size - 1,
> > +             };
> > +             cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
> > +             cxld->interleave_granularity =
> > +                     CFMWS_INTERLEAVE_GRANULARITY(cfmws);
> > +
> > +             rc = cxl_decoder_add(cxld, target_map);
> > +             if (rc)
> > +                     put_device(&cxld->dev);
> > +             else
> > +                     rc = cxl_decoder_autoremove(dev, cxld);
>
> For posterity, I'll say I don't love this interface overall, but I don't have a
> better suggestion.
>
> alloc()
> open coded configuration
> add()
> open coded autoremove
>
> I understand some of the background on moving the responsibility of the devm
> callback to the actual aller, it just ends up a fairly weird interface now since
> all 4 steps are needed to actually create a decoder for consumption by the
> driver.
>
> I'd request a new function to configure the decoder before adding except I don't
> think it's worth doing that either.

Certainly this approach was taken for practical reasons, not elegance.
I too don't see how to clean this up further. It's either make the
caller responsible for all steps, or have monster functions for all
the possible ways a decoder might be configured and tempt the stack
window warnings again.

>
> > +             if (rc) {
> >                       dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
> >                               cfmws->base_hpa, cfmws->base_hpa +
> >                               cfmws->window_size - 1);
>
> Do you think it makes sense to explain to the user what the consequence is of
> this?

In the log message? No, I think that would be too pedantic.

>
> > -             } else {
> > -                     dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > -                             dev_name(&cxld->dev), cfmws->base_hpa,
> > -                              cfmws->base_hpa + cfmws->window_size - 1);
> > +                     goto next;
> >               }
> > +             dev_dbg(dev, "add: %s range %#llx-%#llx\n",
> > +                     dev_name(&cxld->dev), cfmws->base_hpa,
> > +                     cfmws->base_hpa + cfmws->window_size - 1);
> > +next:
> >               cur += c->length;
> >       }
> >  }
> > @@ -266,6 +277,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >       struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> >       struct acpi_pci_root *pci_root;
> >       struct cxl_walk_context ctx;
> > +     int single_port_map[1], rc;
> >       struct cxl_decoder *cxld;
> >       struct cxl_dport *dport;
> >       struct cxl_port *port;
> > @@ -301,22 +313,46 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >               return -ENODEV;
> >       if (ctx.error)
> >               return ctx.error;
> > +     if (ctx.count > 1)
> > +             return 0;
> >
> >       /* TODO: Scan CHBCR for HDM Decoder resources */
> >
> >       /*
> > -      * In the single-port host-bridge case there are no HDM decoders
> > -      * in the CHBCR and a 1:1 passthrough decode is implied.
> > +      * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> > +      * Structure) single ported host-bridges need not publish a decoder
> > +      * capability when a passthrough decode can be assumed, i.e. all
> > +      * transactions that the uport sees are claimed and passed to the single
> > +      * dport. Default the range a 0-base 0-length until the first CXL region
> > +      * is activated.
> >        */
> > -     if (ctx.count == 1) {
> > -             cxld = devm_cxl_add_passthrough_decoder(host, port);
> > -             if (IS_ERR(cxld))
> > -                     return PTR_ERR(cxld);
> > +     cxld = cxl_decoder_alloc(port, 1);
> > +     if (IS_ERR(cxld))
> > +             return PTR_ERR(cxld);
> > +
> > +     cxld->interleave_ways = 1;
> > +     cxld->interleave_granularity = PAGE_SIZE;
> > +     cxld->target_type = CXL_DECODER_EXPANDER;
> > +     cxld->range = (struct range) {
> > +             .start = 0,
> > +             .end = -1,
> > +     };
> >
> > -             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > -     }
> > +     device_lock(&port->dev);
> > +     dport = list_first_entry(&port->dports, typeof(*dport), list);
> > +     device_unlock(&port->dev);
> >
> > -     return 0;
> > +     single_port_map[0] = dport->port_id;
> > +
> > +     rc = cxl_decoder_add(cxld, single_port_map);
> > +     if (rc)
> > +             put_device(&cxld->dev);
> > +     else
> > +             rc = cxl_decoder_autoremove(host, cxld);
> > +
> > +     if (rc == 0)
> > +             dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > +     return rc;
> >  }
> >
> >  static int add_host_bridge_dport(struct device *match, void *arg)
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 6dfdeaf999f0..396252749477 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -453,10 +453,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
> >  }
> >  EXPORT_SYMBOL_GPL(cxl_add_dport);
> >
> > -static int decoder_populate_targets(struct device *host,
> > -                                 struct cxl_decoder *cxld,
> > -                                 struct cxl_port *port, int *target_map,
> > -                                 int nr_targets)
> > +static int decoder_populate_targets(struct cxl_decoder *cxld,
> > +                                 struct cxl_port *port, int *target_map)
> >  {
> >       int rc = 0, i;
> >
> > @@ -464,42 +462,36 @@ static int decoder_populate_targets(struct device *host,
> >               return 0;
> >
> >       device_lock(&port->dev);
> > -     for (i = 0; i < nr_targets; i++) {
> > +     if (list_empty(&port->dports)) {
> > +             rc = -EINVAL;
> > +             goto out_unlock;
> > +     }
>
> Forewarning, I think I'm still going to need to modify this check for endpoints.

I should have expanded the amount of context in the diff. That "return
0;" above is from the !target_map check which should be true for
endpoints, that was one of your earlier feedback items that I heeded.
So I don't think you'll trip over this.

>
> > +
> > +     for (i = 0; i < cxld->nr_targets; i++) {
> >               struct cxl_dport *dport = find_dport(port, target_map[i]);
> >
> >               if (!dport) {
> >                       rc = -ENXIO;
> > -                     break;
> > +                     goto out_unlock;
> >               }
> > -             dev_dbg(host, "%s: target: %d\n", dev_name(dport->dport), i);
> >               cxld->target[i] = dport;
> >       }
> > +
> > +out_unlock:
> >       device_unlock(&port->dev);
> >
> >       return rc;
> >  }
> >
> > -static struct cxl_decoder *
> > -cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> > -               resource_size_t base, resource_size_t len,
> > -               int interleave_ways, int interleave_granularity,
> > -               enum cxl_decoder_type type, unsigned long flags,
> > -               int *target_map)
> > +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
> >  {
> >       struct cxl_decoder *cxld;
> >       struct device *dev;
> >       int rc = 0;
> >
> > -     if (interleave_ways < 1)
> > +     if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
> >               return ERR_PTR(-EINVAL);
> >
> > -     device_lock(&port->dev);
> > -     if (list_empty(&port->dports))
> > -             rc = -EINVAL;
> > -     device_unlock(&port->dev);
> > -     if (rc)
> > -             return ERR_PTR(rc);
> > -
> >       cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> >       if (!cxld)
> >               return ERR_PTR(-ENOMEM);
> > @@ -508,22 +500,8 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> >       if (rc < 0)
> >               goto err;
> >
> > -     *cxld = (struct cxl_decoder) {
> > -             .id = rc,
> > -             .range = {
> > -                     .start = base,
> > -                     .end = base + len - 1,
> > -             },
> > -             .flags = flags,
> > -             .interleave_ways = interleave_ways,
> > -             .interleave_granularity = interleave_granularity,
> > -             .target_type = type,
> > -     };
> > -
> > -     rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
> > -     if (rc)
> > -             goto err;
> > -
> > +     cxld->id = rc;
> > +     cxld->nr_targets = nr_targets;
>
> Would be really nice if cxld->nr_targets could be const...
>

Sure, conversion is a little messy, but not too bad.

> >       dev = &cxld->dev;
> >       device_initialize(dev);
> >       device_set_pm_not_required(dev);
> > @@ -541,72 +519,47 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
> >       kfree(cxld);
> >       return ERR_PTR(rc);
> >  }
> > +EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
> >
> > -struct cxl_decoder *
> > -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
> > -                  resource_size_t base, resource_size_t len,
> > -                  int interleave_ways, int interleave_granularity,
> > -                  enum cxl_decoder_type type, unsigned long flags,
> > -                  int *target_map)
> > +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> >  {
> > -     struct cxl_decoder *cxld;
> > +     struct cxl_port *port;
> >       struct device *dev;
> >       int rc;
> >
> > -     if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
> > -             return ERR_PTR(-EINVAL);
> > +     if (!cxld)
> > +             return -EINVAL;
>
> I don't mind, but I think calling this with !cxld is a driver bug, right?
> Perhaps upgrade to WARN_ONCE?

Sure.

>
> >
> > -     cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
> > -                              interleave_ways, interleave_granularity, type,
> > -                              flags, target_map);
> >       if (IS_ERR(cxld))
> > -             return cxld;
> > +             return PTR_ERR(cxld);
>
> Same as above.

Ok.

>
> >
> > -     dev = &cxld->dev;
> > -     rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
> > -     if (rc)
> > -             goto err;
> > +     if (cxld->interleave_ways < 1)
> > +             return -EINVAL;
> >
> > -     rc = device_add(dev);
> > +     port = to_cxl_port(cxld->dev.parent);
> > +     rc = decoder_populate_targets(cxld, port, target_map);
> >       if (rc)
> > -             goto err;
> > +             return rc;
> >
> > -     rc = devm_add_action_or_reset(host, unregister_cxl_dev, dev);
> > +     dev = &cxld->dev;
> > +     rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
> >       if (rc)
> > -             return ERR_PTR(rc);
> > -     return cxld;
> > +             return rc;
> >
> > -err:
> > -     put_device(dev);
> > -     return ERR_PTR(rc);
> > +     return device_add(dev);
> >  }
> > -EXPORT_SYMBOL_GPL(devm_cxl_add_decoder);
> > +EXPORT_SYMBOL_GPL(cxl_decoder_add);
> >
> > -/*
> > - * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
> > - * single ported host-bridges need not publish a decoder capability when a
> > - * passthrough decode can be assumed, i.e. all transactions that the uport sees
> > - * are claimed and passed to the single dport. Default the range a 0-base
> > - * 0-length until the first CXL region is activated.
> > - */
> > -struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> > -                                                  struct cxl_port *port)
> > +static void cxld_unregister(void *dev)
> >  {
> > -     struct cxl_dport *dport;
> > -     int target_map[1];
> > -
> > -     device_lock(&port->dev);
> > -     dport = list_first_entry_or_null(&port->dports, typeof(*dport), list);
> > -     device_unlock(&port->dev);
> > -
> > -     if (!dport)
> > -             return ERR_PTR(-ENXIO);
> > +     device_unregister(dev);
> > +}
> >
> > -     target_map[0] = dport->port_id;
> > -     return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
> > -                                 CXL_DECODER_EXPANDER, 0, target_map);
> > +int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld)
> > +{
> > +     return devm_add_action_or_reset(host, cxld_unregister, &cxld->dev);
> >  }
> > -EXPORT_SYMBOL_GPL(devm_cxl_add_passthrough_decoder);
> > +EXPORT_SYMBOL_GPL(cxl_decoder_autoremove);
> >
> >  /**
> >   * __cxl_driver_register - register a driver for the cxl bus
> > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> > index c85b7fbad02d..e0c9aacc4e9c 100644
> > --- a/drivers/cxl/core/core.h
> > +++ b/drivers/cxl/core/core.h
> > @@ -9,11 +9,6 @@ extern const struct device_type cxl_nvdimm_type;
> >
> >  extern struct attribute_group cxl_base_attribute_group;
> >
> > -static inline void unregister_cxl_dev(void *dev)
> > -{
> > -     device_unregister(dev);
> > -}
> > -
> >  struct cxl_send_command;
> >  struct cxl_mem_query_commands;
> >  int cxl_query_cmd(struct cxl_memdev *cxlmd,
> > diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
> > index 74be5132df1c..5032f4c1c69d 100644
> > --- a/drivers/cxl/core/pmem.c
> > +++ b/drivers/cxl/core/pmem.c
> > @@ -222,6 +222,11 @@ static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
> >       return cxl_nvd;
> >  }
> >
> > +static void cxl_nvd_unregister(void *dev)
> > +{
> > +     device_unregister(dev);
> > +}
> > +
> >  /**
> >   * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm
> >   * @host: same host as @cxlmd
> > @@ -251,7 +256,7 @@ int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
> >       dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
> >               dev_name(dev));
> >
> > -     return devm_add_action_or_reset(host, unregister_cxl_dev, dev);
> > +     return devm_add_action_or_reset(host, cxl_nvd_unregister, dev);
> >
> >  err:
> >       put_device(dev);
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 9af5745ba2c0..7d6b011dd963 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -195,6 +195,7 @@ enum cxl_decoder_type {
> >   * @interleave_granularity: data stride per dport
> >   * @target_type: accelerator vs expander (type2 vs type3) selector
> >   * @flags: memory type capabilities and locking
> > + * @nr_targets: number of elements in @target
> >   * @target: active ordered target list in current decoder configuration
> >   */
> >  struct cxl_decoder {
> > @@ -205,6 +206,7 @@ struct cxl_decoder {
> >       int interleave_granularity;
> >       enum cxl_decoder_type target_type;
> >       unsigned long flags;
> > +     int nr_targets;
> >       struct cxl_dport *target[];
> >  };
> >
> > @@ -286,15 +288,10 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
> >
> >  struct cxl_decoder *to_cxl_decoder(struct device *dev);
> >  bool is_root_decoder(struct device *dev);
> > -struct cxl_decoder *
> > -devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
> > -                  resource_size_t base, resource_size_t len,
> > -                  int interleave_ways, int interleave_granularity,
> > -                  enum cxl_decoder_type type, unsigned long flags,
> > -                  int *target_map);
> > -
> > -struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
> > -                                                  struct cxl_port *port);
> > +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
> > +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
> > +int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> > +
> >  extern struct bus_type cxl_bus_type;
> >
> >  struct cxl_driver {
> >

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v6 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-14 19:31   ` [PATCH v5 " Dan Williams
  2021-09-21 14:24     ` Ben Widawsky
@ 2021-09-21 19:22     ` Dan Williams
  2021-12-10 19:38       ` Nathan Chancellor
  1 sibling, 1 reply; 78+ messages in thread
From: Dan Williams @ 2021-09-21 19:22 UTC (permalink / raw)
  To: linux-cxl
  Cc: kernel test robot, Nathan Chancellor, Dan Carpenter,
	Ben Widawsky, nvdimm, Jonathan.Cameron, ira.weiny,
	vishal.l.verma, alison.schofield

The kbuild robot reports:

    drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
    limit (1024) in function 'devm_cxl_add_decoder'

It is also the case the devm_cxl_add_decoder() is unwieldy to use for
all the different decoder types. Fix the stack usage by splitting the
creation into alloc and add steps. This also allows for context
specific construction before adding.

With the split the caller is responsible for registering a devm callback
to trigger device_unregister() for the decoder rather than it being
implicit in the decoder registration. I.e. the routine that calls alloc
is responsible for calling put_device() if the "add" operation fails.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v5:
- Add WARN_ON_ONCE() for the obvious programming error failure cases in
  cxl_decoder_add() (Ben)

- Mark @nr_targets in 'struct cxl_decoder' const (Ben)


 drivers/cxl/acpi.c      |   84 ++++++++++++++++++++++---------
 drivers/cxl/core/bus.c  |  129 +++++++++++++++--------------------------------
 drivers/cxl/core/core.h |    5 --
 drivers/cxl/core/pmem.c |    7 ++-
 drivers/cxl/cxl.h       |   15 ++---
 5 files changed, 114 insertions(+), 126 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index d39cc797a64e..af1c6c1875ac 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -82,7 +82,6 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 	struct cxl_decoder *cxld;
 	acpi_size len, cur = 0;
 	void *cedt_subtable;
-	unsigned long flags;
 	int rc;
 
 	len = acpi_cedt->length - sizeof(*acpi_cedt);
@@ -119,24 +118,36 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 		for (i = 0; i < CFMWS_INTERLEAVE_WAYS(cfmws); i++)
 			target_map[i] = cfmws->interleave_targets[i];
 
-		flags = cfmws_to_decoder_flags(cfmws->restrictions);
-		cxld = devm_cxl_add_decoder(dev, root_port,
-					    CFMWS_INTERLEAVE_WAYS(cfmws),
-					    cfmws->base_hpa, cfmws->window_size,
-					    CFMWS_INTERLEAVE_WAYS(cfmws),
-					    CFMWS_INTERLEAVE_GRANULARITY(cfmws),
-					    CXL_DECODER_EXPANDER,
-					    flags, target_map);
-
-		if (IS_ERR(cxld)) {
+		cxld = cxl_decoder_alloc(root_port,
+					 CFMWS_INTERLEAVE_WAYS(cfmws));
+		if (IS_ERR(cxld))
+			goto next;
+
+		cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
+		cxld->target_type = CXL_DECODER_EXPANDER;
+		cxld->range = (struct range) {
+			.start = cfmws->base_hpa,
+			.end = cfmws->base_hpa + cfmws->window_size - 1,
+		};
+		cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
+		cxld->interleave_granularity =
+			CFMWS_INTERLEAVE_GRANULARITY(cfmws);
+
+		rc = cxl_decoder_add(cxld, target_map);
+		if (rc)
+			put_device(&cxld->dev);
+		else
+			rc = cxl_decoder_autoremove(dev, cxld);
+		if (rc) {
 			dev_err(dev, "Failed to add decoder for %#llx-%#llx\n",
 				cfmws->base_hpa, cfmws->base_hpa +
 				cfmws->window_size - 1);
-		} else {
-			dev_dbg(dev, "add: %s range %#llx-%#llx\n",
-				dev_name(&cxld->dev), cfmws->base_hpa,
-				 cfmws->base_hpa + cfmws->window_size - 1);
+			goto next;
 		}
+		dev_dbg(dev, "add: %s range %#llx-%#llx\n",
+			dev_name(&cxld->dev), cfmws->base_hpa,
+			cfmws->base_hpa + cfmws->window_size - 1);
+next:
 		cur += c->length;
 	}
 }
@@ -266,6 +277,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
 	struct acpi_pci_root *pci_root;
 	struct cxl_walk_context ctx;
+	int single_port_map[1], rc;
 	struct cxl_decoder *cxld;
 	struct cxl_dport *dport;
 	struct cxl_port *port;
@@ -301,22 +313,46 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 		return -ENODEV;
 	if (ctx.error)
 		return ctx.error;
+	if (ctx.count > 1)
+		return 0;
 
 	/* TODO: Scan CHBCR for HDM Decoder resources */
 
 	/*
-	 * In the single-port host-bridge case there are no HDM decoders
-	 * in the CHBCR and a 1:1 passthrough decode is implied.
+	 * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
+	 * Structure) single ported host-bridges need not publish a decoder
+	 * capability when a passthrough decode can be assumed, i.e. all
+	 * transactions that the uport sees are claimed and passed to the single
+	 * dport. Disable the range until the first CXL region is enumerated /
+	 * activated.
 	 */
-	if (ctx.count == 1) {
-		cxld = devm_cxl_add_passthrough_decoder(host, port);
-		if (IS_ERR(cxld))
-			return PTR_ERR(cxld);
+	cxld = cxl_decoder_alloc(port, 1);
+	if (IS_ERR(cxld))
+		return PTR_ERR(cxld);
+
+	cxld->interleave_ways = 1;
+	cxld->interleave_granularity = PAGE_SIZE;
+	cxld->target_type = CXL_DECODER_EXPANDER;
+	cxld->range = (struct range) {
+		.start = 0,
+		.end = -1,
+	};
 
-		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
-	}
+	device_lock(&port->dev);
+	dport = list_first_entry(&port->dports, typeof(*dport), list);
+	device_unlock(&port->dev);
 
-	return 0;
+	single_port_map[0] = dport->port_id;
+
+	rc = cxl_decoder_add(cxld, single_port_map);
+	if (rc)
+		put_device(&cxld->dev);
+	else
+		rc = cxl_decoder_autoremove(host, cxld);
+
+	if (rc == 0)
+		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
+	return rc;
 }
 
 static int add_host_bridge_dport(struct device *match, void *arg)
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 6dfdeaf999f0..ebd061d03950 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -453,10 +453,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
 }
 EXPORT_SYMBOL_GPL(cxl_add_dport);
 
-static int decoder_populate_targets(struct device *host,
-				    struct cxl_decoder *cxld,
-				    struct cxl_port *port, int *target_map,
-				    int nr_targets)
+static int decoder_populate_targets(struct cxl_decoder *cxld,
+				    struct cxl_port *port, int *target_map)
 {
 	int rc = 0, i;
 
@@ -464,66 +462,48 @@ static int decoder_populate_targets(struct device *host,
 		return 0;
 
 	device_lock(&port->dev);
-	for (i = 0; i < nr_targets; i++) {
+	if (list_empty(&port->dports)) {
+		rc = -EINVAL;
+		goto out_unlock;
+	}
+
+	for (i = 0; i < cxld->nr_targets; i++) {
 		struct cxl_dport *dport = find_dport(port, target_map[i]);
 
 		if (!dport) {
 			rc = -ENXIO;
-			break;
+			goto out_unlock;
 		}
-		dev_dbg(host, "%s: target: %d\n", dev_name(dport->dport), i);
 		cxld->target[i] = dport;
 	}
+
+out_unlock:
 	device_unlock(&port->dev);
 
 	return rc;
 }
 
-static struct cxl_decoder *
-cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
-		  resource_size_t base, resource_size_t len,
-		  int interleave_ways, int interleave_granularity,
-		  enum cxl_decoder_type type, unsigned long flags,
-		  int *target_map)
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
 {
-	struct cxl_decoder *cxld;
+	struct cxl_decoder *cxld, cxld_const_init = {
+		.nr_targets = nr_targets,
+	};
 	struct device *dev;
 	int rc = 0;
 
-	if (interleave_ways < 1)
+	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
 		return ERR_PTR(-EINVAL);
 
-	device_lock(&port->dev);
-	if (list_empty(&port->dports))
-		rc = -EINVAL;
-	device_unlock(&port->dev);
-	if (rc)
-		return ERR_PTR(rc);
-
 	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
 	if (!cxld)
 		return ERR_PTR(-ENOMEM);
+	memcpy(cxld, &cxld_const_init, sizeof(cxld_const_init));
 
 	rc = ida_alloc(&port->decoder_ida, GFP_KERNEL);
 	if (rc < 0)
 		goto err;
 
-	*cxld = (struct cxl_decoder) {
-		.id = rc,
-		.range = {
-			.start = base,
-			.end = base + len - 1,
-		},
-		.flags = flags,
-		.interleave_ways = interleave_ways,
-		.interleave_granularity = interleave_granularity,
-		.target_type = type,
-	};
-
-	rc = decoder_populate_targets(host, cxld, port, target_map, nr_targets);
-	if (rc)
-		goto err;
-
+	cxld->id = rc;
 	dev = &cxld->dev;
 	device_initialize(dev);
 	device_set_pm_not_required(dev);
@@ -541,72 +521,47 @@ cxl_decoder_alloc(struct device *host, struct cxl_port *port, int nr_targets,
 	kfree(cxld);
 	return ERR_PTR(rc);
 }
+EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
 
-struct cxl_decoder *
-devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
-		     resource_size_t base, resource_size_t len,
-		     int interleave_ways, int interleave_granularity,
-		     enum cxl_decoder_type type, unsigned long flags,
-		     int *target_map)
+int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 {
-	struct cxl_decoder *cxld;
+	struct cxl_port *port;
 	struct device *dev;
 	int rc;
 
-	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
-		return ERR_PTR(-EINVAL);
+	if (WARN_ON_ONCE(!cxld))
+		return -EINVAL;
 
-	cxld = cxl_decoder_alloc(host, port, nr_targets, base, len,
-				 interleave_ways, interleave_granularity, type,
-				 flags, target_map);
-	if (IS_ERR(cxld))
-		return cxld;
+	if (WARN_ON_ONCE(IS_ERR(cxld)))
+		return PTR_ERR(cxld);
 
-	dev = &cxld->dev;
-	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
-	if (rc)
-		goto err;
+	if (cxld->interleave_ways < 1)
+		return -EINVAL;
 
-	rc = device_add(dev);
+	port = to_cxl_port(cxld->dev.parent);
+	rc = decoder_populate_targets(cxld, port, target_map);
 	if (rc)
-		goto err;
+		return rc;
 
-	rc = devm_add_action_or_reset(host, unregister_cxl_dev, dev);
+	dev = &cxld->dev;
+	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
 	if (rc)
-		return ERR_PTR(rc);
-	return cxld;
+		return rc;
 
-err:
-	put_device(dev);
-	return ERR_PTR(rc);
+	return device_add(dev);
 }
-EXPORT_SYMBOL_GPL(devm_cxl_add_decoder);
+EXPORT_SYMBOL_GPL(cxl_decoder_add);
 
-/*
- * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
- * single ported host-bridges need not publish a decoder capability when a
- * passthrough decode can be assumed, i.e. all transactions that the uport sees
- * are claimed and passed to the single dport. Default the range a 0-base
- * 0-length until the first CXL region is activated.
- */
-struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
-						     struct cxl_port *port)
+static void cxld_unregister(void *dev)
 {
-	struct cxl_dport *dport;
-	int target_map[1];
-
-	device_lock(&port->dev);
-	dport = list_first_entry_or_null(&port->dports, typeof(*dport), list);
-	device_unlock(&port->dev);
-
-	if (!dport)
-		return ERR_PTR(-ENXIO);
+	device_unregister(dev);
+}
 
-	target_map[0] = dport->port_id;
-	return devm_cxl_add_decoder(host, port, 1, 0, 0, 1, PAGE_SIZE,
-				    CXL_DECODER_EXPANDER, 0, target_map);
+int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld)
+{
+	return devm_add_action_or_reset(host, cxld_unregister, &cxld->dev);
 }
-EXPORT_SYMBOL_GPL(devm_cxl_add_passthrough_decoder);
+EXPORT_SYMBOL_GPL(cxl_decoder_autoremove);
 
 /**
  * __cxl_driver_register - register a driver for the cxl bus
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index c85b7fbad02d..e0c9aacc4e9c 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -9,11 +9,6 @@ extern const struct device_type cxl_nvdimm_type;
 
 extern struct attribute_group cxl_base_attribute_group;
 
-static inline void unregister_cxl_dev(void *dev)
-{
-	device_unregister(dev);
-}
-
 struct cxl_send_command;
 struct cxl_mem_query_commands;
 int cxl_query_cmd(struct cxl_memdev *cxlmd,
diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index 74be5132df1c..5032f4c1c69d 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -222,6 +222,11 @@ static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
 	return cxl_nvd;
 }
 
+static void cxl_nvd_unregister(void *dev)
+{
+	device_unregister(dev);
+}
+
 /**
  * devm_cxl_add_nvdimm() - add a bridge between a cxl_memdev and an nvdimm
  * @host: same host as @cxlmd
@@ -251,7 +256,7 @@ int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
 	dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
 		dev_name(dev));
 
-	return devm_add_action_or_reset(host, unregister_cxl_dev, dev);
+	return devm_add_action_or_reset(host, cxl_nvd_unregister, dev);
 
 err:
 	put_device(dev);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 9af5745ba2c0..a6687e7fd598 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -195,6 +195,7 @@ enum cxl_decoder_type {
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
  * @flags: memory type capabilities and locking
+ * @nr_targets: number of elements in @target
  * @target: active ordered target list in current decoder configuration
  */
 struct cxl_decoder {
@@ -205,6 +206,7 @@ struct cxl_decoder {
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
 	unsigned long flags;
+	const int nr_targets;
 	struct cxl_dport *target[];
 };
 
@@ -286,15 +288,10 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
-struct cxl_decoder *
-devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
-		     resource_size_t base, resource_size_t len,
-		     int interleave_ways, int interleave_granularity,
-		     enum cxl_decoder_type type, unsigned long flags,
-		     int *target_map);
-
-struct cxl_decoder *devm_cxl_add_passthrough_decoder(struct device *host,
-						     struct cxl_port *port);
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
+int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
+int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
+
 extern struct bus_type cxl_bus_type;
 
 struct cxl_driver {


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v6 21/21] cxl/core: Split decoder setup into alloc + add
  2021-09-21 19:22     ` [PATCH v6 " Dan Williams
@ 2021-12-10 19:38       ` Nathan Chancellor
  2021-12-10 19:41         ` Dan Williams
  0 siblings, 1 reply; 78+ messages in thread
From: Nathan Chancellor @ 2021-12-10 19:38 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, kernel test robot, Dan Carpenter, Ben Widawsky,
	nvdimm, Jonathan.Cameron, ira.weiny, vishal.l.verma,
	alison.schofield, llvm

Hi Dan,

On Tue, Sep 21, 2021 at 12:22:16PM -0700, Dan Williams wrote:
> The kbuild robot reports:
> 
>     drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
>     limit (1024) in function 'devm_cxl_add_decoder'
> 
> It is also the case the devm_cxl_add_decoder() is unwieldy to use for
> all the different decoder types. Fix the stack usage by splitting the
> creation into alloc and add steps. This also allows for context
> specific construction before adding.
> 
> With the split the caller is responsible for registering a devm callback
> to trigger device_unregister() for the decoder rather than it being
> implicit in the decoder registration. I.e. the routine that calls alloc
> is responsible for calling put_device() if the "add" operation fails.
> 
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Apologies for not noticing this sooner, given that I was on the thread.

This patch as commit 48667f676189 ("cxl/core: Split decoder setup into
alloc + add") in mainline does not fully resolve the stack frame
warning. I still see an error with both GCC 11 and LLVM 12 with
allmodconfig minus CONFIG_KASAN.

GCC 11:

drivers/cxl/core/bus.c: In function ‘cxl_decoder_alloc’:
drivers/cxl/core/bus.c:523:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
  523 | }
      | ^
cc1: all warnings being treated as errors

LLVM 12:

drivers/cxl/core/bus.c:486:21: error: stack frame size of 1056 bytes in function 'cxl_decoder_alloc' [-Werror,-Wframe-larger-than=]
struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
                    ^
1 error generated.

This is due to the cxld_const_init structure, which is allocated on the
stack, presumably due to the "const" change requested in v5 that was
applied to v6. Undoing that resolves the warning for me with both
compilers. I am not sure if you have a better idea for how to resolve
that.

diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index ebd061d03950..46ce58376580 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -485,9 +485,7 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
 
 struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
 {
-	struct cxl_decoder *cxld, cxld_const_init = {
-		.nr_targets = nr_targets,
-	};
+	struct cxl_decoder *cxld;
 	struct device *dev;
 	int rc = 0;
 
@@ -497,13 +495,13 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
 	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
 	if (!cxld)
 		return ERR_PTR(-ENOMEM);
-	memcpy(cxld, &cxld_const_init, sizeof(cxld_const_init));
 
 	rc = ida_alloc(&port->decoder_ida, GFP_KERNEL);
 	if (rc < 0)
 		goto err;
 
 	cxld->id = rc;
+	cxld->nr_targets = nr_targets;
 	dev = &cxld->dev;
 	device_initialize(dev);
 	device_set_pm_not_required(dev);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 3af704e9b448..7c2b51746e31 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -191,7 +191,7 @@ struct cxl_decoder {
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
 	unsigned long flags;
-	const int nr_targets;
+	int nr_targets;
 	struct cxl_dport *target[];
 };
 

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH v6 21/21] cxl/core: Split decoder setup into alloc + add
  2021-12-10 19:38       ` Nathan Chancellor
@ 2021-12-10 19:41         ` Dan Williams
  0 siblings, 0 replies; 78+ messages in thread
From: Dan Williams @ 2021-12-10 19:41 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: linux-cxl, kernel test robot, Dan Carpenter, Ben Widawsky,
	Linux NVDIMM, Jonathan Cameron, Weiny, Ira, Vishal L Verma,
	Schofield, Alison, llvm

On Fri, Dec 10, 2021 at 11:39 AM Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi Dan,
>
> On Tue, Sep 21, 2021 at 12:22:16PM -0700, Dan Williams wrote:
> > The kbuild robot reports:
> >
> >     drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
> >     limit (1024) in function 'devm_cxl_add_decoder'
> >
> > It is also the case the devm_cxl_add_decoder() is unwieldy to use for
> > all the different decoder types. Fix the stack usage by splitting the
> > creation into alloc and add steps. This also allows for context
> > specific construction before adding.
> >
> > With the split the caller is responsible for registering a devm callback
> > to trigger device_unregister() for the decoder rather than it being
> > implicit in the decoder registration. I.e. the routine that calls alloc
> > is responsible for calling put_device() if the "add" operation fails.
> >
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reported-by: Nathan Chancellor <nathan@kernel.org>
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> Apologies for not noticing this sooner, given that I was on the thread.
>
> This patch as commit 48667f676189 ("cxl/core: Split decoder setup into
> alloc + add") in mainline does not fully resolve the stack frame
> warning. I still see an error with both GCC 11 and LLVM 12 with
> allmodconfig minus CONFIG_KASAN.
>
> GCC 11:
>
> drivers/cxl/core/bus.c: In function ‘cxl_decoder_alloc’:
> drivers/cxl/core/bus.c:523:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
>   523 | }
>       | ^
> cc1: all warnings being treated as errors
>
> LLVM 12:
>
> drivers/cxl/core/bus.c:486:21: error: stack frame size of 1056 bytes in function 'cxl_decoder_alloc' [-Werror,-Wframe-larger-than=]
> struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>                     ^
> 1 error generated.
>
> This is due to the cxld_const_init structure, which is allocated on the
> stack, presumably due to the "const" change requested in v5 that was
> applied to v6. Undoing that resolves the warning for me with both
> compilers. I am not sure if you have a better idea for how to resolve
> that.
>
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index ebd061d03950..46ce58376580 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -485,9 +485,7 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
>
>  struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>  {
> -       struct cxl_decoder *cxld, cxld_const_init = {
> -               .nr_targets = nr_targets,
> -       };
> +       struct cxl_decoder *cxld;
>         struct device *dev;
>         int rc = 0;
>
> @@ -497,13 +495,13 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>         cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
>         if (!cxld)
>                 return ERR_PTR(-ENOMEM);
> -       memcpy(cxld, &cxld_const_init, sizeof(cxld_const_init));
>
>         rc = ida_alloc(&port->decoder_ida, GFP_KERNEL);
>         if (rc < 0)
>                 goto err;
>
>         cxld->id = rc;
> +       cxld->nr_targets = nr_targets;
>         dev = &cxld->dev;
>         device_initialize(dev);
>         device_set_pm_not_required(dev);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 3af704e9b448..7c2b51746e31 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -191,7 +191,7 @@ struct cxl_decoder {
>         int interleave_granularity;
>         enum cxl_decoder_type target_type;
>         unsigned long flags;
> -       const int nr_targets;
> +       int nr_targets;
>         struct cxl_dport *target[];

That change looks like the right way to go to me. Thanks for the follow-up!

^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, other threads:[~2021-12-10 19:41 UTC | newest]

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09  5:11 [PATCH v4 00/21] cxl_test: Enable CXL Topology and UAPI regression tests Dan Williams
2021-09-09  5:11 ` [PATCH v4 01/21] libnvdimm/labels: Add uuid helpers Dan Williams
2021-09-09  5:11 ` [PATCH v4 02/21] libnvdimm/label: Add a helper for nlabel validation Dan Williams
2021-09-09  5:11 ` [PATCH v4 03/21] libnvdimm/labels: Introduce the concept of multi-range namespace labels Dan Williams
2021-09-09 13:09   ` Jonathan Cameron
2021-09-09 15:16     ` Dan Williams
2021-09-09  5:11 ` [PATCH v4 04/21] libnvdimm/labels: Fix kernel-doc for label.h Dan Williams
2021-09-10  8:38   ` Jonathan Cameron
2021-09-09  5:11 ` [PATCH v4 05/21] libnvdimm/label: Define CXL region labels Dan Williams
2021-09-09 15:58   ` Ben Widawsky
2021-09-09 18:38     ` Dan Williams
2021-09-09  5:12 ` [PATCH v4 06/21] libnvdimm/labels: Introduce CXL labels Dan Williams
2021-09-09  5:12 ` [PATCH v4 07/21] cxl/pci: Make 'struct cxl_mem' device type generic Dan Williams
2021-09-09 16:12   ` Ben Widawsky
2021-09-10  8:43   ` Jonathan Cameron
2021-09-09  5:12 ` [PATCH v4 08/21] cxl/pci: Clean up cxl_mem_get_partition_info() Dan Williams
2021-09-09 16:20   ` Ben Widawsky
2021-09-09 18:06     ` Dan Williams
2021-09-09 21:05       ` Ben Widawsky
2021-09-09 21:10         ` Dan Williams
2021-09-10  8:56         ` Jonathan Cameron
2021-09-13 22:19   ` [PATCH v5 " Dan Williams
2021-09-13 22:21     ` Dan Williams
2021-09-13 22:24   ` [PATCH v6 " Dan Williams
2021-09-09  5:12 ` [PATCH v4 09/21] cxl/mbox: Introduce the mbox_send operation Dan Williams
2021-09-09 16:34   ` Ben Widawsky
2021-09-10  8:58   ` Jonathan Cameron
2021-09-09  5:12 ` [PATCH v4 10/21] cxl/pci: Drop idr.h Dan Williams
2021-09-09 16:34   ` Ben Widawsky
2021-09-10  8:46     ` Jonathan Cameron
2021-09-09  5:12 ` [PATCH v4 11/21] cxl/mbox: Move mailbox and other non-PCI specific infrastructure to the core Dan Williams
2021-09-09 16:41   ` Ben Widawsky
2021-09-09 18:50     ` Dan Williams
2021-09-09 20:35       ` Ben Widawsky
2021-09-09 21:05         ` Dan Williams
2021-09-10  9:13   ` Jonathan Cameron
2021-09-09  5:12 ` [PATCH v4 12/21] cxl/pci: Use module_pci_driver Dan Williams
2021-09-09  5:12 ` [PATCH v4 13/21] cxl/mbox: Convert 'enabled_cmds' to DECLARE_BITMAP Dan Williams
2021-09-09  5:12 ` [PATCH v4 14/21] cxl/mbox: Add exclusive kernel command support Dan Williams
2021-09-09 17:02   ` Ben Widawsky
2021-09-10  9:33   ` Jonathan Cameron
2021-09-13 23:46     ` Dan Williams
2021-09-14  9:01       ` Jonathan Cameron
2021-09-14 12:22       ` Konstantin Ryabitsev
2021-09-14 14:39         ` Dan Williams
2021-09-14 15:51           ` Konstantin Ryabitsev
2021-09-14 19:03   ` [PATCH v5 " Dan Williams
2021-09-09  5:12 ` [PATCH v4 15/21] cxl/pmem: Translate NVDIMM label commands to CXL label commands Dan Williams
2021-09-09 17:22   ` Ben Widawsky
2021-09-09 19:03     ` Dan Williams
2021-09-09 20:32       ` Ben Widawsky
2021-09-10  9:39         ` Jonathan Cameron
2021-09-09 22:08   ` [PATCH v5 " Dan Williams
2021-09-10  9:40     ` Jonathan Cameron
2021-09-14 19:06   ` Dan Williams
2021-09-09  5:12 ` [PATCH v4 16/21] cxl/pmem: Add support for multiple nvdimm-bridge objects Dan Williams
2021-09-09 22:03   ` Dan Williams
2021-09-14 19:08   ` [PATCH v5 " Dan Williams
2021-09-09  5:13 ` [PATCH v4 17/21] tools/testing/cxl: Introduce a mocked-up CXL port hierarchy Dan Williams
2021-09-10  9:53   ` Jonathan Cameron
2021-09-10 18:46     ` Dan Williams
2021-09-14 19:14   ` [PATCH v5 " Dan Williams
2021-09-09  5:13 ` [PATCH v4 18/21] cxl/bus: Populate the target list at decoder create Dan Williams
2021-09-10  9:57   ` Jonathan Cameron
2021-09-09  5:13 ` [PATCH v4 19/21] cxl/mbox: Move command definitions to common location Dan Williams
2021-09-09  5:13 ` [PATCH v4 20/21] tools/testing/cxl: Introduce a mock memory device + driver Dan Williams
2021-09-10 10:09   ` Jonathan Cameron
2021-09-09  5:13 ` [PATCH v4 21/21] cxl/core: Split decoder setup into alloc + add Dan Williams
2021-09-10 10:33   ` Jonathan Cameron
2021-09-10 18:36     ` Dan Williams
2021-09-11 17:15       ` Ben Widawsky
2021-09-11 20:20         ` Dan Williams
2021-09-14 19:31   ` [PATCH v5 " Dan Williams
2021-09-21 14:24     ` Ben Widawsky
2021-09-21 16:18       ` Dan Williams
2021-09-21 19:22     ` [PATCH v6 " Dan Williams
2021-12-10 19:38       ` Nathan Chancellor
2021-12-10 19:41         ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).