All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/19] cxl: Device memory setup
@ 2023-06-04 23:31 Dan Williams
  2023-06-04 23:31 ` [PATCH 01/19] cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output Dan Williams
                   ` (18 more replies)
  0 siblings, 19 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:31 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

There are 2 models for implementing CXL memory. To date the CXL enabling
has been focused on the common class-device case (Type-3). The
class-code mandates the implementation of standard mechanisms like the
mailbox and mandatory commands. The other model (Type-2) is
implementation specific memory typically associated with local device
memory for an accelerator. Start the support for Type-2 and take the
opportunity to better prepare the CXL core for "a la carte" enabling of
optional CXL features.

Now, to date there has not been any engagement on the list for an
accelerator driver that wants to reuse the CXL core, but I think it is
worth moving ahead with these patches for the following reasons:

1/ The refactoring of region creation is needed by Persistent Memory
   support where the kernel needs to create regions from labels, not sysfs
   input.

2/ The 'struct cxl_dev_state' object carries infrastructure that is
   optional outside of CXL memory-device class-code devices. That makes it
   difficult to even start the discussion with accelerator driver authors
   that want to evaluate what pieces of the CXL core are suitable to reuse.

3/ The example type-2 driver in cxl_test protects against
   type-3-exclusive assumptions from leaking back into the code base.

In other words it is difficult to start the "type-2" discussion when the
kernel is ~1500 lines of change from the baseline such a driver might
need, and the cleanups make the code more maintainable independent of an
immediate non-test user.

The first 9 patches are general cleanups, the last 10 are focused on
refactoring region creation in support of driver-instantiated CXL memory
regions.

---

Dan Williams (19):
      cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output
      tools/testing/cxl: Remove unused @cxlds argument
      cxl/mbox: Move mailbox related driver state to its own data structure
      cxl/memdev: Make mailbox functionality optional
      cxl/port: Rename CXL_DECODER_{EXPANDER,ACCELERATOR} => {HOSTMEM,DEVMEM}
      cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM
      cxl/region: Manage decoder target_type at decoder-attach time
      cxl/port: Enumerate flit mode capability
      cxl/memdev: Formalize endpoint port linkage
      cxl/memdev: Indicate probe deferral
      cxl/region: Factor out construct_region_{begin,end} and drop_region() for reuse
      cxl/region: Factor out interleave ways setup
      cxl/region: Factor out interleave granularity setup
      cxl/region: Clarify locking requirements of cxl_region_attach()
      cxl/region: Specify host-only vs device memory at region creation time
      cxl/hdm: Define a driver interface for DPA allocation
      cxl/region: Define a driver interface for HPA free space enumeration
      cxl/region: Define a driver interface for region creation
      tools/testing/cxl: Emulate a CXL accelerator with local memory


 drivers/cxl/acpi.c           |    2 
 drivers/cxl/core/hdm.c       |  164 +++++++++++++---
 drivers/cxl/core/mbox.c      |  277 ++++++++++++++-------------
 drivers/cxl/core/memdev.c    |  108 +++++++++-
 drivers/cxl/core/pci.c       |   84 ++++++++
 drivers/cxl/core/pmem.c      |    2 
 drivers/cxl/core/port.c      |   19 +-
 drivers/cxl/core/region.c    |  437 ++++++++++++++++++++++++++++++++++++------
 drivers/cxl/core/regs.c      |    8 -
 drivers/cxl/cxl.h            |   21 ++
 drivers/cxl/cxlmem.h         |  124 ++++++++----
 drivers/cxl/cxlpci.h         |   25 ++
 drivers/cxl/mem.c            |   17 +-
 drivers/cxl/pci.c            |  114 ++++++-----
 drivers/cxl/pmem.c           |   35 ++-
 drivers/cxl/port.c           |    5 
 drivers/cxl/security.c       |   24 +-
 tools/testing/cxl/test/cxl.c |   20 ++
 tools/testing/cxl/test/mem.c |  214 ++++++++++++++-------
 19 files changed, 1245 insertions(+), 455 deletions(-)

base-commit: 9561de3a55bed6bdd44a12820ba81ec416e705a7

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 01/19] cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
@ 2023-06-04 23:31 ` Dan Williams
  2023-06-05  8:46   ` Jonathan Cameron
  2023-06-13 22:03   ` Dave Jiang
  2023-06-04 23:31 ` [PATCH 02/19] tools/testing/cxl: Remove unused @cxlds argument Dan Williams
                   ` (17 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:31 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

The @map parameter to cxl_probe_X_registers() is filled in with the
mapping parameters of the register block. The @map parameter to
cxl_map_X_registers() only reads that information to perform the
mapping. Mark @map const for cxl_map_X_registers() to clarify that it is
only an input to those helpers.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/regs.c |    8 ++++----
 drivers/cxl/cxl.h       |    4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index 1476a0299c9b..52d1dbeda527 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -200,10 +200,10 @@ void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
 }
 
 int cxl_map_component_regs(struct device *dev, struct cxl_component_regs *regs,
-			   struct cxl_register_map *map, unsigned long map_mask)
+			   const struct cxl_register_map *map, unsigned long map_mask)
 {
 	struct mapinfo {
-		struct cxl_reg_map *rmap;
+		const struct cxl_reg_map *rmap;
 		void __iomem **addr;
 	} mapinfo[] = {
 		{ &map->component_map.hdm_decoder, &regs->hdm_decoder },
@@ -233,11 +233,11 @@ EXPORT_SYMBOL_NS_GPL(cxl_map_component_regs, CXL);
 
 int cxl_map_device_regs(struct device *dev,
 			struct cxl_device_regs *regs,
-			struct cxl_register_map *map)
+			const struct cxl_register_map *map)
 {
 	resource_size_t phys_addr = map->resource;
 	struct mapinfo {
-		struct cxl_reg_map *rmap;
+		const struct cxl_reg_map *rmap;
 		void __iomem **addr;
 	} mapinfo[] = {
 		{ &map->device_map.status, &regs->status, },
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index f93a28538962..dfc94e76c7d6 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -254,10 +254,10 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base,
 void cxl_probe_device_regs(struct device *dev, void __iomem *base,
 			   struct cxl_device_reg_map *map);
 int cxl_map_component_regs(struct device *dev, struct cxl_component_regs *regs,
-			   struct cxl_register_map *map,
+			   const struct cxl_register_map *map,
 			   unsigned long map_mask);
 int cxl_map_device_regs(struct device *dev, struct cxl_device_regs *regs,
-			struct cxl_register_map *map);
+			const struct cxl_register_map *map);
 
 enum cxl_regloc_type;
 int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 02/19] tools/testing/cxl: Remove unused @cxlds argument
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
  2023-06-04 23:31 ` [PATCH 01/19] cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output Dan Williams
@ 2023-06-04 23:31 ` Dan Williams
  2023-06-06 10:53   ` Jonathan Cameron
  2023-06-13 22:08   ` Dave Jiang
  2023-06-04 23:31 ` [PATCH 03/19] cxl/mbox: Move mailbox related driver state to its own data structure Dan Williams
                   ` (16 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:31 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

In preparation for plumbing a 'struct cxl_memdev_state' as a superset of
a 'struct cxl_dev_state' cleanup the usage of @cxlds in the unit test
infrastructure.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 tools/testing/cxl/test/mem.c |   86 +++++++++++++++++++-----------------------
 1 file changed, 39 insertions(+), 47 deletions(-)

diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 34b48027b3de..bdaf086d994e 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -180,8 +180,7 @@ static void mes_add_event(struct mock_event_store *mes,
 	log->nr_events++;
 }
 
-static int mock_get_event(struct cxl_dev_state *cxlds,
-			  struct cxl_mbox_cmd *cmd)
+static int mock_get_event(struct device *dev, struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_get_event_payload *pl;
 	struct mock_event_log *log;
@@ -201,7 +200,7 @@ static int mock_get_event(struct cxl_dev_state *cxlds,
 
 	memset(cmd->payload_out, 0, cmd->size_out);
 
-	log = event_find_log(cxlds->dev, log_type);
+	log = event_find_log(dev, log_type);
 	if (!log || event_log_empty(log))
 		return 0;
 
@@ -234,8 +233,7 @@ static int mock_get_event(struct cxl_dev_state *cxlds,
 	return 0;
 }
 
-static int mock_clear_event(struct cxl_dev_state *cxlds,
-			    struct cxl_mbox_cmd *cmd)
+static int mock_clear_event(struct device *dev, struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
 	struct mock_event_log *log;
@@ -246,7 +244,7 @@ static int mock_clear_event(struct cxl_dev_state *cxlds,
 	if (log_type >= CXL_EVENT_TYPE_MAX)
 		return -EINVAL;
 
-	log = event_find_log(cxlds->dev, log_type);
+	log = event_find_log(dev, log_type);
 	if (!log)
 		return 0; /* No mock data in this log */
 
@@ -256,7 +254,7 @@ static int mock_clear_event(struct cxl_dev_state *cxlds,
 	 * However, this is not good behavior for the host so test it.
 	 */
 	if (log->clear_idx + pl->nr_recs > log->cur_idx) {
-		dev_err(cxlds->dev,
+		dev_err(dev,
 			"Attempting to clear more events than returned!\n");
 		return -EINVAL;
 	}
@@ -266,7 +264,7 @@ static int mock_clear_event(struct cxl_dev_state *cxlds,
 	     nr < pl->nr_recs;
 	     nr++, handle++) {
 		if (handle != le16_to_cpu(pl->handles[nr])) {
-			dev_err(cxlds->dev, "Clearing events out of order\n");
+			dev_err(dev, "Clearing events out of order\n");
 			return -EINVAL;
 		}
 	}
@@ -477,7 +475,7 @@ static int mock_get_log(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 	return 0;
 }
 
-static int mock_rcd_id(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int mock_rcd_id(struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_mbox_identify id = {
 		.fw_revision = { "mock fw v1 " },
@@ -495,7 +493,7 @@ static int mock_rcd_id(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 	return 0;
 }
 
-static int mock_id(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int mock_id(struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_mbox_identify id = {
 		.fw_revision = { "mock fw v1 " },
@@ -517,8 +515,7 @@ static int mock_id(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 	return 0;
 }
 
-static int mock_partition_info(struct cxl_dev_state *cxlds,
-			       struct cxl_mbox_cmd *cmd)
+static int mock_partition_info(struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_mbox_get_partition_info pi = {
 		.active_volatile_cap =
@@ -535,11 +532,9 @@ static int mock_partition_info(struct cxl_dev_state *cxlds,
 	return 0;
 }
 
-static int mock_get_security_state(struct cxl_dev_state *cxlds,
+static int mock_get_security_state(struct cxl_mockmem_data *mdata,
 				   struct cxl_mbox_cmd *cmd)
 {
-	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
-
 	if (cmd->size_in)
 		return -EINVAL;
 
@@ -569,9 +564,9 @@ static void user_plimit_check(struct cxl_mockmem_data *mdata)
 		mdata->security_state |= CXL_PMEM_SEC_STATE_USER_PLIMIT;
 }
 
-static int mock_set_passphrase(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int mock_set_passphrase(struct cxl_mockmem_data *mdata,
+			       struct cxl_mbox_cmd *cmd)
 {
-	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
 	struct cxl_set_pass *set_pass;
 
 	if (cmd->size_in != sizeof(*set_pass))
@@ -629,9 +624,9 @@ static int mock_set_passphrase(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd
 	return -EINVAL;
 }
 
-static int mock_disable_passphrase(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int mock_disable_passphrase(struct cxl_mockmem_data *mdata,
+				   struct cxl_mbox_cmd *cmd)
 {
-	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
 	struct cxl_disable_pass *dis_pass;
 
 	if (cmd->size_in != sizeof(*dis_pass))
@@ -700,10 +695,9 @@ static int mock_disable_passphrase(struct cxl_dev_state *cxlds, struct cxl_mbox_
 	return 0;
 }
 
-static int mock_freeze_security(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int mock_freeze_security(struct cxl_mockmem_data *mdata,
+				struct cxl_mbox_cmd *cmd)
 {
-	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
-
 	if (cmd->size_in != 0)
 		return -EINVAL;
 
@@ -717,10 +711,9 @@ static int mock_freeze_security(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd
 	return 0;
 }
 
-static int mock_unlock_security(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int mock_unlock_security(struct cxl_mockmem_data *mdata,
+				struct cxl_mbox_cmd *cmd)
 {
-	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
-
 	if (cmd->size_in != NVDIMM_PASSPHRASE_LEN)
 		return -EINVAL;
 
@@ -759,10 +752,9 @@ static int mock_unlock_security(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd
 	return 0;
 }
 
-static int mock_passphrase_secure_erase(struct cxl_dev_state *cxlds,
+static int mock_passphrase_secure_erase(struct cxl_mockmem_data *mdata,
 					struct cxl_mbox_cmd *cmd)
 {
-	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
 	struct cxl_pass_erase *erase;
 
 	if (cmd->size_in != sizeof(*erase))
@@ -858,10 +850,10 @@ static int mock_passphrase_secure_erase(struct cxl_dev_state *cxlds,
 	return 0;
 }
 
-static int mock_get_lsa(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int mock_get_lsa(struct cxl_mockmem_data *mdata,
+			struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_mbox_get_lsa *get_lsa = cmd->payload_in;
-	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
 	void *lsa = mdata->lsa;
 	u32 offset, length;
 
@@ -878,10 +870,10 @@ static int mock_get_lsa(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 	return 0;
 }
 
-static int mock_set_lsa(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int mock_set_lsa(struct cxl_mockmem_data *mdata,
+			struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_mbox_set_lsa *set_lsa = cmd->payload_in;
-	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
 	void *lsa = mdata->lsa;
 	u32 offset, length;
 
@@ -896,8 +888,7 @@ static int mock_set_lsa(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 	return 0;
 }
 
-static int mock_health_info(struct cxl_dev_state *cxlds,
-			    struct cxl_mbox_cmd *cmd)
+static int mock_health_info(struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_mbox_health_info health_info = {
 		/* set flags for maint needed, perf degraded, hw replacement */
@@ -1117,6 +1108,7 @@ ATTRIBUTE_GROUPS(cxl_mock_mem_core);
 static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 {
 	struct device *dev = cxlds->dev;
+	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
 	int rc = -EIO;
 
 	switch (cmd->opcode) {
@@ -1131,45 +1123,45 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
 		break;
 	case CXL_MBOX_OP_IDENTIFY:
 		if (cxlds->rcd)
-			rc = mock_rcd_id(cxlds, cmd);
+			rc = mock_rcd_id(cmd);
 		else
-			rc = mock_id(cxlds, cmd);
+			rc = mock_id(cmd);
 		break;
 	case CXL_MBOX_OP_GET_LSA:
-		rc = mock_get_lsa(cxlds, cmd);
+		rc = mock_get_lsa(mdata, cmd);
 		break;
 	case CXL_MBOX_OP_GET_PARTITION_INFO:
-		rc = mock_partition_info(cxlds, cmd);
+		rc = mock_partition_info(cmd);
 		break;
 	case CXL_MBOX_OP_GET_EVENT_RECORD:
-		rc = mock_get_event(cxlds, cmd);
+		rc = mock_get_event(dev, cmd);
 		break;
 	case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
-		rc = mock_clear_event(cxlds, cmd);
+		rc = mock_clear_event(dev, cmd);
 		break;
 	case CXL_MBOX_OP_SET_LSA:
-		rc = mock_set_lsa(cxlds, cmd);
+		rc = mock_set_lsa(mdata, cmd);
 		break;
 	case CXL_MBOX_OP_GET_HEALTH_INFO:
-		rc = mock_health_info(cxlds, cmd);
+		rc = mock_health_info(cmd);
 		break;
 	case CXL_MBOX_OP_GET_SECURITY_STATE:
-		rc = mock_get_security_state(cxlds, cmd);
+		rc = mock_get_security_state(mdata, cmd);
 		break;
 	case CXL_MBOX_OP_SET_PASSPHRASE:
-		rc = mock_set_passphrase(cxlds, cmd);
+		rc = mock_set_passphrase(mdata, cmd);
 		break;
 	case CXL_MBOX_OP_DISABLE_PASSPHRASE:
-		rc = mock_disable_passphrase(cxlds, cmd);
+		rc = mock_disable_passphrase(mdata, cmd);
 		break;
 	case CXL_MBOX_OP_FREEZE_SECURITY:
-		rc = mock_freeze_security(cxlds, cmd);
+		rc = mock_freeze_security(mdata, cmd);
 		break;
 	case CXL_MBOX_OP_UNLOCK:
-		rc = mock_unlock_security(cxlds, cmd);
+		rc = mock_unlock_security(mdata, cmd);
 		break;
 	case CXL_MBOX_OP_PASSPHRASE_SECURE_ERASE:
-		rc = mock_passphrase_secure_erase(cxlds, cmd);
+		rc = mock_passphrase_secure_erase(mdata, cmd);
 		break;
 	case CXL_MBOX_OP_GET_POISON:
 		rc = mock_get_poison(cxlds, cmd);


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 03/19] cxl/mbox: Move mailbox related driver state to its own data structure
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
  2023-06-04 23:31 ` [PATCH 01/19] cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output Dan Williams
  2023-06-04 23:31 ` [PATCH 02/19] tools/testing/cxl: Remove unused @cxlds argument Dan Williams
@ 2023-06-04 23:31 ` Dan Williams
  2023-06-06 11:10   ` Jonathan Cameron
  2023-06-13 22:15   ` Dave Jiang
  2023-06-04 23:31 ` [PATCH 04/19] cxl/memdev: Make mailbox functionality optional Dan Williams
                   ` (15 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:31 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

'struct cxl_dev_state' makes too many assumptions about the capabilities
of a CXL device. In particular it assumes a CXL device has a mailbox and
all of the infrastructure and state that comes along with that.

In preparation for supporting accelerator / Type-2 devices that may not
have a mailbox and in general maintain a minimal core context structure,
make mailbox functionality a super-set of  'struct cxl_dev_state' with
'struct cxl_memdev_state'.

With this reorganization it allows for CXL devices that support HDM
decoder mapping, but not other general-expander / Type-3 capabilities,
to only enable that subset without the rest of the mailbox
infrastructure coming along for the ride.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/mbox.c      |  276 ++++++++++++++++++++++--------------------
 drivers/cxl/core/memdev.c    |   38 +++---
 drivers/cxl/cxlmem.h         |   89 ++++++++------
 drivers/cxl/mem.c            |   10 +-
 drivers/cxl/pci.c            |  114 +++++++++--------
 drivers/cxl/pmem.c           |   35 +++--
 drivers/cxl/security.c       |   24 ++--
 tools/testing/cxl/test/mem.c |   43 ++++---
 8 files changed, 338 insertions(+), 291 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index bea9cf31a12d..14805dae5a74 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -182,7 +182,7 @@ static const char *cxl_mem_opcode_to_name(u16 opcode)
 
 /**
  * cxl_internal_send_cmd() - Kernel internal interface to send a mailbox command
- * @cxlds: The device data for the operation
+ * @mds: The driver data for the operation
  * @mbox_cmd: initialized command to execute
  *
  * Context: Any context.
@@ -198,19 +198,19 @@ static const char *cxl_mem_opcode_to_name(u16 opcode)
  * error. While this distinction can be useful for commands from userspace, the
  * kernel will only be able to use results when both are successful.
  */
-int cxl_internal_send_cmd(struct cxl_dev_state *cxlds,
+int cxl_internal_send_cmd(struct cxl_memdev_state *mds,
 			  struct cxl_mbox_cmd *mbox_cmd)
 {
 	size_t out_size, min_out;
 	int rc;
 
-	if (mbox_cmd->size_in > cxlds->payload_size ||
-	    mbox_cmd->size_out > cxlds->payload_size)
+	if (mbox_cmd->size_in > mds->payload_size ||
+	    mbox_cmd->size_out > mds->payload_size)
 		return -E2BIG;
 
 	out_size = mbox_cmd->size_out;
 	min_out = mbox_cmd->min_out;
-	rc = cxlds->mbox_send(cxlds, mbox_cmd);
+	rc = mds->mbox_send(mds, mbox_cmd);
 	/*
 	 * EIO is reserved for a payload size mismatch and mbox_send()
 	 * may not return this error.
@@ -297,7 +297,7 @@ static bool cxl_payload_from_user_allowed(u16 opcode, void *payload_in)
 }
 
 static int cxl_mbox_cmd_ctor(struct cxl_mbox_cmd *mbox,
-			     struct cxl_dev_state *cxlds, u16 opcode,
+			     struct cxl_memdev_state *mds, u16 opcode,
 			     size_t in_size, size_t out_size, u64 in_payload)
 {
 	*mbox = (struct cxl_mbox_cmd) {
@@ -312,7 +312,7 @@ static int cxl_mbox_cmd_ctor(struct cxl_mbox_cmd *mbox,
 			return PTR_ERR(mbox->payload_in);
 
 		if (!cxl_payload_from_user_allowed(opcode, mbox->payload_in)) {
-			dev_dbg(cxlds->dev, "%s: input payload not allowed\n",
+			dev_dbg(mds->cxlds.dev, "%s: input payload not allowed\n",
 				cxl_mem_opcode_to_name(opcode));
 			kvfree(mbox->payload_in);
 			return -EBUSY;
@@ -321,7 +321,7 @@ static int cxl_mbox_cmd_ctor(struct cxl_mbox_cmd *mbox,
 
 	/* Prepare to handle a full payload for variable sized output */
 	if (out_size == CXL_VARIABLE_PAYLOAD)
-		mbox->size_out = cxlds->payload_size;
+		mbox->size_out = mds->payload_size;
 	else
 		mbox->size_out = out_size;
 
@@ -343,7 +343,7 @@ static void cxl_mbox_cmd_dtor(struct cxl_mbox_cmd *mbox)
 
 static int cxl_to_mem_cmd_raw(struct cxl_mem_command *mem_cmd,
 			      const struct cxl_send_command *send_cmd,
-			      struct cxl_dev_state *cxlds)
+			      struct cxl_memdev_state *mds)
 {
 	if (send_cmd->raw.rsvd)
 		return -EINVAL;
@@ -353,13 +353,13 @@ static int cxl_to_mem_cmd_raw(struct cxl_mem_command *mem_cmd,
 	 * gets passed along without further checking, so it must be
 	 * validated here.
 	 */
-	if (send_cmd->out.size > cxlds->payload_size)
+	if (send_cmd->out.size > mds->payload_size)
 		return -EINVAL;
 
 	if (!cxl_mem_raw_command_allowed(send_cmd->raw.opcode))
 		return -EPERM;
 
-	dev_WARN_ONCE(cxlds->dev, true, "raw command path used\n");
+	dev_WARN_ONCE(mds->cxlds.dev, true, "raw command path used\n");
 
 	*mem_cmd = (struct cxl_mem_command) {
 		.info = {
@@ -375,7 +375,7 @@ static int cxl_to_mem_cmd_raw(struct cxl_mem_command *mem_cmd,
 
 static int cxl_to_mem_cmd(struct cxl_mem_command *mem_cmd,
 			  const struct cxl_send_command *send_cmd,
-			  struct cxl_dev_state *cxlds)
+			  struct cxl_memdev_state *mds)
 {
 	struct cxl_mem_command *c = &cxl_mem_commands[send_cmd->id];
 	const struct cxl_command_info *info = &c->info;
@@ -390,11 +390,11 @@ static int cxl_to_mem_cmd(struct cxl_mem_command *mem_cmd,
 		return -EINVAL;
 
 	/* Check that the command is enabled for hardware */
-	if (!test_bit(info->id, cxlds->enabled_cmds))
+	if (!test_bit(info->id, mds->enabled_cmds))
 		return -ENOTTY;
 
 	/* Check that the command is not claimed for exclusive kernel use */
-	if (test_bit(info->id, cxlds->exclusive_cmds))
+	if (test_bit(info->id, mds->exclusive_cmds))
 		return -EBUSY;
 
 	/* Check the input buffer is the expected size */
@@ -423,7 +423,7 @@ static int cxl_to_mem_cmd(struct cxl_mem_command *mem_cmd,
 /**
  * cxl_validate_cmd_from_user() - Check fields for CXL_MEM_SEND_COMMAND.
  * @mbox_cmd: Sanitized and populated &struct cxl_mbox_cmd.
- * @cxlds: The device data for the operation
+ * @mds: The driver data for the operation
  * @send_cmd: &struct cxl_send_command copied in from userspace.
  *
  * Return:
@@ -438,7 +438,7 @@ static int cxl_to_mem_cmd(struct cxl_mem_command *mem_cmd,
  * safe to send to the hardware.
  */
 static int cxl_validate_cmd_from_user(struct cxl_mbox_cmd *mbox_cmd,
-				      struct cxl_dev_state *cxlds,
+				      struct cxl_memdev_state *mds,
 				      const struct cxl_send_command *send_cmd)
 {
 	struct cxl_mem_command mem_cmd;
@@ -452,20 +452,20 @@ static int cxl_validate_cmd_from_user(struct cxl_mbox_cmd *mbox_cmd,
 	 * supports, but output can be arbitrarily large (simply write out as
 	 * much data as the hardware provides).
 	 */
-	if (send_cmd->in.size > cxlds->payload_size)
+	if (send_cmd->in.size > mds->payload_size)
 		return -EINVAL;
 
 	/* Sanitize and construct a cxl_mem_command */
 	if (send_cmd->id == CXL_MEM_COMMAND_ID_RAW)
-		rc = cxl_to_mem_cmd_raw(&mem_cmd, send_cmd, cxlds);
+		rc = cxl_to_mem_cmd_raw(&mem_cmd, send_cmd, mds);
 	else
-		rc = cxl_to_mem_cmd(&mem_cmd, send_cmd, cxlds);
+		rc = cxl_to_mem_cmd(&mem_cmd, send_cmd, mds);
 
 	if (rc)
 		return rc;
 
 	/* Sanitize and construct a cxl_mbox_cmd */
-	return cxl_mbox_cmd_ctor(mbox_cmd, cxlds, mem_cmd.opcode,
+	return cxl_mbox_cmd_ctor(mbox_cmd, mds, mem_cmd.opcode,
 				 mem_cmd.info.size_in, mem_cmd.info.size_out,
 				 send_cmd->in.payload);
 }
@@ -473,6 +473,7 @@ static int cxl_validate_cmd_from_user(struct cxl_mbox_cmd *mbox_cmd,
 int cxl_query_cmd(struct cxl_memdev *cxlmd,
 		  struct cxl_mem_query_commands __user *q)
 {
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct device *dev = &cxlmd->dev;
 	struct cxl_mem_command *cmd;
 	u32 n_commands;
@@ -494,9 +495,9 @@ int cxl_query_cmd(struct cxl_memdev *cxlmd,
 	cxl_for_each_cmd(cmd) {
 		struct cxl_command_info info = cmd->info;
 
-		if (test_bit(info.id, cxlmd->cxlds->enabled_cmds))
+		if (test_bit(info.id, mds->enabled_cmds))
 			info.flags |= CXL_MEM_COMMAND_FLAG_ENABLED;
-		if (test_bit(info.id, cxlmd->cxlds->exclusive_cmds))
+		if (test_bit(info.id, mds->exclusive_cmds))
 			info.flags |= CXL_MEM_COMMAND_FLAG_EXCLUSIVE;
 
 		if (copy_to_user(&q->commands[j++], &info, sizeof(info)))
@@ -511,7 +512,7 @@ int cxl_query_cmd(struct cxl_memdev *cxlmd,
 
 /**
  * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
- * @cxlds: The device data for the operation
+ * @mds: The driver data for the operation
  * @mbox_cmd: The validated mailbox command.
  * @out_payload: Pointer to userspace's output payload.
  * @size_out: (Input) Max payload size to copy out.
@@ -532,12 +533,12 @@ int cxl_query_cmd(struct cxl_memdev *cxlmd,
  *
  * See cxl_send_cmd().
  */
-static int handle_mailbox_cmd_from_user(struct cxl_dev_state *cxlds,
+static int handle_mailbox_cmd_from_user(struct cxl_memdev_state *mds,
 					struct cxl_mbox_cmd *mbox_cmd,
 					u64 out_payload, s32 *size_out,
 					u32 *retval)
 {
-	struct device *dev = cxlds->dev;
+	struct device *dev = mds->cxlds.dev;
 	int rc;
 
 	dev_dbg(dev,
@@ -547,7 +548,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_dev_state *cxlds,
 		cxl_mem_opcode_to_name(mbox_cmd->opcode),
 		mbox_cmd->opcode, mbox_cmd->size_in);
 
-	rc = cxlds->mbox_send(cxlds, mbox_cmd);
+	rc = mds->mbox_send(mds, mbox_cmd);
 	if (rc)
 		goto out;
 
@@ -576,7 +577,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_dev_state *cxlds,
 
 int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s)
 {
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct device *dev = &cxlmd->dev;
 	struct cxl_send_command send;
 	struct cxl_mbox_cmd mbox_cmd;
@@ -587,11 +588,11 @@ int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s)
 	if (copy_from_user(&send, s, sizeof(send)))
 		return -EFAULT;
 
-	rc = cxl_validate_cmd_from_user(&mbox_cmd, cxlmd->cxlds, &send);
+	rc = cxl_validate_cmd_from_user(&mbox_cmd, mds, &send);
 	if (rc)
 		return rc;
 
-	rc = handle_mailbox_cmd_from_user(cxlds, &mbox_cmd, send.out.payload,
+	rc = handle_mailbox_cmd_from_user(mds, &mbox_cmd, send.out.payload,
 					  &send.out.size, &send.retval);
 	if (rc)
 		return rc;
@@ -602,13 +603,14 @@ int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s)
 	return 0;
 }
 
-static int cxl_xfer_log(struct cxl_dev_state *cxlds, uuid_t *uuid, u32 *size, u8 *out)
+static int cxl_xfer_log(struct cxl_memdev_state *mds, uuid_t *uuid,
+			u32 *size, u8 *out)
 {
 	u32 remaining = *size;
 	u32 offset = 0;
 
 	while (remaining) {
-		u32 xfer_size = min_t(u32, remaining, cxlds->payload_size);
+		u32 xfer_size = min_t(u32, remaining, mds->payload_size);
 		struct cxl_mbox_cmd mbox_cmd;
 		struct cxl_mbox_get_log log;
 		int rc;
@@ -627,7 +629,7 @@ static int cxl_xfer_log(struct cxl_dev_state *cxlds, uuid_t *uuid, u32 *size, u8
 			.payload_out = out,
 		};
 
-		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+		rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 
 		/*
 		 * The output payload length that indicates the number
@@ -654,17 +656,18 @@ static int cxl_xfer_log(struct cxl_dev_state *cxlds, uuid_t *uuid, u32 *size, u8
 
 /**
  * cxl_walk_cel() - Walk through the Command Effects Log.
- * @cxlds: The device data for the operation
+ * @mds: The driver data for the operation
  * @size: Length of the Command Effects Log.
  * @cel: CEL
  *
  * Iterate over each entry in the CEL and determine if the driver supports the
  * command. If so, the command is enabled for the device and can be used later.
  */
-static void cxl_walk_cel(struct cxl_dev_state *cxlds, size_t size, u8 *cel)
+static void cxl_walk_cel(struct cxl_memdev_state *mds, size_t size, u8 *cel)
 {
 	struct cxl_cel_entry *cel_entry;
 	const int cel_entries = size / sizeof(*cel_entry);
+	struct device *dev = mds->cxlds.dev;
 	int i;
 
 	cel_entry = (struct cxl_cel_entry *) cel;
@@ -674,39 +677,40 @@ static void cxl_walk_cel(struct cxl_dev_state *cxlds, size_t size, u8 *cel)
 		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
 
 		if (!cmd && !cxl_is_poison_command(opcode)) {
-			dev_dbg(cxlds->dev,
+			dev_dbg(dev,
 				"Opcode 0x%04x unsupported by driver\n", opcode);
 			continue;
 		}
 
 		if (cmd)
-			set_bit(cmd->info.id, cxlds->enabled_cmds);
+			set_bit(cmd->info.id, mds->enabled_cmds);
 
 		if (cxl_is_poison_command(opcode))
-			cxl_set_poison_cmd_enabled(&cxlds->poison, opcode);
+			cxl_set_poison_cmd_enabled(&mds->poison, opcode);
 
-		dev_dbg(cxlds->dev, "Opcode 0x%04x enabled\n", opcode);
+		dev_dbg(dev, "Opcode 0x%04x enabled\n", opcode);
 	}
 }
 
-static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_dev_state *cxlds)
+static struct cxl_mbox_get_supported_logs *
+cxl_get_gsl(struct cxl_memdev_state *mds)
 {
 	struct cxl_mbox_get_supported_logs *ret;
 	struct cxl_mbox_cmd mbox_cmd;
 	int rc;
 
-	ret = kvmalloc(cxlds->payload_size, GFP_KERNEL);
+	ret = kvmalloc(mds->payload_size, GFP_KERNEL);
 	if (!ret)
 		return ERR_PTR(-ENOMEM);
 
 	mbox_cmd = (struct cxl_mbox_cmd) {
 		.opcode = CXL_MBOX_OP_GET_SUPPORTED_LOGS,
-		.size_out = cxlds->payload_size,
+		.size_out = mds->payload_size,
 		.payload_out = ret,
 		/* At least the record number field must be valid */
 		.min_out = 2,
 	};
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc < 0) {
 		kvfree(ret);
 		return ERR_PTR(rc);
@@ -729,22 +733,22 @@ static const uuid_t log_uuid[] = {
 
 /**
  * cxl_enumerate_cmds() - Enumerate commands for a device.
- * @cxlds: The device data for the operation
+ * @mds: The driver data for the operation
  *
  * Returns 0 if enumerate completed successfully.
  *
  * CXL devices have optional support for certain commands. This function will
  * determine the set of supported commands for the hardware and update the
- * enabled_cmds bitmap in the @cxlds.
+ * enabled_cmds bitmap in the @mds.
  */
-int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
+int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
 {
 	struct cxl_mbox_get_supported_logs *gsl;
-	struct device *dev = cxlds->dev;
+	struct device *dev = mds->cxlds.dev;
 	struct cxl_mem_command *cmd;
 	int i, rc;
 
-	gsl = cxl_get_gsl(cxlds);
+	gsl = cxl_get_gsl(mds);
 	if (IS_ERR(gsl))
 		return PTR_ERR(gsl);
 
@@ -765,19 +769,19 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 			goto out;
 		}
 
-		rc = cxl_xfer_log(cxlds, &uuid, &size, log);
+		rc = cxl_xfer_log(mds, &uuid, &size, log);
 		if (rc) {
 			kvfree(log);
 			goto out;
 		}
 
-		cxl_walk_cel(cxlds, size, log);
+		cxl_walk_cel(mds, size, log);
 		kvfree(log);
 
 		/* In case CEL was bogus, enable some default commands. */
 		cxl_for_each_cmd(cmd)
 			if (cmd->flags & CXL_CMD_FLAG_FORCE_ENABLE)
-				set_bit(cmd->info.id, cxlds->enabled_cmds);
+				set_bit(cmd->info.id, mds->enabled_cmds);
 
 		/* Found the required CEL */
 		rc = 0;
@@ -838,7 +842,7 @@ static void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 	}
 }
 
-static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
+static int cxl_clear_event_record(struct cxl_memdev_state *mds,
 				  enum cxl_event_log_type log,
 				  struct cxl_get_event_payload *get_pl)
 {
@@ -852,9 +856,9 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 	int i;
 
 	/* Payload size may limit the max handles */
-	if (pl_size > cxlds->payload_size) {
-		max_handles = (cxlds->payload_size - sizeof(*payload)) /
-				sizeof(__le16);
+	if (pl_size > mds->payload_size) {
+		max_handles = (mds->payload_size - sizeof(*payload)) /
+			      sizeof(__le16);
 		pl_size = struct_size(payload, handles, max_handles);
 	}
 
@@ -879,12 +883,12 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 	i = 0;
 	for (cnt = 0; cnt < total; cnt++) {
 		payload->handles[i++] = get_pl->records[cnt].hdr.handle;
-		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
-			log, le16_to_cpu(payload->handles[i]));
+		dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
+			le16_to_cpu(payload->handles[i]));
 
 		if (i == max_handles) {
 			payload->nr_recs = i;
-			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+			rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 			if (rc)
 				goto free_pl;
 			i = 0;
@@ -895,7 +899,7 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 	if (i) {
 		payload->nr_recs = i;
 		mbox_cmd.size_in = struct_size(payload, handles, i);
-		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+		rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 		if (rc)
 			goto free_pl;
 	}
@@ -905,32 +909,34 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 	return rc;
 }
 
-static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
+static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
 				    enum cxl_event_log_type type)
 {
+	struct cxl_memdev *cxlmd = mds->cxlds.cxlmd;
+	struct device *dev = mds->cxlds.dev;
 	struct cxl_get_event_payload *payload;
 	struct cxl_mbox_cmd mbox_cmd;
 	u8 log_type = type;
 	u16 nr_rec;
 
-	mutex_lock(&cxlds->event.log_lock);
-	payload = cxlds->event.buf;
+	mutex_lock(&mds->event.log_lock);
+	payload = mds->event.buf;
 
 	mbox_cmd = (struct cxl_mbox_cmd) {
 		.opcode = CXL_MBOX_OP_GET_EVENT_RECORD,
 		.payload_in = &log_type,
 		.size_in = sizeof(log_type),
 		.payload_out = payload,
-		.size_out = cxlds->payload_size,
+		.size_out = mds->payload_size,
 		.min_out = struct_size(payload, records, 0),
 	};
 
 	do {
 		int rc, i;
 
-		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+		rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 		if (rc) {
-			dev_err_ratelimited(cxlds->dev,
+			dev_err_ratelimited(dev,
 				"Event log '%d': Failed to query event records : %d",
 				type, rc);
 			break;
@@ -941,27 +947,27 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 			break;
 
 		for (i = 0; i < nr_rec; i++)
-			cxl_event_trace_record(cxlds->cxlmd, type,
+			cxl_event_trace_record(cxlmd, type,
 					       &payload->records[i]);
 
 		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
-			trace_cxl_overflow(cxlds->cxlmd, type, payload);
+			trace_cxl_overflow(cxlmd, type, payload);
 
-		rc = cxl_clear_event_record(cxlds, type, payload);
+		rc = cxl_clear_event_record(mds, type, payload);
 		if (rc) {
-			dev_err_ratelimited(cxlds->dev,
+			dev_err_ratelimited(dev,
 				"Event log '%d': Failed to clear events : %d",
 				type, rc);
 			break;
 		}
 	} while (nr_rec);
 
-	mutex_unlock(&cxlds->event.log_lock);
+	mutex_unlock(&mds->event.log_lock);
 }
 
 /**
  * cxl_mem_get_event_records - Get Event Records from the device
- * @cxlds: The device data for the operation
+ * @mds: The driver data for the operation
  * @status: Event Status register value identifying which events are available.
  *
  * Retrieve all event records available on the device, report them as trace
@@ -970,24 +976,24 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
  * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
  * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
  */
-void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
+void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status)
 {
-	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
+	dev_dbg(mds->cxlds.dev, "Reading event logs: %x\n", status);
 
 	if (status & CXLDEV_EVENT_STATUS_FATAL)
-		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
+		cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_FATAL);
 	if (status & CXLDEV_EVENT_STATUS_FAIL)
-		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
+		cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_FAIL);
 	if (status & CXLDEV_EVENT_STATUS_WARN)
-		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
+		cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_WARN);
 	if (status & CXLDEV_EVENT_STATUS_INFO)
-		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
+		cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_INFO);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
 
 /**
  * cxl_mem_get_partition_info - Get partition info
- * @cxlds: The device data for the operation
+ * @mds: The driver data for the operation
  *
  * Retrieve the current partition info for the device specified.  The active
  * values are the current capacity in bytes.  If not 0, the 'next' values are
@@ -997,7 +1003,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
  *
  * See CXL @8.2.9.5.2.1 Get Partition Info
  */
-static int cxl_mem_get_partition_info(struct cxl_dev_state *cxlds)
+static int cxl_mem_get_partition_info(struct cxl_memdev_state *mds)
 {
 	struct cxl_mbox_get_partition_info pi;
 	struct cxl_mbox_cmd mbox_cmd;
@@ -1008,17 +1014,17 @@ static int cxl_mem_get_partition_info(struct cxl_dev_state *cxlds)
 		.size_out = sizeof(pi),
 		.payload_out = &pi,
 	};
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc)
 		return rc;
 
-	cxlds->active_volatile_bytes =
+	mds->active_volatile_bytes =
 		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
-	cxlds->active_persistent_bytes =
+	mds->active_persistent_bytes =
 		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
-	cxlds->next_volatile_bytes =
+	mds->next_volatile_bytes =
 		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
-	cxlds->next_persistent_bytes =
+	mds->next_persistent_bytes =
 		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
 
 	return 0;
@@ -1026,14 +1032,14 @@ static int cxl_mem_get_partition_info(struct cxl_dev_state *cxlds)
 
 /**
  * cxl_dev_state_identify() - Send the IDENTIFY command to the device.
- * @cxlds: The device data for the operation
+ * @mds: The driver data for the operation
  *
  * Return: 0 if identify was executed successfully or media not ready.
  *
  * This will dispatch the identify command to the device and on success populate
  * structures to be exported to sysfs.
  */
-int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
+int cxl_dev_state_identify(struct cxl_memdev_state *mds)
 {
 	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
 	struct cxl_mbox_identify id;
@@ -1041,7 +1047,7 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
 	u32 val;
 	int rc;
 
-	if (!cxlds->media_ready)
+	if (!mds->cxlds.media_ready)
 		return 0;
 
 	mbox_cmd = (struct cxl_mbox_cmd) {
@@ -1049,25 +1055,26 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
 		.size_out = sizeof(id),
 		.payload_out = &id,
 	};
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc < 0)
 		return rc;
 
-	cxlds->total_bytes =
+	mds->total_bytes =
 		le64_to_cpu(id.total_capacity) * CXL_CAPACITY_MULTIPLIER;
-	cxlds->volatile_only_bytes =
+	mds->volatile_only_bytes =
 		le64_to_cpu(id.volatile_capacity) * CXL_CAPACITY_MULTIPLIER;
-	cxlds->persistent_only_bytes =
+	mds->persistent_only_bytes =
 		le64_to_cpu(id.persistent_capacity) * CXL_CAPACITY_MULTIPLIER;
-	cxlds->partition_align_bytes =
+	mds->partition_align_bytes =
 		le64_to_cpu(id.partition_align) * CXL_CAPACITY_MULTIPLIER;
 
-	cxlds->lsa_size = le32_to_cpu(id.lsa_size);
-	memcpy(cxlds->firmware_version, id.fw_revision, sizeof(id.fw_revision));
+	mds->lsa_size = le32_to_cpu(id.lsa_size);
+	memcpy(mds->firmware_version, id.fw_revision,
+	       sizeof(id.fw_revision));
 
-	if (test_bit(CXL_POISON_ENABLED_LIST, cxlds->poison.enabled_cmds)) {
+	if (test_bit(CXL_POISON_ENABLED_LIST, mds->poison.enabled_cmds)) {
 		val = get_unaligned_le24(id.poison_list_max_mer);
-		cxlds->poison.max_errors = min_t(u32, val, CXL_POISON_LIST_MAX);
+		mds->poison.max_errors = min_t(u32, val, CXL_POISON_LIST_MAX);
 	}
 
 	return 0;
@@ -1100,8 +1107,9 @@ static int add_dpa_res(struct device *dev, struct resource *parent,
 	return 0;
 }
 
-int cxl_mem_create_range_info(struct cxl_dev_state *cxlds)
+int cxl_mem_create_range_info(struct cxl_memdev_state *mds)
 {
+	struct cxl_dev_state *cxlds = &mds->cxlds;
 	struct device *dev = cxlds->dev;
 	int rc;
 
@@ -1113,35 +1121,35 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds)
 	}
 
 	cxlds->dpa_res =
-		(struct resource)DEFINE_RES_MEM(0, cxlds->total_bytes);
+		(struct resource)DEFINE_RES_MEM(0, mds->total_bytes);
 
-	if (cxlds->partition_align_bytes == 0) {
+	if (mds->partition_align_bytes == 0) {
 		rc = add_dpa_res(dev, &cxlds->dpa_res, &cxlds->ram_res, 0,
-				 cxlds->volatile_only_bytes, "ram");
+				 mds->volatile_only_bytes, "ram");
 		if (rc)
 			return rc;
 		return add_dpa_res(dev, &cxlds->dpa_res, &cxlds->pmem_res,
-				   cxlds->volatile_only_bytes,
-				   cxlds->persistent_only_bytes, "pmem");
+				   mds->volatile_only_bytes,
+				   mds->persistent_only_bytes, "pmem");
 	}
 
-	rc = cxl_mem_get_partition_info(cxlds);
+	rc = cxl_mem_get_partition_info(mds);
 	if (rc) {
 		dev_err(dev, "Failed to query partition information\n");
 		return rc;
 	}
 
 	rc = add_dpa_res(dev, &cxlds->dpa_res, &cxlds->ram_res, 0,
-			 cxlds->active_volatile_bytes, "ram");
+			 mds->active_volatile_bytes, "ram");
 	if (rc)
 		return rc;
 	return add_dpa_res(dev, &cxlds->dpa_res, &cxlds->pmem_res,
-			   cxlds->active_volatile_bytes,
-			   cxlds->active_persistent_bytes, "pmem");
+			   mds->active_volatile_bytes,
+			   mds->active_persistent_bytes, "pmem");
 }
 EXPORT_SYMBOL_NS_GPL(cxl_mem_create_range_info, CXL);
 
-int cxl_set_timestamp(struct cxl_dev_state *cxlds)
+int cxl_set_timestamp(struct cxl_memdev_state *mds)
 {
 	struct cxl_mbox_cmd mbox_cmd;
 	struct cxl_mbox_set_timestamp_in pi;
@@ -1154,7 +1162,7 @@ int cxl_set_timestamp(struct cxl_dev_state *cxlds)
 		.payload_in = &pi,
 	};
 
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	/*
 	 * Command is optional. Devices may have another way of providing
 	 * a timestamp, or may return all 0s in timestamp fields.
@@ -1170,18 +1178,18 @@ EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		       struct cxl_region *cxlr)
 {
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_poison_out *po;
 	struct cxl_mbox_poison_in pi;
 	struct cxl_mbox_cmd mbox_cmd;
 	int nr_records = 0;
 	int rc;
 
-	rc = mutex_lock_interruptible(&cxlds->poison.lock);
+	rc = mutex_lock_interruptible(&mds->poison.lock);
 	if (rc)
 		return rc;
 
-	po = cxlds->poison.list_out;
+	po = mds->poison.list_out;
 	pi.offset = cpu_to_le64(offset);
 	pi.length = cpu_to_le64(len / CXL_POISON_LEN_MULT);
 
@@ -1189,13 +1197,13 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		.opcode = CXL_MBOX_OP_GET_POISON,
 		.size_in = sizeof(pi),
 		.payload_in = &pi,
-		.size_out = cxlds->payload_size,
+		.size_out = mds->payload_size,
 		.payload_out = po,
 		.min_out = struct_size(po, record, 0),
 	};
 
 	do {
-		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+		rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 		if (rc)
 			break;
 
@@ -1206,14 +1214,14 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 
 		/* Protect against an uncleared _FLAG_MORE */
 		nr_records = nr_records + le16_to_cpu(po->count);
-		if (nr_records >= cxlds->poison.max_errors) {
+		if (nr_records >= mds->poison.max_errors) {
 			dev_dbg(&cxlmd->dev, "Max Error Records reached: %d\n",
 				nr_records);
 			break;
 		}
 	} while (po->flags & CXL_POISON_FLAG_MORE);
 
-	mutex_unlock(&cxlds->poison.lock);
+	mutex_unlock(&mds->poison.lock);
 	return rc;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, CXL);
@@ -1223,52 +1231,52 @@ static void free_poison_buf(void *buf)
 	kvfree(buf);
 }
 
-/* Get Poison List output buffer is protected by cxlds->poison.lock */
-static int cxl_poison_alloc_buf(struct cxl_dev_state *cxlds)
+/* Get Poison List output buffer is protected by mds->poison.lock */
+static int cxl_poison_alloc_buf(struct cxl_memdev_state *mds)
 {
-	cxlds->poison.list_out = kvmalloc(cxlds->payload_size, GFP_KERNEL);
-	if (!cxlds->poison.list_out)
+	mds->poison.list_out = kvmalloc(mds->payload_size, GFP_KERNEL);
+	if (!mds->poison.list_out)
 		return -ENOMEM;
 
-	return devm_add_action_or_reset(cxlds->dev, free_poison_buf,
-					cxlds->poison.list_out);
+	return devm_add_action_or_reset(mds->cxlds.dev, free_poison_buf,
+					mds->poison.list_out);
 }
 
-int cxl_poison_state_init(struct cxl_dev_state *cxlds)
+int cxl_poison_state_init(struct cxl_memdev_state *mds)
 {
 	int rc;
 
-	if (!test_bit(CXL_POISON_ENABLED_LIST, cxlds->poison.enabled_cmds))
+	if (!test_bit(CXL_POISON_ENABLED_LIST, mds->poison.enabled_cmds))
 		return 0;
 
-	rc = cxl_poison_alloc_buf(cxlds);
+	rc = cxl_poison_alloc_buf(mds);
 	if (rc) {
-		clear_bit(CXL_POISON_ENABLED_LIST, cxlds->poison.enabled_cmds);
+		clear_bit(CXL_POISON_ENABLED_LIST, mds->poison.enabled_cmds);
 		return rc;
 	}
 
-	mutex_init(&cxlds->poison.lock);
+	mutex_init(&mds->poison.lock);
 	return 0;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, CXL);
 
-struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
+struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
 {
-	struct cxl_dev_state *cxlds;
+	struct cxl_memdev_state *mds;
 
-	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
-	if (!cxlds) {
+	mds = devm_kzalloc(dev, sizeof(*mds), GFP_KERNEL);
+	if (!mds) {
 		dev_err(dev, "No memory available\n");
 		return ERR_PTR(-ENOMEM);
 	}
 
-	mutex_init(&cxlds->mbox_mutex);
-	mutex_init(&cxlds->event.log_lock);
-	cxlds->dev = dev;
+	mutex_init(&mds->mbox_mutex);
+	mutex_init(&mds->event.log_lock);
+	mds->cxlds.dev = dev;
 
-	return cxlds;
+	return mds;
 }
-EXPORT_SYMBOL_NS_GPL(cxl_dev_state_create, CXL);
+EXPORT_SYMBOL_NS_GPL(cxl_memdev_state_create, CXL);
 
 void __init cxl_mbox_init(void)
 {
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 057a43267290..15434b1b4909 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -39,8 +39,9 @@ static ssize_t firmware_version_show(struct device *dev,
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
 
-	return sysfs_emit(buf, "%.16s\n", cxlds->firmware_version);
+	return sysfs_emit(buf, "%.16s\n", mds->firmware_version);
 }
 static DEVICE_ATTR_RO(firmware_version);
 
@@ -49,8 +50,9 @@ static ssize_t payload_max_show(struct device *dev,
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
 
-	return sysfs_emit(buf, "%zu\n", cxlds->payload_size);
+	return sysfs_emit(buf, "%zu\n", mds->payload_size);
 }
 static DEVICE_ATTR_RO(payload_max);
 
@@ -59,8 +61,9 @@ static ssize_t label_storage_size_show(struct device *dev,
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
 
-	return sysfs_emit(buf, "%zu\n", cxlds->lsa_size);
+	return sysfs_emit(buf, "%zu\n", mds->lsa_size);
 }
 static DEVICE_ATTR_RO(label_storage_size);
 
@@ -231,7 +234,7 @@ static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
 
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
 {
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_inject_poison inject;
 	struct cxl_poison_record record;
 	struct cxl_mbox_cmd mbox_cmd;
@@ -255,13 +258,13 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
 		.size_in = sizeof(inject),
 		.payload_in = &inject,
 	};
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc)
 		goto out;
 
 	cxlr = cxl_dpa_to_region(cxlmd, dpa);
 	if (cxlr)
-		dev_warn_once(cxlds->dev,
+		dev_warn_once(mds->cxlds.dev,
 			      "poison inject dpa:%#llx region: %s\n", dpa,
 			      dev_name(&cxlr->dev));
 
@@ -279,7 +282,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, CXL);
 
 int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
 {
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_clear_poison clear;
 	struct cxl_poison_record record;
 	struct cxl_mbox_cmd mbox_cmd;
@@ -312,14 +315,15 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
 		.payload_in = &clear,
 	};
 
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc)
 		goto out;
 
 	cxlr = cxl_dpa_to_region(cxlmd, dpa);
 	if (cxlr)
-		dev_warn_once(cxlds->dev, "poison clear dpa:%#llx region: %s\n",
-			      dpa, dev_name(&cxlr->dev));
+		dev_warn_once(mds->cxlds.dev,
+			      "poison clear dpa:%#llx region: %s\n", dpa,
+			      dev_name(&cxlr->dev));
 
 	record = (struct cxl_poison_record) {
 		.address = cpu_to_le64(dpa),
@@ -397,17 +401,18 @@ EXPORT_SYMBOL_NS_GPL(is_cxl_memdev, CXL);
 
 /**
  * set_exclusive_cxl_commands() - atomically disable user cxl commands
- * @cxlds: The device state to operate on
+ * @mds: The device state to operate on
  * @cmds: bitmap of commands to mark exclusive
  *
  * Grab the cxl_memdev_rwsem in write mode to flush in-flight
  * invocations of the ioctl path and then disable future execution of
  * commands with the command ids set in @cmds.
  */
-void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds)
+void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
+				unsigned long *cmds)
 {
 	down_write(&cxl_memdev_rwsem);
-	bitmap_or(cxlds->exclusive_cmds, cxlds->exclusive_cmds, cmds,
+	bitmap_or(mds->exclusive_cmds, mds->exclusive_cmds, cmds,
 		  CXL_MEM_COMMAND_ID_MAX);
 	up_write(&cxl_memdev_rwsem);
 }
@@ -415,13 +420,14 @@ EXPORT_SYMBOL_NS_GPL(set_exclusive_cxl_commands, CXL);
 
 /**
  * clear_exclusive_cxl_commands() - atomically enable user cxl commands
- * @cxlds: The device state to modify
+ * @mds: The device state to modify
  * @cmds: bitmap of commands to mark available for userspace
  */
-void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds)
+void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
+				  unsigned long *cmds)
 {
 	down_write(&cxl_memdev_rwsem);
-	bitmap_andnot(cxlds->exclusive_cmds, cxlds->exclusive_cmds, cmds,
+	bitmap_andnot(mds->exclusive_cmds, mds->exclusive_cmds, cmds,
 		      CXL_MEM_COMMAND_ID_MAX);
 	up_write(&cxl_memdev_rwsem);
 }
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index a2845a7a69d8..d3fe73d5ba4d 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -267,6 +267,35 @@ struct cxl_poison_state {
  * @cxl_dvsec: Offset to the PCIe device DVSEC
  * @rcd: operating in RCD mode (CXL 3.0 9.11.8 CXL Devices Attached to an RCH)
  * @media_ready: Indicate whether the device media is usable
+ * @dpa_res: Overall DPA resource tree for the device
+ * @pmem_res: Active Persistent memory capacity configuration
+ * @ram_res: Active Volatile memory capacity configuration
+ * @component_reg_phys: register base of component registers
+ * @info: Cached DVSEC information about the device.
+ * @serial: PCIe Device Serial Number
+ */
+struct cxl_dev_state {
+	struct device *dev;
+	struct cxl_memdev *cxlmd;
+	struct cxl_regs regs;
+	int cxl_dvsec;
+	bool rcd;
+	bool media_ready;
+	struct resource dpa_res;
+	struct resource pmem_res;
+	struct resource ram_res;
+	resource_size_t component_reg_phys;
+	u64 serial;
+};
+
+/**
+ * struct cxl_memdev_state - Generic Type-3 Memory Device Class driver data
+ *
+ * CXL 8.1.12.1 PCI Header - Class Code Register Memory Device defines
+ * common memory device functionality like the presence of a mailbox and
+ * the functionality related to that like Identify Memory Device and Get
+ * Partition Info
+ * @cxlds: Core driver state common across Type-2 and Type-3 devices
  * @payload_size: Size of space for payload
  *                (CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
  * @lsa_size: Size of Label Storage Area
@@ -275,9 +304,6 @@ struct cxl_poison_state {
  * @firmware_version: Firmware version for the memory device.
  * @enabled_cmds: Hardware commands found enabled in CEL.
  * @exclusive_cmds: Commands that are kernel-internal only
- * @dpa_res: Overall DPA resource tree for the device
- * @pmem_res: Active Persistent memory capacity configuration
- * @ram_res: Active Volatile memory capacity configuration
  * @total_bytes: sum of all possible capacities
  * @volatile_only_bytes: hard volatile capacity
  * @persistent_only_bytes: hard persistent capacity
@@ -286,54 +312,41 @@ struct cxl_poison_state {
  * @active_persistent_bytes: sum of hard + soft persistent
  * @next_volatile_bytes: volatile capacity change pending device reset
  * @next_persistent_bytes: persistent capacity change pending device reset
- * @component_reg_phys: register base of component registers
- * @info: Cached DVSEC information about the device.
- * @serial: PCIe Device Serial Number
  * @event: event log driver state
  * @poison: poison driver state info
  * @mbox_send: @dev specific transport for transmitting mailbox commands
  *
- * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
+ * See CXL 3.0 8.2.9.8.2 Capacity Configuration and Label Storage for
  * details on capacity parameters.
  */
-struct cxl_dev_state {
-	struct device *dev;
-	struct cxl_memdev *cxlmd;
-
-	struct cxl_regs regs;
-	int cxl_dvsec;
-
-	bool rcd;
-	bool media_ready;
+struct cxl_memdev_state {
+	struct cxl_dev_state cxlds;
 	size_t payload_size;
 	size_t lsa_size;
 	struct mutex mbox_mutex; /* Protects device mailbox and firmware */
 	char firmware_version[0x10];
 	DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
 	DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
-
-	struct resource dpa_res;
-	struct resource pmem_res;
-	struct resource ram_res;
 	u64 total_bytes;
 	u64 volatile_only_bytes;
 	u64 persistent_only_bytes;
 	u64 partition_align_bytes;
-
 	u64 active_volatile_bytes;
 	u64 active_persistent_bytes;
 	u64 next_volatile_bytes;
 	u64 next_persistent_bytes;
-
-	resource_size_t component_reg_phys;
-	u64 serial;
-
 	struct cxl_event_state event;
 	struct cxl_poison_state poison;
-
-	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
+	int (*mbox_send)(struct cxl_memdev_state *mds,
+			 struct cxl_mbox_cmd *cmd);
 };
 
+static inline struct cxl_memdev_state *
+to_cxl_memdev_state(struct cxl_dev_state *cxlds)
+{
+	return container_of(cxlds, struct cxl_memdev_state, cxlds);
+}
+
 enum cxl_opcode {
 	CXL_MBOX_OP_INVALID		= 0x0000,
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
@@ -692,18 +705,20 @@ enum {
 	CXL_PMEM_SEC_PASS_USER,
 };
 
-int cxl_internal_send_cmd(struct cxl_dev_state *cxlds,
+int cxl_internal_send_cmd(struct cxl_memdev_state *mds,
 			  struct cxl_mbox_cmd *cmd);
-int cxl_dev_state_identify(struct cxl_dev_state *cxlds);
+int cxl_dev_state_identify(struct cxl_memdev_state *mds);
 int cxl_await_media_ready(struct cxl_dev_state *cxlds);
-int cxl_enumerate_cmds(struct cxl_dev_state *cxlds);
-int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
-struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
-void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
-void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
-void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
-int cxl_set_timestamp(struct cxl_dev_state *cxlds);
-int cxl_poison_state_init(struct cxl_dev_state *cxlds);
+int cxl_enumerate_cmds(struct cxl_memdev_state *mds);
+int cxl_mem_create_range_info(struct cxl_memdev_state *mds);
+struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev);
+void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
+				unsigned long *cmds);
+void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
+				  unsigned long *cmds);
+void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
+int cxl_set_timestamp(struct cxl_memdev_state *mds);
+int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		       struct cxl_region *cxlr);
 int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 519edd0eb196..584f9eec57e4 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -117,6 +117,7 @@ DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
 static int cxl_mem_probe(struct device *dev)
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
 	struct device *endpoint_parent;
 	struct cxl_port *parent_port;
@@ -141,10 +142,10 @@ static int cxl_mem_probe(struct device *dev)
 	dentry = cxl_debugfs_create_dir(dev_name(dev));
 	debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
 
-	if (test_bit(CXL_POISON_ENABLED_INJECT, cxlds->poison.enabled_cmds))
+	if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
 		debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
 				    &cxl_poison_inject_fops);
-	if (test_bit(CXL_POISON_ENABLED_CLEAR, cxlds->poison.enabled_cmds))
+	if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
 		debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
 				    &cxl_poison_clear_fops);
 
@@ -227,9 +228,12 @@ static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n)
 {
 	if (a == &dev_attr_trigger_poison_list.attr) {
 		struct device *dev = kobj_to_dev(kobj);
+		struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+		struct cxl_memdev_state *mds =
+			to_cxl_memdev_state(cxlmd->cxlds);
 
 		if (!test_bit(CXL_POISON_ENABLED_LIST,
-			      to_cxl_memdev(dev)->cxlds->poison.enabled_cmds))
+			      mds->poison.enabled_cmds))
 			return 0;
 	}
 	return a->mode;
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 0872f2233ed0..4e2845b7331a 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -86,7 +86,7 @@ static int cxl_pci_mbox_wait_for_doorbell(struct cxl_dev_state *cxlds)
 
 /**
  * __cxl_pci_mbox_send_cmd() - Execute a mailbox command
- * @cxlds: The device state to communicate with.
+ * @mds: The memory device driver data
  * @mbox_cmd: Command to send to the memory device.
  *
  * Context: Any context. Expects mbox_mutex to be held.
@@ -106,16 +106,17 @@ static int cxl_pci_mbox_wait_for_doorbell(struct cxl_dev_state *cxlds)
  * not need to coordinate with each other. The driver only uses the primary
  * mailbox.
  */
-static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
+static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
 				   struct cxl_mbox_cmd *mbox_cmd)
 {
+	struct cxl_dev_state *cxlds = &mds->cxlds;
 	void __iomem *payload = cxlds->regs.mbox + CXLDEV_MBOX_PAYLOAD_OFFSET;
 	struct device *dev = cxlds->dev;
 	u64 cmd_reg, status_reg;
 	size_t out_len;
 	int rc;
 
-	lockdep_assert_held(&cxlds->mbox_mutex);
+	lockdep_assert_held(&mds->mbox_mutex);
 
 	/*
 	 * Here are the steps from 8.2.8.4 of the CXL 2.0 spec.
@@ -196,8 +197,9 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
 		 * have requested less data than the hardware supplied even
 		 * within spec.
 		 */
-		size_t n = min3(mbox_cmd->size_out, cxlds->payload_size, out_len);
+		size_t n;
 
+		n = min3(mbox_cmd->size_out, mds->payload_size, out_len);
 		memcpy_fromio(mbox_cmd->payload_out, payload, n);
 		mbox_cmd->size_out = n;
 	} else {
@@ -207,20 +209,23 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
 	return 0;
 }
 
-static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int cxl_pci_mbox_send(struct cxl_memdev_state *mds,
+			     struct cxl_mbox_cmd *cmd)
 {
 	int rc;
 
-	mutex_lock_io(&cxlds->mbox_mutex);
-	rc = __cxl_pci_mbox_send_cmd(cxlds, cmd);
-	mutex_unlock(&cxlds->mbox_mutex);
+	mutex_lock_io(&mds->mbox_mutex);
+	rc = __cxl_pci_mbox_send_cmd(mds, cmd);
+	mutex_unlock(&mds->mbox_mutex);
 
 	return rc;
 }
 
-static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
+static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
 {
+	struct cxl_dev_state *cxlds = &mds->cxlds;
 	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
+	struct device *dev = cxlds->dev;
 	unsigned long timeout;
 	u64 md_status;
 
@@ -234,8 +239,7 @@ static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
 	} while (!time_after(jiffies, timeout));
 
 	if (!(md_status & CXLMDEV_MBOX_IF_READY)) {
-		cxl_err(cxlds->dev, md_status,
-			"timeout awaiting mailbox ready");
+		cxl_err(dev, md_status, "timeout awaiting mailbox ready");
 		return -ETIMEDOUT;
 	}
 
@@ -246,12 +250,12 @@ static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
 	 * source for future doorbell busy events.
 	 */
 	if (cxl_pci_mbox_wait_for_doorbell(cxlds) != 0) {
-		cxl_err(cxlds->dev, md_status, "timeout awaiting mailbox idle");
+		cxl_err(dev, md_status, "timeout awaiting mailbox idle");
 		return -ETIMEDOUT;
 	}
 
-	cxlds->mbox_send = cxl_pci_mbox_send;
-	cxlds->payload_size =
+	mds->mbox_send = cxl_pci_mbox_send;
+	mds->payload_size =
 		1 << FIELD_GET(CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK, cap);
 
 	/*
@@ -261,15 +265,14 @@ static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
 	 * there's no point in going forward. If the size is too large, there's
 	 * no harm is soft limiting it.
 	 */
-	cxlds->payload_size = min_t(size_t, cxlds->payload_size, SZ_1M);
-	if (cxlds->payload_size < 256) {
-		dev_err(cxlds->dev, "Mailbox is too small (%zub)",
-			cxlds->payload_size);
+	mds->payload_size = min_t(size_t, mds->payload_size, SZ_1M);
+	if (mds->payload_size < 256) {
+		dev_err(dev, "Mailbox is too small (%zub)",
+			mds->payload_size);
 		return -ENXIO;
 	}
 
-	dev_dbg(cxlds->dev, "Mailbox payload sized %zu",
-		cxlds->payload_size);
+	dev_dbg(dev, "Mailbox payload sized %zu", mds->payload_size);
 
 	return 0;
 }
@@ -433,18 +436,18 @@ static void free_event_buf(void *buf)
 
 /*
  * There is a single buffer for reading event logs from the mailbox.  All logs
- * share this buffer protected by the cxlds->event_log_lock.
+ * share this buffer protected by the mds->event_log_lock.
  */
-static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
+static int cxl_mem_alloc_event_buf(struct cxl_memdev_state *mds)
 {
 	struct cxl_get_event_payload *buf;
 
-	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
+	buf = kvmalloc(mds->payload_size, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
-	cxlds->event.buf = buf;
+	mds->event.buf = buf;
 
-	return devm_add_action_or_reset(cxlds->dev, free_event_buf, buf);
+	return devm_add_action_or_reset(mds->cxlds.dev, free_event_buf, buf);
 }
 
 static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
@@ -477,6 +480,7 @@ static irqreturn_t cxl_event_thread(int irq, void *id)
 {
 	struct cxl_dev_id *dev_id = id;
 	struct cxl_dev_state *cxlds = dev_id->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
 	u32 status;
 
 	do {
@@ -489,7 +493,7 @@ static irqreturn_t cxl_event_thread(int irq, void *id)
 		status &= CXLDEV_EVENT_STATUS_ALL;
 		if (!status)
 			break;
-		cxl_mem_get_event_records(cxlds, status);
+		cxl_mem_get_event_records(mds, status);
 		cond_resched();
 	} while (status);
 
@@ -522,7 +526,7 @@ static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
 					 dev_id);
 }
 
-static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
+static int cxl_event_get_int_policy(struct cxl_memdev_state *mds,
 				    struct cxl_event_interrupt_policy *policy)
 {
 	struct cxl_mbox_cmd mbox_cmd = {
@@ -532,15 +536,15 @@ static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
 	};
 	int rc;
 
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc < 0)
-		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
-			rc);
+		dev_err(mds->cxlds.dev,
+			"Failed to get event interrupt policy : %d", rc);
 
 	return rc;
 }
 
-static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
+static int cxl_event_config_msgnums(struct cxl_memdev_state *mds,
 				    struct cxl_event_interrupt_policy *policy)
 {
 	struct cxl_mbox_cmd mbox_cmd;
@@ -559,23 +563,24 @@ static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
 		.size_in = sizeof(*policy),
 	};
 
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc < 0) {
-		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
+		dev_err(mds->cxlds.dev, "Failed to set event interrupt policy : %d",
 			rc);
 		return rc;
 	}
 
 	/* Retrieve final interrupt settings */
-	return cxl_event_get_int_policy(cxlds, policy);
+	return cxl_event_get_int_policy(mds, policy);
 }
 
-static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
+static int cxl_event_irqsetup(struct cxl_memdev_state *mds)
 {
+	struct cxl_dev_state *cxlds = &mds->cxlds;
 	struct cxl_event_interrupt_policy policy;
 	int rc;
 
-	rc = cxl_event_config_msgnums(cxlds, &policy);
+	rc = cxl_event_config_msgnums(mds, &policy);
 	if (rc)
 		return rc;
 
@@ -614,7 +619,7 @@ static bool cxl_event_int_is_fw(u8 setting)
 }
 
 static int cxl_event_config(struct pci_host_bridge *host_bridge,
-			    struct cxl_dev_state *cxlds)
+			    struct cxl_memdev_state *mds)
 {
 	struct cxl_event_interrupt_policy policy;
 	int rc;
@@ -626,11 +631,11 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
 	if (!host_bridge->native_cxl_error)
 		return 0;
 
-	rc = cxl_mem_alloc_event_buf(cxlds);
+	rc = cxl_mem_alloc_event_buf(mds);
 	if (rc)
 		return rc;
 
-	rc = cxl_event_get_int_policy(cxlds, &policy);
+	rc = cxl_event_get_int_policy(mds, &policy);
 	if (rc)
 		return rc;
 
@@ -638,15 +643,16 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
 	    cxl_event_int_is_fw(policy.warn_settings) ||
 	    cxl_event_int_is_fw(policy.failure_settings) ||
 	    cxl_event_int_is_fw(policy.fatal_settings)) {
-		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
+		dev_err(mds->cxlds.dev,
+			"FW still in control of Event Logs despite _OSC settings\n");
 		return -EBUSY;
 	}
 
-	rc = cxl_event_irqsetup(cxlds);
+	rc = cxl_event_irqsetup(mds);
 	if (rc)
 		return rc;
 
-	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+	cxl_mem_get_event_records(mds, CXLDEV_EVENT_STATUS_ALL);
 
 	return 0;
 }
@@ -654,9 +660,10 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
+	struct cxl_memdev_state *mds;
+	struct cxl_dev_state *cxlds;
 	struct cxl_register_map map;
 	struct cxl_memdev *cxlmd;
-	struct cxl_dev_state *cxlds;
 	int rc;
 
 	/*
@@ -671,9 +678,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		return rc;
 	pci_set_master(pdev);
 
-	cxlds = cxl_dev_state_create(&pdev->dev);
-	if (IS_ERR(cxlds))
-		return PTR_ERR(cxlds);
+	mds = cxl_memdev_state_create(&pdev->dev);
+	if (IS_ERR(mds))
+		return PTR_ERR(mds);
+	cxlds = &mds->cxlds;
 	pci_set_drvdata(pdev, cxlds);
 
 	cxlds->rcd = is_cxl_restricted(pdev);
@@ -714,27 +722,27 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	else
 		dev_warn(&pdev->dev, "Media not active (%d)\n", rc);
 
-	rc = cxl_pci_setup_mailbox(cxlds);
+	rc = cxl_pci_setup_mailbox(mds);
 	if (rc)
 		return rc;
 
-	rc = cxl_enumerate_cmds(cxlds);
+	rc = cxl_enumerate_cmds(mds);
 	if (rc)
 		return rc;
 
-	rc = cxl_set_timestamp(cxlds);
+	rc = cxl_set_timestamp(mds);
 	if (rc)
 		return rc;
 
-	rc = cxl_poison_state_init(cxlds);
+	rc = cxl_poison_state_init(mds);
 	if (rc)
 		return rc;
 
-	rc = cxl_dev_state_identify(cxlds);
+	rc = cxl_dev_state_identify(mds);
 	if (rc)
 		return rc;
 
-	rc = cxl_mem_create_range_info(cxlds);
+	rc = cxl_mem_create_range_info(mds);
 	if (rc)
 		return rc;
 
@@ -746,7 +754,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
-	rc = cxl_event_config(host_bridge, cxlds);
+	rc = cxl_event_config(host_bridge, mds);
 	if (rc)
 		return rc;
 
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index 71cfa1fdf902..7cb8994f8809 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -15,9 +15,9 @@ extern const struct nvdimm_security_ops *cxl_security_ops;
 
 static __read_mostly DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
 
-static void clear_exclusive(void *cxlds)
+static void clear_exclusive(void *mds)
 {
-	clear_exclusive_cxl_commands(cxlds, exclusive_cmds);
+	clear_exclusive_cxl_commands(mds, exclusive_cmds);
 }
 
 static void unregister_nvdimm(void *nvdimm)
@@ -65,13 +65,13 @@ static int cxl_nvdimm_probe(struct device *dev)
 	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
 	struct cxl_nvdimm_bridge *cxl_nvb = cxlmd->cxl_nvb;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	unsigned long flags = 0, cmd_mask = 0;
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
 	struct nvdimm *nvdimm;
 	int rc;
 
-	set_exclusive_cxl_commands(cxlds, exclusive_cmds);
-	rc = devm_add_action_or_reset(dev, clear_exclusive, cxlds);
+	set_exclusive_cxl_commands(mds, exclusive_cmds);
+	rc = devm_add_action_or_reset(dev, clear_exclusive, mds);
 	if (rc)
 		return rc;
 
@@ -100,22 +100,23 @@ static struct cxl_driver cxl_nvdimm_driver = {
 	},
 };
 
-static int cxl_pmem_get_config_size(struct cxl_dev_state *cxlds,
+static int cxl_pmem_get_config_size(struct cxl_memdev_state *mds,
 				    struct nd_cmd_get_config_size *cmd,
 				    unsigned int buf_len)
 {
 	if (sizeof(*cmd) > buf_len)
 		return -EINVAL;
 
-	*cmd = (struct nd_cmd_get_config_size) {
-		 .config_size = cxlds->lsa_size,
-		 .max_xfer = cxlds->payload_size - sizeof(struct cxl_mbox_set_lsa),
+	*cmd = (struct nd_cmd_get_config_size){
+		.config_size = mds->lsa_size,
+		.max_xfer =
+			mds->payload_size - sizeof(struct cxl_mbox_set_lsa),
 	};
 
 	return 0;
 }
 
-static int cxl_pmem_get_config_data(struct cxl_dev_state *cxlds,
+static int cxl_pmem_get_config_data(struct cxl_memdev_state *mds,
 				    struct nd_cmd_get_config_data_hdr *cmd,
 				    unsigned int buf_len)
 {
@@ -140,13 +141,13 @@ static int cxl_pmem_get_config_data(struct cxl_dev_state *cxlds,
 		.payload_out = cmd->out_buf,
 	};
 
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	cmd->status = 0;
 
 	return rc;
 }
 
-static int cxl_pmem_set_config_data(struct cxl_dev_state *cxlds,
+static int cxl_pmem_set_config_data(struct cxl_memdev_state *mds,
 				    struct nd_cmd_set_config_hdr *cmd,
 				    unsigned int buf_len)
 {
@@ -176,7 +177,7 @@ static int cxl_pmem_set_config_data(struct cxl_dev_state *cxlds,
 		.size_in = struct_size(set_lsa, data, cmd->in_length),
 	};
 
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 
 	/*
 	 * Set "firmware" status (4-packed bytes at the end of the input
@@ -194,18 +195,18 @@ static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
 	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
 	unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 
 	if (!test_bit(cmd, &cmd_mask))
 		return -ENOTTY;
 
 	switch (cmd) {
 	case ND_CMD_GET_CONFIG_SIZE:
-		return cxl_pmem_get_config_size(cxlds, buf, buf_len);
+		return cxl_pmem_get_config_size(mds, buf, buf_len);
 	case ND_CMD_GET_CONFIG_DATA:
-		return cxl_pmem_get_config_data(cxlds, buf, buf_len);
+		return cxl_pmem_get_config_data(mds, buf, buf_len);
 	case ND_CMD_SET_CONFIG_DATA:
-		return cxl_pmem_set_config_data(cxlds, buf, buf_len);
+		return cxl_pmem_set_config_data(mds, buf, buf_len);
 	default:
 		return -ENOTTY;
 	}
diff --git a/drivers/cxl/security.c b/drivers/cxl/security.c
index 4ad4bda2d18e..8c98fc674fa7 100644
--- a/drivers/cxl/security.c
+++ b/drivers/cxl/security.c
@@ -14,7 +14,7 @@ static unsigned long cxl_pmem_get_security_flags(struct nvdimm *nvdimm,
 {
 	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	unsigned long security_flags = 0;
 	struct cxl_get_security_output {
 		__le32 flags;
@@ -29,7 +29,7 @@ static unsigned long cxl_pmem_get_security_flags(struct nvdimm *nvdimm,
 		.payload_out = &out,
 	};
 
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc < 0)
 		return 0;
 
@@ -67,7 +67,7 @@ static int cxl_pmem_security_change_key(struct nvdimm *nvdimm,
 {
 	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_cmd mbox_cmd;
 	struct cxl_set_pass set_pass;
 
@@ -84,7 +84,7 @@ static int cxl_pmem_security_change_key(struct nvdimm *nvdimm,
 		.payload_in = &set_pass,
 	};
 
-	return cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	return cxl_internal_send_cmd(mds, &mbox_cmd);
 }
 
 static int __cxl_pmem_security_disable(struct nvdimm *nvdimm,
@@ -93,7 +93,7 @@ static int __cxl_pmem_security_disable(struct nvdimm *nvdimm,
 {
 	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_disable_pass dis_pass;
 	struct cxl_mbox_cmd mbox_cmd;
 
@@ -109,7 +109,7 @@ static int __cxl_pmem_security_disable(struct nvdimm *nvdimm,
 		.payload_in = &dis_pass,
 	};
 
-	return cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	return cxl_internal_send_cmd(mds, &mbox_cmd);
 }
 
 static int cxl_pmem_security_disable(struct nvdimm *nvdimm,
@@ -128,12 +128,12 @@ static int cxl_pmem_security_freeze(struct nvdimm *nvdimm)
 {
 	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_cmd mbox_cmd = {
 		.opcode = CXL_MBOX_OP_FREEZE_SECURITY,
 	};
 
-	return cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	return cxl_internal_send_cmd(mds, &mbox_cmd);
 }
 
 static int cxl_pmem_security_unlock(struct nvdimm *nvdimm,
@@ -141,7 +141,7 @@ static int cxl_pmem_security_unlock(struct nvdimm *nvdimm,
 {
 	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	u8 pass[NVDIMM_PASSPHRASE_LEN];
 	struct cxl_mbox_cmd mbox_cmd;
 	int rc;
@@ -153,7 +153,7 @@ static int cxl_pmem_security_unlock(struct nvdimm *nvdimm,
 		.payload_in = pass,
 	};
 
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc < 0)
 		return rc;
 
@@ -166,7 +166,7 @@ static int cxl_pmem_security_passphrase_erase(struct nvdimm *nvdimm,
 {
 	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
 	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_cmd mbox_cmd;
 	struct cxl_pass_erase erase;
 	int rc;
@@ -182,7 +182,7 @@ static int cxl_pmem_security_passphrase_erase(struct nvdimm *nvdimm,
 		.payload_in = &erase,
 	};
 
-	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
 	if (rc < 0)
 		return rc;
 
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index bdaf086d994e..6fb5718588f3 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -102,7 +102,7 @@ struct mock_event_log {
 };
 
 struct mock_event_store {
-	struct cxl_dev_state *cxlds;
+	struct cxl_memdev_state *mds;
 	struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
 	u32 ev_status;
 };
@@ -291,7 +291,7 @@ static void cxl_mock_event_trigger(struct device *dev)
 			event_reset_log(log);
 	}
 
-	cxl_mem_get_event_records(mes->cxlds, mes->ev_status);
+	cxl_mem_get_event_records(mes->mds, mes->ev_status);
 }
 
 struct cxl_event_record_raw maint_needed = {
@@ -451,7 +451,7 @@ static int mock_gsl(struct cxl_mbox_cmd *cmd)
 	return 0;
 }
 
-static int mock_get_log(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int mock_get_log(struct cxl_memdev_state *mds, struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_mbox_get_log *gl = cmd->payload_in;
 	u32 offset = le32_to_cpu(gl->offset);
@@ -461,7 +461,7 @@ static int mock_get_log(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 
 	if (cmd->size_in < sizeof(*gl))
 		return -EINVAL;
-	if (length > cxlds->payload_size)
+	if (length > mds->payload_size)
 		return -EINVAL;
 	if (offset + length > sizeof(mock_cel))
 		return -EINVAL;
@@ -1105,8 +1105,10 @@ static struct attribute *cxl_mock_mem_core_attrs[] = {
 };
 ATTRIBUTE_GROUPS(cxl_mock_mem_core);
 
-static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+static int cxl_mock_mbox_send(struct cxl_memdev_state *mds,
+			      struct cxl_mbox_cmd *cmd)
 {
+	struct cxl_dev_state *cxlds = &mds->cxlds;
 	struct device *dev = cxlds->dev;
 	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
 	int rc = -EIO;
@@ -1119,7 +1121,7 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
 		rc = mock_gsl(cmd);
 		break;
 	case CXL_MBOX_OP_GET_LOG:
-		rc = mock_get_log(cxlds, cmd);
+		rc = mock_get_log(mds, cmd);
 		break;
 	case CXL_MBOX_OP_IDENTIFY:
 		if (cxlds->rcd)
@@ -1207,6 +1209,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct cxl_memdev *cxlmd;
+	struct cxl_memdev_state *mds;
 	struct cxl_dev_state *cxlds;
 	struct cxl_mockmem_data *mdata;
 	int rc;
@@ -1223,48 +1226,50 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	if (rc)
 		return rc;
 
-	cxlds = cxl_dev_state_create(dev);
-	if (IS_ERR(cxlds))
-		return PTR_ERR(cxlds);
+	mds = cxl_memdev_state_create(dev);
+	if (IS_ERR(mds))
+		return PTR_ERR(mds);
+
+	mds->mbox_send = cxl_mock_mbox_send;
+	mds->payload_size = SZ_4K;
+	mds->event.buf = (struct cxl_get_event_payload *) mdata->event_buf;
 
+	cxlds = &mds->cxlds;
 	cxlds->serial = pdev->id;
-	cxlds->mbox_send = cxl_mock_mbox_send;
-	cxlds->payload_size = SZ_4K;
-	cxlds->event.buf = (struct cxl_get_event_payload *) mdata->event_buf;
 	if (is_rcd(pdev)) {
 		cxlds->rcd = true;
 		cxlds->component_reg_phys = CXL_RESOURCE_NONE;
 	}
 
-	rc = cxl_enumerate_cmds(cxlds);
+	rc = cxl_enumerate_cmds(mds);
 	if (rc)
 		return rc;
 
-	rc = cxl_poison_state_init(cxlds);
+	rc = cxl_poison_state_init(mds);
 	if (rc)
 		return rc;
 
-	rc = cxl_set_timestamp(cxlds);
+	rc = cxl_set_timestamp(mds);
 	if (rc)
 		return rc;
 
 	cxlds->media_ready = true;
-	rc = cxl_dev_state_identify(cxlds);
+	rc = cxl_dev_state_identify(mds);
 	if (rc)
 		return rc;
 
-	rc = cxl_mem_create_range_info(cxlds);
+	rc = cxl_mem_create_range_info(mds);
 	if (rc)
 		return rc;
 
-	mdata->mes.cxlds = cxlds;
+	mdata->mes.mds = mds;
 	cxl_mock_add_event_logs(&mdata->mes);
 
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
-	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+	cxl_mem_get_event_records(mds, CXLDEV_EVENT_STATUS_ALL);
 
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 04/19] cxl/memdev: Make mailbox functionality optional
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (2 preceding siblings ...)
  2023-06-04 23:31 ` [PATCH 03/19] cxl/mbox: Move mailbox related driver state to its own data structure Dan Williams
@ 2023-06-04 23:31 ` Dan Williams
  2023-06-06 11:15   ` Jonathan Cameron
  2023-06-04 23:32 ` [PATCH 05/19] cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTMEM, DEVMEM} Dan Williams
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:31 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

In support of the Linux CXL core scaling for a wider set of CXL devices,
allow for the creation of memdevs with some memory device capabilities
disabled. Specifically, allow for CXL devices outside of those claiming
to be compliant with the generic CXL memory device class code, like
vendor specific Type-2/3 devices that host CXL.mem. This implies, allow
for the creation of memdevs that only support component-registers, not
necessarily memory-device-registers (like mailbox registers). A memdev
derived from a CXL endpoint that does not support generic class code
expectations is tagged "CXL_DEVTYPE_DEVMEM", while a memdev derived from a
class-code compliant endpoint is tagged "CXL_DEVTYPE_CLASSMEM".

The primary assumption of a CXL_DEVTYPE_DEVMEM memdev is that it
optionally may not host a mailbox. Disable the command passthrough ioctl
for memdevs that are not CXL_DEVTYPE_CLASSMEM, and return empty strings
from memdev attributes associated with data retrieved via the
class-device-standard IDENTIFY command. Note that empty strings were
chosen over attribute visibility to maintain compatibility with shipping
versions of cxl-cli that expect those attributes to always be present.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/mbox.c   |    1 +
 drivers/cxl/core/memdev.c |   10 +++++++++-
 drivers/cxl/cxlmem.h      |   18 ++++++++++++++++++
 3 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 14805dae5a74..3ca0bf12c55f 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1273,6 +1273,7 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
 	mutex_init(&mds->mbox_mutex);
 	mutex_init(&mds->event.log_lock);
 	mds->cxlds.dev = dev;
+	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
 
 	return mds;
 }
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 15434b1b4909..3f2d54f30548 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -41,6 +41,8 @@ static ssize_t firmware_version_show(struct device *dev,
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
 
+	if (!mds)
+		return sysfs_emit(buf, "\n");
 	return sysfs_emit(buf, "%.16s\n", mds->firmware_version);
 }
 static DEVICE_ATTR_RO(firmware_version);
@@ -52,6 +54,8 @@ static ssize_t payload_max_show(struct device *dev,
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
 
+	if (!mds)
+		return sysfs_emit(buf, "\n");
 	return sysfs_emit(buf, "%zu\n", mds->payload_size);
 }
 static DEVICE_ATTR_RO(payload_max);
@@ -63,6 +67,8 @@ static ssize_t label_storage_size_show(struct device *dev,
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
 
+	if (!mds)
+		return sysfs_emit(buf, "\n");
 	return sysfs_emit(buf, "%zu\n", mds->lsa_size);
 }
 static DEVICE_ATTR_RO(label_storage_size);
@@ -517,10 +523,12 @@ static long cxl_memdev_ioctl(struct file *file, unsigned int cmd,
 			     unsigned long arg)
 {
 	struct cxl_memdev *cxlmd = file->private_data;
+	struct cxl_dev_state *cxlds;
 	int rc = -ENXIO;
 
 	down_read(&cxl_memdev_rwsem);
-	if (cxlmd->cxlds)
+	cxlds = cxlmd->cxlds;
+	if (cxlds && cxlds->type == CXL_DEVTYPE_CLASSMEM)
 		rc = __cxl_memdev_ioctl(cxlmd, cmd, arg);
 	up_read(&cxl_memdev_rwsem);
 
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index d3fe73d5ba4d..b8bdf7490d2c 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -254,6 +254,20 @@ struct cxl_poison_state {
 	struct mutex lock;  /* Protect reads of poison list */
 };
 
+/*
+ * enum cxl_devtype - delineate type-2 from a generic type-3 device
+ * @CXL_DEVTYPE_DEVMEM - Vendor specific CXL Type-2 device implementing HDM-D or
+ *			 HDM-DB, no expectation that this device implements a
+ *			 mailbox, or other memory-device-standard manageability
+ *			 flows.
+ * @CXL_DEVTYPE_CLASSMEM - Common class definition of a CXL Type-3 device with
+ *			   HDM-H and class-mandatory memory device registers
+ */
+enum cxl_devtype {
+	CXL_DEVTYPE_DEVMEM,
+	CXL_DEVTYPE_CLASSMEM,
+};
+
 /**
  * struct cxl_dev_state - The driver device state
  *
@@ -273,6 +287,7 @@ struct cxl_poison_state {
  * @component_reg_phys: register base of component registers
  * @info: Cached DVSEC information about the device.
  * @serial: PCIe Device Serial Number
+ * @type: Generic Memory Class device or Vendor Specific Memory device
  */
 struct cxl_dev_state {
 	struct device *dev;
@@ -286,6 +301,7 @@ struct cxl_dev_state {
 	struct resource ram_res;
 	resource_size_t component_reg_phys;
 	u64 serial;
+	enum cxl_devtype type;
 };
 
 /**
@@ -344,6 +360,8 @@ struct cxl_memdev_state {
 static inline struct cxl_memdev_state *
 to_cxl_memdev_state(struct cxl_dev_state *cxlds)
 {
+	if (cxlds->type != CXL_DEVTYPE_CLASSMEM)
+		return NULL;
 	return container_of(cxlds, struct cxl_memdev_state, cxlds);
 }
 


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 05/19] cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTMEM, DEVMEM}
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (3 preceding siblings ...)
  2023-06-04 23:31 ` [PATCH 04/19] cxl/memdev: Make mailbox functionality optional Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-06 11:21   ` Jonathan Cameron
  2023-06-04 23:32 ` [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM Dan Williams
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

In preparation for support for HDM-D and HDM-DB configuration
(device-memory, and device-memory with back-invalidate). Rename the current
type designators to use HOSTMEM and DEVMEM as a suffix.

HDM-DB can be supported by devices that are not accelerators, so DEVMEM is
a more generic term for that case.

Fixup one location where this type value was open coded.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/acpi.c           |    2 +-
 drivers/cxl/core/hdm.c       |   10 +++++-----
 drivers/cxl/core/port.c      |    6 +++---
 drivers/cxl/core/region.c    |    2 +-
 drivers/cxl/cxl.h            |    4 ++--
 tools/testing/cxl/test/cxl.c |    6 +++---
 6 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 7e1765b09e04..4e66483f1fd3 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -258,7 +258,7 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 
 	cxld = &cxlrd->cxlsd.cxld;
 	cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
-	cxld->target_type = CXL_DECODER_EXPANDER;
+	cxld->target_type = CXL_DECODER_HOSTMEM;
 	cxld->hpa_range = (struct range) {
 		.start = res->start,
 		.end = res->end,
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 7889ff203a34..de8a3fb28331 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -570,7 +570,7 @@ static void cxld_set_interleave(struct cxl_decoder *cxld, u32 *ctrl)
 
 static void cxld_set_type(struct cxl_decoder *cxld, u32 *ctrl)
 {
-	u32p_replace_bits(ctrl, !!(cxld->target_type == 3),
+	u32p_replace_bits(ctrl, !!(cxld->target_type == CXL_DECODER_HOSTMEM),
 			  CXL_HDM_DECODER0_CTRL_TYPE);
 }
 
@@ -764,7 +764,7 @@ static int cxl_setup_hdm_decoder_from_dvsec(
 	if (!len)
 		return -ENOENT;
 
-	cxld->target_type = CXL_DECODER_EXPANDER;
+	cxld->target_type = CXL_DECODER_HOSTMEM;
 	cxld->commit = NULL;
 	cxld->reset = NULL;
 	cxld->hpa_range = info->dvsec_range[which];
@@ -838,9 +838,9 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 		if (ctrl & CXL_HDM_DECODER0_CTRL_LOCK)
 			cxld->flags |= CXL_DECODER_F_LOCK;
 		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl))
-			cxld->target_type = CXL_DECODER_EXPANDER;
+			cxld->target_type = CXL_DECODER_HOSTMEM;
 		else
-			cxld->target_type = CXL_DECODER_ACCELERATOR;
+			cxld->target_type = CXL_DECODER_DEVMEM;
 		if (cxld->id != port->commit_end + 1) {
 			dev_warn(&port->dev,
 				 "decoder%d.%d: Committed out of order\n",
@@ -861,7 +861,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 			ctrl |= CXL_HDM_DECODER0_CTRL_TYPE;
 			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
 		}
-		cxld->target_type = CXL_DECODER_EXPANDER;
+		cxld->target_type = CXL_DECODER_HOSTMEM;
 	}
 	rc = eiw_to_ways(FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl),
 			  &cxld->interleave_ways);
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index e7c284c890bc..432a4ac38f36 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -117,9 +117,9 @@ static ssize_t target_type_show(struct device *dev,
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
 
 	switch (cxld->target_type) {
-	case CXL_DECODER_ACCELERATOR:
+	case CXL_DECODER_DEVMEM:
 		return sysfs_emit(buf, "accelerator\n");
-	case CXL_DECODER_EXPANDER:
+	case CXL_DECODER_HOSTMEM:
 		return sysfs_emit(buf, "expander\n");
 	}
 	return -ENXIO;
@@ -1550,7 +1550,7 @@ static int cxl_decoder_init(struct cxl_port *port, struct cxl_decoder *cxld)
 	/* Pre initialize an "empty" decoder */
 	cxld->interleave_ways = 1;
 	cxld->interleave_granularity = PAGE_SIZE;
-	cxld->target_type = CXL_DECODER_EXPANDER;
+	cxld->target_type = CXL_DECODER_HOSTMEM;
 	cxld->hpa_range = (struct range) {
 		.start = 0,
 		.end = -1,
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index f822de44bee0..dca94c458b8f 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2103,7 +2103,7 @@ static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
 		return ERR_PTR(-EBUSY);
 	}
 
-	return devm_cxl_add_region(cxlrd, id, mode, CXL_DECODER_EXPANDER);
+	return devm_cxl_add_region(cxlrd, id, mode, CXL_DECODER_HOSTMEM);
 }
 
 static ssize_t create_pmem_region_store(struct device *dev,
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index dfc94e76c7d6..e2d0ae228cba 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -290,8 +290,8 @@ resource_size_t cxl_rcrb_to_component(struct device *dev,
 #define CXL_DECODER_F_MASK  GENMASK(5, 0)
 
 enum cxl_decoder_type {
-       CXL_DECODER_ACCELERATOR = 2,
-       CXL_DECODER_EXPANDER = 3,
+	CXL_DECODER_DEVMEM = 2,
+	CXL_DECODER_HOSTMEM = 3,
 };
 
 /*
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index bf00dc52fe96..e3f1b2e88e3e 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -713,7 +713,7 @@ static void default_mock_decoder(struct cxl_decoder *cxld)
 
 	cxld->interleave_ways = 1;
 	cxld->interleave_granularity = 256;
-	cxld->target_type = CXL_DECODER_EXPANDER;
+	cxld->target_type = CXL_DECODER_HOSTMEM;
 	cxld->commit = mock_decoder_commit;
 	cxld->reset = mock_decoder_reset;
 }
@@ -787,7 +787,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
 
 	cxld->interleave_ways = 2;
 	eig_to_granularity(window->granularity, &cxld->interleave_granularity);
-	cxld->target_type = CXL_DECODER_EXPANDER;
+	cxld->target_type = CXL_DECODER_HOSTMEM;
 	cxld->flags = CXL_DECODER_F_ENABLE;
 	cxled->state = CXL_DECODER_STATE_AUTO;
 	port->commit_end = cxld->id;
@@ -820,7 +820,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
 		} else
 			cxlsd->target[0] = dport;
 		cxld = &cxlsd->cxld;
-		cxld->target_type = CXL_DECODER_EXPANDER;
+		cxld->target_type = CXL_DECODER_HOSTMEM;
 		cxld->flags = CXL_DECODER_F_ENABLE;
 		iter->commit_end = 0;
 		/*


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (4 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 05/19] cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTMEM, DEVMEM} Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-05  1:14   ` kernel test robot
  2023-06-06 11:27   ` Jonathan Cameron
  2023-06-04 23:32 ` [PATCH 07/19] cxl/region: Manage decoder target_type at decoder-attach time Dan Williams
                   ` (12 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

In preparation for device-memory region creation, arrange for decoders
of CXL_DEVTYPE_DEVMEM memdevs to default to CXL_DECODER_DEVMEM for their
target type.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/hdm.c |   14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index de8a3fb28331..ca3b99c6eacf 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -856,12 +856,22 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 		}
 		port->commit_end = cxld->id;
 	} else {
-		/* unless / until type-2 drivers arrive, assume type-3 */
 		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0) {
 			ctrl |= CXL_HDM_DECODER0_CTRL_TYPE;
 			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
 		}
-		cxld->target_type = CXL_DECODER_HOSTMEM;
+		if (cxled) {
+			struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+			struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+			if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
+				cxld->target_type = CXL_DECODER_HOSTMEM;
+			else
+				cxld->target_type = CXL_DECODER_DEVMEM;
+		} else {
+			/* To be overridden by region type at commit time */
+			cxld->target_type = CXL_DECODER_HOSTMEM;
+		}
 	}
 	rc = eiw_to_ways(FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl),
 			  &cxld->interleave_ways);


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 07/19] cxl/region: Manage decoder target_type at decoder-attach time
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (5 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-06 12:36   ` Jonathan Cameron
  2023-06-13 22:42   ` Dave Jiang
  2023-06-04 23:32 ` [PATCH 08/19] cxl/port: Enumerate flit mode capability Dan Williams
                   ` (11 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

Switch-level (mid-level) decoders between the platform root and an
endpoint can dynamically switch modes between HDM-H and HDM-D[B]
depending on which region they target. Use the region type to fixup each
decoder that gets allocated to map the given region.

Note that endpoint decoders are meant to determine the region type, so
warn if those ever need to be fixed up, but since it is possible to
continue do so.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index dca94c458b8f..c7170d92f47f 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -809,6 +809,18 @@ static int cxl_rr_alloc_decoder(struct cxl_port *port, struct cxl_region *cxlr,
 		return -EBUSY;
 	}
 
+	/*
+	 * Endpoints should already match the region type, but backstop that
+	 * assumption with an assertion. Switch-decoders change mapping-type
+	 * based on what is mapped when they are assigned to a region.
+	 */
+	dev_WARN_ONCE(&cxlr->dev,
+		      port == cxled_to_port(cxled) &&
+			      cxld->target_type != cxlr->type,
+		      "%s:%s mismatch decoder type %d -> %d\n",
+		      dev_name(&cxled_to_memdev(cxled)->dev),
+		      dev_name(&cxld->dev), cxld->target_type, cxlr->type);
+	cxld->target_type = cxlr->type;
 	cxl_rr->decoder = cxld;
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 08/19] cxl/port: Enumerate flit mode capability
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (6 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 07/19] cxl/region: Manage decoder target_type at decoder-attach time Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-06 13:04   ` Jonathan Cameron
  2023-06-04 23:32 ` [PATCH 09/19] cxl/memdev: Formalize endpoint port linkage Dan Williams
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

Per CXL 3.0 Section 9.14 Back-Invalidation Configuration, in order to
enable an HDM-DB range (CXL.mem region with device initiated
back-invalidation support), all ports in the path between the endpoint and
the host bridge must be in 256-bit flit-mode.

Even for typical Type-3 class devices it is useful to enumerate link
capabilities through the chain for debug purposes.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/hdm.c  |    2 +
 drivers/cxl/core/pci.c  |   84 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/port.c |    6 +++
 drivers/cxl/cxl.h       |    2 +
 drivers/cxl/cxlpci.h    |   25 +++++++++++++-
 drivers/cxl/port.c      |    5 +++
 6 files changed, 122 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index ca3b99c6eacf..91ab3033c781 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -3,8 +3,10 @@
 #include <linux/seq_file.h>
 #include <linux/device.h>
 #include <linux/delay.h>
+#include <linux/pci.h>
 
 #include "cxlmem.h"
+#include "cxlpci.h"
 #include "core.h"
 
 /**
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 67f4ab6daa34..b62ec17ccdde 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -519,6 +519,90 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
 }
 EXPORT_SYMBOL_NS_GPL(cxl_hdm_decode_init, CXL);
 
+static struct pci_dev *cxl_port_to_pci(struct cxl_port *port)
+{
+	struct device *dev;
+
+	if (is_cxl_endpoint(port))
+		dev = port->uport->parent;
+	else
+		dev = port->uport;
+
+	if (!dev_is_pci(dev))
+		return NULL;
+
+	return to_pci_dev(dev);
+}
+
+int cxl_probe_link(struct cxl_port *port)
+{
+	struct pci_dev *pdev = cxl_port_to_pci(port);
+	u16 cap, en, parent_features;
+	struct cxl_port *parent_port;
+	struct device *dev;
+	int rc, dvsec;
+	u32 hdr;
+
+	if (!pdev) {
+		/*
+		 * Assume host bridges support all features, the root
+		 * port will dictate the actual enabled set to endpoints.
+		 */
+		return 0;
+	}
+
+	dev = &pdev->dev;
+	dvsec = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
+					  CXL_DVSEC_FLEXBUS_PORT);
+	if (!dvsec) {
+		dev_err(dev, "Failed to enumerate port capabilities\n");
+		return -ENXIO;
+	}
+
+	/*
+	 * Cache the link features for future determination of HDM-D or
+	 * HDM-DB support
+	 */
+	rc = pci_read_config_dword(pdev, dvsec + PCI_DVSEC_HEADER1, &hdr);
+	if (rc)
+		return rc;
+
+	rc = pci_read_config_word(pdev, dvsec + CXL_DVSEC_FLEXBUS_CAP_OFFSET,
+				  &cap);
+	if (rc)
+		return rc;
+
+	rc = pci_read_config_word(pdev, dvsec + CXL_DVSEC_FLEXBUS_STATUS_OFFSET,
+				  &en);
+	if (rc)
+		return rc;
+
+	if (PCI_DVSEC_HEADER1_REV(hdr) < 2)
+		cap &= ~CXL_DVSEC_FLEXBUS_REV2_MASK;
+
+	if (PCI_DVSEC_HEADER1_REV(hdr) < 1)
+		cap &= ~CXL_DVSEC_FLEXBUS_REV1_MASK;
+
+	en &= cap;
+	parent_port = to_cxl_port(port->dev.parent);
+	parent_features = parent_port->features;
+
+	/* Enforce port features are plumbed through to the host bridge */
+	port->features = en & CXL_DVSEC_FLEXBUS_ENABLE_MASK & parent_features;
+
+	dev_dbg(dev, "features:%s%s%s%s%s%s%s\n",
+		en & CXL_DVSEC_FLEXBUS_CACHE_ENABLED ? " cache" : "",
+		en & CXL_DVSEC_FLEXBUS_IO_ENABLED ? " io" : "",
+		en & CXL_DVSEC_FLEXBUS_MEM_ENABLED ? " mem" : "",
+		en & CXL_DVSEC_FLEXBUS_FLIT68_ENABLED ? " flit68" : "",
+		en & CXL_DVSEC_FLEXBUS_MLD_ENABLED ? " mld" : "",
+		en & CXL_DVSEC_FLEXBUS_FLIT256_ENABLED ? " flit256" : "",
+		en & CXL_DVSEC_FLEXBUS_PBR_ENABLED ? " pbr" : "");
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_probe_link, CXL);
+
 #define CXL_DOE_TABLE_ACCESS_REQ_CODE		0x000000ff
 #define   CXL_DOE_TABLE_ACCESS_REQ_CODE_READ	0
 #define CXL_DOE_TABLE_ACCESS_TABLE_TYPE		0x0000ff00
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 432a4ac38f36..71a7547a8d6f 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -665,6 +665,12 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
 	} else
 		dev->parent = uport;
 
+	/*
+	 * Assume all CXL link capabilities for root-device-to-host-bridge link,
+	 * cxl_probe_link() will fix this up later in cxl_probe_link() for all
+	 * other ports.
+	 */
+	port->features = CXL_DVSEC_FLEXBUS_ENABLE_MASK;
 	port->component_reg_phys = component_reg_phys;
 	ida_init(&port->decoder_ida);
 	port->hdm_end = -1;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index e2d0ae228cba..258c90727dd2 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -557,6 +557,7 @@ struct cxl_dax_region {
  * @depth: How deep this port is relative to the root. depth 0 is the root.
  * @cdat: Cached CDAT data
  * @cdat_available: Should a CDAT attribute be available in sysfs
+ * @features: active link features (see CXL_DVSEC_FLEXBUS_*_ENABLED)
  */
 struct cxl_port {
 	struct device dev;
@@ -579,6 +580,7 @@ struct cxl_port {
 		size_t length;
 	} cdat;
 	bool cdat_available;
+	u16 features;
 };
 
 static inline struct cxl_dport *
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 7c02e55b8042..7f82ffb5b4be 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -45,8 +45,28 @@
 /* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
 #define CXL_DVSEC_DEVICE_GPF					5
 
-/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
-#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
+/* CXL 3.0 8.2.1.3: PCIe DVSEC for Flex Bus Port */
+#define CXL_DVSEC_FLEXBUS_PORT					7
+#define   CXL_DVSEC_FLEXBUS_CAP_OFFSET		0xA
+#define     CXL_DVSEC_FLEXBUS_CACHE_CAPABLE	BIT(0)
+#define     CXL_DVSEC_FLEXBUS_IO_CAPABLE	BIT(1)
+#define     CXL_DVSEC_FLEXBUS_MEM_CAPABLE	BIT(2)
+#define     CXL_DVSEC_FLEXBUS_FLIT68_CAPABLE	BIT(5)
+#define     CXL_DVSEC_FLEXBUS_MLD_CAPABLE	BIT(6)
+#define     CXL_DVSEC_FLEXBUS_REV1_MASK		GENMASK(6, 5)
+#define     CXL_DVSEC_FLEXBUS_FLIT256_CAPABLE	BIT(13)
+#define     CXL_DVSEC_FLEXBUS_PBR_CAPABLE	BIT(14)
+#define     CXL_DVSEC_FLEXBUS_REV2_MASK		GENMASK(14, 13)
+#define   CXL_DVSEC_FLEXBUS_STATUS_OFFSET	0xE
+#define     CXL_DVSEC_FLEXBUS_CACHE_ENABLED	BIT(0)
+#define     CXL_DVSEC_FLEXBUS_IO_ENABLED	BIT(1)
+#define     CXL_DVSEC_FLEXBUS_MEM_ENABLED	BIT(2)
+#define     CXL_DVSEC_FLEXBUS_FLIT68_ENABLED	BIT(5)
+#define     CXL_DVSEC_FLEXBUS_MLD_ENABLED	BIT(6)
+#define     CXL_DVSEC_FLEXBUS_FLIT256_ENABLED	BIT(13)
+#define     CXL_DVSEC_FLEXBUS_PBR_ENABLED	BIT(14)
+#define     CXL_DVSEC_FLEXBUS_ENABLE_MASK \
+	(GENMASK(2, 0) | GENMASK(6, 5) | GENMASK(14, 13))
 
 /* CXL 2.0 8.1.9: Register Locator DVSEC */
 #define CXL_DVSEC_REG_LOCATOR					8
@@ -88,6 +108,7 @@ int devm_cxl_port_enumerate_dports(struct cxl_port *port);
 struct cxl_dev_state;
 int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
 			struct cxl_endpoint_dvsec_info *info);
+int cxl_probe_link(struct cxl_port *port);
 void read_cdat_data(struct cxl_port *port);
 void cxl_cor_error_detected(struct pci_dev *pdev);
 pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index c23b6164e1c0..5ffe3c7d2f5e 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -140,6 +140,11 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 static int cxl_port_probe(struct device *dev)
 {
 	struct cxl_port *port = to_cxl_port(dev);
+	int rc;
+
+	rc = cxl_probe_link(port);
+	if (rc)
+		return rc;
 
 	if (is_cxl_endpoint(port))
 		return cxl_endpoint_port_probe(port);


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 09/19] cxl/memdev: Formalize endpoint port linkage
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (7 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 08/19] cxl/port: Enumerate flit mode capability Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-06 13:26   ` Jonathan Cameron
                     ` (2 more replies)
  2023-06-04 23:32 ` [PATCH 10/19] cxl/memdev: Indicate probe deferral Dan Williams
                   ` (9 subsequent siblings)
  18 siblings, 3 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

Move the endpoint port that the cxl_mem driver establishes from drvdata
to a first class attribute. This is in preparation for device-memory
drivers reusing the CXL core for memory region management. Those drivers
need a type-safe method to retrieve their CXL port linkage. Leave
drvdata for private usage of the cxl_mem driver not external consumers
of a 'struct cxl_memdev' object.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c |    4 ++--
 drivers/cxl/core/pmem.c   |    2 +-
 drivers/cxl/core/port.c   |    5 +++--
 drivers/cxl/cxlmem.h      |    2 ++
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 3f2d54f30548..65a685e5616f 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -149,7 +149,7 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
 	struct cxl_port *port;
 	int rc;
 
-	port = dev_get_drvdata(&cxlmd->dev);
+	port = cxlmd->endpoint;
 	if (!port || !is_cxl_endpoint(port))
 		return -EINVAL;
 
@@ -207,7 +207,7 @@ static struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
 	ctx = (struct cxl_dpa_to_region_context) {
 		.dpa = dpa,
 	};
-	port = dev_get_drvdata(&cxlmd->dev);
+	port = cxlmd->endpoint;
 	if (port && is_cxl_endpoint(port) && port->commit_end != -1)
 		device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region);
 
diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
index f8c38d997252..fc94f5240327 100644
--- a/drivers/cxl/core/pmem.c
+++ b/drivers/cxl/core/pmem.c
@@ -64,7 +64,7 @@ static int match_nvdimm_bridge(struct device *dev, void *data)
 
 struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd)
 {
-	struct cxl_port *port = find_cxl_root(dev_get_drvdata(&cxlmd->dev));
+	struct cxl_port *port = find_cxl_root(cxlmd->endpoint);
 	struct device *dev;
 
 	if (!port)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 71a7547a8d6f..6720ab22a494 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1167,7 +1167,7 @@ static struct device *grandparent(struct device *dev)
 static void delete_endpoint(void *data)
 {
 	struct cxl_memdev *cxlmd = data;
-	struct cxl_port *endpoint = dev_get_drvdata(&cxlmd->dev);
+	struct cxl_port *endpoint = cxlmd->endpoint;
 	struct cxl_port *parent_port;
 	struct device *parent;
 
@@ -1182,6 +1182,7 @@ static void delete_endpoint(void *data)
 		devm_release_action(parent, cxl_unlink_uport, endpoint);
 		devm_release_action(parent, unregister_port, endpoint);
 	}
+	cxlmd->endpoint = NULL;
 	device_unlock(parent);
 	put_device(parent);
 out:
@@ -1193,7 +1194,7 @@ int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
 	struct device *dev = &cxlmd->dev;
 
 	get_device(&endpoint->dev);
-	dev_set_drvdata(dev, endpoint);
+	cxlmd->endpoint = endpoint;
 	cxlmd->depth = endpoint->depth;
 	return devm_add_action_or_reset(dev, delete_endpoint, cxlmd);
 }
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index b8bdf7490d2c..7ee78e79933c 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -38,6 +38,7 @@
  * @detach_work: active memdev lost a port in its ancestry
  * @cxl_nvb: coordinate removal of @cxl_nvd if present
  * @cxl_nvd: optional bridge to an nvdimm if the device supports pmem
+ * @endpoint: connection to the CXL port topology for this memory device
  * @id: id number of this memdev instance.
  * @depth: endpoint port depth
  */
@@ -48,6 +49,7 @@ struct cxl_memdev {
 	struct work_struct detach_work;
 	struct cxl_nvdimm_bridge *cxl_nvb;
 	struct cxl_nvdimm *cxl_nvd;
+	struct cxl_port *endpoint;
 	int id;
 	int depth;
 };


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 10/19] cxl/memdev: Indicate probe deferral
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (8 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 09/19] cxl/memdev: Formalize endpoint port linkage Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-06 13:54   ` Jonathan Cameron
  2023-06-04 23:32 ` [PATCH 11/19] cxl/region: Factor out construct_region_{begin, end} and drop_region() for reuse Dan Williams
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

The first stop for a CXL accelerator driver that wants to establish new
CXL.mem regions is to register a 'struct cxl_memdev'. That kicks off
cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
topology up to the root.

If the root driver has not attached yet the expectation is that the
driver waits until that link is established. The common cxl_pci driver
has reason to keep the 'struct cxl_memdev' device attached to the bus
until the root driver attaches. An accelerator may want to instead defer
probing until CXL resources can be acquired.

Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
accelerator driver probing should be deferred vs failed. Provide that
indication via a new cxl_acquire_endpoint() API that can retrieve the
probe status of the memdev.

The first consumer of this API is a test driver that exercises the CXL
Type-2 flow.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c |   41 +++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/port.c   |    2 +-
 drivers/cxl/cxlmem.h      |    2 ++
 drivers/cxl/mem.c         |    7 +++++--
 4 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 65a685e5616f..859c43c340bb 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -609,6 +609,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
 
+/*
+ * Try to get a locked reference on a memdev's CXL port topology
+ * connection. Be careful to observe when cxl_mem_probe() has deposited
+ * a probe deferral awaiting the arrival of the CXL root driver
+ */
+struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
+{
+	struct cxl_port *endpoint;
+	int rc = -ENXIO;
+
+	device_lock(&cxlmd->dev);
+	endpoint = cxlmd->endpoint;
+	if (!endpoint)
+		goto err;
+
+	if (IS_ERR(endpoint)) {
+		rc = PTR_ERR(endpoint);
+		goto err;
+	}
+
+	device_lock(&endpoint->dev);
+	if (!endpoint->dev.driver)
+		goto err_endpoint;
+
+	return endpoint;
+
+err_endpoint:
+	device_unlock(&endpoint->dev);
+err:
+	device_unlock(&cxlmd->dev);
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
+
+void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
+{
+	device_unlock(&endpoint->dev);
+	device_unlock(&cxlmd->dev);
+}
+EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
+
 __init int cxl_memdev_init(void)
 {
 	dev_t devt;
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 6720ab22a494..5e21b53362e6 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1336,7 +1336,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
 		 */
 		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
 			dev_name(dport_dev));
-		return -ENXIO;
+		return -EPROBE_DEFER;
 	}
 
 	parent_port = find_cxl_port(dparent, &parent_dport);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 7ee78e79933c..e3bcd6d12a1c 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -83,6 +83,8 @@ static inline bool is_cxl_endpoint(struct cxl_port *port)
 	return is_cxl_memdev(port->uport);
 }
 
+struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
+void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
 struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
 int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 			 resource_size_t base, resource_size_t len,
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 584f9eec57e4..2470c6f2621c 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -154,13 +154,16 @@ static int cxl_mem_probe(struct device *dev)
 		return rc;
 
 	rc = devm_cxl_enumerate_ports(cxlmd);
-	if (rc)
+	if (rc) {
+		cxlmd->endpoint = ERR_PTR(rc);
 		return rc;
+	}
 
 	parent_port = cxl_mem_find_port(cxlmd, &dport);
 	if (!parent_port) {
 		dev_err(dev, "CXL port topology not found\n");
-		return -ENXIO;
+		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
+		return -EPROBE_DEFER;
 	}
 
 	if (dport->rch)


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 11/19] cxl/region: Factor out construct_region_{begin, end} and drop_region() for reuse
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (9 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 10/19] cxl/memdev: Indicate probe deferral Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-06 14:29   ` Jonathan Cameron
  2023-06-13 23:29   ` Dave Jiang
  2023-06-04 23:32 ` [PATCH 12/19] cxl/region: Factor out interleave ways setup Dan Williams
                   ` (7 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

In preparation for constructing regions from newly allocated HPA, factor
out some helpers that can be shared with the existing kernel-internal
region construction from BIOS pre-allocated regions. Handle acquiring a
new region object under the region rwsem, and optionally tearing it down
if the region assembly process fails.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c |   73 ++++++++++++++++++++++++++++++++-------------
 1 file changed, 52 insertions(+), 21 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c7170d92f47f..bd3c3d4b2683 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2191,19 +2191,25 @@ cxl_find_region_by_name(struct cxl_root_decoder *cxlrd, const char *name)
 	return to_cxl_region(region_dev);
 }
 
+static void drop_region(struct cxl_region *cxlr)
+{
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_port *port = cxlrd_to_port(cxlrd);
+
+	devm_release_action(port->uport, unregister_region, cxlr);
+}
+
 static ssize_t delete_region_store(struct device *dev,
 				   struct device_attribute *attr,
 				   const char *buf, size_t len)
 {
 	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev);
-	struct cxl_port *port = to_cxl_port(dev->parent);
 	struct cxl_region *cxlr;
 
 	cxlr = cxl_find_region_by_name(cxlrd, buf);
 	if (IS_ERR(cxlr))
 		return PTR_ERR(cxlr);
-
-	devm_release_action(port->uport, unregister_region, cxlr);
+	drop_region(cxlr);
 	put_device(&cxlr->dev);
 
 	return len;
@@ -2664,17 +2670,19 @@ static int match_region_by_range(struct device *dev, void *data)
 	return rc;
 }
 
-/* Establish an empty region covering the given HPA range */
-static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
-					   struct cxl_endpoint_decoder *cxled)
+static void construct_region_end(void)
+{
+	up_write(&cxl_region_rwsem);
+}
+
+static struct cxl_region *
+construct_region_begin(struct cxl_root_decoder *cxlrd,
+		       struct cxl_endpoint_decoder *cxled)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
-	struct cxl_port *port = cxlrd_to_port(cxlrd);
-	struct range *hpa = &cxled->cxld.hpa_range;
 	struct cxl_region_params *p;
 	struct cxl_region *cxlr;
-	struct resource *res;
-	int rc;
+	int err = 0;
 
 	do {
 		cxlr = __create_region(cxlrd, cxled->mode,
@@ -2693,19 +2701,41 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 	p = &cxlr->params;
 	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
 		dev_err(cxlmd->dev.parent,
-			"%s:%s: %s autodiscovery interrupted\n",
+			"%s:%s: %s region setup interrupted\n",
 			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
 			__func__);
-		rc = -EBUSY;
-		goto err;
+		err = -EBUSY;
+	}
+
+	if (err) {
+		construct_region_end();
+		drop_region(cxlr);
+		return ERR_PTR(err);
 	}
+	return cxlr;
+}
+
+/* Establish an empty region covering the given HPA range */
+static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
+					   struct cxl_endpoint_decoder *cxled)
+{
+	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+	struct range *hpa = &cxled->cxld.hpa_range;
+	struct cxl_region_params *p;
+	struct cxl_region *cxlr;
+	struct resource *res;
+	int rc;
+
+	cxlr = construct_region_begin(cxlrd, cxled);
+	if (IS_ERR(cxlr))
+		return cxlr;
 
 	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
 
 	res = kmalloc(sizeof(*res), GFP_KERNEL);
 	if (!res) {
 		rc = -ENOMEM;
-		goto err;
+		goto out;
 	}
 
 	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
@@ -2722,6 +2752,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 			 __func__, dev_name(&cxlr->dev));
 	}
 
+	p = &cxlr->params;
 	p->res = res;
 	p->interleave_ways = cxled->cxld.interleave_ways;
 	p->interleave_granularity = cxled->cxld.interleave_granularity;
@@ -2729,7 +2760,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 
 	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
 	if (rc)
-		goto err;
+		goto out;
 
 	dev_dbg(cxlmd->dev.parent, "%s:%s: %s %s res: %pr iw: %d ig: %d\n",
 		dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), __func__,
@@ -2738,14 +2769,14 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 
 	/* ...to match put_device() in cxl_add_to_region() */
 	get_device(&cxlr->dev);
-	up_write(&cxl_region_rwsem);
 
+out:
+	construct_region_end();
+	if (rc) {
+		drop_region(cxlr);
+		return ERR_PTR(rc);
+	}
 	return cxlr;
-
-err:
-	up_write(&cxl_region_rwsem);
-	devm_release_action(port->uport, unregister_region, cxlr);
-	return ERR_PTR(rc);
 }
 
 int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 12/19] cxl/region: Factor out interleave ways setup
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (10 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 11/19] cxl/region: Factor out construct_region_{begin, end} and drop_region() for reuse Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-06 14:31   ` Jonathan Cameron
  2023-06-13 23:30   ` Dave Jiang
  2023-06-04 23:32 ` [PATCH 13/19] cxl/region: Factor out interleave granularity setup Dan Williams
                   ` (6 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

In preparation for kernel driven region creation, factor out a common
helper from the user-sysfs region setup for interleave_ways.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c |   46 ++++++++++++++++++++++++++-------------------
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index bd3c3d4b2683..821c2d90154f 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -338,22 +338,14 @@ static ssize_t interleave_ways_show(struct device *dev,
 
 static const struct attribute_group *get_cxl_region_target_group(void);
 
-static ssize_t interleave_ways_store(struct device *dev,
-				     struct device_attribute *attr,
-				     const char *buf, size_t len)
+static int set_interleave_ways(struct cxl_region *cxlr, int val)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
 	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
-	struct cxl_region *cxlr = to_cxl_region(dev);
 	struct cxl_region_params *p = &cxlr->params;
-	unsigned int val, save;
-	int rc;
+	int save, rc;
 	u8 iw;
 
-	rc = kstrtouint(buf, 0, &val);
-	if (rc)
-		return rc;
-
 	rc = ways_to_eiw(val, &iw);
 	if (rc)
 		return rc;
@@ -368,21 +360,37 @@ static ssize_t interleave_ways_store(struct device *dev,
 		return -EINVAL;
 	}
 
-	rc = down_write_killable(&cxl_region_rwsem);
-	if (rc)
-		return rc;
-	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
-		rc = -EBUSY;
-		goto out;
-	}
+	lockdep_assert_held_write(&cxl_region_rwsem);
+	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
+		return -EBUSY;
 
 	save = p->interleave_ways;
 	p->interleave_ways = val;
 	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
 	if (rc)
 		p->interleave_ways = save;
-out:
+	return rc;
+}
+
+static ssize_t interleave_ways_store(struct device *dev,
+				     struct device_attribute *attr,
+				     const char *buf, size_t len)
+{
+	struct cxl_region *cxlr = to_cxl_region(dev);
+	unsigned int val;
+	int rc;
+
+	rc = kstrtouint(buf, 0, &val);
+	if (rc)
+		return rc;
+
+	rc = down_write_killable(&cxl_region_rwsem);
+	if (rc)
+		return rc;
+
+	rc = set_interleave_ways(cxlr, val);
 	up_write(&cxl_region_rwsem);
+
 	if (rc)
 		return rc;
 	return len;


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 13/19] cxl/region: Factor out interleave granularity setup
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (11 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 12/19] cxl/region: Factor out interleave ways setup Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-06 14:33   ` Jonathan Cameron
  2023-06-13 23:42   ` Dave Jiang
  2023-06-04 23:32 ` [PATCH 14/19] cxl/region: Clarify locking requirements of cxl_region_attach() Dan Williams
                   ` (5 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

In preparation for kernel driven region creation, factor out a common
helper from the user-sysfs region setup for interleave_granularity.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c |   39 +++++++++++++++++++++++----------------
 1 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 821c2d90154f..4d8dbfedd64a 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -414,21 +414,14 @@ static ssize_t interleave_granularity_show(struct device *dev,
 	return rc;
 }
 
-static ssize_t interleave_granularity_store(struct device *dev,
-					    struct device_attribute *attr,
-					    const char *buf, size_t len)
+static int set_interleave_granularity(struct cxl_region *cxlr, int val)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
 	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
-	struct cxl_region *cxlr = to_cxl_region(dev);
 	struct cxl_region_params *p = &cxlr->params;
-	int rc, val;
+	int rc;
 	u16 ig;
 
-	rc = kstrtoint(buf, 0, &val);
-	if (rc)
-		return rc;
-
 	rc = granularity_to_eig(val, &ig);
 	if (rc)
 		return rc;
@@ -444,16 +437,30 @@ static ssize_t interleave_granularity_store(struct device *dev,
 	if (cxld->interleave_ways > 1 && val != cxld->interleave_granularity)
 		return -EINVAL;
 
+	lockdep_assert_held_write(&cxl_region_rwsem);
+	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
+		return -EBUSY;
+
+	p->interleave_granularity = val;
+	return 0;
+}
+
+static ssize_t interleave_granularity_store(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t len)
+{
+	struct cxl_region *cxlr = to_cxl_region(dev);
+	int rc, val;
+
+	rc = kstrtoint(buf, 0, &val);
+	if (rc)
+		return rc;
+
 	rc = down_write_killable(&cxl_region_rwsem);
 	if (rc)
 		return rc;
-	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
-		rc = -EBUSY;
-		goto out;
-	}
 
-	p->interleave_granularity = val;
-out:
+	rc = set_interleave_granularity(cxlr, val);
 	up_write(&cxl_region_rwsem);
 	if (rc)
 		return rc;


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 14/19] cxl/region: Clarify locking requirements of cxl_region_attach()
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (12 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 13/19] cxl/region: Factor out interleave granularity setup Dan Williams
@ 2023-06-04 23:32 ` Dan Williams
  2023-06-06 14:35   ` Jonathan Cameron
  2023-06-13 23:45   ` Dave Jiang
  2023-06-04 23:33 ` [PATCH 15/19] cxl/region: Specify host-only vs device memory at region creation time Dan Williams
                   ` (4 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:32 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

In preparation for cxl_region_attach() being called for kernel initiated
region creation, enforce the locking context with explicit lockdep
assertions.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 4d8dbfedd64a..defc2f0e43e3 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1587,6 +1587,9 @@ static int cxl_region_attach(struct cxl_region *cxlr,
 	struct cxl_dport *dport;
 	int rc = -ENXIO;
 
+	lockdep_assert_held_write(&cxl_region_rwsem);
+	lockdep_assert_held_read(&cxl_dpa_rwsem);
+
 	if (cxled->mode != cxlr->mode) {
 		dev_dbg(&cxlr->dev, "%s region mode: %d mismatch: %d\n",
 			dev_name(&cxled->cxld.dev), cxlr->mode, cxled->mode);


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 15/19] cxl/region: Specify host-only vs device memory at region creation time
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (13 preceding siblings ...)
  2023-06-04 23:32 ` [PATCH 14/19] cxl/region: Clarify locking requirements of cxl_region_attach() Dan Williams
@ 2023-06-04 23:33 ` Dan Williams
  2023-06-06 14:42   ` Jonathan Cameron
  2023-06-04 23:33 ` [PATCH 16/19] cxl/hdm: Define a driver interface for DPA allocation Dan Williams
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:33 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

In preparation for supporting device-memory (HDM-D[B]) region creation,
convey the endpoint-decoder target type to devm_cxl_add_region().

Note that none of the existing sysfs ABIs allow for HDM-D[B] region
creation. The expectation is that HDM-D[B] region creation requires a
kernel-internal region creation flow, for example, driven by an
accelerator driver.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c |   14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index defc2f0e43e3..75c5de627868 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2120,7 +2120,8 @@ static ssize_t create_ram_region_show(struct device *dev,
 }
 
 static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
-					  enum cxl_decoder_mode mode, int id)
+					  int id, enum cxl_decoder_mode mode,
+					  enum cxl_decoder_type type)
 {
 	int rc;
 
@@ -2133,7 +2134,7 @@ static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
 		return ERR_PTR(-EBUSY);
 	}
 
-	return devm_cxl_add_region(cxlrd, id, mode, CXL_DECODER_HOSTMEM);
+	return devm_cxl_add_region(cxlrd, id, mode, type);
 }
 
 static ssize_t create_pmem_region_store(struct device *dev,
@@ -2148,7 +2149,8 @@ static ssize_t create_pmem_region_store(struct device *dev,
 	if (rc != 1)
 		return -EINVAL;
 
-	cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id);
+	cxlr = __create_region(cxlrd, id, CXL_DECODER_PMEM,
+			       CXL_DECODER_HOSTMEM);
 	if (IS_ERR(cxlr))
 		return PTR_ERR(cxlr);
 
@@ -2168,7 +2170,7 @@ static ssize_t create_ram_region_store(struct device *dev,
 	if (rc != 1)
 		return -EINVAL;
 
-	cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id);
+	cxlr = __create_region(cxlrd, id, CXL_DECODER_RAM, CXL_DECODER_HOSTMEM);
 	if (IS_ERR(cxlr))
 		return PTR_ERR(cxlr);
 
@@ -2703,8 +2705,8 @@ construct_region_begin(struct cxl_root_decoder *cxlrd,
 	int err = 0;
 
 	do {
-		cxlr = __create_region(cxlrd, cxled->mode,
-				       atomic_read(&cxlrd->region_id));
+		cxlr = __create_region(cxlrd, atomic_read(&cxlrd->region_id),
+				       cxled->mode, cxled->cxld.target_type);
 	} while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY);
 
 	if (IS_ERR(cxlr)) {


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 16/19] cxl/hdm: Define a driver interface for DPA allocation
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (14 preceding siblings ...)
  2023-06-04 23:33 ` [PATCH 15/19] cxl/region: Specify host-only vs device memory at region creation time Dan Williams
@ 2023-06-04 23:33 ` Dan Williams
  2023-06-06 14:58   ` Jonathan Cameron
  2023-06-13 23:53   ` Dave Jiang
  2023-06-04 23:33 ` [PATCH 17/19] cxl/region: Define a driver interface for HPA free space enumeration Dan Williams
                   ` (2 subsequent siblings)
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:33 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

Region creation involves finding available DPA (device-physical-address)
capacity to map into HPA (host-physical-address) space. Given the HPA
capacity constraint, define an API, cxl_request_dpa(), that has the
flexibility to map the minimum amount of memory the driver needs to
operate vs the total possible that can be mapped given HPA availability.

Factor out the core of cxl_dpa_alloc(), that does free space scanning,
into a cxl_dpa_freespace() helper, and use that to balance the capacity
available to map vs the @min and @max arguments to cxl_request_dpa().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/hdm.c |  140 +++++++++++++++++++++++++++++++++++++++++-------
 drivers/cxl/cxl.h      |    6 ++
 drivers/cxl/cxlmem.h   |    4 +
 3 files changed, 131 insertions(+), 19 deletions(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 91ab3033c781..514d30131d92 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -464,30 +464,17 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 	return rc;
 }
 
-int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
+static resource_size_t cxl_dpa_freespace(struct cxl_endpoint_decoder *cxled,
+					 resource_size_t *start_out,
+					 resource_size_t *skip_out)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	resource_size_t free_ram_start, free_pmem_start;
-	struct cxl_port *port = cxled_to_port(cxled);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
-	struct device *dev = &cxled->cxld.dev;
 	resource_size_t start, avail, skip;
 	struct resource *p, *last;
-	int rc;
 
-	down_write(&cxl_dpa_rwsem);
-	if (cxled->cxld.region) {
-		dev_dbg(dev, "decoder attached to %s\n",
-			dev_name(&cxled->cxld.region->dev));
-		rc = -EBUSY;
-		goto out;
-	}
-
-	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
-		dev_dbg(dev, "decoder enabled\n");
-		rc = -EBUSY;
-		goto out;
-	}
+	lockdep_assert_held(&cxl_dpa_rwsem);
 
 	for (p = cxlds->ram_res.child, last = NULL; p; p = p->sibling)
 		last = p;
@@ -525,11 +512,42 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
 			skip_end = start - 1;
 		skip = skip_end - skip_start + 1;
 	} else {
-		dev_dbg(dev, "mode not set\n");
-		rc = -EINVAL;
+		dev_dbg(cxled_dev(cxled), "mode not set\n");
+		avail = 0;
+	}
+
+	if (!avail)
+		return 0;
+	if (start_out)
+		*start_out = start;
+	if (skip_out)
+		*skip_out = skip;
+	return avail;
+}
+
+int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
+{
+	struct cxl_port *port = cxled_to_port(cxled);
+	struct device *dev = &cxled->cxld.dev;
+	resource_size_t start, avail, skip;
+	int rc;
+
+	down_write(&cxl_dpa_rwsem);
+	if (cxled->cxld.region) {
+		dev_dbg(dev, "decoder attached to %s\n",
+			dev_name(&cxled->cxld.region->dev));
+		rc = -EBUSY;
+		goto out;
+	}
+
+	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
+		dev_dbg(dev, "decoder enabled\n");
+		rc = -EBUSY;
 		goto out;
 	}
 
+	avail = cxl_dpa_freespace(cxled, &start, &skip);
+
 	if (size > avail) {
 		dev_dbg(dev, "%pa exceeds available %s capacity: %pa\n", &size,
 			cxled->mode == CXL_DECODER_RAM ? "ram" : "pmem",
@@ -548,6 +566,90 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
 	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
 }
 
+static int find_free_decoder(struct device *dev, void *data)
+{
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_port *port;
+
+	if (!is_endpoint_decoder(dev))
+		return 0;
+
+	cxled = to_cxl_endpoint_decoder(dev);
+	port = cxled_to_port(cxled);
+
+	if (cxled->cxld.id != port->hdm_end + 1)
+		return 0;
+	return 1;
+}
+
+/**
+ * cxl_request_dpa - search and reserve DPA given input constraints
+ * @endpoint: an endpoint port with available decoders
+ * @mode: DPA operation mode (ram vs pmem)
+ * @min: the minimum amount of capacity the call needs
+ * @max: extra capacity to allocate after min is satisfied
+ *
+ * Given that a region needs to allocate from limited HPA capacity it
+ * may be the case that a device has more mappable DPA capacity than
+ * available HPA. So, the expectation is that @min is a driver known
+ * value for how much capacity is needed, and @max is based the limit of
+ * how much HPA space is available for a new region.
+ *
+ * Returns a pinned cxl_decoder with at least @min bytes of capacity
+ * reserved, or an error pointer. The caller is also expected to own the
+ * lifetime of the memdev registration associated with the endpoint to
+ * pin the decoder registered as well.
+ */
+struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
+					     enum cxl_decoder_mode mode,
+					     resource_size_t min,
+					     resource_size_t max)
+{
+	struct cxl_endpoint_decoder *cxled;
+	struct device *cxled_dev;
+	resource_size_t alloc;
+	int rc;
+
+	if (!IS_ALIGNED(min | max, SZ_256M))
+		return ERR_PTR(-EINVAL);
+
+	down_read(&cxl_dpa_rwsem);
+	cxled_dev = device_find_child(&endpoint->dev, NULL, find_free_decoder);
+	if (!cxled_dev)
+		cxled = ERR_PTR(-ENXIO);
+	else
+		cxled = to_cxl_endpoint_decoder(cxled_dev);
+	up_read(&cxl_dpa_rwsem);
+
+	if (IS_ERR(cxled))
+		return cxled;
+
+	rc = cxl_dpa_set_mode(cxled, mode);
+	if (rc)
+		goto err;
+
+	down_read(&cxl_dpa_rwsem);
+	alloc = cxl_dpa_freespace(cxled, NULL, NULL);
+	up_read(&cxl_dpa_rwsem);
+
+	if (max)
+		alloc = min(max, alloc);
+	if (alloc < min) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	rc = cxl_dpa_alloc(cxled, alloc);
+	if (rc)
+		goto err;
+
+	return cxled;
+err:
+	put_device(cxled_dev);
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_request_dpa, CXL);
+
 static void cxld_set_interleave(struct cxl_decoder *cxld, u32 *ctrl)
 {
 	u16 eig;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 258c90727dd2..55808697773f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -680,6 +680,12 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
 struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
 struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
 struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
+
+static inline struct device *cxled_dev(struct cxl_endpoint_decoder *cxled)
+{
+	return &cxled->cxld.dev;
+}
+
 bool is_root_decoder(struct device *dev);
 bool is_switch_decoder(struct device *dev);
 bool is_endpoint_decoder(struct device *dev);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index e3bcd6d12a1c..8ec5c305d186 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -89,6 +89,10 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
 int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 			 resource_size_t base, resource_size_t len,
 			 resource_size_t skipped);
+struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
+					     enum cxl_decoder_mode mode,
+					     resource_size_t min,
+					     resource_size_t max);
 
 static inline struct cxl_ep *cxl_ep_load(struct cxl_port *port,
 					 struct cxl_memdev *cxlmd)


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 17/19] cxl/region: Define a driver interface for HPA free space enumeration
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (15 preceding siblings ...)
  2023-06-04 23:33 ` [PATCH 16/19] cxl/hdm: Define a driver interface for DPA allocation Dan Williams
@ 2023-06-04 23:33 ` Dan Williams
  2023-06-06 15:23   ` Jonathan Cameron
  2023-06-14  0:15   ` Dave Jiang
  2023-06-04 23:33 ` [PATCH 18/19] cxl/region: Define a driver interface for region creation Dan Williams
  2023-06-04 23:33 ` [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory Dan Williams
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:33 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

CXL region creation involves allocating capacity from device DPA
(device-physical-address space) and assigning it to decode a given HPA
(host-physical-address space). Before determininig how much DPA to
allocate the amount of available HPA must be determined. Also, not all
HPA is created equal, some specifically targets RAM, some target PMEM,
some is prepared for the device-memory flows like HDM-D and HDM-DB, and
some is host-only (HDM-H).

Wrap all of those concerns into an API that retrieves a root decoder
(platform CXL window) that fits the specified constraints and the
capacity available for a new region.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c |  143 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h         |    5 ++
 drivers/cxl/cxlmem.h      |    5 ++
 3 files changed, 153 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 75c5de627868..a41756249f8d 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -575,6 +575,149 @@ static int free_hpa(struct cxl_region *cxlr)
 	return 0;
 }
 
+struct cxlrd_max_context {
+	struct device * const *host_bridges;
+	int interleave_ways;
+	unsigned long flags;
+	resource_size_t max_hpa;
+	struct cxl_root_decoder *cxlrd;
+};
+
+static int find_max_hpa(struct device *dev, void *data)
+{
+	struct cxlrd_max_context *ctx = data;
+	struct cxl_switch_decoder *cxlsd;
+	struct cxl_root_decoder *cxlrd;
+	struct resource *res, *prev;
+	struct cxl_decoder *cxld;
+	resource_size_t max;
+	unsigned int seq;
+	int found;
+
+	if (!is_root_decoder(dev))
+		return 0;
+
+	cxlrd = to_cxl_root_decoder(dev);
+	cxld = &cxlrd->cxlsd.cxld;
+	if ((cxld->flags & ctx->flags) != ctx->flags)
+		return 0;
+
+	if (cxld->interleave_ways != ctx->interleave_ways)
+		return 0;
+
+	cxlsd = &cxlrd->cxlsd;
+	do {
+		found = 0;
+		seq = read_seqbegin(&cxlsd->target_lock);
+		for (int i = 0; i < ctx->interleave_ways; i++)
+			for (int j = 0; j < ctx->interleave_ways; j++)
+				if (ctx->host_bridges[i] ==
+				    cxlsd->target[j]->dport) {
+					found++;
+					break;
+				}
+	} while (read_seqretry(&cxlsd->target_lock, seq));
+
+	if (found != ctx->interleave_ways)
+		return 0;
+
+	/*
+	 * Walk the root decoder resource range relying on cxl_region_rwsem to
+	 * preclude sibling arrival/departure and find the largest free space
+	 * gap.
+	 */
+	lockdep_assert_held_read(&cxl_region_rwsem);
+	max = 0;
+	res = cxlrd->res->child;
+	if (!res)
+		max = resource_size(cxlrd->res);
+	else
+		max = 0;
+	for (prev = NULL; res; prev = res, res = res->sibling) {
+		struct resource *next = res->sibling;
+		resource_size_t free = 0;
+
+		if (!prev && res->start > cxlrd->res->start) {
+			free = res->start - cxlrd->res->start;
+			max = max(free, max);
+		}
+		if (prev && res->start > prev->end + 1) {
+			free = res->start - prev->end + 1;
+			max = max(free, max);
+		}
+		if (next && res->end + 1 < next->start) {
+			free = next->start - res->end + 1;
+			max = max(free, max);
+		}
+		if (!next && res->end + 1 < cxlrd->res->end + 1) {
+			free = cxlrd->res->end + 1 - res->end + 1;
+			max = max(free, max);
+		}
+	}
+
+	if (max > ctx->max_hpa) {
+		if (ctx->cxlrd)
+			put_device(cxlrd_dev(ctx->cxlrd));
+		get_device(cxlrd_dev(cxlrd));
+		ctx->cxlrd = cxlrd;
+		ctx->max_hpa = max;
+		dev_dbg(cxlrd_dev(cxlrd), "found %pa bytes of free space\n", &max);
+	}
+
+	return 0;
+}
+
+/**
+ * cxl_hpa_freespace - find a root decoder with free capacity per constraints
+ * @endpoint: an endpoint that is mapped by the returned decoder
+ * @host_bridges: array of host-bridges that the decoder must interleave
+ * @interleave_ways: number of entries in @host_bridges
+ * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
+ * @max: output parameter of bytes available in the returned decoder
+ *
+ * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
+ * is a point in time snapshot. If by the time the caller goes to use this root
+ * decoder's capacity the capacity is reduced then caller needs to loop and
+ * retry.
+ *
+ * The returned root decoder has an elevated reference count that needs to be
+ * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
+ * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
+ * does not race.
+ */
+struct cxl_root_decoder *cxl_hpa_freespace(struct cxl_port *endpoint,
+					   struct device *const *host_bridges,
+					   int interleave_ways,
+					   unsigned long flags,
+					   resource_size_t *max)
+{
+	struct cxlrd_max_context ctx = {
+		.host_bridges = host_bridges,
+		.interleave_ways = interleave_ways,
+		.flags = flags,
+	};
+	struct cxl_port *root;
+
+	if (!is_cxl_endpoint(endpoint))
+		return ERR_PTR(-EINVAL);
+
+	root = find_cxl_root(endpoint);
+	if (!root)
+		return ERR_PTR(-ENXIO);
+
+	down_read(&cxl_region_rwsem);
+	device_for_each_child(&root->dev, &ctx, find_max_hpa);
+	up_read(&cxl_region_rwsem);
+	put_device(&root->dev);
+
+	if (!ctx.cxlrd)
+		return ERR_PTR(-ENOMEM);
+
+	*max = ctx.max_hpa;
+	return ctx.cxlrd;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_hpa_freespace, CXL);
+
 static ssize_t size_store(struct device *dev, struct device_attribute *attr,
 			  const char *buf, size_t len)
 {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 55808697773f..8400af85d99f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -686,6 +686,11 @@ static inline struct device *cxled_dev(struct cxl_endpoint_decoder *cxled)
 	return &cxled->cxld.dev;
 }
 
+static inline struct device *cxlrd_dev(struct cxl_root_decoder *cxlrd)
+{
+	return &cxlrd->cxlsd.cxld.dev;
+}
+
 bool is_root_decoder(struct device *dev);
 bool is_switch_decoder(struct device *dev);
 bool is_endpoint_decoder(struct device *dev);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 8ec5c305d186..69f07186502d 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -93,6 +93,11 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
 					     enum cxl_decoder_mode mode,
 					     resource_size_t min,
 					     resource_size_t max);
+struct cxl_root_decoder *cxl_hpa_freespace(struct cxl_port *endpoint,
+					   struct device *const *host_bridges,
+					   int interleave_ways,
+					   unsigned long flags,
+					   resource_size_t *max);
 
 static inline struct cxl_ep *cxl_ep_load(struct cxl_port *port,
 					 struct cxl_memdev *cxlmd)


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 18/19] cxl/region: Define a driver interface for region creation
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (16 preceding siblings ...)
  2023-06-04 23:33 ` [PATCH 17/19] cxl/region: Define a driver interface for HPA free space enumeration Dan Williams
@ 2023-06-04 23:33 ` Dan Williams
  2023-06-06 15:31   ` Jonathan Cameron
  2023-06-04 23:33 ` [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory Dan Williams
  18 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:33 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

Scenarios like recreating persistent memory regions from label data and
establishing new regions for CXL attached accelerators with local memory
need a kernel internal facility to establish new regions.

Introduce cxl_create_region() that takes an array of endpoint decoders
with reserved capacity and a root decoder object to establish a new
region.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/region.c |  107 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h      |    3 +
 2 files changed, 110 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index a41756249f8d..543c4499379e 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2878,6 +2878,104 @@ construct_region_begin(struct cxl_root_decoder *cxlrd,
 	return cxlr;
 }
 
+static struct cxl_region *
+__construct_new_region(struct cxl_root_decoder *cxlrd,
+		       struct cxl_endpoint_decoder **cxled, int ways)
+{
+	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
+	struct cxl_region_params *p;
+	resource_size_t size = 0;
+	struct cxl_region *cxlr;
+	int rc, i;
+
+	if (ways < 1)
+		return ERR_PTR(-EINVAL);
+
+	cxlr = construct_region_begin(cxlrd, cxled[0]);
+	if (IS_ERR(cxlr))
+		return cxlr;
+
+	rc = set_interleave_ways(cxlr, ways);
+	if (rc)
+		goto out;
+
+	rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
+	if (rc)
+		goto out;
+
+	down_read(&cxl_dpa_rwsem);
+	for (i = 0; i < ways; i++) {
+		if (!cxled[i]->dpa_res)
+			break;
+		size += resource_size(cxled[i]->dpa_res);
+	}
+	up_read(&cxl_dpa_rwsem);
+
+	if (i < ways)
+		goto out;
+
+	rc = alloc_hpa(cxlr, size);
+	if (rc)
+		goto out;
+
+	down_read(&cxl_dpa_rwsem);
+	for (i = 0; i < ways; i++) {
+		rc = cxl_region_attach(cxlr, cxled[i], i);
+		if (rc)
+			break;
+	}
+	up_read(&cxl_dpa_rwsem);
+
+	if (rc)
+		goto out;
+
+	rc = cxl_region_decode_commit(cxlr);
+	if (rc)
+		goto out;
+
+	p = &cxlr->params;
+	p->state = CXL_CONFIG_COMMIT;
+out:
+	construct_region_end();
+	if (rc) {
+		drop_region(cxlr);
+		return ERR_PTR(rc);
+	}
+	return cxlr;
+}
+
+/**
+ * cxl_create_region - Establish a region given an array of endpoint decoders
+ * @cxlrd: root decoder to allocate HPA
+ * @cxled: array of endpoint decoders with reserved DPA capacity
+ * @ways: size of @cxled array
+ *
+ * Returns a fully formed region in the commit state and attached to the
+ * cxl_region driver.
+ */
+struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
+				     struct cxl_endpoint_decoder **cxled,
+				     int ways)
+{
+	struct cxl_region *cxlr;
+
+	mutex_lock(&cxlrd->range_lock);
+	cxlr = __construct_new_region(cxlrd, cxled, ways);
+	mutex_unlock(&cxlrd->range_lock);
+
+	if (IS_ERR(cxlr))
+		return cxlr;
+
+	if (device_attach(&cxlr->dev) <= 0) {
+		dev_err(&cxlr->dev, "failed to create region\n");
+		drop_region(cxlr);
+		return ERR_PTR(-ENODEV);
+	}
+
+	return cxlr;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_create_region, CXL);
+
 /* Establish an empty region covering the given HPA range */
 static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 					   struct cxl_endpoint_decoder *cxled)
@@ -3085,6 +3183,15 @@ static int cxl_region_probe(struct device *dev)
 					p->res->start, p->res->end, cxlr,
 					is_system_ram) > 0)
 			return 0;
+
+		/*
+		 * HDM-D[B] (device-memory) regions have accelerator
+		 * specific usage, skip device-dax registration.
+		 */
+		if (cxlr->type == CXL_DECODER_DEVMEM)
+			return 0;
+
+		/* HDM-H routes to device-dax */
 		return devm_cxl_add_dax_region(cxlr);
 	default:
 		dev_dbg(&cxlr->dev, "unsupported region mode: %d\n",
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 69f07186502d..ad7f806549d3 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -98,6 +98,9 @@ struct cxl_root_decoder *cxl_hpa_freespace(struct cxl_port *endpoint,
 					   int interleave_ways,
 					   unsigned long flags,
 					   resource_size_t *max);
+struct cxl_region *cxl_create_region(struct cxl_root_decoder *cxlrd,
+				     struct cxl_endpoint_decoder **cxled,
+				     int ways);
 
 static inline struct cxl_ep *cxl_ep_load(struct cxl_port *port,
 					 struct cxl_memdev *cxlmd)


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory
  2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
                   ` (17 preceding siblings ...)
  2023-06-04 23:33 ` [PATCH 18/19] cxl/region: Define a driver interface for region creation Dan Williams
@ 2023-06-04 23:33 ` Dan Williams
  2023-06-06 15:34   ` Jonathan Cameron
  2023-06-07 21:09   ` Vikram Sethi
  18 siblings, 2 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-04 23:33 UTC (permalink / raw)
  To: linux-cxl; +Cc: ira.weiny, navneet.singh

Mock-up a device that does not have a standard mailbox, i.e. a device
that does not implement the CXL memory-device class code, but wants to
map "device" memory (aka Type-2, aka HDM-D[B], aka accelerator memory).

For extra code coverage make this device an RCD to test region creation
flows in the presence of an RCH topology (memory device modeled as a
root-complex-integrated-endpoint RCIEP).

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/memdev.c    |   15 +++++++
 drivers/cxl/cxlmem.h         |    1 
 tools/testing/cxl/test/cxl.c |   16 +++++++-
 tools/testing/cxl/test/mem.c |   85 +++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 112 insertions(+), 5 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 859c43c340bb..5d1ba7a72567 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -467,6 +467,21 @@ static void detach_memdev(struct work_struct *work)
 	put_device(&cxlmd->dev);
 }
 
+struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
+{
+	struct cxl_dev_state *cxlds;
+
+	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
+	if (!cxlds)
+		return ERR_PTR(-ENOMEM);
+
+	cxlds->dev = dev;
+	cxlds->type = CXL_DEVTYPE_DEVMEM;
+
+	return cxlds;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
+
 static struct lock_class_key cxl_memdev_key;
 
 static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index ad7f806549d3..89e560ea14c0 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -746,6 +746,7 @@ int cxl_await_media_ready(struct cxl_dev_state *cxlds);
 int cxl_enumerate_cmds(struct cxl_memdev_state *mds);
 int cxl_mem_create_range_info(struct cxl_memdev_state *mds);
 struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev);
+struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
 void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 				unsigned long *cmds);
 void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index e3f1b2e88e3e..385cdeeab22c 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -278,7 +278,7 @@ static struct {
 			},
 			.interleave_ways = 0,
 			.granularity = 4,
-			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
+			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE2 |
 					ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
 			.qtg_id = 5,
 			.window_size = SZ_256M,
@@ -713,7 +713,19 @@ static void default_mock_decoder(struct cxl_decoder *cxld)
 
 	cxld->interleave_ways = 1;
 	cxld->interleave_granularity = 256;
-	cxld->target_type = CXL_DECODER_HOSTMEM;
+	if (is_endpoint_decoder(&cxld->dev)) {
+		struct cxl_endpoint_decoder *cxled;
+		struct cxl_dev_state *cxlds;
+		struct cxl_memdev *cxlmd;
+
+		cxled = to_cxl_endpoint_decoder(&cxld->dev);
+		cxlmd = cxled_to_memdev(cxled);
+		cxlds = cxlmd->cxlds;
+		if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
+			cxld->target_type = CXL_DECODER_HOSTMEM;
+		else
+			cxld->target_type = CXL_DECODER_DEVMEM;
+	}
 	cxld->commit = mock_decoder_commit;
 	cxld->reset = mock_decoder_reset;
 }
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 6fb5718588f3..620bfcf5e5a5 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -1189,11 +1189,21 @@ static void label_area_release(void *lsa)
 	vfree(lsa);
 }
 
+#define CXL_MOCKMEM_RCD BIT(0)
+#define CXL_MOCKMEM_TYPE2 BIT(1)
+
 static bool is_rcd(struct platform_device *pdev)
 {
 	const struct platform_device_id *id = platform_get_device_id(pdev);
 
-	return !!id->driver_data;
+	return !!(id->driver_data & CXL_MOCKMEM_RCD);
+}
+
+static bool is_type2(struct platform_device *pdev)
+{
+	const struct platform_device_id *id = platform_get_device_id(pdev);
+
+	return !!(id->driver_data & CXL_MOCKMEM_TYPE2);
 }
 
 static ssize_t event_trigger_store(struct device *dev,
@@ -1205,7 +1215,7 @@ static ssize_t event_trigger_store(struct device *dev,
 }
 static DEVICE_ATTR_WO(event_trigger);
 
-static int cxl_mock_mem_probe(struct platform_device *pdev)
+static int __cxl_mock_mem_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct cxl_memdev *cxlmd;
@@ -1274,6 +1284,75 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	return 0;
 }
 
+static int cxl_mock_type2_probe(struct platform_device *pdev)
+{
+	struct cxl_endpoint_decoder *cxled;
+	struct device *dev = &pdev->dev;
+	struct cxl_root_decoder *cxlrd;
+	struct cxl_dev_state *cxlds;
+	struct cxl_port *endpoint;
+	struct cxl_memdev *cxlmd;
+	resource_size_t max = 0;
+	int rc;
+
+	cxlds = cxl_accel_state_create(dev);
+	if (IS_ERR(cxlds))
+		return PTR_ERR(cxlds);
+
+	cxlds->serial = pdev->id;
+	cxlds->component_reg_phys = CXL_RESOURCE_NONE;
+	cxlds->dpa_res = DEFINE_RES_MEM(0, DEV_SIZE);
+	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, DEV_SIZE, "ram");
+	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(DEV_SIZE, 0, "pmem");
+	if (is_rcd(pdev))
+		cxlds->rcd = true;
+
+	rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
+	if (rc)
+		return rc;
+
+	cxlmd = devm_cxl_add_memdev(cxlds);
+	if (IS_ERR(cxlmd))
+		return PTR_ERR(cxlmd);
+
+	endpoint = cxl_acquire_endpoint(cxlmd);
+	if (IS_ERR(endpoint))
+		return PTR_ERR(endpoint);
+
+	cxlrd = cxl_hpa_freespace(endpoint, &endpoint->host_bridge, 1,
+				  CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
+				  &max);
+
+	if (IS_ERR(cxlrd)) {
+		rc = PTR_ERR(cxlrd);
+		goto out;
+	}
+
+	cxled = cxl_request_dpa(endpoint, CXL_DECODER_RAM, 0, max);
+	if (IS_ERR(cxled)) {
+		rc = PTR_ERR(cxled);
+		goto out_cxlrd;
+	}
+
+	/* A real driver would do something with the returned region */
+	rc = PTR_ERR_OR_ZERO(cxl_create_region(cxlrd, &cxled, 1));
+
+	put_device(cxled_dev(cxled));
+out_cxlrd:
+	put_device(cxlrd_dev(cxlrd));
+out:
+	cxl_release_endpoint(cxlmd, endpoint);
+
+	return rc;
+}
+
+static int cxl_mock_mem_probe(struct platform_device *pdev)
+{
+	if (is_type2(pdev))
+		return cxl_mock_type2_probe(pdev);
+	return __cxl_mock_mem_probe(pdev);
+}
+
 static ssize_t security_lock_show(struct device *dev,
 				  struct device_attribute *attr, char *buf)
 {
@@ -1316,7 +1395,7 @@ ATTRIBUTE_GROUPS(cxl_mock_mem);
 
 static const struct platform_device_id cxl_mock_mem_ids[] = {
 	{ .name = "cxl_mem", 0 },
-	{ .name = "cxl_rcd", 1 },
+	{ .name = "cxl_rcd", CXL_MOCKMEM_RCD | CXL_MOCKMEM_TYPE2 },
 	{ },
 };
 MODULE_DEVICE_TABLE(platform, cxl_mock_mem_ids);


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM
  2023-06-04 23:32 ` [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM Dan Williams
@ 2023-06-05  1:14   ` kernel test robot
  2023-06-06 20:10     ` Dan Williams
  2023-06-06 11:27   ` Jonathan Cameron
  1 sibling, 1 reply; 64+ messages in thread
From: kernel test robot @ 2023-06-05  1:14 UTC (permalink / raw)
  To: Dan Williams; +Cc: llvm, oe-kbuild-all

Hi Dan,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 9561de3a55bed6bdd44a12820ba81ec416e705a7]

url:    https://github.com/intel-lab-lkp/linux/commits/Dan-Williams/cxl-regs-Clarify-when-a-struct-cxl_register_map-is-input-vs-output/20230605-073402
base:   9561de3a55bed6bdd44a12820ba81ec416e705a7
patch link:    https://lore.kernel.org/r/168592153054.1948938.12344684637653088842.stgit%40dwillia2-xfh.jf.intel.com
patch subject: [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM
config: riscv-randconfig-r033-20230605 (https://download.01.org/0day-ci/archive/20230605/202306050926.zKn5BTKk-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project 4faf3aaf28226a4e950c103a14f6fc1d1fdabb1b)
reproduce (this is a W=1 build):
        mkdir -p ~/bin
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install riscv cross compiling tool for clang build
        # apt-get install binutils-riscv-linux-gnu
        # https://github.com/intel-lab-lkp/linux/commit/c10eec1d3096b7e244f6c40478b3c2c1bde921fc
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Dan-Williams/cxl-regs-Clarify-when-a-struct-cxl_register_map-is-input-vs-output/20230605-073402
        git checkout c10eec1d3096b7e244f6c40478b3c2c1bde921fc
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang ~/bin/make.cross W=1 O=build_dir ARCH=riscv olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang ~/bin/make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash drivers/cxl/core/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306050926.zKn5BTKk-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/cxl/core/hdm.c:863:7: warning: variable 'cxled' is uninitialized when used here [-Wuninitialized]
                   if (cxled) {
                       ^~~~~
   drivers/cxl/core/hdm.c:797:36: note: initialize the variable 'cxled' to silence this warning
           struct cxl_endpoint_decoder *cxled;
                                             ^
                                              = NULL
   1 warning generated.


vim +/cxled +863 drivers/cxl/core/hdm.c

   791	
   792	static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
   793				    int *target_map, void __iomem *hdm, int which,
   794				    u64 *dpa_base, struct cxl_endpoint_dvsec_info *info)
   795	{
   796		u64 size, base, skip, dpa_size, lo, hi;
   797		struct cxl_endpoint_decoder *cxled;
   798		bool committed;
   799		u32 remainder;
   800		int i, rc;
   801		u32 ctrl;
   802		union {
   803			u64 value;
   804			unsigned char target_id[8];
   805		} target_list;
   806	
   807		if (should_emulate_decoders(info))
   808			return cxl_setup_hdm_decoder_from_dvsec(port, cxld, dpa_base,
   809								which, info);
   810	
   811		ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
   812		lo = readl(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(which));
   813		hi = readl(hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(which));
   814		base = (hi << 32) + lo;
   815		lo = readl(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which));
   816		hi = readl(hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(which));
   817		size = (hi << 32) + lo;
   818		committed = !!(ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED);
   819		cxld->commit = cxl_decoder_commit;
   820		cxld->reset = cxl_decoder_reset;
   821	
   822		if (!committed)
   823			size = 0;
   824		if (base == U64_MAX || size == U64_MAX) {
   825			dev_warn(&port->dev, "decoder%d.%d: Invalid resource range\n",
   826				 port->id, cxld->id);
   827			return -ENXIO;
   828		}
   829	
   830		cxld->hpa_range = (struct range) {
   831			.start = base,
   832			.end = base + size - 1,
   833		};
   834	
   835		/* decoders are enabled if committed */
   836		if (committed) {
   837			cxld->flags |= CXL_DECODER_F_ENABLE;
   838			if (ctrl & CXL_HDM_DECODER0_CTRL_LOCK)
   839				cxld->flags |= CXL_DECODER_F_LOCK;
   840			if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl))
   841				cxld->target_type = CXL_DECODER_HOSTMEM;
   842			else
   843				cxld->target_type = CXL_DECODER_DEVMEM;
   844			if (cxld->id != port->commit_end + 1) {
   845				dev_warn(&port->dev,
   846					 "decoder%d.%d: Committed out of order\n",
   847					 port->id, cxld->id);
   848				return -ENXIO;
   849			}
   850	
   851			if (size == 0) {
   852				dev_warn(&port->dev,
   853					 "decoder%d.%d: Committed with zero size\n",
   854					 port->id, cxld->id);
   855				return -ENXIO;
   856			}
   857			port->commit_end = cxld->id;
   858		} else {
   859			if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0) {
   860				ctrl |= CXL_HDM_DECODER0_CTRL_TYPE;
   861				writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
   862			}
 > 863			if (cxled) {
   864				struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
   865				struct cxl_dev_state *cxlds = cxlmd->cxlds;
   866	
   867				if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
   868					cxld->target_type = CXL_DECODER_HOSTMEM;
   869				else
   870					cxld->target_type = CXL_DECODER_DEVMEM;
   871			} else {
   872				/* To be overridden by region type at commit time */
   873				cxld->target_type = CXL_DECODER_HOSTMEM;
   874			}
   875		}
   876		rc = eiw_to_ways(FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl),
   877				  &cxld->interleave_ways);
   878		if (rc) {
   879			dev_warn(&port->dev,
   880				 "decoder%d.%d: Invalid interleave ways (ctrl: %#x)\n",
   881				 port->id, cxld->id, ctrl);
   882			return rc;
   883		}
   884		rc = eig_to_granularity(FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl),
   885					 &cxld->interleave_granularity);
   886		if (rc)
   887			return rc;
   888	
   889		dev_dbg(&port->dev, "decoder%d.%d: range: %#llx-%#llx iw: %d ig: %d\n",
   890			port->id, cxld->id, cxld->hpa_range.start, cxld->hpa_range.end,
   891			cxld->interleave_ways, cxld->interleave_granularity);
   892	
   893		if (!info) {
   894			lo = readl(hdm + CXL_HDM_DECODER0_TL_LOW(which));
   895			hi = readl(hdm + CXL_HDM_DECODER0_TL_HIGH(which));
   896			target_list.value = (hi << 32) + lo;
   897			for (i = 0; i < cxld->interleave_ways; i++)
   898				target_map[i] = target_list.target_id[i];
   899	
   900			return 0;
   901		}
   902	
   903		if (!committed)
   904			return 0;
   905	
   906		dpa_size = div_u64_rem(size, cxld->interleave_ways, &remainder);
   907		if (remainder) {
   908			dev_err(&port->dev,
   909				"decoder%d.%d: invalid committed configuration size: %#llx ways: %d\n",
   910				port->id, cxld->id, size, cxld->interleave_ways);
   911			return -ENXIO;
   912		}
   913		lo = readl(hdm + CXL_HDM_DECODER0_SKIP_LOW(which));
   914		hi = readl(hdm + CXL_HDM_DECODER0_SKIP_HIGH(which));
   915		skip = (hi << 32) + lo;
   916		cxled = to_cxl_endpoint_decoder(&cxld->dev);
   917		rc = devm_cxl_dpa_reserve(cxled, *dpa_base + skip, dpa_size, skip);
   918		if (rc) {
   919			dev_err(&port->dev,
   920				"decoder%d.%d: Failed to reserve DPA range %#llx - %#llx\n (%d)",
   921				port->id, cxld->id, *dpa_base,
   922				*dpa_base + dpa_size + skip - 1, rc);
   923			return rc;
   924		}
   925		*dpa_base += dpa_size + skip;
   926	
   927		cxled->state = CXL_DECODER_STATE_AUTO;
   928	
   929		return 0;
   930	}
   931	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/19] cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output
  2023-06-04 23:31 ` [PATCH 01/19] cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output Dan Williams
@ 2023-06-05  8:46   ` Jonathan Cameron
  2023-06-13 22:03   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-05  8:46 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:31:43 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> The @map parameter to cxl_probe_X_registers() is filled in with the
> mapping parameters of the register block. The @map parameter to
> cxl_map_X_registers() only reads that information to perform the
> mapping. Mark @map const for cxl_map_X_registers() to clarify that it is
> only an input to those helpers.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This makes sense as a stand alone clarification.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/core/regs.c |    8 ++++----
>  drivers/cxl/cxl.h       |    4 ++--
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index 1476a0299c9b..52d1dbeda527 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -200,10 +200,10 @@ void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
>  }
>  
>  int cxl_map_component_regs(struct device *dev, struct cxl_component_regs *regs,
> -			   struct cxl_register_map *map, unsigned long map_mask)
> +			   const struct cxl_register_map *map, unsigned long map_mask)
>  {
>  	struct mapinfo {
> -		struct cxl_reg_map *rmap;
> +		const struct cxl_reg_map *rmap;
>  		void __iomem **addr;
>  	} mapinfo[] = {
>  		{ &map->component_map.hdm_decoder, &regs->hdm_decoder },
> @@ -233,11 +233,11 @@ EXPORT_SYMBOL_NS_GPL(cxl_map_component_regs, CXL);
>  
>  int cxl_map_device_regs(struct device *dev,
>  			struct cxl_device_regs *regs,
> -			struct cxl_register_map *map)
> +			const struct cxl_register_map *map)
>  {
>  	resource_size_t phys_addr = map->resource;
>  	struct mapinfo {
> -		struct cxl_reg_map *rmap;
> +		const struct cxl_reg_map *rmap;
>  		void __iomem **addr;
>  	} mapinfo[] = {
>  		{ &map->device_map.status, &regs->status, },
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index f93a28538962..dfc94e76c7d6 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -254,10 +254,10 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base,
>  void cxl_probe_device_regs(struct device *dev, void __iomem *base,
>  			   struct cxl_device_reg_map *map);
>  int cxl_map_component_regs(struct device *dev, struct cxl_component_regs *regs,
> -			   struct cxl_register_map *map,
> +			   const struct cxl_register_map *map,
>  			   unsigned long map_mask);
>  int cxl_map_device_regs(struct device *dev, struct cxl_device_regs *regs,
> -			struct cxl_register_map *map);
> +			const struct cxl_register_map *map);
>  
>  enum cxl_regloc_type;
>  int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/19] tools/testing/cxl: Remove unused @cxlds argument
  2023-06-04 23:31 ` [PATCH 02/19] tools/testing/cxl: Remove unused @cxlds argument Dan Williams
@ 2023-06-06 10:53   ` Jonathan Cameron
  2023-06-13 22:08   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 10:53 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:31:48 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for plumbing a 'struct cxl_memdev_state' as a superset of
> a 'struct cxl_dev_state' cleanup the usage of @cxlds in the unit test
> infrastructure.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Sensible cleanup. I'd be tempted to queue some of these up independent
of the main patch set, just to cut down on the noise when we get to the
interesting parts.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/19] cxl/mbox: Move mailbox related driver state to its own data structure
  2023-06-04 23:31 ` [PATCH 03/19] cxl/mbox: Move mailbox related driver state to its own data structure Dan Williams
@ 2023-06-06 11:10   ` Jonathan Cameron
  2023-06-14  0:45     ` Dan Williams
  2023-06-13 22:15   ` Dave Jiang
  1 sibling, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 11:10 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:31:54 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> 'struct cxl_dev_state' makes too many assumptions about the capabilities
> of a CXL device. In particular it assumes a CXL device has a mailbox and
> all of the infrastructure and state that comes along with that.
> 
> In preparation for supporting accelerator / Type-2 devices that may not
> have a mailbox and in general maintain a minimal core context structure,
> make mailbox functionality a super-set of  'struct cxl_dev_state' with
> 'struct cxl_memdev_state'.
> 
> With this reorganization it allows for CXL devices that support HDM
> decoder mapping, but not other general-expander / Type-3 capabilities,
> to only enable that subset without the rest of the mailbox
> infrastructure coming along for the ride.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

I'm not yet sure that the division in exactly in the right place, but we
can move things later if it turns out some elements are more general than
we currently think.

A few trivial things inline.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


> ---

 
> -static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_dev_state *cxlds)
> +static struct cxl_mbox_get_supported_logs *
> +cxl_get_gsl(struct cxl_memdev_state *mds)

I'd consider keeping this on one line.  It was between 80 and 90 before and still is...


>  {


> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index a2845a7a69d8..d3fe73d5ba4d 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -267,6 +267,35 @@ struct cxl_poison_state {
>   * @cxl_dvsec: Offset to the PCIe device DVSEC
>   * @rcd: operating in RCD mode (CXL 3.0 9.11.8 CXL Devices Attached to an RCH)
>   * @media_ready: Indicate whether the device media is usable
> + * @dpa_res: Overall DPA resource tree for the device
> + * @pmem_res: Active Persistent memory capacity configuration
> + * @ram_res: Active Volatile memory capacity configuration
> + * @component_reg_phys: register base of component registers
> + * @info: Cached DVSEC information about the device.

Not seeing info in this structure.

> + * @serial: PCIe Device Serial Number
> + */
> +struct cxl_dev_state {
> +	struct device *dev;
> +	struct cxl_memdev *cxlmd;
> +	struct cxl_regs regs;
> +	int cxl_dvsec;
> +	bool rcd;
> +	bool media_ready;
> +	struct resource dpa_res;
> +	struct resource pmem_res;
> +	struct resource ram_res;
> +	resource_size_t component_reg_phys;
> +	u64 serial;
> +};
> +
> +/**
> + * struct cxl_memdev_state - Generic Type-3 Memory Device Class driver data
> + *
> + * CXL 8.1.12.1 PCI Header - Class Code Register Memory Device defines
> + * common memory device functionality like the presence of a mailbox and
> + * the functionality related to that like Identify Memory Device and Get
> + * Partition Info
> + * @cxlds: Core driver state common across Type-2 and Type-3 devices
>   * @payload_size: Size of space for payload
>   *                (CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
>   * @lsa_size: Size of Label Storage Area
> @@ -275,9 +304,6 @@ struct cxl_poison_state {
>   * @firmware_version: Firmware version for the memory device.
>   * @enabled_cmds: Hardware commands found enabled in CEL.
>   * @exclusive_cmds: Commands that are kernel-internal only
> - * @dpa_res: Overall DPA resource tree for the device
> - * @pmem_res: Active Persistent memory capacity configuration
> - * @ram_res: Active Volatile memory capacity configuration
>   * @total_bytes: sum of all possible capacities
>   * @volatile_only_bytes: hard volatile capacity
>   * @persistent_only_bytes: hard persistent capacity
> @@ -286,54 +312,41 @@ struct cxl_poison_state {
>   * @active_persistent_bytes: sum of hard + soft persistent
>   * @next_volatile_bytes: volatile capacity change pending device reset
>   * @next_persistent_bytes: persistent capacity change pending device reset
> - * @component_reg_phys: register base of component registers
> - * @info: Cached DVSEC information about the device.

Not seeing this removed from this structure in this patch.
Curiously doesn't seem to be here in first place.

Probably wants precursor fix patch to get rid of it from the docs.

> - * @serial: PCIe Device Serial Number
>   * @event: event log driver state
>   * @poison: poison driver state info
>   * @mbox_send: @dev specific transport for transmitting mailbox commands
>   *
> - * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> + * See CXL 3.0 8.2.9.8.2 Capacity Configuration and Label Storage for
>   * details on capacity parameters.
>   */
> -struct cxl_dev_state {
> -	struct device *dev;
> -	struct cxl_memdev *cxlmd;
> -
> -	struct cxl_regs regs;
> -	int cxl_dvsec;
> -
> -	bool rcd;
> -	bool media_ready;
> +struct cxl_memdev_state {
> +	struct cxl_dev_state cxlds;
>  	size_t payload_size;
>  	size_t lsa_size;
>  	struct mutex mbox_mutex; /* Protects device mailbox and firmware */
>  	char firmware_version[0x10];
>  	DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
>  	DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
> -
> -	struct resource dpa_res;
> -	struct resource pmem_res;
> -	struct resource ram_res;
>  	u64 total_bytes;
>  	u64 volatile_only_bytes;
>  	u64 persistent_only_bytes;
>  	u64 partition_align_bytes;
> -
>  	u64 active_volatile_bytes;
>  	u64 active_persistent_bytes;
>  	u64 next_volatile_bytes;
>  	u64 next_persistent_bytes;
> -
> -	resource_size_t component_reg_phys;
> -	u64 serial;
> -
>  	struct cxl_event_state event;
>  	struct cxl_poison_state poison;
> -
> -	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> +	int (*mbox_send)(struct cxl_memdev_state *mds,
> +			 struct cxl_mbox_cmd *cmd);
>  };

...

...

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 04/19] cxl/memdev: Make mailbox functionality optional
  2023-06-04 23:31 ` [PATCH 04/19] cxl/memdev: Make mailbox functionality optional Dan Williams
@ 2023-06-06 11:15   ` Jonathan Cameron
  2023-06-13 20:53     ` Dan Williams
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 11:15 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:31:59 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In support of the Linux CXL core scaling for a wider set of CXL devices,
> allow for the creation of memdevs with some memory device capabilities
> disabled. Specifically, allow for CXL devices outside of those claiming
> to be compliant with the generic CXL memory device class code, like
> vendor specific Type-2/3 devices that host CXL.mem. This implies, allow
> for the creation of memdevs that only support component-registers, not
> necessarily memory-device-registers (like mailbox registers). A memdev
> derived from a CXL endpoint that does not support generic class code
> expectations is tagged "CXL_DEVTYPE_DEVMEM", while a memdev derived from a
> class-code compliant endpoint is tagged "CXL_DEVTYPE_CLASSMEM".
> 
> The primary assumption of a CXL_DEVTYPE_DEVMEM memdev is that it
> optionally may not host a mailbox. Disable the command passthrough ioctl
> for memdevs that are not CXL_DEVTYPE_CLASSMEM, and return empty strings
> from memdev attributes associated with data retrieved via the
> class-device-standard IDENTIFY command. Note that empty strings were
> chosen over attribute visibility to maintain compatibility with shipping
> versions of cxl-cli that expect those attributes to always be present.
Hmm.  I'm not keen on this, but I guess we've ended up in this corner
so don't have much choice.

> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Trivial stuff inline.

> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index d3fe73d5ba4d..b8bdf7490d2c 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -254,6 +254,20 @@ struct cxl_poison_state {
>  	struct mutex lock;  /* Protect reads of poison list */
>  };
>  
> +/*
> + * enum cxl_devtype - delineate type-2 from a generic type-3 device
> + * @CXL_DEVTYPE_DEVMEM - Vendor specific CXL Type-2 device implementing HDM-D or

Bit of a naming collision with other uses of DEVMEM but I can't immediately think
of a better name so fair enough...

> + *			 HDM-DB, no expectation that this device implements a
> + *			 mailbox, or other memory-device-standard manageability
> + *			 flows.
no requirement that this device

Expectation is a bit strong the other way to my reading.  These device might well
implement some or all of that + other stuff that means they don't want to use the
class code.

> + * @CXL_DEVTYPE_CLASSMEM - Common class definition of a CXL Type-3 device with
> + *			   HDM-H and class-mandatory memory device registers
> + */
> +enum cxl_devtype {
> +	CXL_DEVTYPE_DEVMEM,
> +	CXL_DEVTYPE_CLASSMEM,
> +};
> +
>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> @@ -273,6 +287,7 @@ struct cxl_poison_state {
>   * @component_reg_phys: register base of component registers
>   * @info: Cached DVSEC information about the device.
>   * @serial: PCIe Device Serial Number
> + * @type: Generic Memory Class device or Vendor Specific Memory device
>   */
>  struct cxl_dev_state {
>  	struct device *dev;
> @@ -286,6 +301,7 @@ struct cxl_dev_state {
>  	struct resource ram_res;
>  	resource_size_t component_reg_phys;
>  	u64 serial;
> +	enum cxl_devtype type;
>  };
>  
>  /**
> @@ -344,6 +360,8 @@ struct cxl_memdev_state {
>  static inline struct cxl_memdev_state *
>  to_cxl_memdev_state(struct cxl_dev_state *cxlds)
>  {
> +	if (cxlds->type != CXL_DEVTYPE_CLASSMEM)
> +		return NULL;
>  	return container_of(cxlds, struct cxl_memdev_state, cxlds);
>  }
>  
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/19] cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTMEM, DEVMEM}
  2023-06-04 23:32 ` [PATCH 05/19] cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTMEM, DEVMEM} Dan Williams
@ 2023-06-06 11:21   ` Jonathan Cameron
  2023-06-13 21:03     ` Dan Williams
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 11:21 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:05 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for support for HDM-D and HDM-DB configuration
> (device-memory, and device-memory with back-invalidate). Rename the current
> type designators to use HOSTMEM and DEVMEM as a suffix.

I'd state this is inline with the CXL 3.0 spec naming.
Device coherent address range vs Host - Only coherent address range.
(or rename HOSTMEM to HOSTONLYMEM which I think deals with that subtlety) 
> 
> HDM-DB can be supported by devices that are not accelerators, so DEVMEM is
> a more generic term for that case.
> 
> Fixup one location where this type value was open coded.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Bike-shedding aside LGTM

Jonathan



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM
  2023-06-04 23:32 ` [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM Dan Williams
  2023-06-05  1:14   ` kernel test robot
@ 2023-06-06 11:27   ` Jonathan Cameron
  2023-06-13 21:23     ` Dan Williams
  2023-06-13 22:32     ` Dan Williams
  1 sibling, 2 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 11:27 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:10 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for device-memory region creation, arrange for decoders
> of CXL_DEVTYPE_DEVMEM memdevs to default to CXL_DECODER_DEVMEM for their
> target type.

Why?  CXL_DEVTYPE_DEVMEM might just be a non CLASS code compliant HDM-H
only device.  I'd want those drivers to always set this explicitly.


> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/hdm.c |   14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index de8a3fb28331..ca3b99c6eacf 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -856,12 +856,22 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
>  		}
>  		port->commit_end = cxld->id;
>  	} else {
> -		/* unless / until type-2 drivers arrive, assume type-3 */
>  		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0) {
>  			ctrl |= CXL_HDM_DECODER0_CTRL_TYPE;
>  			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));

This is setting it to be HOSTMEM if it was previously DEVMEM and that
makes it inconsistent with the state cached below.

Not sure why it was conditional in the first place - writing to existing value
should have been safe and would be less code...

>  		}
> -		cxld->target_type = CXL_DECODER_HOSTMEM;
> +		if (cxled) {
> +			struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +			struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +
> +			if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
> +				cxld->target_type = CXL_DECODER_HOSTMEM;
> +			else
> +				cxld->target_type = CXL_DECODER_DEVMEM;
> +		} else {
> +			/* To be overridden by region type at commit time */
> +			cxld->target_type = CXL_DECODER_HOSTMEM;
> +		}
>  	}
>  	rc = eiw_to_ways(FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl),
>  			  &cxld->interleave_ways);
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/19] cxl/region: Manage decoder target_type at decoder-attach time
  2023-06-04 23:32 ` [PATCH 07/19] cxl/region: Manage decoder target_type at decoder-attach time Dan Williams
@ 2023-06-06 12:36   ` Jonathan Cameron
  2023-06-13 22:42   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 12:36 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:16 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Switch-level (mid-level) decoders between the platform root and an
> endpoint can dynamically switch modes between HDM-H and HDM-D[B]
> depending on which region they target. Use the region type to fixup each
> decoder that gets allocated to map the given region.
> 
> Note that endpoint decoders are meant to determine the region type, so
> warn if those ever need to be fixed up, but since it is possible to
> continue do so.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/core/region.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index dca94c458b8f..c7170d92f47f 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -809,6 +809,18 @@ static int cxl_rr_alloc_decoder(struct cxl_port *port, struct cxl_region *cxlr,
>  		return -EBUSY;
>  	}
>  
> +	/*
> +	 * Endpoints should already match the region type, but backstop that
> +	 * assumption with an assertion. Switch-decoders change mapping-type
> +	 * based on what is mapped when they are assigned to a region.
> +	 */
> +	dev_WARN_ONCE(&cxlr->dev,
> +		      port == cxled_to_port(cxled) &&
> +			      cxld->target_type != cxlr->type,
> +		      "%s:%s mismatch decoder type %d -> %d\n",
> +		      dev_name(&cxled_to_memdev(cxled)->dev),
> +		      dev_name(&cxld->dev), cxld->target_type, cxlr->type);
> +	cxld->target_type = cxlr->type;
>  	cxl_rr->decoder = cxld;
>  	return 0;
>  }
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 08/19] cxl/port: Enumerate flit mode capability
  2023-06-04 23:32 ` [PATCH 08/19] cxl/port: Enumerate flit mode capability Dan Williams
@ 2023-06-06 13:04   ` Jonathan Cameron
  2023-06-14  1:06     ` Dan Williams
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 13:04 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:21 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Per CXL 3.0 Section 9.14 Back-Invalidation Configuration, in order to
> enable an HDM-DB range (CXL.mem region with device initiated
> back-invalidation support), all ports in the path between the endpoint and
> the host bridge must be in 256-bit flit-mode.
> 
> Even for typical Type-3 class devices it is useful to enumerate link
> capabilities through the chain for debug purposes.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

A few minor comments. In particularly that the field you have in here doesn't
distinguish between 256 byte flits and otherwise.  That's done with the PCI spec
field not this one which is about latency optimization.

> ---
>  drivers/cxl/core/hdm.c  |    2 +
>  drivers/cxl/core/pci.c  |   84 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/port.c |    6 +++
>  drivers/cxl/cxl.h       |    2 +
>  drivers/cxl/cxlpci.h    |   25 +++++++++++++-
>  drivers/cxl/port.c      |    5 +++
>  6 files changed, 122 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index ca3b99c6eacf..91ab3033c781 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -3,8 +3,10 @@
>  #include <linux/seq_file.h>
>  #include <linux/device.h>
>  #include <linux/delay.h>
> +#include <linux/pci.h>
>  
>  #include "cxlmem.h"
> +#include "cxlpci.h"
>  #include "core.h"
I'm not following why link related patch should change includes in hdm relate c file?
Maybe later once you use it this makes sense?


>  
>  /**
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 67f4ab6daa34..b62ec17ccdde 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -519,6 +519,90 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,

> +
> +int cxl_probe_link(struct cxl_port *port)
> +{
> +	struct pci_dev *pdev = cxl_port_to_pci(port);
> +	u16 cap, en, parent_features;
> +	struct cxl_port *parent_port;
> +	struct device *dev;
> +	int rc, dvsec;
> +	u32 hdr;
> +
> +	if (!pdev) {
> +		/*
> +		 * Assume host bridges support all features, the root
> +		 * port will dictate the actual enabled set to endpoints.
> +		 */
> +		return 0;
> +	}
> +
> +	dev = &pdev->dev;
> +	dvsec = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> +					  CXL_DVSEC_FLEXBUS_PORT);
> +	if (!dvsec) {
> +		dev_err(dev, "Failed to enumerate port capabilities\n");
> +		return -ENXIO;
> +	}
> +
> +	/*
> +	 * Cache the link features for future determination of HDM-D or
> +	 * HDM-DB support
> +	 */
> +	rc = pci_read_config_dword(pdev, dvsec + PCI_DVSEC_HEADER1, &hdr);
> +	if (rc)
> +		return rc;
> +
> +	rc = pci_read_config_word(pdev, dvsec + CXL_DVSEC_FLEXBUS_CAP_OFFSET,
> +				  &cap);
> +	if (rc)
> +		return rc;
> +
> +	rc = pci_read_config_word(pdev, dvsec + CXL_DVSEC_FLEXBUS_STATUS_OFFSET,
> +				  &en);
> +	if (rc)
> +		return rc;
> +
> +	if (PCI_DVSEC_HEADER1_REV(hdr) < 2)
> +		cap &= ~CXL_DVSEC_FLEXBUS_REV2_MASK;
> +
> +	if (PCI_DVSEC_HEADER1_REV(hdr) < 1)
> +		cap &= ~CXL_DVSEC_FLEXBUS_REV1_MASK;

I talk about this below, but I'd not normally expect to see this.
Anyone who used those bits out of usage defined by later specs has buggy 
hardware and should quirk it rather than having it built in here.

> +
> +	en &= cap;
> +	parent_port = to_cxl_port(port->dev.parent);
> +	parent_features = parent_port->features;
> +
> +	/* Enforce port features are plumbed through to the host bridge */
> +	port->features = en & CXL_DVSEC_FLEXBUS_ENABLE_MASK & parent_features;
> +
> +	dev_dbg(dev, "features:%s%s%s%s%s%s%s\n",
> +		en & CXL_DVSEC_FLEXBUS_CACHE_ENABLED ? " cache" : "",
> +		en & CXL_DVSEC_FLEXBUS_IO_ENABLED ? " io" : "",
> +		en & CXL_DVSEC_FLEXBUS_MEM_ENABLED ? " mem" : "",
> +		en & CXL_DVSEC_FLEXBUS_FLIT68_ENABLED ? " flit68" : "",
> +		en & CXL_DVSEC_FLEXBUS_MLD_ENABLED ? " mld" : "",
> +		en & CXL_DVSEC_FLEXBUS_FLIT256_ENABLED ? " flit256" : "",

Definitely want that text to be more explicit about latency optimized

> +		en & CXL_DVSEC_FLEXBUS_PBR_ENABLED ? " pbr" : "");
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_probe_link, CXL);
> +
>  #define CXL_DOE_TABLE_ACCESS_REQ_CODE		0x000000ff
>  #define   CXL_DOE_TABLE_ACCESS_REQ_CODE_READ	0
>  #define CXL_DOE_TABLE_ACCESS_TABLE_TYPE		0x0000ff00


> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 7c02e55b8042..7f82ffb5b4be 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -45,8 +45,28 @@
>  /* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
>  #define CXL_DVSEC_DEVICE_GPF					5
>  
> -/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
> -#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
> +/* CXL 3.0 8.2.1.3: PCIe DVSEC for Flex Bus Port */
> +#define CXL_DVSEC_FLEXBUS_PORT					7
> +#define   CXL_DVSEC_FLEXBUS_CAP_OFFSET		0xA
> +#define     CXL_DVSEC_FLEXBUS_CACHE_CAPABLE	BIT(0)
> +#define     CXL_DVSEC_FLEXBUS_IO_CAPABLE	BIT(1)
> +#define     CXL_DVSEC_FLEXBUS_MEM_CAPABLE	BIT(2)
> +#define     CXL_DVSEC_FLEXBUS_FLIT68_CAPABLE	BIT(5)

This one includes the stuff that makes it 2.0 rather than 1.1  Might need a longer
name to avoid miss use? (I checked the 1.1 spec and reserved so would be 0).

> +#define     CXL_DVSEC_FLEXBUS_MLD_CAPABLE	BIT(6)
> +#define     CXL_DVSEC_FLEXBUS_REV1_MASK		GENMASK(6, 5)

Unusual approach.. Shouldn't be needed as those bits were RsvdP so
no one should have set them and now we are supporting the new bits
so should be good without masking.

> +#define     CXL_DVSEC_FLEXBUS_FLIT256_CAPABLE	BIT(13)

Not just flit256, but the latency optimized one (split in two kind of
with separate CRCs)  So this name needs to be something like
FLEXBUS_LAT_OPT_FLIT256_CAPABLE


> +#define     CXL_DVSEC_FLEXBUS_PBR_CAPABLE	BIT(14)
> +#define     CXL_DVSEC_FLEXBUS_REV2_MASK		GENMASK(14, 13)
> +#define   CXL_DVSEC_FLEXBUS_STATUS_OFFSET	0xE
> +#define     CXL_DVSEC_FLEXBUS_CACHE_ENABLED	BIT(0)
> +#define     CXL_DVSEC_FLEXBUS_IO_ENABLED	BIT(1)
> +#define     CXL_DVSEC_FLEXBUS_MEM_ENABLED	BIT(2)
> +#define     CXL_DVSEC_FLEXBUS_FLIT68_ENABLED	BIT(5)

Again, not just FLIT68, but the VH stuff from CXL 2.0 as well.

> +#define     CXL_DVSEC_FLEXBUS_MLD_ENABLED	BIT(6)
> +#define     CXL_DVSEC_FLEXBUS_FLIT256_ENABLED	BIT(13)
Also latency optimized is key here, not 256 bit (though you need
that as well).

> +#define     CXL_DVSEC_FLEXBUS_PBR_ENABLED	BIT(14)
> +#define     CXL_DVSEC_FLEXBUS_ENABLE_MASK \
> +	(GENMASK(2, 0) | GENMASK(6, 5) | GENMASK(14, 13))
Ok - I guess the resvP requires this dance.
>  


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/19] cxl/memdev: Formalize endpoint port linkage
  2023-06-04 23:32 ` [PATCH 09/19] cxl/memdev: Formalize endpoint port linkage Dan Williams
@ 2023-06-06 13:26   ` Jonathan Cameron
       [not found]   ` <CGME20230607164756uscas1p2fb025e7f4de5094925cc25fc2ac45212@uscas1p2.samsung.com>
  2023-06-13 22:59   ` Dave Jiang
  2 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 13:26 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:27 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Move the endpoint port that the cxl_mem driver establishes from drvdata
> to a first class attribute. This is in preparation for device-memory
> drivers reusing the CXL core for memory region management. Those drivers
> need a type-safe method to retrieve their CXL port linkage. Leave
> drvdata for private usage of the cxl_mem driver not external consumers
> of a 'struct cxl_memdev' object.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Good. I never liked this being 'hidden' and un-typed.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 10/19] cxl/memdev: Indicate probe deferral
  2023-06-04 23:32 ` [PATCH 10/19] cxl/memdev: Indicate probe deferral Dan Williams
@ 2023-06-06 13:54   ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 13:54 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:32 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> The first stop for a CXL accelerator driver that wants to establish new
> CXL.mem regions is to register a 'struct cxl_memdev'. That kicks off
> cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
> topology up to the root.
> 
> If the root driver has not attached yet the expectation is that the
> driver waits until that link is established. The common cxl_pci driver
> has reason to keep the 'struct cxl_memdev' device attached to the bus
> until the root driver attaches. An accelerator may want to instead defer
> probing until CXL resources can be acquired.
> 
> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when
> accelerator driver probing should be deferred vs failed. Provide that
> indication via a new cxl_acquire_endpoint() API that can retrieve the
> probe status of the memdev.
> 
> The first consumer of this API is a test driver that exercises the CXL
> Type-2 flow.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c |   41 +++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/port.c   |    2 +-
>  drivers/cxl/cxlmem.h      |    2 ++
>  drivers/cxl/mem.c         |    7 +++++--
>  4 files changed, 49 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 65a685e5616f..859c43c340bb 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -609,6 +609,47 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, CXL);
>  
> +/*
> + * Try to get a locked reference on a memdev's CXL port topology
> + * connection. Be careful to observe when cxl_mem_probe() has deposited
> + * a probe deferral awaiting the arrival of the CXL root driver
> + */
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_port *endpoint;
> +	int rc = -ENXIO;
> +
> +	device_lock(&cxlmd->dev);
> +	endpoint = cxlmd->endpoint;
> +	if (!endpoint)
> +		goto err;
> +
> +	if (IS_ERR(endpoint)) {
> +		rc = PTR_ERR(endpoint);

Whilst the above comment talks about deferred, there are paths where it's
potentially another error pointer currently.  Maybe suppress those at
source so the comment is 'precise' or make the comment above incorporate
those other error pointers.

> +		goto err;
> +	}
> +
> +	device_lock(&endpoint->dev);
> +	if (!endpoint->dev.driver)
> +		goto err_endpoint;
> +
> +	return endpoint;
> +
> +err_endpoint:
> +	device_unlock(&endpoint->dev);
> +err:
> +	device_unlock(&cxlmd->dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS(cxl_acquire_endpoint, CXL);
> +
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
> +{
> +	device_unlock(&endpoint->dev);
> +	device_unlock(&cxlmd->dev);
> +}
> +EXPORT_SYMBOL_NS(cxl_release_endpoint, CXL);
> +
>  __init int cxl_memdev_init(void)
>  {
>  	dev_t devt;
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 6720ab22a494..5e21b53362e6 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1336,7 +1336,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
>  		 */
>  		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
>  			dev_name(dport_dev));
> -		return -ENXIO;
> +		return -EPROBE_DEFER;
>  	}
>  
>  	parent_port = find_cxl_port(dparent, &parent_dport);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 7ee78e79933c..e3bcd6d12a1c 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -83,6 +83,8 @@ static inline bool is_cxl_endpoint(struct cxl_port *port)
>  	return is_cxl_memdev(port->uport);
>  }
>  
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
>  struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
>  int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
>  			 resource_size_t base, resource_size_t len,
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 584f9eec57e4..2470c6f2621c 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -154,13 +154,16 @@ static int cxl_mem_probe(struct device *dev)
>  		return rc;
>  
>  	rc = devm_cxl_enumerate_ports(cxlmd);
> -	if (rc)
> +	if (rc) {
> +		cxlmd->endpoint = ERR_PTR(rc);

This can be other errors than defer.  Which seems inconsistent
for the above check.

>  		return rc;
> +	}
>  
>  	parent_port = cxl_mem_find_port(cxlmd, &dport);
>  	if (!parent_port) {
>  		dev_err(dev, "CXL port topology not found\n");
> -		return -ENXIO;
> +		cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
> +		return -EPROBE_DEFER;

This is a little inelegant as we aren't setting this in the same
function as were the endpoint is otherwise set up.  
Partly that's a naming thing.  In wonder if can pull out the bit
related to the endpoint from cxl_mem_probe() to a new
cxl_mem_ep_probe() at which case this will feel more logical.
That would be fine for this case, but the one above doesn't
really work for that...

Perhaps this is the best that can be done.


>  	}
>  
>  	if (dport->rch)
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 11/19] cxl/region: Factor out construct_region_{begin, end} and drop_region() for reuse
  2023-06-04 23:32 ` [PATCH 11/19] cxl/region: Factor out construct_region_{begin, end} and drop_region() for reuse Dan Williams
@ 2023-06-06 14:29   ` Jonathan Cameron
  2023-06-13 23:29   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 14:29 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:38 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for constructing regions from newly allocated HPA, factor
> out some helpers that can be shared with the existing kernel-internal
> region construction from BIOS pre-allocated regions. Handle acquiring a
> new region object under the region rwsem, and optionally tearing it down
> if the region assembly process fails.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Nasty diff to read! :)  When will some add some AI magic to git diff so
we get easier code to review?

I 'think' it ends up fine. Comments are about the usual balance
of keeping opencoded side effect removal code in error handers (making it
easier to review) vs small amount of code duplication.

Jonathan

> ---
>  drivers/cxl/core/region.c |   73 ++++++++++++++++++++++++++++++++-------------
>  1 file changed, 52 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index c7170d92f47f..bd3c3d4b2683 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2191,19 +2191,25 @@ cxl_find_region_by_name(struct cxl_root_decoder *cxlrd, const char *name)
>  	return to_cxl_region(region_dev);
>  }
>  
> +static void drop_region(struct cxl_region *cxlr)
> +{
> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> +	struct cxl_port *port = cxlrd_to_port(cxlrd);
> +
> +	devm_release_action(port->uport, unregister_region, cxlr);
> +}
> +
>  static ssize_t delete_region_store(struct device *dev,
>  				   struct device_attribute *attr,
>  				   const char *buf, size_t len)
>  {
>  	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev);
> -	struct cxl_port *port = to_cxl_port(dev->parent);
>  	struct cxl_region *cxlr;
>  
>  	cxlr = cxl_find_region_by_name(cxlrd, buf);
>  	if (IS_ERR(cxlr))
>  		return PTR_ERR(cxlr);
> -
> -	devm_release_action(port->uport, unregister_region, cxlr);
> +	drop_region(cxlr);
>  	put_device(&cxlr->dev);
>  
>  	return len;
> @@ -2664,17 +2670,19 @@ static int match_region_by_range(struct device *dev, void *data)
>  	return rc;
>  }
>  
> -/* Establish an empty region covering the given HPA range */
> -static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> -					   struct cxl_endpoint_decoder *cxled)
> +static void construct_region_end(void)
> +{
> +	up_write(&cxl_region_rwsem);
> +}
> +
> +static struct cxl_region *
> +construct_region_begin(struct cxl_root_decoder *cxlrd,
> +		       struct cxl_endpoint_decoder *cxled)
>  {
>  	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> -	struct cxl_port *port = cxlrd_to_port(cxlrd);
> -	struct range *hpa = &cxled->cxld.hpa_range;
>  	struct cxl_region_params *p;
>  	struct cxl_region *cxlr;
> -	struct resource *res;
> -	int rc;
> +	int err = 0;
>  
>  	do {
>  		cxlr = __create_region(cxlrd, cxled->mode,
> @@ -2693,19 +2701,41 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  	p = &cxlr->params;
>  	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>  		dev_err(cxlmd->dev.parent,
> -			"%s:%s: %s autodiscovery interrupted\n",
> +			"%s:%s: %s region setup interrupted\n",
>  			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>  			__func__);
> -		rc = -EBUSY;
> -		goto err;
> +		err = -EBUSY;
> +	}
> +
> +	if (err) {
> +		construct_region_end();

Unless you are going to add more in construct_region_end()
later I'd be tempted to just have the up_write() here as it
can clearly match against the down_write() earlier in the function.

> +		drop_region(cxlr);

Hmm. I'm int two minds about this vs opencoding it for readability.
Given drop_region matches with construct_region() I'd slightly prefer this
open coded as well so I can do nice easy matches on what it's unwinding.

> +		return ERR_PTR(err);
>  	}
> +	return cxlr;
> +}
> +
> +/* Establish an empty region covering the given HPA range */
> +static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> +					   struct cxl_endpoint_decoder *cxled)
> +{
> +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +	struct range *hpa = &cxled->cxld.hpa_range;
> +	struct cxl_region_params *p;
> +	struct cxl_region *cxlr;
> +	struct resource *res;
> +	int rc;
> +
> +	cxlr = construct_region_begin(cxlrd, cxled);
> +	if (IS_ERR(cxlr))
> +		return cxlr;
>  
>  	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>  
>  	res = kmalloc(sizeof(*res), GFP_KERNEL);
>  	if (!res) {
>  		rc = -ENOMEM;
> -		goto err;
> +		goto out;
>  	}
>  
>  	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
> @@ -2722,6 +2752,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  			 __func__, dev_name(&cxlr->dev));
>  	}
>  
> +	p = &cxlr->params;
>  	p->res = res;
>  	p->interleave_ways = cxled->cxld.interleave_ways;
>  	p->interleave_granularity = cxled->cxld.interleave_granularity;
> @@ -2729,7 +2760,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  
>  	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
>  	if (rc)
> -		goto err;
> +		goto out;
>  
>  	dev_dbg(cxlmd->dev.parent, "%s:%s: %s %s res: %pr iw: %d ig: %d\n",
>  		dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), __func__,
> @@ -2738,14 +2769,14 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  
>  	/* ...to match put_device() in cxl_add_to_region() */
>  	get_device(&cxlr->dev);
> -	up_write(&cxl_region_rwsem);
>  
> +out:
> +	construct_region_end();
> +	if (rc) {
> +		drop_region(cxlr);
> +		return ERR_PTR(rc);
> +	}
>  	return cxlr;
> -
> -err:
> -	up_write(&cxl_region_rwsem);
> -	devm_release_action(port->uport, unregister_region, cxlr);
> -	return ERR_PTR(rc);
>  }
>  
>  int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/19] cxl/region: Factor out interleave ways setup
  2023-06-04 23:32 ` [PATCH 12/19] cxl/region: Factor out interleave ways setup Dan Williams
@ 2023-06-06 14:31   ` Jonathan Cameron
  2023-06-13 23:30   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 14:31 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:44 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for kernel driven region creation, factor out a common
> helper from the user-sysfs region setup for interleave_ways.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 13/19] cxl/region: Factor out interleave granularity setup
  2023-06-04 23:32 ` [PATCH 13/19] cxl/region: Factor out interleave granularity setup Dan Williams
@ 2023-06-06 14:33   ` Jonathan Cameron
  2023-06-13 23:42   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 14:33 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:50 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for kernel driven region creation, factor out a common
> helper from the user-sysfs region setup for interleave_granularity.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

One trivial thing inline.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> @@ -444,16 +437,30 @@ static ssize_t interleave_granularity_store(struct device *dev,
>  	if (cxld->interleave_ways > 1 && val != cxld->interleave_granularity)
>  		return -EINVAL;
>  
> +	lockdep_assert_held_write(&cxl_region_rwsem);
> +	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
> +		return -EBUSY;
> +
> +	p->interleave_granularity = val;

Trivial: Blank line here would be a tiny bit nicer to read.

> +	return 0;
> +}
> +
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 14/19] cxl/region: Clarify locking requirements of cxl_region_attach()
  2023-06-04 23:32 ` [PATCH 14/19] cxl/region: Clarify locking requirements of cxl_region_attach() Dan Williams
@ 2023-06-06 14:35   ` Jonathan Cameron
  2023-06-13 23:45   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 14:35 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:32:56 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for cxl_region_attach() being called for kernel initiated
> region creation, enforce the locking context with explicit lockdep
> assertions.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 15/19] cxl/region: Specify host-only vs device memory at region creation time
  2023-06-04 23:33 ` [PATCH 15/19] cxl/region: Specify host-only vs device memory at region creation time Dan Williams
@ 2023-06-06 14:42   ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 14:42 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:33:01 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> In preparation for supporting device-memory (HDM-D[B]) region creation,
> convey the endpoint-decoder target type to devm_cxl_add_region().
> 
> Note that none of the existing sysfs ABIs allow for HDM-D[B] region
> creation. The expectation is that HDM-D[B] region creation requires a
> kernel-internal region creation flow, for example, driven by an
> accelerator driver.

There are potential advantages in using CXL type 3 HDM-DB devices with
a normal flow, userspace driven region creation flow.  If there is
any potential for UIO P2P transactions targeting them later (which
we won't necessarily know at setup time) then we'll need to use HDM-DB.
 The targetting of them for UIO will require some accelerator specific
stuff at the source of the transactions, but we won't want to offline
the memory on a type 3 device to convert it over to HDM-H
(assuming that's required...)

+ cache coherent sharing - though I guess region setup code may
be quite different for that anyway :)  I've not divided into
Navneet's set yet to see how he did it.

Code itself is fine - just the comment potentially being misleading or
overly restrictive.

Jonathan

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/19] cxl/hdm: Define a driver interface for DPA allocation
  2023-06-04 23:33 ` [PATCH 16/19] cxl/hdm: Define a driver interface for DPA allocation Dan Williams
@ 2023-06-06 14:58   ` Jonathan Cameron
  2023-06-13 23:53   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 14:58 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:33:07 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Region creation involves finding available DPA (device-physical-address)
> capacity to map into HPA (host-physical-address) space. Given the HPA
> capacity constraint, define an API, cxl_request_dpa(), that has the
> flexibility to map the minimum amount of memory the driver needs to
> operate vs the total possible that can be mapped given HPA availability.

Maybe give an example of when a device might want to do this? I guess
it's the equivalent of resize-able BARs?
> 
> Factor out the core of cxl_dpa_alloc(), that does free space scanning,
> into a cxl_dpa_freespace() helper, and use that to balance the capacity
> available to map vs the @min and @max arguments to cxl_request_dpa().
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/hdm.c |  140 +++++++++++++++++++++++++++++++++++++++++-------
>  drivers/cxl/cxl.h      |    6 ++
>  drivers/cxl/cxlmem.h   |    4 +
>  3 files changed, 131 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 91ab3033c781..514d30131d92 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -464,30 +464,17 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
>  	return rc;
>  }
>  
...

> +static int find_free_decoder(struct device *dev, void *data)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_port *port;
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	port = cxled_to_port(cxled);
> +
> +	if (cxled->cxld.id != port->hdm_end + 1)
> +		return 0;
> +	return 1;
> +}
> +
> +/**
> + * cxl_request_dpa - search and reserve DPA given input constraints
> + * @endpoint: an endpoint port with available decoders

Can call it on ones with out given you check.  Whilst not useful
to do so, I'm not sure you want to say they are available here as that
feels like a constraint that isn't true.
"endpoint with decoders" or "endpoint with potentially available decoders"


> + * @mode: DPA operation mode (ram vs pmem)
> + * @min: the minimum amount of capacity the call needs
> + * @max: extra capacity to allocate after min is satisfied

This sounds like another chunk on top. So allocate min + max?
That's an odd thing for 'max'.  
"max: allocate up to this capacity if available - @min must be allocated"

> + *
> + * Given that a region needs to allocate from limited HPA capacity it
> + * may be the case that a device has more mappable DPA capacity than
> + * available HPA. So, the expectation is that @min is a driver known
> + * value for how much capacity is needed, and @max is based the limit of
> + * how much HPA space is available for a new region.
> + *
> + * Returns a pinned cxl_decoder with at least @min bytes of capacity
> + * reserved, or an error pointer. The caller is also expected to own the
> + * lifetime of the memdev registration associated with the endpoint to
> + * pin the decoder registered as well.
> + */
> +struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
> +					     enum cxl_decoder_mode mode,
> +					     resource_size_t min,
> +					     resource_size_t max)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	struct device *cxled_dev;
> +	resource_size_t alloc;
> +	int rc;
> +
> +	if (!IS_ALIGNED(min | max, SZ_256M))
> +		return ERR_PTR(-EINVAL);
> +
> +	down_read(&cxl_dpa_rwsem);
> +	cxled_dev = device_find_child(&endpoint->dev, NULL, find_free_decoder);
> +	if (!cxled_dev)
> +		cxled = ERR_PTR(-ENXIO);
> +	else
> +		cxled = to_cxl_endpoint_decoder(cxled_dev);

Can in theory (based on what's in the function at least) return NULL. 

> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (IS_ERR(cxled))
> +		return cxled;

Perhaps cleaner as following. I'm not spotting lifetime concerns from narrower locking
but I might be missing something.

	down_read(&cxl_dpa_rwsem);
	cxled_dev = device_find_child(&endpoint->dev, NULL, find_free_decoder);
	up_read(&cxl_dpa_rwsem);

	if (!cxled_dev) {
		return ERR_PTR(-ENXIO);
	} else {
		cxled = to_cxl_endpoint_decoder(cxled_dev);
		if (!cxl_ed)
			return ERR_PTR(-ENXIO);
	} 

> +
> +	rc = cxl_dpa_set_mode(cxled, mode);
> +	if (rc)
> +		goto err;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	alloc = cxl_dpa_freespace(cxled, NULL, NULL);
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (max)
> +		alloc = min(max, alloc);

Why allow optional max? Just have it same as min if
no difference is allowed.  Docs don't mention special
values - though they are bit confusing as noted above.

> +	if (alloc < min) {
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +
> +	rc = cxl_dpa_alloc(cxled, alloc);
> +	if (rc)
> +		goto err;
> +
> +	return cxled;
> +err:
> +	put_device(cxled_dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_request_dpa, CXL);
> +
>  static void cxld_set_interleave(struct cxl_decoder *cxld, u32 *ctrl)
>  {
>  	u16 eig;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 258c90727dd2..55808697773f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -680,6 +680,12 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
>  struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
>  struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
> +
> +static inline struct device *cxled_dev(struct cxl_endpoint_decoder *cxled)
> +{
> +	return &cxled->cxld.dev;
> +}
> +
For one current user, I'd not bother.  If there are others in tree (there are ;)
then do this as a precursor cleanup patch.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 17/19] cxl/region: Define a driver interface for HPA free space enumeration
  2023-06-04 23:33 ` [PATCH 17/19] cxl/region: Define a driver interface for HPA free space enumeration Dan Williams
@ 2023-06-06 15:23   ` Jonathan Cameron
  2023-06-14  0:15   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 15:23 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:33:12 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> CXL region creation involves allocating capacity from device DPA
> (device-physical-address space) and assigning it to decode a given HPA
> (host-physical-address space). Before determininig how much DPA to
> allocate the amount of available HPA must be determined. Also, not all
> HPA is created equal, some specifically targets RAM, some target PMEM,
> some is prepared for the device-memory flows like HDM-D and HDM-DB, and
> some is host-only (HDM-H).
> 
> Wrap all of those concerns into an API that retrieves a root decoder
> (platform CXL window) that fits the specified constraints and the
> capacity available for a new region.

Interaction with QTG?  I'd guess we should be prioritizing CFMWS based
on that if available.  Fun question of how to balance suggested QTG vs
available HPA space.


Otherwise, one suggestion inline.

Jonathan

> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/region.c |  143 +++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h         |    5 ++
>  drivers/cxl/cxlmem.h      |    5 ++
>  3 files changed, 153 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 75c5de627868..a41756249f8d 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -575,6 +575,149 @@ static int free_hpa(struct cxl_region *cxlr)
>  	return 0;
>  }
>  
> +struct cxlrd_max_context {
> +	struct device * const *host_bridges;
> +	int interleave_ways;
> +	unsigned long flags;
> +	resource_size_t max_hpa;
> +	struct cxl_root_decoder *cxlrd;
> +};
> +
> +static int find_max_hpa(struct device *dev, void *data)
> +{
> +	struct cxlrd_max_context *ctx = data;
> +	struct cxl_switch_decoder *cxlsd;
> +	struct cxl_root_decoder *cxlrd;
> +	struct resource *res, *prev;
> +	struct cxl_decoder *cxld;
> +	resource_size_t max;
> +	unsigned int seq;
> +	int found;
> +
> +	if (!is_root_decoder(dev))
> +		return 0;
> +
> +	cxlrd = to_cxl_root_decoder(dev);
> +	cxld = &cxlrd->cxlsd.cxld;
> +	if ((cxld->flags & ctx->flags) != ctx->flags)
> +		return 0;
> +
> +	if (cxld->interleave_ways != ctx->interleave_ways)
> +		return 0;
> +
> +	cxlsd = &cxlrd->cxlsd;
> +	do {
> +		found = 0;
> +		seq = read_seqbegin(&cxlsd->target_lock);
> +		for (int i = 0; i < ctx->interleave_ways; i++)
> +			for (int j = 0; j < ctx->interleave_ways; j++)
> +				if (ctx->host_bridges[i] ==
> +				    cxlsd->target[j]->dport) {
> +					found++;
> +					break;
> +				}
> +	} while (read_seqretry(&cxlsd->target_lock, seq));
> +
> +	if (found != ctx->interleave_ways)
> +		return 0;
> +
> +	/*
> +	 * Walk the root decoder resource range relying on cxl_region_rwsem to
> +	 * preclude sibling arrival/departure and find the largest free space
> +	 * gap.
> +	 */
> +	lockdep_assert_held_read(&cxl_region_rwsem);
> +	max = 0;
> +	res = cxlrd->res->child;
> +	if (!res)
> +		max = resource_size(cxlrd->res);

		Maybe jump from here to after the loop would be clearer
		or factor the loop out as a utility function to find the
		max gap which feels like it might be generally useful to
		have?

> +	else
> +		max = 0;
> +	for (prev = NULL; res; prev = res, res = res->sibling) {
> +		struct resource *next = res->sibling;
> +		resource_size_t free = 0;
> +
> +		if (!prev && res->start > cxlrd->res->start) {
> +			free = res->start - cxlrd->res->start;
> +			max = max(free, max);
> +		}
> +		if (prev && res->start > prev->end + 1) {
> +			free = res->start - prev->end + 1;
> +			max = max(free, max);
> +		}
> +		if (next && res->end + 1 < next->start) {
> +			free = next->start - res->end + 1;
> +			max = max(free, max);
> +		}
> +		if (!next && res->end + 1 < cxlrd->res->end + 1) {
> +			free = cxlrd->res->end + 1 - res->end + 1;
> +			max = max(free, max);
> +		}
> +	}
> +
> +	if (max > ctx->max_hpa) {
> +		if (ctx->cxlrd)
> +			put_device(cxlrd_dev(ctx->cxlrd));
> +		get_device(cxlrd_dev(cxlrd));
> +		ctx->cxlrd = cxlrd;
> +		ctx->max_hpa = max;
> +		dev_dbg(cxlrd_dev(cxlrd), "found %pa bytes of free space\n", &max);
> +	}
> +
> +	return 0;
> +}
> +


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 18/19] cxl/region: Define a driver interface for region creation
  2023-06-04 23:33 ` [PATCH 18/19] cxl/region: Define a driver interface for region creation Dan Williams
@ 2023-06-06 15:31   ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 15:31 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:33:18 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Scenarios like recreating persistent memory regions from label data and
> establishing new regions for CXL attached accelerators with local memory
> need a kernel internal facility to establish new regions.

Could probably make the label data one a userspace problem, but I agree
that it 'might' be done entirely in kernel.

> 
> Introduce cxl_create_region() that takes an array of endpoint decoders
> with reserved capacity and a root decoder object to establish a new
> region.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

> ---
>  drivers/cxl/core/region.c |  107 +++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlmem.h      |    3 +
>  2 files changed, 110 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index a41756249f8d..543c4499379e 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2878,6 +2878,104 @@ construct_region_begin(struct cxl_root_decoder *cxlrd,
>  	return cxlr;
>  }
>  
> +static struct cxl_region *
> +__construct_new_region(struct cxl_root_decoder *cxlrd,
> +		       struct cxl_endpoint_decoder **cxled, int ways)
> +{
> +	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> +	struct cxl_region_params *p;
> +	resource_size_t size = 0;
> +	struct cxl_region *cxlr;
> +	int rc, i;
> +
> +	if (ways < 1)
> +		return ERR_PTR(-EINVAL);
> +
> +	cxlr = construct_region_begin(cxlrd, cxled[0]);
> +	if (IS_ERR(cxlr))
> +		return cxlr;
> +
> +	rc = set_interleave_ways(cxlr, ways);
> +	if (rc)
> +		goto out;
> +
> +	rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
> +	if (rc)
> +		goto out;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	for (i = 0; i < ways; i++) {
> +		if (!cxled[i]->dpa_res)
> +			break;
> +		size += resource_size(cxled[i]->dpa_res);
> +	}
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (i < ways)
> +		goto out;
> +
> +	rc = alloc_hpa(cxlr, size);
> +	if (rc)
> +		goto out;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	for (i = 0; i < ways; i++) {
> +		rc = cxl_region_attach(cxlr, cxled[i], i);
> +		if (rc)
> +			break;
> +	}
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (rc)
> +		goto out;
> +
> +	rc = cxl_region_decode_commit(cxlr);
> +	if (rc)
> +		goto out;
> +
> +	p = &cxlr->params;
> +	p->state = CXL_CONFIG_COMMIT;
> +out:
> +	construct_region_end();
> +	if (rc) {
> +		drop_region(cxlr);
> +		return ERR_PTR(rc);
> +	}
> +	return cxlr;
> +}
> +

>  /* Establish an empty region covering the given HPA range */
>  static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  					   struct cxl_endpoint_decoder *cxled)
> @@ -3085,6 +3183,15 @@ static int cxl_region_probe(struct device *dev)
>  					p->res->start, p->res->end, cxlr,
>  					is_system_ram) > 0)
>  			return 0;
> +
> +		/*
> +		 * HDM-D[B] (device-memory) regions have accelerator
> +		 * specific usage, skip device-dax registration.
> +		 */

As before - I'm not yet convinced that is always the case for HDM-DB
Particularly given you support interleaving which may never make sense
for accelerators.

> +		if (cxlr->type == CXL_DECODER_DEVMEM)
> +			return 0;
> +
> +		/* HDM-H routes to device-dax */
>  		return devm_cxl_add_dax_region(cxlr);


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory
  2023-06-04 23:33 ` [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory Dan Williams
@ 2023-06-06 15:34   ` Jonathan Cameron
  2023-06-07 21:09   ` Vikram Sethi
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-06 15:34 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, 04 Jun 2023 16:33:23 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Mock-up a device that does not have a standard mailbox, i.e. a device
> that does not implement the CXL memory-device class code, but wants to
> map "device" memory (aka Type-2, aka HDM-D[B], aka accelerator memory).
> 
> For extra code coverage make this device an RCD to test region creation
> flows in the presence of an RCH topology (memory device modeled as a
> root-complex-integrated-endpoint RCIEP).
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Whilst this looks superficially find, I haven't reviewed it in enough
depth to give a tag. Might get back to it at somepoint, but don't
wait on me!

Jonathan



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM
  2023-06-05  1:14   ` kernel test robot
@ 2023-06-06 20:10     ` Dan Williams
  0 siblings, 0 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-06 20:10 UTC (permalink / raw)
  To: kernel test robot, Dan Williams; +Cc: llvm, oe-kbuild-all

kernel test robot wrote:
> Hi Dan,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on 9561de3a55bed6bdd44a12820ba81ec416e705a7]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Dan-Williams/cxl-regs-Clarify-when-a-struct-cxl_register_map-is-input-vs-output/20230605-073402
> base:   9561de3a55bed6bdd44a12820ba81ec416e705a7
> patch link:    https://lore.kernel.org/r/168592153054.1948938.12344684637653088842.stgit%40dwillia2-xfh.jf.intel.com
> patch subject: [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM

Will fold in this change:

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index ca3b99c6eacf..be81fcbb4755 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -793,8 +793,8 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 			    int *target_map, void __iomem *hdm, int which,
 			    u64 *dpa_base, struct cxl_endpoint_dvsec_info *info)
 {
+	struct cxl_endpoint_decoder *cxled = NULL;
 	u64 size, base, skip, dpa_size, lo, hi;
-	struct cxl_endpoint_decoder *cxled;
 	bool committed;
 	u32 remainder;
 	int i, rc;
@@ -827,6 +827,8 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 		return -ENXIO;
 	}
 
+	if (info)
+		cxled = to_cxl_endpoint_decoder(&cxld->dev);
 	cxld->hpa_range = (struct range) {
 		.start = base,
 		.end = base + size - 1,
@@ -890,7 +892,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 		port->id, cxld->id, cxld->hpa_range.start, cxld->hpa_range.end,
 		cxld->interleave_ways, cxld->interleave_granularity);
 
-	if (!info) {
+	if (!cxled) {
 		lo = readl(hdm + CXL_HDM_DECODER0_TL_LOW(which));
 		hi = readl(hdm + CXL_HDM_DECODER0_TL_HIGH(which));
 		target_list.value = (hi << 32) + lo;
@@ -913,7 +915,6 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 	lo = readl(hdm + CXL_HDM_DECODER0_SKIP_LOW(which));
 	hi = readl(hdm + CXL_HDM_DECODER0_SKIP_HIGH(which));
 	skip = (hi << 32) + lo;
-	cxled = to_cxl_endpoint_decoder(&cxld->dev);
 	rc = devm_cxl_dpa_reserve(cxled, *dpa_base + skip, dpa_size, skip);
 	if (rc) {
 		dev_err(&port->dev,

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/19] cxl/memdev: Formalize endpoint port linkage
       [not found]   ` <CGME20230607164756uscas1p2fb025e7f4de5094925cc25fc2ac45212@uscas1p2.samsung.com>
@ 2023-06-07 16:47     ` Fan Ni
  0 siblings, 0 replies; 64+ messages in thread
From: Fan Ni @ 2023-06-07 16:47 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Sun, Jun 04, 2023 at 04:32:27PM -0700, Dan Williams wrote:
> Move the endpoint port that the cxl_mem driver establishes from drvdata
> to a first class attribute. This is in preparation for device-memory
> drivers reusing the CXL core for memory region management. Those drivers
> need a type-safe method to retrieve their CXL port linkage. Leave
> drvdata for private usage of the cxl_mem driver not external consumers
> of a 'struct cxl_memdev' object.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---

Reviewed-by: Fan Ni <fan.ni@samsung.com>

>  drivers/cxl/core/memdev.c |    4 ++--
>  drivers/cxl/core/pmem.c   |    2 +-
>  drivers/cxl/core/port.c   |    5 +++--
>  drivers/cxl/cxlmem.h      |    2 ++
>  4 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 3f2d54f30548..65a685e5616f 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -149,7 +149,7 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
>  	struct cxl_port *port;
>  	int rc;
>  
> -	port = dev_get_drvdata(&cxlmd->dev);
> +	port = cxlmd->endpoint;
>  	if (!port || !is_cxl_endpoint(port))
>  		return -EINVAL;
>  
> @@ -207,7 +207,7 @@ static struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
>  	ctx = (struct cxl_dpa_to_region_context) {
>  		.dpa = dpa,
>  	};
> -	port = dev_get_drvdata(&cxlmd->dev);
> +	port = cxlmd->endpoint;
>  	if (port && is_cxl_endpoint(port) && port->commit_end != -1)
>  		device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region);
>  
> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
> index f8c38d997252..fc94f5240327 100644
> --- a/drivers/cxl/core/pmem.c
> +++ b/drivers/cxl/core/pmem.c
> @@ -64,7 +64,7 @@ static int match_nvdimm_bridge(struct device *dev, void *data)
>  
>  struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd)
>  {
> -	struct cxl_port *port = find_cxl_root(dev_get_drvdata(&cxlmd->dev));
> +	struct cxl_port *port = find_cxl_root(cxlmd->endpoint);
>  	struct device *dev;
>  
>  	if (!port)
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 71a7547a8d6f..6720ab22a494 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1167,7 +1167,7 @@ static struct device *grandparent(struct device *dev)
>  static void delete_endpoint(void *data)
>  {
>  	struct cxl_memdev *cxlmd = data;
> -	struct cxl_port *endpoint = dev_get_drvdata(&cxlmd->dev);
> +	struct cxl_port *endpoint = cxlmd->endpoint;
>  	struct cxl_port *parent_port;
>  	struct device *parent;
>  
> @@ -1182,6 +1182,7 @@ static void delete_endpoint(void *data)
>  		devm_release_action(parent, cxl_unlink_uport, endpoint);
>  		devm_release_action(parent, unregister_port, endpoint);
>  	}
> +	cxlmd->endpoint = NULL;
>  	device_unlock(parent);
>  	put_device(parent);
>  out:
> @@ -1193,7 +1194,7 @@ int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
>  	struct device *dev = &cxlmd->dev;
>  
>  	get_device(&endpoint->dev);
> -	dev_set_drvdata(dev, endpoint);
> +	cxlmd->endpoint = endpoint;
>  	cxlmd->depth = endpoint->depth;
>  	return devm_add_action_or_reset(dev, delete_endpoint, cxlmd);
>  }
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index b8bdf7490d2c..7ee78e79933c 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -38,6 +38,7 @@
>   * @detach_work: active memdev lost a port in its ancestry
>   * @cxl_nvb: coordinate removal of @cxl_nvd if present
>   * @cxl_nvd: optional bridge to an nvdimm if the device supports pmem
> + * @endpoint: connection to the CXL port topology for this memory device
>   * @id: id number of this memdev instance.
>   * @depth: endpoint port depth
>   */
> @@ -48,6 +49,7 @@ struct cxl_memdev {
>  	struct work_struct detach_work;
>  	struct cxl_nvdimm_bridge *cxl_nvb;
>  	struct cxl_nvdimm *cxl_nvd;
> +	struct cxl_port *endpoint;
>  	int id;
>  	int depth;
>  };
> 
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory
  2023-06-04 23:33 ` [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory Dan Williams
  2023-06-06 15:34   ` Jonathan Cameron
@ 2023-06-07 21:09   ` Vikram Sethi
  2023-06-08 10:47     ` Jonathan Cameron
  1 sibling, 1 reply; 64+ messages in thread
From: Vikram Sethi @ 2023-06-07 21:09 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh

Thanks for posting this Dan. 
> From: Dan Williams <dan.j.williams@intel.com>
> Sent: Sunday, June 4, 2023 6:33 PM
> To: linux-cxl@vger.kernel.org
> Cc: ira.weiny@intel.com; navneet.singh@intel.com
> Subject: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local
> memory 
> 
> Mock-up a device that does not have a standard mailbox, i.e. a device that
> does not implement the CXL memory-device class code, but wants to map
> "device" memory (aka Type-2, aka HDM-D[B], aka accelerator memory).
> 
> For extra code coverage make this device an RCD to test region creation
> flows in the presence of an RCH topology (memory device modeled as a
> root-complex-integrated-endpoint RCIEP).
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/memdev.c    |   15 +++++++
>  drivers/cxl/cxlmem.h         |    1
>  tools/testing/cxl/test/cxl.c |   16 +++++++-
>  tools/testing/cxl/test/mem.c |   85
> +++++++++++++++++++++++++++++++++++++++++-
>  4 files changed, 112 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index
> 859c43c340bb..5d1ba7a72567 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -467,6 +467,21 @@ static void detach_memdev(struct work_struct
> *work)
>         put_device(&cxlmd->dev);
>  }
> 
> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev) {
> +       struct cxl_dev_state *cxlds;
> +
> +       cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
> +       if (!cxlds)
> +               return ERR_PTR(-ENOMEM);
> +
> +       cxlds->dev = dev;
> +       cxlds->type = CXL_DEVTYPE_DEVMEM;
> +
> +       return cxlds;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
> +
>  static struct lock_class_key cxl_memdev_key;
> 
>  static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index
> ad7f806549d3..89e560ea14c0 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -746,6 +746,7 @@ int cxl_await_media_ready(struct cxl_dev_state
> *cxlds);  int cxl_enumerate_cmds(struct cxl_memdev_state *mds);  int
> cxl_mem_create_range_info(struct cxl_memdev_state *mds);  struct
> cxl_memdev_state *cxl_memdev_state_create(struct device *dev);
> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
>  void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>                                 unsigned long *cmds);  void
> clear_exclusive_cxl_commands(struct cxl_memdev_state *mds, diff --git
> a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index
> e3f1b2e88e3e..385cdeeab22c 100644
> --- a/tools/testing/cxl/test/cxl.c
> +++ b/tools/testing/cxl/test/cxl.c
> @@ -278,7 +278,7 @@ static struct {
>                         },
>                         .interleave_ways = 0,
>                         .granularity = 4,
> -                       .restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
> +                       .restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE2 |
>                                         ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
>                         .qtg_id = 5,
>                         .window_size = SZ_256M, @@ -713,7 +713,19 @@ static void
> default_mock_decoder(struct cxl_decoder *cxld)
> 
>         cxld->interleave_ways = 1;
>         cxld->interleave_granularity = 256;
> -       cxld->target_type = CXL_DECODER_HOSTMEM;
> +       if (is_endpoint_decoder(&cxld->dev)) {
> +               struct cxl_endpoint_decoder *cxled;
> +               struct cxl_dev_state *cxlds;
> +               struct cxl_memdev *cxlmd;
> +
> +               cxled = to_cxl_endpoint_decoder(&cxld->dev);
> +               cxlmd = cxled_to_memdev(cxled);
> +               cxlds = cxlmd->cxlds;
> +               if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
> +                       cxld->target_type = CXL_DECODER_HOSTMEM;
> +               else
> +                       cxld->target_type = CXL_DECODER_DEVMEM;
> +       }
>         cxld->commit = mock_decoder_commit;
>         cxld->reset = mock_decoder_reset;  } diff --git
> a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c index
> 6fb5718588f3..620bfcf5e5a5 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -1189,11 +1189,21 @@ static void label_area_release(void *lsa)
>         vfree(lsa);
>  }
> 
> +#define CXL_MOCKMEM_RCD BIT(0)
> +#define CXL_MOCKMEM_TYPE2 BIT(1)
> +
>  static bool is_rcd(struct platform_device *pdev)  {
>         const struct platform_device_id *id = platform_get_device_id(pdev);
> 
> -       return !!id->driver_data;
> +       return !!(id->driver_data & CXL_MOCKMEM_RCD); }
> +
> +static bool is_type2(struct platform_device *pdev) {
> +       const struct platform_device_id *id =
> +platform_get_device_id(pdev);
> +
> +       return !!(id->driver_data & CXL_MOCKMEM_TYPE2);
>  }
> 
>  static ssize_t event_trigger_store(struct device *dev, @@ -1205,7 +1215,7
> @@ static ssize_t event_trigger_store(struct device *dev,  }  static
> DEVICE_ATTR_WO(event_trigger);
> 
> -static int cxl_mock_mem_probe(struct platform_device *pdev)
> +static int __cxl_mock_mem_probe(struct platform_device *pdev)
>  {
>         struct device *dev = &pdev->dev;
>         struct cxl_memdev *cxlmd;
> @@ -1274,6 +1284,75 @@ static int cxl_mock_mem_probe(struct
> platform_device *pdev)
>         return 0;
>  }
> 
> +static int cxl_mock_type2_probe(struct platform_device *pdev) {
> +       struct cxl_endpoint_decoder *cxled;
> +       struct device *dev = &pdev->dev;
> +       struct cxl_root_decoder *cxlrd;
> +       struct cxl_dev_state *cxlds;
> +       struct cxl_port *endpoint;
> +       struct cxl_memdev *cxlmd;
> +       resource_size_t max = 0;
> +       int rc;
> +
> +       cxlds = cxl_accel_state_create(dev);
> +       if (IS_ERR(cxlds))
> +               return PTR_ERR(cxlds);
> +
> +       cxlds->serial = pdev->id;
> +       cxlds->component_reg_phys = CXL_RESOURCE_NONE;
> +       cxlds->dpa_res = DEFINE_RES_MEM(0, DEV_SIZE);
> +       cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, DEV_SIZE, "ram");
> +       cxlds->pmem_res = DEFINE_RES_MEM_NAMED(DEV_SIZE, 0, "pmem");
> +       if (is_rcd(pdev))
> +               cxlds->rcd = true;
> +
> +       rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
> +       if (rc)
> +               return rc;
> +
> +       cxlmd = devm_cxl_add_memdev(cxlds);
> +       if (IS_ERR(cxlmd))
> +               return PTR_ERR(cxlmd);
> +
> +       endpoint = cxl_acquire_endpoint(cxlmd);
> +       if (IS_ERR(endpoint))
> +               return PTR_ERR(endpoint);
> +
> +       cxlrd = cxl_hpa_freespace(endpoint, &endpoint->host_bridge, 1,
> +                                 CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
> +                                 &max);
> +

IIUC, finding free HPA space is for the case where the platform FW has not already allocated it and initialized the HDM ranges in the device decoders, correct?
If the accelerator driver recognized that FW had initialized its HPA ranges in the device decoders (without committing/locking the decoders), could it skip the cxl_hpa_freespace call?
It would seem reasonable for FW to init the decoder but not commit/lock it. 

> +       if (IS_ERR(cxlrd)) {
> +               rc = PTR_ERR(cxlrd);
> +               goto out;
> +       }
> +
> +       cxled = cxl_request_dpa(endpoint, CXL_DECODER_RAM, 0, max);
> +       if (IS_ERR(cxled)) {
> +               rc = PTR_ERR(cxled);
> +               goto out_cxlrd;
> +       }
> +
> +       /* A real driver would do something with the returned region */
> +       rc = PTR_ERR_OR_ZERO(cxl_create_region(cxlrd, &cxled, 1));

Assuming the accelerator driver wanted to add some, or all of its coherent memory to the kernel MM via add_memory_driver_managed, I think it would get the HPA ranges from the decoder's hpa_range field. 
But that API also needs a node ID. 
If the FW ACPI tables had shown the accelerator Generic Initiator Affinity structure, then I believe the accelerator's device struct should already have its numa node, and the same could be passed to add_memory_driver_managed. Does that sound right, or is there a better way to ensure the accelerator memory gets a distinct NUMA node?
If the ACPI tables had not shown the device as a generic initiator, is there any notion of the cxl memory device structs having a new/distinct NUMA node for the memory device, or would it just be pointing to the NUMA node of the associated CPU socket which has the host bridge or a default 0 NUMA node?

> +
> +       put_device(cxled_dev(cxled));
> +out_cxlrd:
> +       put_device(cxlrd_dev(cxlrd));
> +out:
> +       cxl_release_endpoint(cxlmd, endpoint);
> +
> +       return rc;
> +}
> +
> +static int cxl_mock_mem_probe(struct platform_device *pdev) {
> +       if (is_type2(pdev))
> +               return cxl_mock_type2_probe(pdev);
> +       return __cxl_mock_mem_probe(pdev); }
> +
>  static ssize_t security_lock_show(struct device *dev,
>                                   struct device_attribute *attr, char *buf)  { @@ -1316,7
> +1395,7 @@ ATTRIBUTE_GROUPS(cxl_mock_mem);
> 
>  static const struct platform_device_id cxl_mock_mem_ids[] = {
>         { .name = "cxl_mem", 0 },
> -       { .name = "cxl_rcd", 1 },
> +       { .name = "cxl_rcd", CXL_MOCKMEM_RCD | CXL_MOCKMEM_TYPE2 },
>         { },
>  };
>  MODULE_DEVICE_TABLE(platform, cxl_mock_mem_ids);


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory
  2023-06-07 21:09   ` Vikram Sethi
@ 2023-06-08 10:47     ` Jonathan Cameron
  2023-06-08 14:34       ` Vikram Sethi
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-08 10:47 UTC (permalink / raw)
  To: Vikram Sethi; +Cc: Dan Williams, linux-cxl, ira.weiny, navneet.singh

On Wed, 7 Jun 2023 21:09:21 +0000
Vikram Sethi <vsethi@nvidia.com> wrote:

> Thanks for posting this Dan. 
> > From: Dan Williams <dan.j.williams@intel.com>
> > Sent: Sunday, June 4, 2023 6:33 PM
> > To: linux-cxl@vger.kernel.org
> > Cc: ira.weiny@intel.com; navneet.singh@intel.com
> > Subject: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local
> > memory 
> > 
> > Mock-up a device that does not have a standard mailbox, i.e. a device that
> > does not implement the CXL memory-device class code, but wants to map
> > "device" memory (aka Type-2, aka HDM-D[B], aka accelerator memory).
> > 
> > For extra code coverage make this device an RCD to test region creation
> > flows in the presence of an RCH topology (memory device modeled as a
> > root-complex-integrated-endpoint RCIEP).
> > 
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/cxl/core/memdev.c    |   15 +++++++
> >  drivers/cxl/cxlmem.h         |    1
> >  tools/testing/cxl/test/cxl.c |   16 +++++++-
> >  tools/testing/cxl/test/mem.c |   85
> > +++++++++++++++++++++++++++++++++++++++++-
> >  4 files changed, 112 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index
> > 859c43c340bb..5d1ba7a72567 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -467,6 +467,21 @@ static void detach_memdev(struct work_struct
> > *work)
> >         put_device(&cxlmd->dev);
> >  }
> > 
> > +struct cxl_dev_state *cxl_accel_state_create(struct device *dev) {
> > +       struct cxl_dev_state *cxlds;
> > +
> > +       cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
> > +       if (!cxlds)
> > +               return ERR_PTR(-ENOMEM);
> > +
> > +       cxlds->dev = dev;
> > +       cxlds->type = CXL_DEVTYPE_DEVMEM;
> > +
> > +       return cxlds;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
> > +
> >  static struct lock_class_key cxl_memdev_key;
> > 
> >  static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index
> > ad7f806549d3..89e560ea14c0 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -746,6 +746,7 @@ int cxl_await_media_ready(struct cxl_dev_state
> > *cxlds);  int cxl_enumerate_cmds(struct cxl_memdev_state *mds);  int
> > cxl_mem_create_range_info(struct cxl_memdev_state *mds);  struct
> > cxl_memdev_state *cxl_memdev_state_create(struct device *dev);
> > +struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
> >  void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> >                                 unsigned long *cmds);  void
> > clear_exclusive_cxl_commands(struct cxl_memdev_state *mds, diff --git
> > a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index
> > e3f1b2e88e3e..385cdeeab22c 100644
> > --- a/tools/testing/cxl/test/cxl.c
> > +++ b/tools/testing/cxl/test/cxl.c
> > @@ -278,7 +278,7 @@ static struct {
> >                         },
> >                         .interleave_ways = 0,
> >                         .granularity = 4,
> > -                       .restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
> > +                       .restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE2 |
> >                                         ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
> >                         .qtg_id = 5,
> >                         .window_size = SZ_256M, @@ -713,7 +713,19 @@ static void
> > default_mock_decoder(struct cxl_decoder *cxld)
> > 
> >         cxld->interleave_ways = 1;
> >         cxld->interleave_granularity = 256;
> > -       cxld->target_type = CXL_DECODER_HOSTMEM;
> > +       if (is_endpoint_decoder(&cxld->dev)) {
> > +               struct cxl_endpoint_decoder *cxled;
> > +               struct cxl_dev_state *cxlds;
> > +               struct cxl_memdev *cxlmd;
> > +
> > +               cxled = to_cxl_endpoint_decoder(&cxld->dev);
> > +               cxlmd = cxled_to_memdev(cxled);
> > +               cxlds = cxlmd->cxlds;
> > +               if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
> > +                       cxld->target_type = CXL_DECODER_HOSTMEM;
> > +               else
> > +                       cxld->target_type = CXL_DECODER_DEVMEM;
> > +       }
> >         cxld->commit = mock_decoder_commit;
> >         cxld->reset = mock_decoder_reset;  } diff --git
> > a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c index
> > 6fb5718588f3..620bfcf5e5a5 100644
> > --- a/tools/testing/cxl/test/mem.c
> > +++ b/tools/testing/cxl/test/mem.c
> > @@ -1189,11 +1189,21 @@ static void label_area_release(void *lsa)
> >         vfree(lsa);
> >  }
> > 
> > +#define CXL_MOCKMEM_RCD BIT(0)
> > +#define CXL_MOCKMEM_TYPE2 BIT(1)
> > +
> >  static bool is_rcd(struct platform_device *pdev)  {
> >         const struct platform_device_id *id = platform_get_device_id(pdev);
> > 
> > -       return !!id->driver_data;
> > +       return !!(id->driver_data & CXL_MOCKMEM_RCD); }
> > +
> > +static bool is_type2(struct platform_device *pdev) {
> > +       const struct platform_device_id *id =
> > +platform_get_device_id(pdev);
> > +
> > +       return !!(id->driver_data & CXL_MOCKMEM_TYPE2);
> >  }
> > 
> >  static ssize_t event_trigger_store(struct device *dev, @@ -1205,7 +1215,7
> > @@ static ssize_t event_trigger_store(struct device *dev,  }  static
> > DEVICE_ATTR_WO(event_trigger);
> > 
> > -static int cxl_mock_mem_probe(struct platform_device *pdev)
> > +static int __cxl_mock_mem_probe(struct platform_device *pdev)
> >  {
> >         struct device *dev = &pdev->dev;
> >         struct cxl_memdev *cxlmd;
> > @@ -1274,6 +1284,75 @@ static int cxl_mock_mem_probe(struct
> > platform_device *pdev)
> >         return 0;
> >  }
> > 
> > +static int cxl_mock_type2_probe(struct platform_device *pdev) {
> > +       struct cxl_endpoint_decoder *cxled;
> > +       struct device *dev = &pdev->dev;
> > +       struct cxl_root_decoder *cxlrd;
> > +       struct cxl_dev_state *cxlds;
> > +       struct cxl_port *endpoint;
> > +       struct cxl_memdev *cxlmd;
> > +       resource_size_t max = 0;
> > +       int rc;
> > +
> > +       cxlds = cxl_accel_state_create(dev);
> > +       if (IS_ERR(cxlds))
> > +               return PTR_ERR(cxlds);
> > +
> > +       cxlds->serial = pdev->id;
> > +       cxlds->component_reg_phys = CXL_RESOURCE_NONE;
> > +       cxlds->dpa_res = DEFINE_RES_MEM(0, DEV_SIZE);
> > +       cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, DEV_SIZE, "ram");
> > +       cxlds->pmem_res = DEFINE_RES_MEM_NAMED(DEV_SIZE, 0, "pmem");
> > +       if (is_rcd(pdev))
> > +               cxlds->rcd = true;
> > +
> > +       rc = request_resource(&cxlds->dpa_res, &cxlds->ram_res);
> > +       if (rc)
> > +               return rc;
> > +
> > +       cxlmd = devm_cxl_add_memdev(cxlds);
> > +       if (IS_ERR(cxlmd))
> > +               return PTR_ERR(cxlmd);
> > +
> > +       endpoint = cxl_acquire_endpoint(cxlmd);
> > +       if (IS_ERR(endpoint))
> > +               return PTR_ERR(endpoint);
> > +
> > +       cxlrd = cxl_hpa_freespace(endpoint, &endpoint->host_bridge, 1,
> > +                                 CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
> > +                                 &max);
> > +  
> 

> IIUC, finding free HPA space is for the case where the platform FW
> has not already allocated it and initialized the HDM ranges in the
> device decoders, correct? If the accelerator driver recognized that
> FW had initialized its HPA ranges in the device decoders (without
> committing/locking the decoders), could it skip the cxl_hpa_freespace
> call? It would seem reasonable for FW to init the decoder but not
> commit/lock it. 

I'd find it a bit odd if firmware did a partial job...
Why do you think it might?  To pass a hint to the kernel?


> 
> > +       if (IS_ERR(cxlrd)) {
> > +               rc = PTR_ERR(cxlrd);
> > +               goto out;
> > +       }
> > +
> > +       cxled = cxl_request_dpa(endpoint, CXL_DECODER_RAM, 0, max);
> > +       if (IS_ERR(cxled)) {
> > +               rc = PTR_ERR(cxled);
> > +               goto out_cxlrd;
> > +       }
> > +
> > +       /* A real driver would do something with the returned region */
> > +       rc = PTR_ERR_OR_ZERO(cxl_create_region(cxlrd, &cxled, 1));  
> 

> Assuming the accelerator driver wanted to add some, or all of its
> coherent memory to the kernel MM via add_memory_driver_managed, I
> think it would get the HPA ranges from the decoder's hpa_range field.
> But that API also needs a node ID. If the FW ACPI tables had shown
> the accelerator Generic Initiator Affinity structure, then I believe
> the accelerator's device struct should already have its numa node,
> and the same could be passed to add_memory_driver_managed. Does that
> sound right, or is there a better way to ensure the accelerator
> memory gets a distinct NUMA node?

If it has a GI node assigned I agree that probably makes sense.
There might be some fiddly corners where it doesn't but they are probably
the exception (multiple memory types, or different access characteristics,
need to represent some other topology complexities)


> If the ACPI tables had not shown
> the device as a generic initiator, is there any notion of the cxl
> memory device structs having a new/distinct NUMA node for the memory
> device, or would it just be pointing to the NUMA node of the
> associated CPU socket which has the host bridge or a default 0 NUMA
> node?

I'll leave this one for others (mostly :). I personally think current model is too
simplistic and we need to bite the bullet and work out how to do full
dynamic numa node creation rather that using some preallocated ones.

> 
> > +
> > +       put_device(cxled_dev(cxled));
> > +out_cxlrd:
> > +       put_device(cxlrd_dev(cxlrd));
> > +out:
> > +       cxl_release_endpoint(cxlmd, endpoint);
> > +
> > +       return rc;
> > +}
> > +
> > +static int cxl_mock_mem_probe(struct platform_device *pdev) {
> > +       if (is_type2(pdev))
> > +               return cxl_mock_type2_probe(pdev);
> > +       return __cxl_mock_mem_probe(pdev); }
> > +
> >  static ssize_t security_lock_show(struct device *dev,
> >                                   struct device_attribute *attr, char *buf)  { @@ -1316,7
> > +1395,7 @@ ATTRIBUTE_GROUPS(cxl_mock_mem);
> > 
> >  static const struct platform_device_id cxl_mock_mem_ids[] = {
> >         { .name = "cxl_mem", 0 },
> > -       { .name = "cxl_rcd", 1 },
> > +       { .name = "cxl_rcd", CXL_MOCKMEM_RCD | CXL_MOCKMEM_TYPE2 },
> >         { },
> >  };
> >  MODULE_DEVICE_TABLE(platform, cxl_mock_mem_ids);  
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory
  2023-06-08 10:47     ` Jonathan Cameron
@ 2023-06-08 14:34       ` Vikram Sethi
  2023-06-08 15:22         ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Vikram Sethi @ 2023-06-08 14:34 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Dan Williams, linux-cxl, ira.weiny, navneet.singh

> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Sent: Thursday, June 8, 2023 5:47 AM
> To: Vikram Sethi <vsethi@nvidia.com>
> Cc: Dan Williams <dan.j.williams@intel.com>; linux-cxl@vger.kernel.org;
> ira.weiny@intel.com; navneet.singh@intel.com
> Subject: Re: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with
> local memory
> On Wed, 7 Jun 2023 21:09:21 +0000
> Vikram Sethi <vsethi@nvidia.com> wrote:
> 
> > Thanks for posting this Dan.
> > > From: Dan Williams <dan.j.williams@intel.com>
> > > Sent: Sunday, June 4, 2023 6:33 PM
> > > To: linux-cxl@vger.kernel.org
> > > Cc: ira.weiny@intel.com; navneet.singh@intel.com
> > > Subject: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator
> > > with local memory
> > >
> > > Mock-up a device that does not have a standard mailbox, i.e. a
> > > device that does not implement the CXL memory-device class code, but
> > > wants to map "device" memory (aka Type-2, aka HDM-D[B], aka
> accelerator memory).
> > >
> > > +
> > > +       cxlrd = cxl_hpa_freespace(endpoint, &endpoint->host_bridge, 1,
> > > +                                 CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
> > > +                                 &max);
> > > +
> >
> 
> > IIUC, finding free HPA space is for the case where the platform FW has
> > not already allocated it and initialized the HDM ranges in the device
> > decoders, correct? If the accelerator driver recognized that FW had
> > initialized its HPA ranges in the device decoders (without
> > committing/locking the decoders), could it skip the cxl_hpa_freespace
> > call? It would seem reasonable for FW to init the decoder but not
> > commit/lock it.
> 
> I'd find it a bit odd if firmware did a partial job...
> Why do you think it might?  To pass a hint to the kernel?
> 
Firmware could certainly initialize, commit, and lock the decoder for accelerators that are soldered on the motherboard. 
I just wasn't sure if the CXL core code could deal with a committed and locked decoder. 
I was also thinking about chiplets within a package with new specifications like UCIe where it is possible that chip designers
assigned a fixed HPA range in the chip address map to a CXL device chiplet's HDM. Would it be sufficient for FW to convey this by committing and
locking the decoders, or would we need some new ACPI flags telling the kernel that this device's decoders can really decode a fixed HPA range and not to change the fixed values?
A similar notion exists in PCIe of fixed BARs called enhanced allocation with hardwired addresses.

> 
> >
> > > +       if (IS_ERR(cxlrd)) {
> > > +               rc = PTR_ERR(cxlrd);
> > > +               goto out;
> > > +       }
> > > +
> > > +       cxled = cxl_request_dpa(endpoint, CXL_DECODER_RAM, 0, max);
> > > +       if (IS_ERR(cxled)) {
> > > +               rc = PTR_ERR(cxled);
> > > +               goto out_cxlrd;
> > > +       }
> > > +
> > > +       /* A real driver would do something with the returned region */
> > > +       rc = PTR_ERR_OR_ZERO(cxl_create_region(cxlrd, &cxled, 1));
> >
> 
> > Assuming the accelerator driver wanted to add some, or all of its
> > coherent memory to the kernel MM via add_memory_driver_managed, I
> > think it would get the HPA ranges from the decoder's hpa_range field.
> > But that API also needs a node ID. If the FW ACPI tables had shown the
> > accelerator Generic Initiator Affinity structure, then I believe the
> > accelerator's device struct should already have its numa node, and the
> > same could be passed to add_memory_driver_managed. Does that sound
> > right, or is there a better way to ensure the accelerator memory gets
> > a distinct NUMA node?
> 
> If it has a GI node assigned I agree that probably makes sense.
> There might be some fiddly corners where it doesn't but they are probably
> the exception (multiple memory types, or different access characteristics,
> need to represent some other topology complexities)
> 
> 
> > If the ACPI tables had not shown
> > the device as a generic initiator, is there any notion of the cxl
> > memory device structs having a new/distinct NUMA node for the memory
> > device, or would it just be pointing to the NUMA node of the
> > associated CPU socket which has the host bridge or a default 0 NUMA
> > node?
> 
> I'll leave this one for others (mostly :). I personally think current model is too
> simplistic and we need to bite the bullet and work out how to do full dynamic
> numa node creation rather that using some preallocated ones.
> 
> >
> > > +
> > > +       put_device(cxled_dev(cxled));
> > > +out_cxlrd:
> > > +       put_device(cxlrd_dev(cxlrd));
> > > +out:
> > > +       cxl_release_endpoint(cxlmd, endpoint);
> > > +
> > > +       return rc;
> > > +}
> > > +
> > > +static int cxl_mock_mem_probe(struct platform_device *pdev) {
> > > +       if (is_type2(pdev))
> > > +               return cxl_mock_type2_probe(pdev);
> > > +       return __cxl_mock_mem_probe(pdev); }
> > > +
> > >  static ssize_t security_lock_show(struct device *dev,
> > >                                   struct device_attribute *attr,
> > > char *buf)  { @@ -1316,7
> > > +1395,7 @@ ATTRIBUTE_GROUPS(cxl_mock_mem);
> > >
> > >  static const struct platform_device_id cxl_mock_mem_ids[] = {
> > >         { .name = "cxl_mem", 0 },
> > > -       { .name = "cxl_rcd", 1 },
> > > +       { .name = "cxl_rcd", CXL_MOCKMEM_RCD | CXL_MOCKMEM_TYPE2
> },
> > >         { },
> > >  };
> > >  MODULE_DEVICE_TABLE(platform, cxl_mock_mem_ids);
> >


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory
  2023-06-08 14:34       ` Vikram Sethi
@ 2023-06-08 15:22         ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-08 15:22 UTC (permalink / raw)
  To: Vikram Sethi; +Cc: Dan Williams, linux-cxl, ira.weiny, navneet.singh

On Thu, 8 Jun 2023 14:34:48 +0000
Vikram Sethi <vsethi@nvidia.com> wrote:

> > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> > Sent: Thursday, June 8, 2023 5:47 AM
> > To: Vikram Sethi <vsethi@nvidia.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>; linux-cxl@vger.kernel.org;
> > ira.weiny@intel.com; navneet.singh@intel.com
> > Subject: Re: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with
> > local memory
> > On Wed, 7 Jun 2023 21:09:21 +0000
> > Vikram Sethi <vsethi@nvidia.com> wrote:
> >   
> > > Thanks for posting this Dan.  
> > > > From: Dan Williams <dan.j.williams@intel.com>
> > > > Sent: Sunday, June 4, 2023 6:33 PM
> > > > To: linux-cxl@vger.kernel.org
> > > > Cc: ira.weiny@intel.com; navneet.singh@intel.com
> > > > Subject: [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator
> > > > with local memory
> > > >
> > > > Mock-up a device that does not have a standard mailbox, i.e. a
> > > > device that does not implement the CXL memory-device class code, but
> > > > wants to map "device" memory (aka Type-2, aka HDM-D[B], aka  
> > accelerator memory).  
> > > >
> > > > +
> > > > +       cxlrd = cxl_hpa_freespace(endpoint, &endpoint->host_bridge, 1,
> > > > +                                 CXL_DECODER_F_RAM | CXL_DECODER_F_TYPE2,
> > > > +                                 &max);
> > > > +  
> > >  
> >   
> > > IIUC, finding free HPA space is for the case where the platform FW has
> > > not already allocated it and initialized the HDM ranges in the device
> > > decoders, correct? If the accelerator driver recognized that FW had
> > > initialized its HPA ranges in the device decoders (without
> > > committing/locking the decoders), could it skip the cxl_hpa_freespace
> > > call? It would seem reasonable for FW to init the decoder but not
> > > commit/lock it.  
> > 
> > I'd find it a bit odd if firmware did a partial job...
> > Why do you think it might?  To pass a hint to the kernel?
> >   

> Firmware could certainly initialize, commit, and lock the decoder for
> accelerators that are soldered on the motherboard. I just wasn't sure
> if the CXL core code could deal with a committed and locked decoder. 

It can for type 3 devices, I haven't checked it still applies for
this type 2 code.

> I was also thinking about chiplets within a package with new
> specifications like UCIe where it is possible that chip designers
> assigned a fixed HPA range in the chip address map to a CXL device
> chiplet's HDM. Would it be sufficient for FW to convey this by
> committing and locking the decoders, or would we need some new ACPI
> flags telling the kernel that this device's decoders can really
> decode a fixed HPA range and not to change the fixed values? A
> similar notion exists in PCIe of fixed BARs called enhanced
> allocation with hardwired addresses.

If they are hard wired then FW should just lock them I think.

Jonathan


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 04/19] cxl/memdev: Make mailbox functionality optional
  2023-06-06 11:15   ` Jonathan Cameron
@ 2023-06-13 20:53     ` Dan Williams
  0 siblings, 0 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-13 20:53 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

Jonathan Cameron wrote:
> On Sun, 04 Jun 2023 16:31:59 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > In support of the Linux CXL core scaling for a wider set of CXL devices,
> > allow for the creation of memdevs with some memory device capabilities
> > disabled. Specifically, allow for CXL devices outside of those claiming
> > to be compliant with the generic CXL memory device class code, like
> > vendor specific Type-2/3 devices that host CXL.mem. This implies, allow
> > for the creation of memdevs that only support component-registers, not
> > necessarily memory-device-registers (like mailbox registers). A memdev
> > derived from a CXL endpoint that does not support generic class code
> > expectations is tagged "CXL_DEVTYPE_DEVMEM", while a memdev derived from a
> > class-code compliant endpoint is tagged "CXL_DEVTYPE_CLASSMEM".
> > 
> > The primary assumption of a CXL_DEVTYPE_DEVMEM memdev is that it
> > optionally may not host a mailbox. Disable the command passthrough ioctl
> > for memdevs that are not CXL_DEVTYPE_CLASSMEM, and return empty strings
> > from memdev attributes associated with data retrieved via the
> > class-device-standard IDENTIFY command. Note that empty strings were
> > chosen over attribute visibility to maintain compatibility with shipping
> > versions of cxl-cli that expect those attributes to always be present.
> Hmm.  I'm not keen on this, but I guess we've ended up in this corner
> so don't have much choice.

We could start with fixing that expectation and then set a deprecation
schedule for this workaround. I filed an internal task to track this.

> 
> > 
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> Trivial stuff inline.
> 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index d3fe73d5ba4d..b8bdf7490d2c 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -254,6 +254,20 @@ struct cxl_poison_state {
> >  	struct mutex lock;  /* Protect reads of poison list */
> >  };
> >  
> > +/*
> > + * enum cxl_devtype - delineate type-2 from a generic type-3 device
> > + * @CXL_DEVTYPE_DEVMEM - Vendor specific CXL Type-2 device implementing HDM-D or
> 
> Bit of a naming collision with other uses of DEVMEM but I can't immediately think
> of a better name so fair enough...
> 
> > + *			 HDM-DB, no expectation that this device implements a
> > + *			 mailbox, or other memory-device-standard manageability
> > + *			 flows.
> no requirement that this device
> 
> Expectation is a bit strong the other way to my reading.  These device might well
> implement some or all of that + other stuff that means they don't want to use the
> class code.

Fair enough.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/19] cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTMEM, DEVMEM}
  2023-06-06 11:21   ` Jonathan Cameron
@ 2023-06-13 21:03     ` Dan Williams
  0 siblings, 0 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-13 21:03 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

Jonathan Cameron wrote:
> On Sun, 04 Jun 2023 16:32:05 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > In preparation for support for HDM-D and HDM-DB configuration
> > (device-memory, and device-memory with back-invalidate). Rename the current
> > type designators to use HOSTMEM and DEVMEM as a suffix.
> 
> I'd state this is inline with the CXL 3.0 spec naming.
> Device coherent address range vs Host - Only coherent address range.
> (or rename HOSTMEM to HOSTONLYMEM which I think deals with that subtlety) 

HOSTONLYMEM is more clear.

> > 
> > HDM-DB can be supported by devices that are not accelerators, so DEVMEM is
> > a more generic term for that case.
> > 
> > Fixup one location where this type value was open coded.
> > 
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> Bike-shedding aside LGTM

Thanks!

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM
  2023-06-06 11:27   ` Jonathan Cameron
@ 2023-06-13 21:23     ` Dan Williams
  2023-06-13 22:32     ` Dan Williams
  1 sibling, 0 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-13 21:23 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

Jonathan Cameron wrote:
> On Sun, 04 Jun 2023 16:32:10 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > In preparation for device-memory region creation, arrange for decoders
> > of CXL_DEVTYPE_DEVMEM memdevs to default to CXL_DECODER_DEVMEM for their
> > target type.
> 
> Why?  CXL_DEVTYPE_DEVMEM might just be a non CLASS code compliant HDM-H
> only device.  I'd want those drivers to always set this explicitly.

As it stands /sys/bus/cxl/devices/decoderX.Y/target_type is read-only.
For non-class-code compliant HDM-H device, or even a device that
supports mixed HDM-H + HDM-DB operation depending on the decoder, there
would need to be some mechanism to communicate that at decoder
instantiation time.

So the default derived from CXL_DEVTYPE_* is indeed an arbitrary
placeholder until a use case for more precision comes along. In that
case I think target_type becomes writable, and "none" becomes a valid
return value.

> > 
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/cxl/core/hdm.c |   14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > index de8a3fb28331..ca3b99c6eacf 100644
> > --- a/drivers/cxl/core/hdm.c
> > +++ b/drivers/cxl/core/hdm.c
> > @@ -856,12 +856,22 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
> >  		}
> >  		port->commit_end = cxld->id;
> >  	} else {
> > -		/* unless / until type-2 drivers arrive, assume type-3 */
> >  		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0) {
> >  			ctrl |= CXL_HDM_DECODER0_CTRL_TYPE;
> >  			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
> 
> This is setting it to be HOSTMEM if it was previously DEVMEM and that
> makes it inconsistent with the state cached below.

Oh, definite oversight.

> Not sure why it was conditional in the first place - writing to existing value
> should have been safe and would be less code...

Agree, that's busted, will fix.

> 
> >  		}
> > -		cxld->target_type = CXL_DECODER_HOSTMEM;
> > +		if (cxled) {
> > +			struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > +			struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > +
> > +			if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
> > +				cxld->target_type = CXL_DECODER_HOSTMEM;
> > +			else
> > +				cxld->target_type = CXL_DECODER_DEVMEM;
> > +		} else {
> > +			/* To be overridden by region type at commit time */
> > +			cxld->target_type = CXL_DECODER_HOSTMEM;
> > +		}
> >  	}
> >  	rc = eiw_to_ways(FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl),
> >  			  &cxld->interleave_ways);
> > 
> 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/19] cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output
  2023-06-04 23:31 ` [PATCH 01/19] cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output Dan Williams
  2023-06-05  8:46   ` Jonathan Cameron
@ 2023-06-13 22:03   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 22:03 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh


On 6/4/23 16:31, Dan Williams wrote:
> The @map parameter to cxl_probe_X_registers() is filled in with the
> mapping parameters of the register block. The @map parameter to
> cxl_map_X_registers() only reads that information to perform the
> mapping. Mark @map const for cxl_map_X_registers() to clarify that it is
> only an input to those helpers.
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>


Reviewed-by: Dave Jiang <dave.jiang@intel.com>


> ---
>   drivers/cxl/core/regs.c |    8 ++++----
>   drivers/cxl/cxl.h       |    4 ++--
>   2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index 1476a0299c9b..52d1dbeda527 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -200,10 +200,10 @@ void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
>   }
>   
>   int cxl_map_component_regs(struct device *dev, struct cxl_component_regs *regs,
> -			   struct cxl_register_map *map, unsigned long map_mask)
> +			   const struct cxl_register_map *map, unsigned long map_mask)
>   {
>   	struct mapinfo {
> -		struct cxl_reg_map *rmap;
> +		const struct cxl_reg_map *rmap;
>   		void __iomem **addr;
>   	} mapinfo[] = {
>   		{ &map->component_map.hdm_decoder, &regs->hdm_decoder },
> @@ -233,11 +233,11 @@ EXPORT_SYMBOL_NS_GPL(cxl_map_component_regs, CXL);
>   
>   int cxl_map_device_regs(struct device *dev,
>   			struct cxl_device_regs *regs,
> -			struct cxl_register_map *map)
> +			const struct cxl_register_map *map)
>   {
>   	resource_size_t phys_addr = map->resource;
>   	struct mapinfo {
> -		struct cxl_reg_map *rmap;
> +		const struct cxl_reg_map *rmap;
>   		void __iomem **addr;
>   	} mapinfo[] = {
>   		{ &map->device_map.status, &regs->status, },
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index f93a28538962..dfc94e76c7d6 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -254,10 +254,10 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base,
>   void cxl_probe_device_regs(struct device *dev, void __iomem *base,
>   			   struct cxl_device_reg_map *map);
>   int cxl_map_component_regs(struct device *dev, struct cxl_component_regs *regs,
> -			   struct cxl_register_map *map,
> +			   const struct cxl_register_map *map,
>   			   unsigned long map_mask);
>   int cxl_map_device_regs(struct device *dev, struct cxl_device_regs *regs,
> -			struct cxl_register_map *map);
> +			const struct cxl_register_map *map);
>   
>   enum cxl_regloc_type;
>   int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/19] tools/testing/cxl: Remove unused @cxlds argument
  2023-06-04 23:31 ` [PATCH 02/19] tools/testing/cxl: Remove unused @cxlds argument Dan Williams
  2023-06-06 10:53   ` Jonathan Cameron
@ 2023-06-13 22:08   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 22:08 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh


On 6/4/23 16:31, Dan Williams wrote:
> In preparation for plumbing a 'struct cxl_memdev_state' as a superset of
> a 'struct cxl_dev_state' cleanup the usage of @cxlds in the unit test
> infrastructure.
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>


> ---
>   tools/testing/cxl/test/mem.c |   86 +++++++++++++++++++-----------------------
>   1 file changed, 39 insertions(+), 47 deletions(-)
>
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index 34b48027b3de..bdaf086d994e 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -180,8 +180,7 @@ static void mes_add_event(struct mock_event_store *mes,
>   	log->nr_events++;
>   }
>   
> -static int mock_get_event(struct cxl_dev_state *cxlds,
> -			  struct cxl_mbox_cmd *cmd)
> +static int mock_get_event(struct device *dev, struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_get_event_payload *pl;
>   	struct mock_event_log *log;
> @@ -201,7 +200,7 @@ static int mock_get_event(struct cxl_dev_state *cxlds,
>   
>   	memset(cmd->payload_out, 0, cmd->size_out);
>   
> -	log = event_find_log(cxlds->dev, log_type);
> +	log = event_find_log(dev, log_type);
>   	if (!log || event_log_empty(log))
>   		return 0;
>   
> @@ -234,8 +233,7 @@ static int mock_get_event(struct cxl_dev_state *cxlds,
>   	return 0;
>   }
>   
> -static int mock_clear_event(struct cxl_dev_state *cxlds,
> -			    struct cxl_mbox_cmd *cmd)
> +static int mock_clear_event(struct device *dev, struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
>   	struct mock_event_log *log;
> @@ -246,7 +244,7 @@ static int mock_clear_event(struct cxl_dev_state *cxlds,
>   	if (log_type >= CXL_EVENT_TYPE_MAX)
>   		return -EINVAL;
>   
> -	log = event_find_log(cxlds->dev, log_type);
> +	log = event_find_log(dev, log_type);
>   	if (!log)
>   		return 0; /* No mock data in this log */
>   
> @@ -256,7 +254,7 @@ static int mock_clear_event(struct cxl_dev_state *cxlds,
>   	 * However, this is not good behavior for the host so test it.
>   	 */
>   	if (log->clear_idx + pl->nr_recs > log->cur_idx) {
> -		dev_err(cxlds->dev,
> +		dev_err(dev,
>   			"Attempting to clear more events than returned!\n");
>   		return -EINVAL;
>   	}
> @@ -266,7 +264,7 @@ static int mock_clear_event(struct cxl_dev_state *cxlds,
>   	     nr < pl->nr_recs;
>   	     nr++, handle++) {
>   		if (handle != le16_to_cpu(pl->handles[nr])) {
> -			dev_err(cxlds->dev, "Clearing events out of order\n");
> +			dev_err(dev, "Clearing events out of order\n");
>   			return -EINVAL;
>   		}
>   	}
> @@ -477,7 +475,7 @@ static int mock_get_log(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
>   	return 0;
>   }
>   
> -static int mock_rcd_id(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int mock_rcd_id(struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_mbox_identify id = {
>   		.fw_revision = { "mock fw v1 " },
> @@ -495,7 +493,7 @@ static int mock_rcd_id(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
>   	return 0;
>   }
>   
> -static int mock_id(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int mock_id(struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_mbox_identify id = {
>   		.fw_revision = { "mock fw v1 " },
> @@ -517,8 +515,7 @@ static int mock_id(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
>   	return 0;
>   }
>   
> -static int mock_partition_info(struct cxl_dev_state *cxlds,
> -			       struct cxl_mbox_cmd *cmd)
> +static int mock_partition_info(struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_mbox_get_partition_info pi = {
>   		.active_volatile_cap =
> @@ -535,11 +532,9 @@ static int mock_partition_info(struct cxl_dev_state *cxlds,
>   	return 0;
>   }
>   
> -static int mock_get_security_state(struct cxl_dev_state *cxlds,
> +static int mock_get_security_state(struct cxl_mockmem_data *mdata,
>   				   struct cxl_mbox_cmd *cmd)
>   {
> -	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
> -
>   	if (cmd->size_in)
>   		return -EINVAL;
>   
> @@ -569,9 +564,9 @@ static void user_plimit_check(struct cxl_mockmem_data *mdata)
>   		mdata->security_state |= CXL_PMEM_SEC_STATE_USER_PLIMIT;
>   }
>   
> -static int mock_set_passphrase(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int mock_set_passphrase(struct cxl_mockmem_data *mdata,
> +			       struct cxl_mbox_cmd *cmd)
>   {
> -	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
>   	struct cxl_set_pass *set_pass;
>   
>   	if (cmd->size_in != sizeof(*set_pass))
> @@ -629,9 +624,9 @@ static int mock_set_passphrase(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd
>   	return -EINVAL;
>   }
>   
> -static int mock_disable_passphrase(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int mock_disable_passphrase(struct cxl_mockmem_data *mdata,
> +				   struct cxl_mbox_cmd *cmd)
>   {
> -	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
>   	struct cxl_disable_pass *dis_pass;
>   
>   	if (cmd->size_in != sizeof(*dis_pass))
> @@ -700,10 +695,9 @@ static int mock_disable_passphrase(struct cxl_dev_state *cxlds, struct cxl_mbox_
>   	return 0;
>   }
>   
> -static int mock_freeze_security(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int mock_freeze_security(struct cxl_mockmem_data *mdata,
> +				struct cxl_mbox_cmd *cmd)
>   {
> -	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
> -
>   	if (cmd->size_in != 0)
>   		return -EINVAL;
>   
> @@ -717,10 +711,9 @@ static int mock_freeze_security(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd
>   	return 0;
>   }
>   
> -static int mock_unlock_security(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int mock_unlock_security(struct cxl_mockmem_data *mdata,
> +				struct cxl_mbox_cmd *cmd)
>   {
> -	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
> -
>   	if (cmd->size_in != NVDIMM_PASSPHRASE_LEN)
>   		return -EINVAL;
>   
> @@ -759,10 +752,9 @@ static int mock_unlock_security(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd
>   	return 0;
>   }
>   
> -static int mock_passphrase_secure_erase(struct cxl_dev_state *cxlds,
> +static int mock_passphrase_secure_erase(struct cxl_mockmem_data *mdata,
>   					struct cxl_mbox_cmd *cmd)
>   {
> -	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
>   	struct cxl_pass_erase *erase;
>   
>   	if (cmd->size_in != sizeof(*erase))
> @@ -858,10 +850,10 @@ static int mock_passphrase_secure_erase(struct cxl_dev_state *cxlds,
>   	return 0;
>   }
>   
> -static int mock_get_lsa(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int mock_get_lsa(struct cxl_mockmem_data *mdata,
> +			struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_mbox_get_lsa *get_lsa = cmd->payload_in;
> -	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
>   	void *lsa = mdata->lsa;
>   	u32 offset, length;
>   
> @@ -878,10 +870,10 @@ static int mock_get_lsa(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
>   	return 0;
>   }
>   
> -static int mock_set_lsa(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int mock_set_lsa(struct cxl_mockmem_data *mdata,
> +			struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_mbox_set_lsa *set_lsa = cmd->payload_in;
> -	struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev);
>   	void *lsa = mdata->lsa;
>   	u32 offset, length;
>   
> @@ -896,8 +888,7 @@ static int mock_set_lsa(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
>   	return 0;
>   }
>   
> -static int mock_health_info(struct cxl_dev_state *cxlds,
> -			    struct cxl_mbox_cmd *cmd)
> +static int mock_health_info(struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_mbox_health_info health_info = {
>   		/* set flags for maint needed, perf degraded, hw replacement */
> @@ -1117,6 +1108,7 @@ ATTRIBUTE_GROUPS(cxl_mock_mem_core);
>   static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
>   {
>   	struct device *dev = cxlds->dev;
> +	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
>   	int rc = -EIO;
>   
>   	switch (cmd->opcode) {
> @@ -1131,45 +1123,45 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
>   		break;
>   	case CXL_MBOX_OP_IDENTIFY:
>   		if (cxlds->rcd)
> -			rc = mock_rcd_id(cxlds, cmd);
> +			rc = mock_rcd_id(cmd);
>   		else
> -			rc = mock_id(cxlds, cmd);
> +			rc = mock_id(cmd);
>   		break;
>   	case CXL_MBOX_OP_GET_LSA:
> -		rc = mock_get_lsa(cxlds, cmd);
> +		rc = mock_get_lsa(mdata, cmd);
>   		break;
>   	case CXL_MBOX_OP_GET_PARTITION_INFO:
> -		rc = mock_partition_info(cxlds, cmd);
> +		rc = mock_partition_info(cmd);
>   		break;
>   	case CXL_MBOX_OP_GET_EVENT_RECORD:
> -		rc = mock_get_event(cxlds, cmd);
> +		rc = mock_get_event(dev, cmd);
>   		break;
>   	case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
> -		rc = mock_clear_event(cxlds, cmd);
> +		rc = mock_clear_event(dev, cmd);
>   		break;
>   	case CXL_MBOX_OP_SET_LSA:
> -		rc = mock_set_lsa(cxlds, cmd);
> +		rc = mock_set_lsa(mdata, cmd);
>   		break;
>   	case CXL_MBOX_OP_GET_HEALTH_INFO:
> -		rc = mock_health_info(cxlds, cmd);
> +		rc = mock_health_info(cmd);
>   		break;
>   	case CXL_MBOX_OP_GET_SECURITY_STATE:
> -		rc = mock_get_security_state(cxlds, cmd);
> +		rc = mock_get_security_state(mdata, cmd);
>   		break;
>   	case CXL_MBOX_OP_SET_PASSPHRASE:
> -		rc = mock_set_passphrase(cxlds, cmd);
> +		rc = mock_set_passphrase(mdata, cmd);
>   		break;
>   	case CXL_MBOX_OP_DISABLE_PASSPHRASE:
> -		rc = mock_disable_passphrase(cxlds, cmd);
> +		rc = mock_disable_passphrase(mdata, cmd);
>   		break;
>   	case CXL_MBOX_OP_FREEZE_SECURITY:
> -		rc = mock_freeze_security(cxlds, cmd);
> +		rc = mock_freeze_security(mdata, cmd);
>   		break;
>   	case CXL_MBOX_OP_UNLOCK:
> -		rc = mock_unlock_security(cxlds, cmd);
> +		rc = mock_unlock_security(mdata, cmd);
>   		break;
>   	case CXL_MBOX_OP_PASSPHRASE_SECURE_ERASE:
> -		rc = mock_passphrase_secure_erase(cxlds, cmd);
> +		rc = mock_passphrase_secure_erase(mdata, cmd);
>   		break;
>   	case CXL_MBOX_OP_GET_POISON:
>   		rc = mock_get_poison(cxlds, cmd);
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/19] cxl/mbox: Move mailbox related driver state to its own data structure
  2023-06-04 23:31 ` [PATCH 03/19] cxl/mbox: Move mailbox related driver state to its own data structure Dan Williams
  2023-06-06 11:10   ` Jonathan Cameron
@ 2023-06-13 22:15   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 22:15 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh


On 6/4/23 16:31, Dan Williams wrote:
> 'struct cxl_dev_state' makes too many assumptions about the capabilities
> of a CXL device. In particular it assumes a CXL device has a mailbox and
> all of the infrastructure and state that comes along with that.
>
> In preparation for supporting accelerator / Type-2 devices that may not
> have a mailbox and in general maintain a minimal core context structure,
> make mailbox functionality a super-set of  'struct cxl_dev_state' with
> 'struct cxl_memdev_state'.
>
> With this reorganization it allows for CXL devices that support HDM
> decoder mapping, but not other general-expander / Type-3 capabilities,
> to only enable that subset without the rest of the mailbox
> infrastructure coming along for the ride.
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>


> ---
>   drivers/cxl/core/mbox.c      |  276 ++++++++++++++++++++++--------------------
>   drivers/cxl/core/memdev.c    |   38 +++---
>   drivers/cxl/cxlmem.h         |   89 ++++++++------
>   drivers/cxl/mem.c            |   10 +-
>   drivers/cxl/pci.c            |  114 +++++++++--------
>   drivers/cxl/pmem.c           |   35 +++--
>   drivers/cxl/security.c       |   24 ++--
>   tools/testing/cxl/test/mem.c |   43 ++++---
>   8 files changed, 338 insertions(+), 291 deletions(-)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index bea9cf31a12d..14805dae5a74 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -182,7 +182,7 @@ static const char *cxl_mem_opcode_to_name(u16 opcode)
>   
>   /**
>    * cxl_internal_send_cmd() - Kernel internal interface to send a mailbox command
> - * @cxlds: The device data for the operation
> + * @mds: The driver data for the operation
>    * @mbox_cmd: initialized command to execute
>    *
>    * Context: Any context.
> @@ -198,19 +198,19 @@ static const char *cxl_mem_opcode_to_name(u16 opcode)
>    * error. While this distinction can be useful for commands from userspace, the
>    * kernel will only be able to use results when both are successful.
>    */
> -int cxl_internal_send_cmd(struct cxl_dev_state *cxlds,
> +int cxl_internal_send_cmd(struct cxl_memdev_state *mds,
>   			  struct cxl_mbox_cmd *mbox_cmd)
>   {
>   	size_t out_size, min_out;
>   	int rc;
>   
> -	if (mbox_cmd->size_in > cxlds->payload_size ||
> -	    mbox_cmd->size_out > cxlds->payload_size)
> +	if (mbox_cmd->size_in > mds->payload_size ||
> +	    mbox_cmd->size_out > mds->payload_size)
>   		return -E2BIG;
>   
>   	out_size = mbox_cmd->size_out;
>   	min_out = mbox_cmd->min_out;
> -	rc = cxlds->mbox_send(cxlds, mbox_cmd);
> +	rc = mds->mbox_send(mds, mbox_cmd);
>   	/*
>   	 * EIO is reserved for a payload size mismatch and mbox_send()
>   	 * may not return this error.
> @@ -297,7 +297,7 @@ static bool cxl_payload_from_user_allowed(u16 opcode, void *payload_in)
>   }
>   
>   static int cxl_mbox_cmd_ctor(struct cxl_mbox_cmd *mbox,
> -			     struct cxl_dev_state *cxlds, u16 opcode,
> +			     struct cxl_memdev_state *mds, u16 opcode,
>   			     size_t in_size, size_t out_size, u64 in_payload)
>   {
>   	*mbox = (struct cxl_mbox_cmd) {
> @@ -312,7 +312,7 @@ static int cxl_mbox_cmd_ctor(struct cxl_mbox_cmd *mbox,
>   			return PTR_ERR(mbox->payload_in);
>   
>   		if (!cxl_payload_from_user_allowed(opcode, mbox->payload_in)) {
> -			dev_dbg(cxlds->dev, "%s: input payload not allowed\n",
> +			dev_dbg(mds->cxlds.dev, "%s: input payload not allowed\n",
>   				cxl_mem_opcode_to_name(opcode));
>   			kvfree(mbox->payload_in);
>   			return -EBUSY;
> @@ -321,7 +321,7 @@ static int cxl_mbox_cmd_ctor(struct cxl_mbox_cmd *mbox,
>   
>   	/* Prepare to handle a full payload for variable sized output */
>   	if (out_size == CXL_VARIABLE_PAYLOAD)
> -		mbox->size_out = cxlds->payload_size;
> +		mbox->size_out = mds->payload_size;
>   	else
>   		mbox->size_out = out_size;
>   
> @@ -343,7 +343,7 @@ static void cxl_mbox_cmd_dtor(struct cxl_mbox_cmd *mbox)
>   
>   static int cxl_to_mem_cmd_raw(struct cxl_mem_command *mem_cmd,
>   			      const struct cxl_send_command *send_cmd,
> -			      struct cxl_dev_state *cxlds)
> +			      struct cxl_memdev_state *mds)
>   {
>   	if (send_cmd->raw.rsvd)
>   		return -EINVAL;
> @@ -353,13 +353,13 @@ static int cxl_to_mem_cmd_raw(struct cxl_mem_command *mem_cmd,
>   	 * gets passed along without further checking, so it must be
>   	 * validated here.
>   	 */
> -	if (send_cmd->out.size > cxlds->payload_size)
> +	if (send_cmd->out.size > mds->payload_size)
>   		return -EINVAL;
>   
>   	if (!cxl_mem_raw_command_allowed(send_cmd->raw.opcode))
>   		return -EPERM;
>   
> -	dev_WARN_ONCE(cxlds->dev, true, "raw command path used\n");
> +	dev_WARN_ONCE(mds->cxlds.dev, true, "raw command path used\n");
>   
>   	*mem_cmd = (struct cxl_mem_command) {
>   		.info = {
> @@ -375,7 +375,7 @@ static int cxl_to_mem_cmd_raw(struct cxl_mem_command *mem_cmd,
>   
>   static int cxl_to_mem_cmd(struct cxl_mem_command *mem_cmd,
>   			  const struct cxl_send_command *send_cmd,
> -			  struct cxl_dev_state *cxlds)
> +			  struct cxl_memdev_state *mds)
>   {
>   	struct cxl_mem_command *c = &cxl_mem_commands[send_cmd->id];
>   	const struct cxl_command_info *info = &c->info;
> @@ -390,11 +390,11 @@ static int cxl_to_mem_cmd(struct cxl_mem_command *mem_cmd,
>   		return -EINVAL;
>   
>   	/* Check that the command is enabled for hardware */
> -	if (!test_bit(info->id, cxlds->enabled_cmds))
> +	if (!test_bit(info->id, mds->enabled_cmds))
>   		return -ENOTTY;
>   
>   	/* Check that the command is not claimed for exclusive kernel use */
> -	if (test_bit(info->id, cxlds->exclusive_cmds))
> +	if (test_bit(info->id, mds->exclusive_cmds))
>   		return -EBUSY;
>   
>   	/* Check the input buffer is the expected size */
> @@ -423,7 +423,7 @@ static int cxl_to_mem_cmd(struct cxl_mem_command *mem_cmd,
>   /**
>    * cxl_validate_cmd_from_user() - Check fields for CXL_MEM_SEND_COMMAND.
>    * @mbox_cmd: Sanitized and populated &struct cxl_mbox_cmd.
> - * @cxlds: The device data for the operation
> + * @mds: The driver data for the operation
>    * @send_cmd: &struct cxl_send_command copied in from userspace.
>    *
>    * Return:
> @@ -438,7 +438,7 @@ static int cxl_to_mem_cmd(struct cxl_mem_command *mem_cmd,
>    * safe to send to the hardware.
>    */
>   static int cxl_validate_cmd_from_user(struct cxl_mbox_cmd *mbox_cmd,
> -				      struct cxl_dev_state *cxlds,
> +				      struct cxl_memdev_state *mds,
>   				      const struct cxl_send_command *send_cmd)
>   {
>   	struct cxl_mem_command mem_cmd;
> @@ -452,20 +452,20 @@ static int cxl_validate_cmd_from_user(struct cxl_mbox_cmd *mbox_cmd,
>   	 * supports, but output can be arbitrarily large (simply write out as
>   	 * much data as the hardware provides).
>   	 */
> -	if (send_cmd->in.size > cxlds->payload_size)
> +	if (send_cmd->in.size > mds->payload_size)
>   		return -EINVAL;
>   
>   	/* Sanitize and construct a cxl_mem_command */
>   	if (send_cmd->id == CXL_MEM_COMMAND_ID_RAW)
> -		rc = cxl_to_mem_cmd_raw(&mem_cmd, send_cmd, cxlds);
> +		rc = cxl_to_mem_cmd_raw(&mem_cmd, send_cmd, mds);
>   	else
> -		rc = cxl_to_mem_cmd(&mem_cmd, send_cmd, cxlds);
> +		rc = cxl_to_mem_cmd(&mem_cmd, send_cmd, mds);
>   
>   	if (rc)
>   		return rc;
>   
>   	/* Sanitize and construct a cxl_mbox_cmd */
> -	return cxl_mbox_cmd_ctor(mbox_cmd, cxlds, mem_cmd.opcode,
> +	return cxl_mbox_cmd_ctor(mbox_cmd, mds, mem_cmd.opcode,
>   				 mem_cmd.info.size_in, mem_cmd.info.size_out,
>   				 send_cmd->in.payload);
>   }
> @@ -473,6 +473,7 @@ static int cxl_validate_cmd_from_user(struct cxl_mbox_cmd *mbox_cmd,
>   int cxl_query_cmd(struct cxl_memdev *cxlmd,
>   		  struct cxl_mem_query_commands __user *q)
>   {
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct device *dev = &cxlmd->dev;
>   	struct cxl_mem_command *cmd;
>   	u32 n_commands;
> @@ -494,9 +495,9 @@ int cxl_query_cmd(struct cxl_memdev *cxlmd,
>   	cxl_for_each_cmd(cmd) {
>   		struct cxl_command_info info = cmd->info;
>   
> -		if (test_bit(info.id, cxlmd->cxlds->enabled_cmds))
> +		if (test_bit(info.id, mds->enabled_cmds))
>   			info.flags |= CXL_MEM_COMMAND_FLAG_ENABLED;
> -		if (test_bit(info.id, cxlmd->cxlds->exclusive_cmds))
> +		if (test_bit(info.id, mds->exclusive_cmds))
>   			info.flags |= CXL_MEM_COMMAND_FLAG_EXCLUSIVE;
>   
>   		if (copy_to_user(&q->commands[j++], &info, sizeof(info)))
> @@ -511,7 +512,7 @@ int cxl_query_cmd(struct cxl_memdev *cxlmd,
>   
>   /**
>    * handle_mailbox_cmd_from_user() - Dispatch a mailbox command for userspace.
> - * @cxlds: The device data for the operation
> + * @mds: The driver data for the operation
>    * @mbox_cmd: The validated mailbox command.
>    * @out_payload: Pointer to userspace's output payload.
>    * @size_out: (Input) Max payload size to copy out.
> @@ -532,12 +533,12 @@ int cxl_query_cmd(struct cxl_memdev *cxlmd,
>    *
>    * See cxl_send_cmd().
>    */
> -static int handle_mailbox_cmd_from_user(struct cxl_dev_state *cxlds,
> +static int handle_mailbox_cmd_from_user(struct cxl_memdev_state *mds,
>   					struct cxl_mbox_cmd *mbox_cmd,
>   					u64 out_payload, s32 *size_out,
>   					u32 *retval)
>   {
> -	struct device *dev = cxlds->dev;
> +	struct device *dev = mds->cxlds.dev;
>   	int rc;
>   
>   	dev_dbg(dev,
> @@ -547,7 +548,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_dev_state *cxlds,
>   		cxl_mem_opcode_to_name(mbox_cmd->opcode),
>   		mbox_cmd->opcode, mbox_cmd->size_in);
>   
> -	rc = cxlds->mbox_send(cxlds, mbox_cmd);
> +	rc = mds->mbox_send(mds, mbox_cmd);
>   	if (rc)
>   		goto out;
>   
> @@ -576,7 +577,7 @@ static int handle_mailbox_cmd_from_user(struct cxl_dev_state *cxlds,
>   
>   int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s)
>   {
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct device *dev = &cxlmd->dev;
>   	struct cxl_send_command send;
>   	struct cxl_mbox_cmd mbox_cmd;
> @@ -587,11 +588,11 @@ int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s)
>   	if (copy_from_user(&send, s, sizeof(send)))
>   		return -EFAULT;
>   
> -	rc = cxl_validate_cmd_from_user(&mbox_cmd, cxlmd->cxlds, &send);
> +	rc = cxl_validate_cmd_from_user(&mbox_cmd, mds, &send);
>   	if (rc)
>   		return rc;
>   
> -	rc = handle_mailbox_cmd_from_user(cxlds, &mbox_cmd, send.out.payload,
> +	rc = handle_mailbox_cmd_from_user(mds, &mbox_cmd, send.out.payload,
>   					  &send.out.size, &send.retval);
>   	if (rc)
>   		return rc;
> @@ -602,13 +603,14 @@ int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s)
>   	return 0;
>   }
>   
> -static int cxl_xfer_log(struct cxl_dev_state *cxlds, uuid_t *uuid, u32 *size, u8 *out)
> +static int cxl_xfer_log(struct cxl_memdev_state *mds, uuid_t *uuid,
> +			u32 *size, u8 *out)
>   {
>   	u32 remaining = *size;
>   	u32 offset = 0;
>   
>   	while (remaining) {
> -		u32 xfer_size = min_t(u32, remaining, cxlds->payload_size);
> +		u32 xfer_size = min_t(u32, remaining, mds->payload_size);
>   		struct cxl_mbox_cmd mbox_cmd;
>   		struct cxl_mbox_get_log log;
>   		int rc;
> @@ -627,7 +629,7 @@ static int cxl_xfer_log(struct cxl_dev_state *cxlds, uuid_t *uuid, u32 *size, u8
>   			.payload_out = out,
>   		};
>   
> -		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +		rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   
>   		/*
>   		 * The output payload length that indicates the number
> @@ -654,17 +656,18 @@ static int cxl_xfer_log(struct cxl_dev_state *cxlds, uuid_t *uuid, u32 *size, u8
>   
>   /**
>    * cxl_walk_cel() - Walk through the Command Effects Log.
> - * @cxlds: The device data for the operation
> + * @mds: The driver data for the operation
>    * @size: Length of the Command Effects Log.
>    * @cel: CEL
>    *
>    * Iterate over each entry in the CEL and determine if the driver supports the
>    * command. If so, the command is enabled for the device and can be used later.
>    */
> -static void cxl_walk_cel(struct cxl_dev_state *cxlds, size_t size, u8 *cel)
> +static void cxl_walk_cel(struct cxl_memdev_state *mds, size_t size, u8 *cel)
>   {
>   	struct cxl_cel_entry *cel_entry;
>   	const int cel_entries = size / sizeof(*cel_entry);
> +	struct device *dev = mds->cxlds.dev;
>   	int i;
>   
>   	cel_entry = (struct cxl_cel_entry *) cel;
> @@ -674,39 +677,40 @@ static void cxl_walk_cel(struct cxl_dev_state *cxlds, size_t size, u8 *cel)
>   		struct cxl_mem_command *cmd = cxl_mem_find_command(opcode);
>   
>   		if (!cmd && !cxl_is_poison_command(opcode)) {
> -			dev_dbg(cxlds->dev,
> +			dev_dbg(dev,
>   				"Opcode 0x%04x unsupported by driver\n", opcode);
>   			continue;
>   		}
>   
>   		if (cmd)
> -			set_bit(cmd->info.id, cxlds->enabled_cmds);
> +			set_bit(cmd->info.id, mds->enabled_cmds);
>   
>   		if (cxl_is_poison_command(opcode))
> -			cxl_set_poison_cmd_enabled(&cxlds->poison, opcode);
> +			cxl_set_poison_cmd_enabled(&mds->poison, opcode);
>   
> -		dev_dbg(cxlds->dev, "Opcode 0x%04x enabled\n", opcode);
> +		dev_dbg(dev, "Opcode 0x%04x enabled\n", opcode);
>   	}
>   }
>   
> -static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_dev_state *cxlds)
> +static struct cxl_mbox_get_supported_logs *
> +cxl_get_gsl(struct cxl_memdev_state *mds)
>   {
>   	struct cxl_mbox_get_supported_logs *ret;
>   	struct cxl_mbox_cmd mbox_cmd;
>   	int rc;
>   
> -	ret = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> +	ret = kvmalloc(mds->payload_size, GFP_KERNEL);
>   	if (!ret)
>   		return ERR_PTR(-ENOMEM);
>   
>   	mbox_cmd = (struct cxl_mbox_cmd) {
>   		.opcode = CXL_MBOX_OP_GET_SUPPORTED_LOGS,
> -		.size_out = cxlds->payload_size,
> +		.size_out = mds->payload_size,
>   		.payload_out = ret,
>   		/* At least the record number field must be valid */
>   		.min_out = 2,
>   	};
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc < 0) {
>   		kvfree(ret);
>   		return ERR_PTR(rc);
> @@ -729,22 +733,22 @@ static const uuid_t log_uuid[] = {
>   
>   /**
>    * cxl_enumerate_cmds() - Enumerate commands for a device.
> - * @cxlds: The device data for the operation
> + * @mds: The driver data for the operation
>    *
>    * Returns 0 if enumerate completed successfully.
>    *
>    * CXL devices have optional support for certain commands. This function will
>    * determine the set of supported commands for the hardware and update the
> - * enabled_cmds bitmap in the @cxlds.
> + * enabled_cmds bitmap in the @mds.
>    */
> -int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
> +int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>   {
>   	struct cxl_mbox_get_supported_logs *gsl;
> -	struct device *dev = cxlds->dev;
> +	struct device *dev = mds->cxlds.dev;
>   	struct cxl_mem_command *cmd;
>   	int i, rc;
>   
> -	gsl = cxl_get_gsl(cxlds);
> +	gsl = cxl_get_gsl(mds);
>   	if (IS_ERR(gsl))
>   		return PTR_ERR(gsl);
>   
> @@ -765,19 +769,19 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>   			goto out;
>   		}
>   
> -		rc = cxl_xfer_log(cxlds, &uuid, &size, log);
> +		rc = cxl_xfer_log(mds, &uuid, &size, log);
>   		if (rc) {
>   			kvfree(log);
>   			goto out;
>   		}
>   
> -		cxl_walk_cel(cxlds, size, log);
> +		cxl_walk_cel(mds, size, log);
>   		kvfree(log);
>   
>   		/* In case CEL was bogus, enable some default commands. */
>   		cxl_for_each_cmd(cmd)
>   			if (cmd->flags & CXL_CMD_FLAG_FORCE_ENABLE)
> -				set_bit(cmd->info.id, cxlds->enabled_cmds);
> +				set_bit(cmd->info.id, mds->enabled_cmds);
>   
>   		/* Found the required CEL */
>   		rc = 0;
> @@ -838,7 +842,7 @@ static void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>   	}
>   }
>   
> -static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> +static int cxl_clear_event_record(struct cxl_memdev_state *mds,
>   				  enum cxl_event_log_type log,
>   				  struct cxl_get_event_payload *get_pl)
>   {
> @@ -852,9 +856,9 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
>   	int i;
>   
>   	/* Payload size may limit the max handles */
> -	if (pl_size > cxlds->payload_size) {
> -		max_handles = (cxlds->payload_size - sizeof(*payload)) /
> -				sizeof(__le16);
> +	if (pl_size > mds->payload_size) {
> +		max_handles = (mds->payload_size - sizeof(*payload)) /
> +			      sizeof(__le16);
>   		pl_size = struct_size(payload, handles, max_handles);
>   	}
>   
> @@ -879,12 +883,12 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
>   	i = 0;
>   	for (cnt = 0; cnt < total; cnt++) {
>   		payload->handles[i++] = get_pl->records[cnt].hdr.handle;
> -		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> -			log, le16_to_cpu(payload->handles[i]));
> +		dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
> +			le16_to_cpu(payload->handles[i]));
>   
>   		if (i == max_handles) {
>   			payload->nr_recs = i;
> -			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +			rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   			if (rc)
>   				goto free_pl;
>   			i = 0;
> @@ -895,7 +899,7 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
>   	if (i) {
>   		payload->nr_recs = i;
>   		mbox_cmd.size_in = struct_size(payload, handles, i);
> -		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +		rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   		if (rc)
>   			goto free_pl;
>   	}
> @@ -905,32 +909,34 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
>   	return rc;
>   }
>   
> -static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> +static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>   				    enum cxl_event_log_type type)
>   {
> +	struct cxl_memdev *cxlmd = mds->cxlds.cxlmd;
> +	struct device *dev = mds->cxlds.dev;
>   	struct cxl_get_event_payload *payload;
>   	struct cxl_mbox_cmd mbox_cmd;
>   	u8 log_type = type;
>   	u16 nr_rec;
>   
> -	mutex_lock(&cxlds->event.log_lock);
> -	payload = cxlds->event.buf;
> +	mutex_lock(&mds->event.log_lock);
> +	payload = mds->event.buf;
>   
>   	mbox_cmd = (struct cxl_mbox_cmd) {
>   		.opcode = CXL_MBOX_OP_GET_EVENT_RECORD,
>   		.payload_in = &log_type,
>   		.size_in = sizeof(log_type),
>   		.payload_out = payload,
> -		.size_out = cxlds->payload_size,
> +		.size_out = mds->payload_size,
>   		.min_out = struct_size(payload, records, 0),
>   	};
>   
>   	do {
>   		int rc, i;
>   
> -		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +		rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   		if (rc) {
> -			dev_err_ratelimited(cxlds->dev,
> +			dev_err_ratelimited(dev,
>   				"Event log '%d': Failed to query event records : %d",
>   				type, rc);
>   			break;
> @@ -941,27 +947,27 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>   			break;
>   
>   		for (i = 0; i < nr_rec; i++)
> -			cxl_event_trace_record(cxlds->cxlmd, type,
> +			cxl_event_trace_record(cxlmd, type,
>   					       &payload->records[i]);
>   
>   		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
> -			trace_cxl_overflow(cxlds->cxlmd, type, payload);
> +			trace_cxl_overflow(cxlmd, type, payload);
>   
> -		rc = cxl_clear_event_record(cxlds, type, payload);
> +		rc = cxl_clear_event_record(mds, type, payload);
>   		if (rc) {
> -			dev_err_ratelimited(cxlds->dev,
> +			dev_err_ratelimited(dev,
>   				"Event log '%d': Failed to clear events : %d",
>   				type, rc);
>   			break;
>   		}
>   	} while (nr_rec);
>   
> -	mutex_unlock(&cxlds->event.log_lock);
> +	mutex_unlock(&mds->event.log_lock);
>   }
>   
>   /**
>    * cxl_mem_get_event_records - Get Event Records from the device
> - * @cxlds: The device data for the operation
> + * @mds: The driver data for the operation
>    * @status: Event Status register value identifying which events are available.
>    *
>    * Retrieve all event records available on the device, report them as trace
> @@ -970,24 +976,24 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>    * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
>    * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
>    */
> -void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
> +void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status)
>   {
> -	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> +	dev_dbg(mds->cxlds.dev, "Reading event logs: %x\n", status);
>   
>   	if (status & CXLDEV_EVENT_STATUS_FATAL)
> -		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> +		cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_FATAL);
>   	if (status & CXLDEV_EVENT_STATUS_FAIL)
> -		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> +		cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_FAIL);
>   	if (status & CXLDEV_EVENT_STATUS_WARN)
> -		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> +		cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_WARN);
>   	if (status & CXLDEV_EVENT_STATUS_INFO)
> -		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> +		cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_INFO);
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
>   
>   /**
>    * cxl_mem_get_partition_info - Get partition info
> - * @cxlds: The device data for the operation
> + * @mds: The driver data for the operation
>    *
>    * Retrieve the current partition info for the device specified.  The active
>    * values are the current capacity in bytes.  If not 0, the 'next' values are
> @@ -997,7 +1003,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
>    *
>    * See CXL @8.2.9.5.2.1 Get Partition Info
>    */
> -static int cxl_mem_get_partition_info(struct cxl_dev_state *cxlds)
> +static int cxl_mem_get_partition_info(struct cxl_memdev_state *mds)
>   {
>   	struct cxl_mbox_get_partition_info pi;
>   	struct cxl_mbox_cmd mbox_cmd;
> @@ -1008,17 +1014,17 @@ static int cxl_mem_get_partition_info(struct cxl_dev_state *cxlds)
>   		.size_out = sizeof(pi),
>   		.payload_out = &pi,
>   	};
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc)
>   		return rc;
>   
> -	cxlds->active_volatile_bytes =
> +	mds->active_volatile_bytes =
>   		le64_to_cpu(pi.active_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> -	cxlds->active_persistent_bytes =
> +	mds->active_persistent_bytes =
>   		le64_to_cpu(pi.active_persistent_cap) * CXL_CAPACITY_MULTIPLIER;
> -	cxlds->next_volatile_bytes =
> +	mds->next_volatile_bytes =
>   		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
> -	cxlds->next_persistent_bytes =
> +	mds->next_persistent_bytes =
>   		le64_to_cpu(pi.next_volatile_cap) * CXL_CAPACITY_MULTIPLIER;
>   
>   	return 0;
> @@ -1026,14 +1032,14 @@ static int cxl_mem_get_partition_info(struct cxl_dev_state *cxlds)
>   
>   /**
>    * cxl_dev_state_identify() - Send the IDENTIFY command to the device.
> - * @cxlds: The device data for the operation
> + * @mds: The driver data for the operation
>    *
>    * Return: 0 if identify was executed successfully or media not ready.
>    *
>    * This will dispatch the identify command to the device and on success populate
>    * structures to be exported to sysfs.
>    */
> -int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
> +int cxl_dev_state_identify(struct cxl_memdev_state *mds)
>   {
>   	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
>   	struct cxl_mbox_identify id;
> @@ -1041,7 +1047,7 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
>   	u32 val;
>   	int rc;
>   
> -	if (!cxlds->media_ready)
> +	if (!mds->cxlds.media_ready)
>   		return 0;
>   
>   	mbox_cmd = (struct cxl_mbox_cmd) {
> @@ -1049,25 +1055,26 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
>   		.size_out = sizeof(id),
>   		.payload_out = &id,
>   	};
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc < 0)
>   		return rc;
>   
> -	cxlds->total_bytes =
> +	mds->total_bytes =
>   		le64_to_cpu(id.total_capacity) * CXL_CAPACITY_MULTIPLIER;
> -	cxlds->volatile_only_bytes =
> +	mds->volatile_only_bytes =
>   		le64_to_cpu(id.volatile_capacity) * CXL_CAPACITY_MULTIPLIER;
> -	cxlds->persistent_only_bytes =
> +	mds->persistent_only_bytes =
>   		le64_to_cpu(id.persistent_capacity) * CXL_CAPACITY_MULTIPLIER;
> -	cxlds->partition_align_bytes =
> +	mds->partition_align_bytes =
>   		le64_to_cpu(id.partition_align) * CXL_CAPACITY_MULTIPLIER;
>   
> -	cxlds->lsa_size = le32_to_cpu(id.lsa_size);
> -	memcpy(cxlds->firmware_version, id.fw_revision, sizeof(id.fw_revision));
> +	mds->lsa_size = le32_to_cpu(id.lsa_size);
> +	memcpy(mds->firmware_version, id.fw_revision,
> +	       sizeof(id.fw_revision));
>   
> -	if (test_bit(CXL_POISON_ENABLED_LIST, cxlds->poison.enabled_cmds)) {
> +	if (test_bit(CXL_POISON_ENABLED_LIST, mds->poison.enabled_cmds)) {
>   		val = get_unaligned_le24(id.poison_list_max_mer);
> -		cxlds->poison.max_errors = min_t(u32, val, CXL_POISON_LIST_MAX);
> +		mds->poison.max_errors = min_t(u32, val, CXL_POISON_LIST_MAX);
>   	}
>   
>   	return 0;
> @@ -1100,8 +1107,9 @@ static int add_dpa_res(struct device *dev, struct resource *parent,
>   	return 0;
>   }
>   
> -int cxl_mem_create_range_info(struct cxl_dev_state *cxlds)
> +int cxl_mem_create_range_info(struct cxl_memdev_state *mds)
>   {
> +	struct cxl_dev_state *cxlds = &mds->cxlds;
>   	struct device *dev = cxlds->dev;
>   	int rc;
>   
> @@ -1113,35 +1121,35 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds)
>   	}
>   
>   	cxlds->dpa_res =
> -		(struct resource)DEFINE_RES_MEM(0, cxlds->total_bytes);
> +		(struct resource)DEFINE_RES_MEM(0, mds->total_bytes);
>   
> -	if (cxlds->partition_align_bytes == 0) {
> +	if (mds->partition_align_bytes == 0) {
>   		rc = add_dpa_res(dev, &cxlds->dpa_res, &cxlds->ram_res, 0,
> -				 cxlds->volatile_only_bytes, "ram");
> +				 mds->volatile_only_bytes, "ram");
>   		if (rc)
>   			return rc;
>   		return add_dpa_res(dev, &cxlds->dpa_res, &cxlds->pmem_res,
> -				   cxlds->volatile_only_bytes,
> -				   cxlds->persistent_only_bytes, "pmem");
> +				   mds->volatile_only_bytes,
> +				   mds->persistent_only_bytes, "pmem");
>   	}
>   
> -	rc = cxl_mem_get_partition_info(cxlds);
> +	rc = cxl_mem_get_partition_info(mds);
>   	if (rc) {
>   		dev_err(dev, "Failed to query partition information\n");
>   		return rc;
>   	}
>   
>   	rc = add_dpa_res(dev, &cxlds->dpa_res, &cxlds->ram_res, 0,
> -			 cxlds->active_volatile_bytes, "ram");
> +			 mds->active_volatile_bytes, "ram");
>   	if (rc)
>   		return rc;
>   	return add_dpa_res(dev, &cxlds->dpa_res, &cxlds->pmem_res,
> -			   cxlds->active_volatile_bytes,
> -			   cxlds->active_persistent_bytes, "pmem");
> +			   mds->active_volatile_bytes,
> +			   mds->active_persistent_bytes, "pmem");
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_mem_create_range_info, CXL);
>   
> -int cxl_set_timestamp(struct cxl_dev_state *cxlds)
> +int cxl_set_timestamp(struct cxl_memdev_state *mds)
>   {
>   	struct cxl_mbox_cmd mbox_cmd;
>   	struct cxl_mbox_set_timestamp_in pi;
> @@ -1154,7 +1162,7 @@ int cxl_set_timestamp(struct cxl_dev_state *cxlds)
>   		.payload_in = &pi,
>   	};
>   
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	/*
>   	 * Command is optional. Devices may have another way of providing
>   	 * a timestamp, or may return all 0s in timestamp fields.
> @@ -1170,18 +1178,18 @@ EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL);
>   int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>   		       struct cxl_region *cxlr)
>   {
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct cxl_mbox_poison_out *po;
>   	struct cxl_mbox_poison_in pi;
>   	struct cxl_mbox_cmd mbox_cmd;
>   	int nr_records = 0;
>   	int rc;
>   
> -	rc = mutex_lock_interruptible(&cxlds->poison.lock);
> +	rc = mutex_lock_interruptible(&mds->poison.lock);
>   	if (rc)
>   		return rc;
>   
> -	po = cxlds->poison.list_out;
> +	po = mds->poison.list_out;
>   	pi.offset = cpu_to_le64(offset);
>   	pi.length = cpu_to_le64(len / CXL_POISON_LEN_MULT);
>   
> @@ -1189,13 +1197,13 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>   		.opcode = CXL_MBOX_OP_GET_POISON,
>   		.size_in = sizeof(pi),
>   		.payload_in = &pi,
> -		.size_out = cxlds->payload_size,
> +		.size_out = mds->payload_size,
>   		.payload_out = po,
>   		.min_out = struct_size(po, record, 0),
>   	};
>   
>   	do {
> -		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +		rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   		if (rc)
>   			break;
>   
> @@ -1206,14 +1214,14 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>   
>   		/* Protect against an uncleared _FLAG_MORE */
>   		nr_records = nr_records + le16_to_cpu(po->count);
> -		if (nr_records >= cxlds->poison.max_errors) {
> +		if (nr_records >= mds->poison.max_errors) {
>   			dev_dbg(&cxlmd->dev, "Max Error Records reached: %d\n",
>   				nr_records);
>   			break;
>   		}
>   	} while (po->flags & CXL_POISON_FLAG_MORE);
>   
> -	mutex_unlock(&cxlds->poison.lock);
> +	mutex_unlock(&mds->poison.lock);
>   	return rc;
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, CXL);
> @@ -1223,52 +1231,52 @@ static void free_poison_buf(void *buf)
>   	kvfree(buf);
>   }
>   
> -/* Get Poison List output buffer is protected by cxlds->poison.lock */
> -static int cxl_poison_alloc_buf(struct cxl_dev_state *cxlds)
> +/* Get Poison List output buffer is protected by mds->poison.lock */
> +static int cxl_poison_alloc_buf(struct cxl_memdev_state *mds)
>   {
> -	cxlds->poison.list_out = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> -	if (!cxlds->poison.list_out)
> +	mds->poison.list_out = kvmalloc(mds->payload_size, GFP_KERNEL);
> +	if (!mds->poison.list_out)
>   		return -ENOMEM;
>   
> -	return devm_add_action_or_reset(cxlds->dev, free_poison_buf,
> -					cxlds->poison.list_out);
> +	return devm_add_action_or_reset(mds->cxlds.dev, free_poison_buf,
> +					mds->poison.list_out);
>   }
>   
> -int cxl_poison_state_init(struct cxl_dev_state *cxlds)
> +int cxl_poison_state_init(struct cxl_memdev_state *mds)
>   {
>   	int rc;
>   
> -	if (!test_bit(CXL_POISON_ENABLED_LIST, cxlds->poison.enabled_cmds))
> +	if (!test_bit(CXL_POISON_ENABLED_LIST, mds->poison.enabled_cmds))
>   		return 0;
>   
> -	rc = cxl_poison_alloc_buf(cxlds);
> +	rc = cxl_poison_alloc_buf(mds);
>   	if (rc) {
> -		clear_bit(CXL_POISON_ENABLED_LIST, cxlds->poison.enabled_cmds);
> +		clear_bit(CXL_POISON_ENABLED_LIST, mds->poison.enabled_cmds);
>   		return rc;
>   	}
>   
> -	mutex_init(&cxlds->poison.lock);
> +	mutex_init(&mds->poison.lock);
>   	return 0;
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, CXL);
>   
> -struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
> +struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
>   {
> -	struct cxl_dev_state *cxlds;
> +	struct cxl_memdev_state *mds;
>   
> -	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
> -	if (!cxlds) {
> +	mds = devm_kzalloc(dev, sizeof(*mds), GFP_KERNEL);
> +	if (!mds) {
>   		dev_err(dev, "No memory available\n");
>   		return ERR_PTR(-ENOMEM);
>   	}
>   
> -	mutex_init(&cxlds->mbox_mutex);
> -	mutex_init(&cxlds->event.log_lock);
> -	cxlds->dev = dev;
> +	mutex_init(&mds->mbox_mutex);
> +	mutex_init(&mds->event.log_lock);
> +	mds->cxlds.dev = dev;
>   
> -	return cxlds;
> +	return mds;
>   }
> -EXPORT_SYMBOL_NS_GPL(cxl_dev_state_create, CXL);
> +EXPORT_SYMBOL_NS_GPL(cxl_memdev_state_create, CXL);
>   
>   void __init cxl_mbox_init(void)
>   {
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 057a43267290..15434b1b4909 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -39,8 +39,9 @@ static ssize_t firmware_version_show(struct device *dev,
>   {
>   	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>   	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>   
> -	return sysfs_emit(buf, "%.16s\n", cxlds->firmware_version);
> +	return sysfs_emit(buf, "%.16s\n", mds->firmware_version);
>   }
>   static DEVICE_ATTR_RO(firmware_version);
>   
> @@ -49,8 +50,9 @@ static ssize_t payload_max_show(struct device *dev,
>   {
>   	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>   	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>   
> -	return sysfs_emit(buf, "%zu\n", cxlds->payload_size);
> +	return sysfs_emit(buf, "%zu\n", mds->payload_size);
>   }
>   static DEVICE_ATTR_RO(payload_max);
>   
> @@ -59,8 +61,9 @@ static ssize_t label_storage_size_show(struct device *dev,
>   {
>   	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>   	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>   
> -	return sysfs_emit(buf, "%zu\n", cxlds->lsa_size);
> +	return sysfs_emit(buf, "%zu\n", mds->lsa_size);
>   }
>   static DEVICE_ATTR_RO(label_storage_size);
>   
> @@ -231,7 +234,7 @@ static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
>   
>   int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
>   {
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct cxl_mbox_inject_poison inject;
>   	struct cxl_poison_record record;
>   	struct cxl_mbox_cmd mbox_cmd;
> @@ -255,13 +258,13 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
>   		.size_in = sizeof(inject),
>   		.payload_in = &inject,
>   	};
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc)
>   		goto out;
>   
>   	cxlr = cxl_dpa_to_region(cxlmd, dpa);
>   	if (cxlr)
> -		dev_warn_once(cxlds->dev,
> +		dev_warn_once(mds->cxlds.dev,
>   			      "poison inject dpa:%#llx region: %s\n", dpa,
>   			      dev_name(&cxlr->dev));
>   
> @@ -279,7 +282,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, CXL);
>   
>   int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
>   {
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct cxl_mbox_clear_poison clear;
>   	struct cxl_poison_record record;
>   	struct cxl_mbox_cmd mbox_cmd;
> @@ -312,14 +315,15 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
>   		.payload_in = &clear,
>   	};
>   
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc)
>   		goto out;
>   
>   	cxlr = cxl_dpa_to_region(cxlmd, dpa);
>   	if (cxlr)
> -		dev_warn_once(cxlds->dev, "poison clear dpa:%#llx region: %s\n",
> -			      dpa, dev_name(&cxlr->dev));
> +		dev_warn_once(mds->cxlds.dev,
> +			      "poison clear dpa:%#llx region: %s\n", dpa,
> +			      dev_name(&cxlr->dev));
>   
>   	record = (struct cxl_poison_record) {
>   		.address = cpu_to_le64(dpa),
> @@ -397,17 +401,18 @@ EXPORT_SYMBOL_NS_GPL(is_cxl_memdev, CXL);
>   
>   /**
>    * set_exclusive_cxl_commands() - atomically disable user cxl commands
> - * @cxlds: The device state to operate on
> + * @mds: The device state to operate on
>    * @cmds: bitmap of commands to mark exclusive
>    *
>    * Grab the cxl_memdev_rwsem in write mode to flush in-flight
>    * invocations of the ioctl path and then disable future execution of
>    * commands with the command ids set in @cmds.
>    */
> -void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds)
> +void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> +				unsigned long *cmds)
>   {
>   	down_write(&cxl_memdev_rwsem);
> -	bitmap_or(cxlds->exclusive_cmds, cxlds->exclusive_cmds, cmds,
> +	bitmap_or(mds->exclusive_cmds, mds->exclusive_cmds, cmds,
>   		  CXL_MEM_COMMAND_ID_MAX);
>   	up_write(&cxl_memdev_rwsem);
>   }
> @@ -415,13 +420,14 @@ EXPORT_SYMBOL_NS_GPL(set_exclusive_cxl_commands, CXL);
>   
>   /**
>    * clear_exclusive_cxl_commands() - atomically enable user cxl commands
> - * @cxlds: The device state to modify
> + * @mds: The device state to modify
>    * @cmds: bitmap of commands to mark available for userspace
>    */
> -void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds)
> +void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> +				  unsigned long *cmds)
>   {
>   	down_write(&cxl_memdev_rwsem);
> -	bitmap_andnot(cxlds->exclusive_cmds, cxlds->exclusive_cmds, cmds,
> +	bitmap_andnot(mds->exclusive_cmds, mds->exclusive_cmds, cmds,
>   		      CXL_MEM_COMMAND_ID_MAX);
>   	up_write(&cxl_memdev_rwsem);
>   }
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index a2845a7a69d8..d3fe73d5ba4d 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -267,6 +267,35 @@ struct cxl_poison_state {
>    * @cxl_dvsec: Offset to the PCIe device DVSEC
>    * @rcd: operating in RCD mode (CXL 3.0 9.11.8 CXL Devices Attached to an RCH)
>    * @media_ready: Indicate whether the device media is usable
> + * @dpa_res: Overall DPA resource tree for the device
> + * @pmem_res: Active Persistent memory capacity configuration
> + * @ram_res: Active Volatile memory capacity configuration
> + * @component_reg_phys: register base of component registers
> + * @info: Cached DVSEC information about the device.
> + * @serial: PCIe Device Serial Number
> + */
> +struct cxl_dev_state {
> +	struct device *dev;
> +	struct cxl_memdev *cxlmd;
> +	struct cxl_regs regs;
> +	int cxl_dvsec;
> +	bool rcd;
> +	bool media_ready;
> +	struct resource dpa_res;
> +	struct resource pmem_res;
> +	struct resource ram_res;
> +	resource_size_t component_reg_phys;
> +	u64 serial;
> +};
> +
> +/**
> + * struct cxl_memdev_state - Generic Type-3 Memory Device Class driver data
> + *
> + * CXL 8.1.12.1 PCI Header - Class Code Register Memory Device defines
> + * common memory device functionality like the presence of a mailbox and
> + * the functionality related to that like Identify Memory Device and Get
> + * Partition Info
> + * @cxlds: Core driver state common across Type-2 and Type-3 devices
>    * @payload_size: Size of space for payload
>    *                (CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
>    * @lsa_size: Size of Label Storage Area
> @@ -275,9 +304,6 @@ struct cxl_poison_state {
>    * @firmware_version: Firmware version for the memory device.
>    * @enabled_cmds: Hardware commands found enabled in CEL.
>    * @exclusive_cmds: Commands that are kernel-internal only
> - * @dpa_res: Overall DPA resource tree for the device
> - * @pmem_res: Active Persistent memory capacity configuration
> - * @ram_res: Active Volatile memory capacity configuration
>    * @total_bytes: sum of all possible capacities
>    * @volatile_only_bytes: hard volatile capacity
>    * @persistent_only_bytes: hard persistent capacity
> @@ -286,54 +312,41 @@ struct cxl_poison_state {
>    * @active_persistent_bytes: sum of hard + soft persistent
>    * @next_volatile_bytes: volatile capacity change pending device reset
>    * @next_persistent_bytes: persistent capacity change pending device reset
> - * @component_reg_phys: register base of component registers
> - * @info: Cached DVSEC information about the device.
> - * @serial: PCIe Device Serial Number
>    * @event: event log driver state
>    * @poison: poison driver state info
>    * @mbox_send: @dev specific transport for transmitting mailbox commands
>    *
> - * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> + * See CXL 3.0 8.2.9.8.2 Capacity Configuration and Label Storage for
>    * details on capacity parameters.
>    */
> -struct cxl_dev_state {
> -	struct device *dev;
> -	struct cxl_memdev *cxlmd;
> -
> -	struct cxl_regs regs;
> -	int cxl_dvsec;
> -
> -	bool rcd;
> -	bool media_ready;
> +struct cxl_memdev_state {
> +	struct cxl_dev_state cxlds;
>   	size_t payload_size;
>   	size_t lsa_size;
>   	struct mutex mbox_mutex; /* Protects device mailbox and firmware */
>   	char firmware_version[0x10];
>   	DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
>   	DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
> -
> -	struct resource dpa_res;
> -	struct resource pmem_res;
> -	struct resource ram_res;
>   	u64 total_bytes;
>   	u64 volatile_only_bytes;
>   	u64 persistent_only_bytes;
>   	u64 partition_align_bytes;
> -
>   	u64 active_volatile_bytes;
>   	u64 active_persistent_bytes;
>   	u64 next_volatile_bytes;
>   	u64 next_persistent_bytes;
> -
> -	resource_size_t component_reg_phys;
> -	u64 serial;
> -
>   	struct cxl_event_state event;
>   	struct cxl_poison_state poison;
> -
> -	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> +	int (*mbox_send)(struct cxl_memdev_state *mds,
> +			 struct cxl_mbox_cmd *cmd);
>   };
>   
> +static inline struct cxl_memdev_state *
> +to_cxl_memdev_state(struct cxl_dev_state *cxlds)
> +{
> +	return container_of(cxlds, struct cxl_memdev_state, cxlds);
> +}
> +
>   enum cxl_opcode {
>   	CXL_MBOX_OP_INVALID		= 0x0000,
>   	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> @@ -692,18 +705,20 @@ enum {
>   	CXL_PMEM_SEC_PASS_USER,
>   };
>   
> -int cxl_internal_send_cmd(struct cxl_dev_state *cxlds,
> +int cxl_internal_send_cmd(struct cxl_memdev_state *mds,
>   			  struct cxl_mbox_cmd *cmd);
> -int cxl_dev_state_identify(struct cxl_dev_state *cxlds);
> +int cxl_dev_state_identify(struct cxl_memdev_state *mds);
>   int cxl_await_media_ready(struct cxl_dev_state *cxlds);
> -int cxl_enumerate_cmds(struct cxl_dev_state *cxlds);
> -int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
> -struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
> -void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> -void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> -void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
> -int cxl_set_timestamp(struct cxl_dev_state *cxlds);
> -int cxl_poison_state_init(struct cxl_dev_state *cxlds);
> +int cxl_enumerate_cmds(struct cxl_memdev_state *mds);
> +int cxl_mem_create_range_info(struct cxl_memdev_state *mds);
> +struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev);
> +void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> +				unsigned long *cmds);
> +void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> +				  unsigned long *cmds);
> +void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
> +int cxl_set_timestamp(struct cxl_memdev_state *mds);
> +int cxl_poison_state_init(struct cxl_memdev_state *mds);
>   int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>   		       struct cxl_region *cxlr);
>   int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 519edd0eb196..584f9eec57e4 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -117,6 +117,7 @@ DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
>   static int cxl_mem_probe(struct device *dev)
>   {
>   	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct cxl_dev_state *cxlds = cxlmd->cxlds;
>   	struct device *endpoint_parent;
>   	struct cxl_port *parent_port;
> @@ -141,10 +142,10 @@ static int cxl_mem_probe(struct device *dev)
>   	dentry = cxl_debugfs_create_dir(dev_name(dev));
>   	debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
>   
> -	if (test_bit(CXL_POISON_ENABLED_INJECT, cxlds->poison.enabled_cmds))
> +	if (test_bit(CXL_POISON_ENABLED_INJECT, mds->poison.enabled_cmds))
>   		debugfs_create_file("inject_poison", 0200, dentry, cxlmd,
>   				    &cxl_poison_inject_fops);
> -	if (test_bit(CXL_POISON_ENABLED_CLEAR, cxlds->poison.enabled_cmds))
> +	if (test_bit(CXL_POISON_ENABLED_CLEAR, mds->poison.enabled_cmds))
>   		debugfs_create_file("clear_poison", 0200, dentry, cxlmd,
>   				    &cxl_poison_clear_fops);
>   
> @@ -227,9 +228,12 @@ static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n)
>   {
>   	if (a == &dev_attr_trigger_poison_list.attr) {
>   		struct device *dev = kobj_to_dev(kobj);
> +		struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +		struct cxl_memdev_state *mds =
> +			to_cxl_memdev_state(cxlmd->cxlds);
>   
>   		if (!test_bit(CXL_POISON_ENABLED_LIST,
> -			      to_cxl_memdev(dev)->cxlds->poison.enabled_cmds))
> +			      mds->poison.enabled_cmds))
>   			return 0;
>   	}
>   	return a->mode;
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 0872f2233ed0..4e2845b7331a 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -86,7 +86,7 @@ static int cxl_pci_mbox_wait_for_doorbell(struct cxl_dev_state *cxlds)
>   
>   /**
>    * __cxl_pci_mbox_send_cmd() - Execute a mailbox command
> - * @cxlds: The device state to communicate with.
> + * @mds: The memory device driver data
>    * @mbox_cmd: Command to send to the memory device.
>    *
>    * Context: Any context. Expects mbox_mutex to be held.
> @@ -106,16 +106,17 @@ static int cxl_pci_mbox_wait_for_doorbell(struct cxl_dev_state *cxlds)
>    * not need to coordinate with each other. The driver only uses the primary
>    * mailbox.
>    */
> -static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
> +static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
>   				   struct cxl_mbox_cmd *mbox_cmd)
>   {
> +	struct cxl_dev_state *cxlds = &mds->cxlds;
>   	void __iomem *payload = cxlds->regs.mbox + CXLDEV_MBOX_PAYLOAD_OFFSET;
>   	struct device *dev = cxlds->dev;
>   	u64 cmd_reg, status_reg;
>   	size_t out_len;
>   	int rc;
>   
> -	lockdep_assert_held(&cxlds->mbox_mutex);
> +	lockdep_assert_held(&mds->mbox_mutex);
>   
>   	/*
>   	 * Here are the steps from 8.2.8.4 of the CXL 2.0 spec.
> @@ -196,8 +197,9 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
>   		 * have requested less data than the hardware supplied even
>   		 * within spec.
>   		 */
> -		size_t n = min3(mbox_cmd->size_out, cxlds->payload_size, out_len);
> +		size_t n;
>   
> +		n = min3(mbox_cmd->size_out, mds->payload_size, out_len);
>   		memcpy_fromio(mbox_cmd->payload_out, payload, n);
>   		mbox_cmd->size_out = n;
>   	} else {
> @@ -207,20 +209,23 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
>   	return 0;
>   }
>   
> -static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int cxl_pci_mbox_send(struct cxl_memdev_state *mds,
> +			     struct cxl_mbox_cmd *cmd)
>   {
>   	int rc;
>   
> -	mutex_lock_io(&cxlds->mbox_mutex);
> -	rc = __cxl_pci_mbox_send_cmd(cxlds, cmd);
> -	mutex_unlock(&cxlds->mbox_mutex);
> +	mutex_lock_io(&mds->mbox_mutex);
> +	rc = __cxl_pci_mbox_send_cmd(mds, cmd);
> +	mutex_unlock(&mds->mbox_mutex);
>   
>   	return rc;
>   }
>   
> -static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> +static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
>   {
> +	struct cxl_dev_state *cxlds = &mds->cxlds;
>   	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> +	struct device *dev = cxlds->dev;
>   	unsigned long timeout;
>   	u64 md_status;
>   
> @@ -234,8 +239,7 @@ static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
>   	} while (!time_after(jiffies, timeout));
>   
>   	if (!(md_status & CXLMDEV_MBOX_IF_READY)) {
> -		cxl_err(cxlds->dev, md_status,
> -			"timeout awaiting mailbox ready");
> +		cxl_err(dev, md_status, "timeout awaiting mailbox ready");
>   		return -ETIMEDOUT;
>   	}
>   
> @@ -246,12 +250,12 @@ static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
>   	 * source for future doorbell busy events.
>   	 */
>   	if (cxl_pci_mbox_wait_for_doorbell(cxlds) != 0) {
> -		cxl_err(cxlds->dev, md_status, "timeout awaiting mailbox idle");
> +		cxl_err(dev, md_status, "timeout awaiting mailbox idle");
>   		return -ETIMEDOUT;
>   	}
>   
> -	cxlds->mbox_send = cxl_pci_mbox_send;
> -	cxlds->payload_size =
> +	mds->mbox_send = cxl_pci_mbox_send;
> +	mds->payload_size =
>   		1 << FIELD_GET(CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK, cap);
>   
>   	/*
> @@ -261,15 +265,14 @@ static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
>   	 * there's no point in going forward. If the size is too large, there's
>   	 * no harm is soft limiting it.
>   	 */
> -	cxlds->payload_size = min_t(size_t, cxlds->payload_size, SZ_1M);
> -	if (cxlds->payload_size < 256) {
> -		dev_err(cxlds->dev, "Mailbox is too small (%zub)",
> -			cxlds->payload_size);
> +	mds->payload_size = min_t(size_t, mds->payload_size, SZ_1M);
> +	if (mds->payload_size < 256) {
> +		dev_err(dev, "Mailbox is too small (%zub)",
> +			mds->payload_size);
>   		return -ENXIO;
>   	}
>   
> -	dev_dbg(cxlds->dev, "Mailbox payload sized %zu",
> -		cxlds->payload_size);
> +	dev_dbg(dev, "Mailbox payload sized %zu", mds->payload_size);
>   
>   	return 0;
>   }
> @@ -433,18 +436,18 @@ static void free_event_buf(void *buf)
>   
>   /*
>    * There is a single buffer for reading event logs from the mailbox.  All logs
> - * share this buffer protected by the cxlds->event_log_lock.
> + * share this buffer protected by the mds->event_log_lock.
>    */
> -static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> +static int cxl_mem_alloc_event_buf(struct cxl_memdev_state *mds)
>   {
>   	struct cxl_get_event_payload *buf;
>   
> -	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> +	buf = kvmalloc(mds->payload_size, GFP_KERNEL);
>   	if (!buf)
>   		return -ENOMEM;
> -	cxlds->event.buf = buf;
> +	mds->event.buf = buf;
>   
> -	return devm_add_action_or_reset(cxlds->dev, free_event_buf, buf);
> +	return devm_add_action_or_reset(mds->cxlds.dev, free_event_buf, buf);
>   }
>   
>   static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> @@ -477,6 +480,7 @@ static irqreturn_t cxl_event_thread(int irq, void *id)
>   {
>   	struct cxl_dev_id *dev_id = id;
>   	struct cxl_dev_state *cxlds = dev_id->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>   	u32 status;
>   
>   	do {
> @@ -489,7 +493,7 @@ static irqreturn_t cxl_event_thread(int irq, void *id)
>   		status &= CXLDEV_EVENT_STATUS_ALL;
>   		if (!status)
>   			break;
> -		cxl_mem_get_event_records(cxlds, status);
> +		cxl_mem_get_event_records(mds, status);
>   		cond_resched();
>   	} while (status);
>   
> @@ -522,7 +526,7 @@ static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
>   					 dev_id);
>   }
>   
> -static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> +static int cxl_event_get_int_policy(struct cxl_memdev_state *mds,
>   				    struct cxl_event_interrupt_policy *policy)
>   {
>   	struct cxl_mbox_cmd mbox_cmd = {
> @@ -532,15 +536,15 @@ static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
>   	};
>   	int rc;
>   
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc < 0)
> -		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> -			rc);
> +		dev_err(mds->cxlds.dev,
> +			"Failed to get event interrupt policy : %d", rc);
>   
>   	return rc;
>   }
>   
> -static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> +static int cxl_event_config_msgnums(struct cxl_memdev_state *mds,
>   				    struct cxl_event_interrupt_policy *policy)
>   {
>   	struct cxl_mbox_cmd mbox_cmd;
> @@ -559,23 +563,24 @@ static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
>   		.size_in = sizeof(*policy),
>   	};
>   
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc < 0) {
> -		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> +		dev_err(mds->cxlds.dev, "Failed to set event interrupt policy : %d",
>   			rc);
>   		return rc;
>   	}
>   
>   	/* Retrieve final interrupt settings */
> -	return cxl_event_get_int_policy(cxlds, policy);
> +	return cxl_event_get_int_policy(mds, policy);
>   }
>   
> -static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> +static int cxl_event_irqsetup(struct cxl_memdev_state *mds)
>   {
> +	struct cxl_dev_state *cxlds = &mds->cxlds;
>   	struct cxl_event_interrupt_policy policy;
>   	int rc;
>   
> -	rc = cxl_event_config_msgnums(cxlds, &policy);
> +	rc = cxl_event_config_msgnums(mds, &policy);
>   	if (rc)
>   		return rc;
>   
> @@ -614,7 +619,7 @@ static bool cxl_event_int_is_fw(u8 setting)
>   }
>   
>   static int cxl_event_config(struct pci_host_bridge *host_bridge,
> -			    struct cxl_dev_state *cxlds)
> +			    struct cxl_memdev_state *mds)
>   {
>   	struct cxl_event_interrupt_policy policy;
>   	int rc;
> @@ -626,11 +631,11 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
>   	if (!host_bridge->native_cxl_error)
>   		return 0;
>   
> -	rc = cxl_mem_alloc_event_buf(cxlds);
> +	rc = cxl_mem_alloc_event_buf(mds);
>   	if (rc)
>   		return rc;
>   
> -	rc = cxl_event_get_int_policy(cxlds, &policy);
> +	rc = cxl_event_get_int_policy(mds, &policy);
>   	if (rc)
>   		return rc;
>   
> @@ -638,15 +643,16 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
>   	    cxl_event_int_is_fw(policy.warn_settings) ||
>   	    cxl_event_int_is_fw(policy.failure_settings) ||
>   	    cxl_event_int_is_fw(policy.fatal_settings)) {
> -		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> +		dev_err(mds->cxlds.dev,
> +			"FW still in control of Event Logs despite _OSC settings\n");
>   		return -EBUSY;
>   	}
>   
> -	rc = cxl_event_irqsetup(cxlds);
> +	rc = cxl_event_irqsetup(mds);
>   	if (rc)
>   		return rc;
>   
> -	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +	cxl_mem_get_event_records(mds, CXLDEV_EVENT_STATUS_ALL);
>   
>   	return 0;
>   }
> @@ -654,9 +660,10 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
>   static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   {
>   	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> +	struct cxl_memdev_state *mds;
> +	struct cxl_dev_state *cxlds;
>   	struct cxl_register_map map;
>   	struct cxl_memdev *cxlmd;
> -	struct cxl_dev_state *cxlds;
>   	int rc;
>   
>   	/*
> @@ -671,9 +678,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   		return rc;
>   	pci_set_master(pdev);
>   
> -	cxlds = cxl_dev_state_create(&pdev->dev);
> -	if (IS_ERR(cxlds))
> -		return PTR_ERR(cxlds);
> +	mds = cxl_memdev_state_create(&pdev->dev);
> +	if (IS_ERR(mds))
> +		return PTR_ERR(mds);
> +	cxlds = &mds->cxlds;
>   	pci_set_drvdata(pdev, cxlds);
>   
>   	cxlds->rcd = is_cxl_restricted(pdev);
> @@ -714,27 +722,27 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   	else
>   		dev_warn(&pdev->dev, "Media not active (%d)\n", rc);
>   
> -	rc = cxl_pci_setup_mailbox(cxlds);
> +	rc = cxl_pci_setup_mailbox(mds);
>   	if (rc)
>   		return rc;
>   
> -	rc = cxl_enumerate_cmds(cxlds);
> +	rc = cxl_enumerate_cmds(mds);
>   	if (rc)
>   		return rc;
>   
> -	rc = cxl_set_timestamp(cxlds);
> +	rc = cxl_set_timestamp(mds);
>   	if (rc)
>   		return rc;
>   
> -	rc = cxl_poison_state_init(cxlds);
> +	rc = cxl_poison_state_init(mds);
>   	if (rc)
>   		return rc;
>   
> -	rc = cxl_dev_state_identify(cxlds);
> +	rc = cxl_dev_state_identify(mds);
>   	if (rc)
>   		return rc;
>   
> -	rc = cxl_mem_create_range_info(cxlds);
> +	rc = cxl_mem_create_range_info(mds);
>   	if (rc)
>   		return rc;
>   
> @@ -746,7 +754,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   	if (IS_ERR(cxlmd))
>   		return PTR_ERR(cxlmd);
>   
> -	rc = cxl_event_config(host_bridge, cxlds);
> +	rc = cxl_event_config(host_bridge, mds);
>   	if (rc)
>   		return rc;
>   
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> index 71cfa1fdf902..7cb8994f8809 100644
> --- a/drivers/cxl/pmem.c
> +++ b/drivers/cxl/pmem.c
> @@ -15,9 +15,9 @@ extern const struct nvdimm_security_ops *cxl_security_ops;
>   
>   static __read_mostly DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
>   
> -static void clear_exclusive(void *cxlds)
> +static void clear_exclusive(void *mds)
>   {
> -	clear_exclusive_cxl_commands(cxlds, exclusive_cmds);
> +	clear_exclusive_cxl_commands(mds, exclusive_cmds);
>   }
>   
>   static void unregister_nvdimm(void *nvdimm)
> @@ -65,13 +65,13 @@ static int cxl_nvdimm_probe(struct device *dev)
>   	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
>   	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
>   	struct cxl_nvdimm_bridge *cxl_nvb = cxlmd->cxl_nvb;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	unsigned long flags = 0, cmd_mask = 0;
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
>   	struct nvdimm *nvdimm;
>   	int rc;
>   
> -	set_exclusive_cxl_commands(cxlds, exclusive_cmds);
> -	rc = devm_add_action_or_reset(dev, clear_exclusive, cxlds);
> +	set_exclusive_cxl_commands(mds, exclusive_cmds);
> +	rc = devm_add_action_or_reset(dev, clear_exclusive, mds);
>   	if (rc)
>   		return rc;
>   
> @@ -100,22 +100,23 @@ static struct cxl_driver cxl_nvdimm_driver = {
>   	},
>   };
>   
> -static int cxl_pmem_get_config_size(struct cxl_dev_state *cxlds,
> +static int cxl_pmem_get_config_size(struct cxl_memdev_state *mds,
>   				    struct nd_cmd_get_config_size *cmd,
>   				    unsigned int buf_len)
>   {
>   	if (sizeof(*cmd) > buf_len)
>   		return -EINVAL;
>   
> -	*cmd = (struct nd_cmd_get_config_size) {
> -		 .config_size = cxlds->lsa_size,
> -		 .max_xfer = cxlds->payload_size - sizeof(struct cxl_mbox_set_lsa),
> +	*cmd = (struct nd_cmd_get_config_size){
> +		.config_size = mds->lsa_size,
> +		.max_xfer =
> +			mds->payload_size - sizeof(struct cxl_mbox_set_lsa),
>   	};
>   
>   	return 0;
>   }
>   
> -static int cxl_pmem_get_config_data(struct cxl_dev_state *cxlds,
> +static int cxl_pmem_get_config_data(struct cxl_memdev_state *mds,
>   				    struct nd_cmd_get_config_data_hdr *cmd,
>   				    unsigned int buf_len)
>   {
> @@ -140,13 +141,13 @@ static int cxl_pmem_get_config_data(struct cxl_dev_state *cxlds,
>   		.payload_out = cmd->out_buf,
>   	};
>   
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	cmd->status = 0;
>   
>   	return rc;
>   }
>   
> -static int cxl_pmem_set_config_data(struct cxl_dev_state *cxlds,
> +static int cxl_pmem_set_config_data(struct cxl_memdev_state *mds,
>   				    struct nd_cmd_set_config_hdr *cmd,
>   				    unsigned int buf_len)
>   {
> @@ -176,7 +177,7 @@ static int cxl_pmem_set_config_data(struct cxl_dev_state *cxlds,
>   		.size_in = struct_size(set_lsa, data, cmd->in_length),
>   	};
>   
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   
>   	/*
>   	 * Set "firmware" status (4-packed bytes at the end of the input
> @@ -194,18 +195,18 @@ static int cxl_pmem_nvdimm_ctl(struct nvdimm *nvdimm, unsigned int cmd,
>   	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
>   	unsigned long cmd_mask = nvdimm_cmd_mask(nvdimm);
>   	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   
>   	if (!test_bit(cmd, &cmd_mask))
>   		return -ENOTTY;
>   
>   	switch (cmd) {
>   	case ND_CMD_GET_CONFIG_SIZE:
> -		return cxl_pmem_get_config_size(cxlds, buf, buf_len);
> +		return cxl_pmem_get_config_size(mds, buf, buf_len);
>   	case ND_CMD_GET_CONFIG_DATA:
> -		return cxl_pmem_get_config_data(cxlds, buf, buf_len);
> +		return cxl_pmem_get_config_data(mds, buf, buf_len);
>   	case ND_CMD_SET_CONFIG_DATA:
> -		return cxl_pmem_set_config_data(cxlds, buf, buf_len);
> +		return cxl_pmem_set_config_data(mds, buf, buf_len);
>   	default:
>   		return -ENOTTY;
>   	}
> diff --git a/drivers/cxl/security.c b/drivers/cxl/security.c
> index 4ad4bda2d18e..8c98fc674fa7 100644
> --- a/drivers/cxl/security.c
> +++ b/drivers/cxl/security.c
> @@ -14,7 +14,7 @@ static unsigned long cxl_pmem_get_security_flags(struct nvdimm *nvdimm,
>   {
>   	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
>   	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	unsigned long security_flags = 0;
>   	struct cxl_get_security_output {
>   		__le32 flags;
> @@ -29,7 +29,7 @@ static unsigned long cxl_pmem_get_security_flags(struct nvdimm *nvdimm,
>   		.payload_out = &out,
>   	};
>   
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc < 0)
>   		return 0;
>   
> @@ -67,7 +67,7 @@ static int cxl_pmem_security_change_key(struct nvdimm *nvdimm,
>   {
>   	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
>   	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct cxl_mbox_cmd mbox_cmd;
>   	struct cxl_set_pass set_pass;
>   
> @@ -84,7 +84,7 @@ static int cxl_pmem_security_change_key(struct nvdimm *nvdimm,
>   		.payload_in = &set_pass,
>   	};
>   
> -	return cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	return cxl_internal_send_cmd(mds, &mbox_cmd);
>   }
>   
>   static int __cxl_pmem_security_disable(struct nvdimm *nvdimm,
> @@ -93,7 +93,7 @@ static int __cxl_pmem_security_disable(struct nvdimm *nvdimm,
>   {
>   	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
>   	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct cxl_disable_pass dis_pass;
>   	struct cxl_mbox_cmd mbox_cmd;
>   
> @@ -109,7 +109,7 @@ static int __cxl_pmem_security_disable(struct nvdimm *nvdimm,
>   		.payload_in = &dis_pass,
>   	};
>   
> -	return cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	return cxl_internal_send_cmd(mds, &mbox_cmd);
>   }
>   
>   static int cxl_pmem_security_disable(struct nvdimm *nvdimm,
> @@ -128,12 +128,12 @@ static int cxl_pmem_security_freeze(struct nvdimm *nvdimm)
>   {
>   	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
>   	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct cxl_mbox_cmd mbox_cmd = {
>   		.opcode = CXL_MBOX_OP_FREEZE_SECURITY,
>   	};
>   
> -	return cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	return cxl_internal_send_cmd(mds, &mbox_cmd);
>   }
>   
>   static int cxl_pmem_security_unlock(struct nvdimm *nvdimm,
> @@ -141,7 +141,7 @@ static int cxl_pmem_security_unlock(struct nvdimm *nvdimm,
>   {
>   	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
>   	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	u8 pass[NVDIMM_PASSPHRASE_LEN];
>   	struct cxl_mbox_cmd mbox_cmd;
>   	int rc;
> @@ -153,7 +153,7 @@ static int cxl_pmem_security_unlock(struct nvdimm *nvdimm,
>   		.payload_in = pass,
>   	};
>   
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc < 0)
>   		return rc;
>   
> @@ -166,7 +166,7 @@ static int cxl_pmem_security_passphrase_erase(struct nvdimm *nvdimm,
>   {
>   	struct cxl_nvdimm *cxl_nvd = nvdimm_provider_data(nvdimm);
>   	struct cxl_memdev *cxlmd = cxl_nvd->cxlmd;
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
>   	struct cxl_mbox_cmd mbox_cmd;
>   	struct cxl_pass_erase erase;
>   	int rc;
> @@ -182,7 +182,7 @@ static int cxl_pmem_security_passphrase_erase(struct nvdimm *nvdimm,
>   		.payload_in = &erase,
>   	};
>   
> -	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	rc = cxl_internal_send_cmd(mds, &mbox_cmd);
>   	if (rc < 0)
>   		return rc;
>   
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index bdaf086d994e..6fb5718588f3 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -102,7 +102,7 @@ struct mock_event_log {
>   };
>   
>   struct mock_event_store {
> -	struct cxl_dev_state *cxlds;
> +	struct cxl_memdev_state *mds;
>   	struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
>   	u32 ev_status;
>   };
> @@ -291,7 +291,7 @@ static void cxl_mock_event_trigger(struct device *dev)
>   			event_reset_log(log);
>   	}
>   
> -	cxl_mem_get_event_records(mes->cxlds, mes->ev_status);
> +	cxl_mem_get_event_records(mes->mds, mes->ev_status);
>   }
>   
>   struct cxl_event_record_raw maint_needed = {
> @@ -451,7 +451,7 @@ static int mock_gsl(struct cxl_mbox_cmd *cmd)
>   	return 0;
>   }
>   
> -static int mock_get_log(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int mock_get_log(struct cxl_memdev_state *mds, struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_mbox_get_log *gl = cmd->payload_in;
>   	u32 offset = le32_to_cpu(gl->offset);
> @@ -461,7 +461,7 @@ static int mock_get_log(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
>   
>   	if (cmd->size_in < sizeof(*gl))
>   		return -EINVAL;
> -	if (length > cxlds->payload_size)
> +	if (length > mds->payload_size)
>   		return -EINVAL;
>   	if (offset + length > sizeof(mock_cel))
>   		return -EINVAL;
> @@ -1105,8 +1105,10 @@ static struct attribute *cxl_mock_mem_core_attrs[] = {
>   };
>   ATTRIBUTE_GROUPS(cxl_mock_mem_core);
>   
> -static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +static int cxl_mock_mbox_send(struct cxl_memdev_state *mds,
> +			      struct cxl_mbox_cmd *cmd)
>   {
> +	struct cxl_dev_state *cxlds = &mds->cxlds;
>   	struct device *dev = cxlds->dev;
>   	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
>   	int rc = -EIO;
> @@ -1119,7 +1121,7 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
>   		rc = mock_gsl(cmd);
>   		break;
>   	case CXL_MBOX_OP_GET_LOG:
> -		rc = mock_get_log(cxlds, cmd);
> +		rc = mock_get_log(mds, cmd);
>   		break;
>   	case CXL_MBOX_OP_IDENTIFY:
>   		if (cxlds->rcd)
> @@ -1207,6 +1209,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>   {
>   	struct device *dev = &pdev->dev;
>   	struct cxl_memdev *cxlmd;
> +	struct cxl_memdev_state *mds;
>   	struct cxl_dev_state *cxlds;
>   	struct cxl_mockmem_data *mdata;
>   	int rc;
> @@ -1223,48 +1226,50 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>   	if (rc)
>   		return rc;
>   
> -	cxlds = cxl_dev_state_create(dev);
> -	if (IS_ERR(cxlds))
> -		return PTR_ERR(cxlds);
> +	mds = cxl_memdev_state_create(dev);
> +	if (IS_ERR(mds))
> +		return PTR_ERR(mds);
> +
> +	mds->mbox_send = cxl_mock_mbox_send;
> +	mds->payload_size = SZ_4K;
> +	mds->event.buf = (struct cxl_get_event_payload *) mdata->event_buf;
>   
> +	cxlds = &mds->cxlds;
>   	cxlds->serial = pdev->id;
> -	cxlds->mbox_send = cxl_mock_mbox_send;
> -	cxlds->payload_size = SZ_4K;
> -	cxlds->event.buf = (struct cxl_get_event_payload *) mdata->event_buf;
>   	if (is_rcd(pdev)) {
>   		cxlds->rcd = true;
>   		cxlds->component_reg_phys = CXL_RESOURCE_NONE;
>   	}
>   
> -	rc = cxl_enumerate_cmds(cxlds);
> +	rc = cxl_enumerate_cmds(mds);
>   	if (rc)
>   		return rc;
>   
> -	rc = cxl_poison_state_init(cxlds);
> +	rc = cxl_poison_state_init(mds);
>   	if (rc)
>   		return rc;
>   
> -	rc = cxl_set_timestamp(cxlds);
> +	rc = cxl_set_timestamp(mds);
>   	if (rc)
>   		return rc;
>   
>   	cxlds->media_ready = true;
> -	rc = cxl_dev_state_identify(cxlds);
> +	rc = cxl_dev_state_identify(mds);
>   	if (rc)
>   		return rc;
>   
> -	rc = cxl_mem_create_range_info(cxlds);
> +	rc = cxl_mem_create_range_info(mds);
>   	if (rc)
>   		return rc;
>   
> -	mdata->mes.cxlds = cxlds;
> +	mdata->mes.mds = mds;
>   	cxl_mock_add_event_logs(&mdata->mes);
>   
>   	cxlmd = devm_cxl_add_memdev(cxlds);
>   	if (IS_ERR(cxlmd))
>   		return PTR_ERR(cxlmd);
>   
> -	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +	cxl_mem_get_event_records(mds, CXLDEV_EVENT_STATUS_ALL);
>   
>   	return 0;
>   }
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM
  2023-06-06 11:27   ` Jonathan Cameron
  2023-06-13 21:23     ` Dan Williams
@ 2023-06-13 22:32     ` Dan Williams
  2023-06-14  9:15       ` Jonathan Cameron
  1 sibling, 1 reply; 64+ messages in thread
From: Dan Williams @ 2023-06-13 22:32 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

Jonathan Cameron wrote:
> On Sun, 04 Jun 2023 16:32:10 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > In preparation for device-memory region creation, arrange for decoders
> > of CXL_DEVTYPE_DEVMEM memdevs to default to CXL_DECODER_DEVMEM for their
> > target type.
> 
> Why?  CXL_DEVTYPE_DEVMEM might just be a non CLASS code compliant HDM-H
> only device.  I'd want those drivers to always set this explicitly.
> 
> 
> > 
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/cxl/core/hdm.c |   14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > index de8a3fb28331..ca3b99c6eacf 100644
> > --- a/drivers/cxl/core/hdm.c
> > +++ b/drivers/cxl/core/hdm.c
> > @@ -856,12 +856,22 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
> >  		}
> >  		port->commit_end = cxld->id;
> >  	} else {
> > -		/* unless / until type-2 drivers arrive, assume type-3 */
> >  		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0) {
> >  			ctrl |= CXL_HDM_DECODER0_CTRL_TYPE;
> >  			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
> 
> This is setting it to be HOSTMEM if it was previously DEVMEM and that
> makes it inconsistent with the state cached below.
> 
> Not sure why it was conditional in the first place - writing to existing value
> should have been safe and would be less code...

folded in the following...

-- >8 --
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 8deb362a7e44..715c1f103739 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -572,7 +572,7 @@ static void cxld_set_type(struct cxl_decoder *cxld, u32 *ctrl)
 {
 	u32p_replace_bits(ctrl,
 			  !!(cxld->target_type == CXL_DECODER_HOSTONLYMEM),
-			  CXL_HDM_DECODER0_CTRL_TYPE);
+			  CXL_HDM_DECODER0_CTRL_HOSTONLY);
 }
 
 static int cxlsd_set_targets(struct cxl_switch_decoder *cxlsd, u64 *tgt)
@@ -840,7 +840,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 		cxld->flags |= CXL_DECODER_F_ENABLE;
 		if (ctrl & CXL_HDM_DECODER0_CTRL_LOCK)
 			cxld->flags |= CXL_DECODER_F_LOCK;
-		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl))
+		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_HOSTONLY, ctrl))
 			cxld->target_type = CXL_DECODER_HOSTONLYMEM;
 		else
 			cxld->target_type = CXL_DECODER_DEVMEM;
@@ -859,14 +859,14 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 		}
 		port->commit_end = cxld->id;
 	} else {
-		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0) {
-			ctrl |= CXL_HDM_DECODER0_CTRL_TYPE;
-			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
-		}
 		if (cxled) {
 			struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 			struct cxl_dev_state *cxlds = cxlmd->cxlds;
 
+			/*
+			 * Default by devtype until a device arrives that needs
+			 * more precision.
+			 */
 			if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
 				cxld->target_type = CXL_DECODER_HOSTONLYMEM;
 			else
@@ -875,6 +875,12 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 			/* To be overridden by region type at commit time */
 			cxld->target_type = CXL_DECODER_HOSTONLYMEM;
 		}
+
+		if (!FIELD_GET(CXL_HDM_DECODER0_CTRL_HOSTONLY, ctrl) &&
+		    cxld->target_type == CXL_DECODER_HOSTONLYMEM) {
+			ctrl |= CXL_HDM_DECODER0_CTRL_HOSTONLY;
+			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
+		}
 	}
 	rc = eiw_to_ways(FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl),
 			  &cxld->interleave_ways);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index ae0965ac8c5a..f309b1387858 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -56,7 +56,7 @@
 #define   CXL_HDM_DECODER0_CTRL_COMMIT BIT(9)
 #define   CXL_HDM_DECODER0_CTRL_COMMITTED BIT(10)
 #define   CXL_HDM_DECODER0_CTRL_COMMIT_ERROR BIT(11)
-#define   CXL_HDM_DECODER0_CTRL_TYPE BIT(12)
+#define   CXL_HDM_DECODER0_CTRL_HOSTONLY BIT(12)
 #define CXL_HDM_DECODER0_TL_LOW(i) (0x20 * (i) + 0x24)
 #define CXL_HDM_DECODER0_TL_HIGH(i) (0x20 * (i) + 0x28)
 #define CXL_HDM_DECODER0_SKIP_LOW(i) CXL_HDM_DECODER0_TL_LOW(i)

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/19] cxl/region: Manage decoder target_type at decoder-attach time
  2023-06-04 23:32 ` [PATCH 07/19] cxl/region: Manage decoder target_type at decoder-attach time Dan Williams
  2023-06-06 12:36   ` Jonathan Cameron
@ 2023-06-13 22:42   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 22:42 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh


On 6/4/23 16:32, Dan Williams wrote:
> Switch-level (mid-level) decoders between the platform root and an
> endpoint can dynamically switch modes between HDM-H and HDM-D[B]
> depending on which region they target. Use the region type to fixup each
> decoder that gets allocated to map the given region.
>
> Note that endpoint decoders are meant to determine the region type, so
> warn if those ever need to be fixed up, but since it is possible to
> continue do so.
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>


> ---
>   drivers/cxl/core/region.c |   12 ++++++++++++
>   1 file changed, 12 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index dca94c458b8f..c7170d92f47f 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -809,6 +809,18 @@ static int cxl_rr_alloc_decoder(struct cxl_port *port, struct cxl_region *cxlr,
>   		return -EBUSY;
>   	}
>   
> +	/*
> +	 * Endpoints should already match the region type, but backstop that
> +	 * assumption with an assertion. Switch-decoders change mapping-type
> +	 * based on what is mapped when they are assigned to a region.
> +	 */
> +	dev_WARN_ONCE(&cxlr->dev,
> +		      port == cxled_to_port(cxled) &&
> +			      cxld->target_type != cxlr->type,
> +		      "%s:%s mismatch decoder type %d -> %d\n",
> +		      dev_name(&cxled_to_memdev(cxled)->dev),
> +		      dev_name(&cxld->dev), cxld->target_type, cxlr->type);
> +	cxld->target_type = cxlr->type;
>   	cxl_rr->decoder = cxld;
>   	return 0;
>   }
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/19] cxl/memdev: Formalize endpoint port linkage
  2023-06-04 23:32 ` [PATCH 09/19] cxl/memdev: Formalize endpoint port linkage Dan Williams
  2023-06-06 13:26   ` Jonathan Cameron
       [not found]   ` <CGME20230607164756uscas1p2fb025e7f4de5094925cc25fc2ac45212@uscas1p2.samsung.com>
@ 2023-06-13 22:59   ` Dave Jiang
  2 siblings, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 22:59 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh


On 6/4/23 16:32, Dan Williams wrote:
> Move the endpoint port that the cxl_mem driver establishes from drvdata
> to a first class attribute. This is in preparation for device-memory
> drivers reusing the CXL core for memory region management. Those drivers
> need a type-safe method to retrieve their CXL port linkage. Leave
> drvdata for private usage of the cxl_mem driver not external consumers
> of a 'struct cxl_memdev' object.
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>   drivers/cxl/core/memdev.c |    4 ++--
>   drivers/cxl/core/pmem.c   |    2 +-
>   drivers/cxl/core/port.c   |    5 +++--
>   drivers/cxl/cxlmem.h      |    2 ++
>   4 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 3f2d54f30548..65a685e5616f 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -149,7 +149,7 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd)
>   	struct cxl_port *port;
>   	int rc;
>   
> -	port = dev_get_drvdata(&cxlmd->dev);
> +	port = cxlmd->endpoint;
>   	if (!port || !is_cxl_endpoint(port))
>   		return -EINVAL;
>   
> @@ -207,7 +207,7 @@ static struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
>   	ctx = (struct cxl_dpa_to_region_context) {
>   		.dpa = dpa,
>   	};
> -	port = dev_get_drvdata(&cxlmd->dev);
> +	port = cxlmd->endpoint;
>   	if (port && is_cxl_endpoint(port) && port->commit_end != -1)
>   		device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region);
>   
> diff --git a/drivers/cxl/core/pmem.c b/drivers/cxl/core/pmem.c
> index f8c38d997252..fc94f5240327 100644
> --- a/drivers/cxl/core/pmem.c
> +++ b/drivers/cxl/core/pmem.c
> @@ -64,7 +64,7 @@ static int match_nvdimm_bridge(struct device *dev, void *data)
>   
>   struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd)
>   {
> -	struct cxl_port *port = find_cxl_root(dev_get_drvdata(&cxlmd->dev));
> +	struct cxl_port *port = find_cxl_root(cxlmd->endpoint);
>   	struct device *dev;
>   
>   	if (!port)
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 71a7547a8d6f..6720ab22a494 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1167,7 +1167,7 @@ static struct device *grandparent(struct device *dev)
>   static void delete_endpoint(void *data)
>   {
>   	struct cxl_memdev *cxlmd = data;
> -	struct cxl_port *endpoint = dev_get_drvdata(&cxlmd->dev);
> +	struct cxl_port *endpoint = cxlmd->endpoint;
>   	struct cxl_port *parent_port;
>   	struct device *parent;
>   
> @@ -1182,6 +1182,7 @@ static void delete_endpoint(void *data)
>   		devm_release_action(parent, cxl_unlink_uport, endpoint);
>   		devm_release_action(parent, unregister_port, endpoint);
>   	}
> +	cxlmd->endpoint = NULL;
>   	device_unlock(parent);
>   	put_device(parent);
>   out:
> @@ -1193,7 +1194,7 @@ int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
>   	struct device *dev = &cxlmd->dev;
>   
>   	get_device(&endpoint->dev);
> -	dev_set_drvdata(dev, endpoint);
> +	cxlmd->endpoint = endpoint;
>   	cxlmd->depth = endpoint->depth;
>   	return devm_add_action_or_reset(dev, delete_endpoint, cxlmd);
>   }
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index b8bdf7490d2c..7ee78e79933c 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -38,6 +38,7 @@
>    * @detach_work: active memdev lost a port in its ancestry
>    * @cxl_nvb: coordinate removal of @cxl_nvd if present
>    * @cxl_nvd: optional bridge to an nvdimm if the device supports pmem
> + * @endpoint: connection to the CXL port topology for this memory device
>    * @id: id number of this memdev instance.
>    * @depth: endpoint port depth
>    */
> @@ -48,6 +49,7 @@ struct cxl_memdev {
>   	struct work_struct detach_work;
>   	struct cxl_nvdimm_bridge *cxl_nvb;
>   	struct cxl_nvdimm *cxl_nvd;
> +	struct cxl_port *endpoint;
>   	int id;
>   	int depth;
>   };
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 11/19] cxl/region: Factor out construct_region_{begin, end} and drop_region() for reuse
  2023-06-04 23:32 ` [PATCH 11/19] cxl/region: Factor out construct_region_{begin, end} and drop_region() for reuse Dan Williams
  2023-06-06 14:29   ` Jonathan Cameron
@ 2023-06-13 23:29   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 23:29 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh


On 6/4/23 16:32, Dan Williams wrote:
> In preparation for constructing regions from newly allocated HPA, factor
> out some helpers that can be shared with the existing kernel-internal
> region construction from BIOS pre-allocated regions. Handle acquiring a
> new region object under the region rwsem, and optionally tearing it down
> if the region assembly process fails.
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>


> ---
>   drivers/cxl/core/region.c |   73 ++++++++++++++++++++++++++++++++-------------
>   1 file changed, 52 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index c7170d92f47f..bd3c3d4b2683 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2191,19 +2191,25 @@ cxl_find_region_by_name(struct cxl_root_decoder *cxlrd, const char *name)
>   	return to_cxl_region(region_dev);
>   }
>   
> +static void drop_region(struct cxl_region *cxlr)
> +{
> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> +	struct cxl_port *port = cxlrd_to_port(cxlrd);
> +
> +	devm_release_action(port->uport, unregister_region, cxlr);
> +}
> +
>   static ssize_t delete_region_store(struct device *dev,
>   				   struct device_attribute *attr,
>   				   const char *buf, size_t len)
>   {
>   	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev);
> -	struct cxl_port *port = to_cxl_port(dev->parent);
>   	struct cxl_region *cxlr;
>   
>   	cxlr = cxl_find_region_by_name(cxlrd, buf);
>   	if (IS_ERR(cxlr))
>   		return PTR_ERR(cxlr);
> -
> -	devm_release_action(port->uport, unregister_region, cxlr);
> +	drop_region(cxlr);
>   	put_device(&cxlr->dev);
>   
>   	return len;
> @@ -2664,17 +2670,19 @@ static int match_region_by_range(struct device *dev, void *data)
>   	return rc;
>   }
>   
> -/* Establish an empty region covering the given HPA range */
> -static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> -					   struct cxl_endpoint_decoder *cxled)
> +static void construct_region_end(void)
> +{
> +	up_write(&cxl_region_rwsem);
> +}
> +
> +static struct cxl_region *
> +construct_region_begin(struct cxl_root_decoder *cxlrd,
> +		       struct cxl_endpoint_decoder *cxled)
>   {
>   	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> -	struct cxl_port *port = cxlrd_to_port(cxlrd);
> -	struct range *hpa = &cxled->cxld.hpa_range;
>   	struct cxl_region_params *p;
>   	struct cxl_region *cxlr;
> -	struct resource *res;
> -	int rc;
> +	int err = 0;
>   
>   	do {
>   		cxlr = __create_region(cxlrd, cxled->mode,
> @@ -2693,19 +2701,41 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>   	p = &cxlr->params;
>   	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
>   		dev_err(cxlmd->dev.parent,
> -			"%s:%s: %s autodiscovery interrupted\n",
> +			"%s:%s: %s region setup interrupted\n",
>   			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
>   			__func__);
> -		rc = -EBUSY;
> -		goto err;
> +		err = -EBUSY;
> +	}
> +
> +	if (err) {
> +		construct_region_end();
> +		drop_region(cxlr);
> +		return ERR_PTR(err);
>   	}
> +	return cxlr;
> +}
> +
> +/* Establish an empty region covering the given HPA range */
> +static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> +					   struct cxl_endpoint_decoder *cxled)
> +{
> +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +	struct range *hpa = &cxled->cxld.hpa_range;
> +	struct cxl_region_params *p;
> +	struct cxl_region *cxlr;
> +	struct resource *res;
> +	int rc;
> +
> +	cxlr = construct_region_begin(cxlrd, cxled);
> +	if (IS_ERR(cxlr))
> +		return cxlr;
>   
>   	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
>   
>   	res = kmalloc(sizeof(*res), GFP_KERNEL);
>   	if (!res) {
>   		rc = -ENOMEM;
> -		goto err;
> +		goto out;
>   	}
>   
>   	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
> @@ -2722,6 +2752,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>   			 __func__, dev_name(&cxlr->dev));
>   	}
>   
> +	p = &cxlr->params;
>   	p->res = res;
>   	p->interleave_ways = cxled->cxld.interleave_ways;
>   	p->interleave_granularity = cxled->cxld.interleave_granularity;
> @@ -2729,7 +2760,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>   
>   	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
>   	if (rc)
> -		goto err;
> +		goto out;
>   
>   	dev_dbg(cxlmd->dev.parent, "%s:%s: %s %s res: %pr iw: %d ig: %d\n",
>   		dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), __func__,
> @@ -2738,14 +2769,14 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>   
>   	/* ...to match put_device() in cxl_add_to_region() */
>   	get_device(&cxlr->dev);
> -	up_write(&cxl_region_rwsem);
>   
> +out:
> +	construct_region_end();
> +	if (rc) {
> +		drop_region(cxlr);
> +		return ERR_PTR(rc);
> +	}
>   	return cxlr;
> -
> -err:
> -	up_write(&cxl_region_rwsem);
> -	devm_release_action(port->uport, unregister_region, cxlr);
> -	return ERR_PTR(rc);
>   }
>   
>   int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/19] cxl/region: Factor out interleave ways setup
  2023-06-04 23:32 ` [PATCH 12/19] cxl/region: Factor out interleave ways setup Dan Williams
  2023-06-06 14:31   ` Jonathan Cameron
@ 2023-06-13 23:30   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 23:30 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh


On 6/4/23 16:32, Dan Williams wrote:
> In preparation for kernel driven region creation, factor out a common
> helper from the user-sysfs region setup for interleave_ways.
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>   drivers/cxl/core/region.c |   46 ++++++++++++++++++++++++++-------------------
>   1 file changed, 27 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index bd3c3d4b2683..821c2d90154f 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -338,22 +338,14 @@ static ssize_t interleave_ways_show(struct device *dev,
>   
>   static const struct attribute_group *get_cxl_region_target_group(void);
>   
> -static ssize_t interleave_ways_store(struct device *dev,
> -				     struct device_attribute *attr,
> -				     const char *buf, size_t len)
> +static int set_interleave_ways(struct cxl_region *cxlr, int val)
>   {
> -	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
>   	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> -	struct cxl_region *cxlr = to_cxl_region(dev);
>   	struct cxl_region_params *p = &cxlr->params;
> -	unsigned int val, save;
> -	int rc;
> +	int save, rc;
>   	u8 iw;
>   
> -	rc = kstrtouint(buf, 0, &val);
> -	if (rc)
> -		return rc;
> -
>   	rc = ways_to_eiw(val, &iw);
>   	if (rc)
>   		return rc;
> @@ -368,21 +360,37 @@ static ssize_t interleave_ways_store(struct device *dev,
>   		return -EINVAL;
>   	}
>   
> -	rc = down_write_killable(&cxl_region_rwsem);
> -	if (rc)
> -		return rc;
> -	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
> -		rc = -EBUSY;
> -		goto out;
> -	}
> +	lockdep_assert_held_write(&cxl_region_rwsem);
> +	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
> +		return -EBUSY;
>   
>   	save = p->interleave_ways;
>   	p->interleave_ways = val;
>   	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
>   	if (rc)
>   		p->interleave_ways = save;
> -out:
> +	return rc;
> +}
> +
> +static ssize_t interleave_ways_store(struct device *dev,
> +				     struct device_attribute *attr,
> +				     const char *buf, size_t len)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	unsigned int val;
> +	int rc;
> +
> +	rc = kstrtouint(buf, 0, &val);
> +	if (rc)
> +		return rc;
> +
> +	rc = down_write_killable(&cxl_region_rwsem);
> +	if (rc)
> +		return rc;
> +
> +	rc = set_interleave_ways(cxlr, val);
>   	up_write(&cxl_region_rwsem);
> +
>   	if (rc)
>   		return rc;
>   	return len;
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 13/19] cxl/region: Factor out interleave granularity setup
  2023-06-04 23:32 ` [PATCH 13/19] cxl/region: Factor out interleave granularity setup Dan Williams
  2023-06-06 14:33   ` Jonathan Cameron
@ 2023-06-13 23:42   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 23:42 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh



On 6/4/23 16:32, Dan Williams wrote:
> In preparation for kernel driven region creation, factor out a common
> helper from the user-sysfs region setup for interleave_granularity.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>   drivers/cxl/core/region.c |   39 +++++++++++++++++++++++----------------
>   1 file changed, 23 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 821c2d90154f..4d8dbfedd64a 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -414,21 +414,14 @@ static ssize_t interleave_granularity_show(struct device *dev,
>   	return rc;
>   }
>   
> -static ssize_t interleave_granularity_store(struct device *dev,
> -					    struct device_attribute *attr,
> -					    const char *buf, size_t len)
> +static int set_interleave_granularity(struct cxl_region *cxlr, int val)
>   {
> -	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
>   	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> -	struct cxl_region *cxlr = to_cxl_region(dev);
>   	struct cxl_region_params *p = &cxlr->params;
> -	int rc, val;
> +	int rc;
>   	u16 ig;
>   
> -	rc = kstrtoint(buf, 0, &val);
> -	if (rc)
> -		return rc;
> -
>   	rc = granularity_to_eig(val, &ig);
>   	if (rc)
>   		return rc;
> @@ -444,16 +437,30 @@ static ssize_t interleave_granularity_store(struct device *dev,
>   	if (cxld->interleave_ways > 1 && val != cxld->interleave_granularity)
>   		return -EINVAL;
>   
> +	lockdep_assert_held_write(&cxl_region_rwsem);
> +	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE)
> +		return -EBUSY;
> +
> +	p->interleave_granularity = val;
> +	return 0;
> +}
> +
> +static ssize_t interleave_granularity_store(struct device *dev,
> +					    struct device_attribute *attr,
> +					    const char *buf, size_t len)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +	int rc, val;
> +
> +	rc = kstrtoint(buf, 0, &val);
> +	if (rc)
> +		return rc;
> +
>   	rc = down_write_killable(&cxl_region_rwsem);
>   	if (rc)
>   		return rc;
> -	if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
> -		rc = -EBUSY;
> -		goto out;
> -	}
>   
> -	p->interleave_granularity = val;
> -out:
> +	rc = set_interleave_granularity(cxlr, val);
>   	up_write(&cxl_region_rwsem);
>   	if (rc)
>   		return rc;
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 14/19] cxl/region: Clarify locking requirements of cxl_region_attach()
  2023-06-04 23:32 ` [PATCH 14/19] cxl/region: Clarify locking requirements of cxl_region_attach() Dan Williams
  2023-06-06 14:35   ` Jonathan Cameron
@ 2023-06-13 23:45   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 23:45 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh



On 6/4/23 16:32, Dan Williams wrote:
> In preparation for cxl_region_attach() being called for kernel initiated
> region creation, enforce the locking context with explicit lockdep
> assertions.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> ---
>   drivers/cxl/core/region.c |    3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 4d8dbfedd64a..defc2f0e43e3 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1587,6 +1587,9 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>   	struct cxl_dport *dport;
>   	int rc = -ENXIO;
>   
> +	lockdep_assert_held_write(&cxl_region_rwsem);
> +	lockdep_assert_held_read(&cxl_dpa_rwsem);
> +
>   	if (cxled->mode != cxlr->mode) {
>   		dev_dbg(&cxlr->dev, "%s region mode: %d mismatch: %d\n",
>   			dev_name(&cxled->cxld.dev), cxlr->mode, cxled->mode);
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/19] cxl/hdm: Define a driver interface for DPA allocation
  2023-06-04 23:33 ` [PATCH 16/19] cxl/hdm: Define a driver interface for DPA allocation Dan Williams
  2023-06-06 14:58   ` Jonathan Cameron
@ 2023-06-13 23:53   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-13 23:53 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh



On 6/4/23 16:33, Dan Williams wrote:
> Region creation involves finding available DPA (device-physical-address)
> capacity to map into HPA (host-physical-address) space. Given the HPA
> capacity constraint, define an API, cxl_request_dpa(), that has the
> flexibility to map the minimum amount of memory the driver needs to
> operate vs the total possible that can be mapped given HPA availability.
> 
> Factor out the core of cxl_dpa_alloc(), that does free space scanning,
> into a cxl_dpa_freespace() helper, and use that to balance the capacity
> available to map vs the @min and @max arguments to cxl_request_dpa().
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>   drivers/cxl/core/hdm.c |  140 +++++++++++++++++++++++++++++++++++++++++-------
>   drivers/cxl/cxl.h      |    6 ++
>   drivers/cxl/cxlmem.h   |    4 +
>   3 files changed, 131 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 91ab3033c781..514d30131d92 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -464,30 +464,17 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
>   	return rc;
>   }
>   
> -int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
> +static resource_size_t cxl_dpa_freespace(struct cxl_endpoint_decoder *cxled,

This function name reads odd for me. Maybe cxl_dpa_reserve_freespace()?

DJ

> +					 resource_size_t *start_out,
> +					 resource_size_t *skip_out)
>   {
>   	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>   	resource_size_t free_ram_start, free_pmem_start;
> -	struct cxl_port *port = cxled_to_port(cxled);
>   	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> -	struct device *dev = &cxled->cxld.dev;
>   	resource_size_t start, avail, skip;
>   	struct resource *p, *last;
> -	int rc;
>   
> -	down_write(&cxl_dpa_rwsem);
> -	if (cxled->cxld.region) {
> -		dev_dbg(dev, "decoder attached to %s\n",
> -			dev_name(&cxled->cxld.region->dev));
> -		rc = -EBUSY;
> -		goto out;
> -	}
> -
> -	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
> -		dev_dbg(dev, "decoder enabled\n");
> -		rc = -EBUSY;
> -		goto out;
> -	}
> +	lockdep_assert_held(&cxl_dpa_rwsem);
>   
>   	for (p = cxlds->ram_res.child, last = NULL; p; p = p->sibling)
>   		last = p;
> @@ -525,11 +512,42 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>   			skip_end = start - 1;
>   		skip = skip_end - skip_start + 1;
>   	} else {
> -		dev_dbg(dev, "mode not set\n");
> -		rc = -EINVAL;
> +		dev_dbg(cxled_dev(cxled), "mode not set\n");
> +		avail = 0;
> +	}
> +
> +	if (!avail)
> +		return 0;
> +	if (start_out)
> +		*start_out = start;
> +	if (skip_out)
> +		*skip_out = skip;
> +	return avail;
> +}
> +
> +int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
> +{
> +	struct cxl_port *port = cxled_to_port(cxled);
> +	struct device *dev = &cxled->cxld.dev;
> +	resource_size_t start, avail, skip;
> +	int rc;
> +
> +	down_write(&cxl_dpa_rwsem);
> +	if (cxled->cxld.region) {
> +		dev_dbg(dev, "decoder attached to %s\n",
> +			dev_name(&cxled->cxld.region->dev));
> +		rc = -EBUSY;
> +		goto out;
> +	}
> +
> +	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
> +		dev_dbg(dev, "decoder enabled\n");
> +		rc = -EBUSY;
>   		goto out;
>   	}
>   
> +	avail = cxl_dpa_freespace(cxled, &start, &skip);
> +
>   	if (size > avail) {
>   		dev_dbg(dev, "%pa exceeds available %s capacity: %pa\n", &size,
>   			cxled->mode == CXL_DECODER_RAM ? "ram" : "pmem",
> @@ -548,6 +566,90 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
>   	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
>   }
>   
> +static int find_free_decoder(struct device *dev, void *data)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_port *port;
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	port = cxled_to_port(cxled);
> +
> +	if (cxled->cxld.id != port->hdm_end + 1)
> +		return 0;
> +	return 1;
> +}
> +
> +/**
> + * cxl_request_dpa - search and reserve DPA given input constraints
> + * @endpoint: an endpoint port with available decoders
> + * @mode: DPA operation mode (ram vs pmem)
> + * @min: the minimum amount of capacity the call needs
> + * @max: extra capacity to allocate after min is satisfied
> + *
> + * Given that a region needs to allocate from limited HPA capacity it
> + * may be the case that a device has more mappable DPA capacity than
> + * available HPA. So, the expectation is that @min is a driver known
> + * value for how much capacity is needed, and @max is based the limit of
> + * how much HPA space is available for a new region.
> + *
> + * Returns a pinned cxl_decoder with at least @min bytes of capacity
> + * reserved, or an error pointer. The caller is also expected to own the
> + * lifetime of the memdev registration associated with the endpoint to
> + * pin the decoder registered as well.
> + */
> +struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
> +					     enum cxl_decoder_mode mode,
> +					     resource_size_t min,
> +					     resource_size_t max)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	struct device *cxled_dev;
> +	resource_size_t alloc;
> +	int rc;
> +
> +	if (!IS_ALIGNED(min | max, SZ_256M))
> +		return ERR_PTR(-EINVAL);
> +
> +	down_read(&cxl_dpa_rwsem);
> +	cxled_dev = device_find_child(&endpoint->dev, NULL, find_free_decoder);
> +	if (!cxled_dev)
> +		cxled = ERR_PTR(-ENXIO);
> +	else
> +		cxled = to_cxl_endpoint_decoder(cxled_dev);
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (IS_ERR(cxled))
> +		return cxled;
> +
> +	rc = cxl_dpa_set_mode(cxled, mode);
> +	if (rc)
> +		goto err;
> +
> +	down_read(&cxl_dpa_rwsem);
> +	alloc = cxl_dpa_freespace(cxled, NULL, NULL);
> +	up_read(&cxl_dpa_rwsem);
> +
> +	if (max)
> +		alloc = min(max, alloc);
> +	if (alloc < min) {
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +
> +	rc = cxl_dpa_alloc(cxled, alloc);
> +	if (rc)
> +		goto err;
> +
> +	return cxled;
> +err:
> +	put_device(cxled_dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_request_dpa, CXL);
> +
>   static void cxld_set_interleave(struct cxl_decoder *cxld, u32 *ctrl)
>   {
>   	u16 eig;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 258c90727dd2..55808697773f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -680,6 +680,12 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
>   struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
>   struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
>   struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
> +
> +static inline struct device *cxled_dev(struct cxl_endpoint_decoder *cxled)
> +{
> +	return &cxled->cxld.dev;
> +}
> +
>   bool is_root_decoder(struct device *dev);
>   bool is_switch_decoder(struct device *dev);
>   bool is_endpoint_decoder(struct device *dev);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index e3bcd6d12a1c..8ec5c305d186 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -89,6 +89,10 @@ struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
>   int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
>   			 resource_size_t base, resource_size_t len,
>   			 resource_size_t skipped);
> +struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
> +					     enum cxl_decoder_mode mode,
> +					     resource_size_t min,
> +					     resource_size_t max);
>   
>   static inline struct cxl_ep *cxl_ep_load(struct cxl_port *port,
>   					 struct cxl_memdev *cxlmd)
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 17/19] cxl/region: Define a driver interface for HPA free space enumeration
  2023-06-04 23:33 ` [PATCH 17/19] cxl/region: Define a driver interface for HPA free space enumeration Dan Williams
  2023-06-06 15:23   ` Jonathan Cameron
@ 2023-06-14  0:15   ` Dave Jiang
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2023-06-14  0:15 UTC (permalink / raw)
  To: Dan Williams, linux-cxl; +Cc: ira.weiny, navneet.singh



On 6/4/23 16:33, Dan Williams wrote:
> CXL region creation involves allocating capacity from device DPA
> (device-physical-address space) and assigning it to decode a given HPA
> (host-physical-address space). Before determininig how much DPA to
> allocate the amount of available HPA must be determined. Also, not all
> HPA is created equal, some specifically targets RAM, some target PMEM,
> some is prepared for the device-memory flows like HDM-D and HDM-DB, and
> some is host-only (HDM-H).
> 
> Wrap all of those concerns into an API that retrieves a root decoder
> (platform CXL window) that fits the specified constraints and the
> capacity available for a new region.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>   drivers/cxl/core/region.c |  143 +++++++++++++++++++++++++++++++++++++++++++++
>   drivers/cxl/cxl.h         |    5 ++
>   drivers/cxl/cxlmem.h      |    5 ++
>   3 files changed, 153 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 75c5de627868..a41756249f8d 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -575,6 +575,149 @@ static int free_hpa(struct cxl_region *cxlr)
>   	return 0;
>   }
>   
> +struct cxlrd_max_context {
> +	struct device * const *host_bridges;
> +	int interleave_ways;
> +	unsigned long flags;
> +	resource_size_t max_hpa;
> +	struct cxl_root_decoder *cxlrd;
> +};
> +
> +static int find_max_hpa(struct device *dev, void *data)
> +{
> +	struct cxlrd_max_context *ctx = data;
> +	struct cxl_switch_decoder *cxlsd;
> +	struct cxl_root_decoder *cxlrd;
> +	struct resource *res, *prev;
> +	struct cxl_decoder *cxld;
> +	resource_size_t max;
> +	unsigned int seq;
> +	int found;
> +
> +	if (!is_root_decoder(dev))
> +		return 0;
> +
> +	cxlrd = to_cxl_root_decoder(dev);
> +	cxld = &cxlrd->cxlsd.cxld;
> +	if ((cxld->flags & ctx->flags) != ctx->flags)
> +		return 0;
> +
> +	if (cxld->interleave_ways != ctx->interleave_ways)
> +		return 0;
> +
> +	cxlsd = &cxlrd->cxlsd;
> +	do {
> +		found = 0;
> +		seq = read_seqbegin(&cxlsd->target_lock);
> +		for (int i = 0; i < ctx->interleave_ways; i++)
> +			for (int j = 0; j < ctx->interleave_ways; j++)
> +				if (ctx->host_bridges[i] ==
> +				    cxlsd->target[j]->dport) {
> +					found++;
> +					break;
> +				}
> +	} while (read_seqretry(&cxlsd->target_lock, seq));
> +
> +	if (found != ctx->interleave_ways)
> +		return 0;
> +
> +	/*
> +	 * Walk the root decoder resource range relying on cxl_region_rwsem to
> +	 * preclude sibling arrival/departure and find the largest free space
> +	 * gap.
> +	 */
> +	lockdep_assert_held_read(&cxl_region_rwsem);
> +	max = 0;
> +	res = cxlrd->res->child;
> +	if (!res)
> +		max = resource_size(cxlrd->res);
> +	else
> +		max = 0;
> +	for (prev = NULL; res; prev = res, res = res->sibling) {
> +		struct resource *next = res->sibling;
> +		resource_size_t free = 0;
> +
> +		if (!prev && res->start > cxlrd->res->start) {
> +			free = res->start - cxlrd->res->start;
> +			max = max(free, max);
> +		}
> +		if (prev && res->start > prev->end + 1) {
> +			free = res->start - prev->end + 1;
> +			max = max(free, max);
> +		}

Can skip the extra compare, not sure if it's worth the extra level of 
indent.

		if (!prev) {
			if (res->start > cxlrd->res->start) {
				free = res->start - cxlrd->res->start;
				max = max(free, max)
			}
		} else {
			if (res->start > prev->end + 1) {
				free = res->start - prev->end + 1;
				max = max(free, max);
			}
		}


Same below.

DJ

> +		if (next && res->end + 1 < next->start) {
> +			free = next->start - res->end + 1;
> +			max = max(free, max);
> +		}
> +		if (!next && res->end + 1 < cxlrd->res->end + 1) {
> +			free = cxlrd->res->end + 1 - res->end + 1;
> +			max = max(free, max);
> +		}
> +	}
> +
> +	if (max > ctx->max_hpa) {
> +		if (ctx->cxlrd)
> +			put_device(cxlrd_dev(ctx->cxlrd));
> +		get_device(cxlrd_dev(cxlrd));
> +		ctx->cxlrd = cxlrd;
> +		ctx->max_hpa = max;
> +		dev_dbg(cxlrd_dev(cxlrd), "found %pa bytes of free space\n", &max);
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * cxl_hpa_freespace - find a root decoder with free capacity per constraints
> + * @endpoint: an endpoint that is mapped by the returned decoder
> + * @host_bridges: array of host-bridges that the decoder must interleave
> + * @interleave_ways: number of entries in @host_bridges
> + * @flags: CXL_DECODER_F flags for selecting RAM vs PMEM, and HDM-H vs HDM-D[B]
> + * @max: output parameter of bytes available in the returned decoder
> + *
> + * The return tuple of a 'struct cxl_root_decoder' and 'bytes available (@max)'
> + * is a point in time snapshot. If by the time the caller goes to use this root
> + * decoder's capacity the capacity is reduced then caller needs to loop and
> + * retry.
> + *
> + * The returned root decoder has an elevated reference count that needs to be
> + * put with put_device(cxlrd_dev(cxlrd)). Locking context is with
> + * cxl_{acquire,release}_endpoint(), that ensures removal of the root decoder
> + * does not race.
> + */
> +struct cxl_root_decoder *cxl_hpa_freespace(struct cxl_port *endpoint,
> +					   struct device *const *host_bridges,
> +					   int interleave_ways,
> +					   unsigned long flags,
> +					   resource_size_t *max)
> +{
> +	struct cxlrd_max_context ctx = {
> +		.host_bridges = host_bridges,
> +		.interleave_ways = interleave_ways,
> +		.flags = flags,
> +	};
> +	struct cxl_port *root;
> +
> +	if (!is_cxl_endpoint(endpoint))
> +		return ERR_PTR(-EINVAL);
> +
> +	root = find_cxl_root(endpoint);
> +	if (!root)
> +		return ERR_PTR(-ENXIO);
> +
> +	down_read(&cxl_region_rwsem);
> +	device_for_each_child(&root->dev, &ctx, find_max_hpa);
> +	up_read(&cxl_region_rwsem);
> +	put_device(&root->dev);
> +
> +	if (!ctx.cxlrd)
> +		return ERR_PTR(-ENOMEM);
> +
> +	*max = ctx.max_hpa;
> +	return ctx.cxlrd;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_hpa_freespace, CXL);
> +
>   static ssize_t size_store(struct device *dev, struct device_attribute *attr,
>   			  const char *buf, size_t len)
>   {
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 55808697773f..8400af85d99f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -686,6 +686,11 @@ static inline struct device *cxled_dev(struct cxl_endpoint_decoder *cxled)
>   	return &cxled->cxld.dev;
>   }
>   
> +static inline struct device *cxlrd_dev(struct cxl_root_decoder *cxlrd)
> +{
> +	return &cxlrd->cxlsd.cxld.dev;
> +}
> +
>   bool is_root_decoder(struct device *dev);
>   bool is_switch_decoder(struct device *dev);
>   bool is_endpoint_decoder(struct device *dev);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 8ec5c305d186..69f07186502d 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -93,6 +93,11 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_port *endpoint,
>   					     enum cxl_decoder_mode mode,
>   					     resource_size_t min,
>   					     resource_size_t max);
> +struct cxl_root_decoder *cxl_hpa_freespace(struct cxl_port *endpoint,
> +					   struct device *const *host_bridges,
> +					   int interleave_ways,
> +					   unsigned long flags,
> +					   resource_size_t *max);
>   
>   static inline struct cxl_ep *cxl_ep_load(struct cxl_port *port,
>   					 struct cxl_memdev *cxlmd)
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/19] cxl/mbox: Move mailbox related driver state to its own data structure
  2023-06-06 11:10   ` Jonathan Cameron
@ 2023-06-14  0:45     ` Dan Williams
  0 siblings, 0 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-14  0:45 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

Jonathan Cameron wrote:
> On Sun, 04 Jun 2023 16:31:54 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > 'struct cxl_dev_state' makes too many assumptions about the capabilities
> > of a CXL device. In particular it assumes a CXL device has a mailbox and
> > all of the infrastructure and state that comes along with that.
> > 
> > In preparation for supporting accelerator / Type-2 devices that may not
> > have a mailbox and in general maintain a minimal core context structure,
> > make mailbox functionality a super-set of  'struct cxl_dev_state' with
> > 'struct cxl_memdev_state'.
> > 
> > With this reorganization it allows for CXL devices that support HDM
> > decoder mapping, but not other general-expander / Type-3 capabilities,
> > to only enable that subset without the rest of the mailbox
> > infrastructure coming along for the ride.
> > 
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> I'm not yet sure that the division in exactly in the right place, but we
> can move things later if it turns out some elements are more general than
> we currently think.

Agree, it is more along the lines of: the current trajectory of 'struct
cxl_dev_state' is unsustainable, stem the tide, and revisit as needed.

> 
> A few trivial things inline.
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> 
> > ---
> 
>  
> > -static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_dev_state *cxlds)
> > +static struct cxl_mbox_get_supported_logs *
> > +cxl_get_gsl(struct cxl_memdev_state *mds)
> 
> I'd consider keeping this on one line.  It was between 80 and 90 before and still is...
> 
> 
> >  {
> 
> 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index a2845a7a69d8..d3fe73d5ba4d 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -267,6 +267,35 @@ struct cxl_poison_state {
> >   * @cxl_dvsec: Offset to the PCIe device DVSEC
> >   * @rcd: operating in RCD mode (CXL 3.0 9.11.8 CXL Devices Attached to an RCH)
> >   * @media_ready: Indicate whether the device media is usable
> > + * @dpa_res: Overall DPA resource tree for the device
> > + * @pmem_res: Active Persistent memory capacity configuration
> > + * @ram_res: Active Volatile memory capacity configuration
> > + * @component_reg_phys: register base of component registers
> > + * @info: Cached DVSEC information about the device.
> 
> Not seeing info in this structure.
> 
> > + * @serial: PCIe Device Serial Number
> > + */
> > +struct cxl_dev_state {
> > +	struct device *dev;
> > +	struct cxl_memdev *cxlmd;
> > +	struct cxl_regs regs;
> > +	int cxl_dvsec;
> > +	bool rcd;
> > +	bool media_ready;
> > +	struct resource dpa_res;
> > +	struct resource pmem_res;
> > +	struct resource ram_res;
> > +	resource_size_t component_reg_phys;
> > +	u64 serial;
> > +};
> > +
> > +/**
> > + * struct cxl_memdev_state - Generic Type-3 Memory Device Class driver data
> > + *
> > + * CXL 8.1.12.1 PCI Header - Class Code Register Memory Device defines
> > + * common memory device functionality like the presence of a mailbox and
> > + * the functionality related to that like Identify Memory Device and Get
> > + * Partition Info
> > + * @cxlds: Core driver state common across Type-2 and Type-3 devices
> >   * @payload_size: Size of space for payload
> >   *                (CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
> >   * @lsa_size: Size of Label Storage Area
> > @@ -275,9 +304,6 @@ struct cxl_poison_state {
> >   * @firmware_version: Firmware version for the memory device.
> >   * @enabled_cmds: Hardware commands found enabled in CEL.
> >   * @exclusive_cmds: Commands that are kernel-internal only
> > - * @dpa_res: Overall DPA resource tree for the device
> > - * @pmem_res: Active Persistent memory capacity configuration
> > - * @ram_res: Active Volatile memory capacity configuration
> >   * @total_bytes: sum of all possible capacities
> >   * @volatile_only_bytes: hard volatile capacity
> >   * @persistent_only_bytes: hard persistent capacity
> > @@ -286,54 +312,41 @@ struct cxl_poison_state {
> >   * @active_persistent_bytes: sum of hard + soft persistent
> >   * @next_volatile_bytes: volatile capacity change pending device reset
> >   * @next_persistent_bytes: persistent capacity change pending device reset
> > - * @component_reg_phys: register base of component registers
> > - * @info: Cached DVSEC information about the device.
> 
> Not seeing this removed from this structure in this patch.
> Curiously doesn't seem to be here in first place.
> 
> Probably wants precursor fix patch to get rid of it from the docs.

I did some digging and it turns out the cxlmem.h is not even built
during a docs build, but I went ahead and added another patch to clean
up warnings from:

./scripts/kernel-doc drivers/cxl/cxlmem.h

I will note though that the extra kdoc descriptor was not flagged, only
missing attribute definitions are flagged.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 08/19] cxl/port: Enumerate flit mode capability
  2023-06-06 13:04   ` Jonathan Cameron
@ 2023-06-14  1:06     ` Dan Williams
  0 siblings, 0 replies; 64+ messages in thread
From: Dan Williams @ 2023-06-14  1:06 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

Jonathan Cameron wrote:
> On Sun, 04 Jun 2023 16:32:21 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > Per CXL 3.0 Section 9.14 Back-Invalidation Configuration, in order to
> > enable an HDM-DB range (CXL.mem region with device initiated
> > back-invalidation support), all ports in the path between the endpoint and
> > the host bridge must be in 256-bit flit-mode.
> > 
> > Even for typical Type-3 class devices it is useful to enumerate link
> > capabilities through the chain for debug purposes.
> > 
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> 
> A few minor comments. In particularly that the field you have in here doesn't
> distinguish between 256 byte flits and otherwise.  That's done with the PCI spec
> field not this one which is about latency optimization.
> 
> > ---
> >  drivers/cxl/core/hdm.c  |    2 +
> >  drivers/cxl/core/pci.c  |   84 +++++++++++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/core/port.c |    6 +++
> >  drivers/cxl/cxl.h       |    2 +
> >  drivers/cxl/cxlpci.h    |   25 +++++++++++++-
> >  drivers/cxl/port.c      |    5 +++
> >  6 files changed, 122 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > index ca3b99c6eacf..91ab3033c781 100644
> > --- a/drivers/cxl/core/hdm.c
> > +++ b/drivers/cxl/core/hdm.c
> > @@ -3,8 +3,10 @@
> >  #include <linux/seq_file.h>
> >  #include <linux/device.h>
> >  #include <linux/delay.h>
> > +#include <linux/pci.h>
> >  
> >  #include "cxlmem.h"
> > +#include "cxlpci.h"
> >  #include "core.h"
> I'm not following why link related patch should change includes in hdm relate c file?
> Maybe later once you use it this makes sense?

Definitely. I missed that this straggled in here.

> 
> 
> >  
> >  /**
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index 67f4ab6daa34..b62ec17ccdde 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -519,6 +519,90 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> 
> > +
> > +int cxl_probe_link(struct cxl_port *port)
> > +{
> > +	struct pci_dev *pdev = cxl_port_to_pci(port);
> > +	u16 cap, en, parent_features;
> > +	struct cxl_port *parent_port;
> > +	struct device *dev;
> > +	int rc, dvsec;
> > +	u32 hdr;
> > +
> > +	if (!pdev) {
> > +		/*
> > +		 * Assume host bridges support all features, the root
> > +		 * port will dictate the actual enabled set to endpoints.
> > +		 */
> > +		return 0;
> > +	}
> > +
> > +	dev = &pdev->dev;
> > +	dvsec = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> > +					  CXL_DVSEC_FLEXBUS_PORT);
> > +	if (!dvsec) {
> > +		dev_err(dev, "Failed to enumerate port capabilities\n");
> > +		return -ENXIO;
> > +	}
> > +
> > +	/*
> > +	 * Cache the link features for future determination of HDM-D or
> > +	 * HDM-DB support
> > +	 */
> > +	rc = pci_read_config_dword(pdev, dvsec + PCI_DVSEC_HEADER1, &hdr);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = pci_read_config_word(pdev, dvsec + CXL_DVSEC_FLEXBUS_CAP_OFFSET,
> > +				  &cap);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = pci_read_config_word(pdev, dvsec + CXL_DVSEC_FLEXBUS_STATUS_OFFSET,
> > +				  &en);
> > +	if (rc)
> > +		return rc;
> > +
> > +	if (PCI_DVSEC_HEADER1_REV(hdr) < 2)
> > +		cap &= ~CXL_DVSEC_FLEXBUS_REV2_MASK;
> > +
> > +	if (PCI_DVSEC_HEADER1_REV(hdr) < 1)
> > +		cap &= ~CXL_DVSEC_FLEXBUS_REV1_MASK;
> 
> I talk about this below, but I'd not normally expect to see this.
> Anyone who used those bits out of usage defined by later specs has buggy 
> hardware and should quirk it rather than having it built in here.

True, makes sense.

> 
> > +
> > +	en &= cap;
> > +	parent_port = to_cxl_port(port->dev.parent);
> > +	parent_features = parent_port->features;
> > +
> > +	/* Enforce port features are plumbed through to the host bridge */
> > +	port->features = en & CXL_DVSEC_FLEXBUS_ENABLE_MASK & parent_features;
> > +
> > +	dev_dbg(dev, "features:%s%s%s%s%s%s%s\n",
> > +		en & CXL_DVSEC_FLEXBUS_CACHE_ENABLED ? " cache" : "",
> > +		en & CXL_DVSEC_FLEXBUS_IO_ENABLED ? " io" : "",
> > +		en & CXL_DVSEC_FLEXBUS_MEM_ENABLED ? " mem" : "",
> > +		en & CXL_DVSEC_FLEXBUS_FLIT68_ENABLED ? " flit68" : "",
> > +		en & CXL_DVSEC_FLEXBUS_MLD_ENABLED ? " mld" : "",
> > +		en & CXL_DVSEC_FLEXBUS_FLIT256_ENABLED ? " flit256" : "",
> 
> Definitely want that text to be more explicit about latency optimized

Ok, see below, I think dropping flit size altogether from these names
makes sense.

> 
> > +		en & CXL_DVSEC_FLEXBUS_PBR_ENABLED ? " pbr" : "");
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_probe_link, CXL);
> > +
> >  #define CXL_DOE_TABLE_ACCESS_REQ_CODE		0x000000ff
> >  #define   CXL_DOE_TABLE_ACCESS_REQ_CODE_READ	0
> >  #define CXL_DOE_TABLE_ACCESS_TABLE_TYPE		0x0000ff00
> 
> 
> > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> > index 7c02e55b8042..7f82ffb5b4be 100644
> > --- a/drivers/cxl/cxlpci.h
> > +++ b/drivers/cxl/cxlpci.h
> > @@ -45,8 +45,28 @@
> >  /* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
> >  #define CXL_DVSEC_DEVICE_GPF					5
> >  
> > -/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
> > -#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
> > +/* CXL 3.0 8.2.1.3: PCIe DVSEC for Flex Bus Port */
> > +#define CXL_DVSEC_FLEXBUS_PORT					7
> > +#define   CXL_DVSEC_FLEXBUS_CAP_OFFSET		0xA
> > +#define     CXL_DVSEC_FLEXBUS_CACHE_CAPABLE	BIT(0)
> > +#define     CXL_DVSEC_FLEXBUS_IO_CAPABLE	BIT(1)
> > +#define     CXL_DVSEC_FLEXBUS_MEM_CAPABLE	BIT(2)
> > +#define     CXL_DVSEC_FLEXBUS_FLIT68_CAPABLE	BIT(5)
> 
> This one includes the stuff that makes it 2.0 rather than 1.1  Might need a longer
> name to avoid miss use? (I checked the 1.1 spec and reserved so would be 0).

So maybe I will drop the flit size from the name until a conflict
arises. I.e. this bit is relevant for all known flit sizes that support
VH topologies.

> 
> > +#define     CXL_DVSEC_FLEXBUS_MLD_CAPABLE	BIT(6)
> > +#define     CXL_DVSEC_FLEXBUS_REV1_MASK		GENMASK(6, 5)
> 
> Unusual approach.. Shouldn't be needed as those bits were RsvdP so
> no one should have set them and now we are supporting the new bits
> so should be good without masking.

Agree.

> 
> > +#define     CXL_DVSEC_FLEXBUS_FLIT256_CAPABLE	BIT(13)
> 
> Not just flit256, but the latency optimized one (split in two kind of
> with separate CRCs)  So this name needs to be something like
> FLEXBUS_LAT_OPT_FLIT256_CAPABLE

Until a non-flit256 latency optimized mechanism is added the flit size
is redundant in this name.

> 
> 
> > +#define     CXL_DVSEC_FLEXBUS_PBR_CAPABLE	BIT(14)
> > +#define     CXL_DVSEC_FLEXBUS_REV2_MASK		GENMASK(14, 13)
> > +#define   CXL_DVSEC_FLEXBUS_STATUS_OFFSET	0xE
> > +#define     CXL_DVSEC_FLEXBUS_CACHE_ENABLED	BIT(0)
> > +#define     CXL_DVSEC_FLEXBUS_IO_ENABLED	BIT(1)
> > +#define     CXL_DVSEC_FLEXBUS_MEM_ENABLED	BIT(2)
> > +#define     CXL_DVSEC_FLEXBUS_FLIT68_ENABLED	BIT(5)
> 
> Again, not just FLIT68, but the VH stuff from CXL 2.0 as well.
> 
> > +#define     CXL_DVSEC_FLEXBUS_MLD_ENABLED	BIT(6)
> > +#define     CXL_DVSEC_FLEXBUS_FLIT256_ENABLED	BIT(13)
> Also latency optimized is key here, not 256 bit (though you need
> that as well).
> 
> > +#define     CXL_DVSEC_FLEXBUS_PBR_ENABLED	BIT(14)
> > +#define     CXL_DVSEC_FLEXBUS_ENABLE_MASK \
> > +	(GENMASK(2, 0) | GENMASK(6, 5) | GENMASK(14, 13))
> Ok - I guess the resvP requires this dance.
> >  
> 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM
  2023-06-13 22:32     ` Dan Williams
@ 2023-06-14  9:15       ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2023-06-14  9:15 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-cxl, ira.weiny, navneet.singh

On Tue, 13 Jun 2023 15:32:54 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Sun, 04 Jun 2023 16:32:10 -0700
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >   
> > > In preparation for device-memory region creation, arrange for decoders
> > > of CXL_DEVTYPE_DEVMEM memdevs to default to CXL_DECODER_DEVMEM for their
> > > target type.  
> > 
> > Why?  CXL_DEVTYPE_DEVMEM might just be a non CLASS code compliant HDM-H
> > only device.  I'd want those drivers to always set this explicitly.
> > 
> >   
> > > 
> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > > ---
> > >  drivers/cxl/core/hdm.c |   14 ++++++++++++--
> > >  1 file changed, 12 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > > index de8a3fb28331..ca3b99c6eacf 100644
> > > --- a/drivers/cxl/core/hdm.c
> > > +++ b/drivers/cxl/core/hdm.c
> > > @@ -856,12 +856,22 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
> > >  		}
> > >  		port->commit_end = cxld->id;
> > >  	} else {
> > > -		/* unless / until type-2 drivers arrive, assume type-3 */
> > >  		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0) {
> > >  			ctrl |= CXL_HDM_DECODER0_CTRL_TYPE;
> > >  			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));  
> > 
> > This is setting it to be HOSTMEM if it was previously DEVMEM and that
> > makes it inconsistent with the state cached below.
> > 
> > Not sure why it was conditional in the first place - writing to existing value
> > should have been safe and would be less code...  
> 
> folded in the following...
LGTM
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> -- >8 --  
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 8deb362a7e44..715c1f103739 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -572,7 +572,7 @@ static void cxld_set_type(struct cxl_decoder *cxld, u32 *ctrl)
>  {
>  	u32p_replace_bits(ctrl,
>  			  !!(cxld->target_type == CXL_DECODER_HOSTONLYMEM),
> -			  CXL_HDM_DECODER0_CTRL_TYPE);
> +			  CXL_HDM_DECODER0_CTRL_HOSTONLY);
>  }
>  
>  static int cxlsd_set_targets(struct cxl_switch_decoder *cxlsd, u64 *tgt)
> @@ -840,7 +840,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
>  		cxld->flags |= CXL_DECODER_F_ENABLE;
>  		if (ctrl & CXL_HDM_DECODER0_CTRL_LOCK)
>  			cxld->flags |= CXL_DECODER_F_LOCK;
> -		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl))
> +		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_HOSTONLY, ctrl))
>  			cxld->target_type = CXL_DECODER_HOSTONLYMEM;
>  		else
>  			cxld->target_type = CXL_DECODER_DEVMEM;
> @@ -859,14 +859,14 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
>  		}
>  		port->commit_end = cxld->id;
>  	} else {
> -		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0) {
> -			ctrl |= CXL_HDM_DECODER0_CTRL_TYPE;
> -			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
> -		}
>  		if (cxled) {
>  			struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>  			struct cxl_dev_state *cxlds = cxlmd->cxlds;
>  
> +			/*
> +			 * Default by devtype until a device arrives that needs
> +			 * more precision.
> +			 */
>  			if (cxlds->type == CXL_DEVTYPE_CLASSMEM)
>  				cxld->target_type = CXL_DECODER_HOSTONLYMEM;
>  			else
> @@ -875,6 +875,12 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
>  			/* To be overridden by region type at commit time */
>  			cxld->target_type = CXL_DECODER_HOSTONLYMEM;
>  		}
> +
> +		if (!FIELD_GET(CXL_HDM_DECODER0_CTRL_HOSTONLY, ctrl) &&
> +		    cxld->target_type == CXL_DECODER_HOSTONLYMEM) {
> +			ctrl |= CXL_HDM_DECODER0_CTRL_HOSTONLY;
> +			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
> +		}
>  	}
>  	rc = eiw_to_ways(FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl),
>  			  &cxld->interleave_ways);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index ae0965ac8c5a..f309b1387858 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -56,7 +56,7 @@
>  #define   CXL_HDM_DECODER0_CTRL_COMMIT BIT(9)
>  #define   CXL_HDM_DECODER0_CTRL_COMMITTED BIT(10)
>  #define   CXL_HDM_DECODER0_CTRL_COMMIT_ERROR BIT(11)
> -#define   CXL_HDM_DECODER0_CTRL_TYPE BIT(12)
> +#define   CXL_HDM_DECODER0_CTRL_HOSTONLY BIT(12)
>  #define CXL_HDM_DECODER0_TL_LOW(i) (0x20 * (i) + 0x24)
>  #define CXL_HDM_DECODER0_TL_HIGH(i) (0x20 * (i) + 0x28)
>  #define CXL_HDM_DECODER0_SKIP_LOW(i) CXL_HDM_DECODER0_TL_LOW(i)


^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2023-06-14  9:15 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-04 23:31 [PATCH 00/19] cxl: Device memory setup Dan Williams
2023-06-04 23:31 ` [PATCH 01/19] cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output Dan Williams
2023-06-05  8:46   ` Jonathan Cameron
2023-06-13 22:03   ` Dave Jiang
2023-06-04 23:31 ` [PATCH 02/19] tools/testing/cxl: Remove unused @cxlds argument Dan Williams
2023-06-06 10:53   ` Jonathan Cameron
2023-06-13 22:08   ` Dave Jiang
2023-06-04 23:31 ` [PATCH 03/19] cxl/mbox: Move mailbox related driver state to its own data structure Dan Williams
2023-06-06 11:10   ` Jonathan Cameron
2023-06-14  0:45     ` Dan Williams
2023-06-13 22:15   ` Dave Jiang
2023-06-04 23:31 ` [PATCH 04/19] cxl/memdev: Make mailbox functionality optional Dan Williams
2023-06-06 11:15   ` Jonathan Cameron
2023-06-13 20:53     ` Dan Williams
2023-06-04 23:32 ` [PATCH 05/19] cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTMEM, DEVMEM} Dan Williams
2023-06-06 11:21   ` Jonathan Cameron
2023-06-13 21:03     ` Dan Williams
2023-06-04 23:32 ` [PATCH 06/19] cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM Dan Williams
2023-06-05  1:14   ` kernel test robot
2023-06-06 20:10     ` Dan Williams
2023-06-06 11:27   ` Jonathan Cameron
2023-06-13 21:23     ` Dan Williams
2023-06-13 22:32     ` Dan Williams
2023-06-14  9:15       ` Jonathan Cameron
2023-06-04 23:32 ` [PATCH 07/19] cxl/region: Manage decoder target_type at decoder-attach time Dan Williams
2023-06-06 12:36   ` Jonathan Cameron
2023-06-13 22:42   ` Dave Jiang
2023-06-04 23:32 ` [PATCH 08/19] cxl/port: Enumerate flit mode capability Dan Williams
2023-06-06 13:04   ` Jonathan Cameron
2023-06-14  1:06     ` Dan Williams
2023-06-04 23:32 ` [PATCH 09/19] cxl/memdev: Formalize endpoint port linkage Dan Williams
2023-06-06 13:26   ` Jonathan Cameron
     [not found]   ` <CGME20230607164756uscas1p2fb025e7f4de5094925cc25fc2ac45212@uscas1p2.samsung.com>
2023-06-07 16:47     ` Fan Ni
2023-06-13 22:59   ` Dave Jiang
2023-06-04 23:32 ` [PATCH 10/19] cxl/memdev: Indicate probe deferral Dan Williams
2023-06-06 13:54   ` Jonathan Cameron
2023-06-04 23:32 ` [PATCH 11/19] cxl/region: Factor out construct_region_{begin, end} and drop_region() for reuse Dan Williams
2023-06-06 14:29   ` Jonathan Cameron
2023-06-13 23:29   ` Dave Jiang
2023-06-04 23:32 ` [PATCH 12/19] cxl/region: Factor out interleave ways setup Dan Williams
2023-06-06 14:31   ` Jonathan Cameron
2023-06-13 23:30   ` Dave Jiang
2023-06-04 23:32 ` [PATCH 13/19] cxl/region: Factor out interleave granularity setup Dan Williams
2023-06-06 14:33   ` Jonathan Cameron
2023-06-13 23:42   ` Dave Jiang
2023-06-04 23:32 ` [PATCH 14/19] cxl/region: Clarify locking requirements of cxl_region_attach() Dan Williams
2023-06-06 14:35   ` Jonathan Cameron
2023-06-13 23:45   ` Dave Jiang
2023-06-04 23:33 ` [PATCH 15/19] cxl/region: Specify host-only vs device memory at region creation time Dan Williams
2023-06-06 14:42   ` Jonathan Cameron
2023-06-04 23:33 ` [PATCH 16/19] cxl/hdm: Define a driver interface for DPA allocation Dan Williams
2023-06-06 14:58   ` Jonathan Cameron
2023-06-13 23:53   ` Dave Jiang
2023-06-04 23:33 ` [PATCH 17/19] cxl/region: Define a driver interface for HPA free space enumeration Dan Williams
2023-06-06 15:23   ` Jonathan Cameron
2023-06-14  0:15   ` Dave Jiang
2023-06-04 23:33 ` [PATCH 18/19] cxl/region: Define a driver interface for region creation Dan Williams
2023-06-06 15:31   ` Jonathan Cameron
2023-06-04 23:33 ` [PATCH 19/19] tools/testing/cxl: Emulate a CXL accelerator with local memory Dan Williams
2023-06-06 15:34   ` Jonathan Cameron
2023-06-07 21:09   ` Vikram Sethi
2023-06-08 10:47     ` Jonathan Cameron
2023-06-08 14:34       ` Vikram Sethi
2023-06-08 15:22         ` Jonathan Cameron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.