nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support
@ 2021-06-10 22:25 Dan Williams
  2021-06-10 22:26 ` [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure Dan Williams
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Dan Williams @ 2021-06-10 22:25 UTC (permalink / raw)
  To: linux-cxl
  Cc: nvdimm, ben.widawsky, alison.schofield, vishal.l.verma,
	ira.weiny, linux-kernel

CXL Memory Expander devices (CXL 2.0 Type-3) support persistent memory
in addition to volatile memory expansion. The most significant changes
this requires of the existing LIBNVDIMM infrastructure, compared to what
was needed to support ACPI NFIT defined PMEM, is the ability to
dynamically provision regions in addition to namespaces, and a formal
model for hotplug.

Before region provisioning can be added the CXL enabling needs to
enumerate "nvdimm" devices on a CXL nvdimm-bus. This is modeled as a
CXL-nvdimm-bridge device (bridging CXL to nvdimm) and an associated
driver to activate and deactivate that bus-bridge. Once the bridge is
registered it scans for CXL nvdimm devices registered by endpoints.  The
CXL core bus is used as a rendezvous for nvdimm bridges and endpoints
allowing them to be registered and enabled in any order.

At the end of this series the ndctl utility can see CXL nvdimm resources
just like any other nvdimm bus.

    # ndctl list -BDiu -b CXL
    {
      "provider":"CXL",
      "dev":"ndbus1",
      "dimms":[
        { 
          "dev":"nmem1",
          "state":"disabled"
        },
        { 
          "dev":"nmem0",
          "state":"disabled"
        }
      ]
    }

Follow-on patches extend the nvdimm core label support for CXL region
and namespace labels. For now just add the machinery to register the
bus and nvdimm base objects.

---

Dan Williams (5):
      cxl/core: Add cxl-bus driver infrastructure
      cxl/pmem: Add initial infrastructure for pmem support
      libnvdimm: Export nvdimm shutdown helper, nvdimm_delete()
      libnvdimm: Drop unused device power management support
      cxl/pmem: Register 'pmem' / cxl_nvdimm  devices


 drivers/cxl/Kconfig        |   13 ++
 drivers/cxl/Makefile       |    2 
 drivers/cxl/acpi.c         |   37 +++++-
 drivers/cxl/core.c         |  281 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h          |   57 +++++++++
 drivers/cxl/mem.h          |    2 
 drivers/cxl/pci.c          |   23 +++-
 drivers/cxl/pmem.c         |  230 ++++++++++++++++++++++++++++++++++++
 drivers/nvdimm/bus.c       |   64 ++++++----
 drivers/nvdimm/dimm_devs.c |   18 +++
 include/linux/libnvdimm.h  |    1 
 11 files changed, 697 insertions(+), 31 deletions(-)
 create mode 100644 drivers/cxl/pmem.c

base-commit: 40ba17afdfabb01688c61565dbe02a916241bc05

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure
  2021-06-10 22:25 [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Dan Williams
@ 2021-06-10 22:26 ` Dan Williams
  2021-06-11 17:47   ` Ben Widawsky
  2021-06-10 22:26 ` [PATCH 2/5] cxl/pmem: Add initial infrastructure for pmem support Dan Williams
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dan Williams @ 2021-06-10 22:26 UTC (permalink / raw)
  To: linux-cxl
  Cc: nvdimm, ben.widawsky, alison.schofield, vishal.l.verma,
	ira.weiny, linux-kernel

Enable devices on the 'cxl' bus to be attached to drivers. The initial
user of this functionality is a driver for an 'nvdimm-bridge' device
that anchors a libnvdimm hierarchy attached to CXL persistent memory
resources. Other device types that will leverage this include:

cxl_port: map and use component register functionality (HDM Decoders)

cxl_nvdimm: translate CXL memory expander endpoints to libnvdimm
	    'nvdimm' objects

cxl_region: translate CXL interleave sets to libnvdimm 'region' objects

The pairing of devices to drivers is handled through the cxl_device_id()
matching to cxl_driver.id values. A cxl_device_id() of '0' indicates no
driver support.

In addition to ->match(), ->probe(), and ->remove() support for the
'cxl' bus introduce MODULE_ALIAS_CXL() to autoload modules containing
cxl-drivers. Drivers are added in follow-on changes.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core.c |   73 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h  |   22 ++++++++++++++++
 2 files changed, 95 insertions(+)

diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
index 1b9ee0b08384..959cecc1f6bf 100644
--- a/drivers/cxl/core.c
+++ b/drivers/cxl/core.c
@@ -767,8 +767,81 @@ int cxl_map_device_regs(struct pci_dev *pdev,
 }
 EXPORT_SYMBOL_GPL(cxl_map_device_regs);
 
+/**
+ * __cxl_driver_register - register a driver for the cxl bus
+ * @cxl_drv: cxl driver structure to attach
+ * @owner: owning module/driver
+ * @modname: KBUILD_MODNAME for parent driver
+ */
+int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
+			  const char *modname)
+{
+	if (!cxl_drv->probe) {
+		pr_debug("%s ->probe() must be specified\n", modname);
+		return -EINVAL;
+	}
+
+	if (!cxl_drv->name) {
+		pr_debug("%s ->name must be specified\n", modname);
+		return -EINVAL;
+	}
+
+	if (!cxl_drv->id) {
+		pr_debug("%s ->id must be specified\n", modname);
+		return -EINVAL;
+	}
+
+	cxl_drv->drv.bus = &cxl_bus_type;
+	cxl_drv->drv.owner = owner;
+	cxl_drv->drv.mod_name = modname;
+	cxl_drv->drv.name = cxl_drv->name;
+
+	return driver_register(&cxl_drv->drv);
+}
+EXPORT_SYMBOL_GPL(__cxl_driver_register);
+
+void cxl_driver_unregister(struct cxl_driver *cxl_drv)
+{
+	driver_unregister(&cxl_drv->drv);
+}
+EXPORT_SYMBOL_GPL(cxl_driver_unregister);
+
+static int cxl_device_id(struct device *dev)
+{
+	return 0;
+}
+
+static int cxl_bus_uevent(struct device *dev, struct kobj_uevent_env *env)
+{
+	return add_uevent_var(env, "MODALIAS=" CXL_MODALIAS_FMT,
+			      cxl_device_id(dev));
+}
+
+static int cxl_bus_match(struct device *dev, struct device_driver *drv)
+{
+	return cxl_device_id(dev) == to_cxl_drv(drv)->id;
+}
+
+static int cxl_bus_probe(struct device *dev)
+{
+	return to_cxl_drv(dev->driver)->probe(dev);
+}
+
+static int cxl_bus_remove(struct device *dev)
+{
+	struct cxl_driver *cxl_drv = to_cxl_drv(dev->driver);
+
+	if (cxl_drv->remove)
+		cxl_drv->remove(dev);
+	return 0;
+}
+
 struct bus_type cxl_bus_type = {
 	.name = "cxl",
+	.uevent = cxl_bus_uevent,
+	.match = cxl_bus_match,
+	.probe = cxl_bus_probe,
+	.remove = cxl_bus_remove,
 };
 EXPORT_SYMBOL_GPL(cxl_bus_type);
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index b988ea288f53..af2237d1c761 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -261,4 +261,26 @@ devm_cxl_add_passthrough_decoder(struct device *host, struct cxl_port *port)
 }
 
 extern struct bus_type cxl_bus_type;
+
+struct cxl_driver {
+	const char *name;
+	int (*probe)(struct device *dev);
+	void (*remove)(struct device *dev);
+	struct device_driver drv;
+	int id;
+};
+
+static inline struct cxl_driver *to_cxl_drv(struct device_driver *drv)
+{
+	return container_of(drv, struct cxl_driver, drv);
+}
+
+int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
+			  const char *modname);
+#define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
+void cxl_driver_unregister(struct cxl_driver *cxl_drv);
+
+#define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
+#define CXL_MODALIAS_FMT "cxl:t%d"
+
 #endif /* __CXL_H__ */


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/5] cxl/pmem: Add initial infrastructure for pmem support
  2021-06-10 22:25 [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Dan Williams
  2021-06-10 22:26 ` [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure Dan Williams
@ 2021-06-10 22:26 ` Dan Williams
  2021-06-11 11:39   ` Jonathan Cameron
  2021-06-10 22:26 ` [PATCH 3/5] libnvdimm: Export nvdimm shutdown helper, nvdimm_delete() Dan Williams
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Dan Williams @ 2021-06-10 22:26 UTC (permalink / raw)
  To: linux-cxl
  Cc: nvdimm, ben.widawsky, alison.schofield, vishal.l.verma,
	ira.weiny, linux-kernel

Register an 'nvdimm-bridge' device to act as an anchor for a libnvdimm
bus hierarchy. Also, flesh out the cxl_bus definition to allow a
cxl_nvdimm_bridge_driver to attach to the bridge and trigger the
nvdimm-bus registration.

The creation of the bridge is gated on the detection of a PMEM capable
address space registered to the root. The bridge indirection allows the
libnvdimm module to remain unloaded on platforms without PMEM support.

Given that the probing of ACPI0017 is asynchronous to CXL endpoint
devices, and the expectation that CXL endpoint devices register other
PMEM resources on the 'CXL' nvdimm bus, a workqueue is added. The
workqueue is needed to run bus_rescan_devices() outside of the
device_lock() of the nvdimm-bridge device to rendezvous nvdimm resources
as they arrive. For now only the bus is taken online/offline in the
workqueue.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/Kconfig  |   13 +++++
 drivers/cxl/Makefile |    2 +
 drivers/cxl/acpi.c   |   37 ++++++++++++-
 drivers/cxl/core.c   |  122 +++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h    |   24 +++++++++
 drivers/cxl/pmem.c   |  141 ++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 337 insertions(+), 2 deletions(-)
 create mode 100644 drivers/cxl/pmem.c

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 1a44b173dcbc..e6de221cc568 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -61,5 +61,18 @@ config CXL_ACPI
 	  hierarchy to map regions that represent System RAM, or Persistent
 	  Memory regions to be managed by LIBNVDIMM.
 
+	  If unsure say 'm'.
+
+config CXL_PMEM
+	tristate "CXL PMEM: Persistent Memory Support"
+	depends on LIBNVDIMM
+	default CXL_BUS
+	help
+	  In addition to typical memory resources a platform may also advertise
+	  support for persistent memory attached via CXL. This support is
+	  managed via a bridge driver from CXL to the LIBNVDIMM system
+	  subsystem. Say 'y/m' to enable support for enumerating and
+	  provisioning the persistent memory capacity of CXL memory expanders.
+
 	  If unsure say 'm'.
 endif
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index a29efb3e8ad2..32954059b37b 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -2,8 +2,10 @@
 obj-$(CONFIG_CXL_BUS) += cxl_core.o
 obj-$(CONFIG_CXL_MEM) += cxl_pci.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
+obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 
 ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL
 cxl_core-y := core.o
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
+cxl_pmem-y := pmem.o
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index be357eea552c..8a723f7f3f73 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -145,6 +145,30 @@ static int add_host_bridge_dport(struct device *match, void *arg)
 	return 0;
 }
 
+static int add_root_nvdimm_bridge(struct device *match, void *data)
+{
+	struct cxl_decoder *cxld;
+	struct cxl_port *root_port = data;
+	struct cxl_nvdimm_bridge *cxl_nvb;
+	struct device *host = root_port->dev.parent;
+
+	if (!is_root_decoder(match))
+		return 0;
+
+	cxld = to_cxl_decoder(match);
+	if (!(cxld->flags & CXL_DECODER_F_PMEM))
+		return 0;
+
+	cxl_nvb = devm_cxl_add_nvdimm_bridge(host, root_port);
+	if (IS_ERR(cxl_nvb)) {
+		dev_dbg(host, "failed to register pmem\n");
+		return PTR_ERR(cxl_nvb);
+	}
+	dev_dbg(host, "%s: add: %s\n", dev_name(&root_port->dev),
+		dev_name(&cxl_nvb->dev));
+	return 1;
+}
+
 static int cxl_acpi_probe(struct platform_device *pdev)
 {
 	int rc;
@@ -166,8 +190,17 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	 * Root level scanned with host-bridge as dports, now scan host-bridges
 	 * for their role as CXL uports to their CXL-capable PCIe Root Ports.
 	 */
-	return bus_for_each_dev(adev->dev.bus, NULL, root_port,
-				add_host_bridge_uport);
+	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
+			      add_host_bridge_uport);
+	if (rc)
+		return rc;
+
+	if (IS_ENABLED(CONFIG_CXL_PMEM))
+		rc = device_for_each_child(&root_port->dev, root_port,
+					   add_root_nvdimm_bridge);
+	if (rc < 0)
+		return rc;
+	return 0;
 }
 
 static const struct acpi_device_id cxl_acpi_ids[] = {
diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
index 959cecc1f6bf..f0305c9c91c8 100644
--- a/drivers/cxl/core.c
+++ b/drivers/cxl/core.c
@@ -187,6 +187,12 @@ static const struct device_type cxl_decoder_root_type = {
 	.groups = cxl_decoder_root_attribute_groups,
 };
 
+bool is_root_decoder(struct device *dev)
+{
+	return dev->type == &cxl_decoder_root_type;
+}
+EXPORT_SYMBOL_GPL(is_root_decoder);
+
 struct cxl_decoder *to_cxl_decoder(struct device *dev)
 {
 	if (dev_WARN_ONCE(dev, dev->type->release != cxl_decoder_release,
@@ -194,6 +200,7 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev)
 		return NULL;
 	return container_of(dev, struct cxl_decoder, dev);
 }
+EXPORT_SYMBOL_GPL(to_cxl_decoder);
 
 static void cxl_dport_release(struct cxl_dport *dport)
 {
@@ -611,6 +618,119 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base,
 }
 EXPORT_SYMBOL_GPL(cxl_probe_component_regs);
 
+static void cxl_nvdimm_bridge_release(struct device *dev)
+{
+	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
+
+	kfree(cxl_nvb);
+}
+
+static const struct attribute_group *cxl_nvdimm_bridge_attribute_groups[] = {
+	&cxl_base_attribute_group,
+	NULL,
+};
+
+static const struct device_type cxl_nvdimm_bridge_type = {
+	.name = "cxl_nvdimm_bridge",
+	.release = cxl_nvdimm_bridge_release,
+	.groups = cxl_nvdimm_bridge_attribute_groups,
+};
+
+struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev)
+{
+	if (dev_WARN_ONCE(dev, dev->type != &cxl_nvdimm_bridge_type,
+			  "not a cxl_nvdimm_bridge device\n"))
+		return NULL;
+	return container_of(dev, struct cxl_nvdimm_bridge, dev);
+}
+EXPORT_SYMBOL_GPL(to_cxl_nvdimm_bridge);
+
+static struct cxl_nvdimm_bridge *
+cxl_nvdimm_bridge_alloc(struct cxl_port *port)
+{
+	struct cxl_nvdimm_bridge *cxl_nvb;
+	struct device *dev;
+
+	cxl_nvb = kzalloc(sizeof(*cxl_nvb), GFP_KERNEL);
+	if (!cxl_nvb)
+		return ERR_PTR(-ENOMEM);
+
+	dev = &cxl_nvb->dev;
+	cxl_nvb->port = port;
+	cxl_nvb->state = CXL_NVB_NEW;
+	device_initialize(dev);
+	device_set_pm_not_required(dev);
+	dev->parent = &port->dev;
+	dev->bus = &cxl_bus_type;
+	dev->type = &cxl_nvdimm_bridge_type;
+
+	return cxl_nvb;
+}
+
+static void unregister_nvb(void *_cxl_nvb)
+{
+	struct cxl_nvdimm_bridge *cxl_nvb = _cxl_nvb;
+	bool flush = false;
+
+	/*
+	 * If the bridge was ever activated then there might be in-flight state
+	 * work to flush. Once the state has been changed to 'dead' then no new
+	 * work can be queued by user-triggered bind.
+	 */
+	device_lock(&cxl_nvb->dev);
+	if (cxl_nvb->state != CXL_NVB_NEW)
+		flush = true;
+	cxl_nvb->state = CXL_NVB_DEAD;
+	device_unlock(&cxl_nvb->dev);
+
+	/*
+	 * Even though the device core will trigger device_release_driver()
+	 * before the unregister, it does not know about the fact that
+	 * cxl_nvdimm_bridge_driver defers ->remove() work. So, do the driver
+	 * release not and flush it before tearing down the nvdimm device
+	 * hierarchy.
+	 */
+	device_release_driver(&cxl_nvb->dev);
+	if (flush)
+		flush_work(&cxl_nvb->state_work);
+	device_unregister(&cxl_nvb->dev);
+}
+
+struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
+						     struct cxl_port *port)
+{
+	struct cxl_nvdimm_bridge *cxl_nvb;
+	struct device *dev;
+	int rc;
+
+	if (!IS_ENABLED(CONFIG_CXL_PMEM))
+		return ERR_PTR(-ENXIO);
+
+	cxl_nvb = cxl_nvdimm_bridge_alloc(port);
+	if (IS_ERR(cxl_nvb))
+		return cxl_nvb;
+
+	dev = &cxl_nvb->dev;
+	rc = dev_set_name(dev, "nvdimm-bridge");
+	if (rc)
+		goto err;
+
+	rc = device_add(dev);
+	if (rc)
+		goto err;
+
+	rc = devm_add_action_or_reset(host, unregister_nvb, cxl_nvb);
+	if (rc)
+		return ERR_PTR(rc);
+
+	return cxl_nvb;
+
+err:
+	put_device(dev);
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_GPL(devm_cxl_add_nvdimm_bridge);
+
 /**
  * cxl_probe_device_regs() - Detect CXL Device register blocks
  * @dev: Host device of the @base mapping
@@ -808,6 +928,8 @@ EXPORT_SYMBOL_GPL(cxl_driver_unregister);
 
 static int cxl_device_id(struct device *dev)
 {
+	if (dev->type == &cxl_nvdimm_bridge_type)
+		return CXL_DEVICE_NVDIMM_BRIDGE;
 	return 0;
 }
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index af2237d1c761..47fcb7ad5978 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -4,6 +4,7 @@
 #ifndef __CXL_H__
 #define __CXL_H__
 
+#include <linux/libnvdimm.h>
 #include <linux/bitfield.h>
 #include <linux/bitops.h>
 #include <linux/io.h>
@@ -195,6 +196,23 @@ struct cxl_decoder {
 	struct cxl_dport *target[];
 };
 
+
+enum cxl_nvdimm_brige_state {
+	CXL_NVB_NEW,
+	CXL_NVB_DEAD,
+	CXL_NVB_ONLINE,
+	CXL_NVB_OFFLINE,
+};
+
+struct cxl_nvdimm_bridge {
+	struct device dev;
+	struct cxl_port *port;
+	struct nvdimm_bus *nvdimm_bus;
+	struct nvdimm_bus_descriptor nd_desc;
+	struct work_struct state_work;
+	enum cxl_nvdimm_brige_state state;
+};
+
 /**
  * struct cxl_port - logical collection of upstream port devices and
  *		     downstream port devices to construct a CXL memory
@@ -240,6 +258,7 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 		  resource_size_t component_reg_phys);
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
+bool is_root_decoder(struct device *dev);
 struct cxl_decoder *
 devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
 		     resource_size_t base, resource_size_t len,
@@ -280,7 +299,12 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
 #define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
 void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 
+#define CXL_DEVICE_NVDIMM_BRIDGE 1
+
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
 
+struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev);
+struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
+						     struct cxl_port *port);
 #endif /* __CXL_H__ */
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
new file mode 100644
index 000000000000..0067bd734559
--- /dev/null
+++ b/drivers/cxl/pmem.c
@@ -0,0 +1,141 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/libnvdimm.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include "cxl.h"
+
+/*
+ * Ordered workqueue for cxl nvdimm device arrival and departure
+ * to coordinate bus rescans when a bridge arrives and trigger remove
+ * operations when the bridge is removed.
+ */
+static struct workqueue_struct *cxl_pmem_wq;
+
+static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
+			struct nvdimm *nvdimm, unsigned int cmd, void *buf,
+			unsigned int buf_len, int *cmd_rc)
+{
+	return -ENOTTY;
+}
+
+static void online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
+{
+	if (cxl_nvb->nvdimm_bus)
+		return;
+	cxl_nvb->nvdimm_bus =
+		nvdimm_bus_register(&cxl_nvb->dev, &cxl_nvb->nd_desc);
+}
+
+static void offline_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
+{
+	if (!cxl_nvb->nvdimm_bus)
+		return;
+	nvdimm_bus_unregister(cxl_nvb->nvdimm_bus);
+	cxl_nvb->nvdimm_bus = NULL;
+}
+
+static void cxl_nvb_update_state(struct work_struct *work)
+{
+	struct cxl_nvdimm_bridge *cxl_nvb =
+		container_of(work, typeof(*cxl_nvb), state_work);
+	bool release = false;
+
+	device_lock(&cxl_nvb->dev);
+	switch (cxl_nvb->state) {
+	case CXL_NVB_ONLINE:
+		online_nvdimm_bus(cxl_nvb);
+		if (!cxl_nvb->nvdimm_bus) {
+			dev_err(&cxl_nvb->dev,
+				"failed to establish nvdimm bus\n");
+			release = true;
+		}
+		break;
+	case CXL_NVB_OFFLINE:
+	case CXL_NVB_DEAD:
+		offline_nvdimm_bus(cxl_nvb);
+		break;
+	default:
+		break;
+	}
+	device_unlock(&cxl_nvb->dev);
+
+	if (release)
+		device_release_driver(&cxl_nvb->dev);
+
+	put_device(&cxl_nvb->dev);
+}
+
+static void cxl_nvdimm_bridge_remove(struct device *dev)
+{
+	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
+
+	if (cxl_nvb->state == CXL_NVB_ONLINE)
+		cxl_nvb->state = CXL_NVB_OFFLINE;
+	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
+		get_device(&cxl_nvb->dev);
+}
+
+static int cxl_nvdimm_bridge_probe(struct device *dev)
+{
+	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
+
+	if (cxl_nvb->state == CXL_NVB_DEAD)
+		return -ENXIO;
+
+	if (cxl_nvb->state == CXL_NVB_NEW) {
+		cxl_nvb->nd_desc = (struct nvdimm_bus_descriptor) {
+			.provider_name = "CXL",
+			.module = THIS_MODULE,
+			.ndctl = cxl_pmem_ctl,
+		};
+
+		INIT_WORK(&cxl_nvb->state_work, cxl_nvb_update_state);
+	}
+
+	cxl_nvb->state = CXL_NVB_ONLINE;
+	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
+		get_device(&cxl_nvb->dev);
+
+	return 0;
+}
+
+static struct cxl_driver cxl_nvdimm_bridge_driver = {
+	.name = "cxl_nvdimm_bridge",
+	.probe = cxl_nvdimm_bridge_probe,
+	.remove = cxl_nvdimm_bridge_remove,
+	.id = CXL_DEVICE_NVDIMM_BRIDGE,
+};
+
+static __init int cxl_pmem_init(void)
+{
+	int rc;
+
+	cxl_pmem_wq = alloc_ordered_workqueue("cxl_pmem", 0);
+
+	if (!cxl_pmem_wq)
+		return -ENXIO;
+
+	rc = cxl_driver_register(&cxl_nvdimm_bridge_driver);
+	if (rc)
+		goto err;
+
+	return 0;
+
+err:
+	destroy_workqueue(cxl_pmem_wq);
+	return rc;
+}
+
+static __exit void cxl_pmem_exit(void)
+{
+	cxl_driver_unregister(&cxl_nvdimm_bridge_driver);
+	destroy_workqueue(cxl_pmem_wq);
+}
+
+MODULE_LICENSE("GPL v2");
+module_init(cxl_pmem_init);
+module_exit(cxl_pmem_exit);
+MODULE_IMPORT_NS(CXL);
+MODULE_ALIAS_CXL(CXL_DEVICE_NVDIMM_BRIDGE);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 3/5] libnvdimm: Export nvdimm shutdown helper, nvdimm_delete()
  2021-06-10 22:25 [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Dan Williams
  2021-06-10 22:26 ` [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure Dan Williams
  2021-06-10 22:26 ` [PATCH 2/5] cxl/pmem: Add initial infrastructure for pmem support Dan Williams
@ 2021-06-10 22:26 ` Dan Williams
  2021-06-10 22:26 ` [PATCH 4/5] libnvdimm: Drop unused device power management support Dan Williams
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Dan Williams @ 2021-06-10 22:26 UTC (permalink / raw)
  To: linux-cxl
  Cc: nvdimm, ben.widawsky, alison.schofield, vishal.l.verma,
	ira.weiny, linux-kernel

CXL is a hotplug bus and arranges for nvdimm devices to be dynamically
discovered and removed. The libnvdimm core manages shutdown of nvdimm
security operations when the device is unregistered. That functionality
is moved to nvdimm_delete() and invoked by the CXL-to-nvdimm glue code.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/bus.c       |   19 ++++---------------
 drivers/nvdimm/dimm_devs.c |   18 ++++++++++++++++++
 include/linux/libnvdimm.h  |    1 +
 3 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 3a777d0073b7..a11821df83b5 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -396,21 +396,10 @@ static int child_unregister(struct device *dev, void *data)
 	if (dev->class)
 		return 0;
 
-	if (is_nvdimm(dev)) {
-		struct nvdimm *nvdimm = to_nvdimm(dev);
-		bool dev_put = false;
-
-		/* We are shutting down. Make state frozen artificially. */
-		nvdimm_bus_lock(dev);
-		set_bit(NVDIMM_SECURITY_FROZEN, &nvdimm->sec.flags);
-		if (test_and_clear_bit(NDD_WORK_PENDING, &nvdimm->flags))
-			dev_put = true;
-		nvdimm_bus_unlock(dev);
-		cancel_delayed_work_sync(&nvdimm->dwork);
-		if (dev_put)
-			put_device(dev);
-	}
-	nd_device_unregister(dev, ND_SYNC);
+	if (is_nvdimm(dev))
+		nvdimm_delete(to_nvdimm(dev));
+	else
+		nd_device_unregister(dev, ND_SYNC);
 
 	return 0;
 }
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index 9d208570d059..dc7449a40003 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -642,6 +642,24 @@ struct nvdimm *__nvdimm_create(struct nvdimm_bus *nvdimm_bus,
 }
 EXPORT_SYMBOL_GPL(__nvdimm_create);
 
+void nvdimm_delete(struct nvdimm *nvdimm)
+{
+	struct device *dev = &nvdimm->dev;
+	bool dev_put = false;
+
+	/* We are shutting down. Make state frozen artificially. */
+	nvdimm_bus_lock(dev);
+	set_bit(NVDIMM_SECURITY_FROZEN, &nvdimm->sec.flags);
+	if (test_and_clear_bit(NDD_WORK_PENDING, &nvdimm->flags))
+		dev_put = true;
+	nvdimm_bus_unlock(dev);
+	cancel_delayed_work_sync(&nvdimm->dwork);
+	if (dev_put)
+		put_device(dev);
+	nd_device_unregister(dev, ND_SYNC);
+}
+EXPORT_SYMBOL_GPL(nvdimm_delete);
+
 static void shutdown_security_notify(void *data)
 {
 	struct nvdimm *nvdimm = data;
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index 89b69e645ac7..7074aa9af525 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -278,6 +278,7 @@ static inline struct nvdimm *nvdimm_create(struct nvdimm_bus *nvdimm_bus,
 	return __nvdimm_create(nvdimm_bus, provider_data, groups, flags,
 			cmd_mask, num_flush, flush_wpq, NULL, NULL, NULL);
 }
+void nvdimm_delete(struct nvdimm *nvdimm);
 
 const struct nd_cmd_desc *nd_cmd_dimm_desc(int cmd);
 const struct nd_cmd_desc *nd_cmd_bus_desc(int cmd);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 4/5] libnvdimm: Drop unused device power management support
  2021-06-10 22:25 [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Dan Williams
                   ` (2 preceding siblings ...)
  2021-06-10 22:26 ` [PATCH 3/5] libnvdimm: Export nvdimm shutdown helper, nvdimm_delete() Dan Williams
@ 2021-06-10 22:26 ` Dan Williams
  2021-06-11 11:47   ` Jonathan Cameron
  2021-06-10 22:26 ` [PATCH 5/5] cxl/pmem: Register 'pmem' / cxl_nvdimm devices Dan Williams
  2021-06-11 12:59 ` [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Jonathan Cameron
  5 siblings, 1 reply; 18+ messages in thread
From: Dan Williams @ 2021-06-10 22:26 UTC (permalink / raw)
  To: linux-cxl
  Cc: nvdimm, ben.widawsky, alison.schofield, vishal.l.verma,
	ira.weiny, linux-kernel

LIBNVDIMM device objects register sysfs power attributes despite nothing
requiring that support. Clean up sysfs remove the power/ attribute
group. This requires a device_create() and a device_register() usage to
be converted to the device_initialize() + device_add() pattern.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/bus.c |   45 +++++++++++++++++++++++++++++++++++++--------
 1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index a11821df83b5..e6aa87043a95 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -363,8 +363,13 @@ struct nvdimm_bus *nvdimm_bus_register(struct device *parent,
 	nvdimm_bus->dev.groups = nd_desc->attr_groups;
 	nvdimm_bus->dev.bus = &nvdimm_bus_type;
 	nvdimm_bus->dev.of_node = nd_desc->of_node;
-	dev_set_name(&nvdimm_bus->dev, "ndbus%d", nvdimm_bus->id);
-	rc = device_register(&nvdimm_bus->dev);
+	device_initialize(&nvdimm_bus->dev);
+	device_set_pm_not_required(&nvdimm_bus->dev);
+	rc = dev_set_name(&nvdimm_bus->dev, "ndbus%d", nvdimm_bus->id);
+	if (rc)
+		goto err;
+
+	rc = device_add(&nvdimm_bus->dev);
 	if (rc) {
 		dev_dbg(&nvdimm_bus->dev, "registration failed: %d\n", rc);
 		goto err;
@@ -525,6 +530,7 @@ void __nd_device_register(struct device *dev)
 		set_dev_node(dev, to_nd_region(dev)->numa_node);
 
 	dev->bus = &nvdimm_bus_type;
+	device_set_pm_not_required(dev);
 	if (dev->parent) {
 		get_device(dev->parent);
 		if (dev_to_node(dev) == NUMA_NO_NODE)
@@ -717,18 +723,41 @@ const struct attribute_group nd_numa_attribute_group = {
 	.is_visible = nd_numa_attr_visible,
 };
 
+static void ndctl_release(struct device *dev)
+{
+	kfree(dev);
+}
+
 int nvdimm_bus_create_ndctl(struct nvdimm_bus *nvdimm_bus)
 {
 	dev_t devt = MKDEV(nvdimm_bus_major, nvdimm_bus->id);
 	struct device *dev;
+	int rc;
 
-	dev = device_create(nd_class, &nvdimm_bus->dev, devt, nvdimm_bus,
-			"ndctl%d", nvdimm_bus->id);
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+	if (!dev)
+		return -ENOMEM;
+	device_initialize(dev);
+	device_set_pm_not_required(dev);
+	dev->class = nd_class;
+	dev->parent = &nvdimm_bus->dev;
+	dev->devt = devt;
+	dev->release = ndctl_release;
+	rc = dev_set_name(dev, "ndctl%d", nvdimm_bus->id);
+	if (rc)
+		goto err;
 
-	if (IS_ERR(dev))
-		dev_dbg(&nvdimm_bus->dev, "failed to register ndctl%d: %ld\n",
-				nvdimm_bus->id, PTR_ERR(dev));
-	return PTR_ERR_OR_ZERO(dev);
+	rc = device_add(dev);
+	if (rc) {
+		dev_dbg(&nvdimm_bus->dev, "failed to register ndctl%d: %d\n",
+				nvdimm_bus->id, rc);
+		goto err;
+	}
+	return 0;
+
+err:
+	put_device(dev);
+	return rc;
 }
 
 void nvdimm_bus_destroy_ndctl(struct nvdimm_bus *nvdimm_bus)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 5/5] cxl/pmem: Register 'pmem' / cxl_nvdimm  devices
  2021-06-10 22:25 [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Dan Williams
                   ` (3 preceding siblings ...)
  2021-06-10 22:26 ` [PATCH 4/5] libnvdimm: Drop unused device power management support Dan Williams
@ 2021-06-10 22:26 ` Dan Williams
  2021-06-11 12:57   ` Jonathan Cameron
  2021-06-11 12:59 ` [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Jonathan Cameron
  5 siblings, 1 reply; 18+ messages in thread
From: Dan Williams @ 2021-06-10 22:26 UTC (permalink / raw)
  To: linux-cxl
  Cc: nvdimm, ben.widawsky, alison.schofield, vishal.l.verma,
	ira.weiny, linux-kernel

While a memX device on /sys/bus/cxl represents a CXL memory expander
control interface, a pmemX device represents the persistent memory
sub-functionality. It bridges the CXL subystem to the libnvdimm nmemX
control interface.

With this skeleton ndctl can now see persistent memory devices on a
"CXL" bus. Later patches add support for translating libnvdimm native
commands to CXL commands.

# ndctl list -BDiu -b CXL
{
  "provider":"CXL",
  "dev":"ndbus1",
  "dimms":[
    {
      "dev":"nmem1",
      "state":"disabled"
    },
    {
      "dev":"nmem0",
      "state":"disabled"
    }
  ]
}

Given nvdimm_bus_unregister() removes all devices on an ndbus0 the
cxl_pmem infrastructure needs to arrange ->remove() to be triggered on
cxl_nvdimm devices to keep their enabled state synchronized with the
registration state of their corresponding device on the nvdimm_bus. In
other words, always arrange for cxl_nvdimm_driver.remove() to unregister
nvdimms from an nvdimm_bus ahead of the bus being unregistered.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core.c |   86 ++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h  |   13 ++++++
 drivers/cxl/mem.h  |    2 +
 drivers/cxl/pci.c  |   23 ++++++++---
 drivers/cxl/pmem.c |  111 +++++++++++++++++++++++++++++++++++++++++++++++-----
 5 files changed, 217 insertions(+), 18 deletions(-)

diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
index f0305c9c91c8..6db660249cea 100644
--- a/drivers/cxl/core.c
+++ b/drivers/cxl/core.c
@@ -7,6 +7,7 @@
 #include <linux/slab.h>
 #include <linux/idr.h>
 #include "cxl.h"
+#include "mem.h"
 
 /**
  * DOC: cxl core
@@ -731,6 +732,89 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
 }
 EXPORT_SYMBOL_GPL(devm_cxl_add_nvdimm_bridge);
 
+static void cxl_nvdimm_release(struct device *dev)
+{
+	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
+
+	kfree(cxl_nvd);
+}
+
+static const struct attribute_group *cxl_nvdimm_attribute_groups[] = {
+	&cxl_base_attribute_group,
+	NULL,
+};
+
+static const struct device_type cxl_nvdimm_type = {
+	.name = "cxl_nvdimm",
+	.release = cxl_nvdimm_release,
+	.groups = cxl_nvdimm_attribute_groups,
+};
+
+bool is_cxl_nvdimm(struct device *dev)
+{
+	return dev->type == &cxl_nvdimm_type;
+}
+EXPORT_SYMBOL_GPL(is_cxl_nvdimm);
+
+struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev)
+{
+	if (dev_WARN_ONCE(dev, !is_cxl_nvdimm(dev),
+			  "not a cxl_nvdimm device\n"))
+		return NULL;
+	return container_of(dev, struct cxl_nvdimm, dev);
+}
+EXPORT_SYMBOL_GPL(to_cxl_nvdimm);
+
+static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
+{
+	struct cxl_nvdimm *cxl_nvd;
+	struct device *dev;
+
+	cxl_nvd = kzalloc(sizeof(*cxl_nvd), GFP_KERNEL);
+	if (!cxl_nvd)
+		return ERR_PTR(-ENOMEM);
+
+	dev = &cxl_nvd->dev;
+	cxl_nvd->cxlmd = cxlmd;
+	device_initialize(dev);
+	device_set_pm_not_required(dev);
+	dev->parent = &cxlmd->dev;
+	dev->bus = &cxl_bus_type;
+	dev->type = &cxl_nvdimm_type;
+
+	return cxl_nvd;
+}
+
+int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
+{
+	struct cxl_nvdimm *cxl_nvd;
+	struct device *dev;
+	int rc;
+
+	cxl_nvd = cxl_nvdimm_alloc(cxlmd);
+	if (IS_ERR(cxl_nvd))
+		return PTR_ERR(cxl_nvd);
+
+	dev = &cxl_nvd->dev;
+	rc = dev_set_name(dev, "pmem%d", cxlmd->id);
+	if (rc)
+		goto err;
+
+	rc = device_add(dev);
+	if (rc)
+		goto err;
+
+	dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
+		dev_name(dev));
+
+	return devm_add_action_or_reset(host, unregister_dev, dev);
+
+err:
+	put_device(dev);
+	return rc;
+}
+EXPORT_SYMBOL_GPL(devm_cxl_add_nvdimm);
+
 /**
  * cxl_probe_device_regs() - Detect CXL Device register blocks
  * @dev: Host device of the @base mapping
@@ -930,6 +1014,8 @@ static int cxl_device_id(struct device *dev)
 {
 	if (dev->type == &cxl_nvdimm_bridge_type)
 		return CXL_DEVICE_NVDIMM_BRIDGE;
+	if (dev->type == &cxl_nvdimm_type)
+		return CXL_DEVICE_NVDIMM;
 	return 0;
 }
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 47fcb7ad5978..3f9a6f7b05db 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -213,6 +213,13 @@ struct cxl_nvdimm_bridge {
 	enum cxl_nvdimm_brige_state state;
 };
 
+struct cxl_mem;
+struct cxl_nvdimm {
+	struct device dev;
+	struct cxl_memdev *cxlmd;
+	struct nvdimm *nvdimm;
+};
+
 /**
  * struct cxl_port - logical collection of upstream port devices and
  *		     downstream port devices to construct a CXL memory
@@ -299,7 +306,8 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
 #define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
 void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 
-#define CXL_DEVICE_NVDIMM_BRIDGE 1
+#define CXL_DEVICE_NVDIMM_BRIDGE	1
+#define CXL_DEVICE_NVDIMM		2
 
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
@@ -307,4 +315,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev);
 struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
 						     struct cxl_port *port);
+struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
+bool is_cxl_nvdimm(struct device *dev);
+int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
 #endif /* __CXL_H__ */
diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
index 13868ff7cadf..8f02d02b26b4 100644
--- a/drivers/cxl/mem.h
+++ b/drivers/cxl/mem.h
@@ -2,6 +2,8 @@
 /* Copyright(c) 2020-2021 Intel Corporation. */
 #ifndef __CXL_MEM_H__
 #define __CXL_MEM_H__
+#include <linux/cdev.h>
+#include "cxl.h"
 
 /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
 #define CXLMDEV_STATUS_OFFSET 0x0
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 5a1705b52278..83e81b44c8f5 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1313,7 +1313,8 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm)
 	return ERR_PTR(rc);
 }
 
-static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
+static struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
+					      struct cxl_mem *cxlm)
 {
 	struct cxl_memdev *cxlmd;
 	struct device *dev;
@@ -1322,7 +1323,7 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
 
 	cxlmd = cxl_memdev_alloc(cxlm);
 	if (IS_ERR(cxlmd))
-		return PTR_ERR(cxlmd);
+		return cxlmd;
 
 	dev = &cxlmd->dev;
 	rc = dev_set_name(dev, "mem%d", cxlmd->id);
@@ -1340,8 +1341,10 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
 	if (rc)
 		goto err;
 
-	return devm_add_action_or_reset(dev->parent, cxl_memdev_unregister,
-					cxlmd);
+	rc = devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
+	if (rc)
+		return ERR_PTR(rc);
+	return cxlmd;
 
 err:
 	/*
@@ -1350,7 +1353,7 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
 	 */
 	cxl_memdev_shutdown(cxlmd);
 	put_device(dev);
-	return rc;
+	return ERR_PTR(rc);
 }
 
 static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
@@ -1561,6 +1564,7 @@ static int cxl_mem_identify(struct cxl_mem *cxlm)
 
 static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
+	struct cxl_memdev *cxlmd;
 	struct cxl_mem *cxlm;
 	int rc;
 
@@ -1588,7 +1592,14 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
-	return cxl_mem_add_memdev(cxlm);
+	cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm);
+	if (IS_ERR(cxlmd))
+		return PTR_ERR(cxlmd);
+
+	if (range_len(&cxlm->pmem_range) && IS_ENABLED(CONFIG_CXL_PMEM))
+		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
+
+	return rc;
 }
 
 static const struct pci_device_id cxl_mem_pci_tbl[] = {
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
index 0067bd734559..1ed19e213157 100644
--- a/drivers/cxl/pmem.c
+++ b/drivers/cxl/pmem.c
@@ -3,7 +3,10 @@
 #include <linux/libnvdimm.h>
 #include <linux/device.h>
 #include <linux/module.h>
+#include <linux/ndctl.h>
+#include <linux/async.h>
 #include <linux/slab.h>
+#include "mem.h"
 #include "cxl.h"
 
 /*
@@ -13,6 +16,64 @@
  */
 static struct workqueue_struct *cxl_pmem_wq;
 
+static void unregister_nvdimm(void *nvdimm)
+{
+	nvdimm_delete(nvdimm);
+}
+
+static int match_nvdimm_bridge(struct device *dev, const void *data)
+{
+	return strcmp(dev_name(dev), "nvdimm-bridge") == 0;
+}
+
+static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
+{
+	struct device *dev;
+
+	dev = bus_find_device(&cxl_bus_type, NULL, NULL, match_nvdimm_bridge);
+	if (!dev)
+		return NULL;
+	return to_cxl_nvdimm_bridge(dev);
+}
+
+static int cxl_nvdimm_probe(struct device *dev)
+{
+	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
+	struct cxl_nvdimm_bridge *cxl_nvb;
+	unsigned long flags = 0;
+	struct nvdimm *nvdimm;
+	int rc = -ENXIO;
+
+	cxl_nvb = cxl_find_nvdimm_bridge();
+	if (!cxl_nvb)
+		return -ENXIO;
+
+	device_lock(&cxl_nvb->dev);
+	if (!cxl_nvb->nvdimm_bus)
+		goto out;
+
+	set_bit(NDD_LABELING, &flags);
+	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
+			       NULL);
+	if (!nvdimm)
+		goto out;
+
+	dev_set_drvdata(dev, nvdimm);
+
+	rc = devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm);
+out:
+	device_unlock(&cxl_nvb->dev);
+	put_device(&cxl_nvb->dev);
+
+	return rc;
+}
+
+static struct cxl_driver cxl_nvdimm_driver = {
+	.name = "cxl_nvdimm",
+	.probe = cxl_nvdimm_probe,
+	.id = CXL_DEVICE_NVDIMM,
+};
+
 static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
 			struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 			unsigned int buf_len, int *cmd_rc)
@@ -28,19 +89,31 @@ static void online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
 		nvdimm_bus_register(&cxl_nvb->dev, &cxl_nvb->nd_desc);
 }
 
-static void offline_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
+static int cxl_nvdimm_release_driver(struct device *dev, void *data)
 {
-	if (!cxl_nvb->nvdimm_bus)
-		return;
-	nvdimm_bus_unregister(cxl_nvb->nvdimm_bus);
-	cxl_nvb->nvdimm_bus = NULL;
+	if (!is_cxl_nvdimm(dev))
+		return 0;
+	device_release_driver(dev);
+	return 0;
+}
+
+static void offline_nvdimm_bus(struct nvdimm_bus *nvdimm_bus)
+{
+	/*
+	 * Set the state of cxl_nvdimm devices to unbound / idle before
+	 * nvdimm_bus_unregister() rips the nvdimm objects out from
+	 * underneath them.
+	 */
+	bus_for_each_dev(&cxl_bus_type, NULL, NULL, cxl_nvdimm_release_driver);
+	nvdimm_bus_unregister(nvdimm_bus);
 }
 
 static void cxl_nvb_update_state(struct work_struct *work)
 {
 	struct cxl_nvdimm_bridge *cxl_nvb =
 		container_of(work, typeof(*cxl_nvb), state_work);
-	bool release = false;
+	struct nvdimm_bus *nvdimm_bus = NULL;
+	bool release = false, rescan = false;
 
 	device_lock(&cxl_nvb->dev);
 	switch (cxl_nvb->state) {
@@ -50,11 +123,13 @@ static void cxl_nvb_update_state(struct work_struct *work)
 			dev_err(&cxl_nvb->dev,
 				"failed to establish nvdimm bus\n");
 			release = true;
-		}
+		} else
+			rescan = true;
 		break;
 	case CXL_NVB_OFFLINE:
 	case CXL_NVB_DEAD:
-		offline_nvdimm_bus(cxl_nvb);
+		nvdimm_bus = cxl_nvb->nvdimm_bus;
+		cxl_nvb->nvdimm_bus = NULL;
 		break;
 	default:
 		break;
@@ -63,6 +138,13 @@ static void cxl_nvb_update_state(struct work_struct *work)
 
 	if (release)
 		device_release_driver(&cxl_nvb->dev);
+	if (rescan) {
+		int rc = bus_rescan_devices(&cxl_bus_type);
+
+		dev_dbg(&cxl_nvb->dev, "rescan: %d\n", rc);
+	}
+	if (nvdimm_bus)
+		offline_nvdimm_bus(nvdimm_bus);
 
 	put_device(&cxl_nvb->dev);
 }
@@ -113,23 +195,29 @@ static __init int cxl_pmem_init(void)
 	int rc;
 
 	cxl_pmem_wq = alloc_ordered_workqueue("cxl_pmem", 0);
-
 	if (!cxl_pmem_wq)
 		return -ENXIO;
 
 	rc = cxl_driver_register(&cxl_nvdimm_bridge_driver);
 	if (rc)
-		goto err;
+		goto err_bridge;
+
+	rc = cxl_driver_register(&cxl_nvdimm_driver);
+	if (rc)
+		goto err_nvdimm;
 
 	return 0;
 
-err:
+err_nvdimm:
+	cxl_driver_unregister(&cxl_nvdimm_bridge_driver);
+err_bridge:
 	destroy_workqueue(cxl_pmem_wq);
 	return rc;
 }
 
 static __exit void cxl_pmem_exit(void)
 {
+	cxl_driver_unregister(&cxl_nvdimm_driver);
 	cxl_driver_unregister(&cxl_nvdimm_bridge_driver);
 	destroy_workqueue(cxl_pmem_wq);
 }
@@ -139,3 +227,4 @@ module_init(cxl_pmem_init);
 module_exit(cxl_pmem_exit);
 MODULE_IMPORT_NS(CXL);
 MODULE_ALIAS_CXL(CXL_DEVICE_NVDIMM_BRIDGE);
+MODULE_ALIAS_CXL(CXL_DEVICE_NVDIMM);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] cxl/pmem: Add initial infrastructure for pmem support
  2021-06-10 22:26 ` [PATCH 2/5] cxl/pmem: Add initial infrastructure for pmem support Dan Williams
@ 2021-06-11 11:39   ` Jonathan Cameron
  2021-06-12  0:07     ` Dan Williams
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Cameron @ 2021-06-11 11:39 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, nvdimm, ben.widawsky, alison.schofield,
	vishal.l.verma, ira.weiny, linux-kernel

On Thu, 10 Jun 2021 15:26:08 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Register an 'nvdimm-bridge' device to act as an anchor for a libnvdimm
> bus hierarchy. Also, flesh out the cxl_bus definition to allow a
> cxl_nvdimm_bridge_driver to attach to the bridge and trigger the
> nvdimm-bus registration.
> 
> The creation of the bridge is gated on the detection of a PMEM capable
> address space registered to the root. The bridge indirection allows the
> libnvdimm module to remain unloaded on platforms without PMEM support.
> 
> Given that the probing of ACPI0017 is asynchronous to CXL endpoint
> devices, and the expectation that CXL endpoint devices register other
> PMEM resources on the 'CXL' nvdimm bus, a workqueue is added. The
> workqueue is needed to run bus_rescan_devices() outside of the
> device_lock() of the nvdimm-bridge device to rendezvous nvdimm resources
> as they arrive. For now only the bus is taken online/offline in the
> workqueue.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

I'm not that familiar with nvdimm side of things, so this is mostly
superficial review of the patch itself.

A few really minor comments inline, but otherwise looks good to me.

Jonathan

> ---
>  drivers/cxl/Kconfig  |   13 +++++
>  drivers/cxl/Makefile |    2 +
>  drivers/cxl/acpi.c   |   37 ++++++++++++-
>  drivers/cxl/core.c   |  122 +++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h    |   24 +++++++++
>  drivers/cxl/pmem.c   |  141 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 337 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/cxl/pmem.c
> 
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 1a44b173dcbc..e6de221cc568 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -61,5 +61,18 @@ config CXL_ACPI
>  	  hierarchy to map regions that represent System RAM, or Persistent
>  	  Memory regions to be managed by LIBNVDIMM.
>  
> +	  If unsure say 'm'.
> +
> +config CXL_PMEM
> +	tristate "CXL PMEM: Persistent Memory Support"
> +	depends on LIBNVDIMM
> +	default CXL_BUS
> +	help
> +	  In addition to typical memory resources a platform may also advertise
> +	  support for persistent memory attached via CXL. This support is
> +	  managed via a bridge driver from CXL to the LIBNVDIMM system
> +	  subsystem. Say 'y/m' to enable support for enumerating and
> +	  provisioning the persistent memory capacity of CXL memory expanders.
> +
>  	  If unsure say 'm'.
>  endif
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index a29efb3e8ad2..32954059b37b 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -2,8 +2,10 @@
>  obj-$(CONFIG_CXL_BUS) += cxl_core.o
>  obj-$(CONFIG_CXL_MEM) += cxl_pci.o
>  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
> +obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
>  
>  ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL
>  cxl_core-y := core.o
>  cxl_pci-y := pci.o
>  cxl_acpi-y := acpi.o
> +cxl_pmem-y := pmem.o
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index be357eea552c..8a723f7f3f73 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -145,6 +145,30 @@ static int add_host_bridge_dport(struct device *match, void *arg)
>  	return 0;
>  }
>  
> +static int add_root_nvdimm_bridge(struct device *match, void *data)
> +{
> +	struct cxl_decoder *cxld;
> +	struct cxl_port *root_port = data;
> +	struct cxl_nvdimm_bridge *cxl_nvb;
> +	struct device *host = root_port->dev.parent;
> +
> +	if (!is_root_decoder(match))
> +		return 0;
> +
> +	cxld = to_cxl_decoder(match);
> +	if (!(cxld->flags & CXL_DECODER_F_PMEM))
> +		return 0;
> +
> +	cxl_nvb = devm_cxl_add_nvdimm_bridge(host, root_port);
> +	if (IS_ERR(cxl_nvb)) {
> +		dev_dbg(host, "failed to register pmem\n");
> +		return PTR_ERR(cxl_nvb);
> +	}
> +	dev_dbg(host, "%s: add: %s\n", dev_name(&root_port->dev),
> +		dev_name(&cxl_nvb->dev));
> +	return 1;
> +}
> +
>  static int cxl_acpi_probe(struct platform_device *pdev)
>  {
>  	int rc;
> @@ -166,8 +190,17 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>  	 * Root level scanned with host-bridge as dports, now scan host-bridges
>  	 * for their role as CXL uports to their CXL-capable PCIe Root Ports.
>  	 */
> -	return bus_for_each_dev(adev->dev.bus, NULL, root_port,
> -				add_host_bridge_uport);
> +	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
> +			      add_host_bridge_uport);
> +	if (rc)
> +		return rc;
> +
> +	if (IS_ENABLED(CONFIG_CXL_PMEM))
> +		rc = device_for_each_child(&root_port->dev, root_port,
> +					   add_root_nvdimm_bridge);
> +	if (rc < 0)
> +		return rc;
> +	return 0;
>  }
>  
>  static const struct acpi_device_id cxl_acpi_ids[] = {
> diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> index 959cecc1f6bf..f0305c9c91c8 100644
> --- a/drivers/cxl/core.c
> +++ b/drivers/cxl/core.c
> @@ -187,6 +187,12 @@ static const struct device_type cxl_decoder_root_type = {
>  	.groups = cxl_decoder_root_attribute_groups,
>  };
>  
> +bool is_root_decoder(struct device *dev)
> +{
> +	return dev->type == &cxl_decoder_root_type;
> +}
> +EXPORT_SYMBOL_GPL(is_root_decoder);
> +
>  struct cxl_decoder *to_cxl_decoder(struct device *dev)
>  {
>  	if (dev_WARN_ONCE(dev, dev->type->release != cxl_decoder_release,
> @@ -194,6 +200,7 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev)
>  		return NULL;
>  	return container_of(dev, struct cxl_decoder, dev);
>  }
> +EXPORT_SYMBOL_GPL(to_cxl_decoder);
>  
>  static void cxl_dport_release(struct cxl_dport *dport)
>  {
> @@ -611,6 +618,119 @@ void cxl_probe_component_regs(struct device *dev, void __iomem *base,
>  }
>  EXPORT_SYMBOL_GPL(cxl_probe_component_regs);
>  
> +static void cxl_nvdimm_bridge_release(struct device *dev)
> +{
> +	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
> +
> +	kfree(cxl_nvb);
> +}
> +
> +static const struct attribute_group *cxl_nvdimm_bridge_attribute_groups[] = {
> +	&cxl_base_attribute_group,
> +	NULL,
> +};
> +
> +static const struct device_type cxl_nvdimm_bridge_type = {
> +	.name = "cxl_nvdimm_bridge",
> +	.release = cxl_nvdimm_bridge_release,
> +	.groups = cxl_nvdimm_bridge_attribute_groups,
> +};
> +
> +struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev)
> +{
> +	if (dev_WARN_ONCE(dev, dev->type != &cxl_nvdimm_bridge_type,
> +			  "not a cxl_nvdimm_bridge device\n"))
> +		return NULL;
> +	return container_of(dev, struct cxl_nvdimm_bridge, dev);
> +}
> +EXPORT_SYMBOL_GPL(to_cxl_nvdimm_bridge);
> +
> +static struct cxl_nvdimm_bridge *
> +cxl_nvdimm_bridge_alloc(struct cxl_port *port)
> +{
> +	struct cxl_nvdimm_bridge *cxl_nvb;
> +	struct device *dev;
> +
> +	cxl_nvb = kzalloc(sizeof(*cxl_nvb), GFP_KERNEL);
> +	if (!cxl_nvb)
> +		return ERR_PTR(-ENOMEM);
> +
> +	dev = &cxl_nvb->dev;
> +	cxl_nvb->port = port;
> +	cxl_nvb->state = CXL_NVB_NEW;
> +	device_initialize(dev);
> +	device_set_pm_not_required(dev);
> +	dev->parent = &port->dev;
> +	dev->bus = &cxl_bus_type;
> +	dev->type = &cxl_nvdimm_bridge_type;
> +
> +	return cxl_nvb;
> +}
> +
> +static void unregister_nvb(void *_cxl_nvb)
> +{
> +	struct cxl_nvdimm_bridge *cxl_nvb = _cxl_nvb;
> +	bool flush = false;
> +
> +	/*
> +	 * If the bridge was ever activated then there might be in-flight state
> +	 * work to flush. Once the state has been changed to 'dead' then no new
> +	 * work can be queued by user-triggered bind.
> +	 */
> +	device_lock(&cxl_nvb->dev);
> +	if (cxl_nvb->state != CXL_NVB_NEW)
> +		flush = true;

flush = clx_nvb->state != CXL_NVB_NEW; 

perhaps?

> +	cxl_nvb->state = CXL_NVB_DEAD;
> +	device_unlock(&cxl_nvb->dev);
> +
> +	/*
> +	 * Even though the device core will trigger device_release_driver()
> +	 * before the unregister, it does not know about the fact that
> +	 * cxl_nvdimm_bridge_driver defers ->remove() work. So, do the driver
> +	 * release not and flush it before tearing down the nvdimm device
> +	 * hierarchy.
> +	 */
> +	device_release_driver(&cxl_nvb->dev);
> +	if (flush)
> +		flush_work(&cxl_nvb->state_work);
> +	device_unregister(&cxl_nvb->dev);
> +}
> +
> +struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
> +						     struct cxl_port *port)
> +{
> +	struct cxl_nvdimm_bridge *cxl_nvb;
> +	struct device *dev;
> +	int rc;
> +
> +	if (!IS_ENABLED(CONFIG_CXL_PMEM))
> +		return ERR_PTR(-ENXIO);
> +
> +	cxl_nvb = cxl_nvdimm_bridge_alloc(port);
> +	if (IS_ERR(cxl_nvb))
> +		return cxl_nvb;
> +
> +	dev = &cxl_nvb->dev;
> +	rc = dev_set_name(dev, "nvdimm-bridge");
> +	if (rc)
> +		goto err;
> +
> +	rc = device_add(dev);
> +	if (rc)
> +		goto err;
> +
> +	rc = devm_add_action_or_reset(host, unregister_nvb, cxl_nvb);
> +	if (rc)
> +		return ERR_PTR(rc);
> +
> +	return cxl_nvb;
> +
> +err:
> +	put_device(dev);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_GPL(devm_cxl_add_nvdimm_bridge);
> +
>  /**
>   * cxl_probe_device_regs() - Detect CXL Device register blocks
>   * @dev: Host device of the @base mapping
> @@ -808,6 +928,8 @@ EXPORT_SYMBOL_GPL(cxl_driver_unregister);
>  
>  static int cxl_device_id(struct device *dev)
>  {
> +	if (dev->type == &cxl_nvdimm_bridge_type)
> +		return CXL_DEVICE_NVDIMM_BRIDGE;
>  	return 0;
>  }
>  
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index af2237d1c761..47fcb7ad5978 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -4,6 +4,7 @@
>  #ifndef __CXL_H__
>  #define __CXL_H__
>  
> +#include <linux/libnvdimm.h>
>  #include <linux/bitfield.h>
>  #include <linux/bitops.h>
>  #include <linux/io.h>
> @@ -195,6 +196,23 @@ struct cxl_decoder {
>  	struct cxl_dport *target[];
>  };
>  
> +
> +enum cxl_nvdimm_brige_state {
> +	CXL_NVB_NEW,
> +	CXL_NVB_DEAD,
> +	CXL_NVB_ONLINE,
> +	CXL_NVB_OFFLINE,
> +};
> +
> +struct cxl_nvdimm_bridge {
> +	struct device dev;
> +	struct cxl_port *port;
> +	struct nvdimm_bus *nvdimm_bus;
> +	struct nvdimm_bus_descriptor nd_desc;
> +	struct work_struct state_work;
> +	enum cxl_nvdimm_brige_state state;
> +};
> +
>  /**
>   * struct cxl_port - logical collection of upstream port devices and
>   *		     downstream port devices to construct a CXL memory
> @@ -240,6 +258,7 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
>  		  resource_size_t component_reg_phys);
>  
>  struct cxl_decoder *to_cxl_decoder(struct device *dev);
> +bool is_root_decoder(struct device *dev);
>  struct cxl_decoder *
>  devm_cxl_add_decoder(struct device *host, struct cxl_port *port, int nr_targets,
>  		     resource_size_t base, resource_size_t len,
> @@ -280,7 +299,12 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
>  #define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
>  void cxl_driver_unregister(struct cxl_driver *cxl_drv);
>  
> +#define CXL_DEVICE_NVDIMM_BRIDGE 1
> +
>  #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
>  #define CXL_MODALIAS_FMT "cxl:t%d"
>  
> +struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev);
> +struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
> +						     struct cxl_port *port);
>  #endif /* __CXL_H__ */
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> new file mode 100644
> index 000000000000..0067bd734559
> --- /dev/null
> +++ b/drivers/cxl/pmem.c
> @@ -0,0 +1,141 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +#include <linux/libnvdimm.h>
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include "cxl.h"
> +
> +/*
> + * Ordered workqueue for cxl nvdimm device arrival and departure
> + * to coordinate bus rescans when a bridge arrives and trigger remove
> + * operations when the bridge is removed.
> + */
> +static struct workqueue_struct *cxl_pmem_wq;
> +
> +static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
> +			struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> +			unsigned int buf_len, int *cmd_rc)
> +{
> +	return -ENOTTY;
> +}
> +
> +static void online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
> +{
> +	if (cxl_nvb->nvdimm_bus)
> +		return;
> +	cxl_nvb->nvdimm_bus =
> +		nvdimm_bus_register(&cxl_nvb->dev, &cxl_nvb->nd_desc);
> +}
> +
> +static void offline_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
> +{
> +	if (!cxl_nvb->nvdimm_bus)
> +		return;
> +	nvdimm_bus_unregister(cxl_nvb->nvdimm_bus);
> +	cxl_nvb->nvdimm_bus = NULL;
> +}
> +
> +static void cxl_nvb_update_state(struct work_struct *work)
> +{
> +	struct cxl_nvdimm_bridge *cxl_nvb =
> +		container_of(work, typeof(*cxl_nvb), state_work);
> +	bool release = false;
> +
> +	device_lock(&cxl_nvb->dev);
> +	switch (cxl_nvb->state) {
> +	case CXL_NVB_ONLINE:
> +		online_nvdimm_bus(cxl_nvb);
> +		if (!cxl_nvb->nvdimm_bus) {

I'd slightly prefer a simple return code from online_nvdimm_bus()
so the reviewer doesn't have to look up above to find out that
this condition corresponds to failure.

> +			dev_err(&cxl_nvb->dev,
> +				"failed to establish nvdimm bus\n");
> +			release = true;
> +		}
> +		break;
> +	case CXL_NVB_OFFLINE:
> +	case CXL_NVB_DEAD:
> +		offline_nvdimm_bus(cxl_nvb);
> +		break;
> +	default:
> +		break;
> +	}
> +	device_unlock(&cxl_nvb->dev);
> +
> +	if (release)
> +		device_release_driver(&cxl_nvb->dev);
> +
> +	put_device(&cxl_nvb->dev);
> +}
> +
> +static void cxl_nvdimm_bridge_remove(struct device *dev)
> +{
> +	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
> +
> +	if (cxl_nvb->state == CXL_NVB_ONLINE)
> +		cxl_nvb->state = CXL_NVB_OFFLINE;
> +	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
> +		get_device(&cxl_nvb->dev);
> +}
> +
> +static int cxl_nvdimm_bridge_probe(struct device *dev)
> +{
> +	struct cxl_nvdimm_bridge *cxl_nvb = to_cxl_nvdimm_bridge(dev);
> +
> +	if (cxl_nvb->state == CXL_NVB_DEAD)
> +		return -ENXIO;
> +
> +	if (cxl_nvb->state == CXL_NVB_NEW) {
> +		cxl_nvb->nd_desc = (struct nvdimm_bus_descriptor) {
> +			.provider_name = "CXL",
> +			.module = THIS_MODULE,
> +			.ndctl = cxl_pmem_ctl,
> +		};
> +
> +		INIT_WORK(&cxl_nvb->state_work, cxl_nvb_update_state);
> +	}
> +
> +	cxl_nvb->state = CXL_NVB_ONLINE;
> +	if (queue_work(cxl_pmem_wq, &cxl_nvb->state_work))
> +		get_device(&cxl_nvb->dev);
> +
> +	return 0;
> +}
> +
> +static struct cxl_driver cxl_nvdimm_bridge_driver = {
> +	.name = "cxl_nvdimm_bridge",
> +	.probe = cxl_nvdimm_bridge_probe,
> +	.remove = cxl_nvdimm_bridge_remove,
> +	.id = CXL_DEVICE_NVDIMM_BRIDGE,
> +};
> +
> +static __init int cxl_pmem_init(void)
> +{
> +	int rc;
> +
> +	cxl_pmem_wq = alloc_ordered_workqueue("cxl_pmem", 0);
> +
> +	if (!cxl_pmem_wq)
> +		return -ENXIO;
> +
> +	rc = cxl_driver_register(&cxl_nvdimm_bridge_driver);
> +	if (rc)
> +		goto err;
> +
> +	return 0;
> +
> +err:
> +	destroy_workqueue(cxl_pmem_wq);
> +	return rc;
> +}
> +
> +static __exit void cxl_pmem_exit(void)
> +{
> +	cxl_driver_unregister(&cxl_nvdimm_bridge_driver);
> +	destroy_workqueue(cxl_pmem_wq);
> +}
> +
> +MODULE_LICENSE("GPL v2");
> +module_init(cxl_pmem_init);
> +module_exit(cxl_pmem_exit);
> +MODULE_IMPORT_NS(CXL);
> +MODULE_ALIAS_CXL(CXL_DEVICE_NVDIMM_BRIDGE);
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] libnvdimm: Drop unused device power management support
  2021-06-10 22:26 ` [PATCH 4/5] libnvdimm: Drop unused device power management support Dan Williams
@ 2021-06-11 11:47   ` Jonathan Cameron
  2021-06-12  0:16     ` Dan Williams
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Cameron @ 2021-06-11 11:47 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, nvdimm, ben.widawsky, alison.schofield,
	vishal.l.verma, ira.weiny, linux-kernel

On Thu, 10 Jun 2021 15:26:19 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> LIBNVDIMM device objects register sysfs power attributes despite nothing
> requiring that support. Clean up sysfs remove the power/ attribute
> group. This requires a device_create() and a device_register() usage to
> be converted to the device_initialize() + device_add() pattern.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Trivial comment below. Looks good.

> ---
>  drivers/nvdimm/bus.c |   45 +++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 37 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
> index a11821df83b5..e6aa87043a95 100644
> --- a/drivers/nvdimm/bus.c
> +++ b/drivers/nvdimm/bus.c
> @@ -363,8 +363,13 @@ struct nvdimm_bus *nvdimm_bus_register(struct device *parent,
>  	nvdimm_bus->dev.groups = nd_desc->attr_groups;
>  	nvdimm_bus->dev.bus = &nvdimm_bus_type;
>  	nvdimm_bus->dev.of_node = nd_desc->of_node;
> -	dev_set_name(&nvdimm_bus->dev, "ndbus%d", nvdimm_bus->id);
> -	rc = device_register(&nvdimm_bus->dev);
> +	device_initialize(&nvdimm_bus->dev);
> +	device_set_pm_not_required(&nvdimm_bus->dev);
> +	rc = dev_set_name(&nvdimm_bus->dev, "ndbus%d", nvdimm_bus->id);
> +	if (rc)
> +		goto err;

Maybe mention in patch description that you also now handle errors in
dev_set_name()?

> +
> +	rc = device_add(&nvdimm_bus->dev);
>  	if (rc) {
>  		dev_dbg(&nvdimm_bus->dev, "registration failed: %d\n", rc);
>  		goto err;
> @@ -525,6 +530,7 @@ void __nd_device_register(struct device *dev)
>  		set_dev_node(dev, to_nd_region(dev)->numa_node);
>  
>  	dev->bus = &nvdimm_bus_type;
> +	device_set_pm_not_required(dev);
>  	if (dev->parent) {
>  		get_device(dev->parent);
>  		if (dev_to_node(dev) == NUMA_NO_NODE)
> @@ -717,18 +723,41 @@ const struct attribute_group nd_numa_attribute_group = {
>  	.is_visible = nd_numa_attr_visible,
>  };
>  
> +static void ndctl_release(struct device *dev)
> +{
> +	kfree(dev);
> +}
> +
>  int nvdimm_bus_create_ndctl(struct nvdimm_bus *nvdimm_bus)
>  {
>  	dev_t devt = MKDEV(nvdimm_bus_major, nvdimm_bus->id);
>  	struct device *dev;
> +	int rc;
>  
> -	dev = device_create(nd_class, &nvdimm_bus->dev, devt, nvdimm_bus,
> -			"ndctl%d", nvdimm_bus->id);
> +	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> +	if (!dev)
> +		return -ENOMEM;
> +	device_initialize(dev);
> +	device_set_pm_not_required(dev);
> +	dev->class = nd_class;
> +	dev->parent = &nvdimm_bus->dev;
> +	dev->devt = devt;
> +	dev->release = ndctl_release;
> +	rc = dev_set_name(dev, "ndctl%d", nvdimm_bus->id);
> +	if (rc)
> +		goto err;
>  
> -	if (IS_ERR(dev))
> -		dev_dbg(&nvdimm_bus->dev, "failed to register ndctl%d: %ld\n",
> -				nvdimm_bus->id, PTR_ERR(dev));
> -	return PTR_ERR_OR_ZERO(dev);
> +	rc = device_add(dev);
> +	if (rc) {
> +		dev_dbg(&nvdimm_bus->dev, "failed to register ndctl%d: %d\n",
> +				nvdimm_bus->id, rc);
> +		goto err;
> +	}
> +	return 0;
> +
> +err:
> +	put_device(dev);
> +	return rc;
>  }
>  
>  void nvdimm_bus_destroy_ndctl(struct nvdimm_bus *nvdimm_bus)
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] cxl/pmem: Register 'pmem' / cxl_nvdimm  devices
  2021-06-10 22:26 ` [PATCH 5/5] cxl/pmem: Register 'pmem' / cxl_nvdimm devices Dan Williams
@ 2021-06-11 12:57   ` Jonathan Cameron
  2021-06-12  0:34     ` Dan Williams
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Cameron @ 2021-06-11 12:57 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, nvdimm, ben.widawsky, alison.schofield,
	vishal.l.verma, ira.weiny, linux-kernel

On Thu, 10 Jun 2021 15:26:26 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> While a memX device on /sys/bus/cxl represents a CXL memory expander
> control interface, a pmemX device represents the persistent memory
> sub-functionality. It bridges the CXL subystem to the libnvdimm nmemX
> control interface.
> 
> With this skeleton ndctl can now see persistent memory devices on a
> "CXL" bus. Later patches add support for translating libnvdimm native
> commands to CXL commands.
> 
> # ndctl list -BDiu -b CXL
> {
>   "provider":"CXL",
>   "dev":"ndbus1",
>   "dimms":[
>     {
>       "dev":"nmem1",
>       "state":"disabled"
>     },
>     {
>       "dev":"nmem0",
>       "state":"disabled"
>     }
>   ]
> }
> 
> Given nvdimm_bus_unregister() removes all devices on an ndbus0 the
> cxl_pmem infrastructure needs to arrange ->remove() to be triggered on
> cxl_nvdimm devices to keep their enabled state synchronized with the
> registration state of their corresponding device on the nvdimm_bus. In
> other words, always arrange for cxl_nvdimm_driver.remove() to unregister
> nvdimms from an nvdimm_bus ahead of the bus being unregistered.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Being my usual fussy self, I've highlighed a few header changes that
as far as I can see are unrelated to this specific patch.

Otherwise, one request for a local variable name change and
a bit of trivial editorial stuff.

Thanks,

Jonathan


> ---
>  drivers/cxl/core.c |   86 ++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h  |   13 ++++++
>  drivers/cxl/mem.h  |    2 +
>  drivers/cxl/pci.c  |   23 ++++++++---
>  drivers/cxl/pmem.c |  111 +++++++++++++++++++++++++++++++++++++++++++++++-----
>  5 files changed, 217 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> index f0305c9c91c8..6db660249cea 100644
> --- a/drivers/cxl/core.c
> +++ b/drivers/cxl/core.c
> @@ -7,6 +7,7 @@
>  #include <linux/slab.h>
>  #include <linux/idr.h>
>  #include "cxl.h"
> +#include "mem.h"
>  
>  /**
>   * DOC: cxl core
> @@ -731,6 +732,89 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
>  }
>  EXPORT_SYMBOL_GPL(devm_cxl_add_nvdimm_bridge);
>  
> +static void cxl_nvdimm_release(struct device *dev)
> +{
> +	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> +
> +	kfree(cxl_nvd);
> +}
> +
> +static const struct attribute_group *cxl_nvdimm_attribute_groups[] = {
> +	&cxl_base_attribute_group,
> +	NULL,
> +};
> +
> +static const struct device_type cxl_nvdimm_type = {
> +	.name = "cxl_nvdimm",
> +	.release = cxl_nvdimm_release,
> +	.groups = cxl_nvdimm_attribute_groups,
> +};
> +
> +bool is_cxl_nvdimm(struct device *dev)
> +{
> +	return dev->type == &cxl_nvdimm_type;
> +}
> +EXPORT_SYMBOL_GPL(is_cxl_nvdimm);
> +
> +struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev)
> +{
> +	if (dev_WARN_ONCE(dev, !is_cxl_nvdimm(dev),
> +			  "not a cxl_nvdimm device\n"))
> +		return NULL;
> +	return container_of(dev, struct cxl_nvdimm, dev);
> +}
> +EXPORT_SYMBOL_GPL(to_cxl_nvdimm);
> +
> +static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_nvdimm *cxl_nvd;
> +	struct device *dev;
> +
> +	cxl_nvd = kzalloc(sizeof(*cxl_nvd), GFP_KERNEL);
> +	if (!cxl_nvd)
> +		return ERR_PTR(-ENOMEM);
> +
> +	dev = &cxl_nvd->dev;
> +	cxl_nvd->cxlmd = cxlmd;
> +	device_initialize(dev);
> +	device_set_pm_not_required(dev);
> +	dev->parent = &cxlmd->dev;
> +	dev->bus = &cxl_bus_type;
> +	dev->type = &cxl_nvdimm_type;
> +
> +	return cxl_nvd;
> +}
> +
> +int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_nvdimm *cxl_nvd;
> +	struct device *dev;
> +	int rc;
> +
> +	cxl_nvd = cxl_nvdimm_alloc(cxlmd);
> +	if (IS_ERR(cxl_nvd))
> +		return PTR_ERR(cxl_nvd);
> +
> +	dev = &cxl_nvd->dev;
> +	rc = dev_set_name(dev, "pmem%d", cxlmd->id);
> +	if (rc)
> +		goto err;
> +
> +	rc = device_add(dev);
> +	if (rc)
> +		goto err;
> +
> +	dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
> +		dev_name(dev));
> +
> +	return devm_add_action_or_reset(host, unregister_dev, dev);
> +
> +err:
> +	put_device(dev);
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(devm_cxl_add_nvdimm);
> +
>  /**
>   * cxl_probe_device_regs() - Detect CXL Device register blocks
>   * @dev: Host device of the @base mapping
> @@ -930,6 +1014,8 @@ static int cxl_device_id(struct device *dev)
>  {
>  	if (dev->type == &cxl_nvdimm_bridge_type)
>  		return CXL_DEVICE_NVDIMM_BRIDGE;
> +	if (dev->type == &cxl_nvdimm_type)
> +		return CXL_DEVICE_NVDIMM;
>  	return 0;
>  }
>  
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 47fcb7ad5978..3f9a6f7b05db 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -213,6 +213,13 @@ struct cxl_nvdimm_bridge {
>  	enum cxl_nvdimm_brige_state state;
>  };
>  
> +struct cxl_mem;

Above looks unrelated as you've not added any uses of cxl_mem

> +struct cxl_nvdimm {
> +	struct device dev;
> +	struct cxl_memdev *cxlmd;
> +	struct nvdimm *nvdimm;
> +};
> +
>  /**
>   * struct cxl_port - logical collection of upstream port devices and
>   *		     downstream port devices to construct a CXL memory
> @@ -299,7 +306,8 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
>  #define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
>  void cxl_driver_unregister(struct cxl_driver *cxl_drv);
>  
> -#define CXL_DEVICE_NVDIMM_BRIDGE 1
> +#define CXL_DEVICE_NVDIMM_BRIDGE	1
> +#define CXL_DEVICE_NVDIMM		2
>  
>  #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
>  #define CXL_MODALIAS_FMT "cxl:t%d"
> @@ -307,4 +315,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
>  struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev);
>  struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
>  						     struct cxl_port *port);
> +struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
> +bool is_cxl_nvdimm(struct device *dev);
> +int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
>  #endif /* __CXL_H__ */
> diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
> index 13868ff7cadf..8f02d02b26b4 100644
> --- a/drivers/cxl/mem.h
> +++ b/drivers/cxl/mem.h
> @@ -2,6 +2,8 @@
>  /* Copyright(c) 2020-2021 Intel Corporation. */
>  #ifndef __CXL_MEM_H__
>  #define __CXL_MEM_H__
> +#include <linux/cdev.h>

Unrelated to rest of patch.  Good to add this, but not hidden in this
patch.

> +#include "cxl.h"
>  
>  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
>  #define CXLMDEV_STATUS_OFFSET 0x0
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 5a1705b52278..83e81b44c8f5 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1313,7 +1313,8 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm)
>  	return ERR_PTR(rc);
>  }
>  
> -static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
> +static struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
> +					      struct cxl_mem *cxlm)
>  {
>  	struct cxl_memdev *cxlmd;
>  	struct device *dev;
> @@ -1322,7 +1323,7 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
>  
>  	cxlmd = cxl_memdev_alloc(cxlm);
>  	if (IS_ERR(cxlmd))
> -		return PTR_ERR(cxlmd);
> +		return cxlmd;
>  
>  	dev = &cxlmd->dev;
>  	rc = dev_set_name(dev, "mem%d", cxlmd->id);
> @@ -1340,8 +1341,10 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
>  	if (rc)
>  		goto err;
>  
> -	return devm_add_action_or_reset(dev->parent, cxl_memdev_unregister,
> -					cxlmd);
> +	rc = devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
> +	if (rc)
> +		return ERR_PTR(rc);
> +	return cxlmd;
>  
>  err:
>  	/*
> @@ -1350,7 +1353,7 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
>  	 */
>  	cxl_memdev_shutdown(cxlmd);
>  	put_device(dev);
> -	return rc;
> +	return ERR_PTR(rc);
>  }
>  
>  static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
> @@ -1561,6 +1564,7 @@ static int cxl_mem_identify(struct cxl_mem *cxlm)
>  
>  static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
> +	struct cxl_memdev *cxlmd;
>  	struct cxl_mem *cxlm;
>  	int rc;
>  
> @@ -1588,7 +1592,14 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	return cxl_mem_add_memdev(cxlm);
> +	cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm);
> +	if (IS_ERR(cxlmd))
> +		return PTR_ERR(cxlmd);
> +
> +	if (range_len(&cxlm->pmem_range) && IS_ENABLED(CONFIG_CXL_PMEM))
> +		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
> +
> +	return rc;
>  }
>  
>  static const struct pci_device_id cxl_mem_pci_tbl[] = {
> diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> index 0067bd734559..1ed19e213157 100644
> --- a/drivers/cxl/pmem.c
> +++ b/drivers/cxl/pmem.c
> @@ -3,7 +3,10 @@
>  #include <linux/libnvdimm.h>
>  #include <linux/device.h>
>  #include <linux/module.h>
> +#include <linux/ndctl.h>
> +#include <linux/async.h>
>  #include <linux/slab.h>
> +#include "mem.h"
>  #include "cxl.h"
>  
>  /*
> @@ -13,6 +16,64 @@
>   */
>  static struct workqueue_struct *cxl_pmem_wq;
>  
> +static void unregister_nvdimm(void *nvdimm)
> +{
> +	nvdimm_delete(nvdimm);
> +}
> +
> +static int match_nvdimm_bridge(struct device *dev, const void *data)
> +{
> +	return strcmp(dev_name(dev), "nvdimm-bridge") == 0;
> +}
> +
> +static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
> +{
> +	struct device *dev;
> +
> +	dev = bus_find_device(&cxl_bus_type, NULL, NULL, match_nvdimm_bridge);

Hmm. It's a singleton, so you could just stash the pointer somewhere
appropriate, but I guess this avoids adding extra infrastructure around
that, so fair enough.

> +	if (!dev)
> +		return NULL;
> +	return to_cxl_nvdimm_bridge(dev);
> +}
> +
> +static int cxl_nvdimm_probe(struct device *dev)
> +{
> +	struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> +	struct cxl_nvdimm_bridge *cxl_nvb;
> +	unsigned long flags = 0;
> +	struct nvdimm *nvdimm;
> +	int rc = -ENXIO;
> +
> +	cxl_nvb = cxl_find_nvdimm_bridge();
> +	if (!cxl_nvb)
> +		return -ENXIO;
> +
> +	device_lock(&cxl_nvb->dev);
> +	if (!cxl_nvb->nvdimm_bus)
> +		goto out;
> +
> +	set_bit(NDD_LABELING, &flags);
> +	nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
> +			       NULL);
> +	if (!nvdimm)
> +		goto out;
> +
> +	dev_set_drvdata(dev, nvdimm);

Why?  No harm done, but I wanted to check this was only used to get the
nvdimm, but can't find where it's used at all.

> +
> +	rc = devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm);
> +out:
> +	device_unlock(&cxl_nvb->dev);
> +	put_device(&cxl_nvb->dev);
> +
> +	return rc;
> +}
> +
> +static struct cxl_driver cxl_nvdimm_driver = {
> +	.name = "cxl_nvdimm",
> +	.probe = cxl_nvdimm_probe,
> +	.id = CXL_DEVICE_NVDIMM,
> +};
> +
>  static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
>  			struct nvdimm *nvdimm, unsigned int cmd, void *buf,
>  			unsigned int buf_len, int *cmd_rc)
> @@ -28,19 +89,31 @@ static void online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
>  		nvdimm_bus_register(&cxl_nvb->dev, &cxl_nvb->nd_desc);
>  }
>  
> -static void offline_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
> +static int cxl_nvdimm_release_driver(struct device *dev, void *data)
>  {
> -	if (!cxl_nvb->nvdimm_bus)
> -		return;
> -	nvdimm_bus_unregister(cxl_nvb->nvdimm_bus);
> -	cxl_nvb->nvdimm_bus = NULL;
> +	if (!is_cxl_nvdimm(dev))
> +		return 0;
> +	device_release_driver(dev);
> +	return 0;
> +}
> +
> +static void offline_nvdimm_bus(struct nvdimm_bus *nvdimm_bus)
> +{
> +	/*
> +	 * Set the state of cxl_nvdimm devices to unbound / idle before
> +	 * nvdimm_bus_unregister() rips the nvdimm objects out from
> +	 * underneath them.
> +	 */
> +	bus_for_each_dev(&cxl_bus_type, NULL, NULL, cxl_nvdimm_release_driver);
> +	nvdimm_bus_unregister(nvdimm_bus);
>  }
>  
>  static void cxl_nvb_update_state(struct work_struct *work)
>  {
>  	struct cxl_nvdimm_bridge *cxl_nvb =
>  		container_of(work, typeof(*cxl_nvb), state_work);
> -	bool release = false;
> +	struct nvdimm_bus *nvdimm_bus = NULL;
> +	bool release = false, rescan = false;
>  
>  	device_lock(&cxl_nvb->dev);
>  	switch (cxl_nvb->state) {
> @@ -50,11 +123,13 @@ static void cxl_nvb_update_state(struct work_struct *work)
>  			dev_err(&cxl_nvb->dev,
>  				"failed to establish nvdimm bus\n");
>  			release = true;
> -		}
> +		} else
> +			rescan = true;
>  		break;
>  	case CXL_NVB_OFFLINE:
>  	case CXL_NVB_DEAD:
> -		offline_nvdimm_bus(cxl_nvb);
> +		nvdimm_bus = cxl_nvb->nvdimm_bus;

Perhaps rename this to make it clear that it only exists as a means
to offline it later?  I wondered briefly why you were offlining
any bus that existed (e.g. in the online case)

nvdimm_bus_to_offline = ...

> +		cxl_nvb->nvdimm_bus = NULL;
>  		break;
>  	default:
>  		break;
> @@ -63,6 +138,13 @@ static void cxl_nvb_update_state(struct work_struct *work)
>  
>  	if (release)
>  		device_release_driver(&cxl_nvb->dev);
> +	if (rescan) {
> +		int rc = bus_rescan_devices(&cxl_bus_type);
> +
> +		dev_dbg(&cxl_nvb->dev, "rescan: %d\n", rc);
> +	}
> +	if (nvdimm_bus)
> +		offline_nvdimm_bus(nvdimm_bus);
>  
>  	put_device(&cxl_nvb->dev);
>  }
> @@ -113,23 +195,29 @@ static __init int cxl_pmem_init(void)
>  	int rc;
>  
>  	cxl_pmem_wq = alloc_ordered_workqueue("cxl_pmem", 0);
> -

Shouldn't be here obviously. Move to earlier patch.

>  	if (!cxl_pmem_wq)
>  		return -ENXIO;
>  
>  	rc = cxl_driver_register(&cxl_nvdimm_bridge_driver);
>  	if (rc)
> -		goto err;
> +		goto err_bridge;
> +
> +	rc = cxl_driver_register(&cxl_nvdimm_driver);
> +	if (rc)
> +		goto err_nvdimm;
>  
>  	return 0;
>  
> -err:
> +err_nvdimm:
> +	cxl_driver_unregister(&cxl_nvdimm_bridge_driver);
> +err_bridge:
>  	destroy_workqueue(cxl_pmem_wq);
>  	return rc;
>  }
>  
>  static __exit void cxl_pmem_exit(void)
>  {
> +	cxl_driver_unregister(&cxl_nvdimm_driver);
>  	cxl_driver_unregister(&cxl_nvdimm_bridge_driver);
>  	destroy_workqueue(cxl_pmem_wq);
>  }
> @@ -139,3 +227,4 @@ module_init(cxl_pmem_init);
>  module_exit(cxl_pmem_exit);
>  MODULE_IMPORT_NS(CXL);
>  MODULE_ALIAS_CXL(CXL_DEVICE_NVDIMM_BRIDGE);
> +MODULE_ALIAS_CXL(CXL_DEVICE_NVDIMM);
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support
  2021-06-10 22:25 [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Dan Williams
                   ` (4 preceding siblings ...)
  2021-06-10 22:26 ` [PATCH 5/5] cxl/pmem: Register 'pmem' / cxl_nvdimm devices Dan Williams
@ 2021-06-11 12:59 ` Jonathan Cameron
  5 siblings, 0 replies; 18+ messages in thread
From: Jonathan Cameron @ 2021-06-11 12:59 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, nvdimm, ben.widawsky, alison.schofield,
	vishal.l.verma, ira.weiny, linux-kernel

On Thu, 10 Jun 2021 15:25:57 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> CXL Memory Expander devices (CXL 2.0 Type-3) support persistent memory
> in addition to volatile memory expansion. The most significant changes
> this requires of the existing LIBNVDIMM infrastructure, compared to what
> was needed to support ACPI NFIT defined PMEM, is the ability to
> dynamically provision regions in addition to namespaces, and a formal
> model for hotplug.
> 
> Before region provisioning can be added the CXL enabling needs to
> enumerate "nvdimm" devices on a CXL nvdimm-bus. This is modeled as a
> CXL-nvdimm-bridge device (bridging CXL to nvdimm) and an associated
> driver to activate and deactivate that bus-bridge. Once the bridge is
> registered it scans for CXL nvdimm devices registered by endpoints.  The
> CXL core bus is used as a rendezvous for nvdimm bridges and endpoints
> allowing them to be registered and enabled in any order.
> 
> At the end of this series the ndctl utility can see CXL nvdimm resources
> just like any other nvdimm bus.
> 
>     # ndctl list -BDiu -b CXL
>     {
>       "provider":"CXL",
>       "dev":"ndbus1",
>       "dimms":[
>         { 
>           "dev":"nmem1",
>           "state":"disabled"
>         },
>         { 
>           "dev":"nmem0",
>           "state":"disabled"
>         }
>       ]
>     }
> 
> Follow-on patches extend the nvdimm core label support for CXL region
> and namespace labels. For now just add the machinery to register the
> bus and nvdimm base objects.

Hi Dan,

Nice set. I won't claim to know all that much about the nvdimm
side of things. With that in mind and the fact all my comments
were trivial stuff you can clean up without me reading it again.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

for whole set.

Thanks,

Jonathan

> 
> ---
> 
> Dan Williams (5):
>       cxl/core: Add cxl-bus driver infrastructure
>       cxl/pmem: Add initial infrastructure for pmem support
>       libnvdimm: Export nvdimm shutdown helper, nvdimm_delete()
>       libnvdimm: Drop unused device power management support
>       cxl/pmem: Register 'pmem' / cxl_nvdimm  devices
> 
> 
>  drivers/cxl/Kconfig        |   13 ++
>  drivers/cxl/Makefile       |    2 
>  drivers/cxl/acpi.c         |   37 +++++-
>  drivers/cxl/core.c         |  281 ++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h          |   57 +++++++++
>  drivers/cxl/mem.h          |    2 
>  drivers/cxl/pci.c          |   23 +++-
>  drivers/cxl/pmem.c         |  230 ++++++++++++++++++++++++++++++++++++
>  drivers/nvdimm/bus.c       |   64 ++++++----
>  drivers/nvdimm/dimm_devs.c |   18 +++
>  include/linux/libnvdimm.h  |    1 
>  11 files changed, 697 insertions(+), 31 deletions(-)
>  create mode 100644 drivers/cxl/pmem.c
> 
> base-commit: 40ba17afdfabb01688c61565dbe02a916241bc05


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure
  2021-06-10 22:26 ` [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure Dan Williams
@ 2021-06-11 17:47   ` Ben Widawsky
  2021-06-11 18:55     ` Dan Williams
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2021-06-11 17:47 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, nvdimm, alison.schofield, vishal.l.verma, ira.weiny,
	linux-kernel

On 21-06-10 15:26:03, Dan Williams wrote:
> Enable devices on the 'cxl' bus to be attached to drivers. The initial
> user of this functionality is a driver for an 'nvdimm-bridge' device
> that anchors a libnvdimm hierarchy attached to CXL persistent memory
> resources. Other device types that will leverage this include:
> 
> cxl_port: map and use component register functionality (HDM Decoders)

Since I'm looking at this now, perhaps I can open the discussion here. Have you
thought about how this works yet? Right now I'm thinking there are two "drivers":
cxl_port: Switches (and ACPI0016)
cxl_mem: The memory device's HDM decoders

For port, probe() will figure out that the thing is an upstream port, call
cxl_probe_component_regs and then call devm_cxl_add_port(). I think that's
straight forward.

For the memory device we've already probed the thing via class code so there is
no need to use this driver registration, however, I think it would be nice to do
so. Is there a clean way to do that?

Also, I'd like to make sure we're on the same page about struct cxl_decoder.
Right now they are only created for active HDM decoders. Going forward, we can
either maintain a count of unused decoders on the given CXL component, or we can
instantiate a struct cxl_decoder that isn't active, ie. no interleave ways
granularit, base, etc. What's your thinking there?

> 
> cxl_nvdimm: translate CXL memory expander endpoints to libnvdimm
> 	    'nvdimm' objects
> 
> cxl_region: translate CXL interleave sets to libnvdimm 'region' objects
> 
> The pairing of devices to drivers is handled through the cxl_device_id()
> matching to cxl_driver.id values. A cxl_device_id() of '0' indicates no
> driver support.
> 
> In addition to ->match(), ->probe(), and ->remove() support for the
> 'cxl' bus introduce MODULE_ALIAS_CXL() to autoload modules containing
> cxl-drivers. Drivers are added in follow-on changes.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core.c |   73 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h  |   22 ++++++++++++++++
>  2 files changed, 95 insertions(+)
> 
> diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> index 1b9ee0b08384..959cecc1f6bf 100644
> --- a/drivers/cxl/core.c
> +++ b/drivers/cxl/core.c
> @@ -767,8 +767,81 @@ int cxl_map_device_regs(struct pci_dev *pdev,
>  }
>  EXPORT_SYMBOL_GPL(cxl_map_device_regs);
>  
> +/**
> + * __cxl_driver_register - register a driver for the cxl bus
> + * @cxl_drv: cxl driver structure to attach
> + * @owner: owning module/driver
> + * @modname: KBUILD_MODNAME for parent driver
> + */
> +int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
> +			  const char *modname)
> +{
> +	if (!cxl_drv->probe) {
> +		pr_debug("%s ->probe() must be specified\n", modname);
> +		return -EINVAL;
> +	}
> +
> +	if (!cxl_drv->name) {
> +		pr_debug("%s ->name must be specified\n", modname);
> +		return -EINVAL;
> +	}
> +
> +	if (!cxl_drv->id) {
> +		pr_debug("%s ->id must be specified\n", modname);
> +		return -EINVAL;
> +	}
> +
> +	cxl_drv->drv.bus = &cxl_bus_type;
> +	cxl_drv->drv.owner = owner;
> +	cxl_drv->drv.mod_name = modname;
> +	cxl_drv->drv.name = cxl_drv->name;
> +
> +	return driver_register(&cxl_drv->drv);
> +}
> +EXPORT_SYMBOL_GPL(__cxl_driver_register);
> +
> +void cxl_driver_unregister(struct cxl_driver *cxl_drv)
> +{
> +	driver_unregister(&cxl_drv->drv);
> +}
> +EXPORT_SYMBOL_GPL(cxl_driver_unregister);
> +
> +static int cxl_device_id(struct device *dev)
> +{
> +	return 0;
> +}
> +
> +static int cxl_bus_uevent(struct device *dev, struct kobj_uevent_env *env)
> +{
> +	return add_uevent_var(env, "MODALIAS=" CXL_MODALIAS_FMT,
> +			      cxl_device_id(dev));
> +}
> +
> +static int cxl_bus_match(struct device *dev, struct device_driver *drv)
> +{
> +	return cxl_device_id(dev) == to_cxl_drv(drv)->id;
> +}
> +
> +static int cxl_bus_probe(struct device *dev)
> +{
> +	return to_cxl_drv(dev->driver)->probe(dev);
> +}
> +
> +static int cxl_bus_remove(struct device *dev)
> +{
> +	struct cxl_driver *cxl_drv = to_cxl_drv(dev->driver);
> +
> +	if (cxl_drv->remove)
> +		cxl_drv->remove(dev);
> +	return 0;
> +}
> +
>  struct bus_type cxl_bus_type = {
>  	.name = "cxl",
> +	.uevent = cxl_bus_uevent,
> +	.match = cxl_bus_match,
> +	.probe = cxl_bus_probe,
> +	.remove = cxl_bus_remove,
>  };
>  EXPORT_SYMBOL_GPL(cxl_bus_type);
>  
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index b988ea288f53..af2237d1c761 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -261,4 +261,26 @@ devm_cxl_add_passthrough_decoder(struct device *host, struct cxl_port *port)
>  }
>  
>  extern struct bus_type cxl_bus_type;
> +
> +struct cxl_driver {
> +	const char *name;
> +	int (*probe)(struct device *dev);
> +	void (*remove)(struct device *dev);
> +	struct device_driver drv;
> +	int id;
> +};
> +
> +static inline struct cxl_driver *to_cxl_drv(struct device_driver *drv)
> +{
> +	return container_of(drv, struct cxl_driver, drv);
> +}
> +
> +int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
> +			  const char *modname);
> +#define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
> +void cxl_driver_unregister(struct cxl_driver *cxl_drv);
> +
> +#define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
> +#define CXL_MODALIAS_FMT "cxl:t%d"
> +
>  #endif /* __CXL_H__ */
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure
  2021-06-11 17:47   ` Ben Widawsky
@ 2021-06-11 18:55     ` Dan Williams
  2021-06-11 19:28       ` Ben Widawsky
  0 siblings, 1 reply; 18+ messages in thread
From: Dan Williams @ 2021-06-11 18:55 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux NVDIMM, Schofield, Alison, Vishal L Verma,
	Weiny, Ira, Linux Kernel Mailing List

On Fri, Jun 11, 2021 at 10:47 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-06-10 15:26:03, Dan Williams wrote:
> > Enable devices on the 'cxl' bus to be attached to drivers. The initial
> > user of this functionality is a driver for an 'nvdimm-bridge' device
> > that anchors a libnvdimm hierarchy attached to CXL persistent memory
> > resources. Other device types that will leverage this include:
> >
> > cxl_port: map and use component register functionality (HDM Decoders)
>
> Since I'm looking at this now, perhaps I can open the discussion here. Have you
> thought about how this works yet? Right now I'm thinking there are two "drivers":
> cxl_port: Switches (and ACPI0016)
> cxl_mem: The memory device's HDM decoders
>
> For port, probe() will figure out that the thing is an upstream port, call
> cxl_probe_component_regs and then call devm_cxl_add_port(). I think that's
> straight forward.

I was expecting cxl_port_driver.probe() comes *after* port discovery.
Think of it like PCI discovery. Some agent does the hardware topology
scan to add devices, in this case devm_cxl_add_port(), and that
triggers cxl_port_driver to load. So the initial enumeration done by
the cxl_acpi driver will populate the first two levels of the port
hierarchy with port objects and populate their component register
physical base addresses. For any other port deeper in the hierarchy I
was expecting that to be scanned after the discovery of a cxl_memdev
that is not attached to the current hierarchy. So, for example imagine
a config like:

Platform --> Host Bridge --> Switch --> Endpoint

...where in sysfs that's modeled as:

root0 --> port1 --> port2 --> port3

Where port3 is assuming that the CXL core models the device's
connection to the topology as yet another cxl_port. At the beginning
of time after cxl_acpi has loaded but before cxl_pci has discovered
the endpoint the topology is:

root0 --> port1

Upon the detection of the endpoint the CXL core can assume that all
intermediary switches between the root and this device have been
registered as PCI devices. So, it follows that endpoint device arrival
triggers "cxl_bus_rescan()" that goes and enumerates all the CXL
resources in the topology to produce:

root0 --> port1 --> port2 --> port3

> For the memory device we've already probed the thing via class code so there is
> no need to use this driver registration, however, I think it would be nice to do
> so. Is there a clean way to do that?

The PCI device associated with the endpoint is already probed, but the
cxl_memdev itself can have a driver on the CXL bus. So I think the
cxl_memdev driver should try to register a cxl_port after telling
cxl_acpi to rescan. If a check like "is_cxl_dport(pdev->dev.parent)"
for the endpoint returns false it means that the cxl_bus_rescan()
failed to enumerate the CXL topology to this endpoint and this
endpoint is limited to only CXL.io operation.

> Also, I'd like to make sure we're on the same page about struct cxl_decoder.
> Right now they are only created for active HDM decoders.

No, I was expecting they are also created for inactive ones. I am
thinking that all decoders ultimately belong to the cxl_acpi driver,
or whatever driver is acting as the root on a non-ACPI system. All
decoder programming is driven by region activation stimulus that asks
the root driver to try to establish a decode chain through the
hieararchy per a given region.

> Going forward, we can
> either maintain a count of unused decoders on the given CXL component, or we can
> instantiate a struct cxl_decoder that isn't active, ie. no interleave ways
> granularit, base, etc. What's your thinking there?

All resources are enumerated, just like PCI. Decode setup belongs to
the core, just like PCI MMIO resource setup. The difference is that
port drivers are needed to map component registers and service
requests from cxl_acpi to reconfigure, but other than that
cxl_decoders themselves don't have drivers and just reflect the
current state of what cxl_acpi / cxl_core have established.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure
  2021-06-11 18:55     ` Dan Williams
@ 2021-06-11 19:28       ` Ben Widawsky
  2021-06-11 23:25         ` Dan Williams
  0 siblings, 1 reply; 18+ messages in thread
From: Ben Widawsky @ 2021-06-11 19:28 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Linux NVDIMM, Schofield, Alison, Vishal L Verma,
	Weiny, Ira, Linux Kernel Mailing List

On 21-06-11 11:55:39, Dan Williams wrote:
> On Fri, Jun 11, 2021 at 10:47 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-06-10 15:26:03, Dan Williams wrote:
> > > Enable devices on the 'cxl' bus to be attached to drivers. The initial
> > > user of this functionality is a driver for an 'nvdimm-bridge' device
> > > that anchors a libnvdimm hierarchy attached to CXL persistent memory
> > > resources. Other device types that will leverage this include:
> > >
> > > cxl_port: map and use component register functionality (HDM Decoders)
> >
> > Since I'm looking at this now, perhaps I can open the discussion here. Have you
> > thought about how this works yet? Right now I'm thinking there are two "drivers":
> > cxl_port: Switches (and ACPI0016)
> > cxl_mem: The memory device's HDM decoders
> >
> > For port, probe() will figure out that the thing is an upstream port, call
> > cxl_probe_component_regs and then call devm_cxl_add_port(). I think that's
> > straight forward.
> 
> I was expecting cxl_port_driver.probe() comes *after* port discovery.
> Think of it like PCI discovery. Some agent does the hardware topology
> scan to add devices, in this case devm_cxl_add_port(), and that
> triggers cxl_port_driver to load. So the initial enumeration done by
> the cxl_acpi driver will populate the first two levels of the port
> hierarchy with port objects and populate their component register
> physical base addresses. For any other port deeper in the hierarchy I
> was expecting that to be scanned after the discovery of a cxl_memdev
> that is not attached to the current hierarchy. So, for example imagine
> a config like:
> 
> Platform --> Host Bridge --> Switch --> Endpoint
> 
> ...where in sysfs that's modeled as:
> 
> root0 --> port1 --> port2 --> port3
> 
> Where port3 is assuming that the CXL core models the device's
> connection to the topology as yet another cxl_port. At the beginning
> of time after cxl_acpi has loaded but before cxl_pci has discovered
> the endpoint the topology is:
> 
> root0 --> port1
> 
> Upon the detection of the endpoint the CXL core can assume that all
> intermediary switches between the root and this device have been
> registered as PCI devices. So, it follows that endpoint device arrival
> triggers "cxl_bus_rescan()" that goes and enumerates all the CXL
> resources in the topology to produce:
> 
> root0 --> port1 --> port2 --> port3
> 

Ah, I had written about scan/rescan in an earlier version of my email but
dropped it. I was actually going to suggest it being a sysfs attr, but I'm fine
with it being implicit so long as...

How do we assert that cxl_pci doesn't run before cxl_acpi has done anything? I
like the idea that the endpoint device can simply ask cxl_acpi to rescan, I just
don't see how it works. I suppose we can queue up the requests to rescan in
cxl_acpi if the ordering can't be guaranteed.

> > For the memory device we've already probed the thing via class code so there is
> > no need to use this driver registration, however, I think it would be nice to do
> > so. Is there a clean way to do that?
> 
> The PCI device associated with the endpoint is already probed, but the
> cxl_memdev itself can have a driver on the CXL bus. So I think the
> cxl_memdev driver should try to register a cxl_port after telling
> cxl_acpi to rescan. If a check like "is_cxl_dport(pdev->dev.parent)"
> for the endpoint returns false it means that the cxl_bus_rescan()
> failed to enumerate the CXL topology to this endpoint and this
> endpoint is limited to only CXL.io operation.

What is going to invoke the memdev driver's probe? That is where we're talking
about putting that is_cxl_dport(...) right? That is the part that tripped me up
and inspired the original email FWIW.

> 
> > Also, I'd like to make sure we're on the same page about struct cxl_decoder.
> > Right now they are only created for active HDM decoders.
> 
> No, I was expecting they are also created for inactive ones. I am
> thinking that all decoders ultimately belong to the cxl_acpi driver,
> or whatever driver is acting as the root on a non-ACPI system. All
> decoder programming is driven by region activation stimulus that asks
> the root driver to try to establish a decode chain through the
> hieararchy per a given region.
> 
> > Going forward, we can
> > either maintain a count of unused decoders on the given CXL component, or we can
> > instantiate a struct cxl_decoder that isn't active, ie. no interleave ways
> > granularit, base, etc. What's your thinking there?
> 
> All resources are enumerated, just like PCI. Decode setup belongs to
> the core, just like PCI MMIO resource setup. The difference is that
> port drivers are needed to map component registers and service
> requests from cxl_acpi to reconfigure, but other than that
> cxl_decoders themselves don't have drivers and just reflect the
> current state of what cxl_acpi / cxl_core have established.

Okay.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure
  2021-06-11 19:28       ` Ben Widawsky
@ 2021-06-11 23:25         ` Dan Williams
  2021-06-14 21:40           ` Ben Widawsky
  0 siblings, 1 reply; 18+ messages in thread
From: Dan Williams @ 2021-06-11 23:25 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux NVDIMM, Schofield, Alison, Vishal L Verma,
	Weiny, Ira, Linux Kernel Mailing List

On Fri, Jun 11, 2021 at 12:28 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-06-11 11:55:39, Dan Williams wrote:
> > On Fri, Jun 11, 2021 at 10:47 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-06-10 15:26:03, Dan Williams wrote:
> > > > Enable devices on the 'cxl' bus to be attached to drivers. The initial
> > > > user of this functionality is a driver for an 'nvdimm-bridge' device
> > > > that anchors a libnvdimm hierarchy attached to CXL persistent memory
> > > > resources. Other device types that will leverage this include:
> > > >
> > > > cxl_port: map and use component register functionality (HDM Decoders)
> > >
> > > Since I'm looking at this now, perhaps I can open the discussion here. Have you
> > > thought about how this works yet? Right now I'm thinking there are two "drivers":
> > > cxl_port: Switches (and ACPI0016)
> > > cxl_mem: The memory device's HDM decoders
> > >
> > > For port, probe() will figure out that the thing is an upstream port, call
> > > cxl_probe_component_regs and then call devm_cxl_add_port(). I think that's
> > > straight forward.
> >
> > I was expecting cxl_port_driver.probe() comes *after* port discovery.
> > Think of it like PCI discovery. Some agent does the hardware topology
> > scan to add devices, in this case devm_cxl_add_port(), and that
> > triggers cxl_port_driver to load. So the initial enumeration done by
> > the cxl_acpi driver will populate the first two levels of the port
> > hierarchy with port objects and populate their component register
> > physical base addresses. For any other port deeper in the hierarchy I
> > was expecting that to be scanned after the discovery of a cxl_memdev
> > that is not attached to the current hierarchy. So, for example imagine
> > a config like:
> >
> > Platform --> Host Bridge --> Switch --> Endpoint
> >
> > ...where in sysfs that's modeled as:
> >
> > root0 --> port1 --> port2 --> port3
> >
> > Where port3 is assuming that the CXL core models the device's
> > connection to the topology as yet another cxl_port. At the beginning
> > of time after cxl_acpi has loaded but before cxl_pci has discovered
> > the endpoint the topology is:
> >
> > root0 --> port1
> >
> > Upon the detection of the endpoint the CXL core can assume that all
> > intermediary switches between the root and this device have been
> > registered as PCI devices. So, it follows that endpoint device arrival
> > triggers "cxl_bus_rescan()" that goes and enumerates all the CXL
> > resources in the topology to produce:
> >
> > root0 --> port1 --> port2 --> port3
> >
>
> Ah, I had written about scan/rescan in an earlier version of my email but
> dropped it. I was actually going to suggest it being a sysfs attr, but I'm fine
> with it being implicit so long as...
>
> How do we assert that cxl_pci doesn't run before cxl_acpi has done anything?

I don't think we need to, or it's broken if the driver load order
matters. The nvdimm enabling code is an example of how to handle this.
The cxl_nvdimm object can be registered before the cxl_nvdimm_bridge,
or after, does not matter. If the cxl_nvdimm comes first it will
trigger the cxl_nvdimm_driver to load. The cxl_nvdimm_driver.probe()
routine finds no bridge present and probe() returns with a failure.
When the bridge arrives it does a rescan  of the cxl_bus_type device
list and if it finds a cxl_nvdimm it re-triggers
cxl_nvdimm_driver.probe(). This time through cxl_nvdimm_driver.probe()
finds the bridge and registers the real nvdimm on the nvdimm_bus.

> I
> like the idea that the endpoint device can simply ask cxl_acpi to rescan, I just
> don't see how it works. I suppose we can queue up the requests to rescan in
> cxl_acpi if the ordering can't be guaranteed.

I think this means that the devm_cxl_add_port() would be triggered by
cxl_memdev_driver.probe() if and only if the parent pci_device of the
CXL endpoint is listed as a dport. If the cxl_memdev is registered
first the search it will search for the CXL root port on the
cxl_bus_type device list. If that fails then cxl_memdev_driver.probe()
fails. If that succeeds it asks the root to scan to the CXL endpoint
parent pci_device and return the confirmation that it is registered as
a dport. If that fails then the device is plugged into a pure PCIe
slot.

When cxl_acpi loads it retriggers all cxl_memdev_driver.probe() to
reconsider all cxl_memdev instances that failed to probe previously.

>
> > > For the memory device we've already probed the thing via class code so there is
> > > no need to use this driver registration, however, I think it would be nice to do
> > > so. Is there a clean way to do that?
> >
> > The PCI device associated with the endpoint is already probed, but the
> > cxl_memdev itself can have a driver on the CXL bus. So I think the
> > cxl_memdev driver should try to register a cxl_port after telling
> > cxl_acpi to rescan. If a check like "is_cxl_dport(pdev->dev.parent)"
> > for the endpoint returns false it means that the cxl_bus_rescan()
> > failed to enumerate the CXL topology to this endpoint and this
> > endpoint is limited to only CXL.io operation.
>
> What is going to invoke the memdev driver's probe? That is where we're talking
> about putting that is_cxl_dport(...) right? That is the part that tripped me up
> and inspired the original email FWIW.

I *think* I worked that out above, but yes please do poke at it to see
if it holds up.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] cxl/pmem: Add initial infrastructure for pmem support
  2021-06-11 11:39   ` Jonathan Cameron
@ 2021-06-12  0:07     ` Dan Williams
  0 siblings, 0 replies; 18+ messages in thread
From: Dan Williams @ 2021-06-12  0:07 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Linux NVDIMM, Ben Widawsky, Schofield, Alison,
	Vishal L Verma, Weiny, Ira, Linux Kernel Mailing List

On Fri, Jun 11, 2021 at 4:40 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Thu, 10 Jun 2021 15:26:08 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > Register an 'nvdimm-bridge' device to act as an anchor for a libnvdimm
> > bus hierarchy. Also, flesh out the cxl_bus definition to allow a
> > cxl_nvdimm_bridge_driver to attach to the bridge and trigger the
> > nvdimm-bus registration.
> >
> > The creation of the bridge is gated on the detection of a PMEM capable
> > address space registered to the root. The bridge indirection allows the
> > libnvdimm module to remain unloaded on platforms without PMEM support.
> >
> > Given that the probing of ACPI0017 is asynchronous to CXL endpoint
> > devices, and the expectation that CXL endpoint devices register other
> > PMEM resources on the 'CXL' nvdimm bus, a workqueue is added. The
> > workqueue is needed to run bus_rescan_devices() outside of the
> > device_lock() of the nvdimm-bridge device to rendezvous nvdimm resources
> > as they arrive. For now only the bus is taken online/offline in the
> > workqueue.
> >
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> I'm not that familiar with nvdimm side of things, so this is mostly
> superficial review of the patch itself.
>
> A few really minor comments inline, but otherwise looks good to me.
>
> Jonathan
>
[..]
> > +static void unregister_nvb(void *_cxl_nvb)
> > +{
> > +     struct cxl_nvdimm_bridge *cxl_nvb = _cxl_nvb;
> > +     bool flush = false;
> > +
> > +     /*
> > +      * If the bridge was ever activated then there might be in-flight state
> > +      * work to flush. Once the state has been changed to 'dead' then no new
> > +      * work can be queued by user-triggered bind.
> > +      */
> > +     device_lock(&cxl_nvb->dev);
> > +     if (cxl_nvb->state != CXL_NVB_NEW)
> > +             flush = true;
>
> flush = clx_nvb->state != CXL_NVB_NEW;
>
> perhaps?

Oh, yeah, that's nicer.

[..]
> > +static void cxl_nvb_update_state(struct work_struct *work)
> > +{
> > +     struct cxl_nvdimm_bridge *cxl_nvb =
> > +             container_of(work, typeof(*cxl_nvb), state_work);
> > +     bool release = false;
> > +
> > +     device_lock(&cxl_nvb->dev);
> > +     switch (cxl_nvb->state) {
> > +     case CXL_NVB_ONLINE:
> > +             online_nvdimm_bus(cxl_nvb);
> > +             if (!cxl_nvb->nvdimm_bus) {
>
> I'd slightly prefer a simple return code from online_nvdimm_bus()
> so the reviewer doesn't have to look up above to find out that
> this condition corresponds to failure.

Yeah, not sure why I made that so obscure.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] libnvdimm: Drop unused device power management support
  2021-06-11 11:47   ` Jonathan Cameron
@ 2021-06-12  0:16     ` Dan Williams
  0 siblings, 0 replies; 18+ messages in thread
From: Dan Williams @ 2021-06-12  0:16 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Linux NVDIMM, Ben Widawsky, Schofield, Alison,
	Vishal L Verma, Weiny, Ira, Linux Kernel Mailing List

On Fri, Jun 11, 2021 at 4:47 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Thu, 10 Jun 2021 15:26:19 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > LIBNVDIMM device objects register sysfs power attributes despite nothing
> > requiring that support. Clean up sysfs remove the power/ attribute
> > group. This requires a device_create() and a device_register() usage to
> > be converted to the device_initialize() + device_add() pattern.
> >
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> Trivial comment below. Looks good.
>
> > ---
> >  drivers/nvdimm/bus.c |   45 +++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 37 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
> > index a11821df83b5..e6aa87043a95 100644
> > --- a/drivers/nvdimm/bus.c
> > +++ b/drivers/nvdimm/bus.c
> > @@ -363,8 +363,13 @@ struct nvdimm_bus *nvdimm_bus_register(struct device *parent,
> >       nvdimm_bus->dev.groups = nd_desc->attr_groups;
> >       nvdimm_bus->dev.bus = &nvdimm_bus_type;
> >       nvdimm_bus->dev.of_node = nd_desc->of_node;
> > -     dev_set_name(&nvdimm_bus->dev, "ndbus%d", nvdimm_bus->id);
> > -     rc = device_register(&nvdimm_bus->dev);
> > +     device_initialize(&nvdimm_bus->dev);
> > +     device_set_pm_not_required(&nvdimm_bus->dev);
> > +     rc = dev_set_name(&nvdimm_bus->dev, "ndbus%d", nvdimm_bus->id);
> > +     if (rc)
> > +             goto err;
>
> Maybe mention in patch description that you also now handle errors in
> dev_set_name()?

Yeah, that's a philosophy change from when this code was first written.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] cxl/pmem: Register 'pmem' / cxl_nvdimm devices
  2021-06-11 12:57   ` Jonathan Cameron
@ 2021-06-12  0:34     ` Dan Williams
  0 siblings, 0 replies; 18+ messages in thread
From: Dan Williams @ 2021-06-12  0:34 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Linux NVDIMM, Ben Widawsky, Schofield, Alison,
	Vishal L Verma, Weiny, Ira, Linux Kernel Mailing List

On Fri, Jun 11, 2021 at 5:57 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Thu, 10 Jun 2021 15:26:26 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > While a memX device on /sys/bus/cxl represents a CXL memory expander
> > control interface, a pmemX device represents the persistent memory
> > sub-functionality. It bridges the CXL subystem to the libnvdimm nmemX
> > control interface.
> >
> > With this skeleton ndctl can now see persistent memory devices on a
> > "CXL" bus. Later patches add support for translating libnvdimm native
> > commands to CXL commands.
> >
> > # ndctl list -BDiu -b CXL
> > {
> >   "provider":"CXL",
> >   "dev":"ndbus1",
> >   "dimms":[
> >     {
> >       "dev":"nmem1",
> >       "state":"disabled"
> >     },
> >     {
> >       "dev":"nmem0",
> >       "state":"disabled"
> >     }
> >   ]
> > }
> >
> > Given nvdimm_bus_unregister() removes all devices on an ndbus0 the
> > cxl_pmem infrastructure needs to arrange ->remove() to be triggered on
> > cxl_nvdimm devices to keep their enabled state synchronized with the
> > registration state of their corresponding device on the nvdimm_bus. In
> > other words, always arrange for cxl_nvdimm_driver.remove() to unregister
> > nvdimms from an nvdimm_bus ahead of the bus being unregistered.
> >
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> Being my usual fussy self, I've highlighed a few header changes that
> as far as I can see are unrelated to this specific patch.
>
> Otherwise, one request for a local variable name change and
> a bit of trivial editorial stuff.
>
> Thanks,
>
> Jonathan
>
>
> > ---
> >  drivers/cxl/core.c |   86 ++++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/cxl.h  |   13 ++++++
> >  drivers/cxl/mem.h  |    2 +
> >  drivers/cxl/pci.c  |   23 ++++++++---
> >  drivers/cxl/pmem.c |  111 +++++++++++++++++++++++++++++++++++++++++++++++-----
> >  5 files changed, 217 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/cxl/core.c b/drivers/cxl/core.c
> > index f0305c9c91c8..6db660249cea 100644
> > --- a/drivers/cxl/core.c
> > +++ b/drivers/cxl/core.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/slab.h>
> >  #include <linux/idr.h>
> >  #include "cxl.h"
> > +#include "mem.h"

Looks like this can go too...

> >
> >  /**
> >   * DOC: cxl core
> > @@ -731,6 +732,89 @@ struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
> >  }
> >  EXPORT_SYMBOL_GPL(devm_cxl_add_nvdimm_bridge);
> >
> > +static void cxl_nvdimm_release(struct device *dev)
> > +{
> > +     struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> > +
> > +     kfree(cxl_nvd);
> > +}
> > +
> > +static const struct attribute_group *cxl_nvdimm_attribute_groups[] = {
> > +     &cxl_base_attribute_group,
> > +     NULL,
> > +};
> > +
> > +static const struct device_type cxl_nvdimm_type = {
> > +     .name = "cxl_nvdimm",
> > +     .release = cxl_nvdimm_release,
> > +     .groups = cxl_nvdimm_attribute_groups,
> > +};
> > +
> > +bool is_cxl_nvdimm(struct device *dev)
> > +{
> > +     return dev->type == &cxl_nvdimm_type;
> > +}
> > +EXPORT_SYMBOL_GPL(is_cxl_nvdimm);
> > +
> > +struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev)
> > +{
> > +     if (dev_WARN_ONCE(dev, !is_cxl_nvdimm(dev),
> > +                       "not a cxl_nvdimm device\n"))
> > +             return NULL;
> > +     return container_of(dev, struct cxl_nvdimm, dev);
> > +}
> > +EXPORT_SYMBOL_GPL(to_cxl_nvdimm);
> > +
> > +static struct cxl_nvdimm *cxl_nvdimm_alloc(struct cxl_memdev *cxlmd)
> > +{
> > +     struct cxl_nvdimm *cxl_nvd;
> > +     struct device *dev;
> > +
> > +     cxl_nvd = kzalloc(sizeof(*cxl_nvd), GFP_KERNEL);
> > +     if (!cxl_nvd)
> > +             return ERR_PTR(-ENOMEM);
> > +
> > +     dev = &cxl_nvd->dev;
> > +     cxl_nvd->cxlmd = cxlmd;
> > +     device_initialize(dev);
> > +     device_set_pm_not_required(dev);
> > +     dev->parent = &cxlmd->dev;
> > +     dev->bus = &cxl_bus_type;
> > +     dev->type = &cxl_nvdimm_type;
> > +
> > +     return cxl_nvd;
> > +}
> > +
> > +int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd)
> > +{
> > +     struct cxl_nvdimm *cxl_nvd;
> > +     struct device *dev;
> > +     int rc;
> > +
> > +     cxl_nvd = cxl_nvdimm_alloc(cxlmd);
> > +     if (IS_ERR(cxl_nvd))
> > +             return PTR_ERR(cxl_nvd);
> > +
> > +     dev = &cxl_nvd->dev;
> > +     rc = dev_set_name(dev, "pmem%d", cxlmd->id);
> > +     if (rc)
> > +             goto err;
> > +
> > +     rc = device_add(dev);
> > +     if (rc)
> > +             goto err;
> > +
> > +     dev_dbg(host, "%s: register %s\n", dev_name(dev->parent),
> > +             dev_name(dev));
> > +
> > +     return devm_add_action_or_reset(host, unregister_dev, dev);
> > +
> > +err:
> > +     put_device(dev);
> > +     return rc;
> > +}
> > +EXPORT_SYMBOL_GPL(devm_cxl_add_nvdimm);
> > +
> >  /**
> >   * cxl_probe_device_regs() - Detect CXL Device register blocks
> >   * @dev: Host device of the @base mapping
> > @@ -930,6 +1014,8 @@ static int cxl_device_id(struct device *dev)
> >  {
> >       if (dev->type == &cxl_nvdimm_bridge_type)
> >               return CXL_DEVICE_NVDIMM_BRIDGE;
> > +     if (dev->type == &cxl_nvdimm_type)
> > +             return CXL_DEVICE_NVDIMM;
> >       return 0;
> >  }
> >
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 47fcb7ad5978..3f9a6f7b05db 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -213,6 +213,13 @@ struct cxl_nvdimm_bridge {
> >       enum cxl_nvdimm_brige_state state;
> >  };
> >
> > +struct cxl_mem;
>
> Above looks unrelated as you've not added any uses of cxl_mem

I think it was a refactoring leftover from an abandoned approach.

Gone now.

>
> > +struct cxl_nvdimm {
> > +     struct device dev;
> > +     struct cxl_memdev *cxlmd;
> > +     struct nvdimm *nvdimm;
> > +};
> > +
> >  /**
> >   * struct cxl_port - logical collection of upstream port devices and
> >   *                downstream port devices to construct a CXL memory
> > @@ -299,7 +306,8 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
> >  #define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
> >  void cxl_driver_unregister(struct cxl_driver *cxl_drv);
> >
> > -#define CXL_DEVICE_NVDIMM_BRIDGE 1
> > +#define CXL_DEVICE_NVDIMM_BRIDGE     1
> > +#define CXL_DEVICE_NVDIMM            2
> >
> >  #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
> >  #define CXL_MODALIAS_FMT "cxl:t%d"
> > @@ -307,4 +315,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
> >  struct cxl_nvdimm_bridge *to_cxl_nvdimm_bridge(struct device *dev);
> >  struct cxl_nvdimm_bridge *devm_cxl_add_nvdimm_bridge(struct device *host,
> >                                                    struct cxl_port *port);
> > +struct cxl_nvdimm *to_cxl_nvdimm(struct device *dev);
> > +bool is_cxl_nvdimm(struct device *dev);
> > +int devm_cxl_add_nvdimm(struct device *host, struct cxl_memdev *cxlmd);
> >  #endif /* __CXL_H__ */
> > diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
> > index 13868ff7cadf..8f02d02b26b4 100644
> > --- a/drivers/cxl/mem.h
> > +++ b/drivers/cxl/mem.h
> > @@ -2,6 +2,8 @@
> >  /* Copyright(c) 2020-2021 Intel Corporation. */
> >  #ifndef __CXL_MEM_H__
> >  #define __CXL_MEM_H__
> > +#include <linux/cdev.h>
>
> Unrelated to rest of patch.  Good to add this, but not hidden in this
> patch.

Yup.

>
> > +#include "cxl.h"
> >
> >  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> >  #define CXLMDEV_STATUS_OFFSET 0x0
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 5a1705b52278..83e81b44c8f5 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -1313,7 +1313,8 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm)
> >       return ERR_PTR(rc);
> >  }
> >
> > -static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
> > +static struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
> > +                                           struct cxl_mem *cxlm)
> >  {
> >       struct cxl_memdev *cxlmd;
> >       struct device *dev;
> > @@ -1322,7 +1323,7 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
> >
> >       cxlmd = cxl_memdev_alloc(cxlm);
> >       if (IS_ERR(cxlmd))
> > -             return PTR_ERR(cxlmd);
> > +             return cxlmd;
> >
> >       dev = &cxlmd->dev;
> >       rc = dev_set_name(dev, "mem%d", cxlmd->id);
> > @@ -1340,8 +1341,10 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
> >       if (rc)
> >               goto err;
> >
> > -     return devm_add_action_or_reset(dev->parent, cxl_memdev_unregister,
> > -                                     cxlmd);
> > +     rc = devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
> > +     if (rc)
> > +             return ERR_PTR(rc);
> > +     return cxlmd;
> >
> >  err:
> >       /*
> > @@ -1350,7 +1353,7 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
> >        */
> >       cxl_memdev_shutdown(cxlmd);
> >       put_device(dev);
> > -     return rc;
> > +     return ERR_PTR(rc);
> >  }
> >
> >  static int cxl_xfer_log(struct cxl_mem *cxlm, uuid_t *uuid, u32 size, u8 *out)
> > @@ -1561,6 +1564,7 @@ static int cxl_mem_identify(struct cxl_mem *cxlm)
> >
> >  static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> > +     struct cxl_memdev *cxlmd;
> >       struct cxl_mem *cxlm;
> >       int rc;
> >
> > @@ -1588,7 +1592,14 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >       if (rc)
> >               return rc;
> >
> > -     return cxl_mem_add_memdev(cxlm);
> > +     cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm);
> > +     if (IS_ERR(cxlmd))
> > +             return PTR_ERR(cxlmd);
> > +
> > +     if (range_len(&cxlm->pmem_range) && IS_ENABLED(CONFIG_CXL_PMEM))
> > +             rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
> > +
> > +     return rc;
> >  }
> >
> >  static const struct pci_device_id cxl_mem_pci_tbl[] = {
> > diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
> > index 0067bd734559..1ed19e213157 100644
> > --- a/drivers/cxl/pmem.c
> > +++ b/drivers/cxl/pmem.c
> > @@ -3,7 +3,10 @@
> >  #include <linux/libnvdimm.h>
> >  #include <linux/device.h>
> >  #include <linux/module.h>
> > +#include <linux/ndctl.h>
> > +#include <linux/async.h>
> >  #include <linux/slab.h>
> > +#include "mem.h"
> >  #include "cxl.h"
> >
> >  /*
> > @@ -13,6 +16,64 @@
> >   */
> >  static struct workqueue_struct *cxl_pmem_wq;
> >
> > +static void unregister_nvdimm(void *nvdimm)
> > +{
> > +     nvdimm_delete(nvdimm);
> > +}
> > +
> > +static int match_nvdimm_bridge(struct device *dev, const void *data)
> > +{
> > +     return strcmp(dev_name(dev), "nvdimm-bridge") == 0;
> > +}
> > +
> > +static struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(void)
> > +{
> > +     struct device *dev;
> > +
> > +     dev = bus_find_device(&cxl_bus_type, NULL, NULL, match_nvdimm_bridge);
>
> Hmm. It's a singleton, so you could just stash the pointer somewhere
> appropriate, but I guess this avoids adding extra infrastructure around
> that, so fair enough.

Yeah, it's a singleton only by convention. If you want to have
multiple CXL nvdimm buses the core won't prevent it.

>
> > +     if (!dev)
> > +             return NULL;
> > +     return to_cxl_nvdimm_bridge(dev);
> > +}
> > +
> > +static int cxl_nvdimm_probe(struct device *dev)
> > +{
> > +     struct cxl_nvdimm *cxl_nvd = to_cxl_nvdimm(dev);
> > +     struct cxl_nvdimm_bridge *cxl_nvb;
> > +     unsigned long flags = 0;
> > +     struct nvdimm *nvdimm;
> > +     int rc = -ENXIO;
> > +
> > +     cxl_nvb = cxl_find_nvdimm_bridge();
> > +     if (!cxl_nvb)
> > +             return -ENXIO;
> > +
> > +     device_lock(&cxl_nvb->dev);
> > +     if (!cxl_nvb->nvdimm_bus)
> > +             goto out;
> > +
> > +     set_bit(NDD_LABELING, &flags);
> > +     nvdimm = nvdimm_create(cxl_nvb->nvdimm_bus, cxl_nvd, NULL, flags, 0, 0,
> > +                            NULL);
> > +     if (!nvdimm)
> > +             goto out;
> > +
> > +     dev_set_drvdata(dev, nvdimm);
>
> Why?  No harm done, but I wanted to check this was only used to get the
> nvdimm, but can't find where it's used at all.

Yup, vestigial. I thought I would use it in the release action, but
now it's just passed directly to unregister_nvdimm().

>
> > +
> > +     rc = devm_add_action_or_reset(dev, unregister_nvdimm, nvdimm);
> > +out:
> > +     device_unlock(&cxl_nvb->dev);
> > +     put_device(&cxl_nvb->dev);
> > +
> > +     return rc;
> > +}
> > +
> > +static struct cxl_driver cxl_nvdimm_driver = {
> > +     .name = "cxl_nvdimm",
> > +     .probe = cxl_nvdimm_probe,
> > +     .id = CXL_DEVICE_NVDIMM,
> > +};
> > +
> >  static int cxl_pmem_ctl(struct nvdimm_bus_descriptor *nd_desc,
> >                       struct nvdimm *nvdimm, unsigned int cmd, void *buf,
> >                       unsigned int buf_len, int *cmd_rc)
> > @@ -28,19 +89,31 @@ static void online_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
> >               nvdimm_bus_register(&cxl_nvb->dev, &cxl_nvb->nd_desc);
> >  }
> >
> > -static void offline_nvdimm_bus(struct cxl_nvdimm_bridge *cxl_nvb)
> > +static int cxl_nvdimm_release_driver(struct device *dev, void *data)
> >  {
> > -     if (!cxl_nvb->nvdimm_bus)
> > -             return;
> > -     nvdimm_bus_unregister(cxl_nvb->nvdimm_bus);
> > -     cxl_nvb->nvdimm_bus = NULL;
> > +     if (!is_cxl_nvdimm(dev))
> > +             return 0;
> > +     device_release_driver(dev);
> > +     return 0;
> > +}
> > +
> > +static void offline_nvdimm_bus(struct nvdimm_bus *nvdimm_bus)
> > +{
> > +     /*
> > +      * Set the state of cxl_nvdimm devices to unbound / idle before
> > +      * nvdimm_bus_unregister() rips the nvdimm objects out from
> > +      * underneath them.
> > +      */
> > +     bus_for_each_dev(&cxl_bus_type, NULL, NULL, cxl_nvdimm_release_driver);
> > +     nvdimm_bus_unregister(nvdimm_bus);
> >  }
> >
> >  static void cxl_nvb_update_state(struct work_struct *work)
> >  {
> >       struct cxl_nvdimm_bridge *cxl_nvb =
> >               container_of(work, typeof(*cxl_nvb), state_work);
> > -     bool release = false;
> > +     struct nvdimm_bus *nvdimm_bus = NULL;
> > +     bool release = false, rescan = false;
> >
> >       device_lock(&cxl_nvb->dev);
> >       switch (cxl_nvb->state) {
> > @@ -50,11 +123,13 @@ static void cxl_nvb_update_state(struct work_struct *work)
> >                       dev_err(&cxl_nvb->dev,
> >                               "failed to establish nvdimm bus\n");
> >                       release = true;
> > -             }
> > +             } else
> > +                     rescan = true;
> >               break;
> >       case CXL_NVB_OFFLINE:
> >       case CXL_NVB_DEAD:
> > -             offline_nvdimm_bus(cxl_nvb);
> > +             nvdimm_bus = cxl_nvb->nvdimm_bus;
>
> Perhaps rename this to make it clear that it only exists as a means
> to offline it later?  I wondered briefly why you were offlining
> any bus that existed (e.g. in the online case)
>
> nvdimm_bus_to_offline = ...
>

Good point, although I renamed it to "victim_bus", and made
offline_nvdimm_bus() handle the NULL bus case, so now it reads with no
conditional.

offline_nvdimm_bus(victim_bus);


> > +             cxl_nvb->nvdimm_bus = NULL;
> >               break;
> >       default:
> >               break;
> > @@ -63,6 +138,13 @@ static void cxl_nvb_update_state(struct work_struct *work)
> >
> >       if (release)
> >               device_release_driver(&cxl_nvb->dev);
> > +     if (rescan) {
> > +             int rc = bus_rescan_devices(&cxl_bus_type);
> > +
> > +             dev_dbg(&cxl_nvb->dev, "rescan: %d\n", rc);
> > +     }
> > +     if (nvdimm_bus)
> > +             offline_nvdimm_bus(nvdimm_bus);
> >
> >       put_device(&cxl_nvb->dev);
> >  }
> > @@ -113,23 +195,29 @@ static __init int cxl_pmem_init(void)
> >       int rc;
> >
> >       cxl_pmem_wq = alloc_ordered_workqueue("cxl_pmem", 0);
> > -
>
> Shouldn't be here obviously. Move to earlier patch.

Yup.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure
  2021-06-11 23:25         ` Dan Williams
@ 2021-06-14 21:40           ` Ben Widawsky
  0 siblings, 0 replies; 18+ messages in thread
From: Ben Widawsky @ 2021-06-14 21:40 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Linux NVDIMM, Schofield, Alison, Vishal L Verma,
	Weiny, Ira, Linux Kernel Mailing List

On 21-06-11 16:25:05, Dan Williams wrote:
> On Fri, Jun 11, 2021 at 12:28 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-06-11 11:55:39, Dan Williams wrote:
> > > On Fri, Jun 11, 2021 at 10:47 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > On 21-06-10 15:26:03, Dan Williams wrote:
> > > > > Enable devices on the 'cxl' bus to be attached to drivers. The initial
> > > > > user of this functionality is a driver for an 'nvdimm-bridge' device
> > > > > that anchors a libnvdimm hierarchy attached to CXL persistent memory
> > > > > resources. Other device types that will leverage this include:
> > > > >
> > > > > cxl_port: map and use component register functionality (HDM Decoders)
> > > >
> > > > Since I'm looking at this now, perhaps I can open the discussion here. Have you
> > > > thought about how this works yet? Right now I'm thinking there are two "drivers":
> > > > cxl_port: Switches (and ACPI0016)
> > > > cxl_mem: The memory device's HDM decoders
> > > >
> > > > For port, probe() will figure out that the thing is an upstream port, call
> > > > cxl_probe_component_regs and then call devm_cxl_add_port(). I think that's
> > > > straight forward.
> > >
> > > I was expecting cxl_port_driver.probe() comes *after* port discovery.
> > > Think of it like PCI discovery. Some agent does the hardware topology
> > > scan to add devices, in this case devm_cxl_add_port(), and that
> > > triggers cxl_port_driver to load. So the initial enumeration done by
> > > the cxl_acpi driver will populate the first two levels of the port
> > > hierarchy with port objects and populate their component register
> > > physical base addresses. For any other port deeper in the hierarchy I
> > > was expecting that to be scanned after the discovery of a cxl_memdev
> > > that is not attached to the current hierarchy. So, for example imagine
> > > a config like:
> > >
> > > Platform --> Host Bridge --> Switch --> Endpoint
> > >
> > > ...where in sysfs that's modeled as:
> > >
> > > root0 --> port1 --> port2 --> port3
> > >
> > > Where port3 is assuming that the CXL core models the device's
> > > connection to the topology as yet another cxl_port. At the beginning
> > > of time after cxl_acpi has loaded but before cxl_pci has discovered
> > > the endpoint the topology is:
> > >
> > > root0 --> port1
> > >
> > > Upon the detection of the endpoint the CXL core can assume that all
> > > intermediary switches between the root and this device have been
> > > registered as PCI devices. So, it follows that endpoint device arrival
> > > triggers "cxl_bus_rescan()" that goes and enumerates all the CXL
> > > resources in the topology to produce:
> > >
> > > root0 --> port1 --> port2 --> port3
> > >
> >
> > Ah, I had written about scan/rescan in an earlier version of my email but
> > dropped it. I was actually going to suggest it being a sysfs attr, but I'm fine
> > with it being implicit so long as...
> >
> > How do we assert that cxl_pci doesn't run before cxl_acpi has done anything?
> 
> I don't think we need to, or it's broken if the driver load order
> matters. The nvdimm enabling code is an example of how to handle this.
> The cxl_nvdimm object can be registered before the cxl_nvdimm_bridge,
> or after, does not matter. If the cxl_nvdimm comes first it will
> trigger the cxl_nvdimm_driver to load. The cxl_nvdimm_driver.probe()
> routine finds no bridge present and probe() returns with a failure.
> When the bridge arrives it does a rescan  of the cxl_bus_type device
> list and if it finds a cxl_nvdimm it re-triggers
> cxl_nvdimm_driver.probe(). This time through cxl_nvdimm_driver.probe()
> finds the bridge and registers the real nvdimm on the nvdimm_bus.
> 
> > I
> > like the idea that the endpoint device can simply ask cxl_acpi to rescan, I just
> > don't see how it works. I suppose we can queue up the requests to rescan in
> > cxl_acpi if the ordering can't be guaranteed.
> 
> I think this means that the devm_cxl_add_port() would be triggered by
> cxl_memdev_driver.probe() if and only if the parent pci_device of the
> CXL endpoint is listed as a dport. If the cxl_memdev is registered
> first the search it will search for the CXL root port on the
> cxl_bus_type device list. If that fails then cxl_memdev_driver.probe()
> fails. If that succeeds it asks the root to scan to the CXL endpoint
> parent pci_device and return the confirmation that it is registered as
> a dport. If that fails then the device is plugged into a pure PCIe
> slot.
> 
> When cxl_acpi loads it retriggers all cxl_memdev_driver.probe() to
> reconsider all cxl_memdev instances that failed to probe previously.
> 
> >
> > > > For the memory device we've already probed the thing via class code so there is
> > > > no need to use this driver registration, however, I think it would be nice to do
> > > > so. Is there a clean way to do that?
> > >
> > > The PCI device associated with the endpoint is already probed, but the
> > > cxl_memdev itself can have a driver on the CXL bus. So I think the
> > > cxl_memdev driver should try to register a cxl_port after telling
> > > cxl_acpi to rescan. If a check like "is_cxl_dport(pdev->dev.parent)"
> > > for the endpoint returns false it means that the cxl_bus_rescan()
> > > failed to enumerate the CXL topology to this endpoint and this
> > > endpoint is limited to only CXL.io operation.
> >
> > What is going to invoke the memdev driver's probe? That is where we're talking
> > about putting that is_cxl_dport(...) right? That is the part that tripped me up
> > and inspired the original email FWIW.
> 
> I *think* I worked that out above, but yes please do poke at it to see
> if it holds up.

I think it works. I have some concerns around synchronization of memdev probing
and cxl_acpi enumerating, but I believe it's workable.

Thanks for the thought.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-06-14 21:40 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-10 22:25 [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Dan Williams
2021-06-10 22:26 ` [PATCH 1/5] cxl/core: Add cxl-bus driver infrastructure Dan Williams
2021-06-11 17:47   ` Ben Widawsky
2021-06-11 18:55     ` Dan Williams
2021-06-11 19:28       ` Ben Widawsky
2021-06-11 23:25         ` Dan Williams
2021-06-14 21:40           ` Ben Widawsky
2021-06-10 22:26 ` [PATCH 2/5] cxl/pmem: Add initial infrastructure for pmem support Dan Williams
2021-06-11 11:39   ` Jonathan Cameron
2021-06-12  0:07     ` Dan Williams
2021-06-10 22:26 ` [PATCH 3/5] libnvdimm: Export nvdimm shutdown helper, nvdimm_delete() Dan Williams
2021-06-10 22:26 ` [PATCH 4/5] libnvdimm: Drop unused device power management support Dan Williams
2021-06-11 11:47   ` Jonathan Cameron
2021-06-12  0:16     ` Dan Williams
2021-06-10 22:26 ` [PATCH 5/5] cxl/pmem: Register 'pmem' / cxl_nvdimm devices Dan Williams
2021-06-11 12:57   ` Jonathan Cameron
2021-06-12  0:34     ` Dan Williams
2021-06-11 12:59 ` [PATCH 0/5] cxl/pmem: Add core infrastructure for PMEM support Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).