All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/23] Add drivers for CXL ports and mem devices
@ 2021-11-20  0:02 Ben Widawsky
  2021-11-20  0:02 ` [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
                   ` (22 more replies)
  0 siblings, 23 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

This is the first set of patches from the RFC [1] for region creation. The
patches enable port enumeration for endpoint devices, and enumeration of decoder
resources for ports. In the RFC [1], I felt it necessary to post the consumer of
this work, the region driver, so that it was clear why these patches were
necessary. Because the region patches patches are less baked, and received no
review in the RFC, they are excluded here. If you find yourself unclear about
why these patches are interesting, go look at the RFC [1].

Each patch contains the list of changes from RFCv2. IMHO the following are the
high level most important changes:
1. Rework cxl_pci to fix mailbox handling and allow for wait media ready.
2. DVSEC range information is passed from cxl_pci and checked

linux-pci is on the Cc since CXL lives in a parallel universe to PCI and some
PCI mechanisms are reused here. Feedback from experts in that domain is very
welcome.

What was requested and not changed:
1. Dropping global list of root ports.
2. Improving find_parent_cxl_port()

---

Summary
=======

Two new drivers are introduced to support Compute Express Link 2.0 [2] HDM
decoder enumeration. While the existing cxl_acpi and cxl_pci drivers already
create some of the necessary devices, they did not do full enumeration of
decoders, and they did not do port enumeration for switches. Additionally, CXL
2.0 Root Port component registers are now handled as well.

cxl_port
========

The cxl_port driver is implemented within the cxl_port module. While loading of
this module is optional, the other new drivers depend, and cxl_acpi depend on it
for complete enumeration. The port driver is responsible for all activities
around HDM decoder enumeration and programming. Introduced earlier, the concept
of a port is an abstraction over CXL components with an upstream port, every
host bridge, switch, and endpoint.

cxl_mem
=======

The cxl_mem driver's main job is to walk up the hierarchy to make the
determination if it is CXL.mem routed, meaning, all components above it in the
hierarchy are participating in the CXL.mem protocol. It is implemented within
the cxl_mem module. As the host bridge ports are added by a platform specific
driver, such as cxl_acpi, the scope of the mem driver can be reduced to scan for
switches and ask cxl_core to work on enumerating them. With this done, the
determination as to whether a device is CXL.mem routed can be done simply by
checking if the struct device has a driver bound to it.

Results
=======

Running these patches should yield new devices and new drivers under
/sys/bus/cxl/devices and /sys/bus/cxl/drivers. For example, in a standard QEMU
run, using run_qemu [3]

/sys/bus/cxl/devices (new):
# The host bridge CHBS decoder
lrwxrwxrwx 1 root root 0 Nov 19 15:23 decoder1.0 -> ../../../devices/platform/ACPI0017:00/root0/port1/decoder1.0
# mem0's decoder
lrwxrwxrwx 1 root root 0 Nov 19 15:23 decoder2.0 -> ../../../devices/platform/ACPI0017:00/root0/port1/port2/decoder2.0
# mem1's decoder
lrwxrwxrwx 1 root root 0 Nov 19 15:23 decoder3.0 -> ../../../devices/platform/ACPI0017:00/root0/port1/port3/decoder3.0
# mem0's port
lrwxrwxrwx 1 root root 0 Nov 19 15:23 port2 -> ../../../devices/platform/ACPI0017:00/root0/port1/port2
# mem1's port
lrwxrwxrwx 1 root root 0 Nov 19 15:23 port3 -> ../../../devices/platform/ACPI0017:00/root0/port1/port3

/sys/bus/cxl/drivers:
drwxr-xr-x 2 root root 0 Nov 19 15:23 cxl_mem
drwxr-xr-x 2 root root 0 Nov 19 15:23 cxl_port

---

[1]: https://lore.kernel.org/linux-cxl/20211022183709.1199701-1-ben.widawsky@intel.com/T/#t
[2]: https://www.computeexpresslink.org/download-the-specification
[3]: https://github.com/pmem/run_qemu/

Ben Widawsky (23):
  cxl: Rename CXL_MEM to CXL_PCI
  cxl: Flesh out register names
  cxl/pci: Extract device status check
  cxl/pci: Implement Interface Ready Timeout
  cxl/pci: Don't poll doorbell for mailbox access
  cxl/pci: Don't check media status for mbox access
  cxl/pci: Add new DVSEC definitions
  cxl/acpi: Map component registers for Root Ports
  cxl: Introduce module_cxl_driver
  cxl/core: Convert decoder range to resource
  cxl/core: Document and tighten up decoder APIs
  cxl: Introduce endpoint decoders
  cxl/core: Move target population locking to caller
  cxl: Introduce topology host registration
  cxl/core: Store global list of root ports
  cxl/pci: Cache device DVSEC offset
  cxl: Cache and pass DVSEC ranges
  cxl/pci: Implement wait for media active
  cxl/pci: Store component register base in cxlds
  cxl/port: Introduce a port driver
  cxl: Unify port enumeration for decoders
  cxl/mem: Introduce cxl_mem driver
  cxl/mem: Disable switch hierarchies for now

 .../driver-api/cxl/memory-devices.rst         |  14 +
 drivers/cxl/Kconfig                           |  54 ++-
 drivers/cxl/Makefile                          |   6 +-
 drivers/cxl/acpi.c                            | 103 ++--
 drivers/cxl/core/Makefile                     |   1 +
 drivers/cxl/core/bus.c                        | 439 ++++++++++++++++--
 drivers/cxl/core/core.h                       |   3 +
 drivers/cxl/core/memdev.c                     |   2 +-
 drivers/cxl/core/pci.c                        | 119 +++++
 drivers/cxl/core/regs.c                       |  60 ++-
 drivers/cxl/cxl.h                             |  73 ++-
 drivers/cxl/cxlmem.h                          |  27 ++
 drivers/cxl/mem.c                             | 197 ++++++++
 drivers/cxl/pci.c                             | 341 ++++++++++----
 drivers/cxl/pci.h                             |  53 ++-
 drivers/cxl/port.c                            | 383 +++++++++++++++
 tools/testing/cxl/Kbuild                      |   1 +
 tools/testing/cxl/mock_acpi.c                 |   4 +-
 18 files changed, 1666 insertions(+), 214 deletions(-)
 create mode 100644 drivers/cxl/core/pci.c
 create mode 100644 drivers/cxl/mem.c
 create mode 100644 drivers/cxl/port.c


base-commit: 53989fad1286e652ea3655ae3367ba698da8d2ff
-- 
2.34.0


^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 14:47   ` Jonathan Cameron
  2021-11-24  4:15   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 02/23] cxl: Flesh out register names Ben Widawsky
                   ` (21 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Dan Williams, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The cxl_mem module was renamed cxl_pci in commit 21e9f76733a8 ("cxl:
Rename mem to pci"). In preparation for adding an ancillary driver for
cxl_memdev devices (registered on the cxl bus by cxl_pci), go ahead and
rename CONFIG_CXL_MEM to CONFIG_CXL_PCI. Free up the CXL_MEM name for
that new driver to manage CXL.mem endpoint operations.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---
Changes since RFCv2:
- Reword commit message (Dan)
- Reword Kconfig description (Dan)
---
 drivers/cxl/Kconfig  | 23 ++++++++++++-----------
 drivers/cxl/Makefile |  2 +-
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 67c91378f2dd..ef05e96f8f97 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -13,25 +13,26 @@ menuconfig CXL_BUS
 
 if CXL_BUS
 
-config CXL_MEM
-	tristate "CXL.mem: Memory Devices"
+config CXL_PCI
+	tristate "PCI manageability"
 	default CXL_BUS
 	help
-	  The CXL.mem protocol allows a device to act as a provider of
-	  "System RAM" and/or "Persistent Memory" that is fully coherent
-	  as if the memory was attached to the typical CPU memory
-	  controller.
+	  The CXL specification defines a "CXL memory device" sub-class in the
+	  PCI "memory controller" base class of devices. Device's identified by
+	  this class code provide support for volatile and / or persistent
+	  memory to be mapped into the system address map (Host-managed Device
+	  Memory (HDM)).
 
-	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
-	  configuration and management primarily via the mailbox interface. See
-	  Chapter 2.3 Type 3 CXL Device in the CXL 2.0 specification for more
-	  details.
+	  Say 'y/m' to enable a driver that will attach to CXL memory expander
+	  devices enumerated by the memory device class code for configuration
+	  and management primarily via the mailbox interface. See Chapter 2.3
+	  Type 3 CXL Device in the CXL 2.0 specification for more details.
 
 	  If unsure say 'm'.
 
 config CXL_MEM_RAW_COMMANDS
 	bool "RAW Command Interface for Memory Devices"
-	depends on CXL_MEM
+	depends on CXL_PCI
 	help
 	  Enable CXL RAW command interface.
 
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index d1aaabc940f3..cf07ae6cea17 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CXL_BUS) += core/
-obj-$(CONFIG_CXL_MEM) += cxl_pci.o
+obj-$(CONFIG_CXL_PCI) += cxl_pci.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 02/23] cxl: Flesh out register names
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
  2021-11-20  0:02 ` [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 14:49   ` Jonathan Cameron
  2021-11-24  4:24   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 03/23] cxl/pci: Extract device status check Ben Widawsky
                   ` (20 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Get a better naming scheme in place for upcoming additions.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
Changes since RFCv2:
Use some abbreviations (Jonathan)
Prefix everything with CXL (Jonathan)
Remove new additions (Dan)

Original discussion motivating this occurred here:
https://lore.kernel.org/linux-pci/20210913190131.xiiszmno46qie7v5@intel.com/
---
 drivers/cxl/pci.c | 14 +++++++-------
 drivers/cxl/pci.h | 19 ++++++++++---------
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 8dc91fd3396a..a6ea9811a05b 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -403,10 +403,10 @@ static int cxl_map_regs(struct cxl_dev_state *cxlds, struct cxl_register_map *ma
 static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
 				struct cxl_register_map *map)
 {
-	map->block_offset =
-		((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
-	map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
-	map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
+	map->block_offset = ((u64)reg_hi << 32) |
+			    (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
+	map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
+	map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
 }
 
 /**
@@ -427,15 +427,15 @@ static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
 	int regloc, i;
 
 	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
-					   PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
+					   CXL_DVSEC_REG_LOCATOR);
 	if (!regloc)
 		return -ENXIO;
 
 	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
 	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
 
-	regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
-	regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
+	regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
+	regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
 
 	for (i = 0; i < regblocks; i++, regloc += 8) {
 		u32 reg_lo, reg_hi;
diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
index 7d3e4bf06b45..29b8eaef3a0a 100644
--- a/drivers/cxl/pci.h
+++ b/drivers/cxl/pci.h
@@ -7,17 +7,21 @@
 
 /*
  * See section 8.1 Configuration Space Registers in the CXL 2.0
- * Specification
+ * Specification. Names are taken straight from the specification with "CXL" and
+ * "DVSEC" redundancies removed. When obvious, abbreviations may be used.
  */
 #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
 #define PCI_DVSEC_VENDOR_ID_CXL		0x1E98
-#define PCI_DVSEC_ID_CXL		0x0
 
-#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID	0x8
-#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET	0xC
+/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
+#define CXL_DVSEC_PCIE_DEVICE					0
 
-/* BAR Indicator Register (BIR) */
-#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
+/* CXL 2.0 8.1.9: Register Locator DVSEC */
+#define CXL_DVSEC_REG_LOCATOR					8
+#define   CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET			0xC
+#define     CXL_DVSEC_REG_LOCATOR_BIR_MASK			GENMASK(2, 0)
+#define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
+#define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
 
 /* Register Block Identifier (RBI) */
 enum cxl_regloc_type {
@@ -28,7 +32,4 @@ enum cxl_regloc_type {
 	CXL_REGLOC_RBI_TYPES
 };
 
-#define CXL_REGLOC_RBI_MASK GENMASK(15, 8)
-#define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
-
 #endif /* __CXL_PCI_H__ */
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 03/23] cxl/pci: Extract device status check
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
  2021-11-20  0:02 ` [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
  2021-11-20  0:02 ` [PATCH 02/23] cxl: Flesh out register names Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 15:03   ` Jonathan Cameron
  2021-11-24 19:30   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout Ben Widawsky
                   ` (19 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The Memory Device Status register is inspected in the same way for at
least two flows in the CXL Type 3 Memory Device Software Guide
(Revision: 1.0): 2.13.9 Device discovery and mailbox ready sequence,
and 2.13.10 Media ready sequence. Extract this common functionality for
use by both.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
This patch did not exist in RFCv2
---
 drivers/cxl/pci.c | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index a6ea9811a05b..6c8d09fb3a17 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -182,6 +182,27 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
 	return 0;
 }
 
+/*
+ * Implements roughly the bottom half of Figure 42 of the CXL Type 3 Memory
+ * Device Software Guide
+ */
+static int check_device_status(struct cxl_dev_state *cxlds)
+{
+	const u64 md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
+
+	if (md_status & CXLMDEV_DEV_FATAL) {
+		dev_err(cxlds->dev, "Fatal: replace device\n");
+		return -EIO;
+	}
+
+	if (md_status & CXLMDEV_FW_HALT) {
+		dev_err(cxlds->dev, "FWHalt: reset or replace device\n");
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
 /**
  * cxl_pci_mbox_get() - Acquire exclusive access to the mailbox.
  * @cxlds: The device state to gain access to.
@@ -231,17 +252,13 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
 	 * Hardware shouldn't allow a ready status but also have failure bits
 	 * set. Spit out an error, this should be a bug report
 	 */
-	rc = -EFAULT;
-	if (md_status & CXLMDEV_DEV_FATAL) {
-		dev_err(dev, "mbox: reported ready, but fatal\n");
+	rc = check_device_status(cxlds);
+	if (rc)
 		goto out;
-	}
-	if (md_status & CXLMDEV_FW_HALT) {
-		dev_err(dev, "mbox: reported ready, but halted\n");
-		goto out;
-	}
+
 	if (CXLMDEV_RESET_NEEDED(md_status)) {
 		dev_err(dev, "mbox: reported ready, but reset needed\n");
+		rc = -EFAULT;
 		goto out;
 	}
 
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (2 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 03/23] cxl/pci: Extract device status check Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 15:02   ` Jonathan Cameron
  2021-11-20  0:02 ` [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access Ben Widawsky
                   ` (18 subsequent siblings)
  22 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The original driver implementation used the doorbell timeout for the
Mailbox Interface Ready bit to piggy back off of, since the latter
doesn't have a defined timeout. This functionality, introduced in
8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
since a timeout has been defined with an ECN to the 2.0 spec.

While devices implemented prior to the ECN could have an arbitrarily
long wait and still be within spec, the max ECN value (256s) is chosen
as the default for all devices. All vendors in the consortium agreed to
this amount and so it is reasonable to assume no devices made will
exceed this amount.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
This patch did not exist in RFCv2
---
 drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 6c8d09fb3a17..2cef9fec8599 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -2,6 +2,7 @@
 /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/module.h>
+#include <linux/delay.h>
 #include <linux/sizes.h>
 #include <linux/mutex.h>
 #include <linux/list.h>
@@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
 static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
 {
 	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
+	unsigned long timeout;
+	u64 md_status;
+	int rc;
+
+	/*
+	 * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
+	 * dictate how long to wait for the mailbox to become ready. For
+	 * simplicity, and to handle devices that might have been implemented
+	 * prior to the ECN, wait the max amount of time no matter what the
+	 * device says.
+	 */
+	timeout = jiffies + 256 * HZ;
+
+	rc = check_device_status(cxlds);
+	if (rc)
+		return rc;
+
+	do {
+		md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
+		if (md_status & CXLMDEV_MBOX_IF_READY)
+			break;
+		if (msleep_interruptible(100))
+			break;
+	} while (!time_after(jiffies, timeout));
+
+	/* It's assumed that once the interface is ready, it will remain ready. */
+	if (!(md_status & CXLMDEV_MBOX_IF_READY))
+		return -EIO;
 
 	cxlds->mbox_send = cxl_pci_mbox_send;
 	cxlds->payload_size =
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (3 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 15:11   ` Jonathan Cameron
  2021-11-24 21:55   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 06/23] cxl/pci: Don't check media status for mbox access Ben Widawsky
                   ` (17 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The expectation is that the mailbox interface ready bit is the first
step in access through the mailbox interface. Therefore, waiting for the
doorbell busy bit to be clear would imply that the mailbox interface is
ready. The original driver implementation used the doorbell timeout for
the Mailbox Interface Ready bit to piggyback off of, since the latter
doesn't have a defined timeout (introduced in 8adaf747c9f0 ("cxl/mem:
Find device capabilities"), a timeout has since been defined with an ECN
to the 2.0 spec). With the current driver waiting for mailbox interface
ready as a part of probe() it's no longer necessary to use the
piggyback.

With the piggybacking no longer necessary it doesn't make sense to check
doorbell status when acquiring the mailbox. It will be checked during
the normal mailbox exchange protocol.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
This patch did not exist in RFCv2
---
 drivers/cxl/pci.c | 25 ++++++-------------------
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 2cef9fec8599..869b4fc18e27 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -221,27 +221,14 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
 
 	/*
 	 * XXX: There is some amount of ambiguity in the 2.0 version of the spec
-	 * around the mailbox interface ready (8.2.8.5.1.1).  The purpose of the
+	 * around the mailbox interface ready (8.2.8.5.1.1). The purpose of the
 	 * bit is to allow firmware running on the device to notify the driver
-	 * that it's ready to receive commands. It is unclear if the bit needs
-	 * to be read for each transaction mailbox, ie. the firmware can switch
-	 * it on and off as needed. Second, there is no defined timeout for
-	 * mailbox ready, like there is for the doorbell interface.
-	 *
-	 * Assumptions:
-	 * 1. The firmware might toggle the Mailbox Interface Ready bit, check
-	 *    it for every command.
-	 *
-	 * 2. If the doorbell is clear, the firmware should have first set the
-	 *    Mailbox Interface Ready bit. Therefore, waiting for the doorbell
-	 *    to be ready is sufficient.
+	 * that it's ready to receive commands. The spec does not clearly define
+	 * under what conditions the bit may get set or cleared. As of the 2.0
+	 * base specification there was no defined timeout for mailbox ready,
+	 * like there is for the doorbell interface. This was fixed with an ECN,
+	 * but it's possible early devices implemented this before the ECN.
 	 */
-	rc = cxl_pci_mbox_wait_for_doorbell(cxlds);
-	if (rc) {
-		dev_warn(dev, "Mailbox interface not ready\n");
-		goto out;
-	}
-
 	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
 	if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
 		dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 06/23] cxl/pci: Don't check media status for mbox access
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (4 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 15:19   ` Jonathan Cameron
  2021-11-24 21:58   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 07/23] cxl/pci: Add new DVSEC definitions Ben Widawsky
                   ` (16 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Media status is necessary for using HDM contained in a CXL device but is
not needed for mailbox accesses. Therefore remove this check. It will be
necessary to have this check (in a different place) when enabling HDM.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
This patch did not exist in RFCv2
---
 drivers/cxl/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 869b4fc18e27..711bf4514480 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -230,7 +230,7 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
 	 * but it's possible early devices implemented this before the ECN.
 	 */
 	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
-	if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
+	if (!(md_status & CXLMDEV_MBOX_IF_READY)) {
 		dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
 		rc = -EBUSY;
 		goto out;
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 07/23] cxl/pci: Add new DVSEC definitions
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (5 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 06/23] cxl/pci: Don't check media status for mbox access Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 15:22   ` Jonathan Cameron
  2021-11-20  0:02 ` [PATCH 08/23] cxl/acpi: Map component registers for Root Ports Ben Widawsky
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

While the new definitions are yet necessary at this point, they are
introduced at this point to help solidify the newly minted schema for
naming registers.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---
This was split from
https://lore.kernel.org/linux-cxl/20211103170552.55ae5u7uvurkync6@intel.com/T/#u
per Dan's request.
---
 drivers/cxl/pci.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
index 29b8eaef3a0a..8ae2b4adc59d 100644
--- a/drivers/cxl/pci.h
+++ b/drivers/cxl/pci.h
@@ -16,6 +16,21 @@
 /* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
 #define CXL_DVSEC_PCIE_DEVICE					0
 
+/* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
+#define CXL_DVSEC_FUNCTION_MAP					2
+
+/* CXL 2.0 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
+#define CXL_DVSEC_PORT_EXTENSIONS				3
+
+/* CXL 2.0 8.1.6: GPF DVSEC for CXL Port */
+#define CXL_DVSEC_PORT_GPF					4
+
+/* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
+#define CXL_DVSEC_DEVICE_GPF					5
+
+/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
+#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
+
 /* CXL 2.0 8.1.9: Register Locator DVSEC */
 #define CXL_DVSEC_REG_LOCATOR					8
 #define   CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET			0xC
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 08/23] cxl/acpi: Map component registers for Root Ports
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (6 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 07/23] cxl/pci: Add new DVSEC definitions Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 15:51   ` Jonathan Cameron
  2021-11-24 22:18   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 09/23] cxl: Introduce module_cxl_driver Ben Widawsky
                   ` (14 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

This implements the TODO in cxl_acpi for mapping component registers.
cxl_acpi becomes the second consumer of CXL register block enumeration
(cxl_pci being the first). Moving the functionality to cxl_core allows
both of these drivers to use the functionality. Equally importantly it
allows cxl_core to use the functionality in the future.

CXL 2.0 root ports are similar to CXL 2.0 Downstream Ports with the main
distinction being they're a part of the CXL 2.0 host bridge. While
mapping their component registers is not immediately useful for the CXL
drivers, the movement of register block enumeration into core is a vital
step towards HDM decoder programming.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---
Changes since RFCv2:
- Squash commits together (Dan)
- Reword commit message to account for above.
---
 drivers/cxl/acpi.c      | 10 ++++++--
 drivers/cxl/core/regs.c | 54 +++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |  4 +++
 drivers/cxl/pci.c       | 52 ---------------------------------------
 drivers/cxl/pci.h       |  4 +++
 5 files changed, 70 insertions(+), 54 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 3163167ecc3a..7cfa8b568013 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -7,6 +7,7 @@
 #include <linux/acpi.h>
 #include <linux/pci.h>
 #include "cxl.h"
+#include "pci.h"
 
 /* Encode defined in CXL 2.0 8.2.5.12.7 HDM Decoder Control Register */
 #define CFMWS_INTERLEAVE_WAYS(x)	(1 << (x)->interleave_ways)
@@ -134,11 +135,13 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 
 __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 {
+	resource_size_t creg = CXL_RESOURCE_NONE;
 	struct cxl_walk_context *ctx = data;
 	struct pci_bus *root_bus = ctx->root;
 	struct cxl_port *port = ctx->port;
 	int type = pci_pcie_type(pdev);
 	struct device *dev = ctx->dev;
+	struct cxl_register_map map;
 	u32 lnkcap, port_num;
 	int rc;
 
@@ -152,9 +155,12 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 				  &lnkcap) != PCIBIOS_SUCCESSFUL)
 		return 0;
 
-	/* TODO walk DVSEC to find component register base */
+	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+	if (!rc)
+		creg = cxl_reg_block(pdev, &map);
+
 	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
-	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
+	rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
 	if (rc) {
 		ctx->error = rc;
 		return rc;
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index e37e23bf4355..41a0245867ea 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -5,6 +5,7 @@
 #include <linux/slab.h>
 #include <linux/pci.h>
 #include <cxlmem.h>
+#include <pci.h>
 
 /**
  * DOC: cxl registers
@@ -247,3 +248,56 @@ int cxl_map_device_regs(struct pci_dev *pdev,
 	return 0;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_map_device_regs, CXL);
+
+static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
+				struct cxl_register_map *map)
+{
+	map->block_offset = ((u64)reg_hi << 32) |
+			    (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
+	map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
+	map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
+}
+
+/**
+ * cxl_find_regblock() - Locate register blocks by type
+ * @pdev: The CXL PCI device to enumerate.
+ * @type: Register Block Indicator id
+ * @map: Enumeration output, clobbered on error
+ *
+ * Return: 0 if register block enumerated, negative error code otherwise
+ *
+ * A CXL DVSEC may additional point one or more register blocks, search
+ * for them by @type.
+ */
+int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
+		      struct cxl_register_map *map)
+{
+	u32 regloc_size, regblocks;
+	int regloc, i;
+
+	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
+					   CXL_DVSEC_REG_LOCATOR);
+	if (!regloc)
+		return -ENXIO;
+
+	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
+	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
+
+	regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
+	regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
+
+	for (i = 0; i < regblocks; i++, regloc += 8) {
+		u32 reg_lo, reg_hi;
+
+		pci_read_config_dword(pdev, regloc, &reg_lo);
+		pci_read_config_dword(pdev, regloc + 4, &reg_hi);
+
+		cxl_decode_regblock(reg_lo, reg_hi, map);
+
+		if (map->reg_type == type)
+			return 0;
+	}
+
+	return -ENODEV;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_find_regblock, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index ab4596f0b751..7150a9694f66 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -145,6 +145,10 @@ int cxl_map_device_regs(struct pci_dev *pdev,
 			struct cxl_device_regs *regs,
 			struct cxl_register_map *map);
 
+enum cxl_regloc_type;
+int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
+		      struct cxl_register_map *map);
+
 #define CXL_RESOURCE_NONE ((resource_size_t) -1)
 #define CXL_TARGET_STRLEN 20
 
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 711bf4514480..d2c743a31b0c 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -433,58 +433,6 @@ static int cxl_map_regs(struct cxl_dev_state *cxlds, struct cxl_register_map *ma
 	return 0;
 }
 
-static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
-				struct cxl_register_map *map)
-{
-	map->block_offset = ((u64)reg_hi << 32) |
-			    (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
-	map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
-	map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
-}
-
-/**
- * cxl_find_regblock() - Locate register blocks by type
- * @pdev: The CXL PCI device to enumerate.
- * @type: Register Block Indicator id
- * @map: Enumeration output, clobbered on error
- *
- * Return: 0 if register block enumerated, negative error code otherwise
- *
- * A CXL DVSEC may point to one or more register blocks, search for them
- * by @type.
- */
-static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
-			     struct cxl_register_map *map)
-{
-	u32 regloc_size, regblocks;
-	int regloc, i;
-
-	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
-					   CXL_DVSEC_REG_LOCATOR);
-	if (!regloc)
-		return -ENXIO;
-
-	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
-	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
-
-	regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
-	regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
-
-	for (i = 0; i < regblocks; i++, regloc += 8) {
-		u32 reg_lo, reg_hi;
-
-		pci_read_config_dword(pdev, regloc, &reg_lo);
-		pci_read_config_dword(pdev, regloc + 4, &reg_hi);
-
-		cxl_decode_regblock(reg_lo, reg_hi, map);
-
-		if (map->reg_type == type)
-			return 0;
-	}
-
-	return -ENODEV;
-}
-
 static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 			  struct cxl_register_map *map)
 {
diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
index 8ae2b4adc59d..a4b506bb37d1 100644
--- a/drivers/cxl/pci.h
+++ b/drivers/cxl/pci.h
@@ -47,4 +47,8 @@ enum cxl_regloc_type {
 	CXL_REGLOC_RBI_TYPES
 };
 
+#define cxl_reg_block(pdev, map)                                               \
+	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
+			   (map)->block_offset))
+
 #endif /* __CXL_PCI_H__ */
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 09/23] cxl: Introduce module_cxl_driver
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (7 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 08/23] cxl/acpi: Map component registers for Root Ports Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 15:54   ` Jonathan Cameron
  2021-11-24 22:22   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 10/23] cxl/core: Convert decoder range to resource Ben Widawsky
                   ` (13 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Dan Williams, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Many CXL drivers simply want to register and unregister themselves.
module_driver already supported this. A simple wrapper around that
reduces a decent amount of boilerplate in upcoming patches.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/cxl.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 7150a9694f66..d39d45f4a770 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -308,6 +308,9 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
 #define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
 void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 
+#define module_cxl_driver(__cxl_driver) \
+	module_driver(__cxl_driver, cxl_driver_register, cxl_driver_unregister)
+
 #define CXL_DEVICE_NVDIMM_BRIDGE	1
 #define CXL_DEVICE_NVDIMM		2
 
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 10/23] cxl/core: Convert decoder range to resource
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (8 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 09/23] cxl: Introduce module_cxl_driver Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 16:08   ` Jonathan Cameron
  2021-11-24 22:41   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 11/23] cxl/core: Document and tighten up decoder APIs Ben Widawsky
                   ` (12 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

CXL decoders manage address ranges in a hierarchical fashion whereby a
leaf is a unique subregion of its parent decoder (midlevel or root). It
therefore makes sense to use the resource API for handling this.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---
Changes since RFCv2
- Switch to DEFINE_RES_MEM from NAMED variant (Dan)
- Differentiate CFMWS resources and other decoder resources (Ben)
- Make decoder resources be range, nor resource (Dan)
- Set decoder name in cxl_decoder_add() (Dan)
---
 drivers/cxl/acpi.c     | 16 ++++++----------
 drivers/cxl/core/bus.c | 19 +++++++++++++++++--
 drivers/cxl/cxl.h      |  8 ++++++--
 3 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 7cfa8b568013..3415184a2e61 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -108,10 +108,8 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 
 	cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
 	cxld->target_type = CXL_DECODER_EXPANDER;
-	cxld->range = (struct range){
-		.start = cfmws->base_hpa,
-		.end = cfmws->base_hpa + cfmws->window_size - 1,
-	};
+	cxld->platform_res = (struct resource)DEFINE_RES_MEM(cfmws->base_hpa,
+							     cfmws->window_size);
 	cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
 	cxld->interleave_granularity = CFMWS_INTERLEAVE_GRANULARITY(cfmws);
 
@@ -127,8 +125,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 		return 0;
 	}
 	dev_dbg(dev, "add: %s node: %d range %#llx-%#llx\n",
-		dev_name(&cxld->dev), phys_to_target_node(cxld->range.start),
-		cfmws->base_hpa, cfmws->base_hpa + cfmws->window_size - 1);
+		dev_name(&cxld->dev),
+		phys_to_target_node(cxld->platform_res.start), cfmws->base_hpa,
+		cfmws->base_hpa + cfmws->window_size - 1);
 
 	return 0;
 }
@@ -267,10 +266,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 	cxld->interleave_ways = 1;
 	cxld->interleave_granularity = PAGE_SIZE;
 	cxld->target_type = CXL_DECODER_EXPANDER;
-	cxld->range = (struct range) {
-		.start = 0,
-		.end = -1,
-	};
+	cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0);
 
 	device_lock(&port->dev);
 	dport = list_first_entry(&port->dports, typeof(*dport), list);
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 17a4fff029f8..8e80e85350b1 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -46,8 +46,14 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr,
 			  char *buf)
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	u64 start = 0;
 
-	return sysfs_emit(buf, "%#llx\n", cxld->range.start);
+	if (is_root_decoder(dev))
+		start = cxld->platform_res.start;
+	else
+		start = cxld->decoder_range.start;
+
+	return sysfs_emit(buf, "%#llx\n", start);
 }
 static DEVICE_ATTR_RO(start);
 
@@ -55,8 +61,14 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr,
 			char *buf)
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	u64 size = 0;
 
-	return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range));
+	if (is_root_decoder(dev))
+		size = resource_size(&cxld->platform_res);
+	else
+		size = range_len(&cxld->decoder_range);
+
+	return sysfs_emit(buf, "%#llx\n", size);
 }
 static DEVICE_ATTR_RO(size);
 
@@ -548,6 +560,9 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 	if (rc)
 		return rc;
 
+	if (is_root_decoder(dev))
+		cxld->platform_res.name = dev_name(dev);
+
 	return device_add(dev);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index d39d45f4a770..ad816fb5bdcc 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -179,7 +179,8 @@ enum cxl_decoder_type {
  * struct cxl_decoder - CXL address range decode configuration
  * @dev: this decoder's device
  * @id: kernel device name id
- * @range: address range considered by this decoder
+ * @platform_res: address space resources considered by root decoder
+ * @decoder_range: address space resources considered by midlevel decoder
  * @interleave_ways: number of cxl_dports in this decode
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
@@ -190,7 +191,10 @@ enum cxl_decoder_type {
 struct cxl_decoder {
 	struct device dev;
 	int id;
-	struct range range;
+	union {
+		struct resource platform_res;
+		struct range decoder_range;
+	};
 	int interleave_ways;
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 11/23] cxl/core: Document and tighten up decoder APIs
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (9 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 10/23] cxl/core: Convert decoder range to resource Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 16:13   ` Jonathan Cameron
  2021-11-24 22:55   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 12/23] cxl: Introduce endpoint decoders Ben Widawsky
                   ` (11 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Since the code to add decoders for switches and endpoints is on the
horizon it helps to have properly documented APIs. In addition, the
decoder APIs will never need to support a negative count for downstream
targets as the spec explicitly starts numbering them at 1, ie. even 0 is
an "invalid" value which can be used as a sentinel.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---

This is respun from a previous incantation here:
https://lore.kernel.org/linux-cxl/20210915155946.308339-1-ben.widawsky@intel.com/
---
 drivers/cxl/core/bus.c | 33 +++++++++++++++++++++++++++++++--
 drivers/cxl/cxl.h      |  3 ++-
 2 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 8e80e85350b1..1ee12a60f3f4 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -495,7 +495,20 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
 	return rc;
 }
 
-struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
+/**
+ * cxl_decoder_alloc - Allocate a new CXL decoder
+ * @port: owning port of this decoder
+ * @nr_targets: downstream targets accessible by this decoder. All upstream
+ *		ports and root ports must have at least 1 target.
+ *
+ * A port should contain one or more decoders. Each of those decoders enable
+ * some address space for CXL.mem utilization. A decoder is expected to be
+ * configured by the caller before registering.
+ *
+ * Return: A new cxl decoder to be registered by cxl_decoder_add()
+ */
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
+				      unsigned int nr_targets)
 {
 	struct cxl_decoder *cxld, cxld_const_init = {
 		.nr_targets = nr_targets,
@@ -503,7 +516,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
 	struct device *dev;
 	int rc = 0;
 
-	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
+	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets == 0)
 		return ERR_PTR(-EINVAL);
 
 	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
@@ -535,6 +548,22 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
 
+/**
+ * cxl_decoder_add - Add a decoder with targets
+ * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
+ * @target_map: A list of downstream ports that this decoder can direct memory
+ *              traffic to. These numbers should correspond with the port number
+ *              in the PCIe Link Capabilities structure.
+ *
+ * Certain types of decoders may not have any targets. The main example of this
+ * is an endpoint device. A more awkward example is a hostbridge whose root
+ * ports get hot added (technically possible, though unlikely).
+ *
+ * Context: Process context. Takes and releases the cxld's device lock.
+ *
+ * Return: Negative error code if the decoder wasn't properly configured; else
+ *	   returns 0.
+ */
 int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 {
 	struct cxl_port *port;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index ad816fb5bdcc..b66ed8f241c6 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -288,7 +288,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
-struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
+				      unsigned int nr_targets);
 int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
 int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
 
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 12/23] cxl: Introduce endpoint decoders
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (10 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 11/23] cxl/core: Document and tighten up decoder APIs Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 16:20   ` Jonathan Cameron
  2021-11-20  0:02 ` [PATCH 13/23] cxl/core: Move target population locking to caller Ben Widawsky
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Endpoints have decoders too. It is useful to share the same
infrastructure from cxl_core. Endpoints do not have dports (downstream
targets), only the underlying physical medium. As a result, some special
casing is needed.

There is no functional change introduced yet as endpoints don't actually
enumerate decoders yet.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/core/bus.c | 41 +++++++++++++++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 1ee12a60f3f4..16b15f54fb62 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -187,6 +187,12 @@ static const struct attribute_group *cxl_decoder_switch_attribute_groups[] = {
 	NULL,
 };
 
+static const struct attribute_group *cxl_decoder_endpoint_attribute_groups[] = {
+	&cxl_decoder_base_attribute_group,
+	&cxl_base_attribute_group,
+	NULL,
+};
+
 static void cxl_decoder_release(struct device *dev)
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
@@ -196,6 +202,12 @@ static void cxl_decoder_release(struct device *dev)
 	kfree(cxld);
 }
 
+static const struct device_type cxl_decoder_endpoint_type = {
+	.name = "cxl_decoder_endpoint",
+	.release = cxl_decoder_release,
+	.groups = cxl_decoder_endpoint_attribute_groups,
+};
+
 static const struct device_type cxl_decoder_switch_type = {
 	.name = "cxl_decoder_switch",
 	.release = cxl_decoder_release,
@@ -208,6 +220,11 @@ static const struct device_type cxl_decoder_root_type = {
 	.groups = cxl_decoder_root_attribute_groups,
 };
 
+static bool is_endpoint_decoder(struct device *dev)
+{
+	return dev->type == &cxl_decoder_endpoint_type;
+}
+
 bool is_root_decoder(struct device *dev)
 {
 	return dev->type == &cxl_decoder_root_type;
@@ -499,7 +516,9 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
  * cxl_decoder_alloc - Allocate a new CXL decoder
  * @port: owning port of this decoder
  * @nr_targets: downstream targets accessible by this decoder. All upstream
- *		ports and root ports must have at least 1 target.
+ *		ports and root ports must have at least 1 target. Endpoint
+ *		devices will have 0 targets. Callers wishing to register an
+ *		endpoint device should specify 0.
  *
  * A port should contain one or more decoders. Each of those decoders enable
  * some address space for CXL.mem utilization. A decoder is expected to be
@@ -516,7 +535,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
 	struct device *dev;
 	int rc = 0;
 
-	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets == 0)
+	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
 		return ERR_PTR(-EINVAL);
 
 	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
@@ -535,8 +554,11 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
 	dev->parent = &port->dev;
 	dev->bus = &cxl_bus_type;
 
+	/* Endpoints don't have a target list */
+	if (nr_targets == 0)
+		dev->type = &cxl_decoder_endpoint_type;
 	/* root ports do not have a cxl_port_type parent */
-	if (port->dev.parent->type == &cxl_port_type)
+	else if (port->dev.parent->type == &cxl_port_type)
 		dev->type = &cxl_decoder_switch_type;
 	else
 		dev->type = &cxl_decoder_root_type;
@@ -579,12 +601,15 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 	if (cxld->interleave_ways < 1)
 		return -EINVAL;
 
-	port = to_cxl_port(cxld->dev.parent);
-	rc = decoder_populate_targets(cxld, port, target_map);
-	if (rc)
-		return rc;
-
 	dev = &cxld->dev;
+
+	port = to_cxl_port(cxld->dev.parent);
+	if (!is_endpoint_decoder(dev)) {
+		rc = decoder_populate_targets(cxld, port, target_map);
+		if (rc)
+			return rc;
+	}
+
 	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
 	if (rc)
 		return rc;
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 13/23] cxl/core: Move target population locking to caller
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (11 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 12/23] cxl: Introduce endpoint decoders Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 16:33   ` Jonathan Cameron
  2021-11-25  0:34   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 14/23] cxl: Introduce topology host registration Ben Widawsky
                   ` (9 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

In preparation for a port driver that enumerates a descendant port +
decoder hierarchy, arrange for an unlocked version of cxl_decoder_add().
Otherwise a port-driver that adds a child decoder will deadlock on the
device_lock() in ->probe().

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---

Changes since RFCv2:
- Reword commit message (Dan)
- Move decoder API changes into this patch (Dan)
---
 drivers/cxl/core/bus.c | 59 +++++++++++++++++++++++++++++++-----------
 drivers/cxl/cxl.h      |  1 +
 2 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 16b15f54fb62..cd6fe7823c69 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -487,28 +487,22 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
 {
 	int rc = 0, i;
 
+	device_lock_assert(&port->dev);
+
 	if (!target_map)
 		return 0;
 
-	device_lock(&port->dev);
-	if (list_empty(&port->dports)) {
-		rc = -EINVAL;
-		goto out_unlock;
-	}
+	if (list_empty(&port->dports))
+		return -EINVAL;
 
 	for (i = 0; i < cxld->nr_targets; i++) {
 		struct cxl_dport *dport = find_dport(port, target_map[i]);
 
-		if (!dport) {
-			rc = -ENXIO;
-			goto out_unlock;
-		}
+		if (!dport)
+			return -ENXIO;
 		cxld->target[i] = dport;
 	}
 
-out_unlock:
-	device_unlock(&port->dev);
-
 	return rc;
 }
 
@@ -571,7 +565,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
 EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
 
 /**
- * cxl_decoder_add - Add a decoder with targets
+ * cxl_decoder_add_locked - Add a decoder with targets
  * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
  * @target_map: A list of downstream ports that this decoder can direct memory
  *              traffic to. These numbers should correspond with the port number
@@ -581,12 +575,14 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
  * is an endpoint device. A more awkward example is a hostbridge whose root
  * ports get hot added (technically possible, though unlikely).
  *
- * Context: Process context. Takes and releases the cxld's device lock.
+ * This is the locked variant of cxl_decoder_add().
+ *
+ * Context: Process context. Expects the cxld's device lock to be held.
  *
  * Return: Negative error code if the decoder wasn't properly configured; else
  *	   returns 0.
  */
-int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
+int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
 {
 	struct cxl_port *port;
 	struct device *dev;
@@ -619,6 +615,39 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 
 	return device_add(dev);
 }
+EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, CXL);
+
+/**
+ * cxl_decoder_add - Add a decoder with targets
+ * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
+ * @target_map: A list of downstream ports that this decoder can direct memory
+ *              traffic to. These numbers should correspond with the port number
+ *              in the PCIe Link Capabilities structure.
+ *
+ * This is the unlocked variant of cxl_decoder_add_locked().
+ * See cxl_decoder_add_locked().
+ *
+ * Context: Process context. Takes and releases the cxld's device lock.
+ */
+int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
+{
+	struct cxl_port *port;
+	int rc;
+
+	if (WARN_ON_ONCE(!cxld))
+		return -EINVAL;
+
+	if (WARN_ON_ONCE(IS_ERR(cxld)))
+		return PTR_ERR(cxld);
+
+	port = to_cxl_port(cxld->dev.parent);
+
+	device_lock(&port->dev);
+	rc = cxl_decoder_add_locked(cxld, target_map);
+	device_unlock(&port->dev);
+
+	return rc;
+}
 EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL);
 
 static void cxld_unregister(void *dev)
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index b66ed8f241c6..2c5627fa8a34 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -290,6 +290,7 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
 struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
 				      unsigned int nr_targets);
+int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map);
 int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
 int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
 
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 14/23] cxl: Introduce topology host registration
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (12 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 13/23] cxl/core: Move target population locking to caller Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 18:20   ` Jonathan Cameron
  2021-11-25  1:09   ` Dan Williams
  2021-11-20  0:02 ` [PATCH 15/23] cxl/core: Store global list of root ports Ben Widawsky
                   ` (8 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The description of the CXL topology will be conveyed by a platform
specific entity that is expected to be a singleton. For ACPI based
systems, this is ACPI0017. When the topology host goes away, which as of
now can only be triggered by module unload, it is desirable to have the
entire topology cleaned up. Regular devm unwinding handles most
situations already, but what's missing is the ability to teardown the
root port. Since the root port is platform specific, the core needs a
set of APIs to allow platform specific drivers to register their root
ports. With that, all the automatic teardown can occur.

cxl_test makes for an interesting case. cxl_test creates an alternate
universe where there are possibly two root topology hosts (a real
ACPI0016, and a fake ACPI0016). For this to work in the future, cxl_acpi
(or some future platform host driver) will need to be unloaded first.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
The topology lock can be used for more things. I'd like to save that as
an add-on patch since it's extra risk for no reward, at this point.
---
 drivers/cxl/acpi.c     | 18 ++++++++++---
 drivers/cxl/core/bus.c | 57 +++++++++++++++++++++++++++++++++++++++---
 drivers/cxl/cxl.h      |  5 +++-
 3 files changed, 73 insertions(+), 7 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 3415184a2e61..82cc42ab38c6 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -224,8 +224,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 		return 0;
 	}
 
-	port = devm_cxl_add_port(host, match, dport->component_reg_phys,
-				 root_port);
+	port = devm_cxl_add_port(match, dport->component_reg_phys, root_port);
 	if (IS_ERR(port))
 		return PTR_ERR(port);
 	dev_dbg(host, "%s: add: %s\n", dev_name(match), dev_name(&port->dev));
@@ -376,6 +375,11 @@ static int add_root_nvdimm_bridge(struct device *match, void *data)
 	return 1;
 }
 
+static void clear_topology_host(void *data)
+{
+	cxl_unregister_topology_host(data);
+}
+
 static int cxl_acpi_probe(struct platform_device *pdev)
 {
 	int rc;
@@ -384,7 +388,15 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	struct acpi_device *adev = ACPI_COMPANION(host);
 	struct cxl_cfmws_context ctx;
 
-	root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
+	rc = cxl_register_topology_host(host);
+	if (rc)
+		return rc;
+
+	rc = devm_add_action_or_reset(host, clear_topology_host, host);
+	if (rc)
+		return rc;
+
+	root_port = devm_cxl_add_port(host, CXL_RESOURCE_NONE, root_port);
 	if (IS_ERR(root_port))
 		return PTR_ERR(root_port);
 	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index cd6fe7823c69..2ad38167796d 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -25,6 +25,53 @@
  */
 
 static DEFINE_IDA(cxl_port_ida);
+static DECLARE_RWSEM(topology_host_sem);
+
+static struct device *cxl_topology_host;
+
+int cxl_register_topology_host(struct device *host)
+{
+	down_write(&topology_host_sem);
+	if (cxl_topology_host) {
+		up_write(&topology_host_sem);
+		pr_warn("%s host currently in use. Please try unloading %s",
+			dev_name(cxl_topology_host), host->driver->name);
+		return -EBUSY;
+	}
+
+	cxl_topology_host = host;
+	up_write(&topology_host_sem);
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_register_topology_host, CXL);
+
+void cxl_unregister_topology_host(struct device *host)
+{
+	down_write(&topology_host_sem);
+	if (cxl_topology_host == host)
+		cxl_topology_host = NULL;
+	else
+		pr_warn("topology host in use by %s\n",
+			cxl_topology_host->driver->name);
+	up_write(&topology_host_sem);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_unregister_topology_host, CXL);
+
+static struct device *get_cxl_topology_host(void)
+{
+	down_read(&topology_host_sem);
+	if (cxl_topology_host)
+		return cxl_topology_host;
+	up_read(&topology_host_sem);
+	return NULL;
+}
+
+static void put_cxl_topology_host(struct device *dev)
+{
+	WARN_ON(dev != cxl_topology_host);
+	up_read(&topology_host_sem);
+}
 
 static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
 			    char *buf)
@@ -362,17 +409,16 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
 
 /**
  * devm_cxl_add_port - register a cxl_port in CXL memory decode hierarchy
- * @host: host device for devm operations
  * @uport: "physical" device implementing this upstream port
  * @component_reg_phys: (optional) for configurable cxl_port instances
  * @parent_port: next hop up in the CXL memory decode hierarchy
  */
-struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
+struct cxl_port *devm_cxl_add_port(struct device *uport,
 				   resource_size_t component_reg_phys,
 				   struct cxl_port *parent_port)
 {
+	struct device *dev, *host;
 	struct cxl_port *port;
-	struct device *dev;
 	int rc;
 
 	port = cxl_port_alloc(uport, component_reg_phys, parent_port);
@@ -391,7 +437,12 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
 	if (rc)
 		goto err;
 
+	host = get_cxl_topology_host();
+	if (!host)
+		return ERR_PTR(-ENODEV);
+
 	rc = devm_add_action_or_reset(host, unregister_port, port);
+	put_cxl_topology_host(host);
 	if (rc)
 		return ERR_PTR(rc);
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 2c5627fa8a34..6fac4826d22b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -152,6 +152,9 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
 #define CXL_RESOURCE_NONE ((resource_size_t) -1)
 #define CXL_TARGET_STRLEN 20
 
+int cxl_register_topology_host(struct device *host);
+void cxl_unregister_topology_host(struct device *host);
+
 /*
  * cxl_decoder flags that define the type of memory / devices this
  * decoder supports as well as configuration lock status See "CXL 2.0
@@ -279,7 +282,7 @@ struct cxl_dport {
 };
 
 struct cxl_port *to_cxl_port(struct device *dev);
-struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
+struct cxl_port *devm_cxl_add_port(struct device *uport,
 				   resource_size_t component_reg_phys,
 				   struct cxl_port *parent_port);
 
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 15/23] cxl/core: Store global list of root ports
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (13 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 14/23] cxl: Introduce topology host registration Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 18:22   ` Jonathan Cameron
  2021-11-20  0:02 ` [PATCH 16/23] cxl/pci: Cache device DVSEC offset Ben Widawsky
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

CXL root ports (the downstream port to a host bridge) are to be
enumerated by a platform specific driver. In the case of ACPI compliant
systems, this is like the cxl_acpi driver. Root ports are the first
CXL spec defined component that can be "found" by that platform specific
driver.

By storing a list of these root ports components in lower levels of the
topology (switches and endpoints), have a mechanism to walk up their
device hierarchy to find an enumerated root port. This will be necessary
for region programming.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---
Dan points out in review this is possible to do without a new global
list. While I agree, I was unable to get it working in a reasonable
mount of time. Will punt on that for now.
---
 drivers/cxl/acpi.c            |  4 ++--
 drivers/cxl/core/bus.c        | 34 +++++++++++++++++++++++++++++++++-
 drivers/cxl/cxl.h             |  5 ++++-
 tools/testing/cxl/mock_acpi.c |  4 ++--
 4 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 82cc42ab38c6..c12e4fed7941 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -159,7 +159,7 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 		creg = cxl_reg_block(pdev, &map);
 
 	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
-	rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
+	rc = cxl_add_dport(port, &pdev->dev, port_num, creg, true);
 	if (rc) {
 		ctx->error = rc;
 		return rc;
@@ -341,7 +341,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
 		return 0;
 	}
 
-	rc = cxl_add_dport(root_port, match, uid, ctx.chbcr);
+	rc = cxl_add_dport(root_port, match, uid, ctx.chbcr, false);
 	if (rc) {
 		dev_err(host, "failed to add downstream port: %s\n",
 			dev_name(match));
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 2ad38167796d..9e0d7d5d9298 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -26,6 +26,8 @@
 
 static DEFINE_IDA(cxl_port_ida);
 static DECLARE_RWSEM(topology_host_sem);
+static LIST_HEAD(cxl_root_ports);
+static DECLARE_RWSEM(root_port_sem);
 
 static struct device *cxl_topology_host;
 
@@ -326,12 +328,31 @@ struct cxl_port *to_cxl_port(struct device *dev)
 	return container_of(dev, struct cxl_port, dev);
 }
 
+struct cxl_dport *cxl_get_root_dport(struct device *dev)
+{
+	struct cxl_dport *ret = NULL;
+	struct cxl_dport *dport;
+
+	down_read(&root_port_sem);
+	list_for_each_entry(dport, &cxl_root_ports, root_port_link) {
+		if (dport->dport == dev) {
+			ret = dport;
+			break;
+		}
+	}
+
+	up_read(&root_port_sem);
+	return ret;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_get_root_dport, CXL);
+
 static void unregister_port(void *_port)
 {
 	struct cxl_port *port = _port;
 	struct cxl_dport *dport;
 
 	device_lock(&port->dev);
+	down_read(&root_port_sem);
 	list_for_each_entry(dport, &port->dports, list) {
 		char link_name[CXL_TARGET_STRLEN];
 
@@ -339,7 +360,10 @@ static void unregister_port(void *_port)
 			     dport->port_id) >= CXL_TARGET_STRLEN)
 			continue;
 		sysfs_remove_link(&port->dev.kobj, link_name);
+
+		list_del_init(&dport->root_port_link);
 	}
+	up_read(&root_port_sem);
 	device_unlock(&port->dev);
 	device_unregister(&port->dev);
 }
@@ -493,12 +517,13 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
  * @dport_dev: firmware or PCI device representing the dport
  * @port_id: identifier for this dport in a decoder's target list
  * @component_reg_phys: optional location of CXL component registers
+ * @root_port: is this a root port (hostbridge downstream)
  *
  * Note that all allocations and links are undone by cxl_port deletion
  * and release.
  */
 int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
-		  resource_size_t component_reg_phys)
+		  resource_size_t component_reg_phys, bool root_port)
 {
 	char link_name[CXL_TARGET_STRLEN];
 	struct cxl_dport *dport;
@@ -513,6 +538,7 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
 		return -ENOMEM;
 
 	INIT_LIST_HEAD(&dport->list);
+	INIT_LIST_HEAD(&dport->root_port_link);
 	dport->dport = get_device(dport_dev);
 	dport->port_id = port_id;
 	dport->component_reg_phys = component_reg_phys;
@@ -526,6 +552,12 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
 	if (rc)
 		goto err;
 
+	if (root_port) {
+		down_write(&root_port_sem);
+		list_add_tail(&dport->root_port_link, &cxl_root_ports);
+		up_write(&root_port_sem);
+	}
+
 	return 0;
 err:
 	cxl_dport_release(dport);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 6fac4826d22b..3962a5e6a950 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -272,6 +272,7 @@ struct cxl_port {
  * @component_reg_phys: downstream port component registers
  * @port: reference to cxl_port that contains this downstream port
  * @list: node for a cxl_port's list of cxl_dport instances
+ * @root_port_link: node for global list of root ports
  */
 struct cxl_dport {
 	struct device *dport;
@@ -279,6 +280,7 @@ struct cxl_dport {
 	resource_size_t component_reg_phys;
 	struct cxl_port *port;
 	struct list_head list;
+	struct list_head root_port_link;
 };
 
 struct cxl_port *to_cxl_port(struct device *dev);
@@ -287,7 +289,8 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
 				   struct cxl_port *parent_port);
 
 int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
-		  resource_size_t component_reg_phys);
+		  resource_size_t component_reg_phys, bool root_port);
+struct cxl_dport *cxl_get_root_dport(struct device *dev);
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
diff --git a/tools/testing/cxl/mock_acpi.c b/tools/testing/cxl/mock_acpi.c
index 4c8a493ace56..ddefc4345f36 100644
--- a/tools/testing/cxl/mock_acpi.c
+++ b/tools/testing/cxl/mock_acpi.c
@@ -57,7 +57,7 @@ static int match_add_root_port(struct pci_dev *pdev, void *data)
 
 	/* TODO walk DVSEC to find component register base */
 	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
-	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
+	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE, true);
 	if (rc) {
 		dev_err(dev, "failed to add dport: %s (%d)\n",
 			dev_name(&pdev->dev), rc);
@@ -78,7 +78,7 @@ static int mock_add_root_port(struct platform_device *pdev, void *data)
 	struct device *dev = ctx->dev;
 	int rc;
 
-	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE);
+	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE, true);
 	if (rc) {
 		dev_err(dev, "failed to add dport: %s (%d)\n",
 			dev_name(&pdev->dev), rc);
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 16/23] cxl/pci: Cache device DVSEC offset
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (14 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 15/23] cxl/core: Store global list of root ports Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 16:46   ` Jonathan Cameron
  2021-11-20  0:02 ` [PATCH 17/23] cxl: Cache and pass DVSEC ranges Ben Widawsky
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The PCIe device DVSEC, defined in the CXL 2.0 spec, 8.1.3 is required to
be implemented by CXL 2.0 endpoint devices. Since the information
contained within this DVSEC will be critically important for region
configuration, it makes sense to find the value early.

Since this DVSEC is not strictly required for mailbox functionality,
failure to find this information does not result in the driver failing
to bind.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/cxlmem.h | 2 ++
 drivers/cxl/pci.c    | 7 +++++++
 2 files changed, 9 insertions(+)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 8d96d009ad90..3ef3c652599e 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -98,6 +98,7 @@ struct cxl_mbox_cmd {
  *
  * @dev: The device associated with this CXL state
  * @regs: Parsed register blocks
+ * @device_dvsec: Offset to the PCIe device DVSEC
  * @payload_size: Size of space for payload
  *                (CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
  * @lsa_size: Size of Label Storage Area
@@ -125,6 +126,7 @@ struct cxl_dev_state {
 	struct device *dev;
 
 	struct cxl_regs regs;
+	int device_dvsec;
 
 	size_t payload_size;
 	size_t lsa_size;
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index d2c743a31b0c..f3872cbee7f8 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -474,6 +474,13 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (IS_ERR(cxlds))
 		return PTR_ERR(cxlds);
 
+	cxlds->device_dvsec = pci_find_dvsec_capability(pdev,
+							PCI_DVSEC_VENDOR_ID_CXL,
+							CXL_DVSEC_PCIE_DEVICE);
+	if (!cxlds->device_dvsec)
+		dev_warn(&pdev->dev,
+			 "Device DVSEC not present. Expect limited functionality.\n");
+
 	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
 	if (rc)
 		return rc;
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 17/23] cxl: Cache and pass DVSEC ranges
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (15 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 16/23] cxl/pci: Cache device DVSEC offset Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-20  4:29     ` kernel test robot
                     ` (2 more replies)
  2021-11-20  0:02 ` [PATCH 18/23] cxl/pci: Implement wait for media active Ben Widawsky
                   ` (5 subsequent siblings)
  22 siblings, 3 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

CXL 1.1 specification provided a mechanism for mapping an address space
of a CXL device. That functionality is known as a "range" and can be
programmed through PCIe DVSEC. In addition to this, the specification
defines an active bit which a device will expose through the same DVSEC
to notify system software that memory is initialized and ready.

While CXL 2.0 introduces a more powerful mechanism called HDM decoders
that are controlled by MMIO behind a PCIe BAR, the spec does allow the
1.1 style mapping to still be present. In such a case, when the CXL
driver takes over, if it were to enable HDM decoding and there was an
actively used range, things would likely blow up, in particular if it
wasn't an identical mapping.

This patch caches the relevant information which the cxl_mem driver will
need to make the proper decision and passes it along.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/cxlmem.h |  19 +++++++
 drivers/cxl/pci.c    | 126 +++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/pci.h    |  13 +++++
 3 files changed, 158 insertions(+)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 3ef3c652599e..eac5528ccaae 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -89,6 +89,22 @@ struct cxl_mbox_cmd {
  */
 #define CXL_CAPACITY_MULTIPLIER SZ_256M
 
+/**
+ * struct cxl_endpoint_dvsec_info - Cached DVSEC info
+ * @mem_enabled: cached value of mem_enabled in the DVSEC, PCIE_DEVICE
+ * @ranges: Number of HDM ranges this device contains.
+ * @range.base: cached value of the range base in the DVSEC, PCIE_DEVICE
+ * @range.size: cached value of the range size in the DVSEC, PCIE_DEVICE
+ */
+struct cxl_endpoint_dvsec_info {
+	bool mem_enabled;
+	int ranges;
+	struct {
+		u64 base;
+		u64 size;
+	} range[2];
+};
+
 /**
  * struct cxl_dev_state - The driver device state
  *
@@ -117,6 +133,7 @@ struct cxl_mbox_cmd {
  * @active_persistent_bytes: sum of hard + soft persistent
  * @next_volatile_bytes: volatile capacity change pending device reset
  * @next_persistent_bytes: persistent capacity change pending device reset
+ * @info: Cached DVSEC information about the device.
  * @mbox_send: @dev specific transport for transmitting mailbox commands
  *
  * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
@@ -147,6 +164,8 @@ struct cxl_dev_state {
 	u64 next_volatile_bytes;
 	u64 next_persistent_bytes;
 
+	struct cxl_endpoint_dvsec_info *info;
+
 	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
 };
 
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index f3872cbee7f8..b3f46045bf3e 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -452,8 +452,126 @@ static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 	return rc;
 }
 
+#define CDPD(cxlds, which)                                                     \
+	cxlds->device_dvsec + CXL_DVSEC_PCIE_DEVICE_##which##_OFFSET
+
+#define CDPDR(cxlds, which, sorb, lohi)                                        \
+	cxlds->device_dvsec +                                                  \
+		CXL_DVSEC_PCIE_DEVICE_RANGE_##sorb##_##lohi##_OFFSET(which)
+
+static int wait_for_valid(struct cxl_dev_state *cxlds)
+{
+	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
+	const unsigned long timeout = jiffies + HZ;
+	bool valid;
+
+	do {
+		u64 size;
+		u32 temp;
+		int rc;
+
+		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
+					   &temp);
+		if (rc)
+			return -ENXIO;
+		size = (u64)temp << 32;
+
+		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
+					   &temp);
+		if (rc)
+			return -ENXIO;
+		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
+
+		/*
+		 * Memory_Info_Valid: When set, indicates that the CXL Range 1
+		 * Size high and Size Low registers are valid. Must be set
+		 * within 1 second of deassertion of reset to CXL device.
+		 */
+		valid = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_INFO_VALID, temp);
+		if (valid)
+			break;
+		cpu_relax();
+	} while (!time_after(jiffies, timeout));
+
+	return valid ? 0 : -ETIMEDOUT;
+}
+
+static struct cxl_endpoint_dvsec_info *dvsec_ranges(struct cxl_dev_state *cxlds)
+{
+	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
+	struct cxl_endpoint_dvsec_info *info;
+	int hdm_count, rc, i;
+	u16 cap, ctrl;
+
+	rc = pci_read_config_word(pdev, CDPD(cxlds, CAP), &cap);
+	if (rc)
+		return ERR_PTR(-ENXIO);
+	rc = pci_read_config_word(pdev, CDPD(cxlds, CTRL), &ctrl);
+	if (rc)
+		return ERR_PTR(-ENXIO);
+
+	if (!(cap & CXL_DVSEC_PCIE_DEVICE_MEM_CAPABLE))
+		return ERR_PTR(-ENODEV);
+
+	/*
+	 * It is not allowed by spec for MEM.capable to be set and have 0 HDM
+	 * decoders. Therefore, the device is not considered CXL.mem capable.
+	 */
+	hdm_count = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_HDM_COUNT_MASK, cap);
+	if (!hdm_count || hdm_count > 2)
+		return ERR_PTR(-EINVAL);
+
+	rc = wait_for_valid(cxlds);
+	if (rc)
+		return ERR_PTR(rc);
+
+	info = devm_kzalloc(cxlds->dev, sizeof(*info), GFP_KERNEL);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	info->mem_enabled = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_ENABLE, ctrl);
+
+	for (i = 0; i < hdm_count; i++) {
+		u64 base, size;
+		u32 temp;
+
+		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, SIZE, HIGH),
+					   &temp);
+		if (rc)
+			continue;
+		size = (u64)temp << 32;
+
+		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, SIZE, LOW),
+					   &temp);
+		if (rc)
+			continue;
+		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
+
+		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, BASE, HIGH),
+					   &temp);
+		if (rc)
+			continue;
+		base = (u64)temp << 32;
+
+		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, BASE, LOW),
+					   &temp);
+		if (rc)
+			continue;
+		base |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_BASE_LOW_MASK;
+
+		info->range[i].base = base;
+		info->range[i].size = size;
+		info->ranges++;
+	}
+
+	return info;
+}
+
+#undef CDPDR
+
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
+	struct cxl_endpoint_dvsec_info *info;
 	struct cxl_register_map map;
 	struct cxl_memdev *cxlmd;
 	struct cxl_dev_state *cxlds;
@@ -505,6 +623,14 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
+	info = dvsec_ranges(cxlds);
+	if (IS_ERR(info))
+		dev_err(&pdev->dev,
+			"Failed to get DVSEC range information (%ld)\n",
+			PTR_ERR(info));
+	else
+		cxlds->info = info;
+
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
index a4b506bb37d1..2a48cd65bf59 100644
--- a/drivers/cxl/pci.h
+++ b/drivers/cxl/pci.h
@@ -15,6 +15,19 @@
 
 /* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
 #define CXL_DVSEC_PCIE_DEVICE					0
+#define   CXL_DVSEC_PCIE_DEVICE_CAP_OFFSET			0xA
+#define     CXL_DVSEC_PCIE_DEVICE_MEM_CAPABLE			BIT(2)
+#define     CXL_DVSEC_PCIE_DEVICE_HDM_COUNT_MASK		GENMASK(5, 4)
+#define   CXL_DVSEC_PCIE_DEVICE_CTRL_OFFSET			0xC
+#define     CXL_DVSEC_PCIE_DEVICE_MEM_ENABLE			BIT(2)
+#define   CXL_DVSEC_PCIE_DEVICE_RANGE_SIZE_HIGH_OFFSET(i)	0x18 + (i * 0x10)
+#define   CXL_DVSEC_PCIE_DEVICE_RANGE_SIZE_LOW_OFFSET(i)	0x1C + (i * 0x10)
+#define     CXL_DVSEC_PCIE_DEVICE_MEM_INFO_VALID		BIT(0)
+#define     CXL_DVSEC_PCIE_DEVICE_MEM_ACTIVE			BIT(1)
+#define     CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK		GENMASK(31, 28)
+#define   CXL_DVSEC_PCIE_DEVICE_RANGE_BASE_HIGH_OFFSET(i)	0x20 + (i * 0x10)
+#define   CXL_DVSEC_PCIE_DEVICE_RANGE_BASE_LOW_OFFSET(i)	0x24 + (i * 0x10)
+#define     CXL_DVSEC_PCIE_DEVICE_MEM_BASE_LOW_MASK		GENMASK(31, 28)
 
 /* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
 #define CXL_DVSEC_FUNCTION_MAP					2
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 18/23] cxl/pci: Implement wait for media active
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (16 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 17/23] cxl: Cache and pass DVSEC ranges Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 17:03   ` Jonathan Cameron
  2021-11-20  0:02 ` [PATCH 19/23] cxl/pci: Store component register base in cxlds Ben Widawsky
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

CXL 2.0 8.1.3.8.2 defines "Memory_Active: When set, indicates that the
CXL Range 1 memory is fully initialized and available for software use.
Must be set within Range 1. Memory_Active_Timeout of deassertion of
reset to CXL device if CXL.mem HwInit Mode=1" The CXL* Type 3 Memory
Device Software Guide (Revision 1.0) further describes the need to check
this bit before using HDM.

Unfortunately, Memory_Active can take quite a long time depending on
media size (up to 256s per 2.0 spec). Since the cxl_pci driver doesn't
care about this, a callback is exported as part of driver state for use
by drivers that do care.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
This patch did not exist in RFCv2
---
 drivers/cxl/cxlmem.h |  1 +
 drivers/cxl/pci.c    | 56 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index eac5528ccaae..a9424dd4e5c3 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -167,6 +167,7 @@ struct cxl_dev_state {
 	struct cxl_endpoint_dvsec_info *info;
 
 	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
+	int (*wait_media_ready)(struct cxl_dev_state *cxlds);
 };
 
 enum cxl_opcode {
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index b3f46045bf3e..f1a68bfe5f77 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -496,6 +496,60 @@ static int wait_for_valid(struct cxl_dev_state *cxlds)
 	return valid ? 0 : -ETIMEDOUT;
 }
 
+/*
+ * Implements Figure 43 of the CXL Type 3 Memory Device Software Guide. Waits a
+ * full 256s no matter what the device reports.
+ */
+static int wait_for_media_ready(struct cxl_dev_state *cxlds)
+{
+	const unsigned long timeout = jiffies + (256 * HZ);
+	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
+	u64 md_status;
+	bool active;
+	int rc;
+
+	rc = wait_for_valid(cxlds);
+	if (rc)
+		return rc;
+
+	do {
+		u64 size;
+		u32 temp;
+		int rc;
+
+		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
+					   &temp);
+		if (rc)
+			return -ENXIO;
+		size = (u64)temp << 32;
+
+		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
+					   &temp);
+		if (rc)
+			return -ENXIO;
+		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
+
+		active = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_ACTIVE, temp);
+		if (active)
+			break;
+		cpu_relax();
+		mdelay(100);
+	} while (!time_after(jiffies, timeout));
+
+	if (!active)
+		return -ETIMEDOUT;
+
+	rc = check_device_status(cxlds);
+	if (rc)
+		return rc;
+
+	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
+	if (!CXLMDEV_READY(md_status))
+		return -EIO;
+
+	return 0;
+}
+
 static struct cxl_endpoint_dvsec_info *dvsec_ranges(struct cxl_dev_state *cxlds)
 {
 	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
@@ -598,6 +652,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (!cxlds->device_dvsec)
 		dev_warn(&pdev->dev,
 			 "Device DVSEC not present. Expect limited functionality.\n");
+	else
+		cxlds->wait_media_ready = wait_for_media_ready;
 
 	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
 	if (rc)
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 19/23] cxl/pci: Store component register base in cxlds
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (17 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 18/23] cxl/pci: Implement wait for media active Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-20  7:28     ` kernel test robot
  2021-11-22 17:11   ` Jonathan Cameron
  2021-11-20  0:02 ` [PATCH 20/23] cxl/port: Introduce a port driver Ben Widawsky
                   ` (3 subsequent siblings)
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The component register base address is enumerated and stored in the
cxl_device_state structure for use by the endpoint driver.

Component register base addresses are obtained through PCI mechanisms.
As such it makes most sense for the cxl_pci driver to obtain that
address. In order to reuse the port driver for enumerating decoder
resources for an endpoint, it is desirable to be able to add the
endpoint as a port. The endpoint does know of the cxlds and can pull the
component register base from there and pass it along to port creation.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
Changes since RFCv2:
This patch was originally named, "cxl/core: Store component register
base for memdevs". It plumbed the component registers through memdev
creation. After more work, it became apparent we needed to plumb other
stuff from the pci driver, so going forward, cxlds will just be
referenced by the cxl_mem driver. This also allows us to ignore the
change needed to cxl_test

- Rework patch to store the base in cxlds
- Remove defunct comment (Dan)
---
 drivers/cxl/cxlmem.h |  2 ++
 drivers/cxl/pci.c    | 11 +++++++++++
 2 files changed, 13 insertions(+)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index a9424dd4e5c3..b1d753541f4e 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -134,6 +134,7 @@ struct cxl_endpoint_dvsec_info {
  * @next_volatile_bytes: volatile capacity change pending device reset
  * @next_persistent_bytes: persistent capacity change pending device reset
  * @info: Cached DVSEC information about the device.
+ * @component_reg_phys: register base of component registers
  * @mbox_send: @dev specific transport for transmitting mailbox commands
  *
  * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
@@ -165,6 +166,7 @@ struct cxl_dev_state {
 	u64 next_persistent_bytes;
 
 	struct cxl_endpoint_dvsec_info *info;
+	resource_size_t component_reg_phys;
 
 	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
 	int (*wait_media_ready)(struct cxl_dev_state *cxlds);
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index f1a68bfe5f77..a8e375950514 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -663,6 +663,17 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
+	/*
+	 * If the component registers can't be found, the cxl_pci driver may
+	 * still be useful for management functions so don't return an error.
+	 */
+	cxlds->component_reg_phys = CXL_RESOURCE_NONE;
+	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+	if (rc)
+		dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
+	else
+		cxlds->component_reg_phys = cxl_reg_block(pdev, &map);
+
 	rc = cxl_pci_setup_mailbox(cxlds);
 	if (rc)
 		return rc;
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (18 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 19/23] cxl/pci: Store component register base in cxlds Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-20  3:14     ` kernel test robot
                     ` (3 more replies)
  2021-11-20  0:02 ` [PATCH 21/23] cxl: Unify port enumeration for decoders Ben Widawsky
                   ` (2 subsequent siblings)
  22 siblings, 4 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The CXL port driver is responsible for managing the decoder resources
contained within the port. It will also provide APIs that other drivers
will consume for managing these resources.

There are 4 types of ports in a system:
1. Platform port. This is a non-programmable entity. Such a port is
   named rootX. It is enumerated by cxl_acpi in an ACPI based system.
2. Hostbridge port. This ports register access is defined in a platform
   specific way (CHBS for ACPI platforms). It has n downstream ports,
   each of which are known as CXL 2.0 root ports. Once the platform
   specific mechanism to get the offset to the registers is obtained it
   operates just like other CXL components. The enumeration of this
   component is started by cxl_acpi and completed by cxl_port.
3. Switch port. A switch port is similar to a hostbridge port except
   register access is defined in the CXL specification in a platform
   agnostic way. The downstream ports for a switch are simply known as
   downstream ports. The enumeration of these are entirely contained
   within cxl_port.
4. Endpoint port. Endpoint ports are similar to switch ports with the
   exception that they have no downstream ports, only the underlying
   media on the device. The enumeration of these are started by cxl_pci,
   and completed by cxl_port.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---
Changes since RFCv2:
- Reword commit message tense (Dan)
- Reword commit message
- Drop SOFTDEP since it's not needed yet (Dan)
- Add CONFIG_CXL_PORT (Dan)
- s/CXL_DECODER_F_EN/CXL_DECODER_F_ENABLE (Dan)
- rename cxl_hdm_decoder_ functions to "to_" (Dan)
- remove useless inline (Dan)
- Check endpoint decoder based on dport list instead of driver id (Dan)
- Use range instead of resource per dependent patch change
- Use clever union packing for target list (Dan)
- Only check NULL from devm_cxl_iomap_block (Dan)
- Properly parent the created cxl decoders
- Move bus rescanning from cxl_acpi to here (Dan)
- Remove references to "CFMWS" in core (Dan)
- Use macro to help keep within 80 character lines
---
 .../driver-api/cxl/memory-devices.rst         |   5 +
 drivers/cxl/Kconfig                           |  22 ++
 drivers/cxl/Makefile                          |   2 +
 drivers/cxl/core/bus.c                        |  67 ++++
 drivers/cxl/core/regs.c                       |   6 +-
 drivers/cxl/cxl.h                             |  34 +-
 drivers/cxl/port.c                            | 323 ++++++++++++++++++
 7 files changed, 450 insertions(+), 9 deletions(-)
 create mode 100644 drivers/cxl/port.c

diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index 3b8f41395f6b..fbf0393cdddc 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -28,6 +28,11 @@ CXL Memory Device
 .. kernel-doc:: drivers/cxl/pci.c
    :internal:
 
+CXL Port
+--------
+.. kernel-doc:: drivers/cxl/port.c
+   :doc: cxl port
+
 CXL Core
 --------
 .. kernel-doc:: drivers/cxl/cxl.h
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index ef05e96f8f97..3aeb33bba5a3 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -77,4 +77,26 @@ config CXL_PMEM
 	  provisioning the persistent memory capacity of CXL memory expanders.
 
 	  If unsure say 'm'.
+
+config CXL_MEM
+	tristate "CXL.mem: Memory Devices"
+	select CXL_PORT
+	depends on CXL_PCI
+	default CXL_BUS
+	help
+	  The CXL.mem protocol allows a device to act as a provider of "System
+	  RAM" and/or "Persistent Memory" that is fully coherent as if the
+	  memory was attached to the typical CPU memory controller.  This is
+	  known as HDM "Host-managed Device Memory".
+
+	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
+	  memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
+	  specification for a detailed description of HDM.
+
+	  If unsure say 'm'.
+
+
+config CXL_PORT
+	tristate
+
 endif
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index cf07ae6cea17..56fcac2323cb 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -3,7 +3,9 @@ obj-$(CONFIG_CXL_BUS) += core/
 obj-$(CONFIG_CXL_PCI) += cxl_pci.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
+obj-$(CONFIG_CXL_PORT) += cxl_port.o
 
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
 cxl_pmem-y := pmem.o
+cxl_port-y := port.o
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 9e0d7d5d9298..46a06cfe79bd 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -31,6 +31,8 @@ static DECLARE_RWSEM(root_port_sem);
 
 static struct device *cxl_topology_host;
 
+static bool is_cxl_decoder(struct device *dev);
+
 int cxl_register_topology_host(struct device *host)
 {
 	down_write(&topology_host_sem);
@@ -75,6 +77,45 @@ static void put_cxl_topology_host(struct device *dev)
 	up_read(&topology_host_sem);
 }
 
+static int decoder_match(struct device *dev, void *data)
+{
+	struct resource *theirs = (struct resource *)data;
+	struct cxl_decoder *cxld;
+
+	if (!is_cxl_decoder(dev))
+		return 0;
+
+	cxld = to_cxl_decoder(dev);
+	if (theirs->start <= cxld->decoder_range.start &&
+	    theirs->end >= cxld->decoder_range.end)
+		return 1;
+
+	return 0;
+}
+
+static struct cxl_decoder *cxl_find_root_decoder(resource_size_t base,
+						 resource_size_t size)
+{
+	struct cxl_decoder *cxld = NULL;
+	struct device *cxldd;
+	struct device *host;
+	struct resource res = (struct resource){
+		.start = base,
+		.end = base + size - 1,
+	};
+
+	host = get_cxl_topology_host();
+	if (!host)
+		return NULL;
+
+	cxldd = device_find_child(host, &res, decoder_match);
+	if (cxldd)
+		cxld = to_cxl_decoder(cxldd);
+
+	put_cxl_topology_host(host);
+	return cxld;
+}
+
 static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
 			    char *buf)
 {
@@ -280,6 +321,11 @@ bool is_root_decoder(struct device *dev)
 }
 EXPORT_SYMBOL_NS_GPL(is_root_decoder, CXL);
 
+static bool is_cxl_decoder(struct device *dev)
+{
+	return dev->type->release == cxl_decoder_release;
+}
+
 struct cxl_decoder *to_cxl_decoder(struct device *dev)
 {
 	if (dev_WARN_ONCE(dev, dev->type->release != cxl_decoder_release,
@@ -327,6 +373,7 @@ struct cxl_port *to_cxl_port(struct device *dev)
 		return NULL;
 	return container_of(dev, struct cxl_port, dev);
 }
+EXPORT_SYMBOL_NS_GPL(to_cxl_port, CXL);
 
 struct cxl_dport *cxl_get_root_dport(struct device *dev)
 {
@@ -735,6 +782,24 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL);
 
 static void cxld_unregister(void *dev)
 {
+	struct cxl_decoder *plat_decoder, *cxld = to_cxl_decoder(dev);
+	resource_size_t base, size;
+
+	if (is_root_decoder(dev)) {
+		device_unregister(dev);
+		return;
+	}
+
+	base = cxld->decoder_range.start;
+	size = range_len(&cxld->decoder_range);
+
+	if (size) {
+		plat_decoder = cxl_find_root_decoder(base, size);
+		if (plat_decoder)
+			__release_region(&plat_decoder->platform_res, base,
+					 size);
+	}
+
 	device_unregister(dev);
 }
 
@@ -789,6 +854,8 @@ static int cxl_device_id(struct device *dev)
 		return CXL_DEVICE_NVDIMM_BRIDGE;
 	if (dev->type == &cxl_nvdimm_type)
 		return CXL_DEVICE_NVDIMM;
+	if (dev->type == &cxl_port_type)
+		return CXL_DEVICE_PORT;
 	return 0;
 }
 
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index 41a0245867ea..f191b0c995a7 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -159,9 +159,8 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base,
 }
 EXPORT_SYMBOL_NS_GPL(cxl_probe_device_regs, CXL);
 
-static void __iomem *devm_cxl_iomap_block(struct device *dev,
-					  resource_size_t addr,
-					  resource_size_t length)
+void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
+				   resource_size_t length)
 {
 	void __iomem *ret_val;
 	struct resource *res;
@@ -180,6 +179,7 @@ static void __iomem *devm_cxl_iomap_block(struct device *dev,
 
 	return ret_val;
 }
+EXPORT_SYMBOL_NS_GPL(devm_cxl_iomap_block, CXL);
 
 int cxl_map_component_regs(struct pci_dev *pdev,
 			   struct cxl_component_regs *regs,
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 3962a5e6a950..24fa16157d5e 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -17,6 +17,9 @@
  * (port-driver, region-driver, nvdimm object-drivers... etc).
  */
 
+/* CXL 2.0 8.2.4 CXL Component Register Layout and Definition */
+#define CXL_COMPONENT_REG_BLOCK_SIZE SZ_64K
+
 /* CXL 2.0 8.2.5 CXL.cache and CXL.mem Registers*/
 #define CXL_CM_OFFSET 0x1000
 #define CXL_CM_CAP_HDR_OFFSET 0x0
@@ -36,11 +39,22 @@
 #define CXL_HDM_DECODER_CAP_OFFSET 0x0
 #define   CXL_HDM_DECODER_COUNT_MASK GENMASK(3, 0)
 #define   CXL_HDM_DECODER_TARGET_COUNT_MASK GENMASK(7, 4)
-#define CXL_HDM_DECODER0_BASE_LOW_OFFSET 0x10
-#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET 0x14
-#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET 0x18
-#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET 0x1c
-#define CXL_HDM_DECODER0_CTRL_OFFSET 0x20
+#define   CXL_HDM_DECODER_INTERLEAVE_11_8 BIT(8)
+#define   CXL_HDM_DECODER_INTERLEAVE_14_12 BIT(9)
+#define CXL_HDM_DECODER_CTRL_OFFSET 0x4
+#define   CXL_HDM_DECODER_ENABLE BIT(1)
+#define CXL_HDM_DECODER0_BASE_LOW_OFFSET(i) (0x20 * (i) + 0x10)
+#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET(i) (0x20 * (i) + 0x14)
+#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET(i) (0x20 * (i) + 0x18)
+#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(i) (0x20 * (i) + 0x1c)
+#define CXL_HDM_DECODER0_CTRL_OFFSET(i) (0x20 * (i) + 0x20)
+#define   CXL_HDM_DECODER0_CTRL_IG_MASK GENMASK(3, 0)
+#define   CXL_HDM_DECODER0_CTRL_IW_MASK GENMASK(7, 4)
+#define   CXL_HDM_DECODER0_CTRL_COMMIT BIT(9)
+#define   CXL_HDM_DECODER0_CTRL_COMMITTED BIT(10)
+#define   CXL_HDM_DECODER0_CTRL_TYPE BIT(12)
+#define CXL_HDM_DECODER0_TL_LOW(i) (0x20 * (i) + 0x24)
+#define CXL_HDM_DECODER0_TL_HIGH(i) (0x20 * (i) + 0x28)
 
 static inline int cxl_hdm_decoder_count(u32 cap_hdr)
 {
@@ -148,6 +162,8 @@ int cxl_map_device_regs(struct pci_dev *pdev,
 enum cxl_regloc_type;
 int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
 		      struct cxl_register_map *map);
+void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
+				   resource_size_t length);
 
 #define CXL_RESOURCE_NONE ((resource_size_t) -1)
 #define CXL_TARGET_STRLEN 20
@@ -165,7 +181,8 @@ void cxl_unregister_topology_host(struct device *host);
 #define CXL_DECODER_F_TYPE2 BIT(2)
 #define CXL_DECODER_F_TYPE3 BIT(3)
 #define CXL_DECODER_F_LOCK  BIT(4)
-#define CXL_DECODER_F_MASK  GENMASK(4, 0)
+#define CXL_DECODER_F_ENABLE    BIT(5)
+#define CXL_DECODER_F_MASK  GENMASK(5, 0)
 
 enum cxl_decoder_type {
        CXL_DECODER_ACCELERATOR = 2,
@@ -255,6 +272,8 @@ struct cxl_walk_context {
  * @dports: cxl_dport instances referenced by decoders
  * @decoder_ida: allocator for decoder ids
  * @component_reg_phys: component register capability base address (optional)
+ * @rescan_work: worker object for bus rescans after port additions
+ * @data: opaque data with driver specific usage
  */
 struct cxl_port {
 	struct device dev;
@@ -263,6 +282,8 @@ struct cxl_port {
 	struct list_head dports;
 	struct ida decoder_ida;
 	resource_size_t component_reg_phys;
+	struct work_struct rescan_work;
+	void *data;
 };
 
 /**
@@ -325,6 +346,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 
 #define CXL_DEVICE_NVDIMM_BRIDGE	1
 #define CXL_DEVICE_NVDIMM		2
+#define CXL_DEVICE_PORT			3
 
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
new file mode 100644
index 000000000000..3c03131517af
--- /dev/null
+++ b/drivers/cxl/port.c
@@ -0,0 +1,323 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+#include "cxlmem.h"
+
+/**
+ * DOC: cxl port
+ *
+ * The port driver implements the set of functionality needed to allow full
+ * decoder enumeration and routing. A CXL port is an abstraction of a CXL
+ * component that implements some amount of CXL decoding of CXL.mem traffic.
+ * As of the CXL 2.0 spec, this includes:
+ *
+ *	.. list-table:: CXL Components w/ Ports
+ *		:widths: 25 25 50
+ *		:header-rows: 1
+ *
+ *		* - component
+ *		  - upstream
+ *		  - downstream
+ *		* - Hostbridge
+ *		  - ACPI0016
+ *		  - root port
+ *		* - Switch
+ *		  - Switch Upstream Port
+ *		  - Switch Downstream Port
+ *		* - Endpoint
+ *		  - Endpoint Port
+ *		  - N/A
+ *
+ * The primary service this driver provides is enumerating HDM decoders and
+ * presenting APIs to other drivers to utilize the decoders.
+ */
+
+static struct workqueue_struct *cxl_port_wq;
+
+struct cxl_port_data {
+	struct cxl_component_regs regs;
+
+	struct port_caps {
+		unsigned int count;
+		unsigned int tc;
+		unsigned int interleave11_8;
+		unsigned int interleave14_12;
+	} caps;
+};
+
+static inline int to_interleave_granularity(u32 ctrl)
+{
+	int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
+
+	return 256 << val;
+}
+
+static inline int to_interleave_ways(u32 ctrl)
+{
+	int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl);
+
+	return 1 << val;
+}
+
+static void get_caps(struct cxl_port *port, struct cxl_port_data *cpd)
+{
+	void __iomem *hdm_decoder = cpd->regs.hdm_decoder;
+	struct port_caps *caps = &cpd->caps;
+	u32 hdm_cap;
+
+	hdm_cap = readl(hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
+
+	caps->count = cxl_hdm_decoder_count(hdm_cap);
+	caps->tc = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
+	caps->interleave11_8 =
+		FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap);
+	caps->interleave14_12 =
+		FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_14_12, hdm_cap);
+}
+
+static int map_regs(struct cxl_port *port, void __iomem *crb,
+		    struct cxl_port_data *cpd)
+{
+	struct cxl_register_map map;
+	struct cxl_component_reg_map *comp_map = &map.component_map;
+
+	cxl_probe_component_regs(&port->dev, crb, comp_map);
+	if (!comp_map->hdm_decoder.valid) {
+		dev_err(&port->dev, "HDM decoder registers invalid\n");
+		return -ENXIO;
+	}
+
+	cpd->regs.hdm_decoder = crb + comp_map->hdm_decoder.offset;
+
+	return 0;
+}
+
+static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
+{
+	u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
+
+	if (ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED)
+		return 0;
+
+	return ioread64_hi_lo(hdm_decoder +
+			      CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
+}
+
+static bool is_endpoint_port(struct cxl_port *port)
+{
+	/* Endpoints can't be ports... yet! */
+	return false;
+}
+
+static void rescan_ports(struct work_struct *work)
+{
+	if (bus_rescan_devices(&cxl_bus_type))
+		pr_warn("Failed to rescan\n");
+}
+
+/* Minor layering violation */
+static int dvsec_range_used(struct cxl_port *port)
+{
+	struct cxl_endpoint_dvsec_info *info;
+	int i, ret = 0;
+
+	if (!is_endpoint_port(port))
+		return 0;
+
+	info = port->data;
+	for (i = 0; i < info->ranges; i++)
+		if (info->range[i].size)
+			ret |= 1 << i;
+
+	return ret;
+}
+
+static int enumerate_hdm_decoders(struct cxl_port *port,
+				  struct cxl_port_data *portdata)
+{
+	void __iomem *hdm_decoder = portdata->regs.hdm_decoder;
+	bool global_enable;
+	u32 global_ctrl;
+	int i = 0;
+
+	global_ctrl = readl(hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
+	global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE;
+	if (!global_enable) {
+		i = dvsec_range_used(port);
+		if (i) {
+			dev_err(&port->dev,
+				"Couldn't add port because device is using DVSEC range registers\n");
+			return -EBUSY;
+		}
+	}
+
+	for (i = 0; i < portdata->caps.count; i++) {
+		int rc, target_count = portdata->caps.tc;
+		struct cxl_decoder *cxld;
+		int *target_map = NULL;
+		u64 size;
+
+		if (is_endpoint_port(port))
+			target_count = 0;
+
+		cxld = cxl_decoder_alloc(port, target_count);
+		if (IS_ERR(cxld)) {
+			dev_warn(&port->dev,
+				 "Failed to allocate the decoder\n");
+			return PTR_ERR(cxld);
+		}
+
+		cxld->target_type = CXL_DECODER_EXPANDER;
+		cxld->interleave_ways = 1;
+		cxld->interleave_granularity = 0;
+
+		size = get_decoder_size(hdm_decoder, i);
+		if (size != 0) {
+#define decoderN(reg, n) hdm_decoder + CXL_HDM_DECODER0_##reg(n)
+			int temp[CXL_DECODER_MAX_INTERLEAVE];
+			u64 base;
+			u32 ctrl;
+			int j;
+			union {
+				u64 value;
+				char target_id[8];
+			} target_list;
+
+			target_map = temp;
+			ctrl = readl(decoderN(CTRL_OFFSET, i));
+			base = ioread64_hi_lo(decoderN(BASE_LOW_OFFSET, i));
+
+			cxld->decoder_range = (struct range){
+				.start = base,
+				.end = base + size - 1
+			};
+
+			cxld->flags = CXL_DECODER_F_ENABLE;
+			cxld->interleave_ways = to_interleave_ways(ctrl);
+			cxld->interleave_granularity =
+				to_interleave_granularity(ctrl);
+
+			if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0)
+				cxld->target_type = CXL_DECODER_ACCELERATOR;
+
+			target_list.value = ioread64_hi_lo(decoderN(TL_LOW, i));
+			for (j = 0; j < cxld->interleave_ways; j++)
+				target_map[j] = target_list.target_id[j];
+#undef decoderN
+		}
+
+		rc = cxl_decoder_add_locked(cxld, target_map);
+		if (rc)
+			put_device(&cxld->dev);
+		else
+			rc = cxl_decoder_autoremove(&port->dev, cxld);
+		if (rc)
+			dev_err(&port->dev, "Failed to add decoder\n");
+		else
+			dev_dbg(&cxld->dev, "Added to port %s\n",
+				dev_name(&port->dev));
+	}
+
+	/*
+	 * Turn on global enable now since DVSEC ranges aren't being used and
+	 * we'll eventually want the decoder enabled.
+	 */
+	if (!global_enable) {
+		dev_dbg(&port->dev, "Enabling HDM decode\n");
+		writel(global_ctrl | CXL_HDM_DECODER_ENABLE, hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
+	}
+
+	return 0;
+}
+
+static int cxl_port_probe(struct device *dev)
+{
+	struct cxl_port *port = to_cxl_port(dev);
+	struct cxl_port_data *portdata;
+	void __iomem *crb;
+	int rc;
+
+	if (port->component_reg_phys == CXL_RESOURCE_NONE)
+		return 0;
+
+	portdata = devm_kzalloc(dev, sizeof(*portdata), GFP_KERNEL);
+	if (!portdata)
+		return -ENOMEM;
+
+	crb = devm_cxl_iomap_block(&port->dev, port->component_reg_phys,
+				   CXL_COMPONENT_REG_BLOCK_SIZE);
+	if (!crb) {
+		dev_err(&port->dev, "No component registers mapped\n");
+		return -ENXIO;
+	}
+
+	rc = map_regs(port, crb, portdata);
+	if (rc)
+		return rc;
+
+	get_caps(port, portdata);
+	if (portdata->caps.count == 0) {
+		dev_err(&port->dev, "Spec violation. Caps invalid\n");
+		return -ENXIO;
+	}
+
+	rc = enumerate_hdm_decoders(port, portdata);
+	if (rc) {
+		dev_err(&port->dev, "Couldn't enumerate decoders (%d)\n", rc);
+		return rc;
+	}
+
+	/*
+	 * Bus rescan is done in a workqueue so that we can do so with the
+	 * device lock dropped.
+	 *
+	 * Why do we need to rescan? There is a race between cxl_acpi and
+	 * cxl_mem (which depends on cxl_pci). cxl_mem will only create a port
+	 * if it can establish a path up to a root port, which is enumerated by
+	 * a platform specific driver (ie. cxl_acpi) and bound by this driver.
+	 * While cxl_acpi could do the rescan, it makes sense to be here as
+	 * other platform drivers might require the same functionality.
+	 */
+	INIT_WORK(&port->rescan_work, rescan_ports);
+	queue_work(cxl_port_wq, &port->rescan_work);
+
+	return 0;
+}
+
+static struct cxl_driver cxl_port_driver = {
+	.name = "cxl_port",
+	.probe = cxl_port_probe,
+	.id = CXL_DEVICE_PORT,
+};
+
+static __init int cxl_port_init(void)
+{
+	int rc;
+
+	cxl_port_wq = alloc_ordered_workqueue("cxl_port", 0);
+	if (!cxl_port_wq)
+		return -ENOMEM;
+
+	rc = cxl_driver_register(&cxl_port_driver);
+	if (rc) {
+		destroy_workqueue(cxl_port_wq);
+		return rc;
+	}
+
+	return 0;
+}
+
+static __exit void cxl_port_exit(void)
+{
+	destroy_workqueue(cxl_port_wq);
+	cxl_driver_unregister(&cxl_port_driver);
+}
+
+module_init(cxl_port_init);
+module_exit(cxl_port_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_IMPORT_NS(CXL);
+MODULE_ALIAS_CXL(CXL_DEVICE_PORT);
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 21/23] cxl: Unify port enumeration for decoders
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (19 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 20/23] cxl/port: Introduce a port driver Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 17:48   ` Jonathan Cameron
  2021-11-20  0:02 ` [PATCH 22/23] cxl/mem: Introduce cxl_mem driver Ben Widawsky
  2021-11-20  0:02 ` [PATCH 23/23] cxl/mem: Disable switch hierarchies for now Ben Widawsky
  22 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The port driver exists to do proper enumeration of the component
registers for ports, including HDM decoder resources. Any port which
follows the CXL specification to implement HDM decoder registers should
be handled by the port driver. This includes host bridge registers that
are currently handled within the cxl_acpi driver.

In moving the responsibility from cxl_acpi to cxl_port, three primary
things are accomplished here:
1. Multi-port host bridges are now handled by the port driver
2. Single port host bridges are handled by the port driver
3. Single port switches without component registers will be handled by
   the port driver.

While it's tempting to remove decoder APIs from cxl_core entirely, it is
still required that platform specific drivers are able to add decoders
which aren't specified in CXL 2.0+. An example of this is the CFMWS
parsing which is implementing in cxl_acpi.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---
Changes since RFCv2:
- Renamed subject
- Reworded commit message
- Handle *all* host bridge port enumeration in cxl_port (Dan)
  - Handle passthrough decoding in cxl_port
---
 drivers/cxl/acpi.c     | 41 +++-----------------------------
 drivers/cxl/core/bus.c |  6 +++--
 drivers/cxl/cxl.h      |  2 ++
 drivers/cxl/port.c     | 54 +++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 62 insertions(+), 41 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index c12e4fed7941..c85a04ecbf7f 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -210,8 +210,6 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
 	struct acpi_pci_root *pci_root;
 	struct cxl_walk_context ctx;
-	int single_port_map[1], rc;
-	struct cxl_decoder *cxld;
 	struct cxl_dport *dport;
 	struct cxl_port *port;
 
@@ -245,43 +243,9 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 		return -ENODEV;
 	if (ctx.error)
 		return ctx.error;
-	if (ctx.count > 1)
-		return 0;
 
-	/* TODO: Scan CHBCR for HDM Decoder resources */
-
-	/*
-	 * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
-	 * Structure) single ported host-bridges need not publish a decoder
-	 * capability when a passthrough decode can be assumed, i.e. all
-	 * transactions that the uport sees are claimed and passed to the single
-	 * dport. Disable the range until the first CXL region is enumerated /
-	 * activated.
-	 */
-	cxld = cxl_decoder_alloc(port, 1);
-	if (IS_ERR(cxld))
-		return PTR_ERR(cxld);
-
-	cxld->interleave_ways = 1;
-	cxld->interleave_granularity = PAGE_SIZE;
-	cxld->target_type = CXL_DECODER_EXPANDER;
-	cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0);
-
-	device_lock(&port->dev);
-	dport = list_first_entry(&port->dports, typeof(*dport), list);
-	device_unlock(&port->dev);
-
-	single_port_map[0] = dport->port_id;
-
-	rc = cxl_decoder_add(cxld, single_port_map);
-	if (rc)
-		put_device(&cxld->dev);
-	else
-		rc = cxl_decoder_autoremove(host, cxld);
-
-	if (rc == 0)
-		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
-	return rc;
+	/* Host bridge ports are enumerated by the port driver. */
+	return 0;
 }
 
 struct cxl_chbs_context {
@@ -448,3 +412,4 @@ module_platform_driver(cxl_acpi_driver);
 MODULE_LICENSE("GPL v2");
 MODULE_IMPORT_NS(CXL);
 MODULE_IMPORT_NS(ACPI);
+MODULE_SOFTDEP("pre: cxl_port");
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 46a06cfe79bd..acfa212eea21 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -62,7 +62,7 @@ void cxl_unregister_topology_host(struct device *host)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_unregister_topology_host, CXL);
 
-static struct device *get_cxl_topology_host(void)
+struct device *get_cxl_topology_host(void)
 {
 	down_read(&topology_host_sem);
 	if (cxl_topology_host)
@@ -70,12 +70,14 @@ static struct device *get_cxl_topology_host(void)
 	up_read(&topology_host_sem);
 	return NULL;
 }
+EXPORT_SYMBOL_NS_GPL(get_cxl_topology_host, CXL);
 
-static void put_cxl_topology_host(struct device *dev)
+void put_cxl_topology_host(struct device *dev)
 {
 	WARN_ON(dev != cxl_topology_host);
 	up_read(&topology_host_sem);
 }
+EXPORT_SYMBOL_NS_GPL(put_cxl_topology_host, CXL);
 
 static int decoder_match(struct device *dev, void *data)
 {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 24fa16157d5e..f8354241c5a3 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -170,6 +170,8 @@ void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
 
 int cxl_register_topology_host(struct device *host);
 void cxl_unregister_topology_host(struct device *host);
+struct device *get_cxl_topology_host(void);
+void put_cxl_topology_host(struct device *dev);
 
 /*
  * cxl_decoder flags that define the type of memory / devices this
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 3c03131517af..7a1fc726fe9f 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -233,12 +233,64 @@ static int enumerate_hdm_decoders(struct cxl_port *port,
 	return 0;
 }
 
+/*
+ * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
+ * single ported host-bridges need not publish a decoder capability when a
+ * passthrough decode can be assumed, i.e. all transactions that the uport sees
+ * are claimed and passed to the single dport. Disable the range until the first
+ * CXL region is enumerated / activated.
+ */
+static int add_passthrough_decoder(struct cxl_port *port)
+{
+	int single_port_map[1], rc;
+	struct cxl_decoder *cxld;
+	struct cxl_dport *dport;
+
+	device_lock_assert(&port->dev);
+
+	cxld = cxl_decoder_alloc(port, 1);
+	if (IS_ERR(cxld))
+		return PTR_ERR(cxld);
+
+	cxld->interleave_ways = 1;
+	cxld->interleave_granularity = PAGE_SIZE;
+	cxld->target_type = CXL_DECODER_EXPANDER;
+	cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0);
+
+	dport = list_first_entry(&port->dports, typeof(*dport), list);
+	single_port_map[0] = dport->port_id;
+
+	rc = cxl_decoder_add_locked(cxld, single_port_map);
+	if (rc)
+		put_device(&cxld->dev);
+	else
+		rc = cxl_decoder_autoremove(&port->dev, cxld);
+
+	if (rc == 0)
+		dev_dbg(&port->dev, "add: %s\n", dev_name(&cxld->dev));
+
+	return rc;
+}
+
 static int cxl_port_probe(struct device *dev)
 {
 	struct cxl_port *port = to_cxl_port(dev);
 	struct cxl_port_data *portdata;
 	void __iomem *crb;
-	int rc;
+	int rc = 0;
+
+	if (list_is_singular(&port->dports)) {
+		struct device *host_dev = get_cxl_topology_host();
+
+		/*
+		 * Root ports (single host bridge downstream) are handled by
+		 * platform driver
+		 */
+		if (port->uport != host_dev)
+			rc = add_passthrough_decoder(port);
+		put_cxl_topology_host(host_dev);
+		return rc;
+	}
 
 	if (port->component_reg_phys == CXL_RESOURCE_NONE)
 		return 0;
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 22/23] cxl/mem: Introduce cxl_mem driver
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (20 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 21/23] cxl: Unify port enumeration for decoders Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-20  0:40   ` Randy Dunlap
  2021-11-22 18:17   ` Jonathan Cameron
  2021-11-20  0:02 ` [PATCH 23/23] cxl/mem: Disable switch hierarchies for now Ben Widawsky
  22 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Add a driver that is capable of determining whether a device is in a
CXL.mem routed part of the topology.

This driver allows a higher level driver - such as one controlling CXL
regions, which is itself a set of CXL devices - to easily determine if
the CXL devices are CXL.mem capable by checking if the driver has bound.
CXL memory device services may also be provided by this driver though
none are needed as of yet. cxl_mem also plays the part of registering
itself as an endpoint port, which is a required step to enumerate the
device's HDM decoder resources.

Even though cxl_mem driver is the only consumer of the new
cxl_scan_ports() introduced in cxl_core, because that functionality has
PCIe specificity it is kept out of this driver.

As part of this patch, find_dport_by_dev() is promoted to the cxl_core's
set of APIs for use by the new driver.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---
Changes since RFCv2:
- Squashed this in with previous patch
- Reworked parent port finding
- Added DVSEC check
- Added wait for active
---
 .../driver-api/cxl/memory-devices.rst         |   9 +
 drivers/cxl/Kconfig                           |  15 ++
 drivers/cxl/Makefile                          |   2 +
 drivers/cxl/acpi.c                            |  22 +-
 drivers/cxl/core/Makefile                     |   1 +
 drivers/cxl/core/bus.c                        | 135 +++++++++++-
 drivers/cxl/core/core.h                       |   3 +
 drivers/cxl/core/memdev.c                     |   2 +-
 drivers/cxl/core/pci.c                        | 119 +++++++++++
 drivers/cxl/cxl.h                             |   8 +-
 drivers/cxl/cxlmem.h                          |   3 +
 drivers/cxl/mem.c                             | 192 ++++++++++++++++++
 drivers/cxl/pci.h                             |   4 +
 drivers/cxl/port.c                            |  12 +-
 tools/testing/cxl/Kbuild                      |   1 +
 15 files changed, 503 insertions(+), 25 deletions(-)
 create mode 100644 drivers/cxl/core/pci.c
 create mode 100644 drivers/cxl/mem.c

diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index fbf0393cdddc..b4ff5f209c34 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -28,6 +28,9 @@ CXL Memory Device
 .. kernel-doc:: drivers/cxl/pci.c
    :internal:
 
+.. kernel-doc:: drivers/cxl/mem.c
+   :doc: cxl mem
+
 CXL Port
 --------
 .. kernel-doc:: drivers/cxl/port.c
@@ -47,6 +50,12 @@ CXL Core
 .. kernel-doc:: drivers/cxl/core/bus.c
    :identifiers:
 
+.. kernel-doc:: drivers/cxl/core/pci.c
+   :doc: cxl core pci
+
+.. kernel-doc:: drivers/cxl/core/pci.c
+   :identifiers:
+
 .. kernel-doc:: drivers/cxl/core/pmem.c
    :doc: cxl pmem
 
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 3aeb33bba5a3..f5553443ba2a 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -30,6 +30,21 @@ config CXL_PCI
 
 	  If unsure say 'm'.
 
+config CXL_MEM
+	tristate "CXL.mem: Memory Devices"
+	default CXL_BUS
+        help
+          The CXL.mem protocol allows a device to act as a provider of
+	  "System RAM" and/or "Persistent Memory" that is fully coherent
+	  as if the memory was attached to the typical CPU memory controller.
+	  This is known as HDM "Host-managed Device Memory".
+
+	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
+	  memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
+	  specification for a detailed description of HDM.
+
+	  If unsure say 'm'.
+
 config CXL_MEM_RAW_COMMANDS
 	bool "RAW Command Interface for Memory Devices"
 	depends on CXL_PCI
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index 56fcac2323cb..ce267ef11d93 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -1,10 +1,12 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CXL_BUS) += core/
 obj-$(CONFIG_CXL_PCI) += cxl_pci.o
+obj-$(CONFIG_CXL_MEM) += cxl_mem.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 obj-$(CONFIG_CXL_PORT) += cxl_port.o
 
+cxl_mem-y := mem.o
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
 cxl_pmem-y := pmem.o
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index c85a04ecbf7f..d17aa7d8b1c9 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -171,21 +171,6 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 	return 0;
 }
 
-static struct cxl_dport *find_dport_by_dev(struct cxl_port *port, struct device *dev)
-{
-	struct cxl_dport *dport;
-
-	device_lock(&port->dev);
-	list_for_each_entry(dport, &port->dports, list)
-		if (dport->dport == dev) {
-			device_unlock(&port->dev);
-			return dport;
-		}
-
-	device_unlock(&port->dev);
-	return NULL;
-}
-
 __mock struct acpi_device *to_cxl_host_bridge(struct device *host,
 					      struct device *dev)
 {
@@ -216,13 +201,14 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 	if (!bridge)
 		return 0;
 
-	dport = find_dport_by_dev(root_port, match);
+	dport = cxl_find_dport_by_dev(root_port, match);
 	if (!dport) {
 		dev_dbg(host, "host bridge expected and not found\n");
 		return 0;
 	}
 
-	port = devm_cxl_add_port(match, dport->component_reg_phys, root_port);
+	port = devm_cxl_add_port(match, dport->component_reg_phys, root_port,
+				 NULL);
 	if (IS_ERR(port))
 		return PTR_ERR(port);
 	dev_dbg(host, "%s: add: %s\n", dev_name(match), dev_name(&port->dev));
@@ -360,7 +346,7 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	if (rc)
 		return rc;
 
-	root_port = devm_cxl_add_port(host, CXL_RESOURCE_NONE, root_port);
+	root_port = devm_cxl_add_port(host, CXL_RESOURCE_NONE, root_port, NULL);
 	if (IS_ERR(root_port))
 		return PTR_ERR(root_port);
 	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 40ab50318daf..5b8ec478fb0b 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -7,3 +7,4 @@ cxl_core-y += pmem.o
 cxl_core-y += regs.o
 cxl_core-y += memdev.o
 cxl_core-y += mbox.o
+cxl_core-y += pci.o
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index acfa212eea21..cab3aabd5abc 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -8,6 +8,7 @@
 #include <linux/idr.h>
 #include <cxlmem.h>
 #include <cxl.h>
+#include <pci.h>
 #include "core.h"
 
 /**
@@ -436,7 +437,7 @@ static int devm_cxl_link_uport(struct device *host, struct cxl_port *port)
 
 static struct cxl_port *cxl_port_alloc(struct device *uport,
 				       resource_size_t component_reg_phys,
-				       struct cxl_port *parent_port)
+				       struct cxl_port *parent_port, void *data)
 {
 	struct cxl_port *port;
 	struct device *dev;
@@ -465,6 +466,7 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
 
 	port->uport = uport;
 	port->component_reg_phys = component_reg_phys;
+	port->data = data;
 	ida_init(&port->decoder_ida);
 	INIT_LIST_HEAD(&port->dports);
 
@@ -485,16 +487,17 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
  * @uport: "physical" device implementing this upstream port
  * @component_reg_phys: (optional) for configurable cxl_port instances
  * @parent_port: next hop up in the CXL memory decode hierarchy
+ * @data: opaque data to be used by the port driver
  */
 struct cxl_port *devm_cxl_add_port(struct device *uport,
 				   resource_size_t component_reg_phys,
-				   struct cxl_port *parent_port)
+				   struct cxl_port *parent_port, void *data)
 {
 	struct device *dev, *host;
 	struct cxl_port *port;
 	int rc;
 
-	port = cxl_port_alloc(uport, component_reg_phys, parent_port);
+	port = cxl_port_alloc(uport, component_reg_phys, parent_port, data);
 	if (IS_ERR(port))
 		return port;
 
@@ -531,6 +534,113 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
 }
 EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, CXL);
 
+static int add_upstream_port(struct device *host, struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct cxl_port *parent_port;
+	struct cxl_register_map map;
+	struct cxl_port *port;
+	int rc;
+
+	/* A port is useless if there are no component registers */
+	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+	if (rc)
+		return rc;
+
+	parent_port = find_parent_cxl_port(pdev);
+	if (!parent_port)
+		return -ENODEV;
+
+	if (!parent_port->dev.driver) {
+		dev_dbg(dev, "Upstream port has no driver\n");
+		put_device(&parent_port->dev);
+		return -ENODEV;
+	}
+
+	port = devm_cxl_add_port(dev, cxl_reg_block(pdev, &map), parent_port,
+				 NULL);
+	put_device(&parent_port->dev);
+	if (IS_ERR(port))
+		dev_err(dev, "Failed to add upstream port %ld\n",
+			PTR_ERR(port));
+	else
+		dev_dbg(dev, "Added CXL port\n");
+
+	return rc;
+}
+
+static int add_downstream_port(struct pci_dev *pdev)
+{
+	resource_size_t creg = CXL_RESOURCE_NONE;
+	struct device *dev = &pdev->dev;
+	struct cxl_port *parent_port;
+	struct cxl_register_map map;
+	u32 lnkcap, port_num;
+	int rc;
+
+	/*
+	 * Ports are to be scanned from top down. Therefore, the upstream port
+	 * must already exist.
+	 */
+	parent_port = find_parent_cxl_port(pdev);
+	if (!parent_port)
+		return -ENODEV;
+
+	if (!parent_port->dev.driver) {
+		dev_dbg(dev, "Host port to dport has no driver\n");
+		put_device(&parent_port->dev);
+		return -ENODEV;
+	}
+
+	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+	if (rc)
+		creg = CXL_RESOURCE_NONE;
+	else
+		creg = cxl_reg_block(pdev, &map);
+
+	if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
+				  &lnkcap) != PCIBIOS_SUCCESSFUL)
+		return 1;
+	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
+
+	rc = cxl_add_dport(parent_port, dev, port_num, creg, false);
+	put_device(&parent_port->dev);
+	if (rc)
+		dev_err(dev, "Failed to add downstream port to %s\n",
+			dev_name(&parent_port->dev));
+	else
+		dev_dbg(dev, "Added downstream port to %s\n",
+			dev_name(&parent_port->dev));
+
+	return rc;
+}
+
+static int match_add_ports(struct pci_dev *pdev, void *data)
+{
+	struct device *dev = &pdev->dev;
+	struct device *host = data;
+
+	if (is_cxl_switch_usp((dev)))
+		return add_upstream_port(host, pdev);
+	else if (is_cxl_switch_dsp((dev)))
+		return add_downstream_port(pdev);
+	else
+		return 0;
+}
+
+/**
+ * cxl_scan_ports() - Adds all ports for the subtree beginning with @dport
+ * @dport: Beginning node of the CXL topology
+ */
+void cxl_scan_ports(struct cxl_dport *dport)
+{
+	struct device *d = dport->dport;
+	struct pci_dev *pdev = to_pci_dev(d);
+
+	pci_walk_bus(pdev->bus, match_add_ports, &dport->port->dev);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_scan_ports, CXL);
+
 static struct cxl_dport *find_dport(struct cxl_port *port, int id)
 {
 	struct cxl_dport *dport;
@@ -614,6 +724,23 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
 }
 EXPORT_SYMBOL_NS_GPL(cxl_add_dport, CXL);
 
+struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port,
+					struct device *dev)
+{
+	struct cxl_dport *dport;
+
+	device_lock(&port->dev);
+	list_for_each_entry(dport, &port->dports, list)
+		if (dport->dport == dev) {
+			device_unlock(&port->dev);
+			return dport;
+		}
+
+	device_unlock(&port->dev);
+	return NULL;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_find_dport_by_dev, CXL);
+
 static int decoder_populate_targets(struct cxl_decoder *cxld,
 				    struct cxl_port *port, int *target_map)
 {
@@ -858,6 +985,8 @@ static int cxl_device_id(struct device *dev)
 		return CXL_DEVICE_NVDIMM;
 	if (dev->type == &cxl_port_type)
 		return CXL_DEVICE_PORT;
+	if (dev->type == &cxl_memdev_type)
+		return CXL_DEVICE_MEMORY_EXPANDER;
 	return 0;
 }
 
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index e0c9aacc4e9c..c5836f071eaa 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -6,6 +6,7 @@
 
 extern const struct device_type cxl_nvdimm_bridge_type;
 extern const struct device_type cxl_nvdimm_type;
+extern const struct device_type cxl_memdev_type;
 
 extern struct attribute_group cxl_base_attribute_group;
 
@@ -20,4 +21,6 @@ void cxl_memdev_exit(void);
 void cxl_mbox_init(void);
 void cxl_mbox_exit(void);
 
+struct cxl_port *find_parent_cxl_port(struct pci_dev *pdev);
+
 #endif /* __CXL_CORE_H__ */
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 61029cb7ac62..149665fd2d3f 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -127,7 +127,7 @@ static const struct attribute_group *cxl_memdev_attribute_groups[] = {
 	NULL,
 };
 
-static const struct device_type cxl_memdev_type = {
+const struct device_type cxl_memdev_type = {
 	.name = "cxl_memdev",
 	.release = cxl_memdev_release,
 	.devnode = cxl_memdev_devnode,
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
new file mode 100644
index 000000000000..818e30571e4d
--- /dev/null
+++ b/drivers/cxl/core/pci.c
@@ -0,0 +1,119 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <cxl.h>
+#include <pci.h>
+#include "core.h"
+
+/**
+ * DOC: cxl core pci
+ *
+ * Compute Express Link protocols are layered on top of PCIe. CXL core provides
+ * a set of helpers for CXL interactions which occur via PCIe.
+ */
+
+/**
+ * find_parent_cxl_port() - Finds parent port through PCIe mechanisms
+ * @pdev: PCIe USP or DSP to find an upstream port for
+ *
+ * Once all CXL ports are enumerated, there is no need to reference the PCIe
+ * parallel universe as all downstream ports are contained in a linked list, and
+ * all upstream ports are accessible via pointer. During the enumeration, it is
+ * very convenient to be able to peak up one level in the hierarchy without
+ * needing the established relationship between data structures so that the
+ * parenting can be done as the ports/dports are created.
+ *
+ * A reference is kept to the found port.
+ */
+struct cxl_port *find_parent_cxl_port(struct pci_dev *pdev)
+{
+	struct device *parent_dev, *gparent_dev;
+
+	/* Parent is either a downstream port, or root port */
+	parent_dev = get_device(pdev->dev.parent);
+
+	if (is_cxl_switch_usp(&pdev->dev)) {
+		if (dev_WARN_ONCE(&pdev->dev,
+				  pci_pcie_type(pdev) !=
+						  PCI_EXP_TYPE_DOWNSTREAM &&
+					  pci_pcie_type(pdev) !=
+						  PCI_EXP_TYPE_ROOT_PORT,
+				  "Parent not downstream\n"))
+			goto err;
+
+		/*
+		 * Grandparent is either an upstream port or a platform device that has
+		 * been added as a cxl_port already.
+		 */
+		gparent_dev = get_device(parent_dev->parent);
+		put_device(parent_dev);
+
+		return to_cxl_port(gparent_dev);
+	} else if (is_cxl_switch_dsp(&pdev->dev)) {
+		if (dev_WARN_ONCE(&pdev->dev,
+				  pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM,
+				  "Parent not upstream"))
+			goto err;
+		return to_cxl_port(parent_dev);
+	}
+
+err:
+	dev_WARN(&pdev->dev, "Invalid topology\n");
+	put_device(parent_dev);
+	return NULL;
+}
+
+/*
+ * Unlike endpoints, switches don't discern CXL.mem capability. Simply finding
+ * the DVSEC is sufficient.
+ */
+static bool is_cxl_switch(struct pci_dev *pdev)
+{
+	return pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
+					 CXL_DVSEC_PORT_EXTENSIONS);
+}
+
+/**
+ * is_cxl_switch_usp() - Is the device a CXL.mem enabled switch
+ * @dev: Device to query for switch type
+ *
+ * If the device is a CXL.mem capable upstream switch port return true;
+ * otherwise return false.
+ */
+bool is_cxl_switch_usp(struct device *dev)
+{
+	struct pci_dev *pdev;
+
+	if (!dev_is_pci(dev))
+		return false;
+
+	pdev = to_pci_dev(dev);
+
+	return pci_is_pcie(pdev) &&
+	       pci_pcie_type(pdev) == PCI_EXP_TYPE_UPSTREAM &&
+	       is_cxl_switch(pdev);
+}
+EXPORT_SYMBOL_NS_GPL(is_cxl_switch_usp, CXL);
+
+/**
+ * is_cxl_switch_dsp() - Is the device a CXL.mem enabled switch
+ * @dev: Device to query for switch type
+ *
+ * If the device is a CXL.mem capable downstream switch port return true;
+ * otherwise return false.
+ */
+bool is_cxl_switch_dsp(struct device *dev)
+{
+	struct pci_dev *pdev;
+
+	if (!dev_is_pci(dev))
+		return false;
+
+	pdev = to_pci_dev(dev);
+
+	return pci_is_pcie(pdev) &&
+	       pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
+	       is_cxl_switch(pdev);
+}
+EXPORT_SYMBOL_NS_GPL(is_cxl_switch_dsp, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index f8354241c5a3..3bda806f4244 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -296,6 +296,7 @@ struct cxl_port {
  * @port: reference to cxl_port that contains this downstream port
  * @list: node for a cxl_port's list of cxl_dport instances
  * @root_port_link: node for global list of root ports
+ * @data: Opaque data passed by other drivers, used by port driver
  */
 struct cxl_dport {
 	struct device *dport;
@@ -304,16 +305,20 @@ struct cxl_dport {
 	struct cxl_port *port;
 	struct list_head list;
 	struct list_head root_port_link;
+	void *data;
 };
 
 struct cxl_port *to_cxl_port(struct device *dev);
 struct cxl_port *devm_cxl_add_port(struct device *uport,
 				   resource_size_t component_reg_phys,
-				   struct cxl_port *parent_port);
+				   struct cxl_port *parent_port, void *data);
+void cxl_scan_ports(struct cxl_dport *root_port);
 
 int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 		  resource_size_t component_reg_phys, bool root_port);
 struct cxl_dport *cxl_get_root_dport(struct device *dev);
+struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port,
+					struct device *dev);
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
@@ -349,6 +354,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 #define CXL_DEVICE_NVDIMM_BRIDGE	1
 #define CXL_DEVICE_NVDIMM		2
 #define CXL_DEVICE_PORT			3
+#define CXL_DEVICE_MEMORY_EXPANDER	4
 
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index b1d753541f4e..de8f6fce74b5 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -35,12 +35,15 @@
  * @cdev: char dev core object for ioctl operations
  * @cxlds: The device state backing this device
  * @id: id number of this memdev instance.
+ * @component_reg_phys: register base of component registers
+ * @root_port: Hostbridge's root port connected to this endpoint
  */
 struct cxl_memdev {
 	struct device dev;
 	struct cdev cdev;
 	struct cxl_dev_state *cxlds;
 	int id;
+	struct cxl_dport *root_port;
 };
 
 static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
new file mode 100644
index 000000000000..e954144af4b8
--- /dev/null
+++ b/drivers/cxl/mem.c
@@ -0,0 +1,192 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+#include "cxlmem.h"
+#include "pci.h"
+
+/**
+ * DOC: cxl mem
+ *
+ * CXL memory endpoint devices and switches are CXL capable devices that are
+ * participating in CXL.mem protocol. Their functionality builds on top of the
+ * CXL.io protocol that allows enumerating and configuring components via
+ * standard PCI mechanisms.
+ *
+ * The cxl_mem driver owns kicking off the enumeration of this CXL.mem
+ * capability. With the detection of a CXL capable endpoint, the driver will
+ * walk up to find the platform specific port it is connected to, and determine
+ * if there are intervening switches in the path. If there are switches, a
+ * secondary action to enumerate those (implemented in cxl_core). Finally the
+ * cxl_mem driver will add the device it is bound to as a CXL port for use in
+ * higher level operations.
+ */
+
+struct walk_ctx {
+	struct cxl_dport *root_port;
+	bool has_switch;
+};
+
+/**
+ * walk_to_root_port() - Walk up to root port
+ * @dev: Device to walk up from
+ * @ctx: Information to populate while walking
+ *
+ * A platform specific driver such as cxl_acpi is responsible for scanning CXL
+ * topologies in a top-down fashion. If the CXL memory device is directly
+ * connected to the top level hostbridge, nothing else needs to be done. If
+ * however there are CXL components (ie. a CXL switch) in between an endpoint
+ * and a hostbridge the platform specific driver must be notified after all the
+ * components are enumerated.
+ */
+static void walk_to_root_port(struct device *dev, struct walk_ctx *ctx)
+{
+	struct cxl_dport *root_port;
+
+	if (!dev->parent)
+		return;
+
+	root_port = cxl_get_root_dport(dev);
+	if (root_port)
+		ctx->root_port = root_port;
+
+	if (is_cxl_switch_usp(dev))
+		ctx->has_switch = true;
+
+	walk_to_root_port(dev->parent, ctx);
+}
+
+static void remove_endpoint(void *_cxlmd)
+{
+	struct cxl_memdev *cxlmd = _cxlmd;
+
+	if (cxlmd->root_port)
+		sysfs_remove_link(&cxlmd->dev.kobj, "root_port");
+}
+
+static int wait_for_media(struct cxl_memdev *cxlmd)
+{
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_endpoint_dvsec_info *info = cxlds->info;
+	int rc;
+
+	if (!info)
+		return -ENXIO;
+
+	if (!info->mem_enabled)
+		return -EBUSY;
+
+	rc = cxlds->wait_media_ready(cxlds);
+	if (rc)
+		return rc;
+
+	/*
+	 * We know the device is active, and enabled, if any ranges are non-zero
+	 * we'll need to check later before adding the port since that owns the
+	 * HDM decoder registers.
+	 */
+	return 0;
+}
+
+static int create_endpoint(struct device *dev, struct cxl_port *parent,
+			   struct cxl_dport *dport)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_endpoint_dvsec_info *info = cxlds->info;
+	struct cxl_port *endpoint;
+	int rc;
+
+	endpoint =
+		devm_cxl_add_port(dev, cxlds->component_reg_phys, parent, info);
+	if (IS_ERR(endpoint))
+		return PTR_ERR(endpoint);
+
+	rc = sysfs_create_link(&cxlmd->dev.kobj, &dport->dport->kobj,
+			       "root_port");
+	if (rc) {
+		device_del(&endpoint->dev);
+		return rc;
+	}
+	dev_dbg(dev, "add: %s\n", dev_name(&endpoint->dev));
+
+	return devm_add_action_or_reset(dev, remove_endpoint, cxlmd);
+}
+
+static int cxl_mem_probe(struct device *dev)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_port *hostbridge, *parent_port;
+	struct walk_ctx ctx = { NULL, false };
+	struct cxl_dport *dport;
+	int rc;
+
+	rc = wait_for_media(cxlmd);
+	if (rc) {
+		dev_err(dev, "Media not active (%d)\n", rc);
+		return rc;
+	}
+
+	walk_to_root_port(dev, &ctx);
+
+	/*
+	 * Couldn't find a CXL capable root port. This may happen even with a
+	 * CXL capable topology if cxl_acpi hasn't completed yet. A rescan will
+	 * occur.
+	 */
+	if (!ctx.root_port)
+		return -ENODEV;
+
+	hostbridge = ctx.root_port->port;
+	device_lock(&hostbridge->dev);
+
+	/* hostbridge has no port driver, the topology isn't enabled yet */
+	if (!hostbridge->dev.driver) {
+		device_unlock(&hostbridge->dev);
+		return -ENODEV;
+	}
+
+	/* No switch + found root port means we're done */
+	if (!ctx.has_switch) {
+		parent_port = to_cxl_port(&hostbridge->dev);
+		dport = ctx.root_port;
+		goto out;
+	}
+
+	/* Walk down from the root port and add all switches */
+	cxl_scan_ports(ctx.root_port);
+
+	/* If parent is a dport the endpoint is good to go. */
+	parent_port = to_cxl_port(dev->parent->parent);
+	dport = cxl_find_dport_by_dev(parent_port, dev->parent);
+	if (!dport) {
+		rc = -ENODEV;
+		goto err_out;
+	}
+
+out:
+	rc = create_endpoint(dev, parent_port, dport);
+	if (rc)
+		goto err_out;
+
+	cxlmd->root_port = ctx.root_port;
+
+err_out:
+	device_unlock(&hostbridge->dev);
+	return rc;
+}
+
+static struct cxl_driver cxl_mem_driver = {
+	.name = "cxl_mem",
+	.probe = cxl_mem_probe,
+	.id = CXL_DEVICE_MEMORY_EXPANDER,
+};
+
+module_cxl_driver(cxl_mem_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_IMPORT_NS(CXL);
+MODULE_ALIAS_CXL(CXL_DEVICE_MEMORY_EXPANDER);
+MODULE_SOFTDEP("pre: cxl_port");
diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
index 2a48cd65bf59..3fd0909522f2 100644
--- a/drivers/cxl/pci.h
+++ b/drivers/cxl/pci.h
@@ -15,6 +15,7 @@
 
 /* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
 #define CXL_DVSEC_PCIE_DEVICE					0
+
 #define   CXL_DVSEC_PCIE_DEVICE_CAP_OFFSET			0xA
 #define     CXL_DVSEC_PCIE_DEVICE_MEM_CAPABLE			BIT(2)
 #define     CXL_DVSEC_PCIE_DEVICE_HDM_COUNT_MASK		GENMASK(5, 4)
@@ -64,4 +65,7 @@ enum cxl_regloc_type {
 	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
 			   (map)->block_offset))
 
+bool is_cxl_switch_usp(struct device *dev);
+bool is_cxl_switch_dsp(struct device *dev);
+
 #endif /* __CXL_PCI_H__ */
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 7a1fc726fe9f..02b1c8cf7567 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -108,8 +108,16 @@ static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
 
 static bool is_endpoint_port(struct cxl_port *port)
 {
-	/* Endpoints can't be ports... yet! */
-	return false;
+	/*
+	 * It's tempting to just check list_empty(port->dports) here, but this
+	 * might get called before dports are setup for a port.
+	 */
+
+	if (!port->uport->driver)
+		return false;
+
+	return to_cxl_drv(port->uport->driver)->id ==
+	       CXL_DEVICE_MEMORY_EXPANDER;
 }
 
 static void rescan_ports(struct work_struct *work)
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index 1acdf2fc31c5..4c2359772f3c 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -30,6 +30,7 @@ cxl_core-y += $(CXL_CORE_SRC)/pmem.o
 cxl_core-y += $(CXL_CORE_SRC)/regs.o
 cxl_core-y += $(CXL_CORE_SRC)/memdev.o
 cxl_core-y += $(CXL_CORE_SRC)/mbox.o
+cxl_core-y += $(CXL_CORE_SRC)/pci.o
 cxl_core-y += config_check.o
 
 cxl_core-y += mock_pmem.o
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 23/23] cxl/mem: Disable switch hierarchies for now
  2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
                   ` (21 preceding siblings ...)
  2021-11-20  0:02 ` [PATCH 22/23] cxl/mem: Introduce cxl_mem driver Ben Widawsky
@ 2021-11-20  0:02 ` Ben Widawsky
  2021-11-22 18:19   ` Jonathan Cameron
  22 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-20  0:02 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Switches aren't supported by the region driver yet. If a device finds
itself under a switch it will not bind a driver so that it cannot be
used later for region creation/configuration.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/mem.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index e954144af4b8..997898e78d63 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -155,6 +155,11 @@ static int cxl_mem_probe(struct device *dev)
 		goto out;
 	}
 
+	/* FIXME: Add true switch support */
+	dev_err(dev, "Devices behind switches are currently unsupported\n");
+	rc = -ENODEV;
+	goto err_out;
+
 	/* Walk down from the root port and add all switches */
 	cxl_scan_ports(ctx.root_port);
 
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [PATCH 22/23] cxl/mem: Introduce cxl_mem driver
  2021-11-20  0:02 ` [PATCH 22/23] cxl/mem: Introduce cxl_mem driver Ben Widawsky
@ 2021-11-20  0:40   ` Randy Dunlap
  2021-11-21  3:55     ` Ben Widawsky
  2021-11-22 18:17   ` Jonathan Cameron
  1 sibling, 1 reply; 133+ messages in thread
From: Randy Dunlap @ 2021-11-20  0:40 UTC (permalink / raw)
  To: Ben Widawsky, linux-cxl, linux-pci
  Cc: Alison Schofield, Dan Williams, Ira Weiny, Jonathan Cameron,
	Vishal Verma

On 11/19/21 4:02 PM, Ben Widawsky wrote:
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 3aeb33bba5a3..f5553443ba2a 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -30,6 +30,21 @@ config CXL_PCI
>   
>   	  If unsure say 'm'.
>   
> +config CXL_MEM
> +	tristate "CXL.mem: Memory Devices"
> +	default CXL_BUS
> +        help
> +          The CXL.mem protocol allows a device to act as a provider of
> +	  "System RAM" and/or "Persistent Memory" that is fully coherent
> +	  as if the memory was attached to the typical CPU memory controller.
> +	  This is known as HDM "Host-managed Device Memory".
> +
> +	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> +	  memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
> +	  specification for a detailed description of HDM.
> +
> +	  If unsure say 'm'.

Hi Ben,

Both patch 20 and patch 22 add a "new" CXL_MEM config symbol.
Is one of them a typo?

thanks.
-- 
~Randy

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-20  0:02 ` [PATCH 20/23] cxl/port: Introduce a port driver Ben Widawsky
@ 2021-11-20  3:14     ` kernel test robot
  2021-11-20  5:38     ` kernel test robot
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 133+ messages in thread
From: kernel test robot @ 2021-11-20  3:14 UTC (permalink / raw)
  To: Ben Widawsky, linux-cxl, linux-pci
  Cc: kbuild-all, Ben Widawsky, Alison Schofield, Dan Williams,
	Ira Weiny, Jonathan Cameron, Vishal Verma

[-- Attachment #1: Type: text/plain, Size: 2037 bytes --]

Hi Ben,

I love your patch! Yet something to improve:

[auto build test ERROR on 53989fad1286e652ea3655ae3367ba698da8d2ff]

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: alpha-randconfig-r026-20211118 (attached as .config)
compiler: alpha-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/8ff43502e84dd4fa1296a131cb0cc82146389db4
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
        git checkout 8ff43502e84dd4fa1296a131cb0cc82146389db4
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=alpha 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/cxl/port.c: In function 'get_decoder_size':
>> drivers/cxl/port.c:105:16: error: implicit declaration of function 'ioread64_hi_lo' [-Werror=implicit-function-declaration]
     105 |         return ioread64_hi_lo(hdm_decoder +
         |                ^~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/ioread64_hi_lo +105 drivers/cxl/port.c

    97	
    98	static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
    99	{
   100		u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
   101	
   102		if (ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED)
   103			return 0;
   104	
 > 105		return ioread64_hi_lo(hdm_decoder +
   106				      CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
   107	}
   108	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 38253 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
@ 2021-11-20  3:14     ` kernel test robot
  0 siblings, 0 replies; 133+ messages in thread
From: kernel test robot @ 2021-11-20  3:14 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2089 bytes --]

Hi Ben,

I love your patch! Yet something to improve:

[auto build test ERROR on 53989fad1286e652ea3655ae3367ba698da8d2ff]

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: alpha-randconfig-r026-20211118 (attached as .config)
compiler: alpha-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/8ff43502e84dd4fa1296a131cb0cc82146389db4
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
        git checkout 8ff43502e84dd4fa1296a131cb0cc82146389db4
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=alpha 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/cxl/port.c: In function 'get_decoder_size':
>> drivers/cxl/port.c:105:16: error: implicit declaration of function 'ioread64_hi_lo' [-Werror=implicit-function-declaration]
     105 |         return ioread64_hi_lo(hdm_decoder +
         |                ^~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/ioread64_hi_lo +105 drivers/cxl/port.c

    97	
    98	static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
    99	{
   100		u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
   101	
   102		if (ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED)
   103			return 0;
   104	
 > 105		return ioread64_hi_lo(hdm_decoder +
   106				      CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
   107	}
   108	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 38253 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 17/23] cxl: Cache and pass DVSEC ranges
  2021-11-20  0:02 ` [PATCH 17/23] cxl: Cache and pass DVSEC ranges Ben Widawsky
@ 2021-11-20  4:29     ` kernel test robot
  2021-11-22 17:00   ` Jonathan Cameron
  2021-11-26 11:37   ` Jonathan Cameron
  2 siblings, 0 replies; 133+ messages in thread
From: kernel test robot @ 2021-11-20  4:29 UTC (permalink / raw)
  To: Ben Widawsky, linux-cxl, linux-pci
  Cc: llvm, kbuild-all, Ben Widawsky, Alison Schofield, Dan Williams,
	Ira Weiny, Jonathan Cameron, Vishal Verma

[-- Attachment #1: Type: text/plain, Size: 3171 bytes --]

Hi Ben,

I love your patch! Perhaps something to improve:

[auto build test WARNING on 53989fad1286e652ea3655ae3367ba698da8d2ff]

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: riscv-buildonly-randconfig-r001-20211119 (attached as .config)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install riscv cross compiling tool for clang build
        # apt-get install binutils-riscv64-linux-gnu
        # https://github.com/0day-ci/linux/commit/cfdf51e15fc8229a494ee59d05bc7459ab5eecd8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
        git checkout cfdf51e15fc8229a494ee59d05bc7459ab5eecd8
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=riscv 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/cxl/pci.c:469:7: warning: variable 'size' set but not used [-Wunused-but-set-variable]
                   u64 size;
                       ^
   1 warning generated.


vim +/size +469 drivers/cxl/pci.c

   454	
   455	#define CDPD(cxlds, which)                                                     \
   456		cxlds->device_dvsec + CXL_DVSEC_PCIE_DEVICE_##which##_OFFSET
   457	
   458	#define CDPDR(cxlds, which, sorb, lohi)                                        \
   459		cxlds->device_dvsec +                                                  \
   460			CXL_DVSEC_PCIE_DEVICE_RANGE_##sorb##_##lohi##_OFFSET(which)
   461	
   462	static int wait_for_valid(struct cxl_dev_state *cxlds)
   463	{
   464		struct pci_dev *pdev = to_pci_dev(cxlds->dev);
   465		const unsigned long timeout = jiffies + HZ;
   466		bool valid;
   467	
   468		do {
 > 469			u64 size;
   470			u32 temp;
   471			int rc;
   472	
   473			rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
   474						   &temp);
   475			if (rc)
   476				return -ENXIO;
   477			size = (u64)temp << 32;
   478	
   479			rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
   480						   &temp);
   481			if (rc)
   482				return -ENXIO;
   483			size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
   484	
   485			/*
   486			 * Memory_Info_Valid: When set, indicates that the CXL Range 1
   487			 * Size high and Size Low registers are valid. Must be set
   488			 * within 1 second of deassertion of reset to CXL device.
   489			 */
   490			valid = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_INFO_VALID, temp);
   491			if (valid)
   492				break;
   493			cpu_relax();
   494		} while (!time_after(jiffies, timeout));
   495	
   496		return valid ? 0 : -ETIMEDOUT;
   497	}
   498	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 37677 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 17/23] cxl: Cache and pass DVSEC ranges
@ 2021-11-20  4:29     ` kernel test robot
  0 siblings, 0 replies; 133+ messages in thread
From: kernel test robot @ 2021-11-20  4:29 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3254 bytes --]

Hi Ben,

I love your patch! Perhaps something to improve:

[auto build test WARNING on 53989fad1286e652ea3655ae3367ba698da8d2ff]

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: riscv-buildonly-randconfig-r001-20211119 (attached as .config)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install riscv cross compiling tool for clang build
        # apt-get install binutils-riscv64-linux-gnu
        # https://github.com/0day-ci/linux/commit/cfdf51e15fc8229a494ee59d05bc7459ab5eecd8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
        git checkout cfdf51e15fc8229a494ee59d05bc7459ab5eecd8
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=riscv 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/cxl/pci.c:469:7: warning: variable 'size' set but not used [-Wunused-but-set-variable]
                   u64 size;
                       ^
   1 warning generated.


vim +/size +469 drivers/cxl/pci.c

   454	
   455	#define CDPD(cxlds, which)                                                     \
   456		cxlds->device_dvsec + CXL_DVSEC_PCIE_DEVICE_##which##_OFFSET
   457	
   458	#define CDPDR(cxlds, which, sorb, lohi)                                        \
   459		cxlds->device_dvsec +                                                  \
   460			CXL_DVSEC_PCIE_DEVICE_RANGE_##sorb##_##lohi##_OFFSET(which)
   461	
   462	static int wait_for_valid(struct cxl_dev_state *cxlds)
   463	{
   464		struct pci_dev *pdev = to_pci_dev(cxlds->dev);
   465		const unsigned long timeout = jiffies + HZ;
   466		bool valid;
   467	
   468		do {
 > 469			u64 size;
   470			u32 temp;
   471			int rc;
   472	
   473			rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
   474						   &temp);
   475			if (rc)
   476				return -ENXIO;
   477			size = (u64)temp << 32;
   478	
   479			rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
   480						   &temp);
   481			if (rc)
   482				return -ENXIO;
   483			size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
   484	
   485			/*
   486			 * Memory_Info_Valid: When set, indicates that the CXL Range 1
   487			 * Size high and Size Low registers are valid. Must be set
   488			 * within 1 second of deassertion of reset to CXL device.
   489			 */
   490			valid = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_INFO_VALID, temp);
   491			if (valid)
   492				break;
   493			cpu_relax();
   494		} while (!time_after(jiffies, timeout));
   495	
   496		return valid ? 0 : -ETIMEDOUT;
   497	}
   498	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 37677 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-20  0:02 ` [PATCH 20/23] cxl/port: Introduce a port driver Ben Widawsky
@ 2021-11-20  5:38     ` kernel test robot
  2021-11-20  5:38     ` kernel test robot
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 133+ messages in thread
From: kernel test robot @ 2021-11-20  5:38 UTC (permalink / raw)
  To: Ben Widawsky, linux-cxl, linux-pci
  Cc: llvm, kbuild-all, Ben Widawsky, Alison Schofield, Dan Williams,
	Ira Weiny, Jonathan Cameron, Vishal Verma

[-- Attachment #1: Type: text/plain, Size: 2244 bytes --]

Hi Ben,

I love your patch! Yet something to improve:

[auto build test ERROR on 53989fad1286e652ea3655ae3367ba698da8d2ff]

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: arm-randconfig-r034-20211119 (attached as .config)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm cross compiling tool for clang build
        # apt-get install binutils-arm-linux-gnueabi
        # https://github.com/0day-ci/linux/commit/8ff43502e84dd4fa1296a131cb0cc82146389db4
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
        git checkout 8ff43502e84dd4fa1296a131cb0cc82146389db4
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/cxl/port.c:105:9: error: implicit declaration of function 'ioread64_hi_lo' [-Werror,-Wimplicit-function-declaration]
           return ioread64_hi_lo(hdm_decoder +
                  ^
   drivers/cxl/port.c:191:11: error: implicit declaration of function 'ioread64_hi_lo' [-Werror,-Wimplicit-function-declaration]
                           base = ioread64_hi_lo(decoderN(BASE_LOW_OFFSET, i));
                                  ^
   2 errors generated.


vim +/ioread64_hi_lo +105 drivers/cxl/port.c

    97	
    98	static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
    99	{
   100		u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
   101	
   102		if (ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED)
   103			return 0;
   104	
 > 105		return ioread64_hi_lo(hdm_decoder +
   106				      CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
   107	}
   108	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 38342 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
@ 2021-11-20  5:38     ` kernel test robot
  0 siblings, 0 replies; 133+ messages in thread
From: kernel test robot @ 2021-11-20  5:38 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2299 bytes --]

Hi Ben,

I love your patch! Yet something to improve:

[auto build test ERROR on 53989fad1286e652ea3655ae3367ba698da8d2ff]

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: arm-randconfig-r034-20211119 (attached as .config)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm cross compiling tool for clang build
        # apt-get install binutils-arm-linux-gnueabi
        # https://github.com/0day-ci/linux/commit/8ff43502e84dd4fa1296a131cb0cc82146389db4
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
        git checkout 8ff43502e84dd4fa1296a131cb0cc82146389db4
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/cxl/port.c:105:9: error: implicit declaration of function 'ioread64_hi_lo' [-Werror,-Wimplicit-function-declaration]
           return ioread64_hi_lo(hdm_decoder +
                  ^
   drivers/cxl/port.c:191:11: error: implicit declaration of function 'ioread64_hi_lo' [-Werror,-Wimplicit-function-declaration]
                           base = ioread64_hi_lo(decoderN(BASE_LOW_OFFSET, i));
                                  ^
   2 errors generated.


vim +/ioread64_hi_lo +105 drivers/cxl/port.c

    97	
    98	static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
    99	{
   100		u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
   101	
   102		if (ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED)
   103			return 0;
   104	
 > 105		return ioread64_hi_lo(hdm_decoder +
   106				      CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
   107	}
   108	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 38342 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 19/23] cxl/pci: Store component register base in cxlds
  2021-11-20  0:02 ` [PATCH 19/23] cxl/pci: Store component register base in cxlds Ben Widawsky
@ 2021-11-20  7:28     ` kernel test robot
  2021-11-22 17:11   ` Jonathan Cameron
  1 sibling, 0 replies; 133+ messages in thread
From: kernel test robot @ 2021-11-20  7:28 UTC (permalink / raw)
  To: Ben Widawsky, linux-cxl, linux-pci
  Cc: llvm, kbuild-all, Ben Widawsky, Alison Schofield, Dan Williams,
	Ira Weiny, Jonathan Cameron, Vishal Verma

[-- Attachment #1: Type: text/plain, Size: 5236 bytes --]

Hi Ben,

I love your patch! Perhaps something to improve:

[auto build test WARNING on 53989fad1286e652ea3655ae3367ba698da8d2ff]

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: riscv-buildonly-randconfig-r001-20211119 (attached as .config)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install riscv cross compiling tool for clang build
        # apt-get install binutils-riscv64-linux-gnu
        # https://github.com/0day-ci/linux/commit/3f74c99d751a24a4c12ba76c23b68c2832f805e1
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
        git checkout 3f74c99d751a24a4c12ba76c23b68c2832f805e1
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=riscv 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/cxl/pci.c:469:7: warning: variable 'size' set but not used [-Wunused-but-set-variable]
                   u64 size;
                       ^
   drivers/cxl/pci.c:516:7: warning: variable 'size' set but not used [-Wunused-but-set-variable]
                   u64 size;
                       ^
>> drivers/cxl/pci.c:673:13: warning: variable 'cxlmd' is uninitialized when used here [-Wuninitialized]
                   dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
                             ^~~~~
   include/linux/dev_printk.h:146:49: note: expanded from macro 'dev_warn'
           dev_printk_index_wrap(_dev_warn, KERN_WARNING, dev, dev_fmt(fmt), ##__VA_ARGS__)
                                                          ^~~
   include/linux/dev_printk.h:110:11: note: expanded from macro 'dev_printk_index_wrap'
                   _p_func(dev, fmt, ##__VA_ARGS__);                       \
                           ^~~
   drivers/cxl/pci.c:630:26: note: initialize the variable 'cxlmd' to silence this warning
           struct cxl_memdev *cxlmd;
                                   ^
                                    = NULL
   3 warnings generated.


vim +/cxlmd +673 drivers/cxl/pci.c

   625	
   626	static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
   627	{
   628		struct cxl_endpoint_dvsec_info *info;
   629		struct cxl_register_map map;
   630		struct cxl_memdev *cxlmd;
   631		struct cxl_dev_state *cxlds;
   632		int rc;
   633	
   634		/*
   635		 * Double check the anonymous union trickery in struct cxl_regs
   636		 * FIXME switch to struct_group()
   637		 */
   638		BUILD_BUG_ON(offsetof(struct cxl_regs, memdev) !=
   639			     offsetof(struct cxl_regs, device_regs.memdev));
   640	
   641		rc = pcim_enable_device(pdev);
   642		if (rc)
   643			return rc;
   644	
   645		cxlds = cxl_dev_state_create(&pdev->dev);
   646		if (IS_ERR(cxlds))
   647			return PTR_ERR(cxlds);
   648	
   649		cxlds->device_dvsec = pci_find_dvsec_capability(pdev,
   650								PCI_DVSEC_VENDOR_ID_CXL,
   651								CXL_DVSEC_PCIE_DEVICE);
   652		if (!cxlds->device_dvsec)
   653			dev_warn(&pdev->dev,
   654				 "Device DVSEC not present. Expect limited functionality.\n");
   655		else
   656			cxlds->wait_media_ready = wait_for_media_ready;
   657	
   658		rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
   659		if (rc)
   660			return rc;
   661	
   662		rc = cxl_map_regs(cxlds, &map);
   663		if (rc)
   664			return rc;
   665	
   666		/*
   667		 * If the component registers can't be found, the cxl_pci driver may
   668		 * still be useful for management functions so don't return an error.
   669		 */
   670		cxlds->component_reg_phys = CXL_RESOURCE_NONE;
   671		rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
   672		if (rc)
 > 673			dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
   674		else
   675			cxlds->component_reg_phys = cxl_reg_block(pdev, &map);
   676	
   677		rc = cxl_pci_setup_mailbox(cxlds);
   678		if (rc)
   679			return rc;
   680	
   681		rc = cxl_enumerate_cmds(cxlds);
   682		if (rc)
   683			return rc;
   684	
   685		rc = cxl_dev_state_identify(cxlds);
   686		if (rc)
   687			return rc;
   688	
   689		rc = cxl_mem_create_range_info(cxlds);
   690		if (rc)
   691			return rc;
   692	
   693		info = dvsec_ranges(cxlds);
   694		if (IS_ERR(info))
   695			dev_err(&pdev->dev,
   696				"Failed to get DVSEC range information (%ld)\n",
   697				PTR_ERR(info));
   698		else
   699			cxlds->info = info;
   700	
   701		cxlmd = devm_cxl_add_memdev(cxlds);
   702		if (IS_ERR(cxlmd))
   703			return PTR_ERR(cxlmd);
   704	
   705		if (range_len(&cxlds->pmem_range) && IS_ENABLED(CONFIG_CXL_PMEM))
   706			rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
   707	
   708		return rc;
   709	}
   710	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 37677 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 19/23] cxl/pci: Store component register base in cxlds
@ 2021-11-20  7:28     ` kernel test robot
  0 siblings, 0 replies; 133+ messages in thread
From: kernel test robot @ 2021-11-20  7:28 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5378 bytes --]

Hi Ben,

I love your patch! Perhaps something to improve:

[auto build test WARNING on 53989fad1286e652ea3655ae3367ba698da8d2ff]

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: riscv-buildonly-randconfig-r001-20211119 (attached as .config)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install riscv cross compiling tool for clang build
        # apt-get install binutils-riscv64-linux-gnu
        # https://github.com/0day-ci/linux/commit/3f74c99d751a24a4c12ba76c23b68c2832f805e1
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
        git checkout 3f74c99d751a24a4c12ba76c23b68c2832f805e1
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=riscv 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/cxl/pci.c:469:7: warning: variable 'size' set but not used [-Wunused-but-set-variable]
                   u64 size;
                       ^
   drivers/cxl/pci.c:516:7: warning: variable 'size' set but not used [-Wunused-but-set-variable]
                   u64 size;
                       ^
>> drivers/cxl/pci.c:673:13: warning: variable 'cxlmd' is uninitialized when used here [-Wuninitialized]
                   dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
                             ^~~~~
   include/linux/dev_printk.h:146:49: note: expanded from macro 'dev_warn'
           dev_printk_index_wrap(_dev_warn, KERN_WARNING, dev, dev_fmt(fmt), ##__VA_ARGS__)
                                                          ^~~
   include/linux/dev_printk.h:110:11: note: expanded from macro 'dev_printk_index_wrap'
                   _p_func(dev, fmt, ##__VA_ARGS__);                       \
                           ^~~
   drivers/cxl/pci.c:630:26: note: initialize the variable 'cxlmd' to silence this warning
           struct cxl_memdev *cxlmd;
                                   ^
                                    = NULL
   3 warnings generated.


vim +/cxlmd +673 drivers/cxl/pci.c

   625	
   626	static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
   627	{
   628		struct cxl_endpoint_dvsec_info *info;
   629		struct cxl_register_map map;
   630		struct cxl_memdev *cxlmd;
   631		struct cxl_dev_state *cxlds;
   632		int rc;
   633	
   634		/*
   635		 * Double check the anonymous union trickery in struct cxl_regs
   636		 * FIXME switch to struct_group()
   637		 */
   638		BUILD_BUG_ON(offsetof(struct cxl_regs, memdev) !=
   639			     offsetof(struct cxl_regs, device_regs.memdev));
   640	
   641		rc = pcim_enable_device(pdev);
   642		if (rc)
   643			return rc;
   644	
   645		cxlds = cxl_dev_state_create(&pdev->dev);
   646		if (IS_ERR(cxlds))
   647			return PTR_ERR(cxlds);
   648	
   649		cxlds->device_dvsec = pci_find_dvsec_capability(pdev,
   650								PCI_DVSEC_VENDOR_ID_CXL,
   651								CXL_DVSEC_PCIE_DEVICE);
   652		if (!cxlds->device_dvsec)
   653			dev_warn(&pdev->dev,
   654				 "Device DVSEC not present. Expect limited functionality.\n");
   655		else
   656			cxlds->wait_media_ready = wait_for_media_ready;
   657	
   658		rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
   659		if (rc)
   660			return rc;
   661	
   662		rc = cxl_map_regs(cxlds, &map);
   663		if (rc)
   664			return rc;
   665	
   666		/*
   667		 * If the component registers can't be found, the cxl_pci driver may
   668		 * still be useful for management functions so don't return an error.
   669		 */
   670		cxlds->component_reg_phys = CXL_RESOURCE_NONE;
   671		rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
   672		if (rc)
 > 673			dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
   674		else
   675			cxlds->component_reg_phys = cxl_reg_block(pdev, &map);
   676	
   677		rc = cxl_pci_setup_mailbox(cxlds);
   678		if (rc)
   679			return rc;
   680	
   681		rc = cxl_enumerate_cmds(cxlds);
   682		if (rc)
   683			return rc;
   684	
   685		rc = cxl_dev_state_identify(cxlds);
   686		if (rc)
   687			return rc;
   688	
   689		rc = cxl_mem_create_range_info(cxlds);
   690		if (rc)
   691			return rc;
   692	
   693		info = dvsec_ranges(cxlds);
   694		if (IS_ERR(info))
   695			dev_err(&pdev->dev,
   696				"Failed to get DVSEC range information (%ld)\n",
   697				PTR_ERR(info));
   698		else
   699			cxlds->info = info;
   700	
   701		cxlmd = devm_cxl_add_memdev(cxlds);
   702		if (IS_ERR(cxlmd))
   703			return PTR_ERR(cxlmd);
   704	
   705		if (range_len(&cxlds->pmem_range) && IS_ENABLED(CONFIG_CXL_PMEM))
   706			rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
   707	
   708		return rc;
   709	}
   710	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 37677 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 22/23] cxl/mem: Introduce cxl_mem driver
  2021-11-20  0:40   ` Randy Dunlap
@ 2021-11-21  3:55     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-21  3:55 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-19 16:40:09, Randy Dunlap wrote:
> On 11/19/21 4:02 PM, Ben Widawsky wrote:
> > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > index 3aeb33bba5a3..f5553443ba2a 100644
> > --- a/drivers/cxl/Kconfig
> > +++ b/drivers/cxl/Kconfig
> > @@ -30,6 +30,21 @@ config CXL_PCI
> >   	  If unsure say 'm'.
> > +config CXL_MEM
> > +	tristate "CXL.mem: Memory Devices"
> > +	default CXL_BUS
> > +        help
> > +          The CXL.mem protocol allows a device to act as a provider of
> > +	  "System RAM" and/or "Persistent Memory" that is fully coherent
> > +	  as if the memory was attached to the typical CPU memory controller.
> > +	  This is known as HDM "Host-managed Device Memory".
> > +
> > +	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> > +	  memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
> > +	  specification for a detailed description of HDM.
> > +
> > +	  If unsure say 'm'.
> 
> Hi Ben,
> 
> Both patch 20 and patch 22 add a "new" CXL_MEM config symbol.
> Is one of them a typo?
> 
> thanks.

Yep. Thank you. Fixed for v2 and I added a reported-by tag from you.

> -- 
> ~Randy

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI
  2021-11-20  0:02 ` [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
@ 2021-11-22 14:47   ` Jonathan Cameron
  2021-11-24  4:15   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 14:47 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Dan Williams, Alison Schofield, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:28 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The cxl_mem module was renamed cxl_pci in commit 21e9f76733a8 ("cxl:
> Rename mem to pci"). In preparation for adding an ancillary driver for
> cxl_memdev devices (registered on the cxl bus by cxl_pci), go ahead and
> rename CONFIG_CXL_MEM to CONFIG_CXL_PCI. Free up the CXL_MEM name for
> that new driver to manage CXL.mem endpoint operations.
> 
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Makes sense to me, particularly as it brings it inline with the file name.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> Changes since RFCv2:
> - Reword commit message (Dan)
> - Reword Kconfig description (Dan)
> ---
>  drivers/cxl/Kconfig  | 23 ++++++++++++-----------
>  drivers/cxl/Makefile |  2 +-
>  2 files changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 67c91378f2dd..ef05e96f8f97 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -13,25 +13,26 @@ menuconfig CXL_BUS
>  
>  if CXL_BUS
>  
> -config CXL_MEM
> -	tristate "CXL.mem: Memory Devices"
> +config CXL_PCI
> +	tristate "PCI manageability"
>  	default CXL_BUS
>  	help
> -	  The CXL.mem protocol allows a device to act as a provider of
> -	  "System RAM" and/or "Persistent Memory" that is fully coherent
> -	  as if the memory was attached to the typical CPU memory
> -	  controller.
> +	  The CXL specification defines a "CXL memory device" sub-class in the
> +	  PCI "memory controller" base class of devices. Device's identified by
> +	  this class code provide support for volatile and / or persistent
> +	  memory to be mapped into the system address map (Host-managed Device
> +	  Memory (HDM)).
>  
> -	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> -	  configuration and management primarily via the mailbox interface. See
> -	  Chapter 2.3 Type 3 CXL Device in the CXL 2.0 specification for more
> -	  details.
> +	  Say 'y/m' to enable a driver that will attach to CXL memory expander
> +	  devices enumerated by the memory device class code for configuration
> +	  and management primarily via the mailbox interface. See Chapter 2.3
> +	  Type 3 CXL Device in the CXL 2.0 specification for more details.
>  
>  	  If unsure say 'm'.
>  
>  config CXL_MEM_RAW_COMMANDS
>  	bool "RAW Command Interface for Memory Devices"
> -	depends on CXL_MEM
> +	depends on CXL_PCI
>  	help
>  	  Enable CXL RAW command interface.
>  
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index d1aaabc940f3..cf07ae6cea17 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_CXL_BUS) += core/
> -obj-$(CONFIG_CXL_MEM) += cxl_pci.o
> +obj-$(CONFIG_CXL_PCI) += cxl_pci.o
>  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
>  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
>  


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 02/23] cxl: Flesh out register names
  2021-11-20  0:02 ` [PATCH 02/23] cxl: Flesh out register names Ben Widawsky
@ 2021-11-22 14:49   ` Jonathan Cameron
  2021-11-24  4:24   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 14:49 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:29 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Get a better naming scheme in place for upcoming additions.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Seems like a good balance to me between the different directions this
sort of naming gets dragged in.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> Changes since RFCv2:
> Use some abbreviations (Jonathan)
> Prefix everything with CXL (Jonathan)
> Remove new additions (Dan)
> 
> Original discussion motivating this occurred here:
> https://lore.kernel.org/linux-pci/20210913190131.xiiszmno46qie7v5@intel.com/
> ---
>  drivers/cxl/pci.c | 14 +++++++-------
>  drivers/cxl/pci.h | 19 ++++++++++---------
>  2 files changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8dc91fd3396a..a6ea9811a05b 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -403,10 +403,10 @@ static int cxl_map_regs(struct cxl_dev_state *cxlds, struct cxl_register_map *ma
>  static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
>  				struct cxl_register_map *map)
>  {
> -	map->block_offset =
> -		((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
> -	map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
> -	map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
> +	map->block_offset = ((u64)reg_hi << 32) |
> +			    (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
> +	map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
> +	map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
>  }
>  
>  /**
> @@ -427,15 +427,15 @@ static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>  	int regloc, i;
>  
>  	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> -					   PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> +					   CXL_DVSEC_REG_LOCATOR);
>  	if (!regloc)
>  		return -ENXIO;
>  
>  	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
>  	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
>  
> -	regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> -	regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> +	regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
> +	regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
>  
>  	for (i = 0; i < regblocks; i++, regloc += 8) {
>  		u32 reg_lo, reg_hi;
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 7d3e4bf06b45..29b8eaef3a0a 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -7,17 +7,21 @@
>  
>  /*
>   * See section 8.1 Configuration Space Registers in the CXL 2.0
> - * Specification
> + * Specification. Names are taken straight from the specification with "CXL" and
> + * "DVSEC" redundancies removed. When obvious, abbreviations may be used.
>   */
>  #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
>  #define PCI_DVSEC_VENDOR_ID_CXL		0x1E98
> -#define PCI_DVSEC_ID_CXL		0x0
>  
> -#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID	0x8
> -#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET	0xC
> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
> +#define CXL_DVSEC_PCIE_DEVICE					0
>  
> -/* BAR Indicator Register (BIR) */
> -#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
> +/* CXL 2.0 8.1.9: Register Locator DVSEC */
> +#define CXL_DVSEC_REG_LOCATOR					8
> +#define   CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET			0xC
> +#define     CXL_DVSEC_REG_LOCATOR_BIR_MASK			GENMASK(2, 0)
> +#define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
> +#define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
>  
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
> @@ -28,7 +32,4 @@ enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_TYPES
>  };
>  
> -#define CXL_REGLOC_RBI_MASK GENMASK(15, 8)
> -#define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
> -
>  #endif /* __CXL_PCI_H__ */


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout
  2021-11-20  0:02 ` [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout Ben Widawsky
@ 2021-11-22 15:02   ` Jonathan Cameron
  2021-11-22 17:17     ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 15:02 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:31 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The original driver implementation used the doorbell timeout for the
> Mailbox Interface Ready bit to piggy back off of, since the latter
> doesn't have a defined timeout. This functionality, introduced in
> 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> since a timeout has been defined with an ECN to the 2.0 spec.
> 
> While devices implemented prior to the ECN could have an arbitrarily
> long wait and still be within spec, the max ECN value (256s) is chosen
> as the default for all devices. All vendors in the consortium agreed to
> this amount and so it is reasonable to assume no devices made will
> exceed this amount.

Optimistic :)

> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
> This patch did not exist in RFCv2
> ---
>  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 6c8d09fb3a17..2cef9fec8599 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -2,6 +2,7 @@
>  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/module.h>
> +#include <linux/delay.h>
>  #include <linux/sizes.h>
>  #include <linux/mutex.h>
>  #include <linux/list.h>
> @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
>  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
>  {
>  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> +	unsigned long timeout;
> +	u64 md_status;
> +	int rc;
> +
> +	/*
> +	 * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> +	 * dictate how long to wait for the mailbox to become ready. For
> +	 * simplicity, and to handle devices that might have been implemented

I'm not keen on the 'for simplicity' argument here.  If the device is advertising
a lower value, then that is what we should use.  It's fine to wait the max time
if nothing is specified.  It'll cost us a few lines of code at most unless
I am missing something...

Jonathan

> +	 * prior to the ECN, wait the max amount of time no matter what the
> +	 * device says.
> +	 */
> +	timeout = jiffies + 256 * HZ;
> +
> +	rc = check_device_status(cxlds);
> +	if (rc)
> +		return rc;
> +
> +	do {
> +		md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> +		if (md_status & CXLMDEV_MBOX_IF_READY)
> +			break;
> +		if (msleep_interruptible(100))
> +			break;
> +	} while (!time_after(jiffies, timeout));
> +
> +	/* It's assumed that once the interface is ready, it will remain ready. */
> +	if (!(md_status & CXLMDEV_MBOX_IF_READY))
> +		return -EIO;
>  
>  	cxlds->mbox_send = cxl_pci_mbox_send;
>  	cxlds->payload_size =


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 03/23] cxl/pci: Extract device status check
  2021-11-20  0:02 ` [PATCH 03/23] cxl/pci: Extract device status check Ben Widawsky
@ 2021-11-22 15:03   ` Jonathan Cameron
  2021-11-24 19:30   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 15:03 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:30 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The Memory Device Status register is inspected in the same way for at
> least two flows in the CXL Type 3 Memory Device Software Guide
> (Revision: 1.0): 2.13.9 Device discovery and mailbox ready sequence,
> and 2.13.10 Media ready sequence. Extract this common functionality for
> use by both.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> This patch did not exist in RFCv2
> ---
>  drivers/cxl/pci.c | 33 +++++++++++++++++++++++++--------
>  1 file changed, 25 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index a6ea9811a05b..6c8d09fb3a17 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -182,6 +182,27 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
>  	return 0;
>  }
>  
> +/*
> + * Implements roughly the bottom half of Figure 42 of the CXL Type 3 Memory
> + * Device Software Guide
> + */
> +static int check_device_status(struct cxl_dev_state *cxlds)
> +{
> +	const u64 md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> +
> +	if (md_status & CXLMDEV_DEV_FATAL) {
> +		dev_err(cxlds->dev, "Fatal: replace device\n");
> +		return -EIO;
> +	}
> +
> +	if (md_status & CXLMDEV_FW_HALT) {
> +		dev_err(cxlds->dev, "FWHalt: reset or replace device\n");
> +		return -EBUSY;
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * cxl_pci_mbox_get() - Acquire exclusive access to the mailbox.
>   * @cxlds: The device state to gain access to.
> @@ -231,17 +252,13 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
>  	 * Hardware shouldn't allow a ready status but also have failure bits
>  	 * set. Spit out an error, this should be a bug report
>  	 */
> -	rc = -EFAULT;
> -	if (md_status & CXLMDEV_DEV_FATAL) {
> -		dev_err(dev, "mbox: reported ready, but fatal\n");
> +	rc = check_device_status(cxlds);
> +	if (rc)
>  		goto out;
> -	}
> -	if (md_status & CXLMDEV_FW_HALT) {
> -		dev_err(dev, "mbox: reported ready, but halted\n");
> -		goto out;
> -	}
> +
>  	if (CXLMDEV_RESET_NEEDED(md_status)) {
>  		dev_err(dev, "mbox: reported ready, but reset needed\n");
> +		rc = -EFAULT;
>  		goto out;
>  	}
>  


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-20  0:02 ` [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access Ben Widawsky
@ 2021-11-22 15:11   ` Jonathan Cameron
  2021-11-22 17:24     ` Ben Widawsky
  2021-11-24 21:55   ` Dan Williams
  1 sibling, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 15:11 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:32 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The expectation is that the mailbox interface ready bit is the first
> step in access through the mailbox interface.

Reword this? Perhaps
"The expectation is that the mailbox interface ready bit will be set
 at the start of any access through the mailbox interface."

> Therefore, waiting for the
> doorbell busy bit to be clear would imply that the mailbox interface is
> ready. The original driver implementation used the doorbell timeout for
> the Mailbox Interface Ready bit to piggyback off of, since the latter
> doesn't have a defined timeout (introduced in 8adaf747c9f0 ("cxl/mem:
> Find device capabilities"), a timeout has since been defined with an ECN
> to the 2.0 spec). With the current driver waiting for mailbox interface
> ready as a part of probe() it's no longer necessary to use the
> piggyback.
> 
> With the piggybacking no longer necessary it doesn't make sense to check
> doorbell status when acquiring the mailbox. It will be checked during
> the normal mailbox exchange protocol.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Trivial comment inline - with that fixed either by calling it out, or by
pulling it out of this patch.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> This patch did not exist in RFCv2
> ---
>  drivers/cxl/pci.c | 25 ++++++-------------------
>  1 file changed, 6 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 2cef9fec8599..869b4fc18e27 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -221,27 +221,14 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
>  
>  	/*
>  	 * XXX: There is some amount of ambiguity in the 2.0 version of the spec
> -	 * around the mailbox interface ready (8.2.8.5.1.1).  The purpose of the
> +	 * around the mailbox interface ready (8.2.8.5.1.1). The purpose of the

Whilst it's trivial, I'd prefer white space cleanup in separate patches.
I guess this one is obvious enough to just call out in the patch description
though.

>  	 * bit is to allow firmware running on the device to notify the driver
> -	 * that it's ready to receive commands. It is unclear if the bit needs
> -	 * to be read for each transaction mailbox, ie. the firmware can switch
> -	 * it on and off as needed. Second, there is no defined timeout for
> -	 * mailbox ready, like there is for the doorbell interface.
> -	 *
> -	 * Assumptions:
> -	 * 1. The firmware might toggle the Mailbox Interface Ready bit, check
> -	 *    it for every command.
> -	 *
> -	 * 2. If the doorbell is clear, the firmware should have first set the
> -	 *    Mailbox Interface Ready bit. Therefore, waiting for the doorbell
> -	 *    to be ready is sufficient.
> +	 * that it's ready to receive commands. The spec does not clearly define
> +	 * under what conditions the bit may get set or cleared. As of the 2.0
> +	 * base specification there was no defined timeout for mailbox ready,
> +	 * like there is for the doorbell interface. This was fixed with an ECN,
> +	 * but it's possible early devices implemented this before the ECN.
>  	 */
> -	rc = cxl_pci_mbox_wait_for_doorbell(cxlds);
> -	if (rc) {
> -		dev_warn(dev, "Mailbox interface not ready\n");
> -		goto out;
> -	}
> -
>  	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
>  	if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
>  		dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 06/23] cxl/pci: Don't check media status for mbox access
  2021-11-20  0:02 ` [PATCH 06/23] cxl/pci: Don't check media status for mbox access Ben Widawsky
@ 2021-11-22 15:19   ` Jonathan Cameron
  2021-11-24 21:58   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 15:19 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:33 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Media status is necessary for using HDM contained in a CXL device but is
> not needed for mailbox accesses. Therefore remove this check. It will be
> necessary to have this check (in a different place) when enabling HDM.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

One could perhaps argue this was a bug, but it was hopefully harmless on real devices.
Out of curiosity, did you find a clear definition of what 'User data' includes?
Obviously includes any data in the HDM region, but what about region meta data?

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> This patch did not exist in RFCv2
> ---
>  drivers/cxl/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 869b4fc18e27..711bf4514480 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -230,7 +230,7 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
>  	 * but it's possible early devices implemented this before the ECN.
>  	 */
>  	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> -	if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
> +	if (!(md_status & CXLMDEV_MBOX_IF_READY)) {
>  		dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
>  		rc = -EBUSY;
>  		goto out;


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 07/23] cxl/pci: Add new DVSEC definitions
  2021-11-20  0:02 ` [PATCH 07/23] cxl/pci: Add new DVSEC definitions Ben Widawsky
@ 2021-11-22 15:22   ` Jonathan Cameron
  2021-11-22 17:32     ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 15:22 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:34 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> While the new definitions are yet necessary at this point, they are
> introduced at this point to help solidify the newly minted schema for
> naming registers.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>

> 
> ---
> This was split from
> https://lore.kernel.org/linux-cxl/20211103170552.55ae5u7uvurkync6@intel.com/T/#u
> per Dan's request.
> ---
>  drivers/cxl/pci.h | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 29b8eaef3a0a..8ae2b4adc59d 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -16,6 +16,21 @@
>  /* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
>  #define CXL_DVSEC_PCIE_DEVICE					0
>  
> +/* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
> +#define CXL_DVSEC_FUNCTION_MAP					2
> +
> +/* CXL 2.0 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
> +#define CXL_DVSEC_PORT_EXTENSIONS				3
> +
> +/* CXL 2.0 8.1.6: GPF DVSEC for CXL Port */
> +#define CXL_DVSEC_PORT_GPF					4
> +
> +/* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
> +#define CXL_DVSEC_DEVICE_GPF					5
> +
> +/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
> +#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
> +
>  /* CXL 2.0 8.1.9: Register Locator DVSEC */
>  #define CXL_DVSEC_REG_LOCATOR					8
>  #define   CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET			0xC


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 08/23] cxl/acpi: Map component registers for Root Ports
  2021-11-20  0:02 ` [PATCH 08/23] cxl/acpi: Map component registers for Root Ports Ben Widawsky
@ 2021-11-22 15:51   ` Jonathan Cameron
  2021-11-22 19:28     ` Ben Widawsky
  2021-11-24 22:18   ` Dan Williams
  1 sibling, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 15:51 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:35 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This implements the TODO in cxl_acpi for mapping component registers.
> cxl_acpi becomes the second consumer of CXL register block enumeration
> (cxl_pci being the first). Moving the functionality to cxl_core allows
> both of these drivers to use the functionality. Equally importantly it
> allows cxl_core to use the functionality in the future.
> 
> CXL 2.0 root ports are similar to CXL 2.0 Downstream Ports with the main
> distinction being they're a part of the CXL 2.0 host bridge. While
> mapping their component registers is not immediately useful for the CXL
> drivers, the movement of register block enumeration into core is a vital
> step towards HDM decoder programming.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

A few minor comments below.

Jonathan

> 
> ---
> Changes since RFCv2:
> - Squash commits together (Dan)
> - Reword commit message to account for above.
> ---
>  drivers/cxl/acpi.c      | 10 ++++++--
>  drivers/cxl/core/regs.c | 54 +++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |  4 +++
>  drivers/cxl/pci.c       | 52 ---------------------------------------
>  drivers/cxl/pci.h       |  4 +++
>  5 files changed, 70 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 3163167ecc3a..7cfa8b568013 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -7,6 +7,7 @@
>  #include <linux/acpi.h>
>  #include <linux/pci.h>
>  #include "cxl.h"
> +#include "pci.h"
>  
>  /* Encode defined in CXL 2.0 8.2.5.12.7 HDM Decoder Control Register */
>  #define CFMWS_INTERLEAVE_WAYS(x)	(1 << (x)->interleave_ways)
> @@ -134,11 +135,13 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>  
>  __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>  {
> +	resource_size_t creg = CXL_RESOURCE_NONE;
>  	struct cxl_walk_context *ctx = data;
>  	struct pci_bus *root_bus = ctx->root;
>  	struct cxl_port *port = ctx->port;
>  	int type = pci_pcie_type(pdev);
>  	struct device *dev = ctx->dev;
> +	struct cxl_register_map map;
>  	u32 lnkcap, port_num;
>  	int rc;
>  
> @@ -152,9 +155,12 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>  				  &lnkcap) != PCIBIOS_SUCCESSFUL)
>  		return 0;
>  
> -	/* TODO walk DVSEC to find component register base */
> +	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);

Perhaps a comment to explain why this is optional?

> +	if (!rc)
> +		creg = cxl_reg_block(pdev, &map);
> +
>  	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> -	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
> +	rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
>  	if (rc) {
>  		ctx->error = rc;
>  		return rc;
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index e37e23bf4355..41a0245867ea 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -5,6 +5,7 @@
>  #include <linux/slab.h>
>  #include <linux/pci.h>
>  #include <cxlmem.h>
> +#include <pci.h>
>  
>  /**
>   * DOC: cxl registers
> @@ -247,3 +248,56 @@ int cxl_map_device_regs(struct pci_dev *pdev,
>  	return 0;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_map_device_regs, CXL);
> +
> +static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> +				struct cxl_register_map *map)
> +{
> +	map->block_offset = ((u64)reg_hi << 32) |
> +			    (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
> +	map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
> +	map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
> +}
> +
> +/**
> + * cxl_find_regblock() - Locate register blocks by type
> + * @pdev: The CXL PCI device to enumerate.
> + * @type: Register Block Indicator id
> + * @map: Enumeration output, clobbered on error
> + *
> + * Return: 0 if register block enumerated, negative error code otherwise
> + *
> + * A CXL DVSEC may additional point one or more register blocks, search

Why additional?  I'm not sure what this means.

point to one or more additional register blocks perhaps?

> + * for them by @type.
> + */
> +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> +		      struct cxl_register_map *map)
> +{
> +	u32 regloc_size, regblocks;
> +	int regloc, i;
> +
> +	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> +					   CXL_DVSEC_REG_LOCATOR);
> +	if (!regloc)
> +		return -ENXIO;
> +
> +	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> +	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> +
> +	regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
> +	regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
> +
> +	for (i = 0; i < regblocks; i++, regloc += 8) {
> +		u32 reg_lo, reg_hi;
> +
> +		pci_read_config_dword(pdev, regloc, &reg_lo);
> +		pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> +
> +		cxl_decode_regblock(reg_lo, reg_hi, map);
> +
> +		if (map->reg_type == type)
> +			return 0;
> +	}
> +
> +	return -ENODEV;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_find_regblock, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index ab4596f0b751..7150a9694f66 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -145,6 +145,10 @@ int cxl_map_device_regs(struct pci_dev *pdev,
>  			struct cxl_device_regs *regs,
>  			struct cxl_register_map *map);
>  
> +enum cxl_regloc_type;
> +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> +		      struct cxl_register_map *map);
> +
>  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
>  #define CXL_TARGET_STRLEN 20
>  
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 711bf4514480..d2c743a31b0c 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -433,58 +433,6 @@ static int cxl_map_regs(struct cxl_dev_state *cxlds, struct cxl_register_map *ma
>  	return 0;
>  }
>  
> -static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> -				struct cxl_register_map *map)
> -{
> -	map->block_offset = ((u64)reg_hi << 32) |
> -			    (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
> -	map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
> -	map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
> -}
> -
> -/**
> - * cxl_find_regblock() - Locate register blocks by type
> - * @pdev: The CXL PCI device to enumerate.
> - * @type: Register Block Indicator id
> - * @map: Enumeration output, clobbered on error
> - *
> - * Return: 0 if register block enumerated, negative error code otherwise
> - *
> - * A CXL DVSEC may point to one or more register blocks, search for them
> - * by @type.
> - */
> -static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> -			     struct cxl_register_map *map)
> -{
> -	u32 regloc_size, regblocks;
> -	int regloc, i;
> -
> -	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> -					   CXL_DVSEC_REG_LOCATOR);
> -	if (!regloc)
> -		return -ENXIO;
> -
> -	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> -	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> -
> -	regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
> -	regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
> -
> -	for (i = 0; i < regblocks; i++, regloc += 8) {
> -		u32 reg_lo, reg_hi;
> -
> -		pci_read_config_dword(pdev, regloc, &reg_lo);
> -		pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> -
> -		cxl_decode_regblock(reg_lo, reg_hi, map);
> -
> -		if (map->reg_type == type)
> -			return 0;
> -	}
> -
> -	return -ENODEV;
> -}
> -
>  static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  			  struct cxl_register_map *map)
>  {
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 8ae2b4adc59d..a4b506bb37d1 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -47,4 +47,8 @@ enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_TYPES
>  };
>  
> +#define cxl_reg_block(pdev, map)                                               \
> +	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
> +			   (map)->block_offset))
> +
>  #endif /* __CXL_PCI_H__ */


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 09/23] cxl: Introduce module_cxl_driver
  2021-11-20  0:02 ` [PATCH 09/23] cxl: Introduce module_cxl_driver Ben Widawsky
@ 2021-11-22 15:54   ` Jonathan Cameron
  2021-11-24 22:22   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 15:54 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Dan Williams, Alison Schofield, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:36 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Many CXL drivers simply want to register and unregister themselves.
> module_driver already supported this. A simple wrapper around that
> reduces a decent amount of boilerplate in upcoming patches.
> 
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Makes sense although, when compared with using the module_driver macro
directly it's a fairly minor reduction in boilerplate.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/cxl.h | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 7150a9694f66..d39d45f4a770 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -308,6 +308,9 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
>  #define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
>  void cxl_driver_unregister(struct cxl_driver *cxl_drv);
>  
> +#define module_cxl_driver(__cxl_driver) \
> +	module_driver(__cxl_driver, cxl_driver_register, cxl_driver_unregister)
> +
>  #define CXL_DEVICE_NVDIMM_BRIDGE	1
>  #define CXL_DEVICE_NVDIMM		2
>  


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 10/23] cxl/core: Convert decoder range to resource
  2021-11-20  0:02 ` [PATCH 10/23] cxl/core: Convert decoder range to resource Ben Widawsky
@ 2021-11-22 16:08   ` Jonathan Cameron
  2021-11-24 22:41   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 16:08 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:37 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> CXL decoders manage address ranges in a hierarchical fashion whereby a
> leaf is a unique subregion of its parent decoder (midlevel or root). It
> therefore makes sense to use the resource API for handling this.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
LGTM

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> ---
> Changes since RFCv2
> - Switch to DEFINE_RES_MEM from NAMED variant (Dan)
> - Differentiate CFMWS resources and other decoder resources (Ben)
> - Make decoder resources be range, nor resource (Dan)
> - Set decoder name in cxl_decoder_add() (Dan)
> ---
>  drivers/cxl/acpi.c     | 16 ++++++----------
>  drivers/cxl/core/bus.c | 19 +++++++++++++++++--
>  drivers/cxl/cxl.h      |  8 ++++++--
>  3 files changed, 29 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 7cfa8b568013..3415184a2e61 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -108,10 +108,8 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>  
>  	cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
>  	cxld->target_type = CXL_DECODER_EXPANDER;
> -	cxld->range = (struct range){
> -		.start = cfmws->base_hpa,
> -		.end = cfmws->base_hpa + cfmws->window_size - 1,
> -	};
> +	cxld->platform_res = (struct resource)DEFINE_RES_MEM(cfmws->base_hpa,
> +							     cfmws->window_size);
>  	cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
>  	cxld->interleave_granularity = CFMWS_INTERLEAVE_GRANULARITY(cfmws);
>  
> @@ -127,8 +125,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>  		return 0;
>  	}
>  	dev_dbg(dev, "add: %s node: %d range %#llx-%#llx\n",
> -		dev_name(&cxld->dev), phys_to_target_node(cxld->range.start),
> -		cfmws->base_hpa, cfmws->base_hpa + cfmws->window_size - 1);
> +		dev_name(&cxld->dev),
> +		phys_to_target_node(cxld->platform_res.start), cfmws->base_hpa,
> +		cfmws->base_hpa + cfmws->window_size - 1);
>  
>  	return 0;
>  }
> @@ -267,10 +266,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>  	cxld->interleave_ways = 1;
>  	cxld->interleave_granularity = PAGE_SIZE;
>  	cxld->target_type = CXL_DECODER_EXPANDER;
> -	cxld->range = (struct range) {
> -		.start = 0,
> -		.end = -1,
> -	};
> +	cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0);
>  
>  	device_lock(&port->dev);
>  	dport = list_first_entry(&port->dports, typeof(*dport), list);
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 17a4fff029f8..8e80e85350b1 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -46,8 +46,14 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr,
>  			  char *buf)
>  {
>  	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +	u64 start = 0;
>  
> -	return sysfs_emit(buf, "%#llx\n", cxld->range.start);
> +	if (is_root_decoder(dev))
> +		start = cxld->platform_res.start;
> +	else
> +		start = cxld->decoder_range.start;
> +
> +	return sysfs_emit(buf, "%#llx\n", start);
>  }
>  static DEVICE_ATTR_RO(start);
>  
> @@ -55,8 +61,14 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr,
>  			char *buf)
>  {
>  	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +	u64 size = 0;
>  
> -	return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range));
> +	if (is_root_decoder(dev))
> +		size = resource_size(&cxld->platform_res);
> +	else
> +		size = range_len(&cxld->decoder_range);
> +
> +	return sysfs_emit(buf, "%#llx\n", size);
>  }
>  static DEVICE_ATTR_RO(size);
>  
> @@ -548,6 +560,9 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>  	if (rc)
>  		return rc;
>  
> +	if (is_root_decoder(dev))
> +		cxld->platform_res.name = dev_name(dev);
> +
>  	return device_add(dev);
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index d39d45f4a770..ad816fb5bdcc 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -179,7 +179,8 @@ enum cxl_decoder_type {
>   * struct cxl_decoder - CXL address range decode configuration
>   * @dev: this decoder's device
>   * @id: kernel device name id
> - * @range: address range considered by this decoder
> + * @platform_res: address space resources considered by root decoder
> + * @decoder_range: address space resources considered by midlevel decoder
>   * @interleave_ways: number of cxl_dports in this decode
>   * @interleave_granularity: data stride per dport
>   * @target_type: accelerator vs expander (type2 vs type3) selector
> @@ -190,7 +191,10 @@ enum cxl_decoder_type {
>  struct cxl_decoder {
>  	struct device dev;
>  	int id;
> -	struct range range;
> +	union {
> +		struct resource platform_res;
> +		struct range decoder_range;
> +	};
>  	int interleave_ways;
>  	int interleave_granularity;
>  	enum cxl_decoder_type target_type;


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 11/23] cxl/core: Document and tighten up decoder APIs
  2021-11-20  0:02 ` [PATCH 11/23] cxl/core: Document and tighten up decoder APIs Ben Widawsky
@ 2021-11-22 16:13   ` Jonathan Cameron
  2021-11-24 22:55   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 16:13 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:38 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Since the code to add decoders for switches and endpoints is on the
> horizon it helps to have properly documented APIs. In addition, the
> decoder APIs will never need to support a negative count for downstream
> targets as the spec explicitly starts numbering them at 1, ie. even 0 is
> an "invalid" value which can be used as a sentinel.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> 
> This is respun from a previous incantation here:
> https://lore.kernel.org/linux-cxl/20210915155946.308339-1-ben.widawsky@intel.com/
> ---
>  drivers/cxl/core/bus.c | 33 +++++++++++++++++++++++++++++++--
>  drivers/cxl/cxl.h      |  3 ++-
>  2 files changed, 33 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 8e80e85350b1..1ee12a60f3f4 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -495,7 +495,20 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
>  	return rc;
>  }
>  
> -struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
> +/**
> + * cxl_decoder_alloc - Allocate a new CXL decoder
> + * @port: owning port of this decoder
> + * @nr_targets: downstream targets accessible by this decoder. All upstream
> + *		ports and root ports must have at least 1 target.
> + *
> + * A port should contain one or more decoders. Each of those decoders enable
> + * some address space for CXL.mem utilization. A decoder is expected to be
> + * configured by the caller before registering.
> + *
> + * Return: A new cxl decoder to be registered by cxl_decoder_add()
> + */
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> +				      unsigned int nr_targets)
>  {
>  	struct cxl_decoder *cxld, cxld_const_init = {
>  		.nr_targets = nr_targets,
> @@ -503,7 +516,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>  	struct device *dev;
>  	int rc = 0;
>  
> -	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
> +	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets == 0)
>  		return ERR_PTR(-EINVAL);
>  
>  	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> @@ -535,6 +548,22 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
>  
> +/**
> + * cxl_decoder_add - Add a decoder with targets
> + * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
> + * @target_map: A list of downstream ports that this decoder can direct memory
> + *              traffic to. These numbers should correspond with the port number
> + *              in the PCIe Link Capabilities structure.
> + *
> + * Certain types of decoders may not have any targets. The main example of this
> + * is an endpoint device. A more awkward example is a hostbridge whose root
> + * ports get hot added (technically possible, though unlikely).
> + *
> + * Context: Process context. Takes and releases the cxld's device lock.
> + *
> + * Return: Negative error code if the decoder wasn't properly configured; else
> + *	   returns 0.
> + */
>  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>  {
>  	struct cxl_port *port;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index ad816fb5bdcc..b66ed8f241c6 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -288,7 +288,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
>  
>  struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  bool is_root_decoder(struct device *dev);
> -struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> +				      unsigned int nr_targets);
>  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
>  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
>  


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 12/23] cxl: Introduce endpoint decoders
  2021-11-20  0:02 ` [PATCH 12/23] cxl: Introduce endpoint decoders Ben Widawsky
@ 2021-11-22 16:20   ` Jonathan Cameron
  2021-11-22 19:37     ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 16:20 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:39 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Endpoints have decoders too. It is useful to share the same
> infrastructure from cxl_core. Endpoints do not have dports (downstream
> targets), only the underlying physical medium. As a result, some special
> casing is needed.
> 
> There is no functional change introduced yet as endpoints don't actually
> enumerate decoders yet.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

I'm not a fan of special values like using 0 here to indicate endpoint
device.  I'd rather see a base cxl_decode_alloc(..., bool ep)
and possibly wrappers for the non ep case and ep one.

Jonathan

> ---
>  drivers/cxl/core/bus.c | 41 +++++++++++++++++++++++++++++++++--------
>  1 file changed, 33 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 1ee12a60f3f4..16b15f54fb62 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -187,6 +187,12 @@ static const struct attribute_group *cxl_decoder_switch_attribute_groups[] = {
>  	NULL,
>  };
>  
> +static const struct attribute_group *cxl_decoder_endpoint_attribute_groups[] = {
> +	&cxl_decoder_base_attribute_group,
> +	&cxl_base_attribute_group,
> +	NULL,
> +};
> +
>  static void cxl_decoder_release(struct device *dev)
>  {
>  	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> @@ -196,6 +202,12 @@ static void cxl_decoder_release(struct device *dev)
>  	kfree(cxld);
>  }
>  
> +static const struct device_type cxl_decoder_endpoint_type = {
> +	.name = "cxl_decoder_endpoint",
> +	.release = cxl_decoder_release,
> +	.groups = cxl_decoder_endpoint_attribute_groups,
> +};
> +
>  static const struct device_type cxl_decoder_switch_type = {
>  	.name = "cxl_decoder_switch",
>  	.release = cxl_decoder_release,
> @@ -208,6 +220,11 @@ static const struct device_type cxl_decoder_root_type = {
>  	.groups = cxl_decoder_root_attribute_groups,
>  };
>  
> +static bool is_endpoint_decoder(struct device *dev)
> +{
> +	return dev->type == &cxl_decoder_endpoint_type;
> +}
> +
>  bool is_root_decoder(struct device *dev)
>  {
>  	return dev->type == &cxl_decoder_root_type;
> @@ -499,7 +516,9 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
>   * cxl_decoder_alloc - Allocate a new CXL decoder
>   * @port: owning port of this decoder
>   * @nr_targets: downstream targets accessible by this decoder. All upstream
> - *		ports and root ports must have at least 1 target.
> + *		ports and root ports must have at least 1 target. Endpoint
> + *		devices will have 0 targets. Callers wishing to register an
> + *		endpoint device should specify 0.
>   *
>   * A port should contain one or more decoders. Each of those decoders enable
>   * some address space for CXL.mem utilization. A decoder is expected to be
> @@ -516,7 +535,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
>  	struct device *dev;
>  	int rc = 0;
>  
> -	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets == 0)
> +	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
>  		return ERR_PTR(-EINVAL);
>  
>  	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> @@ -535,8 +554,11 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
>  	dev->parent = &port->dev;
>  	dev->bus = &cxl_bus_type;
>  
> +	/* Endpoints don't have a target list */
> +	if (nr_targets == 0)
> +		dev->type = &cxl_decoder_endpoint_type;
>  	/* root ports do not have a cxl_port_type parent */
> -	if (port->dev.parent->type == &cxl_port_type)
> +	else if (port->dev.parent->type == &cxl_port_type)
>  		dev->type = &cxl_decoder_switch_type;
>  	else
>  		dev->type = &cxl_decoder_root_type;
> @@ -579,12 +601,15 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>  	if (cxld->interleave_ways < 1)
>  		return -EINVAL;
>  
> -	port = to_cxl_port(cxld->dev.parent);
> -	rc = decoder_populate_targets(cxld, port, target_map);
> -	if (rc)
> -		return rc;
> -
>  	dev = &cxld->dev;
> +
> +	port = to_cxl_port(cxld->dev.parent);
> +	if (!is_endpoint_decoder(dev)) {
> +		rc = decoder_populate_targets(cxld, port, target_map);
> +		if (rc)
> +			return rc;
> +	}
> +
>  	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
>  	if (rc)
>  		return rc;


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 13/23] cxl/core: Move target population locking to caller
  2021-11-20  0:02 ` [PATCH 13/23] cxl/core: Move target population locking to caller Ben Widawsky
@ 2021-11-22 16:33   ` Jonathan Cameron
  2021-11-22 21:58     ` Ben Widawsky
  2021-11-25  0:34   ` Dan Williams
  1 sibling, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 16:33 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:40 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> In preparation for a port driver that enumerates a descendant port +
> decoder hierarchy, arrange for an unlocked version of cxl_decoder_add().
> Otherwise a port-driver that adds a child decoder will deadlock on the
> device_lock() in ->probe().
> 

I think this description should call out that the lock was originally taken
for a much shorter time in decoder_populate_targets() but is moved
up one layer.

One other query inline.  Seems like we the WARN_ON stuff is a bit
over paranoid given what's visible in this patch.  If there is a
good reason for that, then add something to the patch description to
justify it.
 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> ---
> 
> Changes since RFCv2:
> - Reword commit message (Dan)
> - Move decoder API changes into this patch (Dan)
> ---
>  drivers/cxl/core/bus.c | 59 +++++++++++++++++++++++++++++++-----------
>  drivers/cxl/cxl.h      |  1 +
>  2 files changed, 45 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 16b15f54fb62..cd6fe7823c69 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -487,28 +487,22 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
>  {
>  	int rc = 0, i;
>  
> +	device_lock_assert(&port->dev);
> +
>  	if (!target_map)
>  		return 0;
>  
> -	device_lock(&port->dev);
> -	if (list_empty(&port->dports)) {
> -		rc = -EINVAL;
> -		goto out_unlock;
> -	}
> +	if (list_empty(&port->dports))
> +		return -EINVAL;
>  
>  	for (i = 0; i < cxld->nr_targets; i++) {
>  		struct cxl_dport *dport = find_dport(port, target_map[i]);
>  
> -		if (!dport) {
> -			rc = -ENXIO;
> -			goto out_unlock;
> -		}
> +		if (!dport)
> +			return -ENXIO;
>  		cxld->target[i] = dport;
>  	}
>  
> -out_unlock:
> -	device_unlock(&port->dev);
> -
>  	return rc;
>  }
>  
> @@ -571,7 +565,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
>  EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
>  
>  /**
> - * cxl_decoder_add - Add a decoder with targets
> + * cxl_decoder_add_locked - Add a decoder with targets
>   * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
>   * @target_map: A list of downstream ports that this decoder can direct memory
>   *              traffic to. These numbers should correspond with the port number
> @@ -581,12 +575,14 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
>   * is an endpoint device. A more awkward example is a hostbridge whose root
>   * ports get hot added (technically possible, though unlikely).
>   *
> - * Context: Process context. Takes and releases the cxld's device lock.
> + * This is the locked variant of cxl_decoder_add().
> + *
> + * Context: Process context. Expects the cxld's device lock to be held.
>   *
>   * Return: Negative error code if the decoder wasn't properly configured; else
>   *	   returns 0.
>   */
> -int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> +int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
>  {
>  	struct cxl_port *port;
>  	struct device *dev;
> @@ -619,6 +615,39 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>  
>  	return device_add(dev);
>  }
> +EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, CXL);
> +
> +/**
> + * cxl_decoder_add - Add a decoder with targets
> + * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
> + * @target_map: A list of downstream ports that this decoder can direct memory
> + *              traffic to. These numbers should correspond with the port number
> + *              in the PCIe Link Capabilities structure.
> + *
> + * This is the unlocked variant of cxl_decoder_add_locked().
> + * See cxl_decoder_add_locked().
> + *
> + * Context: Process context. Takes and releases the cxld's device lock.
> + */
> +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> +{
> +	struct cxl_port *port;
> +	int rc;
> +
> +	if (WARN_ON_ONCE(!cxld))
> +		return -EINVAL;

Why do we now need these protections but didn't before?


> +
> +	if (WARN_ON_ONCE(IS_ERR(cxld)))
> +		return PTR_ERR(cxld);
> +
> +	port = to_cxl_port(cxld->dev.parent);
> +
> +	device_lock(&port->dev);
> +	rc = cxl_decoder_add_locked(cxld, target_map);
> +	device_unlock(&port->dev);
> +
> +	return rc;
> +}
>  EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL);
>  
>  static void cxld_unregister(void *dev)
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index b66ed8f241c6..2c5627fa8a34 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -290,6 +290,7 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  bool is_root_decoder(struct device *dev);
>  struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
>  				      unsigned int nr_targets);
> +int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map);
>  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
>  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
>  


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 16/23] cxl/pci: Cache device DVSEC offset
  2021-11-20  0:02 ` [PATCH 16/23] cxl/pci: Cache device DVSEC offset Ben Widawsky
@ 2021-11-22 16:46   ` Jonathan Cameron
  2021-11-22 22:34     ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 16:46 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:43 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The PCIe device DVSEC, defined in the CXL 2.0 spec, 8.1.3 is required to
> be implemented by CXL 2.0 endpoint devices. Since the information
> contained within this DVSEC will be critically important for region
> configuration, it makes sense to find the value early.
> 
> Since this DVSEC is not strictly required for mailbox functionality,
> failure to find this information does not result in the driver failing
> to bind.

That feels like a path we are going to forget to test sometime in the
future.  Given it's a specification requirement, I'd treat it as
an error and make our lives easier going forwards!

Otherwise looks good to me.

> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/cxlmem.h | 2 ++
>  drivers/cxl/pci.c    | 7 +++++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 8d96d009ad90..3ef3c652599e 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -98,6 +98,7 @@ struct cxl_mbox_cmd {
>   *
>   * @dev: The device associated with this CXL state
>   * @regs: Parsed register blocks
> + * @device_dvsec: Offset to the PCIe device DVSEC
>   * @payload_size: Size of space for payload
>   *                (CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
>   * @lsa_size: Size of Label Storage Area
> @@ -125,6 +126,7 @@ struct cxl_dev_state {
>  	struct device *dev;
>  
>  	struct cxl_regs regs;
> +	int device_dvsec;
>  
>  	size_t payload_size;
>  	size_t lsa_size;
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index d2c743a31b0c..f3872cbee7f8 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -474,6 +474,13 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (IS_ERR(cxlds))
>  		return PTR_ERR(cxlds);
>  
> +	cxlds->device_dvsec = pci_find_dvsec_capability(pdev,
> +							PCI_DVSEC_VENDOR_ID_CXL,
> +							CXL_DVSEC_PCIE_DEVICE);
> +	if (!cxlds->device_dvsec)
> +		dev_warn(&pdev->dev,
> +			 "Device DVSEC not present. Expect limited functionality.\n");
> +
>  	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
>  	if (rc)
>  		return rc;


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 17/23] cxl: Cache and pass DVSEC ranges
  2021-11-20  0:02 ` [PATCH 17/23] cxl: Cache and pass DVSEC ranges Ben Widawsky
  2021-11-20  4:29     ` kernel test robot
@ 2021-11-22 17:00   ` Jonathan Cameron
  2021-11-22 22:50     ` Ben Widawsky
  2021-11-26 11:37   ` Jonathan Cameron
  2 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 17:00 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:44 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> CXL 1.1 specification provided a mechanism for mapping an address space
> of a CXL device. That functionality is known as a "range" and can be
> programmed through PCIe DVSEC. In addition to this, the specification
> defines an active bit which a device will expose through the same DVSEC
> to notify system software that memory is initialized and ready.
> 
> While CXL 2.0 introduces a more powerful mechanism called HDM decoders
> that are controlled by MMIO behind a PCIe BAR, the spec does allow the
> 1.1 style mapping to still be present. In such a case, when the CXL
> driver takes over, if it were to enable HDM decoding and there was an
> actively used range, things would likely blow up, in particular if it
> wasn't an identical mapping.
> 
> This patch caches the relevant information which the cxl_mem driver will
> need to make the proper decision and passes it along.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

0-day spotted issues in same code as me. See below.

This is another case where I'd treat failure as fatal.  Anything that fails
is either dead, or non spec compliant so don't bother loading the driver
if that happens. Fewer paths to test etc...

> ---
>  drivers/cxl/cxlmem.h |  19 +++++++
>  drivers/cxl/pci.c    | 126 +++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/pci.h    |  13 +++++
>  3 files changed, 158 insertions(+)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 3ef3c652599e..eac5528ccaae 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -89,6 +89,22 @@ struct cxl_mbox_cmd {
>   */
>  #define CXL_CAPACITY_MULTIPLIER SZ_256M
>  
> +/**
> + * struct cxl_endpoint_dvsec_info - Cached DVSEC info
> + * @mem_enabled: cached value of mem_enabled in the DVSEC, PCIE_DEVICE
> + * @ranges: Number of HDM ranges this device contains.
> + * @range.base: cached value of the range base in the DVSEC, PCIE_DEVICE
> + * @range.size: cached value of the range size in the DVSEC, PCIE_DEVICE
> + */
> +struct cxl_endpoint_dvsec_info {
> +	bool mem_enabled;
> +	int ranges;
> +	struct {
> +		u64 base;
> +		u64 size;
> +	} range[2];
> +};
> +
>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> @@ -117,6 +133,7 @@ struct cxl_mbox_cmd {
>   * @active_persistent_bytes: sum of hard + soft persistent
>   * @next_volatile_bytes: volatile capacity change pending device reset
>   * @next_persistent_bytes: persistent capacity change pending device reset
> + * @info: Cached DVSEC information about the device.
>   * @mbox_send: @dev specific transport for transmitting mailbox commands
>   *
>   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> @@ -147,6 +164,8 @@ struct cxl_dev_state {
>  	u64 next_volatile_bytes;
>  	u64 next_persistent_bytes;
>  
> +	struct cxl_endpoint_dvsec_info *info;
> +
>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  };
>  
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index f3872cbee7f8..b3f46045bf3e 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -452,8 +452,126 @@ static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  	return rc;
>  }
>  
> +#define CDPD(cxlds, which)                                                     \
> +	cxlds->device_dvsec + CXL_DVSEC_PCIE_DEVICE_##which##_OFFSET

My usual grumble :)  I personally find macros like this a bit of a pain when
reviewing.  I'd really like to see things spelled out in the code so I
can immediately see what register we are dealing with even if it does
seem rather repetitive in the code.

> +
> +#define CDPDR(cxlds, which, sorb, lohi)                                        \
> +	cxlds->device_dvsec +                                                  \
> +		CXL_DVSEC_PCIE_DEVICE_RANGE_##sorb##_##lohi##_OFFSET(which)
> +
> +static int wait_for_valid(struct cxl_dev_state *cxlds)
> +{
> +	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> +	const unsigned long timeout = jiffies + HZ;
> +	bool valid;
> +
> +	do {
> +		u64 size;
> +		u32 temp;
> +		int rc;
> +
> +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
> +					   &temp);
> +		if (rc)
> +			return -ENXIO;
> +		size = (u64)temp << 32;
> +
> +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
> +					   &temp);
> +		if (rc)
> +			return -ENXIO;
> +		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
> +
> +		/*
> +		 * Memory_Info_Valid: When set, indicates that the CXL Range 1
> +		 * Size high and Size Low registers are valid. Must be set
> +		 * within 1 second of deassertion of reset to CXL device.
> +		 */
> +		valid = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_INFO_VALID, temp);
> +		if (valid)
> +			break;

I think there is a race here.  What if you read the high part, get garbage and then
read the low part which is now valid...

Swap this around so you read this one first and it will be fine.

However given as 0-day pointed out, size isn't used, this is fine anyway
(subject to removing the pointless code).

> +		cpu_relax();
> +	} while (!time_after(jiffies, timeout));
> +
> +	return valid ? 0 : -ETIMEDOUT;
> +}
> +
> +static struct cxl_endpoint_dvsec_info *dvsec_ranges(struct cxl_dev_state *cxlds)
> +{
> +	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> +	struct cxl_endpoint_dvsec_info *info;
> +	int hdm_count, rc, i;
> +	u16 cap, ctrl;
> +
> +	rc = pci_read_config_word(pdev, CDPD(cxlds, CAP), &cap);
> +	if (rc)
> +		return ERR_PTR(-ENXIO);
> +	rc = pci_read_config_word(pdev, CDPD(cxlds, CTRL), &ctrl);
> +	if (rc)
> +		return ERR_PTR(-ENXIO);
> +
> +	if (!(cap & CXL_DVSEC_PCIE_DEVICE_MEM_CAPABLE))
> +		return ERR_PTR(-ENODEV);
> +
> +	/*
> +	 * It is not allowed by spec for MEM.capable to be set and have 0 HDM
> +	 * decoders. Therefore, the device is not considered CXL.mem capable.
> +	 */
> +	hdm_count = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_HDM_COUNT_MASK, cap);
> +	if (!hdm_count || hdm_count > 2)
> +		return ERR_PTR(-EINVAL);
> +
> +	rc = wait_for_valid(cxlds);
> +	if (rc)
> +		return ERR_PTR(rc);
> +
> +	info = devm_kzalloc(cxlds->dev, sizeof(*info), GFP_KERNEL);
> +	if (!info)
> +		return ERR_PTR(-ENOMEM);
> +
> +	info->mem_enabled = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_ENABLE, ctrl);
> +
> +	for (i = 0; i < hdm_count; i++) {
> +		u64 base, size;
> +		u32 temp;
> +
> +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, SIZE, HIGH),
> +					   &temp);
> +		if (rc)
> +			continue;
> +		size = (u64)temp << 32;
> +
> +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, SIZE, LOW),
> +					   &temp);
> +		if (rc)
> +			continue;
> +		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
> +
> +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, BASE, HIGH),
> +					   &temp);
> +		if (rc)
> +			continue;
> +		base = (u64)temp << 32;
> +
> +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, BASE, LOW),
> +					   &temp);
> +		if (rc)
> +			continue;
> +		base |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_BASE_LOW_MASK;
> +
> +		info->range[i].base = base;
> +		info->range[i].size = size;
> +		info->ranges++;
> +	}
> +
> +	return info;
> +}
> +
> +#undef CDPDR
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
> +	struct cxl_endpoint_dvsec_info *info;
>  	struct cxl_register_map map;
>  	struct cxl_memdev *cxlmd;
>  	struct cxl_dev_state *cxlds;
> @@ -505,6 +623,14 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	info = dvsec_ranges(cxlds);
> +	if (IS_ERR(info))
> +		dev_err(&pdev->dev,
> +			"Failed to get DVSEC range information (%ld)\n",
> +			PTR_ERR(info));
> +	else
> +		cxlds->info = info;
> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 18/23] cxl/pci: Implement wait for media active
  2021-11-20  0:02 ` [PATCH 18/23] cxl/pci: Implement wait for media active Ben Widawsky
@ 2021-11-22 17:03   ` Jonathan Cameron
  2021-11-22 22:57     ` Ben Widawsky
  2021-11-26 11:36     ` Jonathan Cameron
  0 siblings, 2 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 17:03 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:45 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> CXL 2.0 8.1.3.8.2 defines "Memory_Active: When set, indicates that the
> CXL Range 1 memory is fully initialized and available for software use.
> Must be set within Range 1. Memory_Active_Timeout of deassertion of

Range 1?

> reset to CXL device if CXL.mem HwInit Mode=1" The CXL* Type 3 Memory
> Device Software Guide (Revision 1.0) further describes the need to check
> this bit before using HDM.
> 
> Unfortunately, Memory_Active can take quite a long time depending on
> media size (up to 256s per 2.0 spec). Since the cxl_pci driver doesn't
> care about this, a callback is exported as part of driver state for use
> by drivers that do care.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Same thing about size not being used...

> ---
> This patch did not exist in RFCv2
> ---
>  drivers/cxl/cxlmem.h |  1 +
>  drivers/cxl/pci.c    | 56 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 57 insertions(+)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index eac5528ccaae..a9424dd4e5c3 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -167,6 +167,7 @@ struct cxl_dev_state {
>  	struct cxl_endpoint_dvsec_info *info;
>  
>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> +	int (*wait_media_ready)(struct cxl_dev_state *cxlds);
>  };
>  
>  enum cxl_opcode {
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index b3f46045bf3e..f1a68bfe5f77 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -496,6 +496,60 @@ static int wait_for_valid(struct cxl_dev_state *cxlds)
>  	return valid ? 0 : -ETIMEDOUT;
>  }
>  
> +/*
> + * Implements Figure 43 of the CXL Type 3 Memory Device Software Guide. Waits a
> + * full 256s no matter what the device reports.
> + */
> +static int wait_for_media_ready(struct cxl_dev_state *cxlds)
> +{
> +	const unsigned long timeout = jiffies + (256 * HZ);
> +	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> +	u64 md_status;
> +	bool active;
> +	int rc;
> +
> +	rc = wait_for_valid(cxlds);
> +	if (rc)
> +		return rc;
> +
> +	do {
> +		u64 size;
> +		u32 temp;
> +		int rc;
> +
> +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
> +					   &temp);
> +		if (rc)
> +			return -ENXIO;
> +		size = (u64)temp << 32;
> +
> +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
> +					   &temp);
> +		if (rc)
> +			return -ENXIO;
> +		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
> +
> +		active = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_ACTIVE, temp);

Only need to read the register to get active for this particular functionality.

> +		if (active)
> +			break;
> +		cpu_relax();
> +		mdelay(100);
> +	} while (!time_after(jiffies, timeout));
> +
> +	if (!active)
> +		return -ETIMEDOUT;
> +
> +	rc = check_device_status(cxlds);
> +	if (rc)
> +		return rc;
> +
> +	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> +	if (!CXLMDEV_READY(md_status))
> +		return -EIO;
> +
> +	return 0;
> +}
> +
>  static struct cxl_endpoint_dvsec_info *dvsec_ranges(struct cxl_dev_state *cxlds)
>  {
>  	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> @@ -598,6 +652,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (!cxlds->device_dvsec)
>  		dev_warn(&pdev->dev,
>  			 "Device DVSEC not present. Expect limited functionality.\n");
> +	else
> +		cxlds->wait_media_ready = wait_for_media_ready;
>  
>  	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
>  	if (rc)


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 19/23] cxl/pci: Store component register base in cxlds
  2021-11-20  0:02 ` [PATCH 19/23] cxl/pci: Store component register base in cxlds Ben Widawsky
  2021-11-20  7:28     ` kernel test robot
@ 2021-11-22 17:11   ` Jonathan Cameron
  2021-11-22 23:01     ` Ben Widawsky
  1 sibling, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 17:11 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:46 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The component register base address is enumerated and stored in the
> cxl_device_state structure for use by the endpoint driver.
> 
> Component register base addresses are obtained through PCI mechanisms.
> As such it makes most sense for the cxl_pci driver to obtain that
> address. In order to reuse the port driver for enumerating decoder
> resources for an endpoint, it is desirable to be able to add the
> endpoint as a port. The endpoint does know of the cxlds and can pull the
> component register base from there and pass it along to port creation.

This feels like a lot of explanation in for trivial caching of an address.
I'm not sure you need to be that detailed, though I guess it does no real
harm.

Another one where I'm unsure why we are muddling on after an error...


> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
> Changes since RFCv2:
> This patch was originally named, "cxl/core: Store component register
> base for memdevs". It plumbed the component registers through memdev
> creation. After more work, it became apparent we needed to plumb other
> stuff from the pci driver, so going forward, cxlds will just be
> referenced by the cxl_mem driver. This also allows us to ignore the
> change needed to cxl_test
> 
> - Rework patch to store the base in cxlds
> - Remove defunct comment (Dan)
> ---
>  drivers/cxl/cxlmem.h |  2 ++
>  drivers/cxl/pci.c    | 11 +++++++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index a9424dd4e5c3..b1d753541f4e 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -134,6 +134,7 @@ struct cxl_endpoint_dvsec_info {
>   * @next_volatile_bytes: volatile capacity change pending device reset
>   * @next_persistent_bytes: persistent capacity change pending device reset
>   * @info: Cached DVSEC information about the device.
> + * @component_reg_phys: register base of component registers
>   * @mbox_send: @dev specific transport for transmitting mailbox commands
>   *
>   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> @@ -165,6 +166,7 @@ struct cxl_dev_state {
>  	u64 next_persistent_bytes;
>  
>  	struct cxl_endpoint_dvsec_info *info;
> +	resource_size_t component_reg_phys;
>  
>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  	int (*wait_media_ready)(struct cxl_dev_state *cxlds);
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index f1a68bfe5f77..a8e375950514 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -663,6 +663,17 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	/*
> +	 * If the component registers can't be found, the cxl_pci driver may
> +	 * still be useful for management functions so don't return an error.
> +	 */
> +	cxlds->component_reg_phys = CXL_RESOURCE_NONE;
> +	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> +	if (rc)
> +		dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
> +	else
> +		cxlds->component_reg_phys = cxl_reg_block(pdev, &map);
> +
>  	rc = cxl_pci_setup_mailbox(cxlds);
>  	if (rc)
>  		return rc;


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout
  2021-11-22 15:02   ` Jonathan Cameron
@ 2021-11-22 17:17     ` Ben Widawsky
  2021-11-22 17:53       ` Jonathan Cameron
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 17:17 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 15:02:27, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:31 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The original driver implementation used the doorbell timeout for the
> > Mailbox Interface Ready bit to piggy back off of, since the latter
> > doesn't have a defined timeout. This functionality, introduced in
> > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > since a timeout has been defined with an ECN to the 2.0 spec.
> > 
> > While devices implemented prior to the ECN could have an arbitrarily
> > long wait and still be within spec, the max ECN value (256s) is chosen
> > as the default for all devices. All vendors in the consortium agreed to
> > this amount and so it is reasonable to assume no devices made will
> > exceed this amount.
> 
> Optimistic :)
> 

Reasonable to assume is certainly not the same as "in reality". I can soften
this wording.

> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> > This patch did not exist in RFCv2
> > ---
> >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> >  1 file changed, 29 insertions(+)
> > 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 6c8d09fb3a17..2cef9fec8599 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -2,6 +2,7 @@
> >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> >  #include <linux/io-64-nonatomic-lo-hi.h>
> >  #include <linux/module.h>
> > +#include <linux/delay.h>
> >  #include <linux/sizes.h>
> >  #include <linux/mutex.h>
> >  #include <linux/list.h>
> > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> >  {
> >  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > +	unsigned long timeout;
> > +	u64 md_status;
> > +	int rc;
> > +
> > +	/*
> > +	 * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > +	 * dictate how long to wait for the mailbox to become ready. For
> > +	 * simplicity, and to handle devices that might have been implemented
> 
> I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> a lower value, then that is what we should use.  It's fine to wait the max time
> if nothing is specified.  It'll cost us a few lines of code at most unless
> I am missing something...
> 
> Jonathan
> 

Let me pose it a different way, if a device advertises 1s, but for whatever
takes 4s to come up, should we penalize it over the device advertising 256s? The
way this field is defined in the spec would [IMHO] lead vendors to simply put
the max field in there to game the driver, so why not start off with just
insisting they don't?

> > +	 * prior to the ECN, wait the max amount of time no matter what the
> > +	 * device says.
> > +	 */
> > +	timeout = jiffies + 256 * HZ;
> > +
> > +	rc = check_device_status(cxlds);
> > +	if (rc)
> > +		return rc;
> > +
> > +	do {
> > +		md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > +		if (md_status & CXLMDEV_MBOX_IF_READY)
> > +			break;
> > +		if (msleep_interruptible(100))
> > +			break;
> > +	} while (!time_after(jiffies, timeout));
> > +
> > +	/* It's assumed that once the interface is ready, it will remain ready. */
> > +	if (!(md_status & CXLMDEV_MBOX_IF_READY))
> > +		return -EIO;
> >  
> >  	cxlds->mbox_send = cxl_pci_mbox_send;
> >  	cxlds->payload_size =
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-22 15:11   ` Jonathan Cameron
@ 2021-11-22 17:24     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 17:24 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 15:11:31, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:32 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The expectation is that the mailbox interface ready bit is the first
> > step in access through the mailbox interface.
> 
> Reword this? Perhaps
> "The expectation is that the mailbox interface ready bit will be set
>  at the start of any access through the mailbox interface."
> 
> > Therefore, waiting for the
> > doorbell busy bit to be clear would imply that the mailbox interface is
> > ready. The original driver implementation used the doorbell timeout for
> > the Mailbox Interface Ready bit to piggyback off of, since the latter
> > doesn't have a defined timeout (introduced in 8adaf747c9f0 ("cxl/mem:
> > Find device capabilities"), a timeout has since been defined with an ECN
> > to the 2.0 spec). With the current driver waiting for mailbox interface
> > ready as a part of probe() it's no longer necessary to use the
> > piggyback.
> > 
> > With the piggybacking no longer necessary it doesn't make sense to check
> > doorbell status when acquiring the mailbox. It will be checked during
> > the normal mailbox exchange protocol.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> Trivial comment inline - with that fixed either by calling it out, or by
> pulling it out of this patch.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> > ---
> > This patch did not exist in RFCv2
> > ---
> >  drivers/cxl/pci.c | 25 ++++++-------------------
> >  1 file changed, 6 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 2cef9fec8599..869b4fc18e27 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -221,27 +221,14 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
> >  
> >  	/*
> >  	 * XXX: There is some amount of ambiguity in the 2.0 version of the spec
> > -	 * around the mailbox interface ready (8.2.8.5.1.1).  The purpose of the
> > +	 * around the mailbox interface ready (8.2.8.5.1.1). The purpose of the
> 
> Whilst it's trivial, I'd prefer white space cleanup in separate patches.
> I guess this one is obvious enough to just call out in the patch description
> though.
> 

Okay. I'll keep this in mind for the future, and just fixup the commit messages
with your suggestion and this, now.

Thanks.
Ben

> >  	 * bit is to allow firmware running on the device to notify the driver
> > -	 * that it's ready to receive commands. It is unclear if the bit needs
> > -	 * to be read for each transaction mailbox, ie. the firmware can switch
> > -	 * it on and off as needed. Second, there is no defined timeout for
> > -	 * mailbox ready, like there is for the doorbell interface.
> > -	 *
> > -	 * Assumptions:
> > -	 * 1. The firmware might toggle the Mailbox Interface Ready bit, check
> > -	 *    it for every command.
> > -	 *
> > -	 * 2. If the doorbell is clear, the firmware should have first set the
> > -	 *    Mailbox Interface Ready bit. Therefore, waiting for the doorbell
> > -	 *    to be ready is sufficient.
> > +	 * that it's ready to receive commands. The spec does not clearly define
> > +	 * under what conditions the bit may get set or cleared. As of the 2.0
> > +	 * base specification there was no defined timeout for mailbox ready,
> > +	 * like there is for the doorbell interface. This was fixed with an ECN,
> > +	 * but it's possible early devices implemented this before the ECN.
> >  	 */
> > -	rc = cxl_pci_mbox_wait_for_doorbell(cxlds);
> > -	if (rc) {
> > -		dev_warn(dev, "Mailbox interface not ready\n");
> > -		goto out;
> > -	}
> > -
> >  	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> >  	if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
> >  		dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 07/23] cxl/pci: Add new DVSEC definitions
  2021-11-22 15:22   ` Jonathan Cameron
@ 2021-11-22 17:32     ` Ben Widawsky
  2021-11-24 22:03       ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 17:32 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 15:22:24, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:34 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > While the new definitions are yet necessary at this point, they are
> > introduced at this point to help solidify the newly minted schema for
> > naming registers.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>

Thanks. I realized on re-reading this I didn't like the commit message. I
reworded to this:

While the new definitions are not yet necessary at this point, they are
introduced to help solidify the newly minted schema for naming
registers.

Please let me know if you'd like me to drop your reviewed-by tag.

> 
> > 
> > ---
> > This was split from
> > https://lore.kernel.org/linux-cxl/20211103170552.55ae5u7uvurkync6@intel.com/T/#u
> > per Dan's request.
> > ---
> >  drivers/cxl/pci.h | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> > 
> > diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> > index 29b8eaef3a0a..8ae2b4adc59d 100644
> > --- a/drivers/cxl/pci.h
> > +++ b/drivers/cxl/pci.h
> > @@ -16,6 +16,21 @@
> >  /* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
> >  #define CXL_DVSEC_PCIE_DEVICE					0
> >  
> > +/* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
> > +#define CXL_DVSEC_FUNCTION_MAP					2
> > +
> > +/* CXL 2.0 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
> > +#define CXL_DVSEC_PORT_EXTENSIONS				3
> > +
> > +/* CXL 2.0 8.1.6: GPF DVSEC for CXL Port */
> > +#define CXL_DVSEC_PORT_GPF					4
> > +
> > +/* CXL 2.0 8.1.7: GPF DVSEC for CXL Device */
> > +#define CXL_DVSEC_DEVICE_GPF					5
> > +
> > +/* CXL 2.0 8.1.8: PCIe DVSEC for Flex Bus Port */
> > +#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
> > +
> >  /* CXL 2.0 8.1.9: Register Locator DVSEC */
> >  #define CXL_DVSEC_REG_LOCATOR					8
> >  #define   CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET			0xC
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-20  0:02 ` [PATCH 20/23] cxl/port: Introduce a port driver Ben Widawsky
  2021-11-20  3:14     ` kernel test robot
  2021-11-20  5:38     ` kernel test robot
@ 2021-11-22 17:41   ` Jonathan Cameron
  2021-11-22 23:38     ` Ben Widawsky
  2021-11-23 18:21   ` Bjorn Helgaas
  3 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 17:41 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:47 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The CXL port driver is responsible for managing the decoder resources
> contained within the port. It will also provide APIs that other drivers
> will consume for managing these resources.
> 
> There are 4 types of ports in a system:
> 1. Platform port. This is a non-programmable entity. Such a port is
>    named rootX. It is enumerated by cxl_acpi in an ACPI based system.
> 2. Hostbridge port. This ports register access is defined in a platform

port's 

>    specific way (CHBS for ACPI platforms). It has n downstream ports,
>    each of which are known as CXL 2.0 root ports. Once the platform
>    specific mechanism to get the offset to the registers is obtained it
>    operates just like other CXL components. The enumeration of this
>    component is started by cxl_acpi and completed by cxl_port.
> 3. Switch port. A switch port is similar to a hostbridge port except
>    register access is defined in the CXL specification in a platform
>    agnostic way. The downstream ports for a switch are simply known as
>    downstream ports. The enumeration of these are entirely contained
>    within cxl_port.
> 4. Endpoint port. Endpoint ports are similar to switch ports with the
>    exception that they have no downstream ports, only the underlying
>    media on the device. The enumeration of these are started by cxl_pci,
>    and completed by cxl_port.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
A few comments inline including what looks to me memory on the stack which has
gone out of scope when it's accessed.

Jonathan

> 
> ---
> Changes since RFCv2:
> - Reword commit message tense (Dan)
> - Reword commit message
> - Drop SOFTDEP since it's not needed yet (Dan)
> - Add CONFIG_CXL_PORT (Dan)
> - s/CXL_DECODER_F_EN/CXL_DECODER_F_ENABLE (Dan)
> - rename cxl_hdm_decoder_ functions to "to_" (Dan)
> - remove useless inline (Dan)
> - Check endpoint decoder based on dport list instead of driver id (Dan)
> - Use range instead of resource per dependent patch change
> - Use clever union packing for target list (Dan)
> - Only check NULL from devm_cxl_iomap_block (Dan)
> - Properly parent the created cxl decoders
> - Move bus rescanning from cxl_acpi to here (Dan)
> - Remove references to "CFMWS" in core (Dan)
> - Use macro to help keep within 80 character lines
> ---
>  .../driver-api/cxl/memory-devices.rst         |   5 +
>  drivers/cxl/Kconfig                           |  22 ++
>  drivers/cxl/Makefile                          |   2 +
>  drivers/cxl/core/bus.c                        |  67 ++++
>  drivers/cxl/core/regs.c                       |   6 +-
>  drivers/cxl/cxl.h                             |  34 +-
>  drivers/cxl/port.c                            | 323 ++++++++++++++++++
>  7 files changed, 450 insertions(+), 9 deletions(-)
>  create mode 100644 drivers/cxl/port.c
> 
> diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> index 3b8f41395f6b..fbf0393cdddc 100644
> --- a/Documentation/driver-api/cxl/memory-devices.rst
> +++ b/Documentation/driver-api/cxl/memory-devices.rst
> @@ -28,6 +28,11 @@ CXL Memory Device
>  .. kernel-doc:: drivers/cxl/pci.c
>     :internal:
>  
> +CXL Port
> +--------
> +.. kernel-doc:: drivers/cxl/port.c
> +   :doc: cxl port
> +
>  CXL Core
>  --------
>  .. kernel-doc:: drivers/cxl/cxl.h
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index ef05e96f8f97..3aeb33bba5a3 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -77,4 +77,26 @@ config CXL_PMEM
>  	  provisioning the persistent memory capacity of CXL memory expanders.
>  
>  	  If unsure say 'm'.
> +
> +config CXL_MEM
> +	tristate "CXL.mem: Memory Devices"
> +	select CXL_PORT
> +	depends on CXL_PCI
> +	default CXL_BUS
> +	help
> +	  The CXL.mem protocol allows a device to act as a provider of "System
> +	  RAM" and/or "Persistent Memory" that is fully coherent as if the
> +	  memory was attached to the typical CPU memory controller.  This is
> +	  known as HDM "Host-managed Device Memory".
> +
> +	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> +	  memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
> +	  specification for a detailed description of HDM.
> +
> +	  If unsure say 'm'.
> +
> +
> +config CXL_PORT
> +	tristate
> +
>  endif
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index cf07ae6cea17..56fcac2323cb 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -3,7 +3,9 @@ obj-$(CONFIG_CXL_BUS) += core/
>  obj-$(CONFIG_CXL_PCI) += cxl_pci.o
>  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
>  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> +obj-$(CONFIG_CXL_PORT) += cxl_port.o
>  
>  cxl_pci-y := pci.o
>  cxl_acpi-y := acpi.o
>  cxl_pmem-y := pmem.o
> +cxl_port-y := port.o
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 9e0d7d5d9298..46a06cfe79bd 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -31,6 +31,8 @@ static DECLARE_RWSEM(root_port_sem);
>  
>  static struct device *cxl_topology_host;
>  
> +static bool is_cxl_decoder(struct device *dev);
> +
>  int cxl_register_topology_host(struct device *host)
>  {
>  	down_write(&topology_host_sem);
> @@ -75,6 +77,45 @@ static void put_cxl_topology_host(struct device *dev)
>  	up_read(&topology_host_sem);
>  }
>  
> +static int decoder_match(struct device *dev, void *data)
> +{
> +	struct resource *theirs = (struct resource *)data;
> +	struct cxl_decoder *cxld;
> +
> +	if (!is_cxl_decoder(dev))
> +		return 0;
> +
> +	cxld = to_cxl_decoder(dev);
> +	if (theirs->start <= cxld->decoder_range.start &&
> +	    theirs->end >= cxld->decoder_range.end)
> +		return 1;
> +
> +	return 0;
> +}
> +
> +static struct cxl_decoder *cxl_find_root_decoder(resource_size_t base,
> +						 resource_size_t size)
> +{
> +	struct cxl_decoder *cxld = NULL;
> +	struct device *cxldd;
> +	struct device *host;
> +	struct resource res = (struct resource){
> +		.start = base,
> +		.end = base + size - 1,
> +	};
> +
> +	host = get_cxl_topology_host();
> +	if (!host)
> +		return NULL;
> +
> +	cxldd = device_find_child(host, &res, decoder_match);
> +	if (cxldd)
> +		cxld = to_cxl_decoder(cxldd);
> +
> +	put_cxl_topology_host(host);
> +	return cxld;
> +}
> +
>  static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
>  			    char *buf)
>  {
> @@ -280,6 +321,11 @@ bool is_root_decoder(struct device *dev)
>  }
>  EXPORT_SYMBOL_NS_GPL(is_root_decoder, CXL);
>  
> +static bool is_cxl_decoder(struct device *dev)
> +{
> +	return dev->type->release == cxl_decoder_release;
> +}
> +
>  struct cxl_decoder *to_cxl_decoder(struct device *dev)
>  {
>  	if (dev_WARN_ONCE(dev, dev->type->release != cxl_decoder_release,
> @@ -327,6 +373,7 @@ struct cxl_port *to_cxl_port(struct device *dev)
>  		return NULL;
>  	return container_of(dev, struct cxl_port, dev);
>  }
> +EXPORT_SYMBOL_NS_GPL(to_cxl_port, CXL);
>  
>  struct cxl_dport *cxl_get_root_dport(struct device *dev)
>  {
> @@ -735,6 +782,24 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL);
>  
>  static void cxld_unregister(void *dev)
>  {
> +	struct cxl_decoder *plat_decoder, *cxld = to_cxl_decoder(dev);
> +	resource_size_t base, size;
> +
> +	if (is_root_decoder(dev)) {
> +		device_unregister(dev);
> +		return;
> +	}
> +
> +	base = cxld->decoder_range.start;
> +	size = range_len(&cxld->decoder_range);
> +
> +	if (size) {
> +		plat_decoder = cxl_find_root_decoder(base, size);
> +		if (plat_decoder)
> +			__release_region(&plat_decoder->platform_res, base,
> +					 size);
> +	}
> +
>  	device_unregister(dev);
>  }
>  
> @@ -789,6 +854,8 @@ static int cxl_device_id(struct device *dev)
>  		return CXL_DEVICE_NVDIMM_BRIDGE;
>  	if (dev->type == &cxl_nvdimm_type)
>  		return CXL_DEVICE_NVDIMM;
> +	if (dev->type == &cxl_port_type)
> +		return CXL_DEVICE_PORT;
>  	return 0;
>  }
>  
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index 41a0245867ea..f191b0c995a7 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -159,9 +159,8 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base,
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_probe_device_regs, CXL);
>  
> -static void __iomem *devm_cxl_iomap_block(struct device *dev,
> -					  resource_size_t addr,
> -					  resource_size_t length)
> +void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
> +				   resource_size_t length)
>  {
>  	void __iomem *ret_val;
>  	struct resource *res;
> @@ -180,6 +179,7 @@ static void __iomem *devm_cxl_iomap_block(struct device *dev,
>  
>  	return ret_val;
>  }
> +EXPORT_SYMBOL_NS_GPL(devm_cxl_iomap_block, CXL);
>  
>  int cxl_map_component_regs(struct pci_dev *pdev,
>  			   struct cxl_component_regs *regs,
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 3962a5e6a950..24fa16157d5e 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -17,6 +17,9 @@
>   * (port-driver, region-driver, nvdimm object-drivers... etc).
>   */
>  
> +/* CXL 2.0 8.2.4 CXL Component Register Layout and Definition */
> +#define CXL_COMPONENT_REG_BLOCK_SIZE SZ_64K
> +
>  /* CXL 2.0 8.2.5 CXL.cache and CXL.mem Registers*/
>  #define CXL_CM_OFFSET 0x1000
>  #define CXL_CM_CAP_HDR_OFFSET 0x0
> @@ -36,11 +39,22 @@
>  #define CXL_HDM_DECODER_CAP_OFFSET 0x0
>  #define   CXL_HDM_DECODER_COUNT_MASK GENMASK(3, 0)
>  #define   CXL_HDM_DECODER_TARGET_COUNT_MASK GENMASK(7, 4)
> -#define CXL_HDM_DECODER0_BASE_LOW_OFFSET 0x10
> -#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET 0x14
> -#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET 0x18
> -#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET 0x1c
> -#define CXL_HDM_DECODER0_CTRL_OFFSET 0x20
> +#define   CXL_HDM_DECODER_INTERLEAVE_11_8 BIT(8)
> +#define   CXL_HDM_DECODER_INTERLEAVE_14_12 BIT(9)
> +#define CXL_HDM_DECODER_CTRL_OFFSET 0x4
> +#define   CXL_HDM_DECODER_ENABLE BIT(1)
> +#define CXL_HDM_DECODER0_BASE_LOW_OFFSET(i) (0x20 * (i) + 0x10)
> +#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET(i) (0x20 * (i) + 0x14)
> +#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET(i) (0x20 * (i) + 0x18)
> +#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(i) (0x20 * (i) + 0x1c)
> +#define CXL_HDM_DECODER0_CTRL_OFFSET(i) (0x20 * (i) + 0x20)
> +#define   CXL_HDM_DECODER0_CTRL_IG_MASK GENMASK(3, 0)
> +#define   CXL_HDM_DECODER0_CTRL_IW_MASK GENMASK(7, 4)
> +#define   CXL_HDM_DECODER0_CTRL_COMMIT BIT(9)
> +#define   CXL_HDM_DECODER0_CTRL_COMMITTED BIT(10)
> +#define   CXL_HDM_DECODER0_CTRL_TYPE BIT(12)
> +#define CXL_HDM_DECODER0_TL_LOW(i) (0x20 * (i) + 0x24)
> +#define CXL_HDM_DECODER0_TL_HIGH(i) (0x20 * (i) + 0x28)
>  
>  static inline int cxl_hdm_decoder_count(u32 cap_hdr)
>  {
> @@ -148,6 +162,8 @@ int cxl_map_device_regs(struct pci_dev *pdev,
>  enum cxl_regloc_type;
>  int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>  		      struct cxl_register_map *map);
> +void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
> +				   resource_size_t length);
>  
>  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
>  #define CXL_TARGET_STRLEN 20
> @@ -165,7 +181,8 @@ void cxl_unregister_topology_host(struct device *host);
>  #define CXL_DECODER_F_TYPE2 BIT(2)
>  #define CXL_DECODER_F_TYPE3 BIT(3)
>  #define CXL_DECODER_F_LOCK  BIT(4)
> -#define CXL_DECODER_F_MASK  GENMASK(4, 0)
> +#define CXL_DECODER_F_ENABLE    BIT(5)
> +#define CXL_DECODER_F_MASK  GENMASK(5, 0)
>  
>  enum cxl_decoder_type {
>         CXL_DECODER_ACCELERATOR = 2,
> @@ -255,6 +272,8 @@ struct cxl_walk_context {
>   * @dports: cxl_dport instances referenced by decoders
>   * @decoder_ida: allocator for decoder ids
>   * @component_reg_phys: component register capability base address (optional)
> + * @rescan_work: worker object for bus rescans after port additions
> + * @data: opaque data with driver specific usage
>   */
>  struct cxl_port {
>  	struct device dev;
> @@ -263,6 +282,8 @@ struct cxl_port {
>  	struct list_head dports;
>  	struct ida decoder_ida;
>  	resource_size_t component_reg_phys;
> +	struct work_struct rescan_work;
> +	void *data;
>  };
>  
>  /**
> @@ -325,6 +346,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
>  
>  #define CXL_DEVICE_NVDIMM_BRIDGE	1
>  #define CXL_DEVICE_NVDIMM		2
> +#define CXL_DEVICE_PORT			3
>  
>  #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
>  #define CXL_MODALIAS_FMT "cxl:t%d"
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> new file mode 100644
> index 000000000000..3c03131517af
> --- /dev/null
> +++ b/drivers/cxl/port.c
> @@ -0,0 +1,323 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +
> +#include "cxlmem.h"
> +
> +/**
> + * DOC: cxl port
> + *
> + * The port driver implements the set of functionality needed to allow full
> + * decoder enumeration and routing. A CXL port is an abstraction of a CXL
> + * component that implements some amount of CXL decoding of CXL.mem traffic.
> + * As of the CXL 2.0 spec, this includes:
> + *
> + *	.. list-table:: CXL Components w/ Ports
> + *		:widths: 25 25 50
> + *		:header-rows: 1
> + *
> + *		* - component
> + *		  - upstream
> + *		  - downstream
> + *		* - Hostbridge
> + *		  - ACPI0016
> + *		  - root port
> + *		* - Switch
> + *		  - Switch Upstream Port
> + *		  - Switch Downstream Port
> + *		* - Endpoint
> + *		  - Endpoint Port
> + *		  - N/A
> + *
> + * The primary service this driver provides is enumerating HDM decoders and
> + * presenting APIs to other drivers to utilize the decoders.
> + */
> +
> +static struct workqueue_struct *cxl_port_wq;
> +
> +struct cxl_port_data {
> +	struct cxl_component_regs regs;
> +
> +	struct port_caps {
> +		unsigned int count;
> +		unsigned int tc;
> +		unsigned int interleave11_8;
> +		unsigned int interleave14_12;
> +	} caps;
> +};
> +
> +static inline int to_interleave_granularity(u32 ctrl)
> +{
> +	int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
> +
> +	return 256 << val;
> +}
> +
> +static inline int to_interleave_ways(u32 ctrl)
> +{
> +	int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl);
> +
> +	return 1 << val;
> +}
> +
> +static void get_caps(struct cxl_port *port, struct cxl_port_data *cpd)
> +{
> +	void __iomem *hdm_decoder = cpd->regs.hdm_decoder;
> +	struct port_caps *caps = &cpd->caps;
> +	u32 hdm_cap;
> +
> +	hdm_cap = readl(hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> +
> +	caps->count = cxl_hdm_decoder_count(hdm_cap);
> +	caps->tc = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
> +	caps->interleave11_8 =
> +		FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap);
> +	caps->interleave14_12 =
> +		FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_14_12, hdm_cap);
> +}
> +
> +static int map_regs(struct cxl_port *port, void __iomem *crb,
> +		    struct cxl_port_data *cpd)
> +{
> +	struct cxl_register_map map;
> +	struct cxl_component_reg_map *comp_map = &map.component_map;
> +
> +	cxl_probe_component_regs(&port->dev, crb, comp_map);
> +	if (!comp_map->hdm_decoder.valid) {
> +		dev_err(&port->dev, "HDM decoder registers invalid\n");
> +		return -ENXIO;
> +	}
> +
> +	cpd->regs.hdm_decoder = crb + comp_map->hdm_decoder.offset;
> +
> +	return 0;
> +}
> +
> +static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
> +{
> +	u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
> +
> +	if (ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED)
> +		return 0;
> +
> +	return ioread64_hi_lo(hdm_decoder +
> +			      CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
> +}
> +
> +static bool is_endpoint_port(struct cxl_port *port)
> +{
> +	/* Endpoints can't be ports... yet! */
> +	return false;
> +}
> +
> +static void rescan_ports(struct work_struct *work)
> +{
> +	if (bus_rescan_devices(&cxl_bus_type))
> +		pr_warn("Failed to rescan\n");
> +}
> +
> +/* Minor layering violation */
> +static int dvsec_range_used(struct cxl_port *port)
> +{
> +	struct cxl_endpoint_dvsec_info *info;
> +	int i, ret = 0;
> +
> +	if (!is_endpoint_port(port))
> +		return 0;
> +
> +	info = port->data;
> +	for (i = 0; i < info->ranges; i++)
> +		if (info->range[i].size)
> +			ret |= 1 << i;
> +
> +	return ret;
> +}
> +
> +static int enumerate_hdm_decoders(struct cxl_port *port,
> +				  struct cxl_port_data *portdata)
> +{
> +	void __iomem *hdm_decoder = portdata->regs.hdm_decoder;
> +	bool global_enable;
> +	u32 global_ctrl;
> +	int i = 0;
> +
> +	global_ctrl = readl(hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> +	global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE;
> +	if (!global_enable) {
> +		i = dvsec_range_used(port);
> +		if (i) {
> +			dev_err(&port->dev,
> +				"Couldn't add port because device is using DVSEC range registers\n");
> +			return -EBUSY;
> +		}
> +	}
> +
> +	for (i = 0; i < portdata->caps.count; i++) {
> +		int rc, target_count = portdata->caps.tc;
> +		struct cxl_decoder *cxld;
> +		int *target_map = NULL;
> +		u64 size;
> +
> +		if (is_endpoint_port(port))
> +			target_count = 0;
> +
> +		cxld = cxl_decoder_alloc(port, target_count);
> +		if (IS_ERR(cxld)) {
> +			dev_warn(&port->dev,
> +				 "Failed to allocate the decoder\n");
> +			return PTR_ERR(cxld);
> +		}
> +
> +		cxld->target_type = CXL_DECODER_EXPANDER;
> +		cxld->interleave_ways = 1;
> +		cxld->interleave_granularity = 0;
> +
> +		size = get_decoder_size(hdm_decoder, i);
> +		if (size != 0) {
> +#define decoderN(reg, n) hdm_decoder + CXL_HDM_DECODER0_##reg(n)

Perhaps this block in the if (size != 0) would be more readable if broken out
to a utility function?
As normal I'm not keen on the macro magic if we can avoid it.


> +			int temp[CXL_DECODER_MAX_INTERLEAVE];
> +			u64 base;
> +			u32 ctrl;
> +			int j;
> +			union {
> +				u64 value;
> +				char target_id[8];
> +			} target_list;

I thought we tried to avoid this union usage in kernel because of the whole
thing about c compilers being able to do what they like with it...

> +
> +			target_map = temp;

This is set to a variable that goes out of scope whilst target_map is still in
use.

> +			ctrl = readl(decoderN(CTRL_OFFSET, i));
> +			base = ioread64_hi_lo(decoderN(BASE_LOW_OFFSET, i));
> +
> +			cxld->decoder_range = (struct range){
> +				.start = base,
> +				.end = base + size - 1
> +			};
> +
> +			cxld->flags = CXL_DECODER_F_ENABLE;
> +			cxld->interleave_ways = to_interleave_ways(ctrl);
> +			cxld->interleave_granularity =
> +				to_interleave_granularity(ctrl);
> +
> +			if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0)
> +				cxld->target_type = CXL_DECODER_ACCELERATOR;
> +
> +			target_list.value = ioread64_hi_lo(decoderN(TL_LOW, i));
> +			for (j = 0; j < cxld->interleave_ways; j++)
> +				target_map[j] = target_list.target_id[j];
> +#undef decoderN
> +		}
> +
> +		rc = cxl_decoder_add_locked(cxld, target_map);
> +		if (rc)
> +			put_device(&cxld->dev);
> +		else
> +			rc = cxl_decoder_autoremove(&port->dev, cxld);
> +		if (rc)
> +			dev_err(&port->dev, "Failed to add decoder\n");

If that fails on the autoremove registration (unlikely) this message
will be rather confusing - as the add was fine...

This nest of carrying on when we have an error is getting ever deeper...

> +		else
> +			dev_dbg(&cxld->dev, "Added to port %s\n",
> +				dev_name(&port->dev));
> +	}
> +
> +	/*
> +	 * Turn on global enable now since DVSEC ranges aren't being used and
> +	 * we'll eventually want the decoder enabled.
> +	 */
> +	if (!global_enable) {
> +		dev_dbg(&port->dev, "Enabling HDM decode\n");
> +		writel(global_ctrl | CXL_HDM_DECODER_ENABLE, hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> +	}
> +
> +	return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 21/23] cxl: Unify port enumeration for decoders
  2021-11-20  0:02 ` [PATCH 21/23] cxl: Unify port enumeration for decoders Ben Widawsky
@ 2021-11-22 17:48   ` Jonathan Cameron
  2021-11-22 23:44     ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 17:48 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:48 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The port driver exists to do proper enumeration of the component
> registers for ports, including HDM decoder resources. Any port which
> follows the CXL specification to implement HDM decoder registers should
> be handled by the port driver. This includes host bridge registers that
> are currently handled within the cxl_acpi driver.
> 
> In moving the responsibility from cxl_acpi to cxl_port, three primary
> things are accomplished here:
> 1. Multi-port host bridges are now handled by the port driver
> 2. Single port host bridges are handled by the port driver
> 3. Single port switches without component registers will be handled by
>    the port driver.
> 
> While it's tempting to remove decoder APIs from cxl_core entirely, it is
> still required that platform specific drivers are able to add decoders
> which aren't specified in CXL 2.0+. An example of this is the CFMWS
> parsing which is implementing in cxl_acpi.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
One trivial suggestion inline, but looks fine to me otherwise.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> Changes since RFCv2:
> - Renamed subject
> - Reworded commit message
> - Handle *all* host bridge port enumeration in cxl_port (Dan)
>   - Handle passthrough decoding in cxl_port
> ---
>  drivers/cxl/acpi.c     | 41 +++-----------------------------
>  drivers/cxl/core/bus.c |  6 +++--
>  drivers/cxl/cxl.h      |  2 ++
>  drivers/cxl/port.c     | 54 +++++++++++++++++++++++++++++++++++++++++-
>  4 files changed, 62 insertions(+), 41 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index c12e4fed7941..c85a04ecbf7f 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -210,8 +210,6 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>  	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
>  	struct acpi_pci_root *pci_root;
>  	struct cxl_walk_context ctx;
> -	int single_port_map[1], rc;
> -	struct cxl_decoder *cxld;
>  	struct cxl_dport *dport;
>  	struct cxl_port *port;
>  
> @@ -245,43 +243,9 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>  		return -ENODEV;
>  	if (ctx.error)
>  		return ctx.error;
> -	if (ctx.count > 1)
> -		return 0;
>  
> -	/* TODO: Scan CHBCR for HDM Decoder resources */
> -
> -	/*
> -	 * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> -	 * Structure) single ported host-bridges need not publish a decoder
> -	 * capability when a passthrough decode can be assumed, i.e. all
> -	 * transactions that the uport sees are claimed and passed to the single
> -	 * dport. Disable the range until the first CXL region is enumerated /
> -	 * activated.
> -	 */
> -	cxld = cxl_decoder_alloc(port, 1);
> -	if (IS_ERR(cxld))
> -		return PTR_ERR(cxld);
> -
> -	cxld->interleave_ways = 1;
> -	cxld->interleave_granularity = PAGE_SIZE;
> -	cxld->target_type = CXL_DECODER_EXPANDER;
> -	cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0);
> -
> -	device_lock(&port->dev);
> -	dport = list_first_entry(&port->dports, typeof(*dport), list);
> -	device_unlock(&port->dev);
> -
> -	single_port_map[0] = dport->port_id;
> -
> -	rc = cxl_decoder_add(cxld, single_port_map);
> -	if (rc)
> -		put_device(&cxld->dev);
> -	else
> -		rc = cxl_decoder_autoremove(host, cxld);
> -
> -	if (rc == 0)
> -		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> -	return rc;
> +	/* Host bridge ports are enumerated by the port driver. */
> +	return 0;
>  }
>  
>  struct cxl_chbs_context {
> @@ -448,3 +412,4 @@ module_platform_driver(cxl_acpi_driver);
>  MODULE_LICENSE("GPL v2");
>  MODULE_IMPORT_NS(CXL);
>  MODULE_IMPORT_NS(ACPI);
> +MODULE_SOFTDEP("pre: cxl_port");
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 46a06cfe79bd..acfa212eea21 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -62,7 +62,7 @@ void cxl_unregister_topology_host(struct device *host)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_unregister_topology_host, CXL);
>  
> -static struct device *get_cxl_topology_host(void)
> +struct device *get_cxl_topology_host(void)
>  {
>  	down_read(&topology_host_sem);
>  	if (cxl_topology_host)
> @@ -70,12 +70,14 @@ static struct device *get_cxl_topology_host(void)
>  	up_read(&topology_host_sem);
>  	return NULL;
>  }
> +EXPORT_SYMBOL_NS_GPL(get_cxl_topology_host, CXL);
>  
> -static void put_cxl_topology_host(struct device *dev)
> +void put_cxl_topology_host(struct device *dev)
>  {
>  	WARN_ON(dev != cxl_topology_host);
>  	up_read(&topology_host_sem);
>  }
> +EXPORT_SYMBOL_NS_GPL(put_cxl_topology_host, CXL);
>  
>  static int decoder_match(struct device *dev, void *data)
>  {
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 24fa16157d5e..f8354241c5a3 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -170,6 +170,8 @@ void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
>  
>  int cxl_register_topology_host(struct device *host);
>  void cxl_unregister_topology_host(struct device *host);
> +struct device *get_cxl_topology_host(void);
> +void put_cxl_topology_host(struct device *dev);
>  
>  /*
>   * cxl_decoder flags that define the type of memory / devices this
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index 3c03131517af..7a1fc726fe9f 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -233,12 +233,64 @@ static int enumerate_hdm_decoders(struct cxl_port *port,
>  	return 0;
>  }
>  
> +/*
> + * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
> + * single ported host-bridges need not publish a decoder capability when a
> + * passthrough decode can be assumed, i.e. all transactions that the uport sees
> + * are claimed and passed to the single dport. Disable the range until the first
> + * CXL region is enumerated / activated.
> + */
> +static int add_passthrough_decoder(struct cxl_port *port)
> +{
> +	int single_port_map[1], rc;
> +	struct cxl_decoder *cxld;
> +	struct cxl_dport *dport;
> +
> +	device_lock_assert(&port->dev);
> +
> +	cxld = cxl_decoder_alloc(port, 1);
> +	if (IS_ERR(cxld))
> +		return PTR_ERR(cxld);
> +
> +	cxld->interleave_ways = 1;
> +	cxld->interleave_granularity = PAGE_SIZE;
> +	cxld->target_type = CXL_DECODER_EXPANDER;
> +	cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0);
> +
> +	dport = list_first_entry(&port->dports, typeof(*dport), list);
> +	single_port_map[0] = dport->port_id;
> +
> +	rc = cxl_decoder_add_locked(cxld, single_port_map);
> +	if (rc)
> +		put_device(&cxld->dev);

I would handle this error path entirely here, or use a goto rather
than messing up the good path with conditionals on each element,
particularly as there isn't much to do in the error paths.
I guess this might get more complicated in later patches though.

Obviously that tidy up would make this more complex than simply moving
the code. (I might have commented on this before, but too long ago to remember ;)

	if (rc) {
		put_device(&cxld->dev);
		return rc;
	}
	rc = cxl_decoder...
	if (rc)
		return rc;

	dev_dbg(..

	return 0;

> +	else
> +		rc = cxl_decoder_autoremove(&port->dev, cxld);
> +
> +	if (rc == 0)
> +		dev_dbg(&port->dev, "add: %s\n", dev_name(&cxld->dev));
> +
> +	return rc;
> +}
> +

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout
  2021-11-22 17:17     ` Ben Widawsky
@ 2021-11-22 17:53       ` Jonathan Cameron
  2021-11-24 19:56         ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 17:53 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Mon, 22 Nov 2021 09:17:31 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-11-22 15:02:27, Jonathan Cameron wrote:
> > On Fri, 19 Nov 2021 16:02:31 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > The original driver implementation used the doorbell timeout for the
> > > Mailbox Interface Ready bit to piggy back off of, since the latter
> > > doesn't have a defined timeout. This functionality, introduced in
> > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > > since a timeout has been defined with an ECN to the 2.0 spec.
> > > 
> > > While devices implemented prior to the ECN could have an arbitrarily
> > > long wait and still be within spec, the max ECN value (256s) is chosen
> > > as the default for all devices. All vendors in the consortium agreed to
> > > this amount and so it is reasonable to assume no devices made will
> > > exceed this amount.  
> > 
> > Optimistic :)
> >   
> 
> Reasonable to assume is certainly not the same as "in reality". I can soften
> this wording.
> 
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > > This patch did not exist in RFCv2
> > > ---
> > >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> > >  1 file changed, 29 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index 6c8d09fb3a17..2cef9fec8599 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -2,6 +2,7 @@
> > >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > >  #include <linux/module.h>
> > > +#include <linux/delay.h>
> > >  #include <linux/sizes.h>
> > >  #include <linux/mutex.h>
> > >  #include <linux/list.h>
> > > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> > >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> > >  {
> > >  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > > +	unsigned long timeout;
> > > +	u64 md_status;
> > > +	int rc;
> > > +
> > > +	/*
> > > +	 * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > > +	 * dictate how long to wait for the mailbox to become ready. For
> > > +	 * simplicity, and to handle devices that might have been implemented  
> > 
> > I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> > a lower value, then that is what we should use.  It's fine to wait the max time
> > if nothing is specified.  It'll cost us a few lines of code at most unless
> > I am missing something...
> > 
> > Jonathan
> >   
> 
> Let me pose it a different way, if a device advertises 1s, but for whatever
> takes 4s to come up, should we penalize it over the device advertising 256s?

Yes, because it is buggy.  A compliance test should have failed on this anyway.

> The
> way this field is defined in the spec would [IMHO] lead vendors to simply put
> the max field in there to game the driver, so why not start off with just
> insisting they don't?

Given reading this value and getting a big number gives the implication that
the device is meant to be really slow to initialize, I'd expect that to push
vendors a little in the directly of putting realistic values in).

Maybe we should print the value in the log to make them look silly ;)

Jonathan

> 
> > > +	 * prior to the ECN, wait the max amount of time no matter what the
> > > +	 * device says.
> > > +	 */
> > > +	timeout = jiffies + 256 * HZ;
> > > +
> > > +	rc = check_device_status(cxlds);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	do {
> > > +		md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > > +		if (md_status & CXLMDEV_MBOX_IF_READY)
> > > +			break;
> > > +		if (msleep_interruptible(100))
> > > +			break;
> > > +	} while (!time_after(jiffies, timeout));
> > > +
> > > +	/* It's assumed that once the interface is ready, it will remain ready. */
> > > +	if (!(md_status & CXLMDEV_MBOX_IF_READY))
> > > +		return -EIO;
> > >  
> > >  	cxlds->mbox_send = cxl_pci_mbox_send;
> > >  	cxlds->payload_size =  
> >   


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 22/23] cxl/mem: Introduce cxl_mem driver
  2021-11-20  0:02 ` [PATCH 22/23] cxl/mem: Introduce cxl_mem driver Ben Widawsky
  2021-11-20  0:40   ` Randy Dunlap
@ 2021-11-22 18:17   ` Jonathan Cameron
  2021-11-23  0:05     ` Ben Widawsky
  1 sibling, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 18:17 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:49 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Add a driver that is capable of determining whether a device is in a
> CXL.mem routed part of the topology.
> 
> This driver allows a higher level driver - such as one controlling CXL
> regions, which is itself a set of CXL devices - to easily determine if
> the CXL devices are CXL.mem capable by checking if the driver has bound.
> CXL memory device services may also be provided by this driver though
> none are needed as of yet. cxl_mem also plays the part of registering
> itself as an endpoint port, which is a required step to enumerate the
> device's HDM decoder resources.
> 
> Even though cxl_mem driver is the only consumer of the new
> cxl_scan_ports() introduced in cxl_core, because that functionality has
> PCIe specificity it is kept out of this driver.
> 
> As part of this patch, find_dport_by_dev() is promoted to the cxl_core's
> set of APIs for use by the new driver.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
Main thing in here is the mysterious private data. I'd drop
that until we have patches that set it in the same series.




...

> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index acfa212eea21..cab3aabd5abc 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -8,6 +8,7 @@
>  #include <linux/idr.h>
>  #include <cxlmem.h>
>  #include <cxl.h>
> +#include <pci.h>
>  #include "core.h"
>  
>  /**
> @@ -436,7 +437,7 @@ static int devm_cxl_link_uport(struct device *host, struct cxl_port *port)
>  
>  static struct cxl_port *cxl_port_alloc(struct device *uport,
>  				       resource_size_t component_reg_phys,
> -				       struct cxl_port *parent_port)
> +				       struct cxl_port *parent_port, void *data)
>  {
>  	struct cxl_port *port;
>  	struct device *dev;
> @@ -465,6 +466,7 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
>  
>  	port->uport = uport;
>  	port->component_reg_phys = component_reg_phys;
> +	port->data = data;
>  	ida_init(&port->decoder_ida);
>  	INIT_LIST_HEAD(&port->dports);
>  
> @@ -485,16 +487,17 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
>   * @uport: "physical" device implementing this upstream port
>   * @component_reg_phys: (optional) for configurable cxl_port instances
>   * @parent_port: next hop up in the CXL memory decode hierarchy
> + * @data: opaque data to be used by the port driver
>   */
>  struct cxl_port *devm_cxl_add_port(struct device *uport,
>  				   resource_size_t component_reg_phys,
> -				   struct cxl_port *parent_port)
> +				   struct cxl_port *parent_port, void *data)
>  {
>  	struct device *dev, *host;
>  	struct cxl_port *port;
>  	int rc;
>  
> -	port = cxl_port_alloc(uport, component_reg_phys, parent_port);
> +	port = cxl_port_alloc(uport, component_reg_phys, parent_port, data);
>  	if (IS_ERR(port))
>  		return port;
>  
> @@ -531,6 +534,113 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
>  }
>  EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, CXL);
>  


...

> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> new file mode 100644
> index 000000000000..818e30571e4d
> --- /dev/null
> +++ b/drivers/cxl/core/pci.c
> @@ -0,0 +1,119 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +#include <linux/device.h>
> +#include <linux/pci.h>
> +#include <cxl.h>
> +#include <pci.h>
> +#include "core.h"
> +
> +/**
> + * DOC: cxl core pci
> + *
> + * Compute Express Link protocols are layered on top of PCIe. CXL core provides
> + * a set of helpers for CXL interactions which occur via PCIe.
> + */
> +
> +/**
> + * find_parent_cxl_port() - Finds parent port through PCIe mechanisms
> + * @pdev: PCIe USP or DSP to find an upstream port for
> + *
> + * Once all CXL ports are enumerated, there is no need to reference the PCIe
> + * parallel universe as all downstream ports are contained in a linked list, and
> + * all upstream ports are accessible via pointer. During the enumeration, it is
> + * very convenient to be able to peak up one level in the hierarchy without
> + * needing the established relationship between data structures so that the
> + * parenting can be done as the ports/dports are created.
> + *
> + * A reference is kept to the found port.
> + */
> +struct cxl_port *find_parent_cxl_port(struct pci_dev *pdev)
> +{
> +	struct device *parent_dev, *gparent_dev;
> +
> +	/* Parent is either a downstream port, or root port */
> +	parent_dev = get_device(pdev->dev.parent);
> +
> +	if (is_cxl_switch_usp(&pdev->dev)) {
> +		if (dev_WARN_ONCE(&pdev->dev,

maybe put the condition var in a local variable to reduce the indent and get something
more readable?

> +				  pci_pcie_type(pdev) !=
> +						  PCI_EXP_TYPE_DOWNSTREAM &&
> +					  pci_pcie_type(pdev) !=
> +						  PCI_EXP_TYPE_ROOT_PORT,
> +				  "Parent not downstream\n"))
> +			goto err;
> +
> +		/*
> +		 * Grandparent is either an upstream port or a platform device that has
> +		 * been added as a cxl_port already.
> +		 */
> +		gparent_dev = get_device(parent_dev->parent);
> +		put_device(parent_dev);
> +
> +		return to_cxl_port(gparent_dev);
> +	} else if (is_cxl_switch_dsp(&pdev->dev)) {
> +		if (dev_WARN_ONCE(&pdev->dev,
> +				  pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM,
> +				  "Parent not upstream"))
> +			goto err;
> +		return to_cxl_port(parent_dev);
> +	}
> +
> +err:
> +	dev_WARN(&pdev->dev, "Invalid topology\n");
> +	put_device(parent_dev);
> +	return NULL;
> +}
> +

...

> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index f8354241c5a3..3bda806f4244 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -296,6 +296,7 @@ struct cxl_port {
>   * @port: reference to cxl_port that contains this downstream port
>   * @list: node for a cxl_port's list of cxl_dport instances
>   * @root_port_link: node for global list of root ports
> + * @data: Opaque data passed by other drivers, used by port driver

Is this used yet? possible leave introducing this until we need it
as not obvious here what it will be for.

>   */
>  struct cxl_dport {
>  	struct device *dport;
> @@ -304,16 +305,20 @@ struct cxl_dport {
>  	struct cxl_port *port;
>  	struct list_head list;
>  	struct list_head root_port_link;
> +	void *data;
>  };
>  
>  struct cxl_port *to_cxl_port(struct device *dev);
>  struct cxl_port *devm_cxl_add_port(struct device *uport,
>  				   resource_size_t component_reg_phys,
> -				   struct cxl_port *parent_port);
> +				   struct cxl_port *parent_port, void *data);
> +void cxl_scan_ports(struct cxl_dport *root_port);

...

> +
> +static int create_endpoint(struct device *dev, struct cxl_port *parent,
> +			   struct cxl_dport *dport)
> +{
> +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_endpoint_dvsec_info *info = cxlds->info;
> +	struct cxl_port *endpoint;
> +	int rc;
> +
> +	endpoint =
> +		devm_cxl_add_port(dev, cxlds->component_reg_phys, parent, info);

I'd just have that on one line, or break it somewhere in the parameter list.
Right now it just looks odd and saves maybe 4 characters in line length.

> +	if (IS_ERR(endpoint))
> +		return PTR_ERR(endpoint);
> +
> +	rc = sysfs_create_link(&cxlmd->dev.kobj, &dport->dport->kobj,
> +			       "root_port");

Not obvious to me what this link is for.  Maybe needs a docs update
somewhere?

> +	if (rc) {
> +		device_del(&endpoint->dev);
> +		return rc;
> +	}
> +	dev_dbg(dev, "add: %s\n", dev_name(&endpoint->dev));
> +
> +	return devm_add_action_or_reset(dev, remove_endpoint, cxlmd);
> +}
> +
> +static int cxl_mem_probe(struct device *dev)
> +{
> +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +	struct cxl_port *hostbridge, *parent_port;
> +	struct walk_ctx ctx = { NULL, false };

Perhaps use c99 style to show what is being initialized inside the walk ctx.
Will make it more obvious these are the right values.

> +	struct cxl_dport *dport;
> +	int rc;
> +
> +	rc = wait_for_media(cxlmd);
> +	if (rc) {
> +		dev_err(dev, "Media not active (%d)\n", rc);
> +		return rc;
> +	}
> +
> +	walk_to_root_port(dev, &ctx);
> +
> +	/*
> +	 * Couldn't find a CXL capable root port. This may happen even with a
> +	 * CXL capable topology if cxl_acpi hasn't completed yet. A rescan will
> +	 * occur.
> +	 */
> +	if (!ctx.root_port)
> +		return -ENODEV;
> +
> +	hostbridge = ctx.root_port->port;
> +	device_lock(&hostbridge->dev);
> +
> +	/* hostbridge has no port driver, the topology isn't enabled yet */
> +	if (!hostbridge->dev.driver) {
> +		device_unlock(&hostbridge->dev);
> +		return -ENODEV;
> +	}
> +
> +	/* No switch + found root port means we're done */
> +	if (!ctx.has_switch) {
> +		parent_port = to_cxl_port(&hostbridge->dev);
> +		dport = ctx.root_port;
> +		goto out;
> +	}
> +
> +	/* Walk down from the root port and add all switches */
> +	cxl_scan_ports(ctx.root_port);
> +
> +	/* If parent is a dport the endpoint is good to go. */
> +	parent_port = to_cxl_port(dev->parent->parent);
> +	dport = cxl_find_dport_by_dev(parent_port, dev->parent);
> +	if (!dport) {
> +		rc = -ENODEV;
> +		goto err_out;
> +	}
> +
> +out:
> +	rc = create_endpoint(dev, parent_port, dport);
> +	if (rc)
> +		goto err_out;
> +
> +	cxlmd->root_port = ctx.root_port;
> +
> +err_out:
> +	device_unlock(&hostbridge->dev);
> +	return rc;
> +}
> +

> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 2a48cd65bf59..3fd0909522f2 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -15,6 +15,7 @@
>  
>  /* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
>  #define CXL_DVSEC_PCIE_DEVICE					0
> +

Noise that shouldn't be in this patch.

>  #define   CXL_DVSEC_PCIE_DEVICE_CAP_OFFSET			0xA
>  #define     CXL_DVSEC_PCIE_DEVICE_MEM_CAPABLE			BIT(2)
>  #define     CXL_DVSEC_PCIE_DEVICE_HDM_COUNT_MASK		GENMASK(5, 4)
> @@ -64,4 +65,7 @@ enum cxl_regloc_type {
>  	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
>  			   (map)->block_offset))
>  
> +bool is_cxl_switch_usp(struct device *dev);
> +bool is_cxl_switch_dsp(struct device *dev);
> +
>  #endif /* __CXL_PCI_H__ */



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/23] cxl/mem: Disable switch hierarchies for now
  2021-11-20  0:02 ` [PATCH 23/23] cxl/mem: Disable switch hierarchies for now Ben Widawsky
@ 2021-11-22 18:19   ` Jonathan Cameron
  2021-11-22 19:17     ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 18:19 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:50 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Switches aren't supported by the region driver yet. If a device finds
> itself under a switch it will not bind a driver so that it cannot be
> used later for region creation/configuration.

What's the reasoning in have this in this patch set rather than the region one?

I was rather hoping you'd have it working when the region set is ready :)

Jonathan

> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/mem.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index e954144af4b8..997898e78d63 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -155,6 +155,11 @@ static int cxl_mem_probe(struct device *dev)
>  		goto out;
>  	}
>  
> +	/* FIXME: Add true switch support */
> +	dev_err(dev, "Devices behind switches are currently unsupported\n");
> +	rc = -ENODEV;
> +	goto err_out;
> +
>  	/* Walk down from the root port and add all switches */
>  	cxl_scan_ports(ctx.root_port);
>  


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 14/23] cxl: Introduce topology host registration
  2021-11-20  0:02 ` [PATCH 14/23] cxl: Introduce topology host registration Ben Widawsky
@ 2021-11-22 18:20   ` Jonathan Cameron
  2021-11-22 22:30     ` Ben Widawsky
  2021-11-25  1:09   ` Dan Williams
  1 sibling, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 18:20 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:41 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The description of the CXL topology will be conveyed by a platform
> specific entity that is expected to be a singleton. For ACPI based
> systems, this is ACPI0017. When the topology host goes away, which as of
> now can only be triggered by module unload, it is desirable to have the
> entire topology cleaned up. Regular devm unwinding handles most
> situations already, but what's missing is the ability to teardown the
> root port. Since the root port is platform specific, the core needs a
> set of APIs to allow platform specific drivers to register their root
> ports. With that, all the automatic teardown can occur.
> 
> cxl_test makes for an interesting case. cxl_test creates an alternate
> universe where there are possibly two root topology hosts (a real
> ACPI0016, and a fake ACPI0016). For this to work in the future, cxl_acpi
> (or some future platform host driver) will need to be unloaded first.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
This is a little unusual looking but having followed through how it is used
it seems like a sensible approach to me.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> The topology lock can be used for more things. I'd like to save that as
> an add-on patch since it's extra risk for no reward, at this point.
> ---
>  drivers/cxl/acpi.c     | 18 ++++++++++---
>  drivers/cxl/core/bus.c | 57 +++++++++++++++++++++++++++++++++++++++---
>  drivers/cxl/cxl.h      |  5 +++-
>  3 files changed, 73 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 3415184a2e61..82cc42ab38c6 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -224,8 +224,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>  		return 0;
>  	}
>  
> -	port = devm_cxl_add_port(host, match, dport->component_reg_phys,
> -				 root_port);
> +	port = devm_cxl_add_port(match, dport->component_reg_phys, root_port);
>  	if (IS_ERR(port))
>  		return PTR_ERR(port);
>  	dev_dbg(host, "%s: add: %s\n", dev_name(match), dev_name(&port->dev));
> @@ -376,6 +375,11 @@ static int add_root_nvdimm_bridge(struct device *match, void *data)
>  	return 1;
>  }
>  
> +static void clear_topology_host(void *data)
> +{
> +	cxl_unregister_topology_host(data);
> +}
> +
>  static int cxl_acpi_probe(struct platform_device *pdev)
>  {
>  	int rc;
> @@ -384,7 +388,15 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>  	struct acpi_device *adev = ACPI_COMPANION(host);
>  	struct cxl_cfmws_context ctx;
>  
> -	root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
> +	rc = cxl_register_topology_host(host);
> +	if (rc)
> +		return rc;
> +
> +	rc = devm_add_action_or_reset(host, clear_topology_host, host);
> +	if (rc)
> +		return rc;
> +
> +	root_port = devm_cxl_add_port(host, CXL_RESOURCE_NONE, root_port);
>  	if (IS_ERR(root_port))
>  		return PTR_ERR(root_port);
>  	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index cd6fe7823c69..2ad38167796d 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -25,6 +25,53 @@
>   */
>  
>  static DEFINE_IDA(cxl_port_ida);
> +static DECLARE_RWSEM(topology_host_sem);
> +
> +static struct device *cxl_topology_host;
> +
> +int cxl_register_topology_host(struct device *host)
> +{
> +	down_write(&topology_host_sem);
> +	if (cxl_topology_host) {
> +		up_write(&topology_host_sem);
> +		pr_warn("%s host currently in use. Please try unloading %s",
> +			dev_name(cxl_topology_host), host->driver->name);
> +		return -EBUSY;
> +	}
> +
> +	cxl_topology_host = host;
> +	up_write(&topology_host_sem);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_register_topology_host, CXL);
> +
> +void cxl_unregister_topology_host(struct device *host)
> +{
> +	down_write(&topology_host_sem);
> +	if (cxl_topology_host == host)
> +		cxl_topology_host = NULL;
> +	else
> +		pr_warn("topology host in use by %s\n",
> +			cxl_topology_host->driver->name);
> +	up_write(&topology_host_sem);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_unregister_topology_host, CXL);
> +
> +static struct device *get_cxl_topology_host(void)
> +{
> +	down_read(&topology_host_sem);
> +	if (cxl_topology_host)
> +		return cxl_topology_host;
> +	up_read(&topology_host_sem);
> +	return NULL;
> +}
> +
> +static void put_cxl_topology_host(struct device *dev)
> +{
> +	WARN_ON(dev != cxl_topology_host);
> +	up_read(&topology_host_sem);
> +}
>  
>  static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
>  			    char *buf)
> @@ -362,17 +409,16 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
>  
>  /**
>   * devm_cxl_add_port - register a cxl_port in CXL memory decode hierarchy
> - * @host: host device for devm operations
>   * @uport: "physical" device implementing this upstream port
>   * @component_reg_phys: (optional) for configurable cxl_port instances
>   * @parent_port: next hop up in the CXL memory decode hierarchy
>   */
> -struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
> +struct cxl_port *devm_cxl_add_port(struct device *uport,
>  				   resource_size_t component_reg_phys,
>  				   struct cxl_port *parent_port)
>  {
> +	struct device *dev, *host;
>  	struct cxl_port *port;
> -	struct device *dev;
>  	int rc;
>  
>  	port = cxl_port_alloc(uport, component_reg_phys, parent_port);
> @@ -391,7 +437,12 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
>  	if (rc)
>  		goto err;
>  
> +	host = get_cxl_topology_host();
> +	if (!host)
> +		return ERR_PTR(-ENODEV);
> +
>  	rc = devm_add_action_or_reset(host, unregister_port, port);
> +	put_cxl_topology_host(host);
>  	if (rc)
>  		return ERR_PTR(rc);
>  
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 2c5627fa8a34..6fac4826d22b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -152,6 +152,9 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
>  #define CXL_TARGET_STRLEN 20
>  
> +int cxl_register_topology_host(struct device *host);
> +void cxl_unregister_topology_host(struct device *host);
> +
>  /*
>   * cxl_decoder flags that define the type of memory / devices this
>   * decoder supports as well as configuration lock status See "CXL 2.0
> @@ -279,7 +282,7 @@ struct cxl_dport {
>  };
>  
>  struct cxl_port *to_cxl_port(struct device *dev);
> -struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
> +struct cxl_port *devm_cxl_add_port(struct device *uport,
>  				   resource_size_t component_reg_phys,
>  				   struct cxl_port *parent_port);
>  


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 15/23] cxl/core: Store global list of root ports
  2021-11-20  0:02 ` [PATCH 15/23] cxl/core: Store global list of root ports Ben Widawsky
@ 2021-11-22 18:22   ` Jonathan Cameron
  2021-11-22 22:32     ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-22 18:22 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:42 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> CXL root ports (the downstream port to a host bridge) are to be
> enumerated by a platform specific driver. In the case of ACPI compliant
> systems, this is like the cxl_acpi driver. Root ports are the first
> CXL spec defined component that can be "found" by that platform specific
> driver.
> 
> By storing a list of these root ports components in lower levels of the
> topology (switches and endpoints), have a mechanism to walk up their
> device hierarchy to find an enumerated root port. This will be necessary
> for region programming.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> ---
> Dan points out in review this is possible to do without a new global
> list. While I agree, I was unable to get it working in a reasonable
> mount of time. Will punt on that for now.

This has made me curious.  Is this a punt it for v1, or a punt it for longer
term and maybe revisit later?

What you have looks like it should work fine to me.


> ---
>  drivers/cxl/acpi.c            |  4 ++--
>  drivers/cxl/core/bus.c        | 34 +++++++++++++++++++++++++++++++++-
>  drivers/cxl/cxl.h             |  5 ++++-
>  tools/testing/cxl/mock_acpi.c |  4 ++--
>  4 files changed, 41 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 82cc42ab38c6..c12e4fed7941 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -159,7 +159,7 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>  		creg = cxl_reg_block(pdev, &map);
>  
>  	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> -	rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
> +	rc = cxl_add_dport(port, &pdev->dev, port_num, creg, true);
>  	if (rc) {
>  		ctx->error = rc;
>  		return rc;
> @@ -341,7 +341,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
>  		return 0;
>  	}
>  
> -	rc = cxl_add_dport(root_port, match, uid, ctx.chbcr);
> +	rc = cxl_add_dport(root_port, match, uid, ctx.chbcr, false);
>  	if (rc) {
>  		dev_err(host, "failed to add downstream port: %s\n",
>  			dev_name(match));
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 2ad38167796d..9e0d7d5d9298 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -26,6 +26,8 @@
>  
>  static DEFINE_IDA(cxl_port_ida);
>  static DECLARE_RWSEM(topology_host_sem);
> +static LIST_HEAD(cxl_root_ports);
> +static DECLARE_RWSEM(root_port_sem);
>  
>  static struct device *cxl_topology_host;
>  
> @@ -326,12 +328,31 @@ struct cxl_port *to_cxl_port(struct device *dev)
>  	return container_of(dev, struct cxl_port, dev);
>  }
>  
> +struct cxl_dport *cxl_get_root_dport(struct device *dev)
> +{
> +	struct cxl_dport *ret = NULL;
> +	struct cxl_dport *dport;
> +
> +	down_read(&root_port_sem);
> +	list_for_each_entry(dport, &cxl_root_ports, root_port_link) {
> +		if (dport->dport == dev) {
> +			ret = dport;
> +			break;
> +		}
> +	}
> +
> +	up_read(&root_port_sem);
> +	return ret;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_get_root_dport, CXL);
> +
>  static void unregister_port(void *_port)
>  {
>  	struct cxl_port *port = _port;
>  	struct cxl_dport *dport;
>  
>  	device_lock(&port->dev);
> +	down_read(&root_port_sem);
>  	list_for_each_entry(dport, &port->dports, list) {
>  		char link_name[CXL_TARGET_STRLEN];
>  
> @@ -339,7 +360,10 @@ static void unregister_port(void *_port)
>  			     dport->port_id) >= CXL_TARGET_STRLEN)
>  			continue;
>  		sysfs_remove_link(&port->dev.kobj, link_name);
> +
> +		list_del_init(&dport->root_port_link);
>  	}
> +	up_read(&root_port_sem);
>  	device_unlock(&port->dev);
>  	device_unregister(&port->dev);
>  }
> @@ -493,12 +517,13 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
>   * @dport_dev: firmware or PCI device representing the dport
>   * @port_id: identifier for this dport in a decoder's target list
>   * @component_reg_phys: optional location of CXL component registers
> + * @root_port: is this a root port (hostbridge downstream)
>   *
>   * Note that all allocations and links are undone by cxl_port deletion
>   * and release.
>   */
>  int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
> -		  resource_size_t component_reg_phys)
> +		  resource_size_t component_reg_phys, bool root_port)
>  {
>  	char link_name[CXL_TARGET_STRLEN];
>  	struct cxl_dport *dport;
> @@ -513,6 +538,7 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
>  		return -ENOMEM;
>  
>  	INIT_LIST_HEAD(&dport->list);
> +	INIT_LIST_HEAD(&dport->root_port_link);
>  	dport->dport = get_device(dport_dev);
>  	dport->port_id = port_id;
>  	dport->component_reg_phys = component_reg_phys;
> @@ -526,6 +552,12 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
>  	if (rc)
>  		goto err;
>  
> +	if (root_port) {
> +		down_write(&root_port_sem);
> +		list_add_tail(&dport->root_port_link, &cxl_root_ports);
> +		up_write(&root_port_sem);
> +	}
> +
>  	return 0;
>  err:
>  	cxl_dport_release(dport);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 6fac4826d22b..3962a5e6a950 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -272,6 +272,7 @@ struct cxl_port {
>   * @component_reg_phys: downstream port component registers
>   * @port: reference to cxl_port that contains this downstream port
>   * @list: node for a cxl_port's list of cxl_dport instances
> + * @root_port_link: node for global list of root ports
>   */
>  struct cxl_dport {
>  	struct device *dport;
> @@ -279,6 +280,7 @@ struct cxl_dport {
>  	resource_size_t component_reg_phys;
>  	struct cxl_port *port;
>  	struct list_head list;
> +	struct list_head root_port_link;
>  };
>  
>  struct cxl_port *to_cxl_port(struct device *dev);
> @@ -287,7 +289,8 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
>  				   struct cxl_port *parent_port);
>  
>  int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
> -		  resource_size_t component_reg_phys);
> +		  resource_size_t component_reg_phys, bool root_port);
> +struct cxl_dport *cxl_get_root_dport(struct device *dev);
>  
>  struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  bool is_root_decoder(struct device *dev);
> diff --git a/tools/testing/cxl/mock_acpi.c b/tools/testing/cxl/mock_acpi.c
> index 4c8a493ace56..ddefc4345f36 100644
> --- a/tools/testing/cxl/mock_acpi.c
> +++ b/tools/testing/cxl/mock_acpi.c
> @@ -57,7 +57,7 @@ static int match_add_root_port(struct pci_dev *pdev, void *data)
>  
>  	/* TODO walk DVSEC to find component register base */
>  	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> -	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
> +	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE, true);
>  	if (rc) {
>  		dev_err(dev, "failed to add dport: %s (%d)\n",
>  			dev_name(&pdev->dev), rc);
> @@ -78,7 +78,7 @@ static int mock_add_root_port(struct platform_device *pdev, void *data)
>  	struct device *dev = ctx->dev;
>  	int rc;
>  
> -	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE);
> +	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE, true);
>  	if (rc) {
>  		dev_err(dev, "failed to add dport: %s (%d)\n",
>  			dev_name(&pdev->dev), rc);


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 23/23] cxl/mem: Disable switch hierarchies for now
  2021-11-22 18:19   ` Jonathan Cameron
@ 2021-11-22 19:17     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 19:17 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 18:19:01, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:50 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Switches aren't supported by the region driver yet. If a device finds
> > itself under a switch it will not bind a driver so that it cannot be
> > used later for region creation/configuration.
> 
> What's the reasoning in have this in this patch set rather than the region one?
> 
> I was rather hoping you'd have it working when the region set is ready :)
> 
> Jonathan
> 

I'm uncomfortable enabling it until I actually have an environment to test it
in. If Dan et al. don't mind however, I can drop this patch.

> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/mem.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> > index e954144af4b8..997898e78d63 100644
> > --- a/drivers/cxl/mem.c
> > +++ b/drivers/cxl/mem.c
> > @@ -155,6 +155,11 @@ static int cxl_mem_probe(struct device *dev)
> >  		goto out;
> >  	}
> >  
> > +	/* FIXME: Add true switch support */
> > +	dev_err(dev, "Devices behind switches are currently unsupported\n");
> > +	rc = -ENODEV;
> > +	goto err_out;
> > +
> >  	/* Walk down from the root port and add all switches */
> >  	cxl_scan_ports(ctx.root_port);
> >  
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 08/23] cxl/acpi: Map component registers for Root Ports
  2021-11-22 15:51   ` Jonathan Cameron
@ 2021-11-22 19:28     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 19:28 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 15:51:47, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:35 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > This implements the TODO in cxl_acpi for mapping component registers.
> > cxl_acpi becomes the second consumer of CXL register block enumeration
> > (cxl_pci being the first). Moving the functionality to cxl_core allows
> > both of these drivers to use the functionality. Equally importantly it
> > allows cxl_core to use the functionality in the future.
> > 
> > CXL 2.0 root ports are similar to CXL 2.0 Downstream Ports with the main
> > distinction being they're a part of the CXL 2.0 host bridge. While
> > mapping their component registers is not immediately useful for the CXL
> > drivers, the movement of register block enumeration into core is a vital
> > step towards HDM decoder programming.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> A few minor comments below.
> 
> Jonathan
> 
> > 
> > ---
> > Changes since RFCv2:
> > - Squash commits together (Dan)
> > - Reword commit message to account for above.
> > ---
> >  drivers/cxl/acpi.c      | 10 ++++++--
> >  drivers/cxl/core/regs.c | 54 +++++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/cxl.h       |  4 +++
> >  drivers/cxl/pci.c       | 52 ---------------------------------------
> >  drivers/cxl/pci.h       |  4 +++
> >  5 files changed, 70 insertions(+), 54 deletions(-)
> > 
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 3163167ecc3a..7cfa8b568013 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/acpi.h>
> >  #include <linux/pci.h>
> >  #include "cxl.h"
> > +#include "pci.h"
> >  
> >  /* Encode defined in CXL 2.0 8.2.5.12.7 HDM Decoder Control Register */
> >  #define CFMWS_INTERLEAVE_WAYS(x)	(1 << (x)->interleave_ways)
> > @@ -134,11 +135,13 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
> >  
> >  __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
> >  {
> > +	resource_size_t creg = CXL_RESOURCE_NONE;
> >  	struct cxl_walk_context *ctx = data;
> >  	struct pci_bus *root_bus = ctx->root;
> >  	struct cxl_port *port = ctx->port;
> >  	int type = pci_pcie_type(pdev);
> >  	struct device *dev = ctx->dev;
> > +	struct cxl_register_map map;
> >  	u32 lnkcap, port_num;
> >  	int rc;
> >  
> > @@ -152,9 +155,12 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
> >  				  &lnkcap) != PCIBIOS_SUCCESSFUL)
> >  		return 0;
> >  
> > -	/* TODO walk DVSEC to find component register base */
> > +	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> 
> Perhaps a comment to explain why this is optional?
> 

Got it. I've also added a dev_info if the regblock isn't found because at some
point in the future it might be confusing should we ever want to use those
registers.

> > +	if (!rc)
> > +		creg = cxl_reg_block(pdev, &map);
> > +
> >  	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> > -	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
> > +	rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
> >  	if (rc) {
> >  		ctx->error = rc;
> >  		return rc;
> > diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> > index e37e23bf4355..41a0245867ea 100644
> > --- a/drivers/cxl/core/regs.c
> > +++ b/drivers/cxl/core/regs.c
> > @@ -5,6 +5,7 @@
> >  #include <linux/slab.h>
> >  #include <linux/pci.h>
> >  #include <cxlmem.h>
> > +#include <pci.h>
> >  
> >  /**
> >   * DOC: cxl registers
> > @@ -247,3 +248,56 @@ int cxl_map_device_regs(struct pci_dev *pdev,
> >  	return 0;
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_map_device_regs, CXL);
> > +
> > +static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> > +				struct cxl_register_map *map)
> > +{
> > +	map->block_offset = ((u64)reg_hi << 32) |
> > +			    (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
> > +	map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
> > +	map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
> > +}
> > +
> > +/**
> > + * cxl_find_regblock() - Locate register blocks by type
> > + * @pdev: The CXL PCI device to enumerate.
> > + * @type: Register Block Indicator id
> > + * @map: Enumeration output, clobbered on error
> > + *
> > + * Return: 0 if register block enumerated, negative error code otherwise
> > + *
> > + * A CXL DVSEC may additional point one or more register blocks, search
> 
> Why additional?  I'm not sure what this means.
> 
> point to one or more additional register blocks perhaps?
> 

Was a typo from how I rebased. Good catch.

> > + * for them by @type.
> > + */
> > +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> > +		      struct cxl_register_map *map)
> > +{
> > +	u32 regloc_size, regblocks;
> > +	int regloc, i;
> > +
> > +	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> > +					   CXL_DVSEC_REG_LOCATOR);
> > +	if (!regloc)
> > +		return -ENXIO;
> > +
> > +	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> > +	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> > +
> > +	regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
> > +	regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
> > +
> > +	for (i = 0; i < regblocks; i++, regloc += 8) {
> > +		u32 reg_lo, reg_hi;
> > +
> > +		pci_read_config_dword(pdev, regloc, &reg_lo);
> > +		pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> > +
> > +		cxl_decode_regblock(reg_lo, reg_hi, map);
> > +
> > +		if (map->reg_type == type)
> > +			return 0;
> > +	}
> > +
> > +	return -ENODEV;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_find_regblock, CXL);
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index ab4596f0b751..7150a9694f66 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -145,6 +145,10 @@ int cxl_map_device_regs(struct pci_dev *pdev,
> >  			struct cxl_device_regs *regs,
> >  			struct cxl_register_map *map);
> >  
> > +enum cxl_regloc_type;
> > +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> > +		      struct cxl_register_map *map);
> > +
> >  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
> >  #define CXL_TARGET_STRLEN 20
> >  
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 711bf4514480..d2c743a31b0c 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -433,58 +433,6 @@ static int cxl_map_regs(struct cxl_dev_state *cxlds, struct cxl_register_map *ma
> >  	return 0;
> >  }
> >  
> > -static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> > -				struct cxl_register_map *map)
> > -{
> > -	map->block_offset = ((u64)reg_hi << 32) |
> > -			    (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
> > -	map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
> > -	map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
> > -}
> > -
> > -/**
> > - * cxl_find_regblock() - Locate register blocks by type
> > - * @pdev: The CXL PCI device to enumerate.
> > - * @type: Register Block Indicator id
> > - * @map: Enumeration output, clobbered on error
> > - *
> > - * Return: 0 if register block enumerated, negative error code otherwise
> > - *
> > - * A CXL DVSEC may point to one or more register blocks, search for them
> > - * by @type.
> > - */
> > -static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> > -			     struct cxl_register_map *map)
> > -{
> > -	u32 regloc_size, regblocks;
> > -	int regloc, i;
> > -
> > -	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> > -					   CXL_DVSEC_REG_LOCATOR);
> > -	if (!regloc)
> > -		return -ENXIO;
> > -
> > -	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> > -	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> > -
> > -	regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
> > -	regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
> > -
> > -	for (i = 0; i < regblocks; i++, regloc += 8) {
> > -		u32 reg_lo, reg_hi;
> > -
> > -		pci_read_config_dword(pdev, regloc, &reg_lo);
> > -		pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> > -
> > -		cxl_decode_regblock(reg_lo, reg_hi, map);
> > -
> > -		if (map->reg_type == type)
> > -			return 0;
> > -	}
> > -
> > -	return -ENODEV;
> > -}
> > -
> >  static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> >  			  struct cxl_register_map *map)
> >  {
> > diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> > index 8ae2b4adc59d..a4b506bb37d1 100644
> > --- a/drivers/cxl/pci.h
> > +++ b/drivers/cxl/pci.h
> > @@ -47,4 +47,8 @@ enum cxl_regloc_type {
> >  	CXL_REGLOC_RBI_TYPES
> >  };
> >  
> > +#define cxl_reg_block(pdev, map)                                               \
> > +	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
> > +			   (map)->block_offset))
> > +
> >  #endif /* __CXL_PCI_H__ */
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 12/23] cxl: Introduce endpoint decoders
  2021-11-22 16:20   ` Jonathan Cameron
@ 2021-11-22 19:37     ` Ben Widawsky
  2021-11-25  0:07       ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 19:37 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 16:20:39, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:39 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Endpoints have decoders too. It is useful to share the same
> > infrastructure from cxl_core. Endpoints do not have dports (downstream
> > targets), only the underlying physical medium. As a result, some special
> > casing is needed.
> > 
> > There is no functional change introduced yet as endpoints don't actually
> > enumerate decoders yet.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> I'm not a fan of special values like using 0 here to indicate endpoint
> device.  I'd rather see a base cxl_decode_alloc(..., bool ep)
> and possibly wrappers for the non ep case and ep one.
> 
> Jonathan
> 

My inclination is the opposite. However, I think you and Dan both brought up
something to this effect in the previous RFCs.

Dan, do you have a preference here?

> > ---
> >  drivers/cxl/core/bus.c | 41 +++++++++++++++++++++++++++++++++--------
> >  1 file changed, 33 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 1ee12a60f3f4..16b15f54fb62 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -187,6 +187,12 @@ static const struct attribute_group *cxl_decoder_switch_attribute_groups[] = {
> >  	NULL,
> >  };
> >  
> > +static const struct attribute_group *cxl_decoder_endpoint_attribute_groups[] = {
> > +	&cxl_decoder_base_attribute_group,
> > +	&cxl_base_attribute_group,
> > +	NULL,
> > +};
> > +
> >  static void cxl_decoder_release(struct device *dev)
> >  {
> >  	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> > @@ -196,6 +202,12 @@ static void cxl_decoder_release(struct device *dev)
> >  	kfree(cxld);
> >  }
> >  
> > +static const struct device_type cxl_decoder_endpoint_type = {
> > +	.name = "cxl_decoder_endpoint",
> > +	.release = cxl_decoder_release,
> > +	.groups = cxl_decoder_endpoint_attribute_groups,
> > +};
> > +
> >  static const struct device_type cxl_decoder_switch_type = {
> >  	.name = "cxl_decoder_switch",
> >  	.release = cxl_decoder_release,
> > @@ -208,6 +220,11 @@ static const struct device_type cxl_decoder_root_type = {
> >  	.groups = cxl_decoder_root_attribute_groups,
> >  };
> >  
> > +static bool is_endpoint_decoder(struct device *dev)
> > +{
> > +	return dev->type == &cxl_decoder_endpoint_type;
> > +}
> > +
> >  bool is_root_decoder(struct device *dev)
> >  {
> >  	return dev->type == &cxl_decoder_root_type;
> > @@ -499,7 +516,9 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
> >   * cxl_decoder_alloc - Allocate a new CXL decoder
> >   * @port: owning port of this decoder
> >   * @nr_targets: downstream targets accessible by this decoder. All upstream
> > - *		ports and root ports must have at least 1 target.
> > + *		ports and root ports must have at least 1 target. Endpoint
> > + *		devices will have 0 targets. Callers wishing to register an
> > + *		endpoint device should specify 0.
> >   *
> >   * A port should contain one or more decoders. Each of those decoders enable
> >   * some address space for CXL.mem utilization. A decoder is expected to be
> > @@ -516,7 +535,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> >  	struct device *dev;
> >  	int rc = 0;
> >  
> > -	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets == 0)
> > +	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
> >  		return ERR_PTR(-EINVAL);
> >  
> >  	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> > @@ -535,8 +554,11 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> >  	dev->parent = &port->dev;
> >  	dev->bus = &cxl_bus_type;
> >  
> > +	/* Endpoints don't have a target list */
> > +	if (nr_targets == 0)
> > +		dev->type = &cxl_decoder_endpoint_type;
> >  	/* root ports do not have a cxl_port_type parent */
> > -	if (port->dev.parent->type == &cxl_port_type)
> > +	else if (port->dev.parent->type == &cxl_port_type)
> >  		dev->type = &cxl_decoder_switch_type;
> >  	else
> >  		dev->type = &cxl_decoder_root_type;
> > @@ -579,12 +601,15 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> >  	if (cxld->interleave_ways < 1)
> >  		return -EINVAL;
> >  
> > -	port = to_cxl_port(cxld->dev.parent);
> > -	rc = decoder_populate_targets(cxld, port, target_map);
> > -	if (rc)
> > -		return rc;
> > -
> >  	dev = &cxld->dev;
> > +
> > +	port = to_cxl_port(cxld->dev.parent);
> > +	if (!is_endpoint_decoder(dev)) {
> > +		rc = decoder_populate_targets(cxld, port, target_map);
> > +		if (rc)
> > +			return rc;
> > +	}
> > +
> >  	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
> >  	if (rc)
> >  		return rc;
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 13/23] cxl/core: Move target population locking to caller
  2021-11-22 16:33   ` Jonathan Cameron
@ 2021-11-22 21:58     ` Ben Widawsky
  2021-11-23 11:05       ` Jonathan Cameron
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 21:58 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 16:33:02, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:40 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > In preparation for a port driver that enumerates a descendant port +
> > decoder hierarchy, arrange for an unlocked version of cxl_decoder_add().
> > Otherwise a port-driver that adds a child decoder will deadlock on the
> > device_lock() in ->probe().
> > 
> 
> I think this description should call out that the lock was originally taken
> for a much shorter time in decoder_populate_targets() but is moved
> up one layer.

Sounds good.

> 
> One other query inline.  Seems like we the WARN_ON stuff is a bit
> over paranoid given what's visible in this patch.  If there is a
> good reason for that, then add something to the patch description to
> justify it.
>  
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > 
> > ---
> > 
> > Changes since RFCv2:
> > - Reword commit message (Dan)
> > - Move decoder API changes into this patch (Dan)
> > ---
> >  drivers/cxl/core/bus.c | 59 +++++++++++++++++++++++++++++++-----------
> >  drivers/cxl/cxl.h      |  1 +
> >  2 files changed, 45 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 16b15f54fb62..cd6fe7823c69 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -487,28 +487,22 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
> >  {
> >  	int rc = 0, i;
> >  
> > +	device_lock_assert(&port->dev);
> > +
> >  	if (!target_map)
> >  		return 0;
> >  
> > -	device_lock(&port->dev);
> > -	if (list_empty(&port->dports)) {
> > -		rc = -EINVAL;
> > -		goto out_unlock;
> > -	}
> > +	if (list_empty(&port->dports))
> > +		return -EINVAL;
> >  
> >  	for (i = 0; i < cxld->nr_targets; i++) {
> >  		struct cxl_dport *dport = find_dport(port, target_map[i]);
> >  
> > -		if (!dport) {
> > -			rc = -ENXIO;
> > -			goto out_unlock;
> > -		}
> > +		if (!dport)
> > +			return -ENXIO;
> >  		cxld->target[i] = dport;
> >  	}
> >  
> > -out_unlock:
> > -	device_unlock(&port->dev);
> > -
> >  	return rc;
> >  }
> >  
> > @@ -571,7 +565,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> >  EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
> >  
> >  /**
> > - * cxl_decoder_add - Add a decoder with targets
> > + * cxl_decoder_add_locked - Add a decoder with targets
> >   * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
> >   * @target_map: A list of downstream ports that this decoder can direct memory
> >   *              traffic to. These numbers should correspond with the port number
> > @@ -581,12 +575,14 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
> >   * is an endpoint device. A more awkward example is a hostbridge whose root
> >   * ports get hot added (technically possible, though unlikely).
> >   *
> > - * Context: Process context. Takes and releases the cxld's device lock.
> > + * This is the locked variant of cxl_decoder_add().
> > + *
> > + * Context: Process context. Expects the cxld's device lock to be held.
> >   *
> >   * Return: Negative error code if the decoder wasn't properly configured; else
> >   *	   returns 0.
> >   */
> > -int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> > +int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
> >  {
> >  	struct cxl_port *port;
> >  	struct device *dev;
> > @@ -619,6 +615,39 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> >  
> >  	return device_add(dev);
> >  }
> > +EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, CXL);
> > +
> > +/**
> > + * cxl_decoder_add - Add a decoder with targets
> > + * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
> > + * @target_map: A list of downstream ports that this decoder can direct memory
> > + *              traffic to. These numbers should correspond with the port number
> > + *              in the PCIe Link Capabilities structure.
> > + *
> > + * This is the unlocked variant of cxl_decoder_add_locked().
> > + * See cxl_decoder_add_locked().
> > + *
> > + * Context: Process context. Takes and releases the cxld's device lock.
> > + */
> > +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> > +{
> > +	struct cxl_port *port;
> > +	int rc;
> > +
> > +	if (WARN_ON_ONCE(!cxld))
> > +		return -EINVAL;
> 
> Why do we now need these protections but didn't before?

I don't quite understand what you're trying to point out.

Prior to this patch, cxl_decoder_add() checks:
- !cxld
- IS_ERR(cxld)
- cxld->interleave_ways != 0

After this patch, cxl_decoder_add() checks:
- !cxld
- IS_ERR(cxld)
- (and then calls cxl_decoder_add_locked())

And cxl_decoder_add_locked() checks:
- !cxld
- IS_ERR(cxld)
- cxld->interleave_ways != 0

Ultimately we want to check all 3, and since cxl_decoder_add() calls
cxl_decoder_add_locked(), we're good there. The problem is to get from a cxld to
a port, you need to make sure you have a valid cxld, and the API previously
allowed !cxld and IS_ERR(cxld). So there are duplicative checks if you call
cxl_decoder_add(), but other than that I don't see any new protections.

> 
> 
> > +
> > +	if (WARN_ON_ONCE(IS_ERR(cxld)))
> > +		return PTR_ERR(cxld);
> > +
> > +	port = to_cxl_port(cxld->dev.parent);
> > +
> > +	device_lock(&port->dev);
> > +	rc = cxl_decoder_add_locked(cxld, target_map);
> > +	device_unlock(&port->dev);
> > +
> > +	return rc;
> > +}
> >  EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL);
> >  
> >  static void cxld_unregister(void *dev)
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index b66ed8f241c6..2c5627fa8a34 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -290,6 +290,7 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
> >  bool is_root_decoder(struct device *dev);
> >  struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> >  				      unsigned int nr_targets);
> > +int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map);
> >  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
> >  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> >  
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 14/23] cxl: Introduce topology host registration
  2021-11-22 18:20   ` Jonathan Cameron
@ 2021-11-22 22:30     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 22:30 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 18:20:15, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:41 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The description of the CXL topology will be conveyed by a platform
> > specific entity that is expected to be a singleton. For ACPI based
> > systems, this is ACPI0017. When the topology host goes away, which as of
> > now can only be triggered by module unload, it is desirable to have the
> > entire topology cleaned up. Regular devm unwinding handles most
> > situations already, but what's missing is the ability to teardown the
> > root port. Since the root port is platform specific, the core needs a
> > set of APIs to allow platform specific drivers to register their root
> > ports. With that, all the automatic teardown can occur.
> > 
> > cxl_test makes for an interesting case. cxl_test creates an alternate
> > universe where there are possibly two root topology hosts (a real
> > ACPI0016, and a fake ACPI0016). For this to work in the future, cxl_acpi
> > (or some future platform host driver) will need to be unloaded first.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> This is a little unusual looking but having followed through how it is used
> it seems like a sensible approach to me.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 

Thanks. I noticed another commit message bug, s/ACPI0016/ACPI0017\/CEDT above.

> > ---
> > The topology lock can be used for more things. I'd like to save that as
> > an add-on patch since it's extra risk for no reward, at this point.
> > ---
> >  drivers/cxl/acpi.c     | 18 ++++++++++---
> >  drivers/cxl/core/bus.c | 57 +++++++++++++++++++++++++++++++++++++++---
> >  drivers/cxl/cxl.h      |  5 +++-
> >  3 files changed, 73 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 3415184a2e61..82cc42ab38c6 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -224,8 +224,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >  		return 0;
> >  	}
> >  
> > -	port = devm_cxl_add_port(host, match, dport->component_reg_phys,
> > -				 root_port);
> > +	port = devm_cxl_add_port(match, dport->component_reg_phys, root_port);
> >  	if (IS_ERR(port))
> >  		return PTR_ERR(port);
> >  	dev_dbg(host, "%s: add: %s\n", dev_name(match), dev_name(&port->dev));
> > @@ -376,6 +375,11 @@ static int add_root_nvdimm_bridge(struct device *match, void *data)
> >  	return 1;
> >  }
> >  
> > +static void clear_topology_host(void *data)
> > +{
> > +	cxl_unregister_topology_host(data);
> > +}
> > +
> >  static int cxl_acpi_probe(struct platform_device *pdev)
> >  {
> >  	int rc;
> > @@ -384,7 +388,15 @@ static int cxl_acpi_probe(struct platform_device *pdev)
> >  	struct acpi_device *adev = ACPI_COMPANION(host);
> >  	struct cxl_cfmws_context ctx;
> >  
> > -	root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
> > +	rc = cxl_register_topology_host(host);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = devm_add_action_or_reset(host, clear_topology_host, host);
> > +	if (rc)
> > +		return rc;
> > +
> > +	root_port = devm_cxl_add_port(host, CXL_RESOURCE_NONE, root_port);
> >  	if (IS_ERR(root_port))
> >  		return PTR_ERR(root_port);
> >  	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index cd6fe7823c69..2ad38167796d 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -25,6 +25,53 @@
> >   */
> >  
> >  static DEFINE_IDA(cxl_port_ida);
> > +static DECLARE_RWSEM(topology_host_sem);
> > +
> > +static struct device *cxl_topology_host;
> > +
> > +int cxl_register_topology_host(struct device *host)
> > +{
> > +	down_write(&topology_host_sem);
> > +	if (cxl_topology_host) {
> > +		up_write(&topology_host_sem);
> > +		pr_warn("%s host currently in use. Please try unloading %s",
> > +			dev_name(cxl_topology_host), host->driver->name);
> > +		return -EBUSY;
> > +	}
> > +
> > +	cxl_topology_host = host;
> > +	up_write(&topology_host_sem);
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_register_topology_host, CXL);
> > +
> > +void cxl_unregister_topology_host(struct device *host)
> > +{
> > +	down_write(&topology_host_sem);
> > +	if (cxl_topology_host == host)
> > +		cxl_topology_host = NULL;
> > +	else
> > +		pr_warn("topology host in use by %s\n",
> > +			cxl_topology_host->driver->name);
> > +	up_write(&topology_host_sem);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_unregister_topology_host, CXL);
> > +
> > +static struct device *get_cxl_topology_host(void)
> > +{
> > +	down_read(&topology_host_sem);
> > +	if (cxl_topology_host)
> > +		return cxl_topology_host;
> > +	up_read(&topology_host_sem);
> > +	return NULL;
> > +}
> > +
> > +static void put_cxl_topology_host(struct device *dev)
> > +{
> > +	WARN_ON(dev != cxl_topology_host);
> > +	up_read(&topology_host_sem);
> > +}
> >  
> >  static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
> >  			    char *buf)
> > @@ -362,17 +409,16 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
> >  
> >  /**
> >   * devm_cxl_add_port - register a cxl_port in CXL memory decode hierarchy
> > - * @host: host device for devm operations
> >   * @uport: "physical" device implementing this upstream port
> >   * @component_reg_phys: (optional) for configurable cxl_port instances
> >   * @parent_port: next hop up in the CXL memory decode hierarchy
> >   */
> > -struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
> > +struct cxl_port *devm_cxl_add_port(struct device *uport,
> >  				   resource_size_t component_reg_phys,
> >  				   struct cxl_port *parent_port)
> >  {
> > +	struct device *dev, *host;
> >  	struct cxl_port *port;
> > -	struct device *dev;
> >  	int rc;
> >  
> >  	port = cxl_port_alloc(uport, component_reg_phys, parent_port);
> > @@ -391,7 +437,12 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
> >  	if (rc)
> >  		goto err;
> >  
> > +	host = get_cxl_topology_host();
> > +	if (!host)
> > +		return ERR_PTR(-ENODEV);
> > +
> >  	rc = devm_add_action_or_reset(host, unregister_port, port);
> > +	put_cxl_topology_host(host);
> >  	if (rc)
> >  		return ERR_PTR(rc);
> >  
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 2c5627fa8a34..6fac4826d22b 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -152,6 +152,9 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> >  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
> >  #define CXL_TARGET_STRLEN 20
> >  
> > +int cxl_register_topology_host(struct device *host);
> > +void cxl_unregister_topology_host(struct device *host);
> > +
> >  /*
> >   * cxl_decoder flags that define the type of memory / devices this
> >   * decoder supports as well as configuration lock status See "CXL 2.0
> > @@ -279,7 +282,7 @@ struct cxl_dport {
> >  };
> >  
> >  struct cxl_port *to_cxl_port(struct device *dev);
> > -struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
> > +struct cxl_port *devm_cxl_add_port(struct device *uport,
> >  				   resource_size_t component_reg_phys,
> >  				   struct cxl_port *parent_port);
> >  
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 15/23] cxl/core: Store global list of root ports
  2021-11-22 18:22   ` Jonathan Cameron
@ 2021-11-22 22:32     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 22:32 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 18:22:31, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:42 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > CXL root ports (the downstream port to a host bridge) are to be
> > enumerated by a platform specific driver. In the case of ACPI compliant
> > systems, this is like the cxl_acpi driver. Root ports are the first
> > CXL spec defined component that can be "found" by that platform specific
> > driver.
> > 
> > By storing a list of these root ports components in lower levels of the
> > topology (switches and endpoints), have a mechanism to walk up their
> > device hierarchy to find an enumerated root port. This will be necessary
> > for region programming.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > 
> > ---
> > Dan points out in review this is possible to do without a new global
> > list. While I agree, I was unable to get it working in a reasonable
> > mount of time. Will punt on that for now.
> 
> This has made me curious.  Is this a punt it for v1, or a punt it for longer
> term and maybe revisit later?
> 
> What you have looks like it should work fine to me.
> 

Dan said he was going to take a crack at it. I'll leave it up to him whether he
wants to make it happen before merge. Either way is fine by me.

> 
> > ---
> >  drivers/cxl/acpi.c            |  4 ++--
> >  drivers/cxl/core/bus.c        | 34 +++++++++++++++++++++++++++++++++-
> >  drivers/cxl/cxl.h             |  5 ++++-
> >  tools/testing/cxl/mock_acpi.c |  4 ++--
> >  4 files changed, 41 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 82cc42ab38c6..c12e4fed7941 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -159,7 +159,7 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
> >  		creg = cxl_reg_block(pdev, &map);
> >  
> >  	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> > -	rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
> > +	rc = cxl_add_dport(port, &pdev->dev, port_num, creg, true);
> >  	if (rc) {
> >  		ctx->error = rc;
> >  		return rc;
> > @@ -341,7 +341,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
> >  		return 0;
> >  	}
> >  
> > -	rc = cxl_add_dport(root_port, match, uid, ctx.chbcr);
> > +	rc = cxl_add_dport(root_port, match, uid, ctx.chbcr, false);
> >  	if (rc) {
> >  		dev_err(host, "failed to add downstream port: %s\n",
> >  			dev_name(match));
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 2ad38167796d..9e0d7d5d9298 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -26,6 +26,8 @@
> >  
> >  static DEFINE_IDA(cxl_port_ida);
> >  static DECLARE_RWSEM(topology_host_sem);
> > +static LIST_HEAD(cxl_root_ports);
> > +static DECLARE_RWSEM(root_port_sem);
> >  
> >  static struct device *cxl_topology_host;
> >  
> > @@ -326,12 +328,31 @@ struct cxl_port *to_cxl_port(struct device *dev)
> >  	return container_of(dev, struct cxl_port, dev);
> >  }
> >  
> > +struct cxl_dport *cxl_get_root_dport(struct device *dev)
> > +{
> > +	struct cxl_dport *ret = NULL;
> > +	struct cxl_dport *dport;
> > +
> > +	down_read(&root_port_sem);
> > +	list_for_each_entry(dport, &cxl_root_ports, root_port_link) {
> > +		if (dport->dport == dev) {
> > +			ret = dport;
> > +			break;
> > +		}
> > +	}
> > +
> > +	up_read(&root_port_sem);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_get_root_dport, CXL);
> > +
> >  static void unregister_port(void *_port)
> >  {
> >  	struct cxl_port *port = _port;
> >  	struct cxl_dport *dport;
> >  
> >  	device_lock(&port->dev);
> > +	down_read(&root_port_sem);
> >  	list_for_each_entry(dport, &port->dports, list) {
> >  		char link_name[CXL_TARGET_STRLEN];
> >  
> > @@ -339,7 +360,10 @@ static void unregister_port(void *_port)
> >  			     dport->port_id) >= CXL_TARGET_STRLEN)
> >  			continue;
> >  		sysfs_remove_link(&port->dev.kobj, link_name);
> > +
> > +		list_del_init(&dport->root_port_link);
> >  	}
> > +	up_read(&root_port_sem);
> >  	device_unlock(&port->dev);
> >  	device_unregister(&port->dev);
> >  }
> > @@ -493,12 +517,13 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
> >   * @dport_dev: firmware or PCI device representing the dport
> >   * @port_id: identifier for this dport in a decoder's target list
> >   * @component_reg_phys: optional location of CXL component registers
> > + * @root_port: is this a root port (hostbridge downstream)
> >   *
> >   * Note that all allocations and links are undone by cxl_port deletion
> >   * and release.
> >   */
> >  int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
> > -		  resource_size_t component_reg_phys)
> > +		  resource_size_t component_reg_phys, bool root_port)
> >  {
> >  	char link_name[CXL_TARGET_STRLEN];
> >  	struct cxl_dport *dport;
> > @@ -513,6 +538,7 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
> >  		return -ENOMEM;
> >  
> >  	INIT_LIST_HEAD(&dport->list);
> > +	INIT_LIST_HEAD(&dport->root_port_link);
> >  	dport->dport = get_device(dport_dev);
> >  	dport->port_id = port_id;
> >  	dport->component_reg_phys = component_reg_phys;
> > @@ -526,6 +552,12 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
> >  	if (rc)
> >  		goto err;
> >  
> > +	if (root_port) {
> > +		down_write(&root_port_sem);
> > +		list_add_tail(&dport->root_port_link, &cxl_root_ports);
> > +		up_write(&root_port_sem);
> > +	}
> > +
> >  	return 0;
> >  err:
> >  	cxl_dport_release(dport);
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 6fac4826d22b..3962a5e6a950 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -272,6 +272,7 @@ struct cxl_port {
> >   * @component_reg_phys: downstream port component registers
> >   * @port: reference to cxl_port that contains this downstream port
> >   * @list: node for a cxl_port's list of cxl_dport instances
> > + * @root_port_link: node for global list of root ports
> >   */
> >  struct cxl_dport {
> >  	struct device *dport;
> > @@ -279,6 +280,7 @@ struct cxl_dport {
> >  	resource_size_t component_reg_phys;
> >  	struct cxl_port *port;
> >  	struct list_head list;
> > +	struct list_head root_port_link;
> >  };
> >  
> >  struct cxl_port *to_cxl_port(struct device *dev);
> > @@ -287,7 +289,8 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
> >  				   struct cxl_port *parent_port);
> >  
> >  int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
> > -		  resource_size_t component_reg_phys);
> > +		  resource_size_t component_reg_phys, bool root_port);
> > +struct cxl_dport *cxl_get_root_dport(struct device *dev);
> >  
> >  struct cxl_decoder *to_cxl_decoder(struct device *dev);
> >  bool is_root_decoder(struct device *dev);
> > diff --git a/tools/testing/cxl/mock_acpi.c b/tools/testing/cxl/mock_acpi.c
> > index 4c8a493ace56..ddefc4345f36 100644
> > --- a/tools/testing/cxl/mock_acpi.c
> > +++ b/tools/testing/cxl/mock_acpi.c
> > @@ -57,7 +57,7 @@ static int match_add_root_port(struct pci_dev *pdev, void *data)
> >  
> >  	/* TODO walk DVSEC to find component register base */
> >  	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> > -	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
> > +	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE, true);
> >  	if (rc) {
> >  		dev_err(dev, "failed to add dport: %s (%d)\n",
> >  			dev_name(&pdev->dev), rc);
> > @@ -78,7 +78,7 @@ static int mock_add_root_port(struct platform_device *pdev, void *data)
> >  	struct device *dev = ctx->dev;
> >  	int rc;
> >  
> > -	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE);
> > +	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE, true);
> >  	if (rc) {
> >  		dev_err(dev, "failed to add dport: %s (%d)\n",
> >  			dev_name(&pdev->dev), rc);
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 16/23] cxl/pci: Cache device DVSEC offset
  2021-11-22 16:46   ` Jonathan Cameron
@ 2021-11-22 22:34     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 22:34 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 16:46:21, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:43 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The PCIe device DVSEC, defined in the CXL 2.0 spec, 8.1.3 is required to
> > be implemented by CXL 2.0 endpoint devices. Since the information
> > contained within this DVSEC will be critically important for region
> > configuration, it makes sense to find the value early.
> > 
> > Since this DVSEC is not strictly required for mailbox functionality,
> > failure to find this information does not result in the driver failing
> > to bind.
> 
> That feels like a path we are going to forget to test sometime in the
> future.  Given it's a specification requirement, I'd treat it as
> an error and make our lives easier going forwards!
> 
> Otherwise looks good to me.
> 

Agreed. This is silly. Basically nothing will work if the device dvsec can't be
found. I don't remember what I was thinking...

> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/cxlmem.h | 2 ++
> >  drivers/cxl/pci.c    | 7 +++++++
> >  2 files changed, 9 insertions(+)
> > 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 8d96d009ad90..3ef3c652599e 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -98,6 +98,7 @@ struct cxl_mbox_cmd {
> >   *
> >   * @dev: The device associated with this CXL state
> >   * @regs: Parsed register blocks
> > + * @device_dvsec: Offset to the PCIe device DVSEC
> >   * @payload_size: Size of space for payload
> >   *                (CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
> >   * @lsa_size: Size of Label Storage Area
> > @@ -125,6 +126,7 @@ struct cxl_dev_state {
> >  	struct device *dev;
> >  
> >  	struct cxl_regs regs;
> > +	int device_dvsec;
> >  
> >  	size_t payload_size;
> >  	size_t lsa_size;
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index d2c743a31b0c..f3872cbee7f8 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -474,6 +474,13 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (IS_ERR(cxlds))
> >  		return PTR_ERR(cxlds);
> >  
> > +	cxlds->device_dvsec = pci_find_dvsec_capability(pdev,
> > +							PCI_DVSEC_VENDOR_ID_CXL,
> > +							CXL_DVSEC_PCIE_DEVICE);
> > +	if (!cxlds->device_dvsec)
> > +		dev_warn(&pdev->dev,
> > +			 "Device DVSEC not present. Expect limited functionality.\n");
> > +
> >  	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> >  	if (rc)
> >  		return rc;
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 17/23] cxl: Cache and pass DVSEC ranges
  2021-11-22 17:00   ` Jonathan Cameron
@ 2021-11-22 22:50     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 22:50 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 17:00:56, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:44 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > CXL 1.1 specification provided a mechanism for mapping an address space
> > of a CXL device. That functionality is known as a "range" and can be
> > programmed through PCIe DVSEC. In addition to this, the specification
> > defines an active bit which a device will expose through the same DVSEC
> > to notify system software that memory is initialized and ready.
> > 
> > While CXL 2.0 introduces a more powerful mechanism called HDM decoders
> > that are controlled by MMIO behind a PCIe BAR, the spec does allow the
> > 1.1 style mapping to still be present. In such a case, when the CXL
> > driver takes over, if it were to enable HDM decoding and there was an
> > actively used range, things would likely blow up, in particular if it
> > wasn't an identical mapping.
> > 
> > This patch caches the relevant information which the cxl_mem driver will
> > need to make the proper decision and passes it along.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> 0-day spotted issues in same code as me. See below.
> 
> This is another case where I'd treat failure as fatal.  Anything that fails
> is either dead, or non spec compliant so don't bother loading the driver
> if that happens. Fewer paths to test etc...

I disagree about treating failure as fatal. The failure here only forbids using
the memory on the device. I can envision firmware bugs or the like where these
things might fail, but the mailbox interfaces still work perfectly fine, or at
least fine enough to upgrade the firmware and try again.

Dan and I had talked about a modparam (or perhaps sysfs on some higher level,
like CXL bus) to control the length of the timeout, so that one doesn't have to
wait forever to deal with this usage. Figured we'd cross that bridge when we
came to it.

> 
> > ---
> >  drivers/cxl/cxlmem.h |  19 +++++++
> >  drivers/cxl/pci.c    | 126 +++++++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/pci.h    |  13 +++++
> >  3 files changed, 158 insertions(+)
> > 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 3ef3c652599e..eac5528ccaae 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -89,6 +89,22 @@ struct cxl_mbox_cmd {
> >   */
> >  #define CXL_CAPACITY_MULTIPLIER SZ_256M
> >  
> > +/**
> > + * struct cxl_endpoint_dvsec_info - Cached DVSEC info
> > + * @mem_enabled: cached value of mem_enabled in the DVSEC, PCIE_DEVICE
> > + * @ranges: Number of HDM ranges this device contains.
> > + * @range.base: cached value of the range base in the DVSEC, PCIE_DEVICE
> > + * @range.size: cached value of the range size in the DVSEC, PCIE_DEVICE
> > + */
> > +struct cxl_endpoint_dvsec_info {
> > +	bool mem_enabled;
> > +	int ranges;
> > +	struct {
> > +		u64 base;
> > +		u64 size;
> > +	} range[2];
> > +};
> > +
> >  /**
> >   * struct cxl_dev_state - The driver device state
> >   *
> > @@ -117,6 +133,7 @@ struct cxl_mbox_cmd {
> >   * @active_persistent_bytes: sum of hard + soft persistent
> >   * @next_volatile_bytes: volatile capacity change pending device reset
> >   * @next_persistent_bytes: persistent capacity change pending device reset
> > + * @info: Cached DVSEC information about the device.
> >   * @mbox_send: @dev specific transport for transmitting mailbox commands
> >   *
> >   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > @@ -147,6 +164,8 @@ struct cxl_dev_state {
> >  	u64 next_volatile_bytes;
> >  	u64 next_persistent_bytes;
> >  
> > +	struct cxl_endpoint_dvsec_info *info;
> > +
> >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> >  };
> >  
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index f3872cbee7f8..b3f46045bf3e 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -452,8 +452,126 @@ static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> >  	return rc;
> >  }
> >  
> > +#define CDPD(cxlds, which)                                                     \
> > +	cxlds->device_dvsec + CXL_DVSEC_PCIE_DEVICE_##which##_OFFSET
> 
> My usual grumble :)  I personally find macros like this a bit of a pain when
> reviewing.  I'd really like to see things spelled out in the code so I
> can immediately see what register we are dealing with even if it does
> seem rather repetitive in the code.

I understand. It's this or have every line look strange because of character
limits.  I agree that whomever writes this stuff can reason it out better, and
that makes it harder on reviewers. I don't mind changing it back, I'd like hear
any other opinions before I do though.

> 
> > +
> > +#define CDPDR(cxlds, which, sorb, lohi)                                        \
> > +	cxlds->device_dvsec +                                                  \
> > +		CXL_DVSEC_PCIE_DEVICE_RANGE_##sorb##_##lohi##_OFFSET(which)
> > +
> > +static int wait_for_valid(struct cxl_dev_state *cxlds)
> > +{
> > +	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > +	const unsigned long timeout = jiffies + HZ;
> > +	bool valid;
> > +
> > +	do {
> > +		u64 size;
> > +		u32 temp;
> > +		int rc;
> > +
> > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
> > +					   &temp);
> > +		if (rc)
> > +			return -ENXIO;
> > +		size = (u64)temp << 32;
> > +
> > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
> > +					   &temp);
> > +		if (rc)
> > +			return -ENXIO;
> > +		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
> > +
> > +		/*
> > +		 * Memory_Info_Valid: When set, indicates that the CXL Range 1
> > +		 * Size high and Size Low registers are valid. Must be set
> > +		 * within 1 second of deassertion of reset to CXL device.
> > +		 */
> > +		valid = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_INFO_VALID, temp);
> > +		if (valid)
> > +			break;
> 
> I think there is a race here.  What if you read the high part, get garbage and then
> read the low part which is now valid...
> 
> Swap this around so you read this one first and it will be fine.
> 
> However given as 0-day pointed out, size isn't used, this is fine anyway
> (subject to removing the pointless code).
> 

Yes. I've fixed that. And you're right, size is potentially invalid in this case
even if it found the valid bit.

> > +		cpu_relax();
> > +	} while (!time_after(jiffies, timeout));
> > +
> > +	return valid ? 0 : -ETIMEDOUT;
> > +}
> > +
> > +static struct cxl_endpoint_dvsec_info *dvsec_ranges(struct cxl_dev_state *cxlds)
> > +{
> > +	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > +	struct cxl_endpoint_dvsec_info *info;
> > +	int hdm_count, rc, i;
> > +	u16 cap, ctrl;
> > +
> > +	rc = pci_read_config_word(pdev, CDPD(cxlds, CAP), &cap);
> > +	if (rc)
> > +		return ERR_PTR(-ENXIO);
> > +	rc = pci_read_config_word(pdev, CDPD(cxlds, CTRL), &ctrl);
> > +	if (rc)
> > +		return ERR_PTR(-ENXIO);
> > +
> > +	if (!(cap & CXL_DVSEC_PCIE_DEVICE_MEM_CAPABLE))
> > +		return ERR_PTR(-ENODEV);
> > +
> > +	/*
> > +	 * It is not allowed by spec for MEM.capable to be set and have 0 HDM
> > +	 * decoders. Therefore, the device is not considered CXL.mem capable.
> > +	 */
> > +	hdm_count = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_HDM_COUNT_MASK, cap);
> > +	if (!hdm_count || hdm_count > 2)
> > +		return ERR_PTR(-EINVAL);
> > +
> > +	rc = wait_for_valid(cxlds);
> > +	if (rc)
> > +		return ERR_PTR(rc);
> > +
> > +	info = devm_kzalloc(cxlds->dev, sizeof(*info), GFP_KERNEL);
> > +	if (!info)
> > +		return ERR_PTR(-ENOMEM);
> > +
> > +	info->mem_enabled = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_ENABLE, ctrl);
> > +
> > +	for (i = 0; i < hdm_count; i++) {
> > +		u64 base, size;
> > +		u32 temp;
> > +
> > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, SIZE, HIGH),
> > +					   &temp);
> > +		if (rc)
> > +			continue;
> > +		size = (u64)temp << 32;
> > +
> > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, SIZE, LOW),
> > +					   &temp);
> > +		if (rc)
> > +			continue;
> > +		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
> > +
> > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, BASE, HIGH),
> > +					   &temp);
> > +		if (rc)
> > +			continue;
> > +		base = (u64)temp << 32;
> > +
> > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, i, BASE, LOW),
> > +					   &temp);
> > +		if (rc)
> > +			continue;
> > +		base |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_BASE_LOW_MASK;
> > +
> > +		info->range[i].base = base;
> > +		info->range[i].size = size;
> > +		info->ranges++;
> > +	}
> > +
> > +	return info;
> > +}
> > +
> > +#undef CDPDR
> > +
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> > +	struct cxl_endpoint_dvsec_info *info;
> >  	struct cxl_register_map map;
> >  	struct cxl_memdev *cxlmd;
> >  	struct cxl_dev_state *cxlds;
> > @@ -505,6 +623,14 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (rc)
> >  		return rc;
> >  
> > +	info = dvsec_ranges(cxlds);
> > +	if (IS_ERR(info))
> > +		dev_err(&pdev->dev,
> > +			"Failed to get DVSEC range information (%ld)\n",
> > +			PTR_ERR(info));
> > +	else
> > +		cxlds->info = info;
> > +
> >  	cxlmd = devm_cxl_add_memdev(cxlds);
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> 
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 18/23] cxl/pci: Implement wait for media active
  2021-11-22 17:03   ` Jonathan Cameron
@ 2021-11-22 22:57     ` Ben Widawsky
  2021-11-23 11:09       ` Jonathan Cameron
  2021-11-26 11:36     ` Jonathan Cameron
  1 sibling, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 22:57 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 17:03:35, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:45 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > CXL 2.0 8.1.3.8.2 defines "Memory_Active: When set, indicates that the
> > CXL Range 1 memory is fully initialized and available for software use.
> > Must be set within Range 1. Memory_Active_Timeout of deassertion of
> 
> Range 1?
> 

Not my numbering... It's the first DVSEC range.

> > reset to CXL device if CXL.mem HwInit Mode=1" The CXL* Type 3 Memory
> > Device Software Guide (Revision 1.0) further describes the need to check
> > this bit before using HDM.
> > 
> > Unfortunately, Memory_Active can take quite a long time depending on
> > media size (up to 256s per 2.0 spec). Since the cxl_pci driver doesn't
> > care about this, a callback is exported as part of driver state for use
> > by drivers that do care.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> Same thing about size not being used...
> 

Yep, got it.

> > ---
> > This patch did not exist in RFCv2
> > ---
> >  drivers/cxl/cxlmem.h |  1 +
> >  drivers/cxl/pci.c    | 56 ++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 57 insertions(+)
> > 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index eac5528ccaae..a9424dd4e5c3 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -167,6 +167,7 @@ struct cxl_dev_state {
> >  	struct cxl_endpoint_dvsec_info *info;
> >  
> >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> > +	int (*wait_media_ready)(struct cxl_dev_state *cxlds);
> >  };
> >  
> >  enum cxl_opcode {
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index b3f46045bf3e..f1a68bfe5f77 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -496,6 +496,60 @@ static int wait_for_valid(struct cxl_dev_state *cxlds)
> >  	return valid ? 0 : -ETIMEDOUT;
> >  }
> >  
> > +/*
> > + * Implements Figure 43 of the CXL Type 3 Memory Device Software Guide. Waits a
> > + * full 256s no matter what the device reports.
> > + */
> > +static int wait_for_media_ready(struct cxl_dev_state *cxlds)
> > +{
> > +	const unsigned long timeout = jiffies + (256 * HZ);
> > +	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > +	u64 md_status;
> > +	bool active;
> > +	int rc;
> > +
> > +	rc = wait_for_valid(cxlds);
> > +	if (rc)
> > +		return rc;
> > +
> > +	do {
> > +		u64 size;
> > +		u32 temp;
> > +		int rc;
> > +
> > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
> > +					   &temp);
> > +		if (rc)
> > +			return -ENXIO;
> > +		size = (u64)temp << 32;
> > +
> > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
> > +					   &temp);
> > +		if (rc)
> > +			return -ENXIO;
> > +		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
> > +
> > +		active = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_ACTIVE, temp);
> 
> Only need to read the register to get active for this particular functionality.
> 
> > +		if (active)
> > +			break;
> > +		cpu_relax();
> > +		mdelay(100);
> > +	} while (!time_after(jiffies, timeout));
> > +
> > +	if (!active)
> > +		return -ETIMEDOUT;
> > +
> > +	rc = check_device_status(cxlds);
> > +	if (rc)
> > +		return rc;
> > +
> > +	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > +	if (!CXLMDEV_READY(md_status))
> > +		return -EIO;
> > +
> > +	return 0;
> > +}
> > +
> >  static struct cxl_endpoint_dvsec_info *dvsec_ranges(struct cxl_dev_state *cxlds)
> >  {
> >  	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > @@ -598,6 +652,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (!cxlds->device_dvsec)
> >  		dev_warn(&pdev->dev,
> >  			 "Device DVSEC not present. Expect limited functionality.\n");
> > +	else
> > +		cxlds->wait_media_ready = wait_for_media_ready;
> >  
> >  	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> >  	if (rc)
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 19/23] cxl/pci: Store component register base in cxlds
  2021-11-22 17:11   ` Jonathan Cameron
@ 2021-11-22 23:01     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 23:01 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 17:11:42, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:46 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The component register base address is enumerated and stored in the
> > cxl_device_state structure for use by the endpoint driver.
> > 
> > Component register base addresses are obtained through PCI mechanisms.
> > As such it makes most sense for the cxl_pci driver to obtain that
> > address. In order to reuse the port driver for enumerating decoder
> > resources for an endpoint, it is desirable to be able to add the
> > endpoint as a port. The endpoint does know of the cxlds and can pull the
> > component register base from there and pass it along to port creation.
> 
> This feels like a lot of explanation in for trivial caching of an address.
> I'm not sure you need to be that detailed, though I guess it does no real
> harm.

It is mostly to articulate that cxl_pci is responsible for PCI stuff, and
cxl_mem is responsible for CXL.mem stuff, and that's why we didn't just do this
work in cxl_mem (which indeed was what I originally did).

> 
> Another one where I'm unsure why we are muddling on after an error...

Same motivation as the other - mailbox can still be used even if the media isn't
available.

> 
> 
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> > Changes since RFCv2:
> > This patch was originally named, "cxl/core: Store component register
> > base for memdevs". It plumbed the component registers through memdev
> > creation. After more work, it became apparent we needed to plumb other
> > stuff from the pci driver, so going forward, cxlds will just be
> > referenced by the cxl_mem driver. This also allows us to ignore the
> > change needed to cxl_test
> > 
> > - Rework patch to store the base in cxlds
> > - Remove defunct comment (Dan)
> > ---
> >  drivers/cxl/cxlmem.h |  2 ++
> >  drivers/cxl/pci.c    | 11 +++++++++++
> >  2 files changed, 13 insertions(+)
> > 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index a9424dd4e5c3..b1d753541f4e 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -134,6 +134,7 @@ struct cxl_endpoint_dvsec_info {
> >   * @next_volatile_bytes: volatile capacity change pending device reset
> >   * @next_persistent_bytes: persistent capacity change pending device reset
> >   * @info: Cached DVSEC information about the device.
> > + * @component_reg_phys: register base of component registers
> >   * @mbox_send: @dev specific transport for transmitting mailbox commands
> >   *
> >   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > @@ -165,6 +166,7 @@ struct cxl_dev_state {
> >  	u64 next_persistent_bytes;
> >  
> >  	struct cxl_endpoint_dvsec_info *info;
> > +	resource_size_t component_reg_phys;
> >  
> >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> >  	int (*wait_media_ready)(struct cxl_dev_state *cxlds);
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index f1a68bfe5f77..a8e375950514 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -663,6 +663,17 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (rc)
> >  		return rc;
> >  
> > +	/*
> > +	 * If the component registers can't be found, the cxl_pci driver may
> > +	 * still be useful for management functions so don't return an error.
> > +	 */
> > +	cxlds->component_reg_phys = CXL_RESOURCE_NONE;
> > +	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > +	if (rc)
> > +		dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
> > +	else
> > +		cxlds->component_reg_phys = cxl_reg_block(pdev, &map);
> > +
> >  	rc = cxl_pci_setup_mailbox(cxlds);
> >  	if (rc)
> >  		return rc;
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-22 17:41   ` Jonathan Cameron
@ 2021-11-22 23:38     ` Ben Widawsky
  2021-11-23 11:38       ` Jonathan Cameron
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 23:38 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 17:41:32, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:47 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The CXL port driver is responsible for managing the decoder resources
> > contained within the port. It will also provide APIs that other drivers
> > will consume for managing these resources.
> > 
> > There are 4 types of ports in a system:
> > 1. Platform port. This is a non-programmable entity. Such a port is
> >    named rootX. It is enumerated by cxl_acpi in an ACPI based system.
> > 2. Hostbridge port. This ports register access is defined in a platform
> 
> port's 
> 
> >    specific way (CHBS for ACPI platforms). It has n downstream ports,
> >    each of which are known as CXL 2.0 root ports. Once the platform
> >    specific mechanism to get the offset to the registers is obtained it
> >    operates just like other CXL components. The enumeration of this
> >    component is started by cxl_acpi and completed by cxl_port.
> > 3. Switch port. A switch port is similar to a hostbridge port except
> >    register access is defined in the CXL specification in a platform
> >    agnostic way. The downstream ports for a switch are simply known as
> >    downstream ports. The enumeration of these are entirely contained
> >    within cxl_port.
> > 4. Endpoint port. Endpoint ports are similar to switch ports with the
> >    exception that they have no downstream ports, only the underlying
> >    media on the device. The enumeration of these are started by cxl_pci,
> >    and completed by cxl_port.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> A few comments inline including what looks to me memory on the stack which has
> gone out of scope when it's accessed.
> 
> Jonathan
> 
> > 
> > ---
> > Changes since RFCv2:
> > - Reword commit message tense (Dan)
> > - Reword commit message
> > - Drop SOFTDEP since it's not needed yet (Dan)
> > - Add CONFIG_CXL_PORT (Dan)
> > - s/CXL_DECODER_F_EN/CXL_DECODER_F_ENABLE (Dan)
> > - rename cxl_hdm_decoder_ functions to "to_" (Dan)
> > - remove useless inline (Dan)
> > - Check endpoint decoder based on dport list instead of driver id (Dan)
> > - Use range instead of resource per dependent patch change
> > - Use clever union packing for target list (Dan)
> > - Only check NULL from devm_cxl_iomap_block (Dan)
> > - Properly parent the created cxl decoders
> > - Move bus rescanning from cxl_acpi to here (Dan)
> > - Remove references to "CFMWS" in core (Dan)
> > - Use macro to help keep within 80 character lines
> > ---
> >  .../driver-api/cxl/memory-devices.rst         |   5 +
> >  drivers/cxl/Kconfig                           |  22 ++
> >  drivers/cxl/Makefile                          |   2 +
> >  drivers/cxl/core/bus.c                        |  67 ++++
> >  drivers/cxl/core/regs.c                       |   6 +-
> >  drivers/cxl/cxl.h                             |  34 +-
> >  drivers/cxl/port.c                            | 323 ++++++++++++++++++
> >  7 files changed, 450 insertions(+), 9 deletions(-)
> >  create mode 100644 drivers/cxl/port.c
> > 
> > diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> > index 3b8f41395f6b..fbf0393cdddc 100644
> > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > @@ -28,6 +28,11 @@ CXL Memory Device
> >  .. kernel-doc:: drivers/cxl/pci.c
> >     :internal:
> >  
> > +CXL Port
> > +--------
> > +.. kernel-doc:: drivers/cxl/port.c
> > +   :doc: cxl port
> > +
> >  CXL Core
> >  --------
> >  .. kernel-doc:: drivers/cxl/cxl.h
> > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > index ef05e96f8f97..3aeb33bba5a3 100644
> > --- a/drivers/cxl/Kconfig
> > +++ b/drivers/cxl/Kconfig
> > @@ -77,4 +77,26 @@ config CXL_PMEM
> >  	  provisioning the persistent memory capacity of CXL memory expanders.
> >  
> >  	  If unsure say 'm'.
> > +
> > +config CXL_MEM
> > +	tristate "CXL.mem: Memory Devices"
> > +	select CXL_PORT
> > +	depends on CXL_PCI
> > +	default CXL_BUS
> > +	help
> > +	  The CXL.mem protocol allows a device to act as a provider of "System
> > +	  RAM" and/or "Persistent Memory" that is fully coherent as if the
> > +	  memory was attached to the typical CPU memory controller.  This is
> > +	  known as HDM "Host-managed Device Memory".
> > +
> > +	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> > +	  memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
> > +	  specification for a detailed description of HDM.
> > +
> > +	  If unsure say 'm'.
> > +
> > +
> > +config CXL_PORT
> > +	tristate
> > +
> >  endif
> > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > index cf07ae6cea17..56fcac2323cb 100644
> > --- a/drivers/cxl/Makefile
> > +++ b/drivers/cxl/Makefile
> > @@ -3,7 +3,9 @@ obj-$(CONFIG_CXL_BUS) += core/
> >  obj-$(CONFIG_CXL_PCI) += cxl_pci.o
> >  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
> >  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> > +obj-$(CONFIG_CXL_PORT) += cxl_port.o
> >  
> >  cxl_pci-y := pci.o
> >  cxl_acpi-y := acpi.o
> >  cxl_pmem-y := pmem.o
> > +cxl_port-y := port.o
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 9e0d7d5d9298..46a06cfe79bd 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -31,6 +31,8 @@ static DECLARE_RWSEM(root_port_sem);
> >  
> >  static struct device *cxl_topology_host;
> >  
> > +static bool is_cxl_decoder(struct device *dev);
> > +
> >  int cxl_register_topology_host(struct device *host)
> >  {
> >  	down_write(&topology_host_sem);
> > @@ -75,6 +77,45 @@ static void put_cxl_topology_host(struct device *dev)
> >  	up_read(&topology_host_sem);
> >  }
> >  
> > +static int decoder_match(struct device *dev, void *data)
> > +{
> > +	struct resource *theirs = (struct resource *)data;
> > +	struct cxl_decoder *cxld;
> > +
> > +	if (!is_cxl_decoder(dev))
> > +		return 0;
> > +
> > +	cxld = to_cxl_decoder(dev);
> > +	if (theirs->start <= cxld->decoder_range.start &&
> > +	    theirs->end >= cxld->decoder_range.end)
> > +		return 1;
> > +
> > +	return 0;
> > +}
> > +
> > +static struct cxl_decoder *cxl_find_root_decoder(resource_size_t base,
> > +						 resource_size_t size)
> > +{
> > +	struct cxl_decoder *cxld = NULL;
> > +	struct device *cxldd;
> > +	struct device *host;
> > +	struct resource res = (struct resource){
> > +		.start = base,
> > +		.end = base + size - 1,
> > +	};
> > +
> > +	host = get_cxl_topology_host();
> > +	if (!host)
> > +		return NULL;
> > +
> > +	cxldd = device_find_child(host, &res, decoder_match);
> > +	if (cxldd)
> > +		cxld = to_cxl_decoder(cxldd);
> > +
> > +	put_cxl_topology_host(host);
> > +	return cxld;
> > +}
> > +
> >  static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
> >  			    char *buf)
> >  {
> > @@ -280,6 +321,11 @@ bool is_root_decoder(struct device *dev)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(is_root_decoder, CXL);
> >  
> > +static bool is_cxl_decoder(struct device *dev)
> > +{
> > +	return dev->type->release == cxl_decoder_release;
> > +}
> > +
> >  struct cxl_decoder *to_cxl_decoder(struct device *dev)
> >  {
> >  	if (dev_WARN_ONCE(dev, dev->type->release != cxl_decoder_release,
> > @@ -327,6 +373,7 @@ struct cxl_port *to_cxl_port(struct device *dev)
> >  		return NULL;
> >  	return container_of(dev, struct cxl_port, dev);
> >  }
> > +EXPORT_SYMBOL_NS_GPL(to_cxl_port, CXL);
> >  
> >  struct cxl_dport *cxl_get_root_dport(struct device *dev)
> >  {
> > @@ -735,6 +782,24 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL);
> >  
> >  static void cxld_unregister(void *dev)
> >  {
> > +	struct cxl_decoder *plat_decoder, *cxld = to_cxl_decoder(dev);
> > +	resource_size_t base, size;
> > +
> > +	if (is_root_decoder(dev)) {
> > +		device_unregister(dev);
> > +		return;
> > +	}
> > +
> > +	base = cxld->decoder_range.start;
> > +	size = range_len(&cxld->decoder_range);
> > +
> > +	if (size) {
> > +		plat_decoder = cxl_find_root_decoder(base, size);
> > +		if (plat_decoder)
> > +			__release_region(&plat_decoder->platform_res, base,
> > +					 size);
> > +	}
> > +
> >  	device_unregister(dev);
> >  }
> >  
> > @@ -789,6 +854,8 @@ static int cxl_device_id(struct device *dev)
> >  		return CXL_DEVICE_NVDIMM_BRIDGE;
> >  	if (dev->type == &cxl_nvdimm_type)
> >  		return CXL_DEVICE_NVDIMM;
> > +	if (dev->type == &cxl_port_type)
> > +		return CXL_DEVICE_PORT;
> >  	return 0;
> >  }
> >  
> > diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> > index 41a0245867ea..f191b0c995a7 100644
> > --- a/drivers/cxl/core/regs.c
> > +++ b/drivers/cxl/core/regs.c
> > @@ -159,9 +159,8 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base,
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_probe_device_regs, CXL);
> >  
> > -static void __iomem *devm_cxl_iomap_block(struct device *dev,
> > -					  resource_size_t addr,
> > -					  resource_size_t length)
> > +void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
> > +				   resource_size_t length)
> >  {
> >  	void __iomem *ret_val;
> >  	struct resource *res;
> > @@ -180,6 +179,7 @@ static void __iomem *devm_cxl_iomap_block(struct device *dev,
> >  
> >  	return ret_val;
> >  }
> > +EXPORT_SYMBOL_NS_GPL(devm_cxl_iomap_block, CXL);
> >  
> >  int cxl_map_component_regs(struct pci_dev *pdev,
> >  			   struct cxl_component_regs *regs,
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 3962a5e6a950..24fa16157d5e 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -17,6 +17,9 @@
> >   * (port-driver, region-driver, nvdimm object-drivers... etc).
> >   */
> >  
> > +/* CXL 2.0 8.2.4 CXL Component Register Layout and Definition */
> > +#define CXL_COMPONENT_REG_BLOCK_SIZE SZ_64K
> > +
> >  /* CXL 2.0 8.2.5 CXL.cache and CXL.mem Registers*/
> >  #define CXL_CM_OFFSET 0x1000
> >  #define CXL_CM_CAP_HDR_OFFSET 0x0
> > @@ -36,11 +39,22 @@
> >  #define CXL_HDM_DECODER_CAP_OFFSET 0x0
> >  #define   CXL_HDM_DECODER_COUNT_MASK GENMASK(3, 0)
> >  #define   CXL_HDM_DECODER_TARGET_COUNT_MASK GENMASK(7, 4)
> > -#define CXL_HDM_DECODER0_BASE_LOW_OFFSET 0x10
> > -#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET 0x14
> > -#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET 0x18
> > -#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET 0x1c
> > -#define CXL_HDM_DECODER0_CTRL_OFFSET 0x20
> > +#define   CXL_HDM_DECODER_INTERLEAVE_11_8 BIT(8)
> > +#define   CXL_HDM_DECODER_INTERLEAVE_14_12 BIT(9)
> > +#define CXL_HDM_DECODER_CTRL_OFFSET 0x4
> > +#define   CXL_HDM_DECODER_ENABLE BIT(1)
> > +#define CXL_HDM_DECODER0_BASE_LOW_OFFSET(i) (0x20 * (i) + 0x10)
> > +#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET(i) (0x20 * (i) + 0x14)
> > +#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET(i) (0x20 * (i) + 0x18)
> > +#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(i) (0x20 * (i) + 0x1c)
> > +#define CXL_HDM_DECODER0_CTRL_OFFSET(i) (0x20 * (i) + 0x20)
> > +#define   CXL_HDM_DECODER0_CTRL_IG_MASK GENMASK(3, 0)
> > +#define   CXL_HDM_DECODER0_CTRL_IW_MASK GENMASK(7, 4)
> > +#define   CXL_HDM_DECODER0_CTRL_COMMIT BIT(9)
> > +#define   CXL_HDM_DECODER0_CTRL_COMMITTED BIT(10)
> > +#define   CXL_HDM_DECODER0_CTRL_TYPE BIT(12)
> > +#define CXL_HDM_DECODER0_TL_LOW(i) (0x20 * (i) + 0x24)
> > +#define CXL_HDM_DECODER0_TL_HIGH(i) (0x20 * (i) + 0x28)
> >  
> >  static inline int cxl_hdm_decoder_count(u32 cap_hdr)
> >  {
> > @@ -148,6 +162,8 @@ int cxl_map_device_regs(struct pci_dev *pdev,
> >  enum cxl_regloc_type;
> >  int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> >  		      struct cxl_register_map *map);
> > +void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
> > +				   resource_size_t length);
> >  
> >  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
> >  #define CXL_TARGET_STRLEN 20
> > @@ -165,7 +181,8 @@ void cxl_unregister_topology_host(struct device *host);
> >  #define CXL_DECODER_F_TYPE2 BIT(2)
> >  #define CXL_DECODER_F_TYPE3 BIT(3)
> >  #define CXL_DECODER_F_LOCK  BIT(4)
> > -#define CXL_DECODER_F_MASK  GENMASK(4, 0)
> > +#define CXL_DECODER_F_ENABLE    BIT(5)
> > +#define CXL_DECODER_F_MASK  GENMASK(5, 0)
> >  
> >  enum cxl_decoder_type {
> >         CXL_DECODER_ACCELERATOR = 2,
> > @@ -255,6 +272,8 @@ struct cxl_walk_context {
> >   * @dports: cxl_dport instances referenced by decoders
> >   * @decoder_ida: allocator for decoder ids
> >   * @component_reg_phys: component register capability base address (optional)
> > + * @rescan_work: worker object for bus rescans after port additions
> > + * @data: opaque data with driver specific usage
> >   */
> >  struct cxl_port {
> >  	struct device dev;
> > @@ -263,6 +282,8 @@ struct cxl_port {
> >  	struct list_head dports;
> >  	struct ida decoder_ida;
> >  	resource_size_t component_reg_phys;
> > +	struct work_struct rescan_work;
> > +	void *data;
> >  };
> >  
> >  /**
> > @@ -325,6 +346,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
> >  
> >  #define CXL_DEVICE_NVDIMM_BRIDGE	1
> >  #define CXL_DEVICE_NVDIMM		2
> > +#define CXL_DEVICE_PORT			3
> >  
> >  #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
> >  #define CXL_MODALIAS_FMT "cxl:t%d"
> > diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> > new file mode 100644
> > index 000000000000..3c03131517af
> > --- /dev/null
> > +++ b/drivers/cxl/port.c
> > @@ -0,0 +1,323 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> > +#include <linux/device.h>
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +
> > +#include "cxlmem.h"
> > +
> > +/**
> > + * DOC: cxl port
> > + *
> > + * The port driver implements the set of functionality needed to allow full
> > + * decoder enumeration and routing. A CXL port is an abstraction of a CXL
> > + * component that implements some amount of CXL decoding of CXL.mem traffic.
> > + * As of the CXL 2.0 spec, this includes:
> > + *
> > + *	.. list-table:: CXL Components w/ Ports
> > + *		:widths: 25 25 50
> > + *		:header-rows: 1
> > + *
> > + *		* - component
> > + *		  - upstream
> > + *		  - downstream
> > + *		* - Hostbridge
> > + *		  - ACPI0016
> > + *		  - root port
> > + *		* - Switch
> > + *		  - Switch Upstream Port
> > + *		  - Switch Downstream Port
> > + *		* - Endpoint
> > + *		  - Endpoint Port
> > + *		  - N/A
> > + *
> > + * The primary service this driver provides is enumerating HDM decoders and
> > + * presenting APIs to other drivers to utilize the decoders.
> > + */
> > +
> > +static struct workqueue_struct *cxl_port_wq;
> > +
> > +struct cxl_port_data {
> > +	struct cxl_component_regs regs;
> > +
> > +	struct port_caps {
> > +		unsigned int count;
> > +		unsigned int tc;
> > +		unsigned int interleave11_8;
> > +		unsigned int interleave14_12;
> > +	} caps;
> > +};
> > +
> > +static inline int to_interleave_granularity(u32 ctrl)
> > +{
> > +	int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
> > +
> > +	return 256 << val;
> > +}
> > +
> > +static inline int to_interleave_ways(u32 ctrl)
> > +{
> > +	int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl);
> > +
> > +	return 1 << val;
> > +}
> > +
> > +static void get_caps(struct cxl_port *port, struct cxl_port_data *cpd)
> > +{
> > +	void __iomem *hdm_decoder = cpd->regs.hdm_decoder;
> > +	struct port_caps *caps = &cpd->caps;
> > +	u32 hdm_cap;
> > +
> > +	hdm_cap = readl(hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> > +
> > +	caps->count = cxl_hdm_decoder_count(hdm_cap);
> > +	caps->tc = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
> > +	caps->interleave11_8 =
> > +		FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap);
> > +	caps->interleave14_12 =
> > +		FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_14_12, hdm_cap);
> > +}
> > +
> > +static int map_regs(struct cxl_port *port, void __iomem *crb,
> > +		    struct cxl_port_data *cpd)
> > +{
> > +	struct cxl_register_map map;
> > +	struct cxl_component_reg_map *comp_map = &map.component_map;
> > +
> > +	cxl_probe_component_regs(&port->dev, crb, comp_map);
> > +	if (!comp_map->hdm_decoder.valid) {
> > +		dev_err(&port->dev, "HDM decoder registers invalid\n");
> > +		return -ENXIO;
> > +	}
> > +
> > +	cpd->regs.hdm_decoder = crb + comp_map->hdm_decoder.offset;
> > +
> > +	return 0;
> > +}
> > +
> > +static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
> > +{
> > +	u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
> > +
> > +	if (ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED)
> > +		return 0;
> > +
> > +	return ioread64_hi_lo(hdm_decoder +
> > +			      CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
> > +}
> > +
> > +static bool is_endpoint_port(struct cxl_port *port)
> > +{
> > +	/* Endpoints can't be ports... yet! */
> > +	return false;
> > +}
> > +
> > +static void rescan_ports(struct work_struct *work)
> > +{
> > +	if (bus_rescan_devices(&cxl_bus_type))
> > +		pr_warn("Failed to rescan\n");
> > +}
> > +
> > +/* Minor layering violation */
> > +static int dvsec_range_used(struct cxl_port *port)
> > +{
> > +	struct cxl_endpoint_dvsec_info *info;
> > +	int i, ret = 0;
> > +
> > +	if (!is_endpoint_port(port))
> > +		return 0;
> > +
> > +	info = port->data;
> > +	for (i = 0; i < info->ranges; i++)
> > +		if (info->range[i].size)
> > +			ret |= 1 << i;
> > +
> > +	return ret;
> > +}
> > +
> > +static int enumerate_hdm_decoders(struct cxl_port *port,
> > +				  struct cxl_port_data *portdata)
> > +{
> > +	void __iomem *hdm_decoder = portdata->regs.hdm_decoder;
> > +	bool global_enable;
> > +	u32 global_ctrl;
> > +	int i = 0;
> > +
> > +	global_ctrl = readl(hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > +	global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE;
> > +	if (!global_enable) {
> > +		i = dvsec_range_used(port);
> > +		if (i) {
> > +			dev_err(&port->dev,
> > +				"Couldn't add port because device is using DVSEC range registers\n");
> > +			return -EBUSY;
> > +		}
> > +	}
> > +
> > +	for (i = 0; i < portdata->caps.count; i++) {
> > +		int rc, target_count = portdata->caps.tc;
> > +		struct cxl_decoder *cxld;
> > +		int *target_map = NULL;
> > +		u64 size;
> > +
> > +		if (is_endpoint_port(port))
> > +			target_count = 0;
> > +
> > +		cxld = cxl_decoder_alloc(port, target_count);
> > +		if (IS_ERR(cxld)) {
> > +			dev_warn(&port->dev,
> > +				 "Failed to allocate the decoder\n");
> > +			return PTR_ERR(cxld);
> > +		}
> > +
> > +		cxld->target_type = CXL_DECODER_EXPANDER;
> > +		cxld->interleave_ways = 1;
> > +		cxld->interleave_granularity = 0;
> > +
> > +		size = get_decoder_size(hdm_decoder, i);
> > +		if (size != 0) {
> > +#define decoderN(reg, n) hdm_decoder + CXL_HDM_DECODER0_##reg(n)
> 
> Perhaps this block in the if (size != 0) would be more readable if broken out
> to a utility function?

I don't get this comment, there is already get_decoder_size(). Can you please
elaborate?

> As normal I'm not keen on the macro magic if we can avoid it.
> 

Yeah - just trying to not have to deal with wrapping long lines.

> 
> > +			int temp[CXL_DECODER_MAX_INTERLEAVE];
> > +			u64 base;
> > +			u32 ctrl;
> > +			int j;
> > +			union {
> > +				u64 value;
> > +				char target_id[8];
> > +			} target_list;
> 
> I thought we tried to avoid this union usage in kernel because of the whole
> thing about c compilers being able to do what they like with it...
> 

I wasn't aware of the restriction. I can change it back if it's required. It
does look a lot nicer this way. Is there a reference to this issue somewhere?

> > +
> > +			target_map = temp;
> 
> This is set to a variable that goes out of scope whilst target_map is still in
> use.
> 

Yikes. I'm pretty surprised the compiler didn't catch this.

> > +			ctrl = readl(decoderN(CTRL_OFFSET, i));
> > +			base = ioread64_hi_lo(decoderN(BASE_LOW_OFFSET, i));
> > +
> > +			cxld->decoder_range = (struct range){
> > +				.start = base,
> > +				.end = base + size - 1
> > +			};
> > +
> > +			cxld->flags = CXL_DECODER_F_ENABLE;
> > +			cxld->interleave_ways = to_interleave_ways(ctrl);
> > +			cxld->interleave_granularity =
> > +				to_interleave_granularity(ctrl);
> > +
> > +			if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0)
> > +				cxld->target_type = CXL_DECODER_ACCELERATOR;
> > +
> > +			target_list.value = ioread64_hi_lo(decoderN(TL_LOW, i));
> > +			for (j = 0; j < cxld->interleave_ways; j++)
> > +				target_map[j] = target_list.target_id[j];
> > +#undef decoderN
> > +		}
> > +
> > +		rc = cxl_decoder_add_locked(cxld, target_map);
> > +		if (rc)
> > +			put_device(&cxld->dev);
> > +		else
> > +			rc = cxl_decoder_autoremove(&port->dev, cxld);
> > +		if (rc)
> > +			dev_err(&port->dev, "Failed to add decoder\n");
> 
> If that fails on the autoremove registration (unlikely) this message
> will be rather confusing - as the add was fine...
> 
> This nest of carrying on when we have an error is getting ever deeper...
> 

Yeah, this is not great. I will clean it up.

Thanks.

> > +		else
> > +			dev_dbg(&cxld->dev, "Added to port %s\n",
> > +				dev_name(&port->dev));
> > +	}
> > +
> > +	/*
> > +	 * Turn on global enable now since DVSEC ranges aren't being used and
> > +	 * we'll eventually want the decoder enabled.
> > +	 */
> > +	if (!global_enable) {
> > +		dev_dbg(&port->dev, "Enabling HDM decode\n");
> > +		writel(global_ctrl | CXL_HDM_DECODER_ENABLE, hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > +	}
> > +
> > +	return 0;
> > +}
> > +

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 21/23] cxl: Unify port enumeration for decoders
  2021-11-22 17:48   ` Jonathan Cameron
@ 2021-11-22 23:44     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-22 23:44 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 17:48:16, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:48 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The port driver exists to do proper enumeration of the component
> > registers for ports, including HDM decoder resources. Any port which
> > follows the CXL specification to implement HDM decoder registers should
> > be handled by the port driver. This includes host bridge registers that
> > are currently handled within the cxl_acpi driver.
> > 
> > In moving the responsibility from cxl_acpi to cxl_port, three primary
> > things are accomplished here:
> > 1. Multi-port host bridges are now handled by the port driver
> > 2. Single port host bridges are handled by the port driver
> > 3. Single port switches without component registers will be handled by
> >    the port driver.
> > 
> > While it's tempting to remove decoder APIs from cxl_core entirely, it is
> > still required that platform specific drivers are able to add decoders
> > which aren't specified in CXL 2.0+. An example of this is the CFMWS
> > parsing which is implementing in cxl_acpi.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> One trivial suggestion inline, but looks fine to me otherwise.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> > 
> > ---
> > Changes since RFCv2:
> > - Renamed subject
> > - Reworded commit message
> > - Handle *all* host bridge port enumeration in cxl_port (Dan)
> >   - Handle passthrough decoding in cxl_port
> > ---
> >  drivers/cxl/acpi.c     | 41 +++-----------------------------
> >  drivers/cxl/core/bus.c |  6 +++--
> >  drivers/cxl/cxl.h      |  2 ++
> >  drivers/cxl/port.c     | 54 +++++++++++++++++++++++++++++++++++++++++-
> >  4 files changed, 62 insertions(+), 41 deletions(-)
> > 
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index c12e4fed7941..c85a04ecbf7f 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -210,8 +210,6 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >  	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> >  	struct acpi_pci_root *pci_root;
> >  	struct cxl_walk_context ctx;
> > -	int single_port_map[1], rc;
> > -	struct cxl_decoder *cxld;
> >  	struct cxl_dport *dport;
> >  	struct cxl_port *port;
> >  
> > @@ -245,43 +243,9 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >  		return -ENODEV;
> >  	if (ctx.error)
> >  		return ctx.error;
> > -	if (ctx.count > 1)
> > -		return 0;
> >  
> > -	/* TODO: Scan CHBCR for HDM Decoder resources */
> > -
> > -	/*
> > -	 * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> > -	 * Structure) single ported host-bridges need not publish a decoder
> > -	 * capability when a passthrough decode can be assumed, i.e. all
> > -	 * transactions that the uport sees are claimed and passed to the single
> > -	 * dport. Disable the range until the first CXL region is enumerated /
> > -	 * activated.
> > -	 */
> > -	cxld = cxl_decoder_alloc(port, 1);
> > -	if (IS_ERR(cxld))
> > -		return PTR_ERR(cxld);
> > -
> > -	cxld->interleave_ways = 1;
> > -	cxld->interleave_granularity = PAGE_SIZE;
> > -	cxld->target_type = CXL_DECODER_EXPANDER;
> > -	cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0);
> > -
> > -	device_lock(&port->dev);
> > -	dport = list_first_entry(&port->dports, typeof(*dport), list);
> > -	device_unlock(&port->dev);
> > -
> > -	single_port_map[0] = dport->port_id;
> > -
> > -	rc = cxl_decoder_add(cxld, single_port_map);
> > -	if (rc)
> > -		put_device(&cxld->dev);
> > -	else
> > -		rc = cxl_decoder_autoremove(host, cxld);
> > -
> > -	if (rc == 0)
> > -		dev_dbg(host, "add: %s\n", dev_name(&cxld->dev));
> > -	return rc;
> > +	/* Host bridge ports are enumerated by the port driver. */
> > +	return 0;
> >  }
> >  
> >  struct cxl_chbs_context {
> > @@ -448,3 +412,4 @@ module_platform_driver(cxl_acpi_driver);
> >  MODULE_LICENSE("GPL v2");
> >  MODULE_IMPORT_NS(CXL);
> >  MODULE_IMPORT_NS(ACPI);
> > +MODULE_SOFTDEP("pre: cxl_port");
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 46a06cfe79bd..acfa212eea21 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -62,7 +62,7 @@ void cxl_unregister_topology_host(struct device *host)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_unregister_topology_host, CXL);
> >  
> > -static struct device *get_cxl_topology_host(void)
> > +struct device *get_cxl_topology_host(void)
> >  {
> >  	down_read(&topology_host_sem);
> >  	if (cxl_topology_host)
> > @@ -70,12 +70,14 @@ static struct device *get_cxl_topology_host(void)
> >  	up_read(&topology_host_sem);
> >  	return NULL;
> >  }
> > +EXPORT_SYMBOL_NS_GPL(get_cxl_topology_host, CXL);
> >  
> > -static void put_cxl_topology_host(struct device *dev)
> > +void put_cxl_topology_host(struct device *dev)
> >  {
> >  	WARN_ON(dev != cxl_topology_host);
> >  	up_read(&topology_host_sem);
> >  }
> > +EXPORT_SYMBOL_NS_GPL(put_cxl_topology_host, CXL);
> >  
> >  static int decoder_match(struct device *dev, void *data)
> >  {
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 24fa16157d5e..f8354241c5a3 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -170,6 +170,8 @@ void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
> >  
> >  int cxl_register_topology_host(struct device *host);
> >  void cxl_unregister_topology_host(struct device *host);
> > +struct device *get_cxl_topology_host(void);
> > +void put_cxl_topology_host(struct device *dev);
> >  
> >  /*
> >   * cxl_decoder flags that define the type of memory / devices this
> > diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> > index 3c03131517af..7a1fc726fe9f 100644
> > --- a/drivers/cxl/port.c
> > +++ b/drivers/cxl/port.c
> > @@ -233,12 +233,64 @@ static int enumerate_hdm_decoders(struct cxl_port *port,
> >  	return 0;
> >  }
> >  
> > +/*
> > + * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability Structure)
> > + * single ported host-bridges need not publish a decoder capability when a
> > + * passthrough decode can be assumed, i.e. all transactions that the uport sees
> > + * are claimed and passed to the single dport. Disable the range until the first
> > + * CXL region is enumerated / activated.
> > + */
> > +static int add_passthrough_decoder(struct cxl_port *port)
> > +{
> > +	int single_port_map[1], rc;
> > +	struct cxl_decoder *cxld;
> > +	struct cxl_dport *dport;
> > +
> > +	device_lock_assert(&port->dev);
> > +
> > +	cxld = cxl_decoder_alloc(port, 1);
> > +	if (IS_ERR(cxld))
> > +		return PTR_ERR(cxld);
> > +
> > +	cxld->interleave_ways = 1;
> > +	cxld->interleave_granularity = PAGE_SIZE;
> > +	cxld->target_type = CXL_DECODER_EXPANDER;
> > +	cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0);
> > +
> > +	dport = list_first_entry(&port->dports, typeof(*dport), list);
> > +	single_port_map[0] = dport->port_id;
> > +
> > +	rc = cxl_decoder_add_locked(cxld, single_port_map);
> > +	if (rc)
> > +		put_device(&cxld->dev);
> 
> I would handle this error path entirely here, or use a goto rather
> than messing up the good path with conditionals on each element,
> particularly as there isn't much to do in the error paths.
> I guess this might get more complicated in later patches though.
> 
> Obviously that tidy up would make this more complex than simply moving
> the code. (I might have commented on this before, but too long ago to remember ;)
> 
> 	if (rc) {
> 		put_device(&cxld->dev);
> 		return rc;
> 	}
> 	rc = cxl_decoder...
> 	if (rc)
> 		return rc;
> 
> 	dev_dbg(..
> 
> 	return 0;
> 

Since I changed this in v2 for the port driver, in the last patch, I think it
makes sense to do the move, and then cleanup.

Thanks.

> > +	else
> > +		rc = cxl_decoder_autoremove(&port->dev, cxld);
> > +
> > +	if (rc == 0)
> > +		dev_dbg(&port->dev, "add: %s\n", dev_name(&cxld->dev));
> > +
> > +	return rc;
> > +}
> > +

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 22/23] cxl/mem: Introduce cxl_mem driver
  2021-11-22 18:17   ` Jonathan Cameron
@ 2021-11-23  0:05     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-23  0:05 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-22 18:17:34, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:49 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Add a driver that is capable of determining whether a device is in a
> > CXL.mem routed part of the topology.
> > 
> > This driver allows a higher level driver - such as one controlling CXL
> > regions, which is itself a set of CXL devices - to easily determine if
> > the CXL devices are CXL.mem capable by checking if the driver has bound.
> > CXL memory device services may also be provided by this driver though
> > none are needed as of yet. cxl_mem also plays the part of registering
> > itself as an endpoint port, which is a required step to enumerate the
> > device's HDM decoder resources.
> > 
> > Even though cxl_mem driver is the only consumer of the new
> > cxl_scan_ports() introduced in cxl_core, because that functionality has
> > PCIe specificity it is kept out of this driver.
> > 
> > As part of this patch, find_dport_by_dev() is promoted to the cxl_core's
> > set of APIs for use by the new driver.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > 
> Main thing in here is the mysterious private data. I'd drop
> that until we have patches that set it in the same series.

It's used, it just got leaked into the wrong patch (b37c9a7eca3f ("cxl/port:
Introduce a port driver")). I'll fix it up so it gets added here.

/* Minor layering violation */
static int dvsec_range_used(struct cxl_port *port)
{
        struct cxl_endpoint_dvsec_info *info;
        int i, ret = 0;

        if (!is_endpoint_port(port))
                return 0;

        info = port->data;
        for (i = 0; i < info->ranges; i++)
                if (info->range[i].size)
                        ret |= 1 << i;

        return ret;
}


> 
> 
> 
> 
> ...
> 
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index acfa212eea21..cab3aabd5abc 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -8,6 +8,7 @@
> >  #include <linux/idr.h>
> >  #include <cxlmem.h>
> >  #include <cxl.h>
> > +#include <pci.h>
> >  #include "core.h"
> >  
> >  /**
> > @@ -436,7 +437,7 @@ static int devm_cxl_link_uport(struct device *host, struct cxl_port *port)
> >  
> >  static struct cxl_port *cxl_port_alloc(struct device *uport,
> >  				       resource_size_t component_reg_phys,
> > -				       struct cxl_port *parent_port)
> > +				       struct cxl_port *parent_port, void *data)
> >  {
> >  	struct cxl_port *port;
> >  	struct device *dev;
> > @@ -465,6 +466,7 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
> >  
> >  	port->uport = uport;
> >  	port->component_reg_phys = component_reg_phys;
> > +	port->data = data;
> >  	ida_init(&port->decoder_ida);
> >  	INIT_LIST_HEAD(&port->dports);
> >  
> > @@ -485,16 +487,17 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
> >   * @uport: "physical" device implementing this upstream port
> >   * @component_reg_phys: (optional) for configurable cxl_port instances
> >   * @parent_port: next hop up in the CXL memory decode hierarchy
> > + * @data: opaque data to be used by the port driver
> >   */
> >  struct cxl_port *devm_cxl_add_port(struct device *uport,
> >  				   resource_size_t component_reg_phys,
> > -				   struct cxl_port *parent_port)
> > +				   struct cxl_port *parent_port, void *data)
> >  {
> >  	struct device *dev, *host;
> >  	struct cxl_port *port;
> >  	int rc;
> >  
> > -	port = cxl_port_alloc(uport, component_reg_phys, parent_port);
> > +	port = cxl_port_alloc(uport, component_reg_phys, parent_port, data);
> >  	if (IS_ERR(port))
> >  		return port;
> >  
> > @@ -531,6 +534,113 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
> >  }
> >  EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, CXL);
> >  
> 
> 
> ...
> 
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > new file mode 100644
> > index 000000000000..818e30571e4d
> > --- /dev/null
> > +++ b/drivers/cxl/core/pci.c
> > @@ -0,0 +1,119 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> > +#include <linux/device.h>
> > +#include <linux/pci.h>
> > +#include <cxl.h>
> > +#include <pci.h>
> > +#include "core.h"
> > +
> > +/**
> > + * DOC: cxl core pci
> > + *
> > + * Compute Express Link protocols are layered on top of PCIe. CXL core provides
> > + * a set of helpers for CXL interactions which occur via PCIe.
> > + */
> > +
> > +/**
> > + * find_parent_cxl_port() - Finds parent port through PCIe mechanisms
> > + * @pdev: PCIe USP or DSP to find an upstream port for
> > + *
> > + * Once all CXL ports are enumerated, there is no need to reference the PCIe
> > + * parallel universe as all downstream ports are contained in a linked list, and
> > + * all upstream ports are accessible via pointer. During the enumeration, it is
> > + * very convenient to be able to peak up one level in the hierarchy without
> > + * needing the established relationship between data structures so that the
> > + * parenting can be done as the ports/dports are created.
> > + *
> > + * A reference is kept to the found port.
> > + */
> > +struct cxl_port *find_parent_cxl_port(struct pci_dev *pdev)
> > +{
> > +	struct device *parent_dev, *gparent_dev;
> > +
> > +	/* Parent is either a downstream port, or root port */
> > +	parent_dev = get_device(pdev->dev.parent);
> > +
> > +	if (is_cxl_switch_usp(&pdev->dev)) {
> > +		if (dev_WARN_ONCE(&pdev->dev,
> 
> maybe put the condition var in a local variable to reduce the indent and get something
> more readable?
> 
> > +				  pci_pcie_type(pdev) !=
> > +						  PCI_EXP_TYPE_DOWNSTREAM &&
> > +					  pci_pcie_type(pdev) !=
> > +						  PCI_EXP_TYPE_ROOT_PORT,
> > +				  "Parent not downstream\n"))
> > +			goto err;
> > +
> > +		/*
> > +		 * Grandparent is either an upstream port or a platform device that has
> > +		 * been added as a cxl_port already.
> > +		 */
> > +		gparent_dev = get_device(parent_dev->parent);
> > +		put_device(parent_dev);
> > +
> > +		return to_cxl_port(gparent_dev);
> > +	} else if (is_cxl_switch_dsp(&pdev->dev)) {
> > +		if (dev_WARN_ONCE(&pdev->dev,
> > +				  pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM,
> > +				  "Parent not upstream"))
> > +			goto err;
> > +		return to_cxl_port(parent_dev);
> > +	}
> > +
> > +err:
> > +	dev_WARN(&pdev->dev, "Invalid topology\n");
> > +	put_device(parent_dev);
> > +	return NULL;
> > +}
> > +
> 
> ...
> 
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index f8354241c5a3..3bda806f4244 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -296,6 +296,7 @@ struct cxl_port {
> >   * @port: reference to cxl_port that contains this downstream port
> >   * @list: node for a cxl_port's list of cxl_dport instances
> >   * @root_port_link: node for global list of root ports
> > + * @data: Opaque data passed by other drivers, used by port driver
> 
> Is this used yet? possible leave introducing this until we need it
> as not obvious here what it will be for.
> 

Yep...

> >   */
> >  struct cxl_dport {
> >  	struct device *dport;
> > @@ -304,16 +305,20 @@ struct cxl_dport {
> >  	struct cxl_port *port;
> >  	struct list_head list;
> >  	struct list_head root_port_link;
> > +	void *data;
> >  };
> >  
> >  struct cxl_port *to_cxl_port(struct device *dev);
> >  struct cxl_port *devm_cxl_add_port(struct device *uport,
> >  				   resource_size_t component_reg_phys,
> > -				   struct cxl_port *parent_port);
> > +				   struct cxl_port *parent_port, void *data);
> > +void cxl_scan_ports(struct cxl_dport *root_port);
> 
> ...
> 
> > +
> > +static int create_endpoint(struct device *dev, struct cxl_port *parent,
> > +			   struct cxl_dport *dport)
> > +{
> > +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > +	struct cxl_endpoint_dvsec_info *info = cxlds->info;
> > +	struct cxl_port *endpoint;
> > +	int rc;
> > +
> > +	endpoint =
> > +		devm_cxl_add_port(dev, cxlds->component_reg_phys, parent, info);
> 
> I'd just have that on one line, or break it somewhere in the parameter list.
> Right now it just looks odd and saves maybe 4 characters in line length.
> 
> > +	if (IS_ERR(endpoint))
> > +		return PTR_ERR(endpoint);
> > +
> > +	rc = sysfs_create_link(&cxlmd->dev.kobj, &dport->dport->kobj,
> > +			       "root_port");
> 
> Not obvious to me what this link is for.  Maybe needs a docs update
> somewhere?
> 

Okay. IIRC this was a request from Dan to help userspace tooling but I might be
mistaken on that.

> > +	if (rc) {
> > +		device_del(&endpoint->dev);
> > +		return rc;
> > +	}
> > +	dev_dbg(dev, "add: %s\n", dev_name(&endpoint->dev));
> > +
> > +	return devm_add_action_or_reset(dev, remove_endpoint, cxlmd);
> > +}
> > +
> > +static int cxl_mem_probe(struct device *dev)
> > +{
> > +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > +	struct cxl_port *hostbridge, *parent_port;
> > +	struct walk_ctx ctx = { NULL, false };
> 
> Perhaps use c99 style to show what is being initialized inside the walk ctx.
> Will make it more obvious these are the right values.
> 
> > +	struct cxl_dport *dport;
> > +	int rc;
> > +
> > +	rc = wait_for_media(cxlmd);
> > +	if (rc) {
> > +		dev_err(dev, "Media not active (%d)\n", rc);
> > +		return rc;
> > +	}
> > +
> > +	walk_to_root_port(dev, &ctx);
> > +
> > +	/*
> > +	 * Couldn't find a CXL capable root port. This may happen even with a
> > +	 * CXL capable topology if cxl_acpi hasn't completed yet. A rescan will
> > +	 * occur.
> > +	 */
> > +	if (!ctx.root_port)
> > +		return -ENODEV;
> > +
> > +	hostbridge = ctx.root_port->port;
> > +	device_lock(&hostbridge->dev);
> > +
> > +	/* hostbridge has no port driver, the topology isn't enabled yet */
> > +	if (!hostbridge->dev.driver) {
> > +		device_unlock(&hostbridge->dev);
> > +		return -ENODEV;
> > +	}
> > +
> > +	/* No switch + found root port means we're done */
> > +	if (!ctx.has_switch) {
> > +		parent_port = to_cxl_port(&hostbridge->dev);
> > +		dport = ctx.root_port;
> > +		goto out;
> > +	}
> > +
> > +	/* Walk down from the root port and add all switches */
> > +	cxl_scan_ports(ctx.root_port);
> > +
> > +	/* If parent is a dport the endpoint is good to go. */
> > +	parent_port = to_cxl_port(dev->parent->parent);
> > +	dport = cxl_find_dport_by_dev(parent_port, dev->parent);
> > +	if (!dport) {
> > +		rc = -ENODEV;
> > +		goto err_out;
> > +	}
> > +
> > +out:
> > +	rc = create_endpoint(dev, parent_port, dport);
> > +	if (rc)
> > +		goto err_out;
> > +
> > +	cxlmd->root_port = ctx.root_port;
> > +
> > +err_out:
> > +	device_unlock(&hostbridge->dev);
> > +	return rc;
> > +}
> > +
> 
> > diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> > index 2a48cd65bf59..3fd0909522f2 100644
> > --- a/drivers/cxl/pci.h
> > +++ b/drivers/cxl/pci.h
> > @@ -15,6 +15,7 @@
> >  
> >  /* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
> >  #define CXL_DVSEC_PCIE_DEVICE					0
> > +
> 
> Noise that shouldn't be in this patch.
> 

I wish there was some automation I had to catch this kind of thing... Thanks.

> >  #define   CXL_DVSEC_PCIE_DEVICE_CAP_OFFSET			0xA
> >  #define     CXL_DVSEC_PCIE_DEVICE_MEM_CAPABLE			BIT(2)
> >  #define     CXL_DVSEC_PCIE_DEVICE_HDM_COUNT_MASK		GENMASK(5, 4)
> > @@ -64,4 +65,7 @@ enum cxl_regloc_type {
> >  	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
> >  			   (map)->block_offset))
> >  
> > +bool is_cxl_switch_usp(struct device *dev);
> > +bool is_cxl_switch_dsp(struct device *dev);
> > +
> >  #endif /* __CXL_PCI_H__ */
> 
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 13/23] cxl/core: Move target population locking to caller
  2021-11-22 21:58     ` Ben Widawsky
@ 2021-11-23 11:05       ` Jonathan Cameron
  0 siblings, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-23 11:05 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Mon, 22 Nov 2021 13:58:01 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-11-22 16:33:02, Jonathan Cameron wrote:
> > On Fri, 19 Nov 2021 16:02:40 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > In preparation for a port driver that enumerates a descendant port +
> > > decoder hierarchy, arrange for an unlocked version of cxl_decoder_add().
> > > Otherwise a port-driver that adds a child decoder will deadlock on the
> > > device_lock() in ->probe().
> > >   
> > 
> > I think this description should call out that the lock was originally taken
> > for a much shorter time in decoder_populate_targets() but is moved
> > up one layer.  
> 
> Sounds good.
With that added and below discussion resolved.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> > 
> > One other query inline.  Seems like we the WARN_ON stuff is a bit
> > over paranoid given what's visible in this patch.  If there is a
> > good reason for that, then add something to the patch description to
> > justify it.
> >    
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > 
> > > ---
> > > 
> > > Changes since RFCv2:
> > > - Reword commit message (Dan)
> > > - Move decoder API changes into this patch (Dan)
> > > ---
> > >  drivers/cxl/core/bus.c | 59 +++++++++++++++++++++++++++++++-----------
> > >  drivers/cxl/cxl.h      |  1 +
> > >  2 files changed, 45 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > index 16b15f54fb62..cd6fe7823c69 100644
> > > --- a/drivers/cxl/core/bus.c
> > > +++ b/drivers/cxl/core/bus.c
> > > @@ -487,28 +487,22 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
> > >  {
> > >  	int rc = 0, i;
> > >  
> > > +	device_lock_assert(&port->dev);
> > > +
> > >  	if (!target_map)
> > >  		return 0;
> > >  
> > > -	device_lock(&port->dev);
> > > -	if (list_empty(&port->dports)) {
> > > -		rc = -EINVAL;
> > > -		goto out_unlock;
> > > -	}
> > > +	if (list_empty(&port->dports))
> > > +		return -EINVAL;
> > >  
> > >  	for (i = 0; i < cxld->nr_targets; i++) {
> > >  		struct cxl_dport *dport = find_dport(port, target_map[i]);
> > >  
> > > -		if (!dport) {
> > > -			rc = -ENXIO;
> > > -			goto out_unlock;
> > > -		}
> > > +		if (!dport)
> > > +			return -ENXIO;
> > >  		cxld->target[i] = dport;
> > >  	}
> > >  
> > > -out_unlock:
> > > -	device_unlock(&port->dev);
> > > -
> > >  	return rc;
> > >  }
> > >  
> > > @@ -571,7 +565,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> > >  EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
> > >  
> > >  /**
> > > - * cxl_decoder_add - Add a decoder with targets
> > > + * cxl_decoder_add_locked - Add a decoder with targets
> > >   * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
> > >   * @target_map: A list of downstream ports that this decoder can direct memory
> > >   *              traffic to. These numbers should correspond with the port number
> > > @@ -581,12 +575,14 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
> > >   * is an endpoint device. A more awkward example is a hostbridge whose root
> > >   * ports get hot added (technically possible, though unlikely).
> > >   *
> > > - * Context: Process context. Takes and releases the cxld's device lock.
> > > + * This is the locked variant of cxl_decoder_add().
> > > + *
> > > + * Context: Process context. Expects the cxld's device lock to be held.
> > >   *
> > >   * Return: Negative error code if the decoder wasn't properly configured; else
> > >   *	   returns 0.
> > >   */
> > > -int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> > > +int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
> > >  {
> > >  	struct cxl_port *port;
> > >  	struct device *dev;
> > > @@ -619,6 +615,39 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> > >  
> > >  	return device_add(dev);
> > >  }
> > > +EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, CXL);
> > > +
> > > +/**
> > > + * cxl_decoder_add - Add a decoder with targets
> > > + * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
> > > + * @target_map: A list of downstream ports that this decoder can direct memory
> > > + *              traffic to. These numbers should correspond with the port number
> > > + *              in the PCIe Link Capabilities structure.
> > > + *
> > > + * This is the unlocked variant of cxl_decoder_add_locked().
> > > + * See cxl_decoder_add_locked().
> > > + *
> > > + * Context: Process context. Takes and releases the cxld's device lock.
> > > + */
> > > +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> > > +{
> > > +	struct cxl_port *port;
> > > +	int rc;
> > > +
> > > +	if (WARN_ON_ONCE(!cxld))
> > > +		return -EINVAL;  
> > 
> > Why do we now need these protections but didn't before?  
> 
> I don't quite understand what you're trying to point out.
> 
> Prior to this patch, cxl_decoder_add() checks:
> - !cxld
> - IS_ERR(cxld)
> - cxld->interleave_ways != 0
> 
> After this patch, cxl_decoder_add() checks:
> - !cxld
> - IS_ERR(cxld)
> - (and then calls cxl_decoder_add_locked())
> 
> And cxl_decoder_add_locked() checks:
> - !cxld
> - IS_ERR(cxld)
> - cxld->interleave_ways != 0
> 
> Ultimately we want to check all 3, and since cxl_decoder_add() calls
> cxl_decoder_add_locked(), we're good there. The problem is to get from a cxld to
> a port, you need to make sure you have a valid cxld, and the API previously
> allowed !cxld and IS_ERR(cxld). So there are duplicative checks if you call
> cxl_decoder_add(), but other than that I don't see any new protections.

Ah. It was the duplication that I didn't follow.

Fair enough

J
> 
> > 
> >   
> > > +
> > > +	if (WARN_ON_ONCE(IS_ERR(cxld)))
> > > +		return PTR_ERR(cxld);
> > > +
> > > +	port = to_cxl_port(cxld->dev.parent);
> > > +
> > > +	device_lock(&port->dev);
> > > +	rc = cxl_decoder_add_locked(cxld, target_map);
> > > +	device_unlock(&port->dev);
> > > +
> > > +	return rc;
> > > +}
> > >  EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL);
> > >  
> > >  static void cxld_unregister(void *dev)
> > > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > > index b66ed8f241c6..2c5627fa8a34 100644
> > > --- a/drivers/cxl/cxl.h
> > > +++ b/drivers/cxl/cxl.h
> > > @@ -290,6 +290,7 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
> > >  bool is_root_decoder(struct device *dev);
> > >  struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> > >  				      unsigned int nr_targets);
> > > +int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map);
> > >  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
> > >  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> > >    
> >   


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 18/23] cxl/pci: Implement wait for media active
  2021-11-22 22:57     ` Ben Widawsky
@ 2021-11-23 11:09       ` Jonathan Cameron
  2021-11-23 16:04         ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-23 11:09 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Mon, 22 Nov 2021 14:57:51 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-11-22 17:03:35, Jonathan Cameron wrote:
> > On Fri, 19 Nov 2021 16:02:45 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > CXL 2.0 8.1.3.8.2 defines "Memory_Active: When set, indicates that the
> > > CXL Range 1 memory is fully initialized and available for software use.
> > > Must be set within Range 1. Memory_Active_Timeout of deassertion of  
> > 
> > Range 1?
> >   
> 
> Not my numbering... It's the first DVSEC range.

Ah, got it. Maybe Range 1: Memory Active timeout ?

> 
> > > reset to CXL device if CXL.mem HwInit Mode=1" The CXL* Type 3 Memory
> > > Device Software Guide (Revision 1.0) further describes the need to check
> > > this bit before using HDM.
> > > 
> > > Unfortunately, Memory_Active can take quite a long time depending on
> > > media size (up to 256s per 2.0 spec). Since the cxl_pci driver doesn't
> > > care about this, a callback is exported as part of driver state for use
> > > by drivers that do care.
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>  
> > 
> > Same thing about size not being used...
> >   
> 
> Yep, got it.
> 
> > > ---
> > > This patch did not exist in RFCv2
> > > ---
> > >  drivers/cxl/cxlmem.h |  1 +
> > >  drivers/cxl/pci.c    | 56 ++++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 57 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index eac5528ccaae..a9424dd4e5c3 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h
> > > @@ -167,6 +167,7 @@ struct cxl_dev_state {
> > >  	struct cxl_endpoint_dvsec_info *info;
> > >  
> > >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> > > +	int (*wait_media_ready)(struct cxl_dev_state *cxlds);
> > >  };
> > >  
> > >  enum cxl_opcode {
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index b3f46045bf3e..f1a68bfe5f77 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -496,6 +496,60 @@ static int wait_for_valid(struct cxl_dev_state *cxlds)
> > >  	return valid ? 0 : -ETIMEDOUT;
> > >  }
> > >  
> > > +/*
> > > + * Implements Figure 43 of the CXL Type 3 Memory Device Software Guide. Waits a
> > > + * full 256s no matter what the device reports.
> > > + */
> > > +static int wait_for_media_ready(struct cxl_dev_state *cxlds)
> > > +{
> > > +	const unsigned long timeout = jiffies + (256 * HZ);
> > > +	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > > +	u64 md_status;
> > > +	bool active;
> > > +	int rc;
> > > +
> > > +	rc = wait_for_valid(cxlds);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	do {
> > > +		u64 size;
> > > +		u32 temp;
> > > +		int rc;
> > > +
> > > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
> > > +					   &temp);
> > > +		if (rc)
> > > +			return -ENXIO;
> > > +		size = (u64)temp << 32;
> > > +
> > > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
> > > +					   &temp);
> > > +		if (rc)
> > > +			return -ENXIO;
> > > +		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
> > > +
> > > +		active = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_ACTIVE, temp);  
> > 
> > Only need to read the register to get active for this particular functionality.
> >   
> > > +		if (active)
> > > +			break;
> > > +		cpu_relax();
> > > +		mdelay(100);
> > > +	} while (!time_after(jiffies, timeout));
> > > +
> > > +	if (!active)
> > > +		return -ETIMEDOUT;
> > > +
> > > +	rc = check_device_status(cxlds);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > > +	if (!CXLMDEV_READY(md_status))
> > > +		return -EIO;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  static struct cxl_endpoint_dvsec_info *dvsec_ranges(struct cxl_dev_state *cxlds)
> > >  {
> > >  	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > > @@ -598,6 +652,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  	if (!cxlds->device_dvsec)
> > >  		dev_warn(&pdev->dev,
> > >  			 "Device DVSEC not present. Expect limited functionality.\n");
> > > +	else
> > > +		cxlds->wait_media_ready = wait_for_media_ready;
> > >  
> > >  	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> > >  	if (rc)  
> >   


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-22 23:38     ` Ben Widawsky
@ 2021-11-23 11:38       ` Jonathan Cameron
  2021-11-23 16:14         ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-23 11:38 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Mon, 22 Nov 2021 15:38:20 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

...

> > > +static int enumerate_hdm_decoders(struct cxl_port *port,
> > > +				  struct cxl_port_data *portdata)
> > > +{
> > > +	void __iomem *hdm_decoder = portdata->regs.hdm_decoder;
> > > +	bool global_enable;
> > > +	u32 global_ctrl;
> > > +	int i = 0;
> > > +
> > > +	global_ctrl = readl(hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > > +	global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE;
> > > +	if (!global_enable) {
> > > +		i = dvsec_range_used(port);
> > > +		if (i) {
> > > +			dev_err(&port->dev,
> > > +				"Couldn't add port because device is using DVSEC range registers\n");
> > > +			return -EBUSY;
> > > +		}
> > > +	}
> > > +
> > > +	for (i = 0; i < portdata->caps.count; i++) {
> > > +		int rc, target_count = portdata->caps.tc;
> > > +		struct cxl_decoder *cxld;
> > > +		int *target_map = NULL;
> > > +		u64 size;
> > > +
> > > +		if (is_endpoint_port(port))
> > > +			target_count = 0;
> > > +
> > > +		cxld = cxl_decoder_alloc(port, target_count);
> > > +		if (IS_ERR(cxld)) {
> > > +			dev_warn(&port->dev,
> > > +				 "Failed to allocate the decoder\n");
> > > +			return PTR_ERR(cxld);
> > > +		}
> > > +
> > > +		cxld->target_type = CXL_DECODER_EXPANDER;
> > > +		cxld->interleave_ways = 1;
> > > +		cxld->interleave_granularity = 0;
> > > +
> > > +		size = get_decoder_size(hdm_decoder, i);
> > > +		if (size != 0) {
> > > +#define decoderN(reg, n) hdm_decoder + CXL_HDM_DECODER0_##reg(n)  
> > 
> > Perhaps this block in the if (size != 0) would be more readable if broken out
> > to a utility function?  
> 
> I don't get this comment, there is already get_decoder_size(). Can you please
> elaborate?

Sure.  Just talking about having something like

		if (size != 0)
			init_decoder() // or something better named

as an alternative to this deep nesting. 

> 
> > As normal I'm not keen on the macro magic if we can avoid it.
> >   
> 
> Yeah - just trying to not have to deal with wrapping long lines.
> 
> >   
> > > +			int temp[CXL_DECODER_MAX_INTERLEAVE];
> > > +			u64 base;
> > > +			u32 ctrl;
> > > +			int j;
> > > +			union {
> > > +				u64 value;
> > > +				char target_id[8];
> > > +			} target_list;  
> > 
> > I thought we tried to avoid this union usage in kernel because of the whole
> > thing about c compilers being able to do what they like with it...
> >   
> 
> I wasn't aware of the restriction. I can change it back if it's required. It
> does look a lot nicer this way. Is there a reference to this issue somewhere?

Hmm. Seems I was out of date on this.  There is a mess in the c99 standard that
contradicts itself on whether you can do this or not.

https://davmac.wordpress.com/2010/02/26/c99-revisited/

The pull request for a patch form Andy got a Linus response...

https://lore.kernel.org/all/CAJZ5v0jq45atkapwSjJ4DkHhB1bfOA-Sh1TiA3dPXwKyFTBheA@mail.gmail.com/


> 
> > > +
> > > +			target_map = temp;  
> > 
> > This is set to a variable that goes out of scope whilst target_map is still in
> > use.
> >   
> 
> Yikes. I'm pretty surprised the compiler didn't catch this.
> 
> > > +			ctrl = readl(decoderN(CTRL_OFFSET, i));
> > > +			base = ioread64_hi_lo(decoderN(BASE_LOW_OFFSET, i));
> > > +
> > > +			cxld->decoder_range = (struct range){
> > > +				.start = base,
> > > +				.end = base + size - 1
> > > +			};
> > > +
> > > +			cxld->flags = CXL_DECODER_F_ENABLE;
> > > +			cxld->interleave_ways = to_interleave_ways(ctrl);
> > > +			cxld->interleave_granularity =
> > > +				to_interleave_granularity(ctrl);
> > > +
> > > +			if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0)
> > > +				cxld->target_type = CXL_DECODER_ACCELERATOR;
> > > +
> > > +			target_list.value = ioread64_hi_lo(decoderN(TL_LOW, i));
> > > +			for (j = 0; j < cxld->interleave_ways; j++)
> > > +				target_map[j] = target_list.target_id[j];
> > > +#undef decoderN
> > > +		}
> > > +
> > > +		rc = cxl_decoder_add_locked(cxld, target_map);
> > > +		if (rc)
> > > +			put_device(&cxld->dev);
> > > +		else
> > > +			rc = cxl_decoder_autoremove(&port->dev, cxld);
> > > +		if (rc)
> > > +			dev_err(&port->dev, "Failed to add decoder\n");  
> > 
> > If that fails on the autoremove registration (unlikely) this message
> > will be rather confusing - as the add was fine...
> > 
> > This nest of carrying on when we have an error is getting ever deeper...
> >   
> 
> Yeah, this is not great. I will clean it up.
> 
> Thanks.
> 
> > > +		else
> > > +			dev_dbg(&cxld->dev, "Added to port %s\n",
> > > +				dev_name(&port->dev));
> > > +	}
> > > +
> > > +	/*
> > > +	 * Turn on global enable now since DVSEC ranges aren't being used and
> > > +	 * we'll eventually want the decoder enabled.
> > > +	 */
> > > +	if (!global_enable) {
> > > +		dev_dbg(&port->dev, "Enabling HDM decode\n");
> > > +		writel(global_ctrl | CXL_HDM_DECODER_ENABLE, hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +  


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 18/23] cxl/pci: Implement wait for media active
  2021-11-23 11:09       ` Jonathan Cameron
@ 2021-11-23 16:04         ` Ben Widawsky
  2021-11-23 17:48           ` Bjorn Helgaas
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-23 16:04 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-23 11:09:34, Jonathan Cameron wrote:
> On Mon, 22 Nov 2021 14:57:51 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > On 21-11-22 17:03:35, Jonathan Cameron wrote:
> > > On Fri, 19 Nov 2021 16:02:45 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >   
> > > > CXL 2.0 8.1.3.8.2 defines "Memory_Active: When set, indicates that the
> > > > CXL Range 1 memory is fully initialized and available for software use.
> > > > Must be set within Range 1. Memory_Active_Timeout of deassertion of  
> > > 
> > > Range 1?
> > >   
> > 
> > Not my numbering... It's the first DVSEC range.
> 
> Ah, got it. Maybe Range 1: Memory Active timeout ?

I can, but this is just quoted from the spec. Would this be better:

The CXL Type 3 Memory Device Software Guide (Revision 1.0) describes the
need to check media active before using HDM. CXL 2.0 8.1.3.8.2 states:

  Memory_Active: When set, indicates that the CXL Range 1 memory is
  fully initialized and available for software use. Must be set within
  Range 1. Memory_Active_Timeout of deassertion of reset to CXL device
  if CXL.mem HwInit Mode=1

Unfortunately, Memory_Active can take quite a long time depending on
media size (up to 256s per 2.0 spec). Since the cxl_pci driver doesn't
care about this, a callback is exported as part of driver state for use
by drivers that do care.

> 
> > 
> > > > reset to CXL device if CXL.mem HwInit Mode=1" The CXL* Type 3 Memory
> > > > Device Software Guide (Revision 1.0) further describes the need to check
> > > > this bit before using HDM.
> > > > 
> > > > Unfortunately, Memory_Active can take quite a long time depending on
> > > > media size (up to 256s per 2.0 spec). Since the cxl_pci driver doesn't
> > > > care about this, a callback is exported as part of driver state for use
> > > > by drivers that do care.
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>  
> > > 
> > > Same thing about size not being used...
> > >   
> > 
> > Yep, got it.
> > 
> > > > ---
> > > > This patch did not exist in RFCv2
> > > > ---
> > > >  drivers/cxl/cxlmem.h |  1 +
> > > >  drivers/cxl/pci.c    | 56 ++++++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 57 insertions(+)
> > > > 
> > > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > > index eac5528ccaae..a9424dd4e5c3 100644
> > > > --- a/drivers/cxl/cxlmem.h
> > > > +++ b/drivers/cxl/cxlmem.h
> > > > @@ -167,6 +167,7 @@ struct cxl_dev_state {
> > > >  	struct cxl_endpoint_dvsec_info *info;
> > > >  
> > > >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> > > > +	int (*wait_media_ready)(struct cxl_dev_state *cxlds);
> > > >  };
> > > >  
> > > >  enum cxl_opcode {
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > index b3f46045bf3e..f1a68bfe5f77 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -496,6 +496,60 @@ static int wait_for_valid(struct cxl_dev_state *cxlds)
> > > >  	return valid ? 0 : -ETIMEDOUT;
> > > >  }
> > > >  
> > > > +/*
> > > > + * Implements Figure 43 of the CXL Type 3 Memory Device Software Guide. Waits a
> > > > + * full 256s no matter what the device reports.
> > > > + */
> > > > +static int wait_for_media_ready(struct cxl_dev_state *cxlds)
> > > > +{
> > > > +	const unsigned long timeout = jiffies + (256 * HZ);
> > > > +	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > > > +	u64 md_status;
> > > > +	bool active;
> > > > +	int rc;
> > > > +
> > > > +	rc = wait_for_valid(cxlds);
> > > > +	if (rc)
> > > > +		return rc;
> > > > +
> > > > +	do {
> > > > +		u64 size;
> > > > +		u32 temp;
> > > > +		int rc;
> > > > +
> > > > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, HIGH),
> > > > +					   &temp);
> > > > +		if (rc)
> > > > +			return -ENXIO;
> > > > +		size = (u64)temp << 32;
> > > > +
> > > > +		rc = pci_read_config_dword(pdev, CDPDR(cxlds, 0, SIZE, LOW),
> > > > +					   &temp);
> > > > +		if (rc)
> > > > +			return -ENXIO;
> > > > +		size |= temp & CXL_DVSEC_PCIE_DEVICE_MEM_SIZE_LOW_MASK;
> > > > +
> > > > +		active = FIELD_GET(CXL_DVSEC_PCIE_DEVICE_MEM_ACTIVE, temp);  
> > > 
> > > Only need to read the register to get active for this particular functionality.
> > >   
> > > > +		if (active)
> > > > +			break;
> > > > +		cpu_relax();
> > > > +		mdelay(100);
> > > > +	} while (!time_after(jiffies, timeout));
> > > > +
> > > > +	if (!active)
> > > > +		return -ETIMEDOUT;
> > > > +
> > > > +	rc = check_device_status(cxlds);
> > > > +	if (rc)
> > > > +		return rc;
> > > > +
> > > > +	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > > > +	if (!CXLMDEV_READY(md_status))
> > > > +		return -EIO;
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > >  static struct cxl_endpoint_dvsec_info *dvsec_ranges(struct cxl_dev_state *cxlds)
> > > >  {
> > > >  	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > > > @@ -598,6 +652,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > >  	if (!cxlds->device_dvsec)
> > > >  		dev_warn(&pdev->dev,
> > > >  			 "Device DVSEC not present. Expect limited functionality.\n");
> > > > +	else
> > > > +		cxlds->wait_media_ready = wait_for_media_ready;
> > > >  
> > > >  	rc = cxl_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
> > > >  	if (rc)  
> > >   
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-23 11:38       ` Jonathan Cameron
@ 2021-11-23 16:14         ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-23 16:14 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On 21-11-23 11:38:23, Jonathan Cameron wrote:
> On Mon, 22 Nov 2021 15:38:20 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> ...
> 
> > > > +static int enumerate_hdm_decoders(struct cxl_port *port,
> > > > +				  struct cxl_port_data *portdata)
> > > > +{
> > > > +	void __iomem *hdm_decoder = portdata->regs.hdm_decoder;
> > > > +	bool global_enable;
> > > > +	u32 global_ctrl;
> > > > +	int i = 0;
> > > > +
> > > > +	global_ctrl = readl(hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > > > +	global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE;
> > > > +	if (!global_enable) {
> > > > +		i = dvsec_range_used(port);
> > > > +		if (i) {
> > > > +			dev_err(&port->dev,
> > > > +				"Couldn't add port because device is using DVSEC range registers\n");
> > > > +			return -EBUSY;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	for (i = 0; i < portdata->caps.count; i++) {
> > > > +		int rc, target_count = portdata->caps.tc;
> > > > +		struct cxl_decoder *cxld;
> > > > +		int *target_map = NULL;
> > > > +		u64 size;
> > > > +
> > > > +		if (is_endpoint_port(port))
> > > > +			target_count = 0;
> > > > +
> > > > +		cxld = cxl_decoder_alloc(port, target_count);
> > > > +		if (IS_ERR(cxld)) {
> > > > +			dev_warn(&port->dev,
> > > > +				 "Failed to allocate the decoder\n");
> > > > +			return PTR_ERR(cxld);
> > > > +		}
> > > > +
> > > > +		cxld->target_type = CXL_DECODER_EXPANDER;
> > > > +		cxld->interleave_ways = 1;
> > > > +		cxld->interleave_granularity = 0;
> > > > +
> > > > +		size = get_decoder_size(hdm_decoder, i);
> > > > +		if (size != 0) {
> > > > +#define decoderN(reg, n) hdm_decoder + CXL_HDM_DECODER0_##reg(n)  
> > > 
> > > Perhaps this block in the if (size != 0) would be more readable if broken out
> > > to a utility function?  
> > 
> > I don't get this comment, there is already get_decoder_size(). Can you please
> > elaborate?
> 
> Sure.  Just talking about having something like
> 
> 		if (size != 0)
> 			init_decoder() // or something better named
> 
> as an alternative to this deep nesting. 
> 

Sounds good. I can combine it with the similar initialization done in cxl_acpi.

> > 
> > > As normal I'm not keen on the macro magic if we can avoid it.
> > >   
> > 
> > Yeah - just trying to not have to deal with wrapping long lines.
> > 
> > >   
> > > > +			int temp[CXL_DECODER_MAX_INTERLEAVE];
> > > > +			u64 base;
> > > > +			u32 ctrl;
> > > > +			int j;
> > > > +			union {
> > > > +				u64 value;
> > > > +				char target_id[8];
> > > > +			} target_list;  
> > > 
> > > I thought we tried to avoid this union usage in kernel because of the whole
> > > thing about c compilers being able to do what they like with it...
> > >   
> > 
> > I wasn't aware of the restriction. I can change it back if it's required. It
> > does look a lot nicer this way. Is there a reference to this issue somewhere?
> 
> Hmm. Seems I was out of date on this.  There is a mess in the c99 standard that
> contradicts itself on whether you can do this or not.
> 
> https://davmac.wordpress.com/2010/02/26/c99-revisited/

Thanks for the link.

> 
> The pull request for a patch form Andy got a Linus response...
> 
> https://lore.kernel.org/all/CAJZ5v0jq45atkapwSjJ4DkHhB1bfOA-Sh1TiA3dPXwKyFTBheA@mail.gmail.com/
> 

That was a fun read :-)

I'll defer to Dan on this. This was actually his code that he suggested in
review of the RFC.

> 
> > 
> > > > +
> > > > +			target_map = temp;  
> > > 
> > > This is set to a variable that goes out of scope whilst target_map is still in
> > > use.
> > >   
> > 
> > Yikes. I'm pretty surprised the compiler didn't catch this.
> > 
> > > > +			ctrl = readl(decoderN(CTRL_OFFSET, i));
> > > > +			base = ioread64_hi_lo(decoderN(BASE_LOW_OFFSET, i));
> > > > +
> > > > +			cxld->decoder_range = (struct range){
> > > > +				.start = base,
> > > > +				.end = base + size - 1
> > > > +			};
> > > > +
> > > > +			cxld->flags = CXL_DECODER_F_ENABLE;
> > > > +			cxld->interleave_ways = to_interleave_ways(ctrl);
> > > > +			cxld->interleave_granularity =
> > > > +				to_interleave_granularity(ctrl);
> > > > +
> > > > +			if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0)
> > > > +				cxld->target_type = CXL_DECODER_ACCELERATOR;
> > > > +
> > > > +			target_list.value = ioread64_hi_lo(decoderN(TL_LOW, i));
> > > > +			for (j = 0; j < cxld->interleave_ways; j++)
> > > > +				target_map[j] = target_list.target_id[j];
> > > > +#undef decoderN
> > > > +		}
> > > > +
> > > > +		rc = cxl_decoder_add_locked(cxld, target_map);
> > > > +		if (rc)
> > > > +			put_device(&cxld->dev);
> > > > +		else
> > > > +			rc = cxl_decoder_autoremove(&port->dev, cxld);
> > > > +		if (rc)
> > > > +			dev_err(&port->dev, "Failed to add decoder\n");  
> > > 
> > > If that fails on the autoremove registration (unlikely) this message
> > > will be rather confusing - as the add was fine...
> > > 
> > > This nest of carrying on when we have an error is getting ever deeper...
> > >   
> > 
> > Yeah, this is not great. I will clean it up.
> > 
> > Thanks.
> > 
> > > > +		else
> > > > +			dev_dbg(&cxld->dev, "Added to port %s\n",
> > > > +				dev_name(&port->dev));
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * Turn on global enable now since DVSEC ranges aren't being used and
> > > > +	 * we'll eventually want the decoder enabled.
> > > > +	 */
> > > > +	if (!global_enable) {
> > > > +		dev_dbg(&port->dev, "Enabling HDM decode\n");
> > > > +		writel(global_ctrl | CXL_HDM_DECODER_ENABLE, hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +  
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 18/23] cxl/pci: Implement wait for media active
  2021-11-23 16:04         ` Ben Widawsky
@ 2021-11-23 17:48           ` Bjorn Helgaas
  2021-11-23 19:37             ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Bjorn Helgaas @ 2021-11-23 17:48 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Jonathan Cameron, linux-cxl, linux-pci, Alison Schofield,
	Dan Williams, Ira Weiny, Vishal Verma

On Tue, Nov 23, 2021 at 08:04:13AM -0800, Ben Widawsky wrote:
> On 21-11-23 11:09:34, Jonathan Cameron wrote:
> > On Mon, 22 Nov 2021 14:57:51 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > 
> > > On 21-11-22 17:03:35, Jonathan Cameron wrote:
> > > > On Fri, 19 Nov 2021 16:02:45 -0800
> > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >   
> > > > > CXL 2.0 8.1.3.8.2 defines "Memory_Active: When set,
> > > > > indicates that the CXL Range 1 memory is fully initialized
> > > > > and available for software use.  Must be set within Range 1.
> > > > > Memory_Active_Timeout of deassertion of  
> > ...
> > Ah, got it. Maybe Range 1: Memory Active timeout ?
> 
> I can, but this is just quoted from the spec. Would this be better:
> 
> The CXL Type 3 Memory Device Software Guide (Revision 1.0) describes the
> need to check media active before using HDM. CXL 2.0 8.1.3.8.2 states:
> 
>   Memory_Active: When set, indicates that the CXL Range 1 memory is
>   fully initialized and available for software use. Must be set within
>   Range 1. Memory_Active_Timeout of deassertion of reset to CXL device
>   if CXL.mem HwInit Mode=1

That is some weird wording.  I stumbled over that, too.  I like the
quote format better, but I still don't know what it means.

That last piece ("Memory_Active_Timeout of deassertion ...") purports
to be a sentence, but is not.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-20  0:02 ` [PATCH 20/23] cxl/port: Introduce a port driver Ben Widawsky
                     ` (2 preceding siblings ...)
  2021-11-22 17:41   ` Jonathan Cameron
@ 2021-11-23 18:21   ` Bjorn Helgaas
  2021-11-23 22:03     ` Ben Widawsky
  3 siblings, 1 reply; 133+ messages in thread
From: Bjorn Helgaas @ 2021-11-23 18:21 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 04:02:47PM -0800, Ben Widawsky wrote:
> The CXL port driver is responsible for managing the decoder resources
> contained within the port. It will also provide APIs that other drivers
> will consume for managing these resources.
> 
> There are 4 types of ports in a system:
> 1. Platform port. This is a non-programmable entity. Such a port is
>    named rootX. It is enumerated by cxl_acpi in an ACPI based system.

Can you mention the ACPI source (static table, namespace PNP ID) here?

> 2. Hostbridge port. 

Is "hostbridge" styled as a single word in the spec?  I've only seen
"host bridge" elsewhere.

>    This ports register access is defined in a platform
>    specific way (CHBS for ACPI platforms). 

s/This ports/This port's/

This doesn't really make sense, though.  Are you saying the register
access *mechanism* is platform specific?  Or merely that the
enumeration method (ACPI table, ACPI namespace, DT, etc) is
platform-specific?

I assume "CHBS" is an ACPI static table?

>    It has n downstream ports,
>    each of which are known as CXL 2.0 root ports.

This sounds like a "host bridge port *contains* these root ports."
That doesn't sound right.

>    Once the platform
>    specific mechanism to get the offset to the registers is obtained it
>    operates just like other CXL components. The enumeration of this
>    component is started by cxl_acpi and completed by cxl_port.

> 3. Switch port. A switch port is similar to a hostbridge port except
>    register access is defined in the CXL specification in a platform
>    agnostic way. The downstream ports for a switch are simply known as
>    downstream ports. The enumeration of these are entirely contained
>    within cxl_port.

In PCIe, "Downstream Port" includes both Root Ports and Switch
Downstream Ports.  It sounds like that's not the case for CXL?

Well, except above you say that a Host Bridge Port has N Downstream
Ports, so I guess "Downstream Port" probably includes both Host Bridge
Ports and Switch Downstream Ports.

Maybe you should just omit the "The downstream ports for a switch are
simply known as downstream ports" sentence.

> 4. Endpoint port. Endpoint ports are similar to switch ports with the
>    exception that they have no downstream ports, only the underlying
>    media on the device. The enumeration of these are started by cxl_pci,
>    and completed by cxl_port.

Does CXL use an "Upstream Port" concept similar to PCIe?  In PCIe,
"Upstream Port" includes both Switch Upstream Ports and the Upstream
Port on an Endpoint.

I hope this driver is not modeled on the PCI portdrv.  IMO, that was a
design error, and the "port service drivers" (PME, hotplug, AER, etc)
should be directly integrated into the PCI core instead of pretending
to be independent drivers.

> --- a/Documentation/driver-api/cxl/memory-devices.rst
> +++ b/Documentation/driver-api/cxl/memory-devices.rst
> @@ -28,6 +28,11 @@ CXL Memory Device
>  .. kernel-doc:: drivers/cxl/pci.c
>     :internal:
>  
> +CXL Port
> +--------
> +.. kernel-doc:: drivers/cxl/port.c
> +   :doc: cxl port

s/cxl port/CXL Port/ ?  I don't know exactly how this gets rendered by
ReST.

>  CXL Core
>  --------
>  .. kernel-doc:: drivers/cxl/cxl.h
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index ef05e96f8f97..3aeb33bba5a3 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -77,4 +77,26 @@ config CXL_PMEM
>  	  provisioning the persistent memory capacity of CXL memory expanders.
>  
>  	  If unsure say 'm'.
> +
> +config CXL_MEM
> +	tristate "CXL.mem: Memory Devices"
> +	select CXL_PORT
> +	depends on CXL_PCI
> +	default CXL_BUS
> +	help
> +	  The CXL.mem protocol allows a device to act as a provider of "System
> +	  RAM" and/or "Persistent Memory" that is fully coherent as if the
> +	  memory was attached to the typical CPU memory controller.  This is
> +	  known as HDM "Host-managed Device Memory".

s/was attached/were attached/

> +	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> +	  memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
> +	  specification for a detailed description of HDM.
> +
> +	  If unsure say 'm'.
> +
> +

Spurious blank line.

> +config CXL_PORT
> +	tristate
> +
>  endif

> --- /dev/null
> +++ b/drivers/cxl/port.c
> @@ -0,0 +1,323 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +
> +#include "cxlmem.h"
> +
> +/**
> + * DOC: cxl port

s/cxl port/CXL port/ (I'm assuming this should match usage below)
or maybe "CXL Port" both places to match typical PCIe spec usage?

Capitalization is a useful hint that this term is defined by a spec.

> + *
> + * The port driver implements the set of functionality needed to allow full
> + * decoder enumeration and routing. A CXL port is an abstraction of a CXL
> + * component that implements some amount of CXL decoding of CXL.mem traffic.
> + * As of the CXL 2.0 spec, this includes:
> + *
> + *	.. list-table:: CXL Components w/ Ports
> + *		:widths: 25 25 50
> + *		:header-rows: 1
> + *
> + *		* - component
> + *		  - upstream
> + *		  - downstream
> + *		* - Hostbridge

s/Hostbridge/Host bridge/

> + *		  - ACPI0016
> + *		  - root port

s/root port/Root Port/ to match Switch Ports below (and spec usage).

> + *		* - Switch
> + *		  - Switch Upstream Port
> + *		  - Switch Downstream Port
> + *		* - Endpoint
> + *		  - Endpoint Port
> + *		  - N/A

What does "N/A" mean here?  Is it telling us something useful?

> +static void rescan_ports(struct work_struct *work)
> +{
> +	if (bus_rescan_devices(&cxl_bus_type))
> +		pr_warn("Failed to rescan\n");

Needs some context.  A bare "Failed to rescan" in the dmesg log
without a clue about who emitted it is really annoying.

Maybe you defined pr_fmt() somewhere; I couldn't find it easily.

Bjorn

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 18/23] cxl/pci: Implement wait for media active
  2021-11-23 17:48           ` Bjorn Helgaas
@ 2021-11-23 19:37             ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-23 19:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jonathan Cameron, linux-cxl, linux-pci, Alison Schofield,
	Dan Williams, Ira Weiny, Vishal Verma

On 21-11-23 11:48:53, Bjorn Helgaas wrote:
> On Tue, Nov 23, 2021 at 08:04:13AM -0800, Ben Widawsky wrote:
> > On 21-11-23 11:09:34, Jonathan Cameron wrote:
> > > On Mon, 22 Nov 2021 14:57:51 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > 
> > > > On 21-11-22 17:03:35, Jonathan Cameron wrote:
> > > > > On Fri, 19 Nov 2021 16:02:45 -0800
> > > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >   
> > > > > > CXL 2.0 8.1.3.8.2 defines "Memory_Active: When set,
> > > > > > indicates that the CXL Range 1 memory is fully initialized
> > > > > > and available for software use.  Must be set within Range 1.
> > > > > > Memory_Active_Timeout of deassertion of  
> > > ...
> > > Ah, got it. Maybe Range 1: Memory Active timeout ?
> > 
> > I can, but this is just quoted from the spec. Would this be better:
> > 
> > The CXL Type 3 Memory Device Software Guide (Revision 1.0) describes the
> > need to check media active before using HDM. CXL 2.0 8.1.3.8.2 states:
> > 
> >   Memory_Active: When set, indicates that the CXL Range 1 memory is
> >   fully initialized and available for software use. Must be set within
> >   Range 1. Memory_Active_Timeout of deassertion of reset to CXL device
> >   if CXL.mem HwInit Mode=1
> 
> That is some weird wording.  I stumbled over that, too.  I like the
> quote format better, but I still don't know what it means.
> 
> That last piece ("Memory_Active_Timeout of deassertion ...") purports
> to be a sentence, but is not.

I've reported this to the person locally who drives spec changes. Since it's now
confused multiple people, I will rewrite it with my interpretation if people
think that is more optimal.

"Memory Active bit must be set within Memory_Active_Timeout amount of time
after reset."

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-23 18:21   ` Bjorn Helgaas
@ 2021-11-23 22:03     ` Ben Widawsky
  2021-11-23 22:36       ` Dan Williams
  2021-11-24 21:31       ` Bjorn Helgaas
  0 siblings, 2 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-23 22:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-23 12:21:28, Bjorn Helgaas wrote:
> On Fri, Nov 19, 2021 at 04:02:47PM -0800, Ben Widawsky wrote:
> > The CXL port driver is responsible for managing the decoder resources
> > contained within the port. It will also provide APIs that other drivers
> > will consume for managing these resources.
> > 
> > There are 4 types of ports in a system:
> > 1. Platform port. This is a non-programmable entity. Such a port is
> >    named rootX. It is enumerated by cxl_acpi in an ACPI based system.
> 
> Can you mention the ACPI source (static table, namespace PNP ID) here?

Done.

> 
> > 2. Hostbridge port. 
> 
> Is "hostbridge" styled as a single word in the spec?  I've only seen
> "host bridge" elsewhere.
> 

2 words. I'm sadly inconsistent with this particular word. CXL spec capitalizes
it.

> >    This ports register access is defined in a platform
> >    specific way (CHBS for ACPI platforms). 
> 
> s/This ports/This port's/
> 
> This doesn't really make sense, though.  Are you saying the register
> access *mechanism* is platform specific?  Or merely that the
> enumeration method (ACPI table, ACPI namespace, DT, etc) is
> platform-specific?
> 
> I assume "CHBS" is an ACPI static table?
> 

The registers are spec defined. The base address of those registers is defined
in a platform specific manner. Enumeration is a better word. CHBS is an ACPI
static table, yes.

> >    It has n downstream ports,
> >    each of which are known as CXL 2.0 root ports.
> 
> This sounds like a "host bridge port *contains* these root ports."
> That doesn't sound right.
> 

What sounds better? A CXL 2.0 Root Port is CXL capabilities on top of the PCIe
component which has the PCI_EXP_TYPE_ROOT_PORT cap. In my mental model, a host
bridge does contain the root ports. Perhaps I am wrong about that?

> >    Once the platform
> >    specific mechanism to get the offset to the registers is obtained it
> >    operates just like other CXL components. The enumeration of this
> >    component is started by cxl_acpi and completed by cxl_port.
> 
> > 3. Switch port. A switch port is similar to a hostbridge port except
> >    register access is defined in the CXL specification in a platform
> >    agnostic way. The downstream ports for a switch are simply known as
> >    downstream ports. The enumeration of these are entirely contained
> >    within cxl_port.
> 
> In PCIe, "Downstream Port" includes both Root Ports and Switch
> Downstream Ports.  It sounds like that's not the case for CXL?
> 

In CXL 2.0, it's fairly close to true that switch downstream ports and root
ports are equivalent. From today's driver perspective they are equivalent. Root
ports have some capabilities which switch downstream ports do not.

> Well, except above you say that a Host Bridge Port has N Downstream
> Ports, so I guess "Downstream Port" probably includes both Host Bridge
> Ports and Switch Downstream Ports.

Yes, in that case I meant a port which is downstream - confusing indeed.

> 
> Maybe you should just omit the "The downstream ports for a switch are
> simply known as downstream ports" sentence.
> 

Sounds good.

> > 4. Endpoint port. Endpoint ports are similar to switch ports with the
> >    exception that they have no downstream ports, only the underlying
> >    media on the device. The enumeration of these are started by cxl_pci,
> >    and completed by cxl_port.
> 
> Does CXL use an "Upstream Port" concept similar to PCIe?  In PCIe,
> "Upstream Port" includes both Switch Upstream Ports and the Upstream
> Port on an Endpoint.

Not really. Endpoints aren't known as ports in the spec and they have a decent
amount of divergence from upstream ports. The main area where they do converge
is in the memory decoding capabilities. In fact, it might come to a point where
we find adding an endpoint as a port does not make sense, but for now it does.

> 
> I hope this driver is not modeled on the PCI portdrv.  IMO, that was a
> design error, and the "port service drivers" (PME, hotplug, AER, etc)
> should be directly integrated into the PCI core instead of pretending
> to be independent drivers.

I'd like to understand a bit about why you think it was a design error. The port
driver is intended to be a port services driver, however I see the services
provided as quite different than the ones you mention. The primary service
cxl_port will provide from here on out is the ability to manage HDM decoder
resources for a port. Other independent drivers that want to use HDM decoder
resources would rely on the port driver for this.

It could be in CXL core just the same, but I don't see too much of a benefit
since the code would be almost identical. One nice aspect of having the port
driver outside of CXL core is it would allow CXL devices which do not need port
services (type-1 and probably type-2) to not need to load the port driver. We do
not have examples of such devices today but there's good evidence they are being
built. Whether or not they will even want CXL core is another topic up for
debate however.

I admit Dan and I did discuss putting this in either its own port driver, or
within core as a port driver. My inclination is, less is more in CXL core; but
perhaps you will be able to change my mind.

> 
> > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > @@ -28,6 +28,11 @@ CXL Memory Device
> >  .. kernel-doc:: drivers/cxl/pci.c
> >     :internal:
> >  
> > +CXL Port
> > +--------
> > +.. kernel-doc:: drivers/cxl/port.c
> > +   :doc: cxl port
> 
> s/cxl port/CXL Port/ ?  I don't know exactly how this gets rendered by
> ReST.

I believe this is the specific tag as specified in the .c file. I can capitalize
it, but we haven't done this for other tags.

> 
> >  CXL Core
> >  --------
> >  .. kernel-doc:: drivers/cxl/cxl.h
> > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > index ef05e96f8f97..3aeb33bba5a3 100644
> > --- a/drivers/cxl/Kconfig
> > +++ b/drivers/cxl/Kconfig
> > @@ -77,4 +77,26 @@ config CXL_PMEM
> >  	  provisioning the persistent memory capacity of CXL memory expanders.
> >  
> >  	  If unsure say 'm'.
> > +
> > +config CXL_MEM
> > +	tristate "CXL.mem: Memory Devices"
> > +	select CXL_PORT
> > +	depends on CXL_PCI
> > +	default CXL_BUS
> > +	help
> > +	  The CXL.mem protocol allows a device to act as a provider of "System
> > +	  RAM" and/or "Persistent Memory" that is fully coherent as if the
> > +	  memory was attached to the typical CPU memory controller.  This is
> > +	  known as HDM "Host-managed Device Memory".
> 
> s/was attached/were attached/
> 
> > +	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> > +	  memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
> > +	  specification for a detailed description of HDM.
> > +
> > +	  If unsure say 'm'.
> > +
> > +
> 
> Spurious blank line.
> 
> > +config CXL_PORT
> > +	tristate
> > +
> >  endif
> 
> > --- /dev/null
> > +++ b/drivers/cxl/port.c
> > @@ -0,0 +1,323 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> > +#include <linux/device.h>
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +
> > +#include "cxlmem.h"
> > +
> > +/**
> > + * DOC: cxl port
> 
> s/cxl port/CXL port/ (I'm assuming this should match usage below)
> or maybe "CXL Port" both places to match typical PCIe spec usage?
> 
> Capitalization is a useful hint that this term is defined by a spec.
> 

As above, I don't mind changing this at all, but this would be inconsistent with
the other tags we have defined.

> > + *
> > + * The port driver implements the set of functionality needed to allow full
> > + * decoder enumeration and routing. A CXL port is an abstraction of a CXL
> > + * component that implements some amount of CXL decoding of CXL.mem traffic.
> > + * As of the CXL 2.0 spec, this includes:
> > + *
> > + *	.. list-table:: CXL Components w/ Ports
> > + *		:widths: 25 25 50
> > + *		:header-rows: 1
> > + *
> > + *		* - component
> > + *		  - upstream
> > + *		  - downstream
> > + *		* - Hostbridge
> 
> s/Hostbridge/Host bridge/
> 
> > + *		  - ACPI0016
> > + *		  - root port
> 
> s/root port/Root Port/ to match Switch Ports below (and spec usage).
> 
> > + *		* - Switch
> > + *		  - Switch Upstream Port
> > + *		  - Switch Downstream Port
> > + *		* - Endpoint
> > + *		  - Endpoint Port
> > + *		  - N/A
> 
> What does "N/A" mean here?  Is it telling us something useful?

This gets rendered into a table that looks like the following:


| component  | upstream             | downstream             |
| ---------  | --------             | ----------             |
| Hostbridge | ACPI0016             | Root Port              |
| Switch     | Switch Upstream Port | Switch Downstream Port |
| Endpoint   | Endpoint Port        | N/A                    |


> 
> > +static void rescan_ports(struct work_struct *work)
> > +{
> > +	if (bus_rescan_devices(&cxl_bus_type))
> > +		pr_warn("Failed to rescan\n");
> 
> Needs some context.  A bare "Failed to rescan" in the dmesg log
> without a clue about who emitted it is really annoying.
> 
> Maybe you defined pr_fmt() somewhere; I couldn't find it easily.
> 

I actually didn't know about pr_fmt() trick, so thanks for that. I'll improve
this message to be more useful and contextual.

> Bjorn


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-23 22:03     ` Ben Widawsky
@ 2021-11-23 22:36       ` Dan Williams
  2021-11-23 23:38         ` Ben Widawsky
  2021-11-23 23:55         ` Bjorn Helgaas
  2021-11-24 21:31       ` Bjorn Helgaas
  1 sibling, 2 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-23 22:36 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Bjorn Helgaas, linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 23, 2021 at 2:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
[..]
> > I hope this driver is not modeled on the PCI portdrv.  IMO, that was a
> > design error, and the "port service drivers" (PME, hotplug, AER, etc)
> > should be directly integrated into the PCI core instead of pretending
> > to be independent drivers.
>
> I'd like to understand a bit about why you think it was a design error. The port
> driver is intended to be a port services driver, however I see the services
> provided as quite different than the ones you mention. The primary service
> cxl_port will provide from here on out is the ability to manage HDM decoder
> resources for a port. Other independent drivers that want to use HDM decoder
> resources would rely on the port driver for this.
>
> It could be in CXL core just the same, but I don't see too much of a benefit
> since the code would be almost identical. One nice aspect of having the port
> driver outside of CXL core is it would allow CXL devices which do not need port
> services (type-1 and probably type-2) to not need to load the port driver. We do
> not have examples of such devices today but there's good evidence they are being
> built. Whether or not they will even want CXL core is another topic up for
> debate however.
>
> I admit Dan and I did discuss putting this in either its own port driver, or
> within core as a port driver. My inclination is, less is more in CXL core; but
> perhaps you will be able to change my mind.

No, I don't think Bjorn is talking about whether the driver is
integrated into cxl_core.ko vs its own cxl_port.ko. IIUC this goes
back to the original contention about have /sys/bus/cxl at all:

https://lore.kernel.org/r/CAPcyv4iu8D-hJoujLXw8a4myS7trOE1FcUhESLB_imGMECVfrg@mail.gmail.com

Unlike pcieportdrv where the functionality is bounded to a single
device instance with relatively simpler hierarchical coordination of
the memory space and services. CXL interleaving is both a foreign
concept to the PCIE core and an awkward fit for the pci_bus_type
device model. CXL uses the cxl_bus_type and bus drivers to coordinate
CXL operations that have cross PCI device implications.

Outside of that I think the auxiliary device driver model, of which
the PCIE portdrv model is an open-coded form, is a useful construct
for separation of concerns. That said, I do want to hear more about
what trouble it has caused though to make sure that CXL does not trip
over the same issues longer term.

[..]
> > > +static void rescan_ports(struct work_struct *work)
> > > +{
> > > +   if (bus_rescan_devices(&cxl_bus_type))
> > > +           pr_warn("Failed to rescan\n");
> >
> > Needs some context.  A bare "Failed to rescan" in the dmesg log
> > without a clue about who emitted it is really annoying.
> >
> > Maybe you defined pr_fmt() somewhere; I couldn't find it easily.
> >
>
> I actually didn't know about pr_fmt() trick, so thanks for that. I'll improve
> this message to be more useful and contextual.

I am skeptical that this implementation of the workqueue properly
synchronizes with workqueue shutdown concerns, but I have not had a
chance to dig in too deep on this patchset.

Regardless, it is not worth reporting a rescan failure, because those
are to be expected here. The rescan attempts to rescan when a
constraint changes, there is no guarantee that all constraints are met
just because one constraint changed, so rescan failures
(device_attach() failures) are not interesting to report. The
individual driver probe failure reporting is sufficient.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-23 22:36       ` Dan Williams
@ 2021-11-23 23:38         ` Ben Widawsky
  2021-11-23 23:55         ` Bjorn Helgaas
  1 sibling, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-23 23:38 UTC (permalink / raw)
  To: Dan Williams
  Cc: Bjorn Helgaas, linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-23 14:36:32, Dan Williams wrote:
> On Tue, Nov 23, 2021 at 2:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> [..]
> > > I hope this driver is not modeled on the PCI portdrv.  IMO, that was a
> > > design error, and the "port service drivers" (PME, hotplug, AER, etc)
> > > should be directly integrated into the PCI core instead of pretending
> > > to be independent drivers.
> >
> > I'd like to understand a bit about why you think it was a design error. The port
> > driver is intended to be a port services driver, however I see the services
> > provided as quite different than the ones you mention. The primary service
> > cxl_port will provide from here on out is the ability to manage HDM decoder
> > resources for a port. Other independent drivers that want to use HDM decoder
> > resources would rely on the port driver for this.
> >
> > It could be in CXL core just the same, but I don't see too much of a benefit
> > since the code would be almost identical. One nice aspect of having the port
> > driver outside of CXL core is it would allow CXL devices which do not need port
> > services (type-1 and probably type-2) to not need to load the port driver. We do
> > not have examples of such devices today but there's good evidence they are being
> > built. Whether or not they will even want CXL core is another topic up for
> > debate however.
> >
> > I admit Dan and I did discuss putting this in either its own port driver, or
> > within core as a port driver. My inclination is, less is more in CXL core; but
> > perhaps you will be able to change my mind.
> 
> No, I don't think Bjorn is talking about whether the driver is
> integrated into cxl_core.ko vs its own cxl_port.ko. IIUC this goes
> back to the original contention about have /sys/bus/cxl at all:
> 
> https://lore.kernel.org/r/CAPcyv4iu8D-hJoujLXw8a4myS7trOE1FcUhESLB_imGMECVfrg@mail.gmail.com
> 
> Unlike pcieportdrv where the functionality is bounded to a single
> device instance with relatively simpler hierarchical coordination of
> the memory space and services. CXL interleaving is both a foreign
> concept to the PCIE core and an awkward fit for the pci_bus_type
> device model. CXL uses the cxl_bus_type and bus drivers to coordinate
> CXL operations that have cross PCI device implications.
> 
> Outside of that I think the auxiliary device driver model, of which
> the PCIE portdrv model is an open-coded form, is a useful construct
> for separation of concerns. That said, I do want to hear more about
> what trouble it has caused though to make sure that CXL does not trip
> over the same issues longer term.
> 
> [..]
> > > > +static void rescan_ports(struct work_struct *work)
> > > > +{
> > > > +   if (bus_rescan_devices(&cxl_bus_type))
> > > > +           pr_warn("Failed to rescan\n");
> > >
> > > Needs some context.  A bare "Failed to rescan" in the dmesg log
> > > without a clue about who emitted it is really annoying.
> > >
> > > Maybe you defined pr_fmt() somewhere; I couldn't find it easily.
> > >
> >
> > I actually didn't know about pr_fmt() trick, so thanks for that. I'll improve
> > this message to be more useful and contextual.
> 
> I am skeptical that this implementation of the workqueue properly
> synchronizes with workqueue shutdown concerns, but I have not had a
> chance to dig in too deep on this patchset.

Yeah, we discussed this. Please do check that. I'm happy to send out v2 with all
of Jonathan's fixes first, so you don't have to duplicate effort with what he
has already uncovered.

> 
> Regardless, it is not worth reporting a rescan failure, because those
> are to be expected here. The rescan attempts to rescan when a
> constraint changes, there is no guarantee that all constraints are met
> just because one constraint changed, so rescan failures
> (device_attach() failures) are not interesting to report. The
> individual driver probe failure reporting is sufficient.

Agreed.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-23 22:36       ` Dan Williams
  2021-11-23 23:38         ` Ben Widawsky
@ 2021-11-23 23:55         ` Bjorn Helgaas
  2021-11-24  0:40           ` Dan Williams
  1 sibling, 1 reply; 133+ messages in thread
From: Bjorn Helgaas @ 2021-11-23 23:55 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ben Widawsky, linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma, Christoph Hellwig,
	Greg Kroah-Hartman, Rafael J. Wysocki

[+cc Christoph, since he has opinions about portdrv;
Greg, Rafael, since they have good opinions about sysfs structure]

On Tue, Nov 23, 2021 at 02:36:32PM -0800, Dan Williams wrote:
> On Tue, Nov 23, 2021 at 2:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> [..]
> > > I hope this driver is not modeled on the PCI portdrv.  IMO, that
> > > was a design error, and the "port service drivers" (PME,
> > > hotplug, AER, etc) should be directly integrated into the PCI
> > > core instead of pretending to be independent drivers.
> >
> > I'd like to understand a bit about why you think it was a design
> > error. The port driver is intended to be a port services driver,
> > however I see the services provided as quite different than the
> > ones you mention. The primary service cxl_port will provide from
> > here on out is the ability to manage HDM decoder resources for a
> > port. Other independent drivers that want to use HDM decoder
> > resources would rely on the port driver for this.
> >
> > It could be in CXL core just the same, but I don't see too much of
> > a benefit since the code would be almost identical. One nice
> > aspect of having the port driver outside of CXL core is it would
> > allow CXL devices which do not need port services (type-1 and
> > probably type-2) to not need to load the port driver. We do not
> > have examples of such devices today but there's good evidence they
> > are being built. Whether or not they will even want CXL core is
> > another topic up for debate however.
> >
> > I admit Dan and I did discuss putting this in either its own port
> > driver, or within core as a port driver. My inclination is, less
> > is more in CXL core; but perhaps you will be able to change my
> > mind.
> 
> No, I don't think Bjorn is talking about whether the driver is
> integrated into cxl_core.ko vs its own cxl_port.ko. IIUC this goes
> back to the original contention about have /sys/bus/cxl at all:
> 
> https://lore.kernel.org/r/CAPcyv4iu8D-hJoujLXw8a4myS7trOE1FcUhESLB_imGMECVfrg@mail.gmail.com

That question was about whether we want the same physical device to be
represented both in the /sys/bus/pci hierarchy and the /sys/bus/cxl
hierarchy.  That still seems a little weird to me, but I don't know
enough about the CXL constraints to really object to it.

My question here is more about whether you're going to use something
like the pcie_port_service_register() model for supporting multiple
drivers attached to the same physical device.

The PCIe portdrv creates a /sys/bus/pci_express hierarchy parallel to
the /sys/bus/pci hierarchy.  The pci_express hierarchy has a "device"
for every service (hotplug, AER, DPC, PME, etc) (see
pcie_device_init()).  This device creation is quite complicated and
depends on whether the Port advertises a Capability, whether the
platform has granted control to the OS, whether support is compiled
in, etc.

I think that was a mistake because these hotplug, AER, DPC, PME
"devices" are not independent things.  They are optional features that
can be added to a variety of devices, and those devices might have
their own drivers.  For example, we want to write drivers for
vendor-specific functionality like PMUs in switches, but we can't do
that cleanly because portdrv claims them.

The portdrv features are fully specified by the PCIe spec, so nothing
is vendor-specific.  They share interrupts.  They share power state.
They cannot be reset independently.  They are not addressable entities
in the usual bus/device/function model.  They can't be removed or
added like the underlying device can.  I wasn't there when they were
designed, but from reading the spec, it seems like they were designed
as optional features of a device, not as separate devices themselves.

> Unlike pcieportdrv where the functionality is bounded to a single
> device instance with relatively simpler hierarchical coordination of
> the memory space and services. CXL interleaving is both a foreign
> concept to the PCIE core and an awkward fit for the pci_bus_type
> device model. CXL uses the cxl_bus_type and bus drivers to
> coordinate CXL operations that have cross PCI device implications.
> 
> Outside of that I think the auxiliary device driver model, of which
> the PCIE portdrv model is an open-coded form, is a useful construct
> for separation of concerns. That said, I do want to hear more about
> what trouble it has caused though to make sure that CXL does not
> trip over the same issues longer term.

Bjorn

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-23 23:55         ` Bjorn Helgaas
@ 2021-11-24  0:40           ` Dan Williams
  2021-11-24  6:33             ` Christoph Hellwig
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-24  0:40 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Ben Widawsky, linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma, Christoph Hellwig,
	Greg Kroah-Hartman, Rafael J. Wysocki

On Tue, Nov 23, 2021 at 3:56 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Christoph, since he has opinions about portdrv;
> Greg, Rafael, since they have good opinions about sysfs structure]
>
> On Tue, Nov 23, 2021 at 02:36:32PM -0800, Dan Williams wrote:
> > On Tue, Nov 23, 2021 at 2:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > [..]
> > > > I hope this driver is not modeled on the PCI portdrv.  IMO, that
> > > > was a design error, and the "port service drivers" (PME,
> > > > hotplug, AER, etc) should be directly integrated into the PCI
> > > > core instead of pretending to be independent drivers.
> > >
> > > I'd like to understand a bit about why you think it was a design
> > > error. The port driver is intended to be a port services driver,
> > > however I see the services provided as quite different than the
> > > ones you mention. The primary service cxl_port will provide from
> > > here on out is the ability to manage HDM decoder resources for a
> > > port. Other independent drivers that want to use HDM decoder
> > > resources would rely on the port driver for this.
> > >
> > > It could be in CXL core just the same, but I don't see too much of
> > > a benefit since the code would be almost identical. One nice
> > > aspect of having the port driver outside of CXL core is it would
> > > allow CXL devices which do not need port services (type-1 and
> > > probably type-2) to not need to load the port driver. We do not
> > > have examples of such devices today but there's good evidence they
> > > are being built. Whether or not they will even want CXL core is
> > > another topic up for debate however.
> > >
> > > I admit Dan and I did discuss putting this in either its own port
> > > driver, or within core as a port driver. My inclination is, less
> > > is more in CXL core; but perhaps you will be able to change my
> > > mind.
> >
> > No, I don't think Bjorn is talking about whether the driver is
> > integrated into cxl_core.ko vs its own cxl_port.ko. IIUC this goes
> > back to the original contention about have /sys/bus/cxl at all:
> >
> > https://lore.kernel.org/r/CAPcyv4iu8D-hJoujLXw8a4myS7trOE1FcUhESLB_imGMECVfrg@mail.gmail.com
>
> That question was about whether we want the same physical device to be
> represented both in the /sys/bus/pci hierarchy and the /sys/bus/cxl
> hierarchy.  That still seems a little weird to me, but I don't know
> enough about the CXL constraints to really object to it.
>
> My question here is more about whether you're going to use something
> like the pcie_port_service_register() model for supporting multiple
> drivers attached to the same physical device.
>
> The PCIe portdrv creates a /sys/bus/pci_express hierarchy parallel to
> the /sys/bus/pci hierarchy.  The pci_express hierarchy has a "device"
> for every service (hotplug, AER, DPC, PME, etc) (see
> pcie_device_init()).  This device creation is quite complicated and
> depends on whether the Port advertises a Capability, whether the
> platform has granted control to the OS, whether support is compiled
> in, etc.
>
> I think that was a mistake because these hotplug, AER, DPC, PME
> "devices" are not independent things.  They are optional features that
> can be added to a variety of devices, and those devices might have
> their own drivers.  For example, we want to write drivers for
> vendor-specific functionality like PMUs in switches, but we can't do
> that cleanly because portdrv claims them.
>
> The portdrv features are fully specified by the PCIe spec, so nothing
> is vendor-specific.  They share interrupts.  They share power state.
> They cannot be reset independently.  They are not addressable entities
> in the usual bus/device/function model.  They can't be removed or
> added like the underlying device can.  I wasn't there when they were
> designed, but from reading the spec, it seems like they were designed
> as optional features of a device, not as separate devices themselves.

Let me ask a clarifying question coming from the other direction that
resulted in the creation of the auxiliary bus architecture. Some
background. RDMA is a protocol that may run on top of Ethernet.
Consider the case where you have multiple generations of Ethernet
adapter devices, but they all support common RDMA functionality. You
only have the one PCI device to attach a unique Ethernet driver. What
is an idiomatic way to deploy a module that automatically loads and
attaches to the exported common functionality across adapters that
otherwise have a unique native driver for the hardware device?

Another example, the Native PCIe Enclosure Management (NPEM)
specification defines a handful of registers that can appear anywhere
in the PCIe hierarchy. How can you write a common driver that is
generically applicable to any given NPEM instance?

The auxiliary bus answer to those questions is to register an
auxiliary device to be driven by a common auxiliary driver across
hard-device generations from the same vendor or even across vendors.

For your example about a PCIe port PMU driver it ultimately requires
something to enumerate that capability and a library of code to
exercise that functionality. What is a more natural fit than a Linux
"device" and a Linux driver to coordinate attaching enabling code to a
standalone hardware capability?

PCIe portdrv may be awkward because there was never a real native
driver for the device to begin with and it all could have handled by
'pcie_portdriver' directly without registering more Linux devices, but
assigning self contained features to Linux devices otherwise seems a
useful idiom to me.

As for CXL, there is no analog of the PCIe portdrv pattern of just
having a device act as a multiplexer of features to other Linux
devices.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI
  2021-11-20  0:02 ` [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
  2021-11-22 14:47   ` Jonathan Cameron
@ 2021-11-24  4:15   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24  4:15 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> The cxl_mem module was renamed cxl_pci in commit 21e9f76733a8 ("cxl:
> Rename mem to pci"). In preparation for adding an ancillary driver for
> cxl_memdev devices (registered on the cxl bus by cxl_pci), go ahead and
> rename CONFIG_CXL_MEM to CONFIG_CXL_PCI. Free up the CXL_MEM name for
> that new driver to manage CXL.mem endpoint operations.
>
> Suggested-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Dan Williams <dan.j.williams@intel.com>


> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>
> ---
> Changes since RFCv2:
> - Reword commit message (Dan)
> - Reword Kconfig description (Dan)
> ---
>  drivers/cxl/Kconfig  | 23 ++++++++++++-----------
>  drivers/cxl/Makefile |  2 +-
>  2 files changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 67c91378f2dd..ef05e96f8f97 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -13,25 +13,26 @@ menuconfig CXL_BUS
>
>  if CXL_BUS
>
> -config CXL_MEM
> -       tristate "CXL.mem: Memory Devices"
> +config CXL_PCI
> +       tristate "PCI manageability"
>         default CXL_BUS
>         help
> -         The CXL.mem protocol allows a device to act as a provider of
> -         "System RAM" and/or "Persistent Memory" that is fully coherent
> -         as if the memory was attached to the typical CPU memory
> -         controller.
> +         The CXL specification defines a "CXL memory device" sub-class in the
> +         PCI "memory controller" base class of devices. Device's identified by
> +         this class code provide support for volatile and / or persistent
> +         memory to be mapped into the system address map (Host-managed Device
> +         Memory (HDM)).
>
> -         Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> -         configuration and management primarily via the mailbox interface. See
> -         Chapter 2.3 Type 3 CXL Device in the CXL 2.0 specification for more
> -         details.
> +         Say 'y/m' to enable a driver that will attach to CXL memory expander
> +         devices enumerated by the memory device class code for configuration
> +         and management primarily via the mailbox interface. See Chapter 2.3
> +         Type 3 CXL Device in the CXL 2.0 specification for more details.
>
>           If unsure say 'm'.
>
>  config CXL_MEM_RAW_COMMANDS
>         bool "RAW Command Interface for Memory Devices"
> -       depends on CXL_MEM
> +       depends on CXL_PCI
>         help
>           Enable CXL RAW command interface.
>
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index d1aaabc940f3..cf07ae6cea17 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_CXL_BUS) += core/
> -obj-$(CONFIG_CXL_MEM) += cxl_pci.o
> +obj-$(CONFIG_CXL_PCI) += cxl_pci.o
>  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
>  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
>
> --
> 2.34.0
>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 02/23] cxl: Flesh out register names
  2021-11-20  0:02 ` [PATCH 02/23] cxl: Flesh out register names Ben Widawsky
  2021-11-22 14:49   ` Jonathan Cameron
@ 2021-11-24  4:24   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24  4:24 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Get a better naming scheme in place for upcoming additions.

I prefer that subject lines and changelog have at least one concrete
detail. Writing that out would identify that "rename REGLOC to
REG_LOCATOR", is separate from "drop the unused PCI_DVSEC_ID_CXL
definition", is different from "drop redundant prefixes of 'PCI' and
'DVSEC' from defines".

With some concrete details added to the changelog you can add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>


>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
> Changes since RFCv2:
> Use some abbreviations (Jonathan)
> Prefix everything with CXL (Jonathan)
> Remove new additions (Dan)
>
> Original discussion motivating this occurred here:
> https://lore.kernel.org/linux-pci/20210913190131.xiiszmno46qie7v5@intel.com/
> ---
>  drivers/cxl/pci.c | 14 +++++++-------
>  drivers/cxl/pci.h | 19 ++++++++++---------
>  2 files changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8dc91fd3396a..a6ea9811a05b 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -403,10 +403,10 @@ static int cxl_map_regs(struct cxl_dev_state *cxlds, struct cxl_register_map *ma
>  static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
>                                 struct cxl_register_map *map)
>  {
> -       map->block_offset =
> -               ((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
> -       map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
> -       map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
> +       map->block_offset = ((u64)reg_hi << 32) |
> +                           (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
> +       map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
> +       map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
>  }
>
>  /**
> @@ -427,15 +427,15 @@ static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>         int regloc, i;
>
>         regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> -                                          PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> +                                          CXL_DVSEC_REG_LOCATOR);
>         if (!regloc)
>                 return -ENXIO;
>
>         pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
>         regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
>
> -       regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> -       regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> +       regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
> +       regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
>
>         for (i = 0; i < regblocks; i++, regloc += 8) {
>                 u32 reg_lo, reg_hi;
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 7d3e4bf06b45..29b8eaef3a0a 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -7,17 +7,21 @@
>
>  /*
>   * See section 8.1 Configuration Space Registers in the CXL 2.0
> - * Specification
> + * Specification. Names are taken straight from the specification with "CXL" and
> + * "DVSEC" redundancies removed. When obvious, abbreviations may be used.
>   */
>  #define PCI_DVSEC_HEADER1_LENGTH_MASK  GENMASK(31, 20)
>  #define PCI_DVSEC_VENDOR_ID_CXL                0x1E98
> -#define PCI_DVSEC_ID_CXL               0x0
>
> -#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID       0x8
> -#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET  0xC
> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
> +#define CXL_DVSEC_PCIE_DEVICE                                  0
>
> -/* BAR Indicator Register (BIR) */
> -#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
> +/* CXL 2.0 8.1.9: Register Locator DVSEC */
> +#define CXL_DVSEC_REG_LOCATOR                                  8
> +#define   CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET                  0xC
> +#define     CXL_DVSEC_REG_LOCATOR_BIR_MASK                     GENMASK(2, 0)
> +#define            CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK                 GENMASK(15, 8)
> +#define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK           GENMASK(31, 16)
>
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
> @@ -28,7 +32,4 @@ enum cxl_regloc_type {
>         CXL_REGLOC_RBI_TYPES
>  };
>
> -#define CXL_REGLOC_RBI_MASK GENMASK(15, 8)
> -#define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
> -
>  #endif /* __CXL_PCI_H__ */
> --
> 2.34.0
>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-24  0:40           ` Dan Williams
@ 2021-11-24  6:33             ` Christoph Hellwig
  2021-11-24  7:17               ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Christoph Hellwig @ 2021-11-24  6:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: Bjorn Helgaas, Ben Widawsky, linux-cxl, Linux PCI,
	Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma,
	Christoph Hellwig, Greg Kroah-Hartman, Rafael J. Wysocki

On Tue, Nov 23, 2021 at 04:40:06PM -0800, Dan Williams wrote:
> Let me ask a clarifying question coming from the other direction that
> resulted in the creation of the auxiliary bus architecture. Some
> background. RDMA is a protocol that may run on top of Ethernet.

No, RDMA is a concept.  Linux supports 2 and a half RDMA protocols
that run over ethernet (RoCE v1 and v2 and iWarp).

> Consider the case where you have multiple generations of Ethernet
> adapter devices, but they all support common RDMA functionality. You
> only have the one PCI device to attach a unique Ethernet driver. What
> is an idiomatic way to deploy a module that automatically loads and
> attaches to the exported common functionality across adapters that
> otherwise have a unique native driver for the hardware device?

The whole aux bus drama is mostly because the intel design for these
is really fucked up.  All the sane HCAs do not use this model.  All
this attchment crap really should not be there.

> Another example, the Native PCIe Enclosure Management (NPEM)
> specification defines a handful of registers that can appear anywhere
> in the PCIe hierarchy. How can you write a common driver that is
> generically applicable to any given NPEM instance?

Another totally messed up spec.  But then pretty much everything coming
from the PCIe SIG in terms of interface tends to be really, really
broken lately.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-24  6:33             ` Christoph Hellwig
@ 2021-11-24  7:17               ` Dan Williams
  2021-11-24  7:28                 ` Christoph Hellwig
  2021-12-02 21:24                 ` Bjorn Helgaas
  0 siblings, 2 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24  7:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Bjorn Helgaas, Ben Widawsky, linux-cxl, Linux PCI,
	Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma,
	Greg Kroah-Hartman, Rafael J. Wysocki

On Tue, Nov 23, 2021 at 10:33 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Tue, Nov 23, 2021 at 04:40:06PM -0800, Dan Williams wrote:
> > Let me ask a clarifying question coming from the other direction that
> > resulted in the creation of the auxiliary bus architecture. Some
> > background. RDMA is a protocol that may run on top of Ethernet.
>
> No, RDMA is a concept.  Linux supports 2 and a half RDMA protocols
> that run over ethernet (RoCE v1 and v2 and iWarp).

Yes, I was being too coarse, point taken. However, I don't think that
changes the observation that multiple vendors are using aux bus to
share a feature driver across multiple base Ethernet drivers.

>
> > Consider the case where you have multiple generations of Ethernet
> > adapter devices, but they all support common RDMA functionality. You
> > only have the one PCI device to attach a unique Ethernet driver. What
> > is an idiomatic way to deploy a module that automatically loads and
> > attaches to the exported common functionality across adapters that
> > otherwise have a unique native driver for the hardware device?
>
> The whole aux bus drama is mostly because the intel design for these
> is really fucked up.  All the sane HCAs do not use this model.  All
> this attchment crap really should not be there.

I am missing the counter proposal in both Bjorn's and your distaste
for aux bus and PCIe portdrv?

> > Another example, the Native PCIe Enclosure Management (NPEM)
> > specification defines a handful of registers that can appear anywhere
> > in the PCIe hierarchy. How can you write a common driver that is
> > generically applicable to any given NPEM instance?
>
> Another totally messed up spec.  But then pretty much everything coming
> from the PCIe SIG in terms of interface tends to be really, really
> broken lately.

DVSEC and DOE is more of the same in terms of composing add-on
features into devices. Hardware vendors want to mix multiple hard-IPs
into a single device, aux bus is one response. Topology specific buses
like /sys/bus/cxl are another.

This CXL port driver is offering enumeration, link management, and
memory decode setup services to the rest of the topology. I see it as
similar to management protocol services offered by libsas.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-24  7:17               ` Dan Williams
@ 2021-11-24  7:28                 ` Christoph Hellwig
  2021-11-24  7:33                   ` Greg Kroah-Hartman
  2021-12-02 21:24                 ` Bjorn Helgaas
  1 sibling, 1 reply; 133+ messages in thread
From: Christoph Hellwig @ 2021-11-24  7:28 UTC (permalink / raw)
  To: Dan Williams
  Cc: Christoph Hellwig, Bjorn Helgaas, Ben Widawsky, linux-cxl,
	Linux PCI, Alison Schofield, Ira Weiny, Jonathan Cameron,
	Vishal Verma, Greg Kroah-Hartman, Rafael J. Wysocki

On Tue, Nov 23, 2021 at 11:17:55PM -0800, Dan Williams wrote:
> I am missing the counter proposal in both Bjorn's and your distaste
> for aux bus and PCIe portdrv?

Given that I've only brought in in the last mail I have no idea what
the original proposal even is.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-24  7:28                 ` Christoph Hellwig
@ 2021-11-24  7:33                   ` Greg Kroah-Hartman
  2021-11-24  7:54                     ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-24  7:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dan Williams, Bjorn Helgaas, Ben Widawsky, linux-cxl, Linux PCI,
	Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma,
	Rafael J. Wysocki

On Wed, Nov 24, 2021 at 08:28:24AM +0100, Christoph Hellwig wrote:
> On Tue, Nov 23, 2021 at 11:17:55PM -0800, Dan Williams wrote:
> > I am missing the counter proposal in both Bjorn's and your distaste
> > for aux bus and PCIe portdrv?
> 
> Given that I've only brought in in the last mail I have no idea what
> the original proposal even is.

Neither do I :(

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-24  7:33                   ` Greg Kroah-Hartman
@ 2021-11-24  7:54                     ` Dan Williams
  2021-11-24  8:21                       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-24  7:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Christoph Hellwig, Bjorn Helgaas, Ben Widawsky, linux-cxl,
	Linux PCI, Alison Schofield, Ira Weiny, Jonathan Cameron,
	Vishal Verma, Rafael J. Wysocki

On Tue, Nov 23, 2021 at 11:33 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Wed, Nov 24, 2021 at 08:28:24AM +0100, Christoph Hellwig wrote:
> > On Tue, Nov 23, 2021 at 11:17:55PM -0800, Dan Williams wrote:
> > > I am missing the counter proposal in both Bjorn's and your distaste
> > > for aux bus and PCIe portdrv?
> >
> > Given that I've only brought in in the last mail I have no idea what
> > the original proposal even is.
>
> Neither do I :(

To be clear I am also trying to get to the root of Bjorn's concern.

The proposal in $SUBJECT is to build on / treat a CXL topology as a
Linux device topology on /sys/bus/cxl that references devices on
/sys/bus/platform (CXL ACPI topology root and Host Bridges) and
/sys/bus/pci (CXL Switches and Endpoints). This CXL port device
topology has already been shipping for a few kernel cycles. What is on
the table now is a driver for CXL port devices (a logical Linux
construct). The driver handles discovery of "component registers"
either by ACPI table or PCI DVSEC and offers services to proxision CXL
regions. CXL 'regions' are also proposed as Linux devices that
represent an active CXL memory range which can interleave multiple
endpoints across multiple switches and host bridges.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-24  7:54                     ` Dan Williams
@ 2021-11-24  8:21                       ` Greg Kroah-Hartman
  2021-11-24 18:24                         ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-24  8:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Christoph Hellwig, Bjorn Helgaas, Ben Widawsky, linux-cxl,
	Linux PCI, Alison Schofield, Ira Weiny, Jonathan Cameron,
	Vishal Verma, Rafael J. Wysocki

On Tue, Nov 23, 2021 at 11:54:03PM -0800, Dan Williams wrote:
> On Tue, Nov 23, 2021 at 11:33 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Nov 24, 2021 at 08:28:24AM +0100, Christoph Hellwig wrote:
> > > On Tue, Nov 23, 2021 at 11:17:55PM -0800, Dan Williams wrote:
> > > > I am missing the counter proposal in both Bjorn's and your distaste
> > > > for aux bus and PCIe portdrv?
> > >
> > > Given that I've only brought in in the last mail I have no idea what
> > > the original proposal even is.
> >
> > Neither do I :(
> 
> To be clear I am also trying to get to the root of Bjorn's concern.
> 
> The proposal in $SUBJECT is to build on / treat a CXL topology as a
> Linux device topology on /sys/bus/cxl that references devices on
> /sys/bus/platform (CXL ACPI topology root and Host Bridges) and
> /sys/bus/pci (CXL Switches and Endpoints).

Wait, I am confused.

A bus lists devices that are on that bus.  It does not list devices that
are on other busses.

Now a device in a bus list can have a parent be on different types of
busses, as the parent device does not matter, but the device itself can
not be of different types.

So I do not understand what you are describing here at all.  Do you have
an example output of sysfs that shows this situation?

> This CXL port device topology has already been shipping for a few
> kernel cycles.

So it's always been broken?  :)

> What is on
> the table now is a driver for CXL port devices (a logical Linux
> construct). The driver handles discovery of "component registers"
> either by ACPI table or PCI DVSEC and offers services to proxision CXL
> regions.

So a normal bus controller device that creates new devices of a bus
type, right?  What is special about that?

> CXL 'regions' are also proposed as Linux devices that
> represent an active CXL memory range which can interleave multiple
> endpoints across multiple switches and host bridges.

As long as these 'devices' have drivers and do not mess with the
resources of any other device in the system, I do not understand the
problem here.

Or is the issue that you are again trying to carve up "real" devices
into tiny devices that then somehow need to be aware of the resources
being touched by different drivers at the same time?

I'm still confused, sorry.

greg k-h

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-24  8:21                       ` Greg Kroah-Hartman
@ 2021-11-24 18:24                         ` Dan Williams
  0 siblings, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24 18:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Christoph Hellwig, Bjorn Helgaas, Ben Widawsky, linux-cxl,
	Linux PCI, Alison Schofield, Ira Weiny, Jonathan Cameron,
	Vishal Verma, Rafael J. Wysocki

On Wed, Nov 24, 2021 at 12:22 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Tue, Nov 23, 2021 at 11:54:03PM -0800, Dan Williams wrote:
> > On Tue, Nov 23, 2021 at 11:33 PM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Nov 24, 2021 at 08:28:24AM +0100, Christoph Hellwig wrote:
> > > > On Tue, Nov 23, 2021 at 11:17:55PM -0800, Dan Williams wrote:
> > > > > I am missing the counter proposal in both Bjorn's and your distaste
> > > > > for aux bus and PCIe portdrv?
> > > >
> > > > Given that I've only brought in in the last mail I have no idea what
> > > > the original proposal even is.
> > >
> > > Neither do I :(
> >
> > To be clear I am also trying to get to the root of Bjorn's concern.
> >
> > The proposal in $SUBJECT is to build on / treat a CXL topology as a
> > Linux device topology on /sys/bus/cxl that references devices on
> > /sys/bus/platform (CXL ACPI topology root and Host Bridges) and
> > /sys/bus/pci (CXL Switches and Endpoints).
>
> Wait, I am confused.
>
> A bus lists devices that are on that bus.  It does not list devices that
> are on other busses.
>
> Now a device in a bus list can have a parent be on different types of
> busses, as the parent device does not matter, but the device itself can
> not be of different types.
>
> So I do not understand what you are describing here at all.  Do you have
> an example output of sysfs that shows this situation?

Commit 40ba17afdfab ("cxl/acpi: Introduce cxl_decoder objects")

...has an example of the devices registered on the CXL bus by the
cxl_acpi driver.

> > This CXL port device topology has already been shipping for a few
> > kernel cycles.
>
> So it's always been broken?  :)

Kidding aside, CXL has different moving pieces than PCI and the Linux
device-driver model is how we are organizing that complexity. CXL is
also symbiotically attached to PCI as it borrows the enumeration
mechanism while building up a parallel CXL.mem universe to live
alongside PCI.config and PCI.mmio. The CXL subsystem is similar to MD
/ DM where virtual devices are built up from other devices.

> > What is on
> > the table now is a driver for CXL port devices (a logical Linux
> > construct). The driver handles discovery of "component registers"
> > either by ACPI table or PCI DVSEC and offers services to proxision CXL
> > regions.
>
> So a normal bus controller device that creates new devices of a bus
> type, right?  What is special about that?

Yes, which is the root of my confusion about Bjorn's concern.

> > CXL 'regions' are also proposed as Linux devices that
> > represent an active CXL memory range which can interleave multiple
> > endpoints across multiple switches and host bridges.
>
> As long as these 'devices' have drivers and do not mess with the
> resources of any other device in the system, I do not understand the
> problem here.

Correct, these drivers manage CXL.mem resources and leave PCI.mmio
resource management to PCI.

> Or is the issue that you are again trying to carve up "real" devices
> into tiny devices that then somehow need to be aware of the resources
> being touched by different drivers at the same time?

No, there's no multiplexing of resources across devices / drivers that
requires cross-driver awareness.

> I'm still confused, sorry.

No worries, appreciate the attention to make sure this implementation
is idiomatic and avoids any architectural dragons.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 03/23] cxl/pci: Extract device status check
  2021-11-20  0:02 ` [PATCH 03/23] cxl/pci: Extract device status check Ben Widawsky
  2021-11-22 15:03   ` Jonathan Cameron
@ 2021-11-24 19:30   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24 19:30 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> The Memory Device Status register is inspected in the same way for at
> least two flows in the CXL Type 3 Memory Device Software Guide
> (Revision: 1.0): 2.13.9 Device discovery and mailbox ready sequence,
> and 2.13.10 Media ready sequence. Extract this common functionality for
> use by both.

Can you translate this into CXL specification terms? See below for the
rationale...

>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
> This patch did not exist in RFCv2
> ---
>  drivers/cxl/pci.c | 33 +++++++++++++++++++++++++--------
>  1 file changed, 25 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index a6ea9811a05b..6c8d09fb3a17 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -182,6 +182,27 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_dev_state *cxlds,
>         return 0;
>  }
>
> +/*
> + * Implements roughly the bottom half of Figure 42 of the CXL Type 3 Memory
> + * Device Software Guide

I do appreciate that document for working through some of the concerns
that system software might have for various CXL flows, but at the same
time it's not authoritative. I.e. it is not a specification itself and
it depends on the CXL specification as the "source of truth". So for
Linux commentary I would translate the guide's recommendations back
into the base truth from the CXL specification.

There will be places where Linux goes a different direction than the
software guide so I do not want to set any expectations that those
excursions are a bug, or otherwise require someone to consult a
specific hardware vendor's software guide.

Especially in this case when the logic is simply "check a couple fatal
status flags", the base specification is sufficient and the original
code made no reference to the guide.

> + */
> +static int check_device_status(struct cxl_dev_state *cxlds)
> +{
> +       const u64 md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> +
> +       if (md_status & CXLMDEV_DEV_FATAL) {
> +               dev_err(cxlds->dev, "Fatal: replace device\n");

The specification says "replace device", I disagree that the kernel
should be recommending that the device by replaced. Just report what
the driver does, and that's probably easier if the error messages are
left to the caller.

        const u64 md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);

        if (md_status & (CXLMDEV_DEV_FATAL | CXLMDEV_FW_HALT)) {
                dev_err(dev, "mbox: failed to acquire, device state:%s%s\n",
                        md_status & CXLMDEV_DEV_FATAL ? " fatal" : "",
                        md_status & CXLMDEV_FW_HALT ? " firmware-halt" : "");
                return -EIO;
        }

...i.e. it's not clear to me the helper helps.

> +               return -EIO;
> +       }
> +
> +       if (md_status & CXLMDEV_FW_HALT) {
> +               dev_err(cxlds->dev, "FWHalt: reset or replace device\n");
> +               return -EBUSY;
> +       }
> +
> +       return 0;
> +}
> +
>  /**
>   * cxl_pci_mbox_get() - Acquire exclusive access to the mailbox.
>   * @cxlds: The device state to gain access to.
> @@ -231,17 +252,13 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
>          * Hardware shouldn't allow a ready status but also have failure bits
>          * set. Spit out an error, this should be a bug report
>          */
> -       rc = -EFAULT;
> -       if (md_status & CXLMDEV_DEV_FATAL) {
> -               dev_err(dev, "mbox: reported ready, but fatal\n");
> +       rc = check_device_status(cxlds);
> +       if (rc)
>                 goto out;
> -       }
> -       if (md_status & CXLMDEV_FW_HALT) {
> -               dev_err(dev, "mbox: reported ready, but halted\n");
> -               goto out;
> -       }
> +
>         if (CXLMDEV_RESET_NEEDED(md_status)) {

I think this check needs to go. If the reset is needed because of one
of the above failure statuses then the function will have already
error exited. If the reset is needed because media is disabled that
should not be fatal for mailbox operations. It could be useful to do
some interrogation of *why* media is disabled.

>                 dev_err(dev, "mbox: reported ready, but reset needed\n");
> +               rc = -EFAULT;
>                 goto out;
>         }
>
> --
> 2.34.0
>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout
  2021-11-22 17:53       ` Jonathan Cameron
@ 2021-11-24 19:56         ` Dan Williams
  2021-11-25  6:17           ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-24 19:56 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Ben Widawsky, linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Vishal Verma

On Mon, Nov 22, 2021 at 9:54 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Mon, 22 Nov 2021 09:17:31 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> > On 21-11-22 15:02:27, Jonathan Cameron wrote:
> > > On Fri, 19 Nov 2021 16:02:31 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > > The original driver implementation used the doorbell timeout for the
> > > > Mailbox Interface Ready bit to piggy back off of, since the latter
> > > > doesn't have a defined timeout. This functionality, introduced in
> > > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > > > since a timeout has been defined with an ECN to the 2.0 spec.
> > > >
> > > > While devices implemented prior to the ECN could have an arbitrarily
> > > > long wait and still be within spec, the max ECN value (256s) is chosen
> > > > as the default for all devices. All vendors in the consortium agreed to
> > > > this amount and so it is reasonable to assume no devices made will
> > > > exceed this amount.
> > >
> > > Optimistic :)
> > >
> >
> > Reasonable to assume is certainly not the same as "in reality". I can soften
> > this wording.
> >
> > > >
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > ---
> > > > This patch did not exist in RFCv2
> > > > ---
> > > >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> > > >  1 file changed, 29 insertions(+)
> > > >
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > index 6c8d09fb3a17..2cef9fec8599 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -2,6 +2,7 @@
> > > >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> > > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > > >  #include <linux/module.h>
> > > > +#include <linux/delay.h>
> > > >  #include <linux/sizes.h>
> > > >  #include <linux/mutex.h>
> > > >  #include <linux/list.h>
> > > > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> > > >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> > > >  {
> > > >   const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > > > + unsigned long timeout;
> > > > + u64 md_status;
> > > > + int rc;
> > > > +
> > > > + /*
> > > > +  * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > > > +  * dictate how long to wait for the mailbox to become ready. For
> > > > +  * simplicity, and to handle devices that might have been implemented
> > >
> > > I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> > > a lower value, then that is what we should use.  It's fine to wait the max time
> > > if nothing is specified.  It'll cost us a few lines of code at most unless
> > > I am missing something...
> > >
> > > Jonathan
> > >
> >
> > Let me pose it a different way, if a device advertises 1s, but for whatever
> > takes 4s to come up, should we penalize it over the device advertising 256s?
>
> Yes, because it is buggy.  A compliance test should have failed on this anyway.
>
> > The
> > way this field is defined in the spec would [IMHO] lead vendors to simply put
> > the max field in there to game the driver, so why not start off with just
> > insisting they don't?
>
> Given reading this value and getting a big number gives the implication that
> the device is meant to be really slow to initialize, I'd expect that to push
> vendors a little in the directly of putting realistic values in).
>
> Maybe we should print the value in the log to make them look silly ;)

A print message on the way to a static default timeout value is about
all a device's self reported timeout is useful.

"device not ready after waiting %d seconds, continuing to wait up to %d seconds"

It's also not clear to me that the Linux default timeout should be so
generous at 256 seconds. It might be suitable to just complain about
devices that are taking more than 60 seconds to initialize with an
option to override that timeout for odd outliers. Otherwise encourage
hardware implementations to beat the Linux timeout value to get
support out of the box.

I notice that not even libata waits more than a minute for a given
device to finish post-reset shenanigans, so might as well set 60
seconds as what the driver will tolerate out of the box.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-23 22:03     ` Ben Widawsky
  2021-11-23 22:36       ` Dan Williams
@ 2021-11-24 21:31       ` Bjorn Helgaas
  1 sibling, 0 replies; 133+ messages in thread
From: Bjorn Helgaas @ 2021-11-24 21:31 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 23, 2021 at 02:03:15PM -0800, Ben Widawsky wrote:
> On 21-11-23 12:21:28, Bjorn Helgaas wrote:
> > On Fri, Nov 19, 2021 at 04:02:47PM -0800, Ben Widawsky wrote:

> > > 2. Hostbridge port. 
> > > ...
> > >    It has n downstream ports,
> > >    each of which are known as CXL 2.0 root ports.
> > 
> > This sounds like a "host bridge port *contains* these root ports."
> > That doesn't sound right.
> 
> What sounds better? A CXL 2.0 Root Port is CXL capabilities on top
> of the PCIe component which has the PCI_EXP_TYPE_ROOT_PORT cap. In
> my mental model, a host bridge does contain the root ports. Perhaps
> I am wrong about that?

"A host bridge contains the root ports" makes sense to me.
"A host bridge *port* contains root ports" does not.

The PCIe spec would describe this as a "Root Complex may support one
or more [Root] Ports" (see PCIe r5.0, sec 1.3.1).

In PCIe, a "Port" is "an interface between a component and a PCI
Express Link."  It doesn't contain other Ports.

Sounds like CXL is the same here, and using the same terminology
(assuming that's what the CXL spec does) will reduce confusion.

> > > 3. Switch port. A switch port is similar to a hostbridge port except
> > >    register access is defined in the CXL specification in a platform
> > >    agnostic way. The downstream ports for a switch are simply known as
> > >    downstream ports. The enumeration of these are entirely contained
> > >    within cxl_port.
> > 
> > In PCIe, "Downstream Port" includes both Root Ports and Switch
> > Downstream Ports.  It sounds like that's not the case for CXL?
> 
> In CXL 2.0, it's fairly close to true that switch downstream ports
> and root ports are equivalent. From today's driver perspective they
> are equivalent. Root ports have some capabilities which switch
> downstream ports do not.

Same as PCIe.

> > > 4. Endpoint port. Endpoint ports are similar to switch ports with the
> > >    exception that they have no downstream ports, only the underlying
> > >    media on the device. The enumeration of these are started by cxl_pci,
> > >    and completed by cxl_port.
> > 
> > Does CXL use an "Upstream Port" concept similar to PCIe?  In PCIe,
> > "Upstream Port" includes both Switch Upstream Ports and the Upstream
> > Port on an Endpoint.
> 
> Not really. Endpoints aren't known as ports in the spec and they
> have a decent amount of divergence from upstream ports. The main
> area where they do converge is in the memory decoding capabilities.
> In fact, it might come to a point where we find adding an endpoint
> as a port does not make sense, but for now it does.

Since a PCIe "Port" refers to the interface between a component and a
Link, PCIe Endpoints have Upstream Ports just like switches do.  I'm
guessing CXL is the same.

The part about "endpoint ports have no downstream ports" is what
doesn't read well to me because ports don't have other ports.

This section is about the four types of ports in a system.  I'm
trying to match those up with spec terms, either PCIe or CXL.  It
sounds like you intend bullet 3 to include both Switch Upstream Ports
and Switch Downstream Ports, and bullet 4 to be only Endpoint Ports
(which are Upstream Ports).

> > I hope this driver is not modeled on the PCI portdrv.  IMO, that
> > was a design error, and the "port service drivers" (PME, hotplug,
> > AER, etc) should be directly integrated into the PCI core instead
> > of pretending to be independent drivers.
> 
> I'd like to understand a bit about why you think it was a design
> error. The port driver is intended to be a port services driver,
> however I see the services provided as quite different than the ones
> you mention. The primary service cxl_port will provide from here on
> out is the ability to manage HDM decoder resources for a port. Other
> independent drivers that want to use HDM decoder resources would
> rely on the port driver for this.

I'll continue this part in the later part of the thread.

> > > + *		* - Switch
> > > + *		  - Switch Upstream Port
> > > + *		  - Switch Downstream Port
> > > + *		* - Endpoint
> > > + *		  - Endpoint Port
> > > + *		  - N/A
> > 
> > What does "N/A" mean here?  Is it telling us something useful?
> 
> This gets rendered into a table that looks like the following:
> 
> | component  | upstream             | downstream             |
> | ---------  | --------             | ----------             |
> | Hostbridge | ACPI0016             | Root Port              |
> | Switch     | Switch Upstream Port | Switch Downstream Port |
> | Endpoint   | Endpoint Port        | N/A                    |

Makes sense, thanks.  I didn't know how to read the ReST and thought
this was just a list, where N/A wouldn't make much sense.

Bjorn

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-20  0:02 ` [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access Ben Widawsky
  2021-11-22 15:11   ` Jonathan Cameron
@ 2021-11-24 21:55   ` Dan Williams
  2021-11-29 18:33     ` Ben Widawsky
  1 sibling, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-24 21:55 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> The expectation is that the mailbox interface ready bit is the first
> step in access through the mailbox interface. Therefore, waiting for the
> doorbell busy bit to be clear would imply that the mailbox interface is
> ready. The original driver implementation used the doorbell timeout for
> the Mailbox Interface Ready bit to piggyback off of, since the latter
> doesn't have a defined timeout (introduced in 8adaf747c9f0 ("cxl/mem:
> Find device capabilities"), a timeout has since been defined with an ECN
> to the 2.0 spec). With the current driver waiting for mailbox interface
> ready as a part of probe() it's no longer necessary to use the
> piggyback.
>
> With the piggybacking no longer necessary it doesn't make sense to check
> doorbell status when acquiring the mailbox. It will be checked during
> the normal mailbox exchange protocol.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
> This patch did not exist in RFCv2
> ---
>  drivers/cxl/pci.c | 25 ++++++-------------------
>  1 file changed, 6 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 2cef9fec8599..869b4fc18e27 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -221,27 +221,14 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
>
>         /*
>          * XXX: There is some amount of ambiguity in the 2.0 version of the spec
> -        * around the mailbox interface ready (8.2.8.5.1.1).  The purpose of the
> +        * around the mailbox interface ready (8.2.8.5.1.1). The purpose of the
>          * bit is to allow firmware running on the device to notify the driver
> -        * that it's ready to receive commands. It is unclear if the bit needs
> -        * to be read for each transaction mailbox, ie. the firmware can switch
> -        * it on and off as needed. Second, there is no defined timeout for
> -        * mailbox ready, like there is for the doorbell interface.
> -        *
> -        * Assumptions:
> -        * 1. The firmware might toggle the Mailbox Interface Ready bit, check
> -        *    it for every command.
> -        *
> -        * 2. If the doorbell is clear, the firmware should have first set the
> -        *    Mailbox Interface Ready bit. Therefore, waiting for the doorbell
> -        *    to be ready is sufficient.
> +        * that it's ready to receive commands. The spec does not clearly define
> +        * under what conditions the bit may get set or cleared. As of the 2.0
> +        * base specification there was no defined timeout for mailbox ready,
> +        * like there is for the doorbell interface. This was fixed with an ECN,
> +        * but it's possible early devices implemented this before the ECN.

Can we just drop comment block altogether? Outside of
cxl_pci_setup_mailbox() the only time the mailbox status should be
checked is after a doorbell timeout after submitting a command.

>          */
> -       rc = cxl_pci_mbox_wait_for_doorbell(cxlds);
> -       if (rc) {
> -               dev_warn(dev, "Mailbox interface not ready\n");
> -               goto out;
> -       }
> -
>         md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
>         if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
>                 dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");

This error message is obsolete since nothing is pre-checking the
mailbox anymore, and per above I see no problem waiting to check the
status until after the mailbox has failed to respond after a timeout.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 06/23] cxl/pci: Don't check media status for mbox access
  2021-11-20  0:02 ` [PATCH 06/23] cxl/pci: Don't check media status for mbox access Ben Widawsky
  2021-11-22 15:19   ` Jonathan Cameron
@ 2021-11-24 21:58   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24 21:58 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Media status is necessary for using HDM contained in a CXL device but is
> not needed for mailbox accesses. Therefore remove this check. It will be
> necessary to have this check (in a different place) when enabling HDM.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
> This patch did not exist in RFCv2
> ---
>  drivers/cxl/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 869b4fc18e27..711bf4514480 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -230,7 +230,7 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
>          * but it's possible early devices implemented this before the ECN.
>          */
>         md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> -       if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
> +       if (!(md_status & CXLMDEV_MBOX_IF_READY)) {
>                 dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
>                 rc = -EBUSY;
>                 goto out;

Per comment on last patch I think this whole 'if' block can go.
"Ready" need only be checked once at the beginning of time and then
only as a forensic step after a future failure.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 07/23] cxl/pci: Add new DVSEC definitions
  2021-11-22 17:32     ` Ben Widawsky
@ 2021-11-24 22:03       ` Dan Williams
  0 siblings, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24 22:03 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Jonathan Cameron, linux-cxl, Linux PCI, Alison Schofield,
	Ira Weiny, Vishal Verma

On Mon, Nov 22, 2021 at 9:32 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-22 15:22:24, Jonathan Cameron wrote:
> > On Fri, 19 Nov 2021 16:02:34 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > > While the new definitions are yet necessary at this point, they are
> > > introduced at this point to help solidify the newly minted schema for
> > > naming registers.
> > >
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> >
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com>
>
> Thanks. I realized on re-reading this I didn't like the commit message. I
> reworded to this:
>
> While the new definitions are not yet necessary at this point, they are
> introduced to help solidify the newly minted schema for naming
> registers.
>
> Please let me know if you'd like me to drop your reviewed-by tag.

The typical changelog template for patches like this is:

"In preparation for adding features X, Y, and Z, add definitions for
A, B, and C."

Otherwise, patch looks good.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 08/23] cxl/acpi: Map component registers for Root Ports
  2021-11-20  0:02 ` [PATCH 08/23] cxl/acpi: Map component registers for Root Ports Ben Widawsky
  2021-11-22 15:51   ` Jonathan Cameron
@ 2021-11-24 22:18   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24 22:18 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> This implements the TODO in cxl_acpi for mapping component registers.
> cxl_acpi becomes the second consumer of CXL register block enumeration
> (cxl_pci being the first). Moving the functionality to cxl_core allows
> both of these drivers to use the functionality. Equally importantly it
> allows cxl_core to use the functionality in the future.
>
> CXL 2.0 root ports are similar to CXL 2.0 Downstream Ports with the main
> distinction being they're a part of the CXL 2.0 host bridge. While
> mapping their component registers is not immediately useful for the CXL
> drivers, the movement of register block enumeration into core is a vital
> step towards HDM decoder programming.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>
> ---
> Changes since RFCv2:
> - Squash commits together (Dan)
> - Reword commit message to account for above.
> ---
>  drivers/cxl/acpi.c      | 10 ++++++--
>  drivers/cxl/core/regs.c | 54 +++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |  4 +++
>  drivers/cxl/pci.c       | 52 ---------------------------------------
>  drivers/cxl/pci.h       |  4 +++
>  5 files changed, 70 insertions(+), 54 deletions(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 3163167ecc3a..7cfa8b568013 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -7,6 +7,7 @@
>  #include <linux/acpi.h>
>  #include <linux/pci.h>
>  #include "cxl.h"
> +#include "pci.h"
>
>  /* Encode defined in CXL 2.0 8.2.5.12.7 HDM Decoder Control Register */
>  #define CFMWS_INTERLEAVE_WAYS(x)       (1 << (x)->interleave_ways)
> @@ -134,11 +135,13 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>
>  __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>  {
> +       resource_size_t creg = CXL_RESOURCE_NONE;
>         struct cxl_walk_context *ctx = data;
>         struct pci_bus *root_bus = ctx->root;
>         struct cxl_port *port = ctx->port;
>         int type = pci_pcie_type(pdev);
>         struct device *dev = ctx->dev;
> +       struct cxl_register_map map;
>         u32 lnkcap, port_num;
>         int rc;
>
> @@ -152,9 +155,12 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>                                   &lnkcap) != PCIBIOS_SUCCESSFUL)
>                 return 0;
>
> -       /* TODO walk DVSEC to find component register base */
> +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> +       if (!rc)
> +               creg = cxl_reg_block(pdev, &map);

A couple comments: the difference between cxl_find_regblock() and
cxl_reg_block() is not obvious from the names. Setting aside why one
is regblock and the other is reg_block I would expect a name like
cxl_regmap_to_base() is easier to read.

It occurs to me that if cxl_find_regblock() failures are optional it
would be nice if cxl_regmap_to_base() returns CXL_RESOURCE_NONE if the
map is not populated. Then this can unconditionally call
cxl_regmap_to_base().

> +
>         port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> -       rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
> +       rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
>         if (rc) {
>                 ctx->error = rc;
>                 return rc;
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index e37e23bf4355..41a0245867ea 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -5,6 +5,7 @@
>  #include <linux/slab.h>
>  #include <linux/pci.h>
>  #include <cxlmem.h>
> +#include <pci.h>
>
>  /**
>   * DOC: cxl registers
> @@ -247,3 +248,56 @@ int cxl_map_device_regs(struct pci_dev *pdev,
>         return 0;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_map_device_regs, CXL);
> +
> +static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> +                               struct cxl_register_map *map)
> +{
> +       map->block_offset = ((u64)reg_hi << 32) |
> +                           (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
> +       map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
> +       map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
> +}
> +
> +/**
> + * cxl_find_regblock() - Locate register blocks by type
> + * @pdev: The CXL PCI device to enumerate.
> + * @type: Register Block Indicator id
> + * @map: Enumeration output, clobbered on error
> + *
> + * Return: 0 if register block enumerated, negative error code otherwise
> + *
> + * A CXL DVSEC may additional point one or more register blocks, search
> + * for them by @type.
> + */
> +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> +                     struct cxl_register_map *map)
> +{
> +       u32 regloc_size, regblocks;
> +       int regloc, i;
> +
> +       regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> +                                          CXL_DVSEC_REG_LOCATOR);
> +       if (!regloc)
> +               return -ENXIO;
> +
> +       pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> +       regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> +
> +       regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
> +       regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
> +
> +       for (i = 0; i < regblocks; i++, regloc += 8) {
> +               u32 reg_lo, reg_hi;
> +
> +               pci_read_config_dword(pdev, regloc, &reg_lo);
> +               pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> +
> +               cxl_decode_regblock(reg_lo, reg_hi, map);
> +
> +               if (map->reg_type == type)
> +                       return 0;
> +       }
> +
> +       return -ENODEV;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_find_regblock, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index ab4596f0b751..7150a9694f66 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -145,6 +145,10 @@ int cxl_map_device_regs(struct pci_dev *pdev,
>                         struct cxl_device_regs *regs,
>                         struct cxl_register_map *map);
>
> +enum cxl_regloc_type;
> +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> +                     struct cxl_register_map *map);
> +
>  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
>  #define CXL_TARGET_STRLEN 20
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 711bf4514480..d2c743a31b0c 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -433,58 +433,6 @@ static int cxl_map_regs(struct cxl_dev_state *cxlds, struct cxl_register_map *ma
>         return 0;
>  }
>
> -static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> -                               struct cxl_register_map *map)
> -{
> -       map->block_offset = ((u64)reg_hi << 32) |
> -                           (reg_lo & CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK);
> -       map->barno = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BIR_MASK, reg_lo);
> -       map->reg_type = FIELD_GET(CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK, reg_lo);
> -}
> -
> -/**
> - * cxl_find_regblock() - Locate register blocks by type
> - * @pdev: The CXL PCI device to enumerate.
> - * @type: Register Block Indicator id
> - * @map: Enumeration output, clobbered on error
> - *
> - * Return: 0 if register block enumerated, negative error code otherwise
> - *
> - * A CXL DVSEC may point to one or more register blocks, search for them
> - * by @type.
> - */
> -static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> -                            struct cxl_register_map *map)
> -{
> -       u32 regloc_size, regblocks;
> -       int regloc, i;
> -
> -       regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> -                                          CXL_DVSEC_REG_LOCATOR);
> -       if (!regloc)
> -               return -ENXIO;
> -
> -       pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> -       regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> -
> -       regloc += CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET;
> -       regblocks = (regloc_size - CXL_DVSEC_REG_LOCATOR_BLOCK1_OFFSET) / 8;
> -
> -       for (i = 0; i < regblocks; i++, regloc += 8) {
> -               u32 reg_lo, reg_hi;
> -
> -               pci_read_config_dword(pdev, regloc, &reg_lo);
> -               pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> -
> -               cxl_decode_regblock(reg_lo, reg_hi, map);
> -
> -               if (map->reg_type == type)
> -                       return 0;
> -       }
> -
> -       return -ENODEV;
> -}
> -
>  static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>                           struct cxl_register_map *map)
>  {
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 8ae2b4adc59d..a4b506bb37d1 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -47,4 +47,8 @@ enum cxl_regloc_type {
>         CXL_REGLOC_RBI_TYPES
>  };
>
> +#define cxl_reg_block(pdev, map)                                               \
> +       ((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
> +                          (map)->block_offset))
> +

I see no reason for this to be macro. It's also a bug timebomb if
someone in the future does something like:

    cxl_reg_block(pdev, map++);

...because the macro references its arguments more than once.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 09/23] cxl: Introduce module_cxl_driver
  2021-11-20  0:02 ` [PATCH 09/23] cxl: Introduce module_cxl_driver Ben Widawsky
  2021-11-22 15:54   ` Jonathan Cameron
@ 2021-11-24 22:22   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24 22:22 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Many CXL drivers simply want to register and unregister themselves.

s/Many/Some/?

> module_driver already supported this. A simple wrapper around that
> reduces a decent amount of boilerplate in upcoming patches.

Looks good.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 10/23] cxl/core: Convert decoder range to resource
  2021-11-20  0:02 ` [PATCH 10/23] cxl/core: Convert decoder range to resource Ben Widawsky
  2021-11-22 16:08   ` Jonathan Cameron
@ 2021-11-24 22:41   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24 22:41 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> CXL decoders manage address ranges in a hierarchical fashion whereby a
> leaf is a unique subregion of its parent decoder (midlevel or root). It
> therefore makes sense to use the resource API for handling this.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>
> ---
> Changes since RFCv2
> - Switch to DEFINE_RES_MEM from NAMED variant (Dan)
> - Differentiate CFMWS resources and other decoder resources (Ben)
> - Make decoder resources be range, nor resource (Dan)
> - Set decoder name in cxl_decoder_add() (Dan)
> ---
>  drivers/cxl/acpi.c     | 16 ++++++----------
>  drivers/cxl/core/bus.c | 19 +++++++++++++++++--
>  drivers/cxl/cxl.h      |  8 ++++++--
>  3 files changed, 29 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 7cfa8b568013..3415184a2e61 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -108,10 +108,8 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>
>         cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
>         cxld->target_type = CXL_DECODER_EXPANDER;
> -       cxld->range = (struct range){
> -               .start = cfmws->base_hpa,
> -               .end = cfmws->base_hpa + cfmws->window_size - 1,
> -       };
> +       cxld->platform_res = (struct resource)DEFINE_RES_MEM(cfmws->base_hpa,
> +                                                            cfmws->window_size);
>         cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
>         cxld->interleave_granularity = CFMWS_INTERLEAVE_GRANULARITY(cfmws);
>
> @@ -127,8 +125,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>                 return 0;
>         }
>         dev_dbg(dev, "add: %s node: %d range %#llx-%#llx\n",
> -               dev_name(&cxld->dev), phys_to_target_node(cxld->range.start),
> -               cfmws->base_hpa, cfmws->base_hpa + cfmws->window_size - 1);
> +               dev_name(&cxld->dev),
> +               phys_to_target_node(cxld->platform_res.start), cfmws->base_hpa,
> +               cfmws->base_hpa + cfmws->window_size - 1);

Since you converted to resource you can us %pr:

        dev_dbg(dev, "add: %s node: %d range %pr\n", dev_name(&cxld->dev),
                phys_to_target_node(cxld->platform_res.start),
                &cxld->platform_res);


>
>         return 0;
>  }
> @@ -267,10 +266,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>         cxld->interleave_ways = 1;
>         cxld->interleave_granularity = PAGE_SIZE;
>         cxld->target_type = CXL_DECODER_EXPANDER;
> -       cxld->range = (struct range) {
> -               .start = 0,
> -               .end = -1,
> -       };
> +       cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0);
>
>         device_lock(&port->dev);
>         dport = list_first_entry(&port->dports, typeof(*dport), list);
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 17a4fff029f8..8e80e85350b1 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -46,8 +46,14 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr,
>                           char *buf)
>  {
>         struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +       u64 start = 0;

No need to init to zero.

>
> -       return sysfs_emit(buf, "%#llx\n", cxld->range.start);
> +       if (is_root_decoder(dev))
> +               start = cxld->platform_res.start;
> +       else
> +               start = cxld->decoder_range.start;
> +
> +       return sysfs_emit(buf, "%#llx\n", start);
>  }
>  static DEVICE_ATTR_RO(start);
>
> @@ -55,8 +61,14 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr,
>                         char *buf)
>  {
>         struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +       u64 size = 0;

Same "no init" comment.

>
> -       return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range));
> +       if (is_root_decoder(dev))
> +               size = resource_size(&cxld->platform_res);
> +       else
> +               size = range_len(&cxld->decoder_range);
> +
> +       return sysfs_emit(buf, "%#llx\n", size);
>  }
>  static DEVICE_ATTR_RO(size);
>
> @@ -548,6 +560,9 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>         if (rc)
>                 return rc;
>
> +       if (is_root_decoder(dev))
> +               cxld->platform_res.name = dev_name(dev);

Maybe a comment about why the resource wants the name of the decoder?
Just to help explain the motivation to separate this initialization
step from the rest of the resource init.


Other than that you can add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 11/23] cxl/core: Document and tighten up decoder APIs
  2021-11-20  0:02 ` [PATCH 11/23] cxl/core: Document and tighten up decoder APIs Ben Widawsky
  2021-11-22 16:13   ` Jonathan Cameron
@ 2021-11-24 22:55   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-24 22:55 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Since the code to add decoders for switches and endpoints is on the
> horizon it helps to have properly documented APIs. In addition, the
> decoder APIs will never need to support a negative count for downstream
> targets as the spec explicitly starts numbering them at 1, ie. even 0 is
> an "invalid" value which can be used as a sentinel.

Looks good to me:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>
> ---
>
> This is respun from a previous incantation here:
> https://lore.kernel.org/linux-cxl/20210915155946.308339-1-ben.widawsky@intel.com/
> ---
>  drivers/cxl/core/bus.c | 33 +++++++++++++++++++++++++++++++--
>  drivers/cxl/cxl.h      |  3 ++-
>  2 files changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 8e80e85350b1..1ee12a60f3f4 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -495,7 +495,20 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
>         return rc;
>  }
>
> -struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
> +/**
> + * cxl_decoder_alloc - Allocate a new CXL decoder
> + * @port: owning port of this decoder
> + * @nr_targets: downstream targets accessible by this decoder. All upstream
> + *             ports and root ports must have at least 1 target.
> + *
> + * A port should contain one or more decoders. Each of those decoders enable
> + * some address space for CXL.mem utilization. A decoder is expected to be
> + * configured by the caller before registering.
> + *
> + * Return: A new cxl decoder to be registered by cxl_decoder_add()
> + */
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> +                                     unsigned int nr_targets)
>  {
>         struct cxl_decoder *cxld, cxld_const_init = {
>                 .nr_targets = nr_targets,
> @@ -503,7 +516,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>         struct device *dev;
>         int rc = 0;
>
> -       if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
> +       if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets == 0)
>                 return ERR_PTR(-EINVAL);
>
>         cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> @@ -535,6 +548,22 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
>
> +/**
> + * cxl_decoder_add - Add a decoder with targets
> + * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
> + * @target_map: A list of downstream ports that this decoder can direct memory
> + *              traffic to. These numbers should correspond with the port number
> + *              in the PCIe Link Capabilities structure.
> + *
> + * Certain types of decoders may not have any targets. The main example of this
> + * is an endpoint device. A more awkward example is a hostbridge whose root
> + * ports get hot added (technically possible, though unlikely).
> + *
> + * Context: Process context. Takes and releases the cxld's device lock.
> + *
> + * Return: Negative error code if the decoder wasn't properly configured; else
> + *        returns 0.
> + */
>  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>  {
>         struct cxl_port *port;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index ad816fb5bdcc..b66ed8f241c6 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -288,7 +288,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
>
>  struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  bool is_root_decoder(struct device *dev);
> -struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> +                                     unsigned int nr_targets);
>  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
>  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
>
> --
> 2.34.0
>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 12/23] cxl: Introduce endpoint decoders
  2021-11-22 19:37     ` Ben Widawsky
@ 2021-11-25  0:07       ` Dan Williams
  2021-11-29 20:05         ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-25  0:07 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Jonathan Cameron, linux-cxl, Linux PCI, Alison Schofield,
	Ira Weiny, Vishal Verma

On Mon, Nov 22, 2021 at 11:38 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-22 16:20:39, Jonathan Cameron wrote:
> > On Fri, 19 Nov 2021 16:02:39 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > > Endpoints have decoders too. It is useful to share the same
> > > infrastructure from cxl_core. Endpoints do not have dports (downstream
> > > targets), only the underlying physical medium. As a result, some special
> > > casing is needed.
> > >
> > > There is no functional change introduced yet as endpoints don't actually
> > > enumerate decoders yet.
> > >
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> >
> > I'm not a fan of special values like using 0 here to indicate endpoint
> > device.  I'd rather see a base cxl_decode_alloc(..., bool ep)
> > and possibly wrappers for the non ep case and ep one.
> >
> > Jonathan
> >
>
> My inclination is the opposite. However, I think you and Dan both brought up
> something to this effect in the previous RFCs.
>
> Dan, do you have a preference here?

I was thinking something along the lines of what Jonathan wants,
explicit per-type APIs, but internal / private to the core can use
heuristics like nr_targets == 0 == endpoint.

So unexport cxl_decoder_alloc() and have separate:

cxl_root_decoder_alloc()
cxl_switch_decoder_alloc()
cxl_endpoint_decoder_alloc()

...apis that use a static cxl_decoder_alloc() internally. Probably
also wants a cxl_endpoint_decoder_add() that drops the need to pass a
NULL @target_map.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 13/23] cxl/core: Move target population locking to caller
  2021-11-20  0:02 ` [PATCH 13/23] cxl/core: Move target population locking to caller Ben Widawsky
  2021-11-22 16:33   ` Jonathan Cameron
@ 2021-11-25  0:34   ` Dan Williams
  1 sibling, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-25  0:34 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> In preparation for a port driver that enumerates a descendant port +
> decoder hierarchy, arrange for an unlocked version of cxl_decoder_add().
> Otherwise a port-driver that adds a child decoder will deadlock on the
> device_lock() in ->probe().
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>
> ---
>
> Changes since RFCv2:
> - Reword commit message (Dan)
> - Move decoder API changes into this patch (Dan)
> ---
>  drivers/cxl/core/bus.c | 59 +++++++++++++++++++++++++++++++-----------
>  drivers/cxl/cxl.h      |  1 +
>  2 files changed, 45 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 16b15f54fb62..cd6fe7823c69 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -487,28 +487,22 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
>  {
>         int rc = 0, i;
>
> +       device_lock_assert(&port->dev);
> +
>         if (!target_map)
>                 return 0;
>
> -       device_lock(&port->dev);
> -       if (list_empty(&port->dports)) {
> -               rc = -EINVAL;
> -               goto out_unlock;
> -       }
> +       if (list_empty(&port->dports))
> +               return -EINVAL;
>
>         for (i = 0; i < cxld->nr_targets; i++) {
>                 struct cxl_dport *dport = find_dport(port, target_map[i]);
>
> -               if (!dport) {
> -                       rc = -ENXIO;
> -                       goto out_unlock;
> -               }
> +               if (!dport)
> +                       return -ENXIO;
>                 cxld->target[i] = dport;
>         }
>
> -out_unlock:
> -       device_unlock(&port->dev);
> -
>         return rc;
>  }
>
> @@ -571,7 +565,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
>  EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
>
>  /**
> - * cxl_decoder_add - Add a decoder with targets
> + * cxl_decoder_add_locked - Add a decoder with targets
>   * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
>   * @target_map: A list of downstream ports that this decoder can direct memory
>   *              traffic to. These numbers should correspond with the port number
> @@ -581,12 +575,14 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_alloc, CXL);
>   * is an endpoint device. A more awkward example is a hostbridge whose root
>   * ports get hot added (technically possible, though unlikely).
>   *
> - * Context: Process context. Takes and releases the cxld's device lock.
> + * This is the locked variant of cxl_decoder_add().
> + *
> + * Context: Process context. Expects the cxld's device lock to be held.
>   *
>   * Return: Negative error code if the decoder wasn't properly configured; else
>   *        returns 0.
>   */
> -int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> +int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
>  {
>         struct cxl_port *port;
>         struct device *dev;
> @@ -619,6 +615,39 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>
>         return device_add(dev);
>  }
> +EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, CXL);
> +
> +/**
> + * cxl_decoder_add - Add a decoder with targets
> + * @cxld: The cxl decoder allocated by cxl_decoder_alloc()
> + * @target_map: A list of downstream ports that this decoder can direct memory
> + *              traffic to. These numbers should correspond with the port number
> + *              in the PCIe Link Capabilities structure.
> + *
> + * This is the unlocked variant of cxl_decoder_add_locked().
> + * See cxl_decoder_add_locked().
> + *
> + * Context: Process context. Takes and releases the cxld's device lock.

No, it takes the port's lock to walk its dport list.

Otherwise, looks good to me:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 14/23] cxl: Introduce topology host registration
  2021-11-20  0:02 ` [PATCH 14/23] cxl: Introduce topology host registration Ben Widawsky
  2021-11-22 18:20   ` Jonathan Cameron
@ 2021-11-25  1:09   ` Dan Williams
  2021-11-29 21:23     ` Ben Widawsky
  1 sibling, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-25  1:09 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> The description of the CXL topology will be conveyed by a platform
> specific entity that is expected to be a singleton. For ACPI based
> systems, this is ACPI0017. When the topology host goes away, which as of
> now can only be triggered by module unload, it is desirable to have the
> entire topology cleaned up. Regular devm unwinding handles most
> situations already, but what's missing is the ability to teardown the
> root port. Since the root port is platform specific, the core needs a
> set of APIs to allow platform specific drivers to register their root
> ports. With that, all the automatic teardown can occur.

Wait, no, that was one of the original motivations, but then we
discussed here [1] that devm teardown of a topology can happen
naturally / hierarchically.

[1]: https://lore.kernel.org/r/CAPcyv4ikVFFqyfH2zLhBVJ28N1_gufGHd2gVbP2h+Rv2cZEpeA@mail.gmail.com

No, the reason for the cxl_topology_host is as a constraint for when
CXL.mem connectivity can be verified from root to endpoint. Given that
endpoints can attach at any point in time relative to when the root
arrives CXL.mem connectivity needs to be revalidated at every topology
host arrival / depart event.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout
  2021-11-24 19:56         ` Dan Williams
@ 2021-11-25  6:17           ` Ben Widawsky
  2021-11-25  7:14             ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-25  6:17 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, linux-cxl, Linux PCI, Alison Schofield,
	Ira Weiny, Vishal Verma

On 21-11-24 11:56:36, Dan Williams wrote:
> On Mon, Nov 22, 2021 at 9:54 AM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Mon, 22 Nov 2021 09:17:31 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > > On 21-11-22 15:02:27, Jonathan Cameron wrote:
> > > > On Fri, 19 Nov 2021 16:02:31 -0800
> > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > > The original driver implementation used the doorbell timeout for the
> > > > > Mailbox Interface Ready bit to piggy back off of, since the latter
> > > > > doesn't have a defined timeout. This functionality, introduced in
> > > > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > > > > since a timeout has been defined with an ECN to the 2.0 spec.
> > > > >
> > > > > While devices implemented prior to the ECN could have an arbitrarily
> > > > > long wait and still be within spec, the max ECN value (256s) is chosen
> > > > > as the default for all devices. All vendors in the consortium agreed to
> > > > > this amount and so it is reasonable to assume no devices made will
> > > > > exceed this amount.
> > > >
> > > > Optimistic :)
> > > >
> > >
> > > Reasonable to assume is certainly not the same as "in reality". I can soften
> > > this wording.
> > >
> > > > >
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > ---
> > > > > This patch did not exist in RFCv2
> > > > > ---
> > > > >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> > > > >  1 file changed, 29 insertions(+)
> > > > >
> > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > > index 6c8d09fb3a17..2cef9fec8599 100644
> > > > > --- a/drivers/cxl/pci.c
> > > > > +++ b/drivers/cxl/pci.c
> > > > > @@ -2,6 +2,7 @@
> > > > >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> > > > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > > > >  #include <linux/module.h>
> > > > > +#include <linux/delay.h>
> > > > >  #include <linux/sizes.h>
> > > > >  #include <linux/mutex.h>
> > > > >  #include <linux/list.h>
> > > > > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> > > > >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> > > > >  {
> > > > >   const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > > > > + unsigned long timeout;
> > > > > + u64 md_status;
> > > > > + int rc;
> > > > > +
> > > > > + /*
> > > > > +  * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > > > > +  * dictate how long to wait for the mailbox to become ready. For
> > > > > +  * simplicity, and to handle devices that might have been implemented
> > > >
> > > > I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> > > > a lower value, then that is what we should use.  It's fine to wait the max time
> > > > if nothing is specified.  It'll cost us a few lines of code at most unless
> > > > I am missing something...
> > > >
> > > > Jonathan
> > > >
> > >
> > > Let me pose it a different way, if a device advertises 1s, but for whatever
> > > takes 4s to come up, should we penalize it over the device advertising 256s?
> >
> > Yes, because it is buggy.  A compliance test should have failed on this anyway.
> >
> > > The
> > > way this field is defined in the spec would [IMHO] lead vendors to simply put
> > > the max field in there to game the driver, so why not start off with just
> > > insisting they don't?
> >
> > Given reading this value and getting a big number gives the implication that
> > the device is meant to be really slow to initialize, I'd expect that to push
> > vendors a little in the directly of putting realistic values in).
> >
> > Maybe we should print the value in the log to make them look silly ;)
> 
> A print message on the way to a static default timeout value is about
> all a device's self reported timeout is useful.
> 
> "device not ready after waiting %d seconds, continuing to wait up to %d seconds"
> 
> It's also not clear to me that the Linux default timeout should be so
> generous at 256 seconds. It might be suitable to just complain about
> devices that are taking more than 60 seconds to initialize with an
> option to override that timeout for odd outliers. Otherwise encourage
> hardware implementations to beat the Linux timeout value to get
> support out of the box.
> 
> I notice that not even libata waits more than a minute for a given
> device to finish post-reset shenanigans, so might as well set 60
> seconds as what the driver will tolerate out of the box.

60s is infinity, so 4x * infinity doesn't really make much difference does it
:P?

In my opinion if we're going to pick a limit, might as well tie it to a spec
definition rather than 60s.. Perhaps 60s has some relevance I'm unaware of, but
it seems equally arbitrary to me.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout
  2021-11-25  6:17           ` Ben Widawsky
@ 2021-11-25  7:14             ` Dan Williams
  0 siblings, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-11-25  7:14 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Jonathan Cameron, linux-cxl, Linux PCI, Alison Schofield,
	Ira Weiny, Vishal Verma

On Wed, Nov 24, 2021 at 10:17 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-24 11:56:36, Dan Williams wrote:
> > On Mon, Nov 22, 2021 at 9:54 AM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > On Mon, 22 Nov 2021 09:17:31 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > > On 21-11-22 15:02:27, Jonathan Cameron wrote:
> > > > > On Fri, 19 Nov 2021 16:02:31 -0800
> > > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > > > > > The original driver implementation used the doorbell timeout for the
> > > > > > Mailbox Interface Ready bit to piggy back off of, since the latter
> > > > > > doesn't have a defined timeout. This functionality, introduced in
> > > > > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > > > > > since a timeout has been defined with an ECN to the 2.0 spec.
> > > > > >
> > > > > > While devices implemented prior to the ECN could have an arbitrarily
> > > > > > long wait and still be within spec, the max ECN value (256s) is chosen
> > > > > > as the default for all devices. All vendors in the consortium agreed to
> > > > > > this amount and so it is reasonable to assume no devices made will
> > > > > > exceed this amount.
> > > > >
> > > > > Optimistic :)
> > > > >
> > > >
> > > > Reasonable to assume is certainly not the same as "in reality". I can soften
> > > > this wording.
> > > >
> > > > > >
> > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > > ---
> > > > > > This patch did not exist in RFCv2
> > > > > > ---
> > > > > >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> > > > > >  1 file changed, 29 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > > > index 6c8d09fb3a17..2cef9fec8599 100644
> > > > > > --- a/drivers/cxl/pci.c
> > > > > > +++ b/drivers/cxl/pci.c
> > > > > > @@ -2,6 +2,7 @@
> > > > > >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> > > > > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > > > > >  #include <linux/module.h>
> > > > > > +#include <linux/delay.h>
> > > > > >  #include <linux/sizes.h>
> > > > > >  #include <linux/mutex.h>
> > > > > >  #include <linux/list.h>
> > > > > > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> > > > > >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> > > > > >  {
> > > > > >   const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > > > > > + unsigned long timeout;
> > > > > > + u64 md_status;
> > > > > > + int rc;
> > > > > > +
> > > > > > + /*
> > > > > > +  * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > > > > > +  * dictate how long to wait for the mailbox to become ready. For
> > > > > > +  * simplicity, and to handle devices that might have been implemented
> > > > >
> > > > > I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> > > > > a lower value, then that is what we should use.  It's fine to wait the max time
> > > > > if nothing is specified.  It'll cost us a few lines of code at most unless
> > > > > I am missing something...
> > > > >
> > > > > Jonathan
> > > > >
> > > >
> > > > Let me pose it a different way, if a device advertises 1s, but for whatever
> > > > takes 4s to come up, should we penalize it over the device advertising 256s?
> > >
> > > Yes, because it is buggy.  A compliance test should have failed on this anyway.
> > >
> > > > The
> > > > way this field is defined in the spec would [IMHO] lead vendors to simply put
> > > > the max field in there to game the driver, so why not start off with just
> > > > insisting they don't?
> > >
> > > Given reading this value and getting a big number gives the implication that
> > > the device is meant to be really slow to initialize, I'd expect that to push
> > > vendors a little in the directly of putting realistic values in).
> > >
> > > Maybe we should print the value in the log to make them look silly ;)
> >
> > A print message on the way to a static default timeout value is about
> > all a device's self reported timeout is useful.
> >
> > "device not ready after waiting %d seconds, continuing to wait up to %d seconds"
> >
> > It's also not clear to me that the Linux default timeout should be so
> > generous at 256 seconds. It might be suitable to just complain about
> > devices that are taking more than 60 seconds to initialize with an
> > option to override that timeout for odd outliers. Otherwise encourage
> > hardware implementations to beat the Linux timeout value to get
> > support out of the box.
> >
> > I notice that not even libata waits more than a minute for a given
> > device to finish post-reset shenanigans, so might as well set 60
> > seconds as what the driver will tolerate out of the box.
>
> 60s is infinity, so 4x * infinity doesn't really make much difference does it
> :P?

1 minute is half the hung task timeout in case something accidentally
did an uninterruptible sleep on probe completion event. 4 minutes is
on the order of what it takes a large server to boot. A single device
needs as much time as a server to boot?

> In my opinion if we're going to pick a limit, might as well tie it to a spec
> definition rather than 60s.. Perhaps 60s has some relevance I'm unaware of, but
> it seems equally arbitrary to me.

4 minutes just seems an unreasonable amount of time to wait to make a
decision that something is likely broken. If the industry actually
builds devices that nominally take multiple minutes to boot it's
already going to be in the realm of something custom in terms of
application expectations for when the server is ready. I'll buy you a
beverage of your choice if someone actually builds such a thing.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 14/23] cxl: Introduce topology host registration
  2021-11-20  0:02 ` [PATCH 14/23] cxl: Introduce topology host registration Ben Widawsky
  2021-11-22 18:20   ` Jonathan Cameron
@ 2021-11-29 11:42 ` Dan Carpenter
  1 sibling, 0 replies; 133+ messages in thread
From: kernel test robot @ 2021-11-25 21:53 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 5316 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20211120000250.1663391-15-ben.widawsky@intel.com>
References: <20211120000250.1663391-15-ben.widawsky@intel.com>
TO: Ben Widawsky <ben.widawsky@intel.com>
TO: linux-cxl(a)vger.kernel.org
TO: linux-pci(a)vger.kernel.org
CC: Ben Widawsky <ben.widawsky@intel.com>
CC: Alison Schofield <alison.schofield@intel.com>
CC: Dan Williams <dan.j.williams@intel.com>
CC: Ira Weiny <ira.weiny@intel.com>
CC: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
CC: Vishal Verma <vishal.l.verma@intel.com>

Hi Ben,

I love your patch! Perhaps something to improve:

[auto build test WARNING on 53989fad1286e652ea3655ae3367ba698da8d2ff]

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
:::::: branch date: 6 days ago
:::::: commit date: 6 days ago
config: x86_64-randconfig-m001-20211118 (https://download.01.org/0day-ci/archive/20211126/202111260523.BAvGTRJR-lkp(a)intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
drivers/cxl/acpi.c:399 cxl_acpi_probe() error: uninitialized symbol 'root_port'.

vim +/root_port +399 drivers/cxl/acpi.c

6b4661f8037e4f Ben Widawsky     2021-11-19  382  
4812be97c015bd Dan Williams     2021-06-09  383  static int cxl_acpi_probe(struct platform_device *pdev)
4812be97c015bd Dan Williams     2021-06-09  384  {
3b94ce7b7bc1b4 Dan Williams     2021-06-09  385  	int rc;
4812be97c015bd Dan Williams     2021-06-09  386  	struct cxl_port *root_port;
4812be97c015bd Dan Williams     2021-06-09  387  	struct device *host = &pdev->dev;
7d4b5ca2e2cb5d Dan Williams     2021-06-09  388  	struct acpi_device *adev = ACPI_COMPANION(host);
f4ce1f766f1ebf Dan Williams     2021-10-29  389  	struct cxl_cfmws_context ctx;
4812be97c015bd Dan Williams     2021-06-09  390  
6b4661f8037e4f Ben Widawsky     2021-11-19  391  	rc = cxl_register_topology_host(host);
6b4661f8037e4f Ben Widawsky     2021-11-19  392  	if (rc)
6b4661f8037e4f Ben Widawsky     2021-11-19  393  		return rc;
6b4661f8037e4f Ben Widawsky     2021-11-19  394  
6b4661f8037e4f Ben Widawsky     2021-11-19  395  	rc = devm_add_action_or_reset(host, clear_topology_host, host);
6b4661f8037e4f Ben Widawsky     2021-11-19  396  	if (rc)
6b4661f8037e4f Ben Widawsky     2021-11-19  397  		return rc;
6b4661f8037e4f Ben Widawsky     2021-11-19  398  
6b4661f8037e4f Ben Widawsky     2021-11-19 @399  	root_port = devm_cxl_add_port(host, CXL_RESOURCE_NONE, root_port);
4812be97c015bd Dan Williams     2021-06-09  400  	if (IS_ERR(root_port))
4812be97c015bd Dan Williams     2021-06-09  401  		return PTR_ERR(root_port);
4812be97c015bd Dan Williams     2021-06-09  402  	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
4812be97c015bd Dan Williams     2021-06-09  403  
3b94ce7b7bc1b4 Dan Williams     2021-06-09  404  	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
7d4b5ca2e2cb5d Dan Williams     2021-06-09  405  			      add_host_bridge_dport);
f4ce1f766f1ebf Dan Williams     2021-10-29  406  	if (rc < 0)
f4ce1f766f1ebf Dan Williams     2021-10-29  407  		return rc;
3b94ce7b7bc1b4 Dan Williams     2021-06-09  408  
f4ce1f766f1ebf Dan Williams     2021-10-29  409  	ctx = (struct cxl_cfmws_context) {
f4ce1f766f1ebf Dan Williams     2021-10-29  410  		.dev = host,
f4ce1f766f1ebf Dan Williams     2021-10-29  411  		.root_port = root_port,
f4ce1f766f1ebf Dan Williams     2021-10-29  412  	};
f4ce1f766f1ebf Dan Williams     2021-10-29  413  	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, cxl_parse_cfmws, &ctx);
3e23d17ce1980c Alison Schofield 2021-06-17  414  
3b94ce7b7bc1b4 Dan Williams     2021-06-09  415  	/*
3b94ce7b7bc1b4 Dan Williams     2021-06-09  416  	 * Root level scanned with host-bridge as dports, now scan host-bridges
3b94ce7b7bc1b4 Dan Williams     2021-06-09  417  	 * for their role as CXL uports to their CXL-capable PCIe Root Ports.
3b94ce7b7bc1b4 Dan Williams     2021-06-09  418  	 */
8fdcb1704f61a8 Dan Williams     2021-06-15  419  	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
3b94ce7b7bc1b4 Dan Williams     2021-06-09  420  			      add_host_bridge_uport);
f4ce1f766f1ebf Dan Williams     2021-10-29  421  	if (rc < 0)
f4ce1f766f1ebf Dan Williams     2021-10-29  422  		return rc;
8fdcb1704f61a8 Dan Williams     2021-06-15  423  
8fdcb1704f61a8 Dan Williams     2021-06-15  424  	if (IS_ENABLED(CONFIG_CXL_PMEM))
8fdcb1704f61a8 Dan Williams     2021-06-15  425  		rc = device_for_each_child(&root_port->dev, root_port,
8fdcb1704f61a8 Dan Williams     2021-06-15  426  					   add_root_nvdimm_bridge);
8fdcb1704f61a8 Dan Williams     2021-06-15  427  	if (rc < 0)
8fdcb1704f61a8 Dan Williams     2021-06-15  428  		return rc;
f4ce1f766f1ebf Dan Williams     2021-10-29  429  
8fdcb1704f61a8 Dan Williams     2021-06-15  430  	return 0;
4812be97c015bd Dan Williams     2021-06-09  431  }
4812be97c015bd Dan Williams     2021-06-09  432  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 18/23] cxl/pci: Implement wait for media active
  2021-11-22 17:03   ` Jonathan Cameron
  2021-11-22 22:57     ` Ben Widawsky
@ 2021-11-26 11:36     ` Jonathan Cameron
  1 sibling, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-26 11:36 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Mon, 22 Nov 2021 17:03:35 +0000
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Fri, 19 Nov 2021 16:02:45 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > CXL 2.0 8.1.3.8.2 defines "Memory_Active: When set, indicates that the
> > CXL Range 1 memory is fully initialized and available for software use.
> > Must be set within Range 1. Memory_Active_Timeout of deassertion of  
> 
> Range 1?
> 
> > reset to CXL device if CXL.mem HwInit Mode=1" The CXL* Type 3 Memory
> > Device Software Guide (Revision 1.0) further describes the need to check
> > this bit before using HDM.
> > 
> > Unfortunately, Memory_Active can take quite a long time depending on
> > media size (up to 256s per 2.0 spec). Since the cxl_pci driver doesn't
> > care about this, a callback is exported as part of driver state for use
> > by drivers that do care.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>  
> 
> Same thing about size not being used...
> 
> > ---
> > This patch did not exist in RFCv2
> > ---
> >  drivers/cxl/cxlmem.h |  1 +
> >  drivers/cxl/pci.c    | 56 ++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 57 insertions(+)
> > 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index eac5528ccaae..a9424dd4e5c3 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -167,6 +167,7 @@ struct cxl_dev_state {
> >  	struct cxl_endpoint_dvsec_info *info;
> >  
> >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> > +	int (*wait_media_ready)(struct cxl_dev_state *cxlds);

Missing docs for this. 

Jonathan

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 17/23] cxl: Cache and pass DVSEC ranges
  2021-11-20  0:02 ` [PATCH 17/23] cxl: Cache and pass DVSEC ranges Ben Widawsky
  2021-11-20  4:29     ` kernel test robot
  2021-11-22 17:00   ` Jonathan Cameron
@ 2021-11-26 11:37   ` Jonathan Cameron
  2 siblings, 0 replies; 133+ messages in thread
From: Jonathan Cameron @ 2021-11-26 11:37 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, linux-pci, Alison Schofield, Dan Williams, Ira Weiny,
	Vishal Verma

On Fri, 19 Nov 2021 16:02:44 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> CXL 1.1 specification provided a mechanism for mapping an address space
> of a CXL device. That functionality is known as a "range" and can be
> programmed through PCIe DVSEC. In addition to this, the specification
> defines an active bit which a device will expose through the same DVSEC
> to notify system software that memory is initialized and ready.
> 
> While CXL 2.0 introduces a more powerful mechanism called HDM decoders
> that are controlled by MMIO behind a PCIe BAR, the spec does allow the
> 1.1 style mapping to still be present. In such a case, when the CXL
> driver takes over, if it were to enable HDM decoding and there was an
> actively used range, things would likely blow up, in particular if it
> wasn't an identical mapping.
> 
> This patch caches the relevant information which the cxl_mem driver will
> need to make the proper decision and passes it along.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/cxlmem.h |  19 +++++++
>  drivers/cxl/pci.c    | 126 +++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/pci.h    |  13 +++++
>  3 files changed, 158 insertions(+)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 3ef3c652599e..eac5528ccaae 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -89,6 +89,22 @@ struct cxl_mbox_cmd {
>   */
>  #define CXL_CAPACITY_MULTIPLIER SZ_256M
>  
> +/**
> + * struct cxl_endpoint_dvsec_info - Cached DVSEC info
> + * @mem_enabled: cached value of mem_enabled in the DVSEC, PCIE_DEVICE
> + * @ranges: Number of HDM ranges this device contains.
> + * @range.base: cached value of the range base in the DVSEC, PCIE_DEVICE
> + * @range.size: cached value of the range size in the DVSEC, PCIE_DEVICE
> + */
> +struct cxl_endpoint_dvsec_info {
> +	bool mem_enabled;
> +	int ranges;
> +	struct {
> +		u64 base;
> +		u64 size;
> +	} range[2];

kernel-doc wants documentation for range as well.

> +};

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 14/23] cxl: Introduce topology host registration
@ 2021-11-29 11:42 ` Dan Carpenter
  0 siblings, 0 replies; 133+ messages in thread
From: Dan Carpenter @ 2021-11-29 11:42 UTC (permalink / raw)
  To: kbuild, Ben Widawsky, linux-cxl, linux-pci
  Cc: lkp, kbuild-all, Ben Widawsky, Alison Schofield, Dan Williams,
	Ira Weiny, Jonathan Cameron, Vishal Verma

Hi Ben,

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: x86_64-randconfig-m001-20211118 (https://download.01.org/0day-ci/archive/20211126/202111260523.BAvGTRJR-lkp@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
drivers/cxl/acpi.c:399 cxl_acpi_probe() error: uninitialized symbol 'root_port'.

vim +/root_port +399 drivers/cxl/acpi.c

4812be97c015bd Dan Williams     2021-06-09  383  static int cxl_acpi_probe(struct platform_device *pdev)
4812be97c015bd Dan Williams     2021-06-09  384  {
3b94ce7b7bc1b4 Dan Williams     2021-06-09  385  	int rc;
4812be97c015bd Dan Williams     2021-06-09  386  	struct cxl_port *root_port;
4812be97c015bd Dan Williams     2021-06-09  387  	struct device *host = &pdev->dev;
7d4b5ca2e2cb5d Dan Williams     2021-06-09  388  	struct acpi_device *adev = ACPI_COMPANION(host);
f4ce1f766f1ebf Dan Williams     2021-10-29  389  	struct cxl_cfmws_context ctx;
4812be97c015bd Dan Williams     2021-06-09  390  
6b4661f8037e4f Ben Widawsky     2021-11-19  391  	rc = cxl_register_topology_host(host);
6b4661f8037e4f Ben Widawsky     2021-11-19  392  	if (rc)
6b4661f8037e4f Ben Widawsky     2021-11-19  393  		return rc;
6b4661f8037e4f Ben Widawsky     2021-11-19  394  
6b4661f8037e4f Ben Widawsky     2021-11-19  395  	rc = devm_add_action_or_reset(host, clear_topology_host, host);
6b4661f8037e4f Ben Widawsky     2021-11-19  396  	if (rc)
6b4661f8037e4f Ben Widawsky     2021-11-19  397  		return rc;
6b4661f8037e4f Ben Widawsky     2021-11-19  398  
6b4661f8037e4f Ben Widawsky     2021-11-19 @399  	root_port = devm_cxl_add_port(host, CXL_RESOURCE_NONE, root_port);
                                                                                                               ^^^^^^^^^^
Uninitialized.

4812be97c015bd Dan Williams     2021-06-09  400  	if (IS_ERR(root_port))
4812be97c015bd Dan Williams     2021-06-09  401  		return PTR_ERR(root_port);
4812be97c015bd Dan Williams     2021-06-09  402  	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
4812be97c015bd Dan Williams     2021-06-09  403  
3b94ce7b7bc1b4 Dan Williams     2021-06-09  404  	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
7d4b5ca2e2cb5d Dan Williams     2021-06-09  405  			      add_host_bridge_dport);
f4ce1f766f1ebf Dan Williams     2021-10-29  406  	if (rc < 0)
f4ce1f766f1ebf Dan Williams     2021-10-29  407  		return rc;
3b94ce7b7bc1b4 Dan Williams     2021-06-09  408  
f4ce1f766f1ebf Dan Williams     2021-10-29  409  	ctx = (struct cxl_cfmws_context) {
f4ce1f766f1ebf Dan Williams     2021-10-29  410  		.dev = host,
f4ce1f766f1ebf Dan Williams     2021-10-29  411  		.root_port = root_port,
f4ce1f766f1ebf Dan Williams     2021-10-29  412  	};
f4ce1f766f1ebf Dan Williams     2021-10-29  413  	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, cxl_parse_cfmws, &ctx);
3e23d17ce1980c Alison Schofield 2021-06-17  414  
3b94ce7b7bc1b4 Dan Williams     2021-06-09  415  	/*
3b94ce7b7bc1b4 Dan Williams     2021-06-09  416  	 * Root level scanned with host-bridge as dports, now scan host-bridges
3b94ce7b7bc1b4 Dan Williams     2021-06-09  417  	 * for their role as CXL uports to their CXL-capable PCIe Root Ports.
3b94ce7b7bc1b4 Dan Williams     2021-06-09  418  	 */
8fdcb1704f61a8 Dan Williams     2021-06-15  419  	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
3b94ce7b7bc1b4 Dan Williams     2021-06-09  420  			      add_host_bridge_uport);
f4ce1f766f1ebf Dan Williams     2021-10-29  421  	if (rc < 0)
f4ce1f766f1ebf Dan Williams     2021-10-29  422  		return rc;
8fdcb1704f61a8 Dan Williams     2021-06-15  423  
8fdcb1704f61a8 Dan Williams     2021-06-15  424  	if (IS_ENABLED(CONFIG_CXL_PMEM))
8fdcb1704f61a8 Dan Williams     2021-06-15  425  		rc = device_for_each_child(&root_port->dev, root_port,
8fdcb1704f61a8 Dan Williams     2021-06-15  426  					   add_root_nvdimm_bridge);
8fdcb1704f61a8 Dan Williams     2021-06-15  427  	if (rc < 0)
8fdcb1704f61a8 Dan Williams     2021-06-15  428  		return rc;
f4ce1f766f1ebf Dan Williams     2021-10-29  429  
8fdcb1704f61a8 Dan Williams     2021-06-15  430  	return 0;
4812be97c015bd Dan Williams     2021-06-09  431  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 14/23] cxl: Introduce topology host registration
@ 2021-11-29 11:42 ` Dan Carpenter
  0 siblings, 0 replies; 133+ messages in thread
From: Dan Carpenter @ 2021-11-29 11:42 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4624 bytes --]

Hi Ben,

url:    https://github.com/0day-ci/linux/commits/Ben-Widawsky/Add-drivers-for-CXL-ports-and-mem-devices/20211120-080513
base:   53989fad1286e652ea3655ae3367ba698da8d2ff
config: x86_64-randconfig-m001-20211118 (https://download.01.org/0day-ci/archive/20211126/202111260523.BAvGTRJR-lkp(a)intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
drivers/cxl/acpi.c:399 cxl_acpi_probe() error: uninitialized symbol 'root_port'.

vim +/root_port +399 drivers/cxl/acpi.c

4812be97c015bd Dan Williams     2021-06-09  383  static int cxl_acpi_probe(struct platform_device *pdev)
4812be97c015bd Dan Williams     2021-06-09  384  {
3b94ce7b7bc1b4 Dan Williams     2021-06-09  385  	int rc;
4812be97c015bd Dan Williams     2021-06-09  386  	struct cxl_port *root_port;
4812be97c015bd Dan Williams     2021-06-09  387  	struct device *host = &pdev->dev;
7d4b5ca2e2cb5d Dan Williams     2021-06-09  388  	struct acpi_device *adev = ACPI_COMPANION(host);
f4ce1f766f1ebf Dan Williams     2021-10-29  389  	struct cxl_cfmws_context ctx;
4812be97c015bd Dan Williams     2021-06-09  390  
6b4661f8037e4f Ben Widawsky     2021-11-19  391  	rc = cxl_register_topology_host(host);
6b4661f8037e4f Ben Widawsky     2021-11-19  392  	if (rc)
6b4661f8037e4f Ben Widawsky     2021-11-19  393  		return rc;
6b4661f8037e4f Ben Widawsky     2021-11-19  394  
6b4661f8037e4f Ben Widawsky     2021-11-19  395  	rc = devm_add_action_or_reset(host, clear_topology_host, host);
6b4661f8037e4f Ben Widawsky     2021-11-19  396  	if (rc)
6b4661f8037e4f Ben Widawsky     2021-11-19  397  		return rc;
6b4661f8037e4f Ben Widawsky     2021-11-19  398  
6b4661f8037e4f Ben Widawsky     2021-11-19 @399  	root_port = devm_cxl_add_port(host, CXL_RESOURCE_NONE, root_port);
                                                                                                               ^^^^^^^^^^
Uninitialized.

4812be97c015bd Dan Williams     2021-06-09  400  	if (IS_ERR(root_port))
4812be97c015bd Dan Williams     2021-06-09  401  		return PTR_ERR(root_port);
4812be97c015bd Dan Williams     2021-06-09  402  	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
4812be97c015bd Dan Williams     2021-06-09  403  
3b94ce7b7bc1b4 Dan Williams     2021-06-09  404  	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
7d4b5ca2e2cb5d Dan Williams     2021-06-09  405  			      add_host_bridge_dport);
f4ce1f766f1ebf Dan Williams     2021-10-29  406  	if (rc < 0)
f4ce1f766f1ebf Dan Williams     2021-10-29  407  		return rc;
3b94ce7b7bc1b4 Dan Williams     2021-06-09  408  
f4ce1f766f1ebf Dan Williams     2021-10-29  409  	ctx = (struct cxl_cfmws_context) {
f4ce1f766f1ebf Dan Williams     2021-10-29  410  		.dev = host,
f4ce1f766f1ebf Dan Williams     2021-10-29  411  		.root_port = root_port,
f4ce1f766f1ebf Dan Williams     2021-10-29  412  	};
f4ce1f766f1ebf Dan Williams     2021-10-29  413  	acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, cxl_parse_cfmws, &ctx);
3e23d17ce1980c Alison Schofield 2021-06-17  414  
3b94ce7b7bc1b4 Dan Williams     2021-06-09  415  	/*
3b94ce7b7bc1b4 Dan Williams     2021-06-09  416  	 * Root level scanned with host-bridge as dports, now scan host-bridges
3b94ce7b7bc1b4 Dan Williams     2021-06-09  417  	 * for their role as CXL uports to their CXL-capable PCIe Root Ports.
3b94ce7b7bc1b4 Dan Williams     2021-06-09  418  	 */
8fdcb1704f61a8 Dan Williams     2021-06-15  419  	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
3b94ce7b7bc1b4 Dan Williams     2021-06-09  420  			      add_host_bridge_uport);
f4ce1f766f1ebf Dan Williams     2021-10-29  421  	if (rc < 0)
f4ce1f766f1ebf Dan Williams     2021-10-29  422  		return rc;
8fdcb1704f61a8 Dan Williams     2021-06-15  423  
8fdcb1704f61a8 Dan Williams     2021-06-15  424  	if (IS_ENABLED(CONFIG_CXL_PMEM))
8fdcb1704f61a8 Dan Williams     2021-06-15  425  		rc = device_for_each_child(&root_port->dev, root_port,
8fdcb1704f61a8 Dan Williams     2021-06-15  426  					   add_root_nvdimm_bridge);
8fdcb1704f61a8 Dan Williams     2021-06-15  427  	if (rc < 0)
8fdcb1704f61a8 Dan Williams     2021-06-15  428  		return rc;
f4ce1f766f1ebf Dan Williams     2021-10-29  429  
8fdcb1704f61a8 Dan Williams     2021-06-15  430  	return 0;
4812be97c015bd Dan Williams     2021-06-09  431  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-24 21:55   ` Dan Williams
@ 2021-11-29 18:33     ` Ben Widawsky
  2021-11-29 19:02       ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-29 18:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-24 13:55:03, Dan Williams wrote:
> On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > The expectation is that the mailbox interface ready bit is the first
> > step in access through the mailbox interface. Therefore, waiting for the
> > doorbell busy bit to be clear would imply that the mailbox interface is
> > ready. The original driver implementation used the doorbell timeout for
> > the Mailbox Interface Ready bit to piggyback off of, since the latter
> > doesn't have a defined timeout (introduced in 8adaf747c9f0 ("cxl/mem:
> > Find device capabilities"), a timeout has since been defined with an ECN
> > to the 2.0 spec). With the current driver waiting for mailbox interface
> > ready as a part of probe() it's no longer necessary to use the
> > piggyback.
> >
> > With the piggybacking no longer necessary it doesn't make sense to check
> > doorbell status when acquiring the mailbox. It will be checked during
> > the normal mailbox exchange protocol.
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> > This patch did not exist in RFCv2
> > ---
> >  drivers/cxl/pci.c | 25 ++++++-------------------
> >  1 file changed, 6 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 2cef9fec8599..869b4fc18e27 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -221,27 +221,14 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
> >
> >         /*
> >          * XXX: There is some amount of ambiguity in the 2.0 version of the spec
> > -        * around the mailbox interface ready (8.2.8.5.1.1).  The purpose of the
> > +        * around the mailbox interface ready (8.2.8.5.1.1). The purpose of the
> >          * bit is to allow firmware running on the device to notify the driver
> > -        * that it's ready to receive commands. It is unclear if the bit needs
> > -        * to be read for each transaction mailbox, ie. the firmware can switch
> > -        * it on and off as needed. Second, there is no defined timeout for
> > -        * mailbox ready, like there is for the doorbell interface.
> > -        *
> > -        * Assumptions:
> > -        * 1. The firmware might toggle the Mailbox Interface Ready bit, check
> > -        *    it for every command.
> > -        *
> > -        * 2. If the doorbell is clear, the firmware should have first set the
> > -        *    Mailbox Interface Ready bit. Therefore, waiting for the doorbell
> > -        *    to be ready is sufficient.
> > +        * that it's ready to receive commands. The spec does not clearly define
> > +        * under what conditions the bit may get set or cleared. As of the 2.0
> > +        * base specification there was no defined timeout for mailbox ready,
> > +        * like there is for the doorbell interface. This was fixed with an ECN,
> > +        * but it's possible early devices implemented this before the ECN.
> 
> Can we just drop comment block altogether? Outside of
> cxl_pci_setup_mailbox() the only time the mailbox status should be
> checked is after a doorbell timeout after submitting a command.
> 

Yes, I think it's fine to drop it.

> >          */
> > -       rc = cxl_pci_mbox_wait_for_doorbell(cxlds);
> > -       if (rc) {
> > -               dev_warn(dev, "Mailbox interface not ready\n");
> > -               goto out;
> > -       }
> > -
> >         md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> >         if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
> >                 dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
> 
> This error message is obsolete since nothing is pre-checking the
> mailbox anymore, and per above I see no problem waiting to check the
> status until after the mailbox has failed to respond after a timeout.

The message is wrong, but I think the logic is still valuable. How about:
"mbox: reported interface ready, but mbox not ready"

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-29 18:33     ` Ben Widawsky
@ 2021-11-29 19:02       ` Dan Williams
  2021-11-29 19:11         ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-29 19:02 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Mon, Nov 29, 2021 at 10:33 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-24 13:55:03, Dan Williams wrote:
> > On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > The expectation is that the mailbox interface ready bit is the first
> > > step in access through the mailbox interface. Therefore, waiting for the
> > > doorbell busy bit to be clear would imply that the mailbox interface is
> > > ready. The original driver implementation used the doorbell timeout for
> > > the Mailbox Interface Ready bit to piggyback off of, since the latter
> > > doesn't have a defined timeout (introduced in 8adaf747c9f0 ("cxl/mem:
> > > Find device capabilities"), a timeout has since been defined with an ECN
> > > to the 2.0 spec). With the current driver waiting for mailbox interface
> > > ready as a part of probe() it's no longer necessary to use the
> > > piggyback.
> > >
> > > With the piggybacking no longer necessary it doesn't make sense to check
> > > doorbell status when acquiring the mailbox. It will be checked during
> > > the normal mailbox exchange protocol.
> > >
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > > This patch did not exist in RFCv2
> > > ---
> > >  drivers/cxl/pci.c | 25 ++++++-------------------
> > >  1 file changed, 6 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index 2cef9fec8599..869b4fc18e27 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -221,27 +221,14 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
> > >
> > >         /*
> > >          * XXX: There is some amount of ambiguity in the 2.0 version of the spec
> > > -        * around the mailbox interface ready (8.2.8.5.1.1).  The purpose of the
> > > +        * around the mailbox interface ready (8.2.8.5.1.1). The purpose of the
> > >          * bit is to allow firmware running on the device to notify the driver
> > > -        * that it's ready to receive commands. It is unclear if the bit needs
> > > -        * to be read for each transaction mailbox, ie. the firmware can switch
> > > -        * it on and off as needed. Second, there is no defined timeout for
> > > -        * mailbox ready, like there is for the doorbell interface.
> > > -        *
> > > -        * Assumptions:
> > > -        * 1. The firmware might toggle the Mailbox Interface Ready bit, check
> > > -        *    it for every command.
> > > -        *
> > > -        * 2. If the doorbell is clear, the firmware should have first set the
> > > -        *    Mailbox Interface Ready bit. Therefore, waiting for the doorbell
> > > -        *    to be ready is sufficient.
> > > +        * that it's ready to receive commands. The spec does not clearly define
> > > +        * under what conditions the bit may get set or cleared. As of the 2.0
> > > +        * base specification there was no defined timeout for mailbox ready,
> > > +        * like there is for the doorbell interface. This was fixed with an ECN,
> > > +        * but it's possible early devices implemented this before the ECN.
> >
> > Can we just drop comment block altogether? Outside of
> > cxl_pci_setup_mailbox() the only time the mailbox status should be
> > checked is after a doorbell timeout after submitting a command.
> >
>
> Yes, I think it's fine to drop it.
>
> > >          */
> > > -       rc = cxl_pci_mbox_wait_for_doorbell(cxlds);
> > > -       if (rc) {
> > > -               dev_warn(dev, "Mailbox interface not ready\n");
> > > -               goto out;
> > > -       }
> > > -
> > >         md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > >         if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
> > >                 dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
> >
> > This error message is obsolete since nothing is pre-checking the
> > mailbox anymore, and per above I see no problem waiting to check the
> > status until after the mailbox has failed to respond after a timeout.
>
> The message is wrong, but I think the logic is still valuable. How about:
> "mbox: reported interface ready, but mbox not ready"

You mean check this every time even though the spec says the driver
only needs to check it once per-reset?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-29 19:02       ` Dan Williams
@ 2021-11-29 19:11         ` Ben Widawsky
  2021-11-29 19:18           ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-29 19:11 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-29 11:02:41, Dan Williams wrote:
> On Mon, Nov 29, 2021 at 10:33 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-24 13:55:03, Dan Williams wrote:
> > > On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > The expectation is that the mailbox interface ready bit is the first
> > > > step in access through the mailbox interface. Therefore, waiting for the
> > > > doorbell busy bit to be clear would imply that the mailbox interface is
> > > > ready. The original driver implementation used the doorbell timeout for
> > > > the Mailbox Interface Ready bit to piggyback off of, since the latter
> > > > doesn't have a defined timeout (introduced in 8adaf747c9f0 ("cxl/mem:
> > > > Find device capabilities"), a timeout has since been defined with an ECN
> > > > to the 2.0 spec). With the current driver waiting for mailbox interface
> > > > ready as a part of probe() it's no longer necessary to use the
> > > > piggyback.
> > > >
> > > > With the piggybacking no longer necessary it doesn't make sense to check
> > > > doorbell status when acquiring the mailbox. It will be checked during
> > > > the normal mailbox exchange protocol.
> > > >
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > ---
> > > > This patch did not exist in RFCv2
> > > > ---
> > > >  drivers/cxl/pci.c | 25 ++++++-------------------
> > > >  1 file changed, 6 insertions(+), 19 deletions(-)
> > > >
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > index 2cef9fec8599..869b4fc18e27 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -221,27 +221,14 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
> > > >
> > > >         /*
> > > >          * XXX: There is some amount of ambiguity in the 2.0 version of the spec
> > > > -        * around the mailbox interface ready (8.2.8.5.1.1).  The purpose of the
> > > > +        * around the mailbox interface ready (8.2.8.5.1.1). The purpose of the
> > > >          * bit is to allow firmware running on the device to notify the driver
> > > > -        * that it's ready to receive commands. It is unclear if the bit needs
> > > > -        * to be read for each transaction mailbox, ie. the firmware can switch
> > > > -        * it on and off as needed. Second, there is no defined timeout for
> > > > -        * mailbox ready, like there is for the doorbell interface.
> > > > -        *
> > > > -        * Assumptions:
> > > > -        * 1. The firmware might toggle the Mailbox Interface Ready bit, check
> > > > -        *    it for every command.
> > > > -        *
> > > > -        * 2. If the doorbell is clear, the firmware should have first set the
> > > > -        *    Mailbox Interface Ready bit. Therefore, waiting for the doorbell
> > > > -        *    to be ready is sufficient.
> > > > +        * that it's ready to receive commands. The spec does not clearly define
> > > > +        * under what conditions the bit may get set or cleared. As of the 2.0
> > > > +        * base specification there was no defined timeout for mailbox ready,
> > > > +        * like there is for the doorbell interface. This was fixed with an ECN,
> > > > +        * but it's possible early devices implemented this before the ECN.
> > >
> > > Can we just drop comment block altogether? Outside of
> > > cxl_pci_setup_mailbox() the only time the mailbox status should be
> > > checked is after a doorbell timeout after submitting a command.
> > >
> >
> > Yes, I think it's fine to drop it.
> >
> > > >          */
> > > > -       rc = cxl_pci_mbox_wait_for_doorbell(cxlds);
> > > > -       if (rc) {
> > > > -               dev_warn(dev, "Mailbox interface not ready\n");
> > > > -               goto out;
> > > > -       }
> > > > -
> > > >         md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > > >         if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
> > > >                 dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
> > >
> > > This error message is obsolete since nothing is pre-checking the
> > > mailbox anymore, and per above I see no problem waiting to check the
> > > status until after the mailbox has failed to respond after a timeout.
> >
> > The message is wrong, but I think the logic is still valuable. How about:
> > "mbox: reported interface ready, but mbox not ready"
> 
> You mean check this every time even though the spec says the driver
> only needs to check it once per-reset?

Unfortunately it does not say that. "... it shall remain set until the next
reset or the device encounters an error that prevents any mailbox
communication."

Once we have real error checking in place, this could go away, though I see no
harm in leaving it.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-29 19:11         ` Ben Widawsky
@ 2021-11-29 19:18           ` Dan Williams
  2021-11-29 19:31             ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-29 19:18 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Mon, Nov 29, 2021 at 11:11 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-29 11:02:41, Dan Williams wrote:
> > On Mon, Nov 29, 2021 at 10:33 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-11-24 13:55:03, Dan Williams wrote:
> > > > On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > > > > The expectation is that the mailbox interface ready bit is the first
> > > > > step in access through the mailbox interface. Therefore, waiting for the
> > > > > doorbell busy bit to be clear would imply that the mailbox interface is
> > > > > ready. The original driver implementation used the doorbell timeout for
> > > > > the Mailbox Interface Ready bit to piggyback off of, since the latter
> > > > > doesn't have a defined timeout (introduced in 8adaf747c9f0 ("cxl/mem:
> > > > > Find device capabilities"), a timeout has since been defined with an ECN
> > > > > to the 2.0 spec). With the current driver waiting for mailbox interface
> > > > > ready as a part of probe() it's no longer necessary to use the
> > > > > piggyback.
> > > > >
> > > > > With the piggybacking no longer necessary it doesn't make sense to check
> > > > > doorbell status when acquiring the mailbox. It will be checked during
> > > > > the normal mailbox exchange protocol.
> > > > >
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > ---
> > > > > This patch did not exist in RFCv2
> > > > > ---
> > > > >  drivers/cxl/pci.c | 25 ++++++-------------------
> > > > >  1 file changed, 6 insertions(+), 19 deletions(-)
> > > > >
> > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > > index 2cef9fec8599..869b4fc18e27 100644
> > > > > --- a/drivers/cxl/pci.c
> > > > > +++ b/drivers/cxl/pci.c
> > > > > @@ -221,27 +221,14 @@ static int cxl_pci_mbox_get(struct cxl_dev_state *cxlds)
> > > > >
> > > > >         /*
> > > > >          * XXX: There is some amount of ambiguity in the 2.0 version of the spec
> > > > > -        * around the mailbox interface ready (8.2.8.5.1.1).  The purpose of the
> > > > > +        * around the mailbox interface ready (8.2.8.5.1.1). The purpose of the
> > > > >          * bit is to allow firmware running on the device to notify the driver
> > > > > -        * that it's ready to receive commands. It is unclear if the bit needs
> > > > > -        * to be read for each transaction mailbox, ie. the firmware can switch
> > > > > -        * it on and off as needed. Second, there is no defined timeout for
> > > > > -        * mailbox ready, like there is for the doorbell interface.
> > > > > -        *
> > > > > -        * Assumptions:
> > > > > -        * 1. The firmware might toggle the Mailbox Interface Ready bit, check
> > > > > -        *    it for every command.
> > > > > -        *
> > > > > -        * 2. If the doorbell is clear, the firmware should have first set the
> > > > > -        *    Mailbox Interface Ready bit. Therefore, waiting for the doorbell
> > > > > -        *    to be ready is sufficient.
> > > > > +        * that it's ready to receive commands. The spec does not clearly define
> > > > > +        * under what conditions the bit may get set or cleared. As of the 2.0
> > > > > +        * base specification there was no defined timeout for mailbox ready,
> > > > > +        * like there is for the doorbell interface. This was fixed with an ECN,
> > > > > +        * but it's possible early devices implemented this before the ECN.
> > > >
> > > > Can we just drop comment block altogether? Outside of
> > > > cxl_pci_setup_mailbox() the only time the mailbox status should be
> > > > checked is after a doorbell timeout after submitting a command.
> > > >
> > >
> > > Yes, I think it's fine to drop it.
> > >
> > > > >          */
> > > > > -       rc = cxl_pci_mbox_wait_for_doorbell(cxlds);
> > > > > -       if (rc) {
> > > > > -               dev_warn(dev, "Mailbox interface not ready\n");
> > > > > -               goto out;
> > > > > -       }
> > > > > -
> > > > >         md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > > > >         if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
> > > > >                 dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
> > > >
> > > > This error message is obsolete since nothing is pre-checking the
> > > > mailbox anymore, and per above I see no problem waiting to check the
> > > > status until after the mailbox has failed to respond after a timeout.
> > >
> > > The message is wrong, but I think the logic is still valuable. How about:
> > > "mbox: reported interface ready, but mbox not ready"
> >
> > You mean check this every time even though the spec says the driver
> > only needs to check it once per-reset?
>
> Unfortunately it does not say that. "... it shall remain set until the next
> reset or the device encounters an error that prevents any mailbox
> communication."
>
> Once we have real error checking in place, this could go away, though I see no
> harm in leaving it.

Right, there's no harm in the check, it just seems overly paranoid to
me if it was already checked once. Until a doorbell timeout happens
it's an extra MMIO cycle that can saved for a "what happened?" check
after a timeout.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-29 19:18           ` Dan Williams
@ 2021-11-29 19:31             ` Ben Widawsky
  2021-11-29 19:37               ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-29 19:31 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-29 11:18:36, Dan Williams wrote:

[snip]

> > > > > > -       rc = cxl_pci_mbox_wait_for_doorbell(cxlds);
> > > > > > -       if (rc) {
> > > > > > -               dev_warn(dev, "Mailbox interface not ready\n");
> > > > > > -               goto out;
> > > > > > -       }
> > > > > > -
> > > > > >         md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > > > > >         if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) {
> > > > > >                 dev_err(dev, "mbox: reported doorbell ready, but not mbox ready\n");
> > > > >
> > > > > This error message is obsolete since nothing is pre-checking the
> > > > > mailbox anymore, and per above I see no problem waiting to check the
> > > > > status until after the mailbox has failed to respond after a timeout.
> > > >
> > > > The message is wrong, but I think the logic is still valuable. How about:
> > > > "mbox: reported interface ready, but mbox not ready"
> > >
> > > You mean check this every time even though the spec says the driver
> > > only needs to check it once per-reset?
> >
> > Unfortunately it does not say that. "... it shall remain set until the next
> > reset or the device encounters an error that prevents any mailbox
> > communication."
> >
> > Once we have real error checking in place, this could go away, though I see no
> > harm in leaving it.
> 
> Right, there's no harm in the check, it just seems overly paranoid to
> me if it was already checked once. Until a doorbell timeout happens
> it's an extra MMIO cycle that can saved for a "what happened?" check
> after a timeout.

Well I suspect we're just rearranging the deck chairs on the Titanic now, but...

I see doorbell timeouts as disconnected from whether or not the mailbox
interface is ready. If they were the same, we wouldn't need both bits and we
could just wait extra long for the doorbell when probing.

In other words, I expect if the interface goes unready, doorbell timeout will
occur, but I don't think we should assume if doorbell timeout occurs, the
interface is no longer ready. I don't purport to know why a doorbell timeout
might occur while the interface remains available (likely a firmware bug, I
presume).

It does seem interesting to check if the interface is no longer ready on timeout
though.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-29 19:31             ` Ben Widawsky
@ 2021-11-29 19:37               ` Dan Williams
  2021-11-29 19:50                 ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-29 19:37 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Mon, Nov 29, 2021 at 11:32 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
[..]
> >
> > Right, there's no harm in the check, it just seems overly paranoid to
> > me if it was already checked once. Until a doorbell timeout happens
> > it's an extra MMIO cycle that can saved for a "what happened?" check
> > after a timeout.
>
> Well I suspect we're just rearranging the deck chairs on the Titanic now, but...

Not so much, just trying to get this driver in line with other error
handling designs.

> I see doorbell timeouts as disconnected from whether or not the mailbox
> interface is ready. If they were the same, we wouldn't need both bits and we
> could just wait extra long for the doorbell when probing.
>
> In other words, I expect if the interface goes unready, doorbell timeout will
> occur, but I don't think we should assume if doorbell timeout occurs, the
> interface is no longer ready. I don't purport to know why a doorbell timeout
> might occur while the interface remains available (likely a firmware bug, I
> presume).
>
> It does seem interesting to check if the interface is no longer ready on timeout
> though.

So I'm just modeling this off of NVME error handling where there is a
Controller Fatal Status bit that could be checked every transaction,
but instead the driver waits until a command timeout to collect if the
device went fatal / not-ready.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access
  2021-11-29 19:37               ` Dan Williams
@ 2021-11-29 19:50                 ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-29 19:50 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-29 11:37:34, Dan Williams wrote:
> On Mon, Nov 29, 2021 at 11:32 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> [..]
> > >
> > > Right, there's no harm in the check, it just seems overly paranoid to
> > > me if it was already checked once. Until a doorbell timeout happens
> > > it's an extra MMIO cycle that can saved for a "what happened?" check
> > > after a timeout.
> >
> > Well I suspect we're just rearranging the deck chairs on the Titanic now, but...
> 
> Not so much, just trying to get this driver in line with other error
> handling designs.

Okay. I shall remove it then.

> 
> > I see doorbell timeouts as disconnected from whether or not the mailbox
> > interface is ready. If they were the same, we wouldn't need both bits and we
> > could just wait extra long for the doorbell when probing.
> >
> > In other words, I expect if the interface goes unready, doorbell timeout will
> > occur, but I don't think we should assume if doorbell timeout occurs, the
> > interface is no longer ready. I don't purport to know why a doorbell timeout
> > might occur while the interface remains available (likely a firmware bug, I
> > presume).
> >
> > It does seem interesting to check if the interface is no longer ready on timeout
> > though.
> 
> So I'm just modeling this off of NVME error handling where there is a
> Controller Fatal Status bit that could be checked every transaction,
> but instead the driver waits until a command timeout to collect if the
> device went fatal / not-ready.

No error interrupts?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 12/23] cxl: Introduce endpoint decoders
  2021-11-25  0:07       ` Dan Williams
@ 2021-11-29 20:05         ` Ben Widawsky
  2021-11-29 20:07           ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Widawsky @ 2021-11-29 20:05 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, linux-cxl, Linux PCI, Alison Schofield,
	Ira Weiny, Vishal Verma

On 21-11-24 16:07:23, Dan Williams wrote:
> On Mon, Nov 22, 2021 at 11:38 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-22 16:20:39, Jonathan Cameron wrote:
> > > On Fri, 19 Nov 2021 16:02:39 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > > Endpoints have decoders too. It is useful to share the same
> > > > infrastructure from cxl_core. Endpoints do not have dports (downstream
> > > > targets), only the underlying physical medium. As a result, some special
> > > > casing is needed.
> > > >
> > > > There is no functional change introduced yet as endpoints don't actually
> > > > enumerate decoders yet.
> > > >
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > >
> > > I'm not a fan of special values like using 0 here to indicate endpoint
> > > device.  I'd rather see a base cxl_decode_alloc(..., bool ep)
> > > and possibly wrappers for the non ep case and ep one.
> > >
> > > Jonathan
> > >
> >
> > My inclination is the opposite. However, I think you and Dan both brought up
> > something to this effect in the previous RFCs.
> >
> > Dan, do you have a preference here?
> 
> I was thinking something along the lines of what Jonathan wants,
> explicit per-type APIs, but internal / private to the core can use
> heuristics like nr_targets == 0 == endpoint.
> 
> So unexport cxl_decoder_alloc() and have separate:
> 
> cxl_root_decoder_alloc()
> cxl_switch_decoder_alloc()
> cxl_endpoint_decoder_alloc()
> 
> ...apis that use a static cxl_decoder_alloc() internally. Probably
> also wants a cxl_endpoint_decoder_add() that drops the need to pass a
> NULL @target_map.

Would you a like a prep patch to set up the APIs first, or just do it all in
one?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 12/23] cxl: Introduce endpoint decoders
  2021-11-29 20:05         ` Ben Widawsky
@ 2021-11-29 20:07           ` Dan Williams
  2021-11-29 20:12             ` Ben Widawsky
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-11-29 20:07 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Jonathan Cameron, linux-cxl, Linux PCI, Alison Schofield,
	Ira Weiny, Vishal Verma

On Mon, Nov 29, 2021 at 12:05 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-24 16:07:23, Dan Williams wrote:
> > On Mon, Nov 22, 2021 at 11:38 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-11-22 16:20:39, Jonathan Cameron wrote:
> > > > On Fri, 19 Nov 2021 16:02:39 -0800
> > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > > Endpoints have decoders too. It is useful to share the same
> > > > > infrastructure from cxl_core. Endpoints do not have dports (downstream
> > > > > targets), only the underlying physical medium. As a result, some special
> > > > > casing is needed.
> > > > >
> > > > > There is no functional change introduced yet as endpoints don't actually
> > > > > enumerate decoders yet.
> > > > >
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > >
> > > > I'm not a fan of special values like using 0 here to indicate endpoint
> > > > device.  I'd rather see a base cxl_decode_alloc(..., bool ep)
> > > > and possibly wrappers for the non ep case and ep one.
> > > >
> > > > Jonathan
> > > >
> > >
> > > My inclination is the opposite. However, I think you and Dan both brought up
> > > something to this effect in the previous RFCs.
> > >
> > > Dan, do you have a preference here?
> >
> > I was thinking something along the lines of what Jonathan wants,
> > explicit per-type APIs, but internal / private to the core can use
> > heuristics like nr_targets == 0 == endpoint.
> >
> > So unexport cxl_decoder_alloc() and have separate:
> >
> > cxl_root_decoder_alloc()
> > cxl_switch_decoder_alloc()
> > cxl_endpoint_decoder_alloc()
> >
> > ...apis that use a static cxl_decoder_alloc() internally. Probably
> > also wants a cxl_endpoint_decoder_add() that drops the need to pass a
> > NULL @target_map.
>
> Would you a like a prep patch to set up the APIs first, or just do it all in
> one?

Prep patch to switch over the current usages to the new style before
introducing more helpers sounds good to me.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 12/23] cxl: Introduce endpoint decoders
  2021-11-29 20:07           ` Dan Williams
@ 2021-11-29 20:12             ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-29 20:12 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, linux-cxl, Linux PCI, Alison Schofield,
	Ira Weiny, Vishal Verma

On 21-11-29 12:07:00, Dan Williams wrote:
> On Mon, Nov 29, 2021 at 12:05 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-24 16:07:23, Dan Williams wrote:
> > > On Mon, Nov 22, 2021 at 11:38 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > On 21-11-22 16:20:39, Jonathan Cameron wrote:
> > > > > On Fri, 19 Nov 2021 16:02:39 -0800
> > > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > > > > > Endpoints have decoders too. It is useful to share the same
> > > > > > infrastructure from cxl_core. Endpoints do not have dports (downstream
> > > > > > targets), only the underlying physical medium. As a result, some special
> > > > > > casing is needed.
> > > > > >
> > > > > > There is no functional change introduced yet as endpoints don't actually
> > > > > > enumerate decoders yet.
> > > > > >
> > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > >
> > > > > I'm not a fan of special values like using 0 here to indicate endpoint
> > > > > device.  I'd rather see a base cxl_decode_alloc(..., bool ep)
> > > > > and possibly wrappers for the non ep case and ep one.
> > > > >
> > > > > Jonathan
> > > > >
> > > >
> > > > My inclination is the opposite. However, I think you and Dan both brought up
> > > > something to this effect in the previous RFCs.
> > > >
> > > > Dan, do you have a preference here?
> > >
> > > I was thinking something along the lines of what Jonathan wants,
> > > explicit per-type APIs, but internal / private to the core can use
> > > heuristics like nr_targets == 0 == endpoint.
> > >
> > > So unexport cxl_decoder_alloc() and have separate:
> > >
> > > cxl_root_decoder_alloc()
> > > cxl_switch_decoder_alloc()
> > > cxl_endpoint_decoder_alloc()
> > >
> > > ...apis that use a static cxl_decoder_alloc() internally. Probably
> > > also wants a cxl_endpoint_decoder_add() that drops the need to pass a
> > > NULL @target_map.
> >
> > Would you a like a prep patch to set up the APIs first, or just do it all in
> > one?
> 
> Prep patch to switch over the current usages to the new style before
> introducing more helpers sounds good to me.

Thanks for the suggestions... Looking again, I think it makes sense to squash it
into the patch before this which documents and tightens up this exact API.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 14/23] cxl: Introduce topology host registration
  2021-11-25  1:09   ` Dan Williams
@ 2021-11-29 21:23     ` Ben Widawsky
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Widawsky @ 2021-11-29 21:23 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Linux PCI, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-24 17:09:03, Dan Williams wrote:
> On Fri, Nov 19, 2021 at 4:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > The description of the CXL topology will be conveyed by a platform
> > specific entity that is expected to be a singleton. For ACPI based
> > systems, this is ACPI0017. When the topology host goes away, which as of
> > now can only be triggered by module unload, it is desirable to have the
> > entire topology cleaned up. Regular devm unwinding handles most
> > situations already, but what's missing is the ability to teardown the
> > root port. Since the root port is platform specific, the core needs a
> > set of APIs to allow platform specific drivers to register their root
> > ports. With that, all the automatic teardown can occur.
> 
> Wait, no, that was one of the original motivations, but then we
> discussed here [1] that devm teardown of a topology can happen
> naturally / hierarchically.
> 
> [1]: https://lore.kernel.org/r/CAPcyv4ikVFFqyfH2zLhBVJ28N1_gufGHd2gVbP2h+Rv2cZEpeA@mail.gmail.com
> 
> No, the reason for the cxl_topology_host is as a constraint for when
> CXL.mem connectivity can be verified from root to endpoint. Given that
> endpoints can attach at any point in time relative to when the root
> arrives CXL.mem connectivity needs to be revalidated at every topology
> host arrival / depart event.

Oops. I forgot to update the commit message, I will take what you wrote with
slight modification.



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-11-24  7:17               ` Dan Williams
  2021-11-24  7:28                 ` Christoph Hellwig
@ 2021-12-02 21:24                 ` Bjorn Helgaas
  2021-12-03  1:38                   ` Dan Williams
  1 sibling, 1 reply; 133+ messages in thread
From: Bjorn Helgaas @ 2021-12-02 21:24 UTC (permalink / raw)
  To: Dan Williams
  Cc: Christoph Hellwig, Ben Widawsky, linux-cxl, Linux PCI,
	Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma,
	Greg Kroah-Hartman, Rafael J. Wysocki

On Tue, Nov 23, 2021 at 11:17:55PM -0800, Dan Williams wrote:
> On Tue, Nov 23, 2021 at 10:33 PM Christoph Hellwig <hch@lst.de> wrote:
> > On Tue, Nov 23, 2021 at 04:40:06PM -0800, Dan Williams wrote:
> > > Let me ask a clarifying question coming from the other direction that
> > > resulted in the creation of the auxiliary bus architecture. Some
> > > background. RDMA is a protocol that may run on top of Ethernet.
> >
> > No, RDMA is a concept.  Linux supports 2 and a half RDMA protocols
> > that run over ethernet (RoCE v1 and v2 and iWarp).
> 
> Yes, I was being too coarse, point taken. However, I don't think that
> changes the observation that multiple vendors are using aux bus to
> share a feature driver across multiple base Ethernet drivers.
> 
> > > Consider the case where you have multiple generations of Ethernet
> > > adapter devices, but they all support common RDMA functionality. You
> > > only have the one PCI device to attach a unique Ethernet driver. What
> > > is an idiomatic way to deploy a module that automatically loads and
> > > attaches to the exported common functionality across adapters that
> > > otherwise have a unique native driver for the hardware device?
> >
> > The whole aux bus drama is mostly because the intel design for these
> > is really fucked up.  All the sane HCAs do not use this model.  All
> > this attchment crap really should not be there.
> 
> I am missing the counter proposal in both Bjorn's and your distaste
> for aux bus and PCIe portdrv?

For the case of PCIe portdrv, the functionality involved is Power
Management Events (PME), Advanced Error Reporting (AER), PCIe native
hotplug, Downstream Port Containment (DPC), and Bandwidth
Notifications.

Currently each has a separate "port service driver" with .probe(),
.remove(), .suspend(), .resume(), etc.

The services share interrupt vectors.  It's quite complicated to set
them up, and it has to be done in the portdrv, not in the individual
drivers.

They also share power state (D0, D3hot, etc).  

In my mind these are not separate devices from the underlying PCI
device, and I don't think splitting the support into "service drivers"
made things better.  I think it would be simpler if these were just
added to pci_init_capabilities() like other optional pieces of PCI
functionality.

Sysfs looks like this:

  /sys/devices/pci0000:00/0000:00:1c.0/                       # Root Port
  /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie002/  # AER "device"
  /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie010/  # BW notif

  /sys/bus/pci/devices/0000:00:1c.0 -> ../../../devices/pci0000:00/0000:00:1c.0/
  /sys/bus/pci_express/devices/0000:00:1c.0:pcie002 -> ../../../devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie002/

The "pcie002" names (hex for PCIE_PORT_SERVICE_AER, etc.) are
unintelligible.  I don't know why we have a separate
/sys/bus/pci_express hierarchy.

IIUC, CXL devices will be enumerated by the usual PCI enumeration, so
there will be a struct pci_dev for them, and they will appear under
/sys/devices/pci*/.

They will have the usual PCI Power Management, MSI, AER, DPC, and
similar Capabilites, so the PCI core will manage them.

CXL devices have lots of fancy additional features.  Does that merit
making a separate struct device and a separate sysfs hierarchy for
them?  I don't know.

> > > Another example, the Native PCIe Enclosure Management (NPEM)
> > > specification defines a handful of registers that can appear anywhere
> > > in the PCIe hierarchy. How can you write a common driver that is
> > > generically applicable to any given NPEM instance?
> >
> > Another totally messed up spec.  But then pretty much everything coming
> > from the PCIe SIG in terms of interface tends to be really, really
> > broken lately.

Hotplug is more central to PCI than NPEM is, but NPEM is a little bit
like PCIe native hotplug in concept: hotplug has a few registers that
control downstream indicators, interlock, and power controller; NPEM
has registers that control downstream indicators.

Both are prescribed by the PCIe spec and presumably designed to work
alongside the usual device-specific drivers for bridges, SSDs, etc.

I would at least explore the idea of doing common support by
integrating NPEM into the PCI core.  There would have to be some hook
for the enclosure-specific bits, but I think it's fair for the details
of sending commands and polling for command completed to be part of
the PCI core.

> DVSEC and DOE is more of the same in terms of composing add-on
> features into devices. Hardware vendors want to mix multiple hard-IPs
> into a single device, aux bus is one response. Topology specific buses
> like /sys/bus/cxl are another.

VSEC and DVSEC are pretty much wild cards since the PCIe spec says
nothing about what registers they may contain or how they should work.

DOE *is* specified by PCIe, at least in terms of the data transfer
protocol (interrupt usage, read/write mailbox, etc).  I think that,
and the fact that it's not specific to CXL, means we need some kind of
PCI core interface to do the transfers.

> This CXL port driver is offering enumeration, link management, and
> memory decode setup services to the rest of the topology. I see it as
> similar to management protocol services offered by libsas.

Bjorn

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-12-02 21:24                 ` Bjorn Helgaas
@ 2021-12-03  1:38                   ` Dan Williams
  2021-12-03 22:03                     ` Bjorn Helgaas
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-12-03  1:38 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Christoph Hellwig, Ben Widawsky, linux-cxl, Linux PCI,
	Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma,
	Greg Kroah-Hartman, Rafael J. Wysocki, Stuart Hayes

[ add Stuart for NPEM feedback ]

On Thu, Dec 2, 2021 at 1:24 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Tue, Nov 23, 2021 at 11:17:55PM -0800, Dan Williams wrote:
> > On Tue, Nov 23, 2021 at 10:33 PM Christoph Hellwig <hch@lst.de> wrote:
> > > On Tue, Nov 23, 2021 at 04:40:06PM -0800, Dan Williams wrote:
> > > > Let me ask a clarifying question coming from the other direction that
> > > > resulted in the creation of the auxiliary bus architecture. Some
> > > > background. RDMA is a protocol that may run on top of Ethernet.
> > >
> > > No, RDMA is a concept.  Linux supports 2 and a half RDMA protocols
> > > that run over ethernet (RoCE v1 and v2 and iWarp).
> >
> > Yes, I was being too coarse, point taken. However, I don't think that
> > changes the observation that multiple vendors are using aux bus to
> > share a feature driver across multiple base Ethernet drivers.
> >
> > > > Consider the case where you have multiple generations of Ethernet
> > > > adapter devices, but they all support common RDMA functionality. You
> > > > only have the one PCI device to attach a unique Ethernet driver. What
> > > > is an idiomatic way to deploy a module that automatically loads and
> > > > attaches to the exported common functionality across adapters that
> > > > otherwise have a unique native driver for the hardware device?
> > >
> > > The whole aux bus drama is mostly because the intel design for these
> > > is really fucked up.  All the sane HCAs do not use this model.  All
> > > this attchment crap really should not be there.
> >
> > I am missing the counter proposal in both Bjorn's and your distaste
> > for aux bus and PCIe portdrv?
>
> For the case of PCIe portdrv, the functionality involved is Power
> Management Events (PME), Advanced Error Reporting (AER), PCIe native
> hotplug, Downstream Port Containment (DPC), and Bandwidth
> Notifications.
>
> Currently each has a separate "port service driver" with .probe(),
> .remove(), .suspend(), .resume(), etc.
>
> The services share interrupt vectors.  It's quite complicated to set
> them up, and it has to be done in the portdrv, not in the individual
> drivers.
>
> They also share power state (D0, D3hot, etc).
>
> In my mind these are not separate devices from the underlying PCI
> device, and I don't think splitting the support into "service drivers"
> made things better.  I think it would be simpler if these were just
> added to pci_init_capabilities() like other optional pieces of PCI
> functionality.
>
> Sysfs looks like this:
>
>   /sys/devices/pci0000:00/0000:00:1c.0/                       # Root Port
>   /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie002/  # AER "device"
>   /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie010/  # BW notif
>
>   /sys/bus/pci/devices/0000:00:1c.0 -> ../../../devices/pci0000:00/0000:00:1c.0/
>   /sys/bus/pci_express/devices/0000:00:1c.0:pcie002 -> ../../../devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie002/
>
> The "pcie002" names (hex for PCIE_PORT_SERVICE_AER, etc.) are
> unintelligible.  I don't know why we have a separate
> /sys/bus/pci_express hierarchy.
>
> IIUC, CXL devices will be enumerated by the usual PCI enumeration, so
> there will be a struct pci_dev for them, and they will appear under
> /sys/devices/pci*/.
>
> They will have the usual PCI Power Management, MSI, AER, DPC, and
> similar Capabilites, so the PCI core will manage them.
>
> CXL devices have lots of fancy additional features.  Does that merit
> making a separate struct device and a separate sysfs hierarchy for
> them?  I don't know.

Thank you, this framing really helps.

So, if I look at this through the lens of the "can this just be
handled by pci_init_capabilities()?" guardband, where the capability
is device-scoped and shares resources that a driver for the same
device would use, then yes, PCIe port services do not merit a separate
bus.

CXL memory expansion is distinctly not in that category. CXL is a
distinct specification (i.e. unlike PCIe port services which are
literally in the PCI Sig purview), distinct transport/data layer
(effectively a different physical connection on the wire), and is
composable in ways that PCIe is not.

For example, if you trigger FLR on a PCI device you would expect the
entirety of pci_init_capabilities() needs to be saved / restored, CXL
state is not affected by FLR.

The separate link layer for CXL means that mere device visibility is
not sufficient to determine CXL operation. Whereas with a PCI driver
if you can see the device you know that the device is ready to be the
target of config, io, and mmio cycles, CXL.mem operation may not be
available when the device is visible to enumeration.

The composability of CXL places the highest demand for an independent
'struct bus_type' and registering CXL devices for their corresponding
PCIe devices. The bus is a rendezvous for all the moving pieces needed
to validate and provision a CXL memory region that may span multiple
endpoints, switches and host bridges. An action like reprogramming
memory decode of an endpoint needs to be coordinated across the entire
topology.

The fact that the PCI core remains blissfully unaware of all these
interdependencies is a feature because CXL is effectively a super-set
of PCIe for fast-path CXL.mem operation, even though it is derivative
for enumeration and slow-path manageability.

So I hope you see CXL's need to create some PCIe-symbiotic objects in
order to compose CXL objects (like interleaved memory regions) that
have no PCIe analog.

> > > > Another example, the Native PCIe Enclosure Management (NPEM)
> > > > specification defines a handful of registers that can appear anywhere
> > > > in the PCIe hierarchy. How can you write a common driver that is
> > > > generically applicable to any given NPEM instance?
> > >
> > > Another totally messed up spec.  But then pretty much everything coming
> > > from the PCIe SIG in terms of interface tends to be really, really
> > > broken lately.
>
> Hotplug is more central to PCI than NPEM is, but NPEM is a little bit
> like PCIe native hotplug in concept: hotplug has a few registers that
> control downstream indicators, interlock, and power controller; NPEM
> has registers that control downstream indicators.
>
> Both are prescribed by the PCIe spec and presumably designed to work
> alongside the usual device-specific drivers for bridges, SSDs, etc.
>
> I would at least explore the idea of doing common support by
> integrating NPEM into the PCI core.  There would have to be some hook
> for the enclosure-specific bits, but I think it's fair for the details
> of sending commands and polling for command completed to be part of
> the PCI core.

The problem with NPEM compared to hotplug LED signaling is that it has
the strange property that an NPEM higher in the topology might
override one lower in the topology. This is to support things like
NVME enclosures where the NVME device itself may have an NPEM
instance, but that instance is of course not suitable for signaling
that the device is not present. So, instead the system may be designed
to have an NPEM in the upstream switch port and disable the NPEM
instances in the device. Platform firmware decides which NPEM is in
use.

It also has the "fun" property of additionally being overridden by ACPI.

Stuart, have a look at collapsing the auxiliary-device approach into
pci_init_capabilities() and whether that can still coordinate with the
enclosure code.

One of the nice properties of the auxiliary organization is well
defined module boundaries. Will NPEM in the PCI core now force
CONFIG_ENCLOSURE_SERVICES to be built-in? That seems an unwanted side
effect to me.

> > DVSEC and DOE is more of the same in terms of composing add-on
> > features into devices. Hardware vendors want to mix multiple hard-IPs
> > into a single device, aux bus is one response. Topology specific buses
> > like /sys/bus/cxl are another.
>
> VSEC and DVSEC are pretty much wild cards since the PCIe spec says
> nothing about what registers they may contain or how they should work.
>
> DOE *is* specified by PCIe, at least in terms of the data transfer
> protocol (interrupt usage, read/write mailbox, etc).  I think that,
> and the fact that it's not specific to CXL, means we need some kind of
> PCI core interface to do the transfers.

DOE transfer code belongs in drivers/pci/ period, but does it really
need to be in drivers/pci/built-in.a for everyone regardless of
whether it is being used or not?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-12-03  1:38                   ` Dan Williams
@ 2021-12-03 22:03                     ` Bjorn Helgaas
  2021-12-04  1:24                       ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Bjorn Helgaas @ 2021-12-03 22:03 UTC (permalink / raw)
  To: Dan Williams
  Cc: Christoph Hellwig, Ben Widawsky, linux-cxl, Linux PCI,
	Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma,
	Greg Kroah-Hartman, Rafael J. Wysocki, Stuart Hayes

On Thu, Dec 02, 2021 at 05:38:17PM -0800, Dan Williams wrote:
> [ add Stuart for NPEM feedback ]
> 
> On Thu, Dec 2, 2021 at 1:24 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 23, 2021 at 11:17:55PM -0800, Dan Williams wrote:
> > > On Tue, Nov 23, 2021 at 10:33 PM Christoph Hellwig <hch@lst.de> wrote:
> > > > On Tue, Nov 23, 2021 at 04:40:06PM -0800, Dan Williams wrote:
> > > > > Let me ask a clarifying question coming from the other direction that
> > > > > resulted in the creation of the auxiliary bus architecture. Some
> > > > > background. RDMA is a protocol that may run on top of Ethernet.
> > > >
> > > > No, RDMA is a concept.  Linux supports 2 and a half RDMA protocols
> > > > that run over ethernet (RoCE v1 and v2 and iWarp).
> > >
> > > Yes, I was being too coarse, point taken. However, I don't think that
> > > changes the observation that multiple vendors are using aux bus to
> > > share a feature driver across multiple base Ethernet drivers.
> > >
> > > > > Consider the case where you have multiple generations of Ethernet
> > > > > adapter devices, but they all support common RDMA functionality. You
> > > > > only have the one PCI device to attach a unique Ethernet driver. What
> > > > > is an idiomatic way to deploy a module that automatically loads and
> > > > > attaches to the exported common functionality across adapters that
> > > > > otherwise have a unique native driver for the hardware device?
> > > >
> > > > The whole aux bus drama is mostly because the intel design for these
> > > > is really fucked up.  All the sane HCAs do not use this model.  All
> > > > this attchment crap really should not be there.
> > >
> > > I am missing the counter proposal in both Bjorn's and your distaste
> > > for aux bus and PCIe portdrv?
> >
> > For the case of PCIe portdrv, the functionality involved is Power
> > Management Events (PME), Advanced Error Reporting (AER), PCIe native
> > hotplug, Downstream Port Containment (DPC), and Bandwidth
> > Notifications.
> >
> > Currently each has a separate "port service driver" with .probe(),
> > .remove(), .suspend(), .resume(), etc.
> >
> > The services share interrupt vectors.  It's quite complicated to set
> > them up, and it has to be done in the portdrv, not in the individual
> > drivers.
> >
> > They also share power state (D0, D3hot, etc).
> >
> > In my mind these are not separate devices from the underlying PCI
> > device, and I don't think splitting the support into "service drivers"
> > made things better.  I think it would be simpler if these were just
> > added to pci_init_capabilities() like other optional pieces of PCI
> > functionality.
> >
> > Sysfs looks like this:
> >
> >   /sys/devices/pci0000:00/0000:00:1c.0/                       # Root Port
> >   /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie002/  # AER "device"
> >   /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie010/  # BW notif
> >
> >   /sys/bus/pci/devices/0000:00:1c.0 -> ../../../devices/pci0000:00/0000:00:1c.0/
> >   /sys/bus/pci_express/devices/0000:00:1c.0:pcie002 -> ../../../devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie002/
> >
> > The "pcie002" names (hex for PCIE_PORT_SERVICE_AER, etc.) are
> > unintelligible.  I don't know why we have a separate
> > /sys/bus/pci_express hierarchy.
> >
> > IIUC, CXL devices will be enumerated by the usual PCI enumeration, so
> > there will be a struct pci_dev for them, and they will appear under
> > /sys/devices/pci*/.
> >
> > They will have the usual PCI Power Management, MSI, AER, DPC, and
> > similar Capabilites, so the PCI core will manage them.
> >
> > CXL devices have lots of fancy additional features.  Does that merit
> > making a separate struct device and a separate sysfs hierarchy for
> > them?  I don't know.
> 
> Thank you, this framing really helps.
> 
> So, if I look at this through the lens of the "can this just be
> handled by pci_init_capabilities()?" guardband, where the capability
> is device-scoped and shares resources that a driver for the same
> device would use, then yes, PCIe port services do not merit a separate
> bus.
> 
> CXL memory expansion is distinctly not in that category. CXL is a
> distinct specification (i.e. unlike PCIe port services which are
> literally in the PCI Sig purview), distinct transport/data layer
> (effectively a different physical connection on the wire), and is
> composable in ways that PCIe is not.
> 
> For example, if you trigger FLR on a PCI device you would expect the
> entirety of pci_init_capabilities() needs to be saved / restored, CXL
> state is not affected by FLR.
> 
> The separate link layer for CXL means that mere device visibility is
> not sufficient to determine CXL operation. Whereas with a PCI driver
> if you can see the device you know that the device is ready to be the
> target of config, io, and mmio cycles, 

Not strictly true.  A PCI device must respond to config transactions
within a short time after reset, but it does not respond to IO or MEM
transactions until a driver sets PCI_COMMAND_IO or PCI_COMMAND_MEMORY.

> ... CXL.mem operation may not be available when the device is
> visible to enumeration.

> The composability of CXL places the highest demand for an independent
> 'struct bus_type' and registering CXL devices for their corresponding
> PCIe devices. The bus is a rendezvous for all the moving pieces needed
> to validate and provision a CXL memory region that may span multiple
> endpoints, switches and host bridges. An action like reprogramming
> memory decode of an endpoint needs to be coordinated across the entire
> topology.

I guess software uses the CXL DVSEC to distinguish a CXL device from a
PCIe device, at least for 00.0.  Not sure about non-Dev 0 Func 0
devices; maybe this implies other functions in the same device are
part of the same CXL device?

A CXL device may comprise several PCIe devices, and I think they may
have non-CXL Capabilities, so I assume you need a struct pci_dev for
each PCIe device?

Looks like a single CXL DVSEC controls the entire CXL device
(including several PCIe devices), so I assume you want some kind of
struct cxl_dev to represent that aggregation?

> The fact that the PCI core remains blissfully unaware of all these
> interdependencies is a feature because CXL is effectively a super-set
> of PCIe for fast-path CXL.mem operation, even though it is derivative
> for enumeration and slow-path manageability.

I don't know how blissfully unaware PCI can be.  Can a user remove a
PCIe device that's part of a CXL device via sysfs?  Is the PCIe device
available for drivers to claim?  Do we need coordination between
cxl_driver_register() and pci_register_driver()?  Does CXL impose new
constraints on PCI power management?

> So I hope you see CXL's need to create some PCIe-symbiotic objects in
> order to compose CXL objects (like interleaved memory regions) that
> have no PCIe analog.

> > Hotplug is more central to PCI than NPEM is, but NPEM is a little bit
> > like PCIe native hotplug in concept: hotplug has a few registers that
> > control downstream indicators, interlock, and power controller; NPEM
> > has registers that control downstream indicators.
> >
> > Both are prescribed by the PCIe spec and presumably designed to work
> > alongside the usual device-specific drivers for bridges, SSDs, etc.
> >
> > I would at least explore the idea of doing common support by
> > integrating NPEM into the PCI core.  There would have to be some hook
> > for the enclosure-specific bits, but I think it's fair for the details
> > of sending commands and polling for command completed to be part of
> > the PCI core.
> 
> The problem with NPEM compared to hotplug LED signaling is that it has
> the strange property that an NPEM higher in the topology might
> override one lower in the topology. This is to support things like
> NVME enclosures where the NVME device itself may have an NPEM
> instance, but that instance is of course not suitable for signaling
> that the device is not present.

I would envision the PCI core providing only a bare-bones "send this
command and wait for it to complete" sort of interface.  Everything
else, including deciding which device to use, would go elsewhere.

> So, instead the system may be designed to have an NPEM in the
> upstream switch port and disable the NPEM instances in the device.
> Platform firmware decides which NPEM is in use.

Since you didn't mention a firmware interface to communicate which
NPEM is in use, I assume firmware does this by preventing other
devices from advertising NPEM support?

> It also has the "fun" property of additionally being overridden by ACPI.
> ...
> One of the nice properties of the auxiliary organization is well
> defined module boundaries. Will NPEM in the PCI core now force
> CONFIG_ENCLOSURE_SERVICES to be built-in? That seems an unwanted side
> effect to me.

If the PCI core provides only the mechanism, and the smarts of NPEM
are in something analogous to drivers/scsi/ses.c, I don't think it
would force CONFIG_ENCLOSURE_SERVICES to be built-in.

> > DOE *is* specified by PCIe, at least in terms of the data transfer
> > protocol (interrupt usage, read/write mailbox, etc).  I think that,
> > and the fact that it's not specific to CXL, means we need some kind of
> > PCI core interface to do the transfers.
> 
> DOE transfer code belongs in drivers/pci/ period, but does it really
> need to be in drivers/pci/built-in.a for everyone regardless of
> whether it is being used or not?

I think my opinion would depend on how small the DOE transfer code
could be made and how much it would complicate things to make it a
module.  And also how we could enforce whatever mutual exclusion we
need to make it safe.

Bjorn

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-12-03 22:03                     ` Bjorn Helgaas
@ 2021-12-04  1:24                       ` Dan Williams
  2021-12-07  2:56                         ` Bjorn Helgaas
  0 siblings, 1 reply; 133+ messages in thread
From: Dan Williams @ 2021-12-04  1:24 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Christoph Hellwig, Ben Widawsky, linux-cxl, Linux PCI,
	Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma,
	Greg Kroah-Hartman, Rafael J. Wysocki, Stuart Hayes

On Fri, Dec 3, 2021 at 2:03 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Dec 02, 2021 at 05:38:17PM -0800, Dan Williams wrote:
> > [ add Stuart for NPEM feedback ]
> >
> > On Thu, Dec 2, 2021 at 1:24 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Nov 23, 2021 at 11:17:55PM -0800, Dan Williams wrote:
> > > > On Tue, Nov 23, 2021 at 10:33 PM Christoph Hellwig <hch@lst.de> wrote:
> > > > > On Tue, Nov 23, 2021 at 04:40:06PM -0800, Dan Williams wrote:
> > > > > > Let me ask a clarifying question coming from the other direction that
> > > > > > resulted in the creation of the auxiliary bus architecture. Some
> > > > > > background. RDMA is a protocol that may run on top of Ethernet.
> > > > >
> > > > > No, RDMA is a concept.  Linux supports 2 and a half RDMA protocols
> > > > > that run over ethernet (RoCE v1 and v2 and iWarp).
> > > >
> > > > Yes, I was being too coarse, point taken. However, I don't think that
> > > > changes the observation that multiple vendors are using aux bus to
> > > > share a feature driver across multiple base Ethernet drivers.
> > > >
> > > > > > Consider the case where you have multiple generations of Ethernet
> > > > > > adapter devices, but they all support common RDMA functionality. You
> > > > > > only have the one PCI device to attach a unique Ethernet driver. What
> > > > > > is an idiomatic way to deploy a module that automatically loads and
> > > > > > attaches to the exported common functionality across adapters that
> > > > > > otherwise have a unique native driver for the hardware device?
> > > > >
> > > > > The whole aux bus drama is mostly because the intel design for these
> > > > > is really fucked up.  All the sane HCAs do not use this model.  All
> > > > > this attchment crap really should not be there.
> > > >
> > > > I am missing the counter proposal in both Bjorn's and your distaste
> > > > for aux bus and PCIe portdrv?
> > >
> > > For the case of PCIe portdrv, the functionality involved is Power
> > > Management Events (PME), Advanced Error Reporting (AER), PCIe native
> > > hotplug, Downstream Port Containment (DPC), and Bandwidth
> > > Notifications.
> > >
> > > Currently each has a separate "port service driver" with .probe(),
> > > .remove(), .suspend(), .resume(), etc.
> > >
> > > The services share interrupt vectors.  It's quite complicated to set
> > > them up, and it has to be done in the portdrv, not in the individual
> > > drivers.
> > >
> > > They also share power state (D0, D3hot, etc).
> > >
> > > In my mind these are not separate devices from the underlying PCI
> > > device, and I don't think splitting the support into "service drivers"
> > > made things better.  I think it would be simpler if these were just
> > > added to pci_init_capabilities() like other optional pieces of PCI
> > > functionality.
> > >
> > > Sysfs looks like this:
> > >
> > >   /sys/devices/pci0000:00/0000:00:1c.0/                       # Root Port
> > >   /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie002/  # AER "device"
> > >   /sys/devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie010/  # BW notif
> > >
> > >   /sys/bus/pci/devices/0000:00:1c.0 -> ../../../devices/pci0000:00/0000:00:1c.0/
> > >   /sys/bus/pci_express/devices/0000:00:1c.0:pcie002 -> ../../../devices/pci0000:00/0000:00:1c.0/0000:00:1c.0:pcie002/
> > >
> > > The "pcie002" names (hex for PCIE_PORT_SERVICE_AER, etc.) are
> > > unintelligible.  I don't know why we have a separate
> > > /sys/bus/pci_express hierarchy.
> > >
> > > IIUC, CXL devices will be enumerated by the usual PCI enumeration, so
> > > there will be a struct pci_dev for them, and they will appear under
> > > /sys/devices/pci*/.
> > >
> > > They will have the usual PCI Power Management, MSI, AER, DPC, and
> > > similar Capabilites, so the PCI core will manage them.
> > >
> > > CXL devices have lots of fancy additional features.  Does that merit
> > > making a separate struct device and a separate sysfs hierarchy for
> > > them?  I don't know.
> >
> > Thank you, this framing really helps.
> >
> > So, if I look at this through the lens of the "can this just be
> > handled by pci_init_capabilities()?" guardband, where the capability
> > is device-scoped and shares resources that a driver for the same
> > device would use, then yes, PCIe port services do not merit a separate
> > bus.
> >
> > CXL memory expansion is distinctly not in that category. CXL is a
> > distinct specification (i.e. unlike PCIe port services which are
> > literally in the PCI Sig purview), distinct transport/data layer
> > (effectively a different physical connection on the wire), and is
> > composable in ways that PCIe is not.
> >
> > For example, if you trigger FLR on a PCI device you would expect the
> > entirety of pci_init_capabilities() needs to be saved / restored, CXL
> > state is not affected by FLR.
> >
> > The separate link layer for CXL means that mere device visibility is
> > not sufficient to determine CXL operation. Whereas with a PCI driver
> > if you can see the device you know that the device is ready to be the
> > target of config, io, and mmio cycles,
>
> Not strictly true.  A PCI device must respond to config transactions
> within a short time after reset, but it does not respond to IO or MEM
> transactions until a driver sets PCI_COMMAND_IO or PCI_COMMAND_MEMORY.

Right, what I was attempting to convey is that it's not like CXL.mem.
While flipping a bit on the device turns on PCI.mmio target support,
there's additional polling and status checking to be done after the
device is enabled to be a target for CXL.mem. I.e. the CXL.mem
configuration is done via PCI.mmio* (*for CXL 2.0 devices) and only
after the device negotiates a CXL link beyond the base PCIe link. It
is also the case that the device may not be ready for CXL.mem until
all the devices that compose the same range are available as well.

>
> > ... CXL.mem operation may not be available when the device is
> > visible to enumeration.
>
> > The composability of CXL places the highest demand for an independent
> > 'struct bus_type' and registering CXL devices for their corresponding
> > PCIe devices. The bus is a rendezvous for all the moving pieces needed
> > to validate and provision a CXL memory region that may span multiple
> > endpoints, switches and host bridges. An action like reprogramming
> > memory decode of an endpoint needs to be coordinated across the entire
> > topology.
>
> I guess software uses the CXL DVSEC to distinguish a CXL device from a
> PCIe device, at least for 00.0.

driver/cxl/pci.c attaches to: "PCI_DEVICE_CLASS((PCI_CLASS_MEMORY_CXL
<< 8 | CXL_MEMORY_PROGIF), ~0)"

I am not aware of any restriction for that class code to appear at function0?

> Not sure about non-Dev 0 Func 0
> devices; maybe this implies other functions in the same device are
> part of the same CXL device?

DVSEC is really only telling us the layout of the MMIO register space.
A CXL Memory Device (CXL.mem endpoint) implements so-called "device"
registers and "component" registers. The "device" registers control
things that a traditional PCI driver would control like a mailbox
interface and device local status registers. The "component" registers
are what mediate access to the CXL.mem decode space.

It is somewhat confusing because CXL 1.1 devices used the DVSEC
configuration registers directly for CXL.mem configuration, but CXL
2.0 ditched that organization. The Linux driver is only targeting CXL
2.0+ devices as platform firmware owns setting up CXL 1.1 memory
expanders. As far as Linux is concerned CXL 1.1 looks like DDR i.e.
it's just address space set up by the BIOS and populated into EFI and
ACPI tables. CXL 1.1 is also implementing a 1:1 device-to-memory-range
association. CXL 2.0 allows for N:1 devices-to-memory-range and
dynamic configuration of that by the OS. CXL 1.1 platform firmware
locks out the OS from making configuration changes.

> A CXL device may comprise several PCIe devices, and I think they may
> have non-CXL Capabilities, so I assume you need a struct pci_dev for
> each PCIe device?
>
> Looks like a single CXL DVSEC controls the entire CXL device
> (including several PCIe devices), so I assume you want some kind of
> struct cxl_dev to represent that aggregation?

We have 3 classes of a 'cxl_dev' in drivers/cxl:

"mem" devices (PCIe endpoints)
"port" devices (PCIe Upstream Switch Ports, PCIe Root Ports, ACPI0016
Host Bridge devices, and ACPI0017 CXL Root devices)
"region" devices (aggregate devices composed of multiple mem devices).

The "mem" driver is tasked with enumerating all "ports" in the path
between itself and the ACPI0017 root and validating that a CXL link is
up at each hop before enumerating "region" devices.

>
> > The fact that the PCI core remains blissfully unaware of all these
> > interdependencies is a feature because CXL is effectively a super-set
> > of PCIe for fast-path CXL.mem operation, even though it is derivative
> > for enumeration and slow-path manageability.
>
> I don't know how blissfully unaware PCI can be.  Can a user remove a
> PCIe device that's part of a CXL device via sysfs?

Yes. If that PCIe device is an endpoint then the corresponding "mem"
driver will get a ->remove() event because it is registered as a child
of that parent PCIe device. That in turn will tear down the relevant
part of the CXL port topology.

However, a gap (deliberate design choice?) in the PCI hotplug
implementation I currently see is an ability for an endpoint PCI
driver to dynamically deny hotplug requests based on the state of the
device. pci_ignore_hotplug() seems inadequate for the job of making
sure that someone first disables all participation of a "mem" device
in all regions before asking for its physical removal. However, if
someone force removes a device the driver and CXL subsystem will do
the right thing and go fail that memory region for all users. I'm
working with other DAX developers on a range based memory_failure()
api for this case.

> Is the PCIe device available for drivers to claim?

drivers/cxl/pci.c *is* a native PCI driver for CXL memory expanders.
You might be thinking of CXL accelerator devices, where the CXL.cache
and CXL.mem capabilities are incremental capabilities for the
accelerator. So, for example, a GPU with CXL.mem capabilities, would
be mostly ignored by the drivers/cxl/ stack by default. Only if that
device+driver wanted to emulate a generic memory expander and share
its memory space with the host might it instantiate mem, port, and
region objects. Otherwise CXL for accelerators is mostly a transparent
capability that the OS may not need to manage. Rather than copy data
back and forth between the host a CXL enabled GPU's driver can just
assign pointers to its local memory space and trust that it is
coherent. That's a gross oversimplification, but I want to convey that
there are CXL devices like accelerators that are the responsibility of
each accelerator PCI driver, vs CXL memory expander devices which are
generic providers of "System RAM" and "Persistent Memory" resources
and need the "mem", "port", "region" schema.

> Do we need coordination between cxl_driver_register() and pci_register_driver()?

They do not collide, and I think this goes back to the concern about
whether drivers/cxl/ is scanning for all CXL DVSECs. No, it only cares
about the CXL DVSEC in CXL memory expander endpoints with the
aforementioned class code, and the CXL DVSEC in every upstream switch
port in the ancestry up to the CXL root device (ACPI0017). CXL
accelerator drivers will always use pci_register_driver() and can
decide to register their DVSEC with the CXL core, or keep it private
to themselves. I imagine a GPU accelerator might register a "mem"
device if it needs to get CXL.mem decode set up after a hotplug event,
but if it's only using CXL.cache, or if its memory space was
established by platform firmware then drivers/cxl/ is not involved.

> Does CXL impose new constraints on PCI power management?

Recall that CXL is built such that it could be supported by a legacy
operating system where platform firmware owns the setup of devices,
just like DDR memory does not need a driver. This is where CXL 1.1
played until CXL 2.0 added so much configuration complexity (hotplug,
interleave, persistent memory) that it started to need OS help. The
Linux PCIe PM code will not notice a difference, but behind the scenes
the device, if it is offering coherent memory to the CPU, will be
coordinating with the CPU like it was part of the CPU package and not
a discrete device. I do not see new PM software enabling required in
my reading of "Section 10.0 Power Management" of the CXL
specification.

> > So I hope you see CXL's need to create some PCIe-symbiotic objects in
> > order to compose CXL objects (like interleaved memory regions) that
> > have no PCIe analog.
>
> > > Hotplug is more central to PCI than NPEM is, but NPEM is a little bit
> > > like PCIe native hotplug in concept: hotplug has a few registers that
> > > control downstream indicators, interlock, and power controller; NPEM
> > > has registers that control downstream indicators.
> > >
> > > Both are prescribed by the PCIe spec and presumably designed to work
> > > alongside the usual device-specific drivers for bridges, SSDs, etc.
> > >
> > > I would at least explore the idea of doing common support by
> > > integrating NPEM into the PCI core.  There would have to be some hook
> > > for the enclosure-specific bits, but I think it's fair for the details
> > > of sending commands and polling for command completed to be part of
> > > the PCI core.
> >
> > The problem with NPEM compared to hotplug LED signaling is that it has
> > the strange property that an NPEM higher in the topology might
> > override one lower in the topology. This is to support things like
> > NVME enclosures where the NVME device itself may have an NPEM
> > instance, but that instance is of course not suitable for signaling
> > that the device is not present.
>
> I would envision the PCI core providing only a bare-bones "send this
> command and wait for it to complete" sort of interface.  Everything
> else, including deciding which device to use, would go elsewhere.
>
> > So, instead the system may be designed to have an NPEM in the
> > upstream switch port and disable the NPEM instances in the device.
> > Platform firmware decides which NPEM is in use.
>
> Since you didn't mention a firmware interface to communicate which
> NPEM is in use, I assume firmware does this by preventing other
> devices from advertising NPEM support?

That's also my assumption. If the OS sees a disabled NPEM in the
topology it just needs to assume firmware knew what it was doing when
it disabled it. I wish NPEM was better specified than "trust firmware
to do the right thing via an ambiguous signal".

>
> > It also has the "fun" property of additionally being overridden by ACPI.
> > ...
> > One of the nice properties of the auxiliary organization is well
> > defined module boundaries. Will NPEM in the PCI core now force
> > CONFIG_ENCLOSURE_SERVICES to be built-in? That seems an unwanted side
> > effect to me.
>
> If the PCI core provides only the mechanism, and the smarts of NPEM
> are in something analogous to drivers/scsi/ses.c, I don't think it
> would force CONFIG_ENCLOSURE_SERVICES to be built-in.

What is that dynamic thing that glues CONFIG_ENCLOSURE_SERVICES to the
PCI core that also does not require statically linking that glue to
every driver that wants to talk to NPEM? I don't mind that the base
hardware access mechanism library is in the PCI core, but what does
NVME and the CXL memory expander driver register to get NPEM service
and associate their block / memory device with an enclosure slot? To
me that glue for these one-off ancillary features is what aux-bus is
all about, but I'm open to it not being aux-bus in the end. Maybe
Stuart has a proposal here?

>
> > > DOE *is* specified by PCIe, at least in terms of the data transfer
> > > protocol (interrupt usage, read/write mailbox, etc).  I think that,
> > > and the fact that it's not specific to CXL, means we need some kind of
> > > PCI core interface to do the transfers.
> >
> > DOE transfer code belongs in drivers/pci/ period, but does it really
> > need to be in drivers/pci/built-in.a for everyone regardless of
> > whether it is being used or not?
>
> I think my opinion would depend on how small the DOE transfer code
> could be made and how much it would complicate things to make it a
> module.  And also how we could enforce whatever mutual exclusion we
> need to make it safe.

At least for the mutual exclusion aspect I'm thinking of typical
request_region() style exclusion where the aux-driver claims the
configuration address register range.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-12-04  1:24                       ` Dan Williams
@ 2021-12-07  2:56                         ` Bjorn Helgaas
  2021-12-07  4:48                           ` Dan Williams
  0 siblings, 1 reply; 133+ messages in thread
From: Bjorn Helgaas @ 2021-12-07  2:56 UTC (permalink / raw)
  To: Dan Williams
  Cc: Christoph Hellwig, Ben Widawsky, linux-cxl, Linux PCI,
	Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma,
	Greg Kroah-Hartman, Rafael J. Wysocki, Stuart Hayes

On Fri, Dec 03, 2021 at 05:24:34PM -0800, Dan Williams wrote:
> On Fri, Dec 3, 2021 at 2:03 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Thu, Dec 02, 2021 at 05:38:17PM -0800, Dan Williams wrote:

I'm cutting out a bunch of details, not because they're unimportant,
but because I don't know enough yet for them to make sense to me.

> > I guess software uses the CXL DVSEC to distinguish a CXL device
> > from a PCIe device, at least for 00.0.
> 
> driver/cxl/pci.c attaches to: "PCI_DEVICE_CLASS((PCI_CLASS_MEMORY_CXL
> << 8 | CXL_MEMORY_PROGIF), ~0)"
> 
> I am not aware of any restriction for that class code to appear at
> function0?

Maybe it's not an actual restriction; I'm reading CXL r2.0, sec 8.1.3,
where it says:

  The PCIe configuration space of Device 0, Function 0 shall include
  the CXL PCI Express Designated Vendor-Specific Extended Capability
  (DVSEC) as shown in Figure 126. The capability, status and control
  fields in Device 0, Function 0 DVSEC control the CXL functionality
  of the entire CXL device.
  ...
  Software may use the presence of this DVSEC to differentiate between
  a CXL device and a PCIe device. As such, a standard PCIe device must
  not expose this DVSEC.

Sections 9.11.5 and 9.12.2 also talk about looking for CXL DVSEC on
dev 0, func 0 to identify CXL devices.

> > Not sure about non-Dev 0 Func 0 devices; maybe this implies other
> > functions in the same device are part of the same CXL device?
> 
> DVSEC is really only telling us the layout of the MMIO register
> space. ...

And DVSEC also apparently tells us that "this is a CXL device, not
just an ordinary PCIe device"?  It's not clear to me how you identify
other PCIe functions that are also part of the same CXL device.

> > > The fact that the PCI core remains blissfully unaware of all these
> > > interdependencies is a feature because CXL is effectively a super-set
> > > of PCIe for fast-path CXL.mem operation, even though it is derivative
> > > for enumeration and slow-path manageability.
> >
> > I don't know how blissfully unaware PCI can be.  Can a user remove a
> > PCIe device that's part of a CXL device via sysfs?
> 
> Yes. If that PCIe device is an endpoint then the corresponding "mem"
> driver will get a ->remove() event because it is registered as a child
> of that parent PCIe device. That in turn will tear down the relevant
> part of the CXL port topology.

The CXL device may include several PCIe devices.  "mem" is a CXL
driver that's bound to one of them (?)  Is that what you mean by "mem"
being a "child of the the parent PCIe device"?

> However, a gap (deliberate design choice?) in the PCI hotplug
> implementation I currently see is an ability for an endpoint PCI
> driver to dynamically deny hotplug requests based on the state of the
> device. ...

PCI allows surprise remove, so drivers generally can't deny hot
unplugs.  PCIe *does* provide for an Electromechanical Interlock (see
PCI_EXP_SLTCTL_EIC), but we don't currently support it.

> > Is the PCIe device available for drivers to claim?
> 
> drivers/cxl/pci.c *is* a native PCI driver for CXL memory expanders.
> You might be thinking of CXL accelerator devices, where the CXL.cache
> and CXL.mem capabilities are incremental capabilities for the
> accelerator.  ...

No, I'm not nearly sophisticated enough to be thinking of specific
types of CXL things :)

> > Do we need coordination between cxl_driver_register() and
> > pci_register_driver()?
> 
> They do not collide, and I think this goes back to the concern about
> whether drivers/cxl/ is scanning for all CXL DVSECs. ...

Sorry, I don't remember what this concern was, and I don't know why
they don't collide.  I *would* know that if I knew that the set of
things cxl_driver_register() binds to doesn't intersect the set of
pci_devs, but it's not clear to me what those things are.

The PCI core enumerates devices by doing config reads of the Vendor ID
for every possible bus and device number.  It allocs a pci_dev for
each device it finds, and those are the things pci_register_driver()
binds drivers to based on Vendor ID, Device ID, etc.

How does CXL enumeration work?  Do you envision it being integrated
into PCI enumeration?  Does it build a list/tree/etc of cxl_devs?

cxl_driver_register() associates a driver with something.  What
exactly is the thing the driver is associated with?  A pci_dev?  A
cxl_dev?  Some kind of aggregate CXL device composed of several PCIe
devices?

I expected cxl_driver.probe() to take a "struct cxl_dev *" or similar,
but it takes a "struct device *".  I'm trying to apply my knowledge of
how other buses work in Linux, but obviously it's not working very
well.

> > Does CXL impose new constraints on PCI power management?
> 
> Recall that CXL is built such that it could be supported by a legacy
> operating system where platform firmware owns the setup of devices,
> just like DDR memory does not need a driver. This is where CXL 1.1
> played until CXL 2.0 added so much configuration complexity (hotplug,
> interleave, persistent memory) that it started to need OS help. The
> Linux PCIe PM code will not notice a difference, but behind the scenes
> the device, if it is offering coherent memory to the CPU, will be
> coordinating with the CPU like it was part of the CPU package and not
> a discrete device. I do not see new PM software enabling required in
> my reading of "Section 10.0 Power Management" of the CXL
> specification.

So if Linux PM decides to suspend a PCIe device that's part of a CXL
device and put it in D3hot, this is not a problem for CXL?  I guess if
a CXL driver binds to the PCIe device, it can control what PM will do.
But I thought CXL drivers would bind to a CXL thing, not a PCIe thing.

I see lots of mentions of LTR in sec 10, which really scares me
because I'm pretty confident that Linux handling of LTR is broken, and
I have no idea how to fix it.

> > > So, instead the system may be designed to have an NPEM in the
> > > upstream switch port and disable the NPEM instances in the device.
> > > Platform firmware decides which NPEM is in use.
> >
> > Since you didn't mention a firmware interface to communicate which
> > NPEM is in use, I assume firmware does this by preventing other
> > devices from advertising NPEM support?
> 
> That's also my assumption. If the OS sees a disabled NPEM in the
> topology it just needs to assume firmware knew what it was doing when
> it disabled it. I wish NPEM was better specified than "trust firmware
> to do the right thing via an ambiguous signal".

If we enumerate a device with a capability that is disabled, we
normally don't treat that as a signal that it cannot be enabled.
There are lots of enable bits in PCI capabilities, and I don't know of
any cases where Linux assumes "Enable == 0" means "don't use this
feature."  Absent some negotiation like _OSC or some documented
restriction, e.g., in the PCI Firmware spec, Linux normally just
enables features when it decides to use them.

Bjorn

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 20/23] cxl/port: Introduce a port driver
  2021-12-07  2:56                         ` Bjorn Helgaas
@ 2021-12-07  4:48                           ` Dan Williams
  0 siblings, 0 replies; 133+ messages in thread
From: Dan Williams @ 2021-12-07  4:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Christoph Hellwig, Ben Widawsky, linux-cxl, Linux PCI,
	Alison Schofield, Ira Weiny, Jonathan Cameron, Vishal Verma,
	Greg Kroah-Hartman, Rafael J. Wysocki, Stuart Hayes

On Mon, Dec 6, 2021 at 6:56 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Fri, Dec 03, 2021 at 05:24:34PM -0800, Dan Williams wrote:
> > On Fri, Dec 3, 2021 at 2:03 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Thu, Dec 02, 2021 at 05:38:17PM -0800, Dan Williams wrote:
>
> I'm cutting out a bunch of details, not because they're unimportant,
> but because I don't know enough yet for them to make sense to me.
>
> > > I guess software uses the CXL DVSEC to distinguish a CXL device
> > > from a PCIe device, at least for 00.0.
> >
> > driver/cxl/pci.c attaches to: "PCI_DEVICE_CLASS((PCI_CLASS_MEMORY_CXL
> > << 8 | CXL_MEMORY_PROGIF), ~0)"
> >
> > I am not aware of any restriction for that class code to appear at
> > function0?
>
> Maybe it's not an actual restriction; I'm reading CXL r2.0, sec 8.1.3,
> where it says:
>
>   The PCIe configuration space of Device 0, Function 0 shall include
>   the CXL PCI Express Designated Vendor-Specific Extended Capability
>   (DVSEC) as shown in Figure 126. The capability, status and control
>   fields in Device 0, Function 0 DVSEC control the CXL functionality
>   of the entire CXL device.
>   ...
>   Software may use the presence of this DVSEC to differentiate between
>   a CXL device and a PCIe device. As such, a standard PCIe device must
>   not expose this DVSEC.
>
> Sections 9.11.5 and 9.12.2 also talk about looking for CXL DVSEC on
> dev 0, func 0 to identify CXL devices.

Ah, I did not internalize that when reading because if the DVSEC shows
up on other functions on the same device it should "just work" as far
as Linux is concerned, but I guess it simplifies implementations to
constrain where the capability appears.

> > > Not sure about non-Dev 0 Func 0 devices; maybe this implies other
> > > functions in the same device are part of the same CXL device?
> >
> > DVSEC is really only telling us the layout of the MMIO register
> > space. ...
>
> And DVSEC also apparently tells us that "this is a CXL device, not
> just an ordinary PCIe device"?  It's not clear to me how you identify
> other PCIe functions that are also part of the same CXL device.

I have not encountered any flows where the driver would care if it was
enabling another instance of the CXL DVSEC on the same device.

>
> > > > The fact that the PCI core remains blissfully unaware of all these
> > > > interdependencies is a feature because CXL is effectively a super-set
> > > > of PCIe for fast-path CXL.mem operation, even though it is derivative
> > > > for enumeration and slow-path manageability.
> > >
> > > I don't know how blissfully unaware PCI can be.  Can a user remove a
> > > PCIe device that's part of a CXL device via sysfs?
> >
> > Yes. If that PCIe device is an endpoint then the corresponding "mem"
> > driver will get a ->remove() event because it is registered as a child
> > of that parent PCIe device. That in turn will tear down the relevant
> > part of the CXL port topology.
>
> The CXL device may include several PCIe devices.  "mem" is a CXL
> driver that's bound to one of them (?)  Is that what you mean by "mem"
> being a "child of the the parent PCIe device"?

Right, cxl_pci_probe() does device_add() of a "mem" device on the
cxl_bus_type, and cxl_pci_remove() does device_del() of that same
child device. Just like xhci_pci_probe() arranges for device_add() of
a child USB device on the usb_dev_type.

>
> > However, a gap (deliberate design choice?) in the PCI hotplug
> > implementation I currently see is an ability for an endpoint PCI
> > driver to dynamically deny hotplug requests based on the state of the
> > device. ...
>
> PCI allows surprise remove, so drivers generally can't deny hot
> unplugs.  PCIe *does* provide for an Electromechanical Interlock (see
> PCI_EXP_SLTCTL_EIC), but we don't currently support it.

Ok, I guess people will just need to be careful then.

> > > Is the PCIe device available for drivers to claim?
> >
> > drivers/cxl/pci.c *is* a native PCI driver for CXL memory expanders.
> > You might be thinking of CXL accelerator devices, where the CXL.cache
> > and CXL.mem capabilities are incremental capabilities for the
> > accelerator.  ...
>
> No, I'm not nearly sophisticated enough to be thinking of specific
> types of CXL things :)

Ah, apologies I was reading too much into your line of questions.

>
> > > Do we need coordination between cxl_driver_register() and
> > > pci_register_driver()?
> >
> > They do not collide, and I think this goes back to the concern about
> > whether drivers/cxl/ is scanning for all CXL DVSECs. ...
>
> Sorry, I don't remember what this concern was, and I don't know why
> they don't collide.  I *would* know that if I knew that the set of
> things cxl_driver_register() binds to doesn't intersect the set of
> pci_devs, but it's not clear to me what those things are.

cxl_driver_register() registers a driver on the cxl_bus_type, it can
not bind to devices on the pci_bus_type.

> The PCI core enumerates devices by doing config reads of the Vendor ID
> for every possible bus and device number.  It allocs a pci_dev for
> each device it finds, and those are the things pci_register_driver()
> binds drivers to based on Vendor ID, Device ID, etc.
>
> How does CXL enumeration work?  Do you envision it being integrated
> into PCI enumeration?  Does it build a list/tree/etc of cxl_devs?

The cxl_pci driver, native PCI driver, attaches to endpoints that emit
the CXL Memory Device class code. That driver walks up the PCI
topology and adds a cxl_port device for each CXL DVSEC it finds in
each PCIE Upstream Port up to the hostbridge. Additionally there is a
cxl_acpi driver that enumerates platform CXL root resources and is
called the 'root' cxl_port. Once root-to-endpoint connectivity is
established then the endpoint is considered ready for CXL.mem
operation.

> cxl_driver_register() associates a driver with something.  What
> exactly is the thing the driver is associated with?  A pci_dev?  A
> cxl_dev?  Some kind of aggregate CXL device composed of several PCIe
> devices?

The aggregate device is a  cxl_region composed of 1 or more endpoint
(registered by cxl_pci_probe()) devices. Like a RAID array it
aggregates multiple devices to produce a new memory range.

> I expected cxl_driver.probe() to take a "struct cxl_dev *" or similar,
> but it takes a "struct device *".  I'm trying to apply my knowledge of
> how other buses work in Linux, but obviously it's not working very
> well.

There's no common attributes between "mem" "port" and "region" devices
so the drivers just do container_of() on the device and skip defining
a generic 'struct cxl_dev'.

> > > Does CXL impose new constraints on PCI power management?
> >
> > Recall that CXL is built such that it could be supported by a legacy
> > operating system where platform firmware owns the setup of devices,
> > just like DDR memory does not need a driver. This is where CXL 1.1
> > played until CXL 2.0 added so much configuration complexity (hotplug,
> > interleave, persistent memory) that it started to need OS help. The
> > Linux PCIe PM code will not notice a difference, but behind the scenes
> > the device, if it is offering coherent memory to the CPU, will be
> > coordinating with the CPU like it was part of the CPU package and not
> > a discrete device. I do not see new PM software enabling required in
> > my reading of "Section 10.0 Power Management" of the CXL
> > specification.
>
> So if Linux PM decides to suspend a PCIe device that's part of a CXL
> device and put it in D3hot, this is not a problem for CXL?  I guess if
> a CXL driver binds to the PCIe device, it can control what PM will do.
> But I thought CXL drivers would bind to a CXL thing, not a PCIe thing.

Recall that this starts with a native PCI driver for endpoints, that
driver can coordinate PM for its children on the CXL bus if it is
needed.

> I see lots of mentions of LTR in sec 10, which really scares me
> because I'm pretty confident that Linux handling of LTR is broken, and
> I have no idea how to fix it.

Ok, I will keep that in mind.

> > > > So, instead the system may be designed to have an NPEM in the
> > > > upstream switch port and disable the NPEM instances in the device.
> > > > Platform firmware decides which NPEM is in use.
> > >
> > > Since you didn't mention a firmware interface to communicate which
> > > NPEM is in use, I assume firmware does this by preventing other
> > > devices from advertising NPEM support?
> >
> > That's also my assumption. If the OS sees a disabled NPEM in the
> > topology it just needs to assume firmware knew what it was doing when
> > it disabled it. I wish NPEM was better specified than "trust firmware
> > to do the right thing via an ambiguous signal".
>
> If we enumerate a device with a capability that is disabled, we
> normally don't treat that as a signal that it cannot be enabled.
> There are lots of enable bits in PCI capabilities, and I don't know of
> any cases where Linux assumes "Enable == 0" means "don't use this
> feature."  Absent some negotiation like _OSC or some documented
> restriction, e.g., in the PCI Firmware spec, Linux normally just
> enables features when it decides to use them.

If the LEDs are not routed to a given NPEM instance, no amount of
enabling can get that instance operational.

^ permalink raw reply	[flat|nested] 133+ messages in thread

end of thread, other threads:[~2021-12-07  4:48 UTC | newest]

Thread overview: 133+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-25 21:53 [PATCH 14/23] cxl: Introduce topology host registration kernel test robot
2021-11-29 11:42 ` Dan Carpenter
2021-11-29 11:42 ` Dan Carpenter
  -- strict thread matches above, loose matches on Subject: below --
2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
2021-11-20  0:02 ` [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
2021-11-22 14:47   ` Jonathan Cameron
2021-11-24  4:15   ` Dan Williams
2021-11-20  0:02 ` [PATCH 02/23] cxl: Flesh out register names Ben Widawsky
2021-11-22 14:49   ` Jonathan Cameron
2021-11-24  4:24   ` Dan Williams
2021-11-20  0:02 ` [PATCH 03/23] cxl/pci: Extract device status check Ben Widawsky
2021-11-22 15:03   ` Jonathan Cameron
2021-11-24 19:30   ` Dan Williams
2021-11-20  0:02 ` [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout Ben Widawsky
2021-11-22 15:02   ` Jonathan Cameron
2021-11-22 17:17     ` Ben Widawsky
2021-11-22 17:53       ` Jonathan Cameron
2021-11-24 19:56         ` Dan Williams
2021-11-25  6:17           ` Ben Widawsky
2021-11-25  7:14             ` Dan Williams
2021-11-20  0:02 ` [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access Ben Widawsky
2021-11-22 15:11   ` Jonathan Cameron
2021-11-22 17:24     ` Ben Widawsky
2021-11-24 21:55   ` Dan Williams
2021-11-29 18:33     ` Ben Widawsky
2021-11-29 19:02       ` Dan Williams
2021-11-29 19:11         ` Ben Widawsky
2021-11-29 19:18           ` Dan Williams
2021-11-29 19:31             ` Ben Widawsky
2021-11-29 19:37               ` Dan Williams
2021-11-29 19:50                 ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 06/23] cxl/pci: Don't check media status for mbox access Ben Widawsky
2021-11-22 15:19   ` Jonathan Cameron
2021-11-24 21:58   ` Dan Williams
2021-11-20  0:02 ` [PATCH 07/23] cxl/pci: Add new DVSEC definitions Ben Widawsky
2021-11-22 15:22   ` Jonathan Cameron
2021-11-22 17:32     ` Ben Widawsky
2021-11-24 22:03       ` Dan Williams
2021-11-20  0:02 ` [PATCH 08/23] cxl/acpi: Map component registers for Root Ports Ben Widawsky
2021-11-22 15:51   ` Jonathan Cameron
2021-11-22 19:28     ` Ben Widawsky
2021-11-24 22:18   ` Dan Williams
2021-11-20  0:02 ` [PATCH 09/23] cxl: Introduce module_cxl_driver Ben Widawsky
2021-11-22 15:54   ` Jonathan Cameron
2021-11-24 22:22   ` Dan Williams
2021-11-20  0:02 ` [PATCH 10/23] cxl/core: Convert decoder range to resource Ben Widawsky
2021-11-22 16:08   ` Jonathan Cameron
2021-11-24 22:41   ` Dan Williams
2021-11-20  0:02 ` [PATCH 11/23] cxl/core: Document and tighten up decoder APIs Ben Widawsky
2021-11-22 16:13   ` Jonathan Cameron
2021-11-24 22:55   ` Dan Williams
2021-11-20  0:02 ` [PATCH 12/23] cxl: Introduce endpoint decoders Ben Widawsky
2021-11-22 16:20   ` Jonathan Cameron
2021-11-22 19:37     ` Ben Widawsky
2021-11-25  0:07       ` Dan Williams
2021-11-29 20:05         ` Ben Widawsky
2021-11-29 20:07           ` Dan Williams
2021-11-29 20:12             ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 13/23] cxl/core: Move target population locking to caller Ben Widawsky
2021-11-22 16:33   ` Jonathan Cameron
2021-11-22 21:58     ` Ben Widawsky
2021-11-23 11:05       ` Jonathan Cameron
2021-11-25  0:34   ` Dan Williams
2021-11-20  0:02 ` [PATCH 14/23] cxl: Introduce topology host registration Ben Widawsky
2021-11-22 18:20   ` Jonathan Cameron
2021-11-22 22:30     ` Ben Widawsky
2021-11-25  1:09   ` Dan Williams
2021-11-29 21:23     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 15/23] cxl/core: Store global list of root ports Ben Widawsky
2021-11-22 18:22   ` Jonathan Cameron
2021-11-22 22:32     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 16/23] cxl/pci: Cache device DVSEC offset Ben Widawsky
2021-11-22 16:46   ` Jonathan Cameron
2021-11-22 22:34     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 17/23] cxl: Cache and pass DVSEC ranges Ben Widawsky
2021-11-20  4:29   ` kernel test robot
2021-11-20  4:29     ` kernel test robot
2021-11-22 17:00   ` Jonathan Cameron
2021-11-22 22:50     ` Ben Widawsky
2021-11-26 11:37   ` Jonathan Cameron
2021-11-20  0:02 ` [PATCH 18/23] cxl/pci: Implement wait for media active Ben Widawsky
2021-11-22 17:03   ` Jonathan Cameron
2021-11-22 22:57     ` Ben Widawsky
2021-11-23 11:09       ` Jonathan Cameron
2021-11-23 16:04         ` Ben Widawsky
2021-11-23 17:48           ` Bjorn Helgaas
2021-11-23 19:37             ` Ben Widawsky
2021-11-26 11:36     ` Jonathan Cameron
2021-11-20  0:02 ` [PATCH 19/23] cxl/pci: Store component register base in cxlds Ben Widawsky
2021-11-20  7:28   ` kernel test robot
2021-11-20  7:28     ` kernel test robot
2021-11-22 17:11   ` Jonathan Cameron
2021-11-22 23:01     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 20/23] cxl/port: Introduce a port driver Ben Widawsky
2021-11-20  3:14   ` kernel test robot
2021-11-20  3:14     ` kernel test robot
2021-11-20  5:38   ` kernel test robot
2021-11-20  5:38     ` kernel test robot
2021-11-22 17:41   ` Jonathan Cameron
2021-11-22 23:38     ` Ben Widawsky
2021-11-23 11:38       ` Jonathan Cameron
2021-11-23 16:14         ` Ben Widawsky
2021-11-23 18:21   ` Bjorn Helgaas
2021-11-23 22:03     ` Ben Widawsky
2021-11-23 22:36       ` Dan Williams
2021-11-23 23:38         ` Ben Widawsky
2021-11-23 23:55         ` Bjorn Helgaas
2021-11-24  0:40           ` Dan Williams
2021-11-24  6:33             ` Christoph Hellwig
2021-11-24  7:17               ` Dan Williams
2021-11-24  7:28                 ` Christoph Hellwig
2021-11-24  7:33                   ` Greg Kroah-Hartman
2021-11-24  7:54                     ` Dan Williams
2021-11-24  8:21                       ` Greg Kroah-Hartman
2021-11-24 18:24                         ` Dan Williams
2021-12-02 21:24                 ` Bjorn Helgaas
2021-12-03  1:38                   ` Dan Williams
2021-12-03 22:03                     ` Bjorn Helgaas
2021-12-04  1:24                       ` Dan Williams
2021-12-07  2:56                         ` Bjorn Helgaas
2021-12-07  4:48                           ` Dan Williams
2021-11-24 21:31       ` Bjorn Helgaas
2021-11-20  0:02 ` [PATCH 21/23] cxl: Unify port enumeration for decoders Ben Widawsky
2021-11-22 17:48   ` Jonathan Cameron
2021-11-22 23:44     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 22/23] cxl/mem: Introduce cxl_mem driver Ben Widawsky
2021-11-20  0:40   ` Randy Dunlap
2021-11-21  3:55     ` Ben Widawsky
2021-11-22 18:17   ` Jonathan Cameron
2021-11-23  0:05     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 23/23] cxl/mem: Disable switch hierarchies for now Ben Widawsky
2021-11-22 18:19   ` Jonathan Cameron
2021-11-22 19:17     ` Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.