linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem
@ 2023-04-19 20:21 Dave Jiang
  2023-04-19 20:21 ` [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
                   ` (22 more replies)
  0 siblings, 23 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: Ira Weiny, Dan Williams, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, lukas,
	Jonathan.Cameron

v4:
- Reworked PCIe link path latency calculation
- 0-day fixes
- Removed unused qos_list from cxl_memdev and its stray usages

v3:
- Please see specific patches for log entries addressing comments from v2.
- Refactor cxl_port_probe() additions. (Alison)
- Convert to use 'struct node_hmem_attrs'
- Refactor to use common code for genport target allocation.
- Add third array entry for target hmem_attrs to store genport locality data.
- Go back to per partition QTG ID. (Dan)

v2:
- Please see specific patches for log entries addressing comments from v1.
- Removed ACPICA code usages.
- Removed PCI subsystem helpers for latency and bandwidth.
- Add CXL switch CDAT parsing support (SSLBIS)
- Add generic port SRAT+HMAT support (ACPI)
- Export a single QTG ID via sysfs per memory device (Dan)
- Provide rest of DSMAS range info in debugfs (Dan)


Hi Rafael,
please review the relevant patches to ACPI: 13/23-16/23. Thank you!
If they are ok, Dan can take them through the CXL tree for upstream merging.
13 - Adds enum for memory_target hmem_attrs in order to enumerate the array index.
14 - Add generic port target allocation for SRAT parsing during HMAT init in order
to extract and store the device handle.
15 - Add a new index for hmem_attrs and save the locality data to the new hmem_attrs
array element with generic port data. The old array elements are preserved for later
when we want to store the calculated CXL memory target locality data.
16 - Add ACPI helper function to retrieve the locality data for generic port. Used by
CXL driver to calculate the full locality data for the CXL memory device.

This series adds the retrieval of QoS Throttling Group (QTG) IDs for the CXL Fixed
Memory Window Structure (CFMWS) and the CXL memory device. It provides the QTG IDs
to user space to provide some guidance with putting the proper DPA range under the
appropriate CFMWS window for a hot-plugged CXL memory device.

The CFMWS structure contains a QTG ID that is associated with the memory window that the
structure exports. On Linux, the CFMWS is represented as a CXL root decoder. The QTG
ID will be attached to the CXL root decoder and exported as a sysfs attribute (qtg_id).

The QTG ID for a device is retrieved via sending a _DSM method to the ACPI0017 device.
The _DSM expects an input package of 4 DWORDS that contains the read latency, write
latency, read bandwidth, and write banwidth. These are the caluclated numbers for the
path between the CXL device and the CXL host bridge (HB). The QTG ID is also exported
as a sysfs attribute under the mem device memory partition type:
/sys/bus/cxl/devices/memX/ram/qtg_id
/sys/bus/cxl/devices/memX/pmem/qtg_id
Only the first QTG ID is exported. The rest of the information can be found under
/sys/kernel/debug/cxl/memX/qtgmap where all the DPA ranges with the correlated QTG ID
are displayed. Each DSMAS from the device CDAT will provide a DPA range.

The latency numbers are the aggregated latencies for the path between the CXL device and
the CPU. If a CXL device is directly attached to the CXL HB, the latency
would be the aggregated latencies from the device Coherent Device Attribute Table (CDAT),
the caluclated PCIe link latency between the device and the HB, and the generic port data
from ACPI SRAT+HMAT. The bandwidth in this configuration would be the minimum between the
CDAT bandwidth number, link bandwidth between the device and the HB, and the bandwidth data
from the generic port data via ACPI SRAT+HMAT.

If a configuration has a switch in between then the latency would be the aggregated
latencies from the device CDAT, the link latency between device and switch, the
latency from the switch CDAT, the link latency between switch and the HB, and the
generic port latency between the CPU and the CXL HB. The bandwidth calculation would be the
min of device CDAT bandwidth, link bandwith between device and switch, switch CDAT
bandwidth, the link bandwidth between switch and HB, and the generic port bandwidth

There can be 0 or more switches between the CXL device and the CXL HB. There are detailed
examples on calculating bandwidth and latency in the CXL Memory Device Software Guide [4].

The CDAT provides Device Scoped Memory Affinity Structures (DSMAS) that contains the
Device Physical Address (DPA) range and the related Device Scoped Latency and Bandwidth
Informat Stuctures (DSLBIS). Each DSLBIS provides a latency or bandwidth entry that is
tied to a DSMAS entry via a per DSMAS unique DSMAD handle.

This series is based on Lukas's latest DOE changes [5]. Kernel branch with all the code can
be retrieved here [6] for convenience.

Test setup is done with runqemu genport support branch [6]. The setup provides 2 CXL HBs
with one HB having a CXL switch underneath. It also provides generic port support detailed
below.

A hacked up qemu branch is used to support generic port SRAT and HMAT [7].

To create the appropriate HMAT entries for generic port, the following qemu paramters must
be added:

-object genport,id=$X -numa node,genport=genport$X,nodeid=$Y,initiator=$Z
-numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-latency,latency=$latency
-numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-bandwidth,bandwidth=$bandwidthM
for ((i = 0; i < total_nodes; i++)); do
	for ((j = 0; j < cxl_hbs; j++ )); do	# 2 CXL HBs
		-numa dist,src=$i,dst=$X,val=$dist
	done
done

See the genport run_qemu branch for full details.

[1]: https://www.computeexpresslink.org/download-the-specification
[2]: https://uefi.org/sites/default/files/resources/Coherent%20Device%20Attribute%20Table_1.01.pdf
[3]: https://uefi.org/sites/default/files/resources/ACPI_Spec_6_5_Aug29.pdf
[4]: https://cdrdv2-public.intel.com/643805/643805_CXL%20Memory%20Device%20SW%20Guide_Rev1p0.pdf
[5]: https://lore.kernel.org/linux-cxl/20230313195530.GA1532686@bhelgaas/T/#t
[6]: https://git.kernel.org/pub/scm/linux/kernel/git/djiang/linux.git/log/?h=cxl-qtg
[7]: https://github.com/pmem/run_qemu/tree/djiang/genport
[8]: https://github.com/davejiang/qemu/tree/genport

---

Dave Jiang (23):
      cxl: Export QTG ids from CFMWS to sysfs
      cxl: Add checksum verification to CDAT from CXL
      cxl: Add support for reading CXL switch CDAT table
      cxl: Add common helpers for cdat parsing
      cxl: Add callback to parse the DSMAS subtables from CDAT
      cxl: Add callback to parse the DSLBIS subtable from CDAT
      cxl: Add callback to parse the SSLBIS subtable from CDAT
      cxl: Add support for _DSM Function for retrieving QTG ID
      cxl: Add helper function to retrieve ACPI handle of CXL root device
      cxl: Add helpers to calculate pci latency for the CXL device
      cxl: Add helper function that calculates QoS values for switches
      cxl: Add helper function that calculate QoS values for PCI path
      ACPI: NUMA: Create enum for memory_target hmem_attrs indexing
      ACPI: NUMA: Add genport target allocation to the HMAT parsing
      ACPI: NUMA: Add setting of generic port locality attributes
      ACPI: NUMA: Add helper function to retrieve the performance attributes
      cxl: Add helper function to retrieve generic port QoS
      cxl: Add latency and bandwidth calculations for the CXL path
      cxl: Wait Memory_Info_Valid before access memory related info
      cxl: Move identify and partition query from pci probe to port probe
      cxl: Store QTG IDs and related info to the CXL memory device context
      cxl: Export sysfs attributes for memory device QTG ID
      cxl/mem: Add debugfs output for QTG related data


 Documentation/ABI/testing/debugfs-cxl   |  11 +
 Documentation/ABI/testing/sysfs-bus-cxl |  31 +++
 drivers/acpi/numa/hmat.c                | 138 ++++++++++--
 drivers/cxl/acpi.c                      |   3 +
 drivers/cxl/core/Makefile               |   2 +
 drivers/cxl/core/acpi.c                 | 180 ++++++++++++++++
 drivers/cxl/core/cdat.c                 | 270 ++++++++++++++++++++++++
 drivers/cxl/core/mbox.c                 |   3 +
 drivers/cxl/core/memdev.c               |  26 +++
 drivers/cxl/core/pci.c                  | 187 ++++++++++++++--
 drivers/cxl/core/port.c                 | 183 ++++++++++++++++
 drivers/cxl/cxl.h                       |  27 +++
 drivers/cxl/cxlmem.h                    |  21 ++
 drivers/cxl/cxlpci.h                    | 117 ++++++++++
 drivers/cxl/mem.c                       |  17 ++
 drivers/cxl/pci.c                       |  21 --
 drivers/cxl/port.c                      | 155 +++++++++++++-
 include/acpi/actbl3.h                   |   2 +
 include/linux/acpi.h                    |   6 +
 19 files changed, 1348 insertions(+), 52 deletions(-)
 create mode 100644 Documentation/ABI/testing/debugfs-cxl
 create mode 100644 drivers/cxl/core/acpi.c
 create mode 100644 drivers/cxl/core/cdat.c

--


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
@ 2023-04-19 20:21 ` Dave Jiang
  2023-04-20  8:51   ` Jonathan Cameron
  2023-04-24 21:46   ` Dan Williams
  2023-04-19 20:21 ` [PATCH v4 02/23] cxl: Add checksum verification to CDAT from CXL Dave Jiang
                   ` (21 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: Ira Weiny, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas, Jonathan.Cameron

Export the QoS Throttling Group ID from the CXL Fixed Memory Window
Structure (CFMWS) under the root decoder sysfs attributes.
CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS)

cxl cli will use this QTG ID to match with the _DSM retrieved QTG ID for a
hot-plugged CXL memory device DPA memory range to make sure that the DPA range
is under the right CFMWS window.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v4:
- Change kernel version for documentation to v6.5
v2:
- Add explanation commit header (Jonathan)
---
 Documentation/ABI/testing/sysfs-bus-cxl |    9 +++++++++
 drivers/cxl/acpi.c                      |    3 +++
 drivers/cxl/core/port.c                 |   14 ++++++++++++++
 drivers/cxl/cxl.h                       |    3 +++
 4 files changed, 29 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 3acf2f17a73f..bd2b59784979 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -309,6 +309,15 @@ Description:
 		(WO) Write a string in the form 'regionZ' to delete that region,
 		provided it is currently idle / not bound to a driver.
 
+What:		/sys/bus/cxl/devices/decoderX.Y/qtg_id
+Date:		Jan, 2023
+KernelVersion:	v6.5
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) Shows the QoS Throttling Group ID. The QTG ID for a root
+		decoder comes from the CFMWS structure of the CEDT. A value of
+		-1 indicates that no QTG ID was retrieved. The QTG ID is used as
+		guidance to match against the QTG ID of a hot-plugged device.
 
 What:		/sys/bus/cxl/devices/regionZ/uuid
 Date:		May, 2022
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 7e1765b09e04..abc24137c291 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 			}
 		}
 	}
+
+	cxld->qtg_id = cfmws->qtg_id;
+
 	rc = cxl_decoder_add(cxld, target_map);
 err_xormap:
 	if (rc)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 4d1f9c5b5029..024d4178f557 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -276,6 +276,16 @@ static ssize_t interleave_ways_show(struct device *dev,
 
 static DEVICE_ATTR_RO(interleave_ways);
 
+static ssize_t qtg_id_show(struct device *dev,
+			   struct device_attribute *attr, char *buf)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+
+	return sysfs_emit(buf, "%d\n", cxld->qtg_id);
+}
+
+static DEVICE_ATTR_RO(qtg_id);
+
 static struct attribute *cxl_decoder_base_attrs[] = {
 	&dev_attr_start.attr,
 	&dev_attr_size.attr,
@@ -295,6 +305,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
 	&dev_attr_cap_type2.attr,
 	&dev_attr_cap_type3.attr,
 	&dev_attr_target_list.attr,
+	&dev_attr_qtg_id.attr,
 	SET_CXL_REGION_ATTR(create_pmem_region)
 	SET_CXL_REGION_ATTR(create_ram_region)
 	SET_CXL_REGION_ATTR(delete_region)
@@ -1625,6 +1636,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
 	}
 
 	atomic_set(&cxlrd->region_id, rc);
+	cxld->qtg_id = CXL_QTG_ID_INVALID;
 	return cxlrd;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL);
@@ -1662,6 +1674,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
 
 	cxld = &cxlsd->cxld;
 	cxld->dev.type = &cxl_decoder_switch_type;
+	cxld->qtg_id = CXL_QTG_ID_INVALID;
 	return cxlsd;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL);
@@ -1694,6 +1707,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port)
 	}
 
 	cxld->dev.type = &cxl_decoder_endpoint_type;
+	cxld->qtg_id = CXL_QTG_ID_INVALID;
 	return cxled;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 044a92d9813e..278ab6952332 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -300,6 +300,7 @@ enum cxl_decoder_type {
  */
 #define CXL_DECODER_MAX_INTERLEAVE 16
 
+#define CXL_QTG_ID_INVALID	-1
 
 /**
  * struct cxl_decoder - Common CXL HDM Decoder Attributes
@@ -311,6 +312,7 @@ enum cxl_decoder_type {
  * @target_type: accelerator vs expander (type2 vs type3) selector
  * @region: currently assigned region for this decoder
  * @flags: memory type capabilities and locking
+ * @qtg_id: QoS Throttling Group ID
  * @commit: device/decoder-type specific callback to commit settings to hw
  * @reset: device/decoder-type specific callback to reset hw settings
 */
@@ -323,6 +325,7 @@ struct cxl_decoder {
 	enum cxl_decoder_type target_type;
 	struct cxl_region *region;
 	unsigned long flags;
+	int qtg_id;
 	int (*commit)(struct cxl_decoder *cxld);
 	int (*reset)(struct cxl_decoder *cxld);
 };



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 02/23] cxl: Add checksum verification to CDAT from CXL
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
  2023-04-19 20:21 ` [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
@ 2023-04-19 20:21 ` Dave Jiang
  2023-04-20  8:55   ` Jonathan Cameron
  2023-04-24 22:01   ` Dan Williams
  2023-04-19 20:21 ` [PATCH v4 03/23] cxl: Add support for reading CXL switch CDAT table Dave Jiang
                   ` (20 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: Ira Weiny, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas, Jonathan.Cameron

A CDAT table is available from a CXL device. The table is read by the
driver and cached in software. With the CXL subsystem needing to parse the
CDAT table, the checksum should be verified. Add checksum verification
after the CDAT table is read from device.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v3:
- Just return the final sum. (Alison)
v2:
- Drop ACPI checksum export and just use local verification. (Dan)
---
 drivers/cxl/core/pci.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 25b7e8125d5d..9c7e2f69d9ca 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -528,6 +528,16 @@ static int cxl_cdat_read_table(struct device *dev,
 	return 0;
 }
 
+static unsigned char cdat_checksum(void *buf, size_t size)
+{
+	unsigned char sum, *data = buf;
+	size_t i;
+
+	for (sum = 0, i = 0; i < size; i++)
+		sum += data[i];
+	return sum;
+}
+
 /**
  * read_cdat_data - Read the CDAT data on this port
  * @port: Port to read data from
@@ -573,6 +583,12 @@ void read_cdat_data(struct cxl_port *port)
 	}
 
 	port->cdat.table = cdat_table + sizeof(__le32);
+	if (cdat_checksum(port->cdat.table, cdat_length)) {
+		/* Don't leave table data allocated on error */
+		devm_kfree(dev, cdat_table);
+		dev_err(dev, "CDAT data checksum error\n");
+	}
+
 	port->cdat.length = cdat_length;
 }
 EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 03/23] cxl: Add support for reading CXL switch CDAT table
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
  2023-04-19 20:21 ` [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
  2023-04-19 20:21 ` [PATCH v4 02/23] cxl: Add checksum verification to CDAT from CXL Dave Jiang
@ 2023-04-19 20:21 ` Dave Jiang
  2023-04-20  9:25   ` Jonathan Cameron
  2023-04-24 22:08   ` Dan Williams
  2023-04-19 20:21 ` [PATCH v4 04/23] cxl: Add common helpers for cdat parsing Dave Jiang
                   ` (19 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: Ira Weiny, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas, Jonathan.Cameron

Move read_cdat_data() from endpoint probe to general port probe to
allow reading of CDAT data for CXL switches as well as CXL device.
Add wrapper support for cxl_test to bypass the cdat reading.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v4:
- Remove cxl_test wrapper. (Ira)
---
 drivers/cxl/core/pci.c |   20 +++++++++++++++-----
 drivers/cxl/port.c     |    6 +++---
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 9c7e2f69d9ca..1c415b26e866 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -546,16 +546,26 @@ static unsigned char cdat_checksum(void *buf, size_t size)
  */
 void read_cdat_data(struct cxl_port *port)
 {
-	struct pci_doe_mb *cdat_doe;
-	struct device *dev = &port->dev;
 	struct device *uport = port->uport;
-	struct cxl_memdev *cxlmd = to_cxl_memdev(uport);
-	struct cxl_dev_state *cxlds = cxlmd->cxlds;
-	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
+	struct device *dev = &port->dev;
+	struct cxl_dev_state *cxlds;
+	struct pci_doe_mb *cdat_doe;
+	struct cxl_memdev *cxlmd;
+	struct pci_dev *pdev;
 	size_t cdat_length;
 	void *cdat_table;
 	int rc;
 
+	if (is_cxl_memdev(uport)) {
+		cxlmd = to_cxl_memdev(uport);
+		cxlds = cxlmd->cxlds;
+		pdev = to_pci_dev(cxlds->dev);
+	} else if (dev_is_pci(uport)) {
+		pdev = to_pci_dev(uport);
+	} else {
+		return;
+	}
+
 	cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL,
 					CXL_DOE_PROTOCOL_TABLE_ACCESS);
 	if (!cdat_doe) {
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 22a7ab2bae7c..615e0ef6b440 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -93,9 +93,6 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 	if (IS_ERR(cxlhdm))
 		return PTR_ERR(cxlhdm);
 
-	/* Cache the data early to ensure is_visible() works */
-	read_cdat_data(port);
-
 	get_device(&cxlmd->dev);
 	rc = devm_add_action_or_reset(&port->dev, schedule_detach, cxlmd);
 	if (rc)
@@ -135,6 +132,9 @@ static int cxl_port_probe(struct device *dev)
 {
 	struct cxl_port *port = to_cxl_port(dev);
 
+	/* Cache the data early to ensure is_visible() works */
+	read_cdat_data(port);
+
 	if (is_cxl_endpoint(port))
 		return cxl_endpoint_port_probe(port);
 	return cxl_switch_port_probe(port);



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 04/23] cxl: Add common helpers for cdat parsing
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (2 preceding siblings ...)
  2023-04-19 20:21 ` [PATCH v4 03/23] cxl: Add support for reading CXL switch CDAT table Dave Jiang
@ 2023-04-19 20:21 ` Dave Jiang
  2023-04-20  9:41   ` Jonathan Cameron
  2023-04-24 22:33   ` Dan Williams
  2023-04-19 20:21 ` [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
                   ` (18 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Add helper functions to parse the CDAT table and provide a callback to
parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
parsing. The code is patterned after the ACPI table parsing helpers.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v2:
- Use local headers to handle LE instead of ACPI header
- Reduce complexity of parser function. (Jonathan)
- Directly access header type. (Jonathan)
- Simplify header ptr math. (Jonathan)
- Move parsed counter to the correct location. (Jonathan)
- Add LE to host conversion for entry length
---
 drivers/cxl/core/Makefile |    1 
 drivers/cxl/core/cdat.c   |  100 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlpci.h      |   29 +++++++++++++
 3 files changed, 130 insertions(+)
 create mode 100644 drivers/cxl/core/cdat.c

diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index ca4ae31d8f57..867a8014b462 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -12,5 +12,6 @@ cxl_core-y += memdev.o
 cxl_core-y += mbox.o
 cxl_core-y += pci.o
 cxl_core-y += hdm.o
+cxl_core-y += cdat.o
 cxl_core-$(CONFIG_TRACING) += trace.o
 cxl_core-$(CONFIG_CXL_REGION) += region.o
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
new file mode 100644
index 000000000000..210f4499bddb
--- /dev/null
+++ b/drivers/cxl/core/cdat.c
@@ -0,0 +1,100 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
+#include "cxlpci.h"
+#include "cxl.h"
+
+static bool has_handler(struct cdat_subtable_proc *proc)
+{
+	return proc->handler;
+}
+
+static int call_handler(struct cdat_subtable_proc *proc,
+			struct cdat_subtable_entry *ent)
+{
+	if (has_handler(proc))
+		return proc->handler(ent->hdr, proc->arg);
+	return -EINVAL;
+}
+
+static bool cdat_is_subtable_match(struct cdat_subtable_entry *ent)
+{
+	return ent->hdr->type == ent->type;
+}
+
+static int cdat_table_parse_entries(enum cdat_type type,
+				    struct cdat_header *table_header,
+				    struct cdat_subtable_proc *proc)
+{
+	unsigned long table_end, entry_len;
+	struct cdat_subtable_entry entry;
+	int count = 0;
+	int rc;
+
+	if (!has_handler(proc))
+		return -EINVAL;
+
+	table_end = (unsigned long)table_header + table_header->length;
+
+	if (type >= CDAT_TYPE_RESERVED)
+		return -EINVAL;
+
+	entry.type = type;
+	entry.hdr = (struct cdat_entry_header *)(table_header + 1);
+
+	while ((unsigned long)entry.hdr < table_end) {
+		entry_len = le16_to_cpu(entry.hdr->length);
+
+		if ((unsigned long)entry.hdr + entry_len > table_end)
+			return -EINVAL;
+
+		if (entry_len == 0)
+			return -EINVAL;
+
+		if (cdat_is_subtable_match(&entry)) {
+			rc = call_handler(proc, &entry);
+			if (rc)
+				return rc;
+			count++;
+		}
+
+		entry.hdr = (struct cdat_entry_header *)((unsigned long)entry.hdr + entry_len);
+	}
+
+	return count;
+}
+
+int cdat_table_parse_dsmas(struct cdat_header *table,
+			   cdat_tbl_entry_handler handler, void *arg)
+{
+	struct cdat_subtable_proc proc = {
+		.handler	= handler,
+		.arg		= arg,
+	};
+
+	return cdat_table_parse_entries(CDAT_TYPE_DSMAS, table, &proc);
+}
+EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dsmas, CXL);
+
+int cdat_table_parse_dslbis(struct cdat_header *table,
+			    cdat_tbl_entry_handler handler, void *arg)
+{
+	struct cdat_subtable_proc proc = {
+		.handler	= handler,
+		.arg		= arg,
+	};
+
+	return cdat_table_parse_entries(CDAT_TYPE_DSLBIS, table, &proc);
+}
+EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dslbis, CXL);
+
+int cdat_table_parse_sslbis(struct cdat_header *table,
+			    cdat_tbl_entry_handler handler, void *arg)
+{
+	struct cdat_subtable_proc proc = {
+		.handler	= handler,
+		.arg		= arg,
+	};
+
+	return cdat_table_parse_entries(CDAT_TYPE_SSLBIS, table, &proc);
+}
+EXPORT_SYMBOL_NS_GPL(cdat_table_parse_sslbis, CXL);
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 0465ef963cd6..45e2f2bf5ef8 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -76,12 +76,34 @@ struct cdat_header {
 	__le32 sequence;
 } __packed;
 
+enum cdat_type {
+	CDAT_TYPE_DSMAS = 0,
+	CDAT_TYPE_DSLBIS,
+	CDAT_TYPE_DSMSCIS,
+	CDAT_TYPE_DSIS,
+	CDAT_TYPE_DSEMTS,
+	CDAT_TYPE_SSLBIS,
+	CDAT_TYPE_RESERVED
+};
+
 struct cdat_entry_header {
 	u8 type;
 	u8 reserved;
 	__le16 length;
 } __packed;
 
+typedef int (*cdat_tbl_entry_handler)(struct cdat_entry_header *header, void *arg);
+
+struct cdat_subtable_proc {
+	cdat_tbl_entry_handler handler;
+	void *arg;
+};
+
+struct cdat_subtable_entry {
+	struct cdat_entry_header *hdr;
+	enum cdat_type type;
+};
+
 int devm_cxl_port_enumerate_dports(struct cxl_port *port);
 struct cxl_dev_state;
 int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
@@ -90,4 +112,11 @@ void read_cdat_data(struct cxl_port *port);
 void cxl_cor_error_detected(struct pci_dev *pdev);
 pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
 				    pci_channel_state_t state);
+
+#define cdat_table_parse(x) \
+int cdat_table_parse_##x(struct cdat_header *table, \
+			 cdat_tbl_entry_handler handler, void *arg)
+cdat_table_parse(dsmas);
+cdat_table_parse(dslbis);
+cdat_table_parse(sslbis);
 #endif /* __CXL_PCI_H__ */



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (3 preceding siblings ...)
  2023-04-19 20:21 ` [PATCH v4 04/23] cxl: Add common helpers for cdat parsing Dave Jiang
@ 2023-04-19 20:21 ` Dave Jiang
  2023-04-20 11:33   ` Jonathan Cameron
                     ` (2 more replies)
  2023-04-19 20:21 ` [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
                   ` (17 subsequent siblings)
  22 siblings, 3 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Provide a callback function to the CDAT parser in order to parse the Device
Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
DPA range and its associated attributes in each entry. See the CDAT
specification for details.

Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity
Structure (DSMAS)

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v3:
- Add spec section number. (Alison)
- Remove cast from void *. (Alison)
- Refactor cxl_port_probe() block. (Alison)
- Move CDAT parse to cxl_endpoint_port_probe()

v2:
- Add DSMAS table size check. (Lukas)
- Use local DSMAS header for LE handling.
- Remove dsmas lock. (Jonathan)
- Fix handle size (Jonathan)
- Add LE to host conversion for DSMAS address and length.
- Make dsmas_list local
---
 drivers/cxl/core/cdat.c |   26 ++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    1 +
 drivers/cxl/cxlpci.h    |   18 ++++++++++++++++++
 drivers/cxl/port.c      |   22 ++++++++++++++++++++++
 4 files changed, 67 insertions(+)

diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index 210f4499bddb..6f20af83a3ed 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -98,3 +98,29 @@ int cdat_table_parse_sslbis(struct cdat_header *table,
 	return cdat_table_parse_entries(CDAT_TYPE_SSLBIS, table, &proc);
 }
 EXPORT_SYMBOL_NS_GPL(cdat_table_parse_sslbis, CXL);
+
+int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg)
+{
+	struct cdat_dsmas *dsmas = (struct cdat_dsmas *)header;
+	struct list_head *dsmas_list = arg;
+	struct dsmas_entry *dent;
+
+	if (dsmas->hdr.length != sizeof(*dsmas)) {
+		pr_warn("Malformed DSMAS table length: (%lu:%u)\n",
+			 (unsigned long)sizeof(*dsmas), dsmas->hdr.length);
+		return -EINVAL;
+	}
+
+	dent = kzalloc(sizeof(*dent), GFP_KERNEL);
+	if (!dent)
+		return -ENOMEM;
+
+	dent->handle = dsmas->dsmad_handle;
+	dent->dpa_range.start = le64_to_cpu(dsmas->dpa_base_address);
+	dent->dpa_range.end = le64_to_cpu(dsmas->dpa_base_address) +
+			      le64_to_cpu(dsmas->dpa_length) - 1;
+	list_add_tail(&dent->list, dsmas_list);
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 278ab6952332..18ca25c7e527 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -8,6 +8,7 @@
 #include <linux/bitfield.h>
 #include <linux/bitops.h>
 #include <linux/log2.h>
+#include <linux/list.h>
 #include <linux/io.h>
 
 /**
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 45e2f2bf5ef8..9a2468a93d83 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -104,6 +104,22 @@ struct cdat_subtable_entry {
 	enum cdat_type type;
 };
 
+struct dsmas_entry {
+	struct list_head list;
+	struct range dpa_range;
+	u8 handle;
+};
+
+/* Sub-table 0: Device Scoped Memory Affinity Structure (DSMAS) */
+struct cdat_dsmas {
+	struct cdat_entry_header hdr;
+	u8 dsmad_handle;
+	u8 flags;
+	__u16 reserved;
+	__le64 dpa_base_address;
+	__le64 dpa_length;
+} __packed;
+
 int devm_cxl_port_enumerate_dports(struct cxl_port *port);
 struct cxl_dev_state;
 int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
@@ -119,4 +135,6 @@ int cdat_table_parse_##x(struct cdat_header *table, \
 cdat_table_parse(dsmas);
 cdat_table_parse(dslbis);
 cdat_table_parse(sslbis);
+
+int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg);
 #endif /* __CXL_PCI_H__ */
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 615e0ef6b440..3022bdd52439 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -57,6 +57,16 @@ static int discover_region(struct device *dev, void *root)
 	return 0;
 }
 
+static void dsmas_list_destroy(struct list_head *dsmas_list)
+{
+	struct dsmas_entry *dentry, *n;
+
+	list_for_each_entry_safe(dentry, n, dsmas_list, list) {
+		list_del(&dentry->list);
+		kfree(dentry);
+	}
+}
+
 static int cxl_switch_port_probe(struct cxl_port *port)
 {
 	struct cxl_hdm *cxlhdm;
@@ -125,6 +135,18 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 	device_for_each_child(&port->dev, root, discover_region);
 	put_device(&root->dev);
 
+	if (port->cdat.table) {
+		LIST_HEAD(dsmas_list);
+
+		rc = cdat_table_parse_dsmas(port->cdat.table,
+					    cxl_dsmas_parse_entry,
+					    (void *)&dsmas_list);
+		if (rc < 0)
+			dev_warn(&port->dev, "Failed to parse DSMAS: %d\n", rc);
+
+		dsmas_list_destroy(&dsmas_list);
+	}
+
 	return 0;
 }
 



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable from CDAT
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (4 preceding siblings ...)
  2023-04-19 20:21 ` [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
@ 2023-04-19 20:21 ` Dave Jiang
  2023-04-20 11:40   ` Jonathan Cameron
  2023-04-24 22:46   ` Dan Williams
  2023-04-19 20:21 ` [PATCH v4 07/23] cxl: Add callback to parse the SSLBIS " Dave Jiang
                   ` (16 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Provide a callback to parse the Device Scoped Latency and Bandwidth
Information Structure (DSLBIS) in the CDAT structures. The DSLBIS
contains the bandwidth and latency information that's tied to a DSMAS
handle. The driver will retrieve the read and write latency and
bandwidth associated with the DSMAS which is tied to a DPA range.

Coherent Device Attribute Table 1.03 2.1 Device Scoped Latency and
Bandwidth Information Structure (DSLBIS)

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v3:
- Added spec section in commit header. (Alison)
- Remove void * recast. (Alison)
- Rework comment. (Alison)
- Move CDAT parse to cxl_endpoint_port_probe()
- Convert to use 'struct node_hmem_attrs'

v2:
- Add size check to DSLIBIS table. (Lukas)
- Remove unnecessary entry type check. (Jonathan)
- Move data_type check to after match. (Jonathan)
- Skip unknown data type. (Jonathan)
- Add overflow check for unit multiply. (Jonathan)
- Use dev_warn() when entries parsing fail. (Jonathan)
---
 drivers/cxl/core/cdat.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlpci.h    |   34 +++++++++++++++++++++++-
 drivers/cxl/port.c      |   11 +++++++-
 3 files changed, 111 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index 6f20af83a3ed..e8b9bb99dfdf 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2023 Intel Corporation. All rights reserved. */
+#include <linux/overflow.h>
 #include "cxlpci.h"
 #include "cxl.h"
 
@@ -124,3 +125,70 @@ int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg)
 	return 0;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
+
+static void cxl_hmem_attrs_set(struct node_hmem_attrs *attrs,
+			       int access, unsigned int val)
+{
+	switch (access) {
+	case HMAT_SLLBIS_ACCESS_LATENCY:
+		attrs->read_latency = val;
+		attrs->write_latency = val;
+		break;
+	case HMAT_SLLBIS_READ_LATENCY:
+		attrs->read_latency = val;
+		break;
+	case HMAT_SLLBIS_WRITE_LATENCY:
+		attrs->write_latency = val;
+		break;
+	case HMAT_SLLBIS_ACCESS_BANDWIDTH:
+		attrs->read_bandwidth = val;
+		attrs->write_bandwidth = val;
+		break;
+	case HMAT_SLLBIS_READ_BANDWIDTH:
+		attrs->read_bandwidth = val;
+		break;
+	case HMAT_SLLBIS_WRITE_BANDWIDTH:
+		attrs->write_bandwidth = val;
+		break;
+	}
+}
+
+int cxl_dslbis_parse_entry(struct cdat_entry_header *header, void *arg)
+{
+	struct cdat_dslbis *dslbis = (struct cdat_dslbis *)header;
+	struct list_head *dsmas_list = arg;
+	struct dsmas_entry *dent;
+
+	if (dslbis->hdr.length != sizeof(*dslbis)) {
+		pr_warn("Malformed DSLBIS table length: (%lu:%u)\n",
+			(unsigned long)sizeof(*dslbis), dslbis->hdr.length);
+		return -EINVAL;
+	}
+
+	/* Skip unrecognized data type */
+	if (dslbis->data_type >= HMAT_SLLBIS_DATA_TYPE_MAX)
+		return 0;
+
+	list_for_each_entry(dent, dsmas_list, list) {
+		u64 val;
+		int rc;
+
+		if (dslbis->handle != dent->handle)
+			continue;
+
+		/* Not a memory type, skip */
+		if ((dslbis->flags & DSLBIS_MEM_MASK) != DSLBIS_MEM_MEMORY)
+			return 0;
+
+		rc = check_mul_overflow(le64_to_cpu(dslbis->entry_base_unit),
+					le16_to_cpu(dslbis->entry[0]), &val);
+		if (unlikely(rc))
+			pr_warn("DSLBIS value overflowed.\n");
+
+		cxl_hmem_attrs_set(&dent->hmem_attrs, dslbis->data_type, val);
+		break;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_dslbis_parse_entry, CXL);
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 9a2468a93d83..0f36fb23055c 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -2,6 +2,7 @@
 /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
 #ifndef __CXL_PCI_H__
 #define __CXL_PCI_H__
+#include <linux/node.h>
 #include <linux/pci.h>
 #include "cxl.h"
 
@@ -104,10 +105,21 @@ struct cdat_subtable_entry {
 	enum cdat_type type;
 };
 
+enum {
+	HMAT_SLLBIS_ACCESS_LATENCY = 0,
+	HMAT_SLLBIS_READ_LATENCY,
+	HMAT_SLLBIS_WRITE_LATENCY,
+	HMAT_SLLBIS_ACCESS_BANDWIDTH,
+	HMAT_SLLBIS_READ_BANDWIDTH,
+	HMAT_SLLBIS_WRITE_BANDWIDTH,
+	HMAT_SLLBIS_DATA_TYPE_MAX
+};
+
 struct dsmas_entry {
 	struct list_head list;
 	struct range dpa_range;
 	u8 handle;
+	struct node_hmem_attrs hmem_attrs;
 };
 
 /* Sub-table 0: Device Scoped Memory Affinity Structure (DSMAS) */
@@ -120,6 +132,22 @@ struct cdat_dsmas {
 	__le64 dpa_length;
 } __packed;
 
+/* Sub-table 1: Device Scoped Latency and Bandwidth Information Structure (DSLBIS) */
+struct cdat_dslbis {
+	struct cdat_entry_header hdr;
+	u8 handle;
+	u8 flags;
+	u8 data_type;
+	u8 reserved;
+	__le64 entry_base_unit;
+	__le16 entry[3];
+	__le16 reserved2;
+} __packed;
+
+/* Flags for DSLBIS subtable */
+#define DSLBIS_MEM_MASK		GENMASK(3, 0)
+#define DSLBIS_MEM_MEMORY	0
+
 int devm_cxl_port_enumerate_dports(struct cxl_port *port);
 struct cxl_dev_state;
 int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
@@ -136,5 +164,9 @@ cdat_table_parse(dsmas);
 cdat_table_parse(dslbis);
 cdat_table_parse(sslbis);
 
-int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg);
+#define cxl_parse_entry(x) \
+int cxl_##x##_parse_entry(struct cdat_entry_header *header, void *arg)
+
+cxl_parse_entry(dsmas);
+cxl_parse_entry(dslbis);
 #endif /* __CXL_PCI_H__ */
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 3022bdd52439..ab584b83010d 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -141,8 +141,17 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 		rc = cdat_table_parse_dsmas(port->cdat.table,
 					    cxl_dsmas_parse_entry,
 					    (void *)&dsmas_list);
-		if (rc < 0)
+		if (rc > 0) {
+			rc = cdat_table_parse_dslbis(port->cdat.table,
+						     cxl_dslbis_parse_entry,
+						     (void *)&dsmas_list);
+			if (rc <= 0) {
+				dev_warn(&port->dev,
+					 "Failed to parse DSLBIS: %d\n", rc);
+			}
+		} else {
 			dev_warn(&port->dev, "Failed to parse DSMAS: %d\n", rc);
+		}
 
 		dsmas_list_destroy(&dsmas_list);
 	}



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 07/23] cxl: Add callback to parse the SSLBIS subtable from CDAT
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (5 preceding siblings ...)
  2023-04-19 20:21 ` [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
@ 2023-04-19 20:21 ` Dave Jiang
  2023-04-20 11:50   ` Jonathan Cameron
  2023-04-24 23:38   ` Dan Williams
  2023-04-19 20:21 ` [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
                   ` (15 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Provide a callback to parse the Switched Scoped Latency and Bandwidth
Information Structure (DSLBIS) in the CDAT structures. The SSLBIS
contains the bandwidth and latency information that's tied to the
CLX switch that the data table has been read from. The extracted
values are indexed by the downstream port id. It is possible
the downstream port id is 0xffff which is a wildcard value for any
port id.

Coherent Device Attribute Table 1.03 2.1 Switched Scoped Latency
and Bandwidth Information Structure (DSLBIS)

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v3:
- Add spec section in commit header (Alison)
- Move CDAT parse to cxl_switch_port_probe()
- Use 'struct node_hmem_attrs'
---
 drivers/cxl/core/cdat.c |   76 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/port.c |    5 +++
 drivers/cxl/cxl.h       |    1 +
 drivers/cxl/cxlpci.h    |   20 ++++++++++++
 drivers/cxl/port.c      |   14 ++++++++-
 5 files changed, 115 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index e8b9bb99dfdf..ec3420dddf27 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -192,3 +192,79 @@ int cxl_dslbis_parse_entry(struct cdat_entry_header *header, void *arg)
 	return 0;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_dslbis_parse_entry, CXL);
+
+int cxl_sslbis_parse_entry(struct cdat_entry_header *header, void *arg)
+{
+	struct cdat_sslbis *sslbis = (struct cdat_sslbis *)header;
+	struct xarray *sslbis_xa = arg;
+	int remain, entries, i;
+
+	remain = sslbis->hdr.length - sizeof(*sslbis);
+	if (!remain || remain % sizeof(struct sslbis_sslbe)) {
+		pr_warn("Malformed SSLBIS table length: (%u)\n",
+			sslbis->hdr.length);
+		return -EINVAL;
+	}
+
+	/* Unrecognized data type, we can skip */
+	if (sslbis->data_type >= HMAT_SLLBIS_DATA_TYPE_MAX)
+		return 0;
+
+	entries = remain / sizeof(*sslbis);
+
+	for (i = 0; i < entries; i++) {
+		struct sslbis_sslbe *sslbe = &sslbis->sslbe[i];
+		u16 x = le16_to_cpu(sslbe->port_x_id);
+		u16 y = le16_to_cpu(sslbe->port_y_id);
+		struct node_hmem_attrs *hmem_attrs;
+		u16 dsp_id;
+		u64 val;
+		int rc;
+
+		switch (x) {
+		case SSLBIS_US_PORT:
+			dsp_id = y;
+			break;
+		case SSLBIS_ANY_PORT:
+			switch (y) {
+			case SSLBIS_US_PORT:
+				dsp_id = x;
+				break;
+			case SSLBIS_ANY_PORT:
+				dsp_id = SSLBIS_ANY_PORT;
+				break;
+			default:
+				dsp_id = y;
+				break;
+			}
+			break;
+		default:
+			dsp_id = x;
+			break;
+		}
+
+		hmem_attrs = xa_load(sslbis_xa, dsp_id);
+		if (xa_is_err(hmem_attrs))
+			return xa_err(hmem_attrs);
+		if (!hmem_attrs) {
+			hmem_attrs = kzalloc(sizeof(*hmem_attrs), GFP_KERNEL);
+			if (!hmem_attrs)
+				return -ENOMEM;
+		}
+
+		rc = check_mul_overflow(le64_to_cpu(sslbis->entry_base_unit),
+					le16_to_cpu(sslbe->value), &val);
+		if (unlikely(rc))
+			pr_warn("SSLBIS value overflowed!\n");
+
+		cxl_hmem_attrs_set(hmem_attrs, sslbis->data_type, val);
+		rc = xa_insert(sslbis_xa, dsp_id, hmem_attrs, GFP_KERNEL);
+		if (rc < 0 && rc != -EBUSY) {
+			kfree(hmem_attrs);
+			return rc;
+		}
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_sslbis_parse_entry, CXL);
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 024d4178f557..3fedbabac1af 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -518,6 +518,7 @@ static void cxl_ep_remove(struct cxl_port *port, struct cxl_ep *ep)
 static void cxl_port_release(struct device *dev)
 {
 	struct cxl_port *port = to_cxl_port(dev);
+	struct node_hmem_attrs *hmem_attrs;
 	unsigned long index;
 	struct cxl_ep *ep;
 
@@ -526,6 +527,9 @@ static void cxl_port_release(struct device *dev)
 	xa_destroy(&port->endpoints);
 	xa_destroy(&port->dports);
 	xa_destroy(&port->regions);
+	xa_for_each(&port->cdat.sslbis_xa, index, hmem_attrs)
+		kfree(hmem_attrs);
+	xa_destroy(&port->cdat.sslbis_xa);
 	ida_free(&cxl_port_ida, port->id);
 	kfree(port);
 }
@@ -684,6 +688,7 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
 	xa_init(&port->dports);
 	xa_init(&port->endpoints);
 	xa_init(&port->regions);
+	xa_init(&port->cdat.sslbis_xa);
 
 	device_initialize(dev);
 	lockdep_set_class_and_subclass(&dev->mutex, &cxl_port_key, port->depth);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 18ca25c7e527..318aa051f65a 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -581,6 +581,7 @@ struct cxl_port {
 	struct cxl_cdat {
 		void *table;
 		size_t length;
+		struct xarray sslbis_xa;
 	} cdat;
 	bool cdat_available;
 };
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 0f36fb23055c..1bca1c0e4b40 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -148,6 +148,25 @@ struct cdat_dslbis {
 #define DSLBIS_MEM_MASK		GENMASK(3, 0)
 #define DSLBIS_MEM_MEMORY	0
 
+struct sslbis_sslbe {
+	__le16 port_x_id;
+	__le16 port_y_id;
+	__le16 value;	/* latency or bandwidth */
+	__le16 reserved;
+} __packed;
+
+/* Sub-table 5: Switch Scoped Latency and Bandwidth Information Structure (SSLBIS) */
+struct cdat_sslbis {
+	struct cdat_entry_header hdr;
+	u8 data_type;
+	u8 reserved[3];
+	__le64 entry_base_unit;
+	struct sslbis_sslbe sslbe[];
+} __packed;
+
+#define SSLBIS_US_PORT		0x0100
+#define SSLBIS_ANY_PORT		0xffff
+
 int devm_cxl_port_enumerate_dports(struct cxl_port *port);
 struct cxl_dev_state;
 int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
@@ -169,4 +188,5 @@ int cxl_##x##_parse_entry(struct cdat_entry_header *header, void *arg)
 
 cxl_parse_entry(dsmas);
 cxl_parse_entry(dslbis);
+cxl_parse_entry(sslbis);
 #endif /* __CXL_PCI_H__ */
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index ab584b83010d..2d5b9ba13429 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -83,7 +83,19 @@ static int cxl_switch_port_probe(struct cxl_port *port)
 	if (IS_ERR(cxlhdm))
 		return PTR_ERR(cxlhdm);
 
-	return devm_cxl_enumerate_decoders(cxlhdm, NULL);
+	rc = devm_cxl_enumerate_decoders(cxlhdm, NULL);
+	if (rc < 0)
+		return rc;
+
+	if (port->cdat.table) {
+		rc = cdat_table_parse_sslbis(port->cdat.table,
+					     cxl_sslbis_parse_entry,
+					     (void *)&port->cdat.sslbis_xa);
+		if (rc <= 0)
+			dev_warn(&port->dev, "Failed to parse SSLBIS: %d\n", rc);
+	}
+
+	return 0;
 }
 
 static int cxl_endpoint_port_probe(struct cxl_port *port)



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (6 preceding siblings ...)
  2023-04-19 20:21 ` [PATCH v4 07/23] cxl: Add callback to parse the SSLBIS " Dave Jiang
@ 2023-04-19 20:21 ` Dave Jiang
  2023-04-20 12:00   ` Jonathan Cameron
  2023-04-25  0:12   ` Dan Williams
  2023-04-19 20:21 ` [PATCH v4 09/23] cxl: Add helper function to retrieve ACPI handle of CXL root device Dave Jiang
                   ` (14 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)

Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
an input of an ACPI package with 4 dwords (read latency, write latency,
read bandwidth, write bandwidth). The call returns a package with 1 WORD
that provides the max supported QTG ID and a package that may contain 0 or
more WORDs as the recommended QTG IDs in the recommended order.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v2:
- Reorder var declaration and use C99 style. (Jonathan)
- Allow >2 ACPI objects in package for future expansion. (Jonathan)
- Check QTG IDs against MAX QTG ID provided by output package. (Jonathan)
---
 drivers/cxl/core/Makefile |    1 
 drivers/cxl/core/acpi.c   |  116 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h         |   16 ++++++
 3 files changed, 133 insertions(+)
 create mode 100644 drivers/cxl/core/acpi.c

diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 867a8014b462..30d61c8cae22 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -13,5 +13,6 @@ cxl_core-y += mbox.o
 cxl_core-y += pci.o
 cxl_core-y += hdm.o
 cxl_core-y += cdat.o
+cxl_core-y += acpi.o
 cxl_core-$(CONFIG_TRACING) += trace.o
 cxl_core-$(CONFIG_CXL_REGION) += region.o
diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
new file mode 100644
index 000000000000..6eda5cad8d59
--- /dev/null
+++ b/drivers/cxl/core/acpi.c
@@ -0,0 +1,116 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+#include <asm/div64.h>
+#include "cxlpci.h"
+#include "cxl.h"
+
+const guid_t acpi_cxl_qtg_id_guid =
+	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
+		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
+
+/**
+ * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
+ * @handle: ACPI handle
+ * @input: bandwidth and latency data
+ *
+ * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
+ * the QTG IDs that falls within the performance data.
+ */
+struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
+						 struct qtg_dsm_input *input)
+{
+	union acpi_object *out_obj, *out_buf, *pkg;
+	union acpi_object in_buf = {
+		.buffer = {
+			.type = ACPI_TYPE_BUFFER,
+			.pointer = (u8 *)input,
+			.length = sizeof(u32) * 4,
+		},
+	};
+	union acpi_object in_obj = {
+		.package = {
+			.type = ACPI_TYPE_PACKAGE,
+			.count = 1,
+			.elements = &in_buf
+		},
+	};
+	struct qtg_dsm_output *output = NULL;
+	int len, rc, i;
+	u16 *max_qtg;
+
+	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
+	if (!out_obj)
+		return ERR_PTR(-ENXIO);
+
+	if (out_obj->type != ACPI_TYPE_PACKAGE) {
+		rc = -ENXIO;
+		goto err;
+	}
+
+	/* Check Max QTG ID */
+	pkg = &out_obj->package.elements[0];
+	if (pkg->type != ACPI_TYPE_BUFFER) {
+		rc = -ENXIO;
+		goto err;
+	}
+
+	if (pkg->buffer.length != sizeof(u16)) {
+		rc = -ENXIO;
+		goto err;
+	}
+	max_qtg = (u16 *)pkg->buffer.pointer;
+
+	/* Retrieve QTG IDs package */
+	pkg = &out_obj->package.elements[1];
+	if (pkg->type != ACPI_TYPE_PACKAGE) {
+		rc = -ENXIO;
+		goto err;
+	}
+
+	out_buf = &pkg->package.elements[0];
+	if (out_buf->type != ACPI_TYPE_BUFFER) {
+		rc = -ENXIO;
+		goto err;
+	}
+
+	len = out_buf->buffer.length;
+
+	/* It's legal to have 0 QTG entries */
+	if (len == 0)
+		goto out;
+
+	/* Malformed package, not multiple of WORD size */
+	if (len % sizeof(u16)) {
+		rc = -ENXIO;
+		goto out;
+	}
+
+	output = kmalloc(len + sizeof(*output), GFP_KERNEL);
+	if (!output) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	output->nr = len / sizeof(u16);
+	memcpy(output->qtg_ids, out_buf->buffer.pointer, len);
+
+	for (i = 0; i < output->nr; i++) {
+		if (output->qtg_ids[i] > *max_qtg)
+			pr_warn("QTG ID %u greater than MAX %u\n",
+				output->qtg_ids[i], *max_qtg);
+	}
+
+out:
+	ACPI_FREE(out_obj);
+	return output;
+
+err:
+	ACPI_FREE(out_obj);
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_acpi_evaluate_qtg_dsm, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 318aa051f65a..6426c4c22e28 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -7,6 +7,7 @@
 #include <linux/libnvdimm.h>
 #include <linux/bitfield.h>
 #include <linux/bitops.h>
+#include <linux/acpi.h>
 #include <linux/log2.h>
 #include <linux/list.h>
 #include <linux/io.h>
@@ -793,6 +794,21 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
 }
 #endif
 
+struct qtg_dsm_input {
+	u32 rd_lat;
+	u32 wr_lat;
+	u32 rd_bw;
+	u32 wr_bw;
+};
+
+struct qtg_dsm_output {
+	int nr;
+	u16 qtg_ids[];
+};
+
+struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
+						 struct qtg_dsm_input *input);
+
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 09/23] cxl: Add helper function to retrieve ACPI handle of CXL root device
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (7 preceding siblings ...)
  2023-04-19 20:21 ` [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
@ 2023-04-19 20:21 ` Dave Jiang
  2023-04-20 12:06   ` Jonathan Cameron
  2023-04-25  0:18   ` Dan Williams
  2023-04-19 20:22 ` [PATCH v4 10/23] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
                   ` (13 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:21 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Provide a helper to find the ACPI0017 device in order to issue the _DSM.
The helper will take the 'struct device' from a cxl_port and iterate until
the root device is reached. The ACPI handle will be returned from the root
device.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v2:
- Fix commenting style. (Jonathan)
- Fix var declaration aligning. (Jonathan)
---
 drivers/cxl/core/acpi.c |   34 ++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    1 +
 2 files changed, 35 insertions(+)

diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
index 6eda5cad8d59..191644d0ca6d 100644
--- a/drivers/cxl/core/acpi.c
+++ b/drivers/cxl/core/acpi.c
@@ -5,6 +5,7 @@
 #include <linux/kernel.h>
 #include <linux/acpi.h>
 #include <linux/pci.h>
+#include <linux/platform_device.h>
 #include <asm/div64.h>
 #include "cxlpci.h"
 #include "cxl.h"
@@ -13,6 +14,39 @@ const guid_t acpi_cxl_qtg_id_guid =
 	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
 		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
 
+/**
+ * cxl_acpi_get_rootdev_handle - get the ACPI handle of the CXL root device
+ * @dev: 'struct device' to start searching from. Should be from cxl_port->dev.
+ *
+ * Return: acpi_handle on success, errptr of errno on error.
+ *
+ * Looks for the ACPI0017 device and return the ACPI handle
+ **/
+acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev)
+{
+	struct device *itr = dev;
+	struct device *root_dev;
+	acpi_handle handle;
+
+	if (!dev)
+		return ERR_PTR(-EINVAL);
+
+	while (itr->parent) {
+		root_dev = itr;
+		itr = itr->parent;
+	}
+
+	if (!dev_is_platform(root_dev))
+		return ERR_PTR(-ENODEV);
+
+	handle = ACPI_HANDLE(root_dev);
+	if (!handle)
+		return ERR_PTR(-ENODEV);
+
+	return handle;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_acpi_get_rootdev_handle, CXL);
+
 /**
  * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
  * @handle: ACPI handle
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 6426c4c22e28..d7c1410a437c 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -808,6 +808,7 @@ struct qtg_dsm_output {
 
 struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
 						 struct qtg_dsm_input *input);
+acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev);
 
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 10/23] cxl: Add helpers to calculate pci latency for the CXL device
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (8 preceding siblings ...)
  2023-04-19 20:21 ` [PATCH v4 09/23] cxl: Add helper function to retrieve ACPI handle of CXL root device Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-20 12:15   ` Jonathan Cameron
  2023-04-25  0:30   ` Dan Williams
  2023-04-19 20:22 ` [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches Dave Jiang
                   ` (12 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

The latency is calculated by dividing the flit size over the bandwidth. Add
support to retrieve the flit size for the CXL device and calculate the
latency of the downstream link.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v2:
- Fix commit log issues. (Jonathan)
- Fix var declaration issues. (Jonathan)
---
 drivers/cxl/core/pci.c |   68 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlpci.h   |   15 +++++++++++
 drivers/cxl/pci.c      |   13 ---------
 3 files changed, 83 insertions(+), 13 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 1c415b26e866..bb58296b3e56 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -712,3 +712,71 @@ pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
 	return PCI_ERS_RESULT_NEED_RESET;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_error_detected, CXL);
+
+static int pci_bus_speed_to_mbps(enum pci_bus_speed speed)
+{
+	switch (speed) {
+	case PCIE_SPEED_2_5GT:
+		return 2500;
+	case PCIE_SPEED_5_0GT:
+		return 5000;
+	case PCIE_SPEED_8_0GT:
+		return 8000;
+	case PCIE_SPEED_16_0GT:
+		return 16000;
+	case PCIE_SPEED_32_0GT:
+		return 32000;
+	case PCIE_SPEED_64_0GT:
+		return 64000;
+	default:
+		break;
+	}
+
+	return -EINVAL;
+}
+
+static int cxl_pci_mbits_to_mbytes(struct pci_dev *pdev)
+{
+	int mbits;
+
+	mbits = pci_bus_speed_to_mbps(pdev->bus->cur_bus_speed);
+	if (mbits < 0)
+		return mbits;
+
+	return mbits >> 3;
+}
+
+static int cxl_flit_size(struct pci_dev *pdev)
+{
+	if (cxl_pci_flit_256(pdev))
+		return 256;
+
+	return 68;
+}
+
+/**
+ * cxl_pci_get_latency - calculate the link latency for the PCIe link
+ * @pdev - PCI device
+ *
+ * return: calculated latency or -errno
+ *
+ * CXL Memory Device SW Guide v1.0 2.11.4 Link latency calculation
+ * Link latency = LinkPropagationLatency + FlitLatency + RetimerLatency
+ * LinkProgationLatency is negligible, so 0 will be used
+ * RetimerLatency is assumed to be negligible and 0 will be used
+ * FlitLatency = FlitSize / LinkBandwidth
+ * FlitSize is defined by spec. CXL rev3.0 4.2.1.
+ * 68B flit is used up to 32GT/s. >32GT/s, 256B flit size is used.
+ * The FlitLatency is converted to picoseconds.
+ */
+long cxl_pci_get_latency(struct pci_dev *pdev)
+{
+	long bw;
+
+	bw = cxl_pci_mbits_to_mbytes(pdev);
+	if (bw < 0)
+		return bw;
+
+	return cxl_flit_size(pdev) * 1000000L / bw;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_pci_get_latency, CXL);
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 1bca1c0e4b40..795eba31fe29 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -167,6 +167,19 @@ struct cdat_sslbis {
 #define SSLBIS_US_PORT		0x0100
 #define SSLBIS_ANY_PORT		0xffff
 
+/*
+ * CXL v3.0 6.2.3 Table 6-4
+ * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
+ * mode, otherwise it's 68B flits mode.
+ */
+static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
+{
+	u16 lnksta2;
+
+	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
+	return lnksta2 & PCI_EXP_LNKSTA2_FLIT;
+}
+
 int devm_cxl_port_enumerate_dports(struct cxl_port *port);
 struct cxl_dev_state;
 int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
@@ -189,4 +202,6 @@ int cxl_##x##_parse_entry(struct cdat_entry_header *header, void *arg)
 cxl_parse_entry(dsmas);
 cxl_parse_entry(dslbis);
 cxl_parse_entry(sslbis);
+
+long cxl_pci_get_latency(struct pci_dev *pdev);
 #endif /* __CXL_PCI_H__ */
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index ea38bd49b0cf..ed39d133b70d 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -365,19 +365,6 @@ static bool is_cxl_restricted(struct pci_dev *pdev)
 	return pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END;
 }
 
-/*
- * CXL v3.0 6.2.3 Table 6-4
- * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
- * mode, otherwise it's 68B flits mode.
- */
-static bool cxl_pci_flit_256(struct pci_dev *pdev)
-{
-	u16 lnksta2;
-
-	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
-	return lnksta2 & PCI_EXP_LNKSTA2_FLIT;
-}
-
 static int cxl_pci_ras_unmask(struct pci_dev *pdev)
 {
 	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (9 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 10/23] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-20 12:26   ` Jonathan Cameron
  2023-04-25  0:33   ` Dan Williams
  2023-04-19 20:22 ` [PATCH v4 12/23] cxl: Add helper function that calculate QoS values for PCI path Dave Jiang
                   ` (11 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

The CDAT information from the switch, Switch Scoped Latency and Bandwidth
Information Strucutre (SSLBIS), is parsed and stored in an xarray under the
cxl_port. The QoS data are indexed by the downstream port id.  Walk the CXL
ports from endpoint to root and retrieve the relevant QoS information
(bandwidth and latency) that are from the switch CDAT. If read or write QoS
values are not available, then use the access QoS value.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v3:
- Move to use 'struct node_hmem_attrs'
---
 drivers/cxl/core/port.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    2 +
 2 files changed, 83 insertions(+)

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 3fedbabac1af..770b540d5325 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1921,6 +1921,87 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd)
 }
 EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
 
+/**
+ * cxl_port_get_switch_qos - retrieve QoS data for CXL switches
+ * @port: endpoint cxl_port
+ * @rd_bw: writeback value for min read bandwidth
+ * @rd_lat: writeback value for total read latency
+ * @wr_bw: writeback value for min write bandwidth
+ * @wr_lat: writeback value for total write latency
+ *
+ * Return: Errno on failure, 0 on success. -ENOENT if no switch device
+ */
+int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
+			    u64 *wr_bw, u64 *wr_lat)
+{
+	u64 min_rd_bw = ULONG_MAX;
+	u64 min_wr_bw = ULONG_MAX;
+	struct cxl_dport *dport;
+	struct cxl_port *nport;
+	u64 total_rd_lat = 0;
+	u64 total_wr_lat = 0;
+	struct device *next;
+	int switches = 0;
+	int rc = 0;
+
+	if (!is_cxl_endpoint(port))
+		return -EINVAL;
+
+	/* Skip the endpoint */
+	next = port->dev.parent;
+	nport = to_cxl_port(next);
+	dport = port->parent_dport;
+
+	do {
+		struct node_hmem_attrs *hmem_attrs;
+		u64 lat, bw;
+
+		if (!nport->cdat.table)
+			break;
+
+		if (!dev_is_pci(dport->dport))
+			break;
+
+		hmem_attrs = xa_load(&nport->cdat.sslbis_xa, dport->port_id);
+		if (xa_is_err(hmem_attrs))
+			return xa_err(hmem_attrs);
+
+		if (!hmem_attrs) {
+			hmem_attrs = xa_load(&nport->cdat.sslbis_xa, SSLBIS_ANY_PORT);
+			if (xa_is_err(hmem_attrs))
+				return xa_err(hmem_attrs);
+			if (!hmem_attrs)
+				return -ENXIO;
+		}
+
+		bw = hmem_attrs->write_bandwidth;
+		lat = hmem_attrs->write_latency;
+		min_wr_bw = min_t(u64, min_wr_bw, bw);
+		total_wr_lat += lat;
+
+		bw = hmem_attrs->read_bandwidth;
+		lat = hmem_attrs->read_latency;
+		min_rd_bw = min_t(u64, min_rd_bw, bw);
+		total_rd_lat += lat;
+
+		dport = nport->parent_dport;
+		next = next->parent;
+		nport = to_cxl_port(next);
+		switches++;
+	} while (next);
+
+	*wr_bw = min_wr_bw;
+	*wr_lat = total_wr_lat;
+	*rd_bw = min_rd_bw;
+	*rd_lat = total_rd_lat;
+
+	if (!switches)
+		return -ENOENT;
+
+	return rc;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_port_get_switch_qos, CXL);
+
 /* for user tooling to ensure port disable work has completed */
 static ssize_t flush_store(struct bus_type *bus, const char *buf, size_t count)
 {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index d7c1410a437c..76ccc815134f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -809,6 +809,8 @@ struct qtg_dsm_output {
 struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
 						 struct qtg_dsm_input *input);
 acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev);
+int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
+			    u64 *wr_bw, u64 *wr_lat);
 
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 12/23] cxl: Add helper function that calculate QoS values for PCI path
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (10 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-20 12:32   ` Jonathan Cameron
  2023-04-25  0:45   ` Dan Williams
  2023-04-19 20:22 ` [PATCH v4 13/23] ACPI: NUMA: Create enum for memory_target hmem_attrs indexing Dave Jiang
                   ` (10 subsequent siblings)
  22 siblings, 2 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Calculate the link bandwidth and latency for the PCIe path from the device
to the CXL Host Bridge. This does not include the CDAT data from the device
or the switch(es) in the path.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v4:
- 0-day fix, remove unused var. Fix checking < 0 for unsigned var.
- Rework port hierachy walk to calculate the latencies correctly
---
 drivers/cxl/core/port.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    2 +
 2 files changed, 85 insertions(+)

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 770b540d5325..8da437e038b9 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -2002,6 +2002,89 @@ int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
 }
 EXPORT_SYMBOL_NS_GPL(cxl_port_get_switch_qos, CXL);
 
+/**
+ * cxl_port_get_downstream_qos - retrieve QoS data for PCIE downstream path
+ * @port: endpoint cxl_port
+ * @bandwidth: writeback value for min bandwidth
+ * @latency: writeback value for total latency
+ *
+ * Return: Errno on failure, 0 on success.
+ */
+int cxl_port_get_downstream_qos(struct cxl_port *port, u64 *bandwidth,
+				u64 *latency)
+{
+	u64 min_bw = ULONG_MAX;
+	struct pci_dev *pdev;
+	struct cxl_port *p;
+	struct device *dev;
+	u64 total_lat = 0;
+	long lat;
+
+	*bandwidth = 0;
+	*latency = 0;
+
+	/* Grab the device that is the PCI device for CXL memdev */
+	dev = port->uport->parent;
+	/* Skip if it's not PCI, most likely a cxl_test device */
+	if (!dev_is_pci(dev))
+		return 0;
+
+	pdev = to_pci_dev(dev);
+	min_bw = pcie_bandwidth_available(pdev, NULL, NULL, NULL);
+	if (min_bw == 0)
+		return -ENXIO;
+
+	/* convert to MB/s from Mb/s */
+	min_bw >>= 3;
+
+	/*
+	 * Walk the cxl_port hierachy to retrieve the link latencies for
+	 * each of the PCIe segments. The loop will obtain the link latency
+	 * via each of the switch downstream port.
+	 */
+	p = port;
+	do {
+		struct cxl_dport *dport = p->parent_dport;
+		struct device *dport_dev, *uport_dev;
+		struct pci_dev *dport_pdev;
+
+		if (!dport)
+			break;
+
+		dport_dev = dport->dport;
+		if (!dev_is_pci(dport_dev))
+			break;
+
+		p = dport->port;
+		uport_dev = p->uport;
+		if (!dev_is_pci(uport_dev))
+			break;
+
+		dport_pdev = to_pci_dev(dport_dev);
+		pdev = to_pci_dev(uport_dev);
+		lat = cxl_pci_get_latency(dport_pdev);
+		if (lat < 0)
+			return lat;
+
+		total_lat += lat;
+	} while (1);
+
+	/*
+	 * pdev would be either the cxl device if there are no switches, or the
+	 * upstream port of the last switch.
+	 */
+	lat = cxl_pci_get_latency(pdev);
+	if (lat < 0)
+		return lat;
+
+	total_lat += lat;
+	*bandwidth = min_bw;
+	*latency = total_lat;
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_port_get_downstream_qos, CXL);
+
 /* for user tooling to ensure port disable work has completed */
 static ssize_t flush_store(struct bus_type *bus, const char *buf, size_t count)
 {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 76ccc815134f..6a6387a545db 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -811,6 +811,8 @@ struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
 acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev);
 int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
 			    u64 *wr_bw, u64 *wr_lat);
+int cxl_port_get_downstream_qos(struct cxl_port *port, u64 *bandwidth,
+				u64 *latency);
 
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 13/23] ACPI: NUMA: Create enum for memory_target hmem_attrs indexing
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (11 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 12/23] cxl: Add helper function that calculate QoS values for PCI path Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-19 20:22 ` [PATCH v4 14/23] ACPI: NUMA: Add genport target allocation to the HMAT parsing Dave Jiang
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Create enums to provide named indexing for the hmem_attrs array. This is in
preparation for adding generic port support which will add a third member
in the array to keep the generic port attributes separate from the memory
attributes.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/acpi/numa/hmat.c |   35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index bba268ecd802..4911b7b9e4dd 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -57,12 +57,18 @@ struct target_cache {
 	struct node_cache_attrs cache_attrs;
 };
 
+enum {
+	NODE_ACCESS_CLASS_0 = 0,
+	NODE_ACCESS_CLASS_1,
+	NODE_ACCESS_CLASS_MAX,
+};
+
 struct memory_target {
 	struct list_head node;
 	unsigned int memory_pxm;
 	unsigned int processor_pxm;
 	struct resource memregions;
-	struct node_hmem_attrs hmem_attrs[2];
+	struct node_hmem_attrs hmem_attrs[NODE_ACCESS_CLASS_MAX];
 	struct list_head caches;
 	struct node_cache_attrs cache_attrs;
 	bool registered;
@@ -338,10 +344,12 @@ static __init int hmat_parse_locality(union acpi_subtable_headers *header,
 			if (mem_hier == ACPI_HMAT_MEMORY) {
 				target = find_mem_target(targs[targ]);
 				if (target && target->processor_pxm == inits[init]) {
-					hmat_update_target_access(target, type, value, 0);
+					hmat_update_target_access(target, type, value,
+								  NODE_ACCESS_CLASS_0);
 					/* If the node has a CPU, update access 1 */
 					if (node_state(pxm_to_node(inits[init]), N_CPU))
-						hmat_update_target_access(target, type, value, 1);
+						hmat_update_target_access(target, type, value,
+									  NODE_ACCESS_CLASS_1);
 				}
 			}
 		}
@@ -600,10 +608,12 @@ static void hmat_register_target_initiators(struct memory_target *target)
 	 */
 	if (target->processor_pxm != PXM_INVAL) {
 		cpu_nid = pxm_to_node(target->processor_pxm);
-		register_memory_node_under_compute_node(mem_nid, cpu_nid, 0);
+		register_memory_node_under_compute_node(mem_nid, cpu_nid,
+							NODE_ACCESS_CLASS_0);
 		access0done = true;
 		if (node_state(cpu_nid, N_CPU)) {
-			register_memory_node_under_compute_node(mem_nid, cpu_nid, 1);
+			register_memory_node_under_compute_node(mem_nid, cpu_nid,
+								NODE_ACCESS_CLASS_1);
 			return;
 		}
 	}
@@ -644,12 +654,13 @@ static void hmat_register_target_initiators(struct memory_target *target)
 			}
 			if (best)
 				hmat_update_target_access(target, loc->hmat_loc->data_type,
-							  best, 0);
+							  best, NODE_ACCESS_CLASS_0);
 		}
 
 		for_each_set_bit(i, p_nodes, MAX_NUMNODES) {
 			cpu_nid = pxm_to_node(i);
-			register_memory_node_under_compute_node(mem_nid, cpu_nid, 0);
+			register_memory_node_under_compute_node(mem_nid, cpu_nid,
+								NODE_ACCESS_CLASS_0);
 		}
 	}
 
@@ -681,11 +692,13 @@ static void hmat_register_target_initiators(struct memory_target *target)
 				clear_bit(initiator->processor_pxm, p_nodes);
 		}
 		if (best)
-			hmat_update_target_access(target, loc->hmat_loc->data_type, best, 1);
+			hmat_update_target_access(target, loc->hmat_loc->data_type, best,
+						  NODE_ACCESS_CLASS_1);
 	}
 	for_each_set_bit(i, p_nodes, MAX_NUMNODES) {
 		cpu_nid = pxm_to_node(i);
-		register_memory_node_under_compute_node(mem_nid, cpu_nid, 1);
+		register_memory_node_under_compute_node(mem_nid, cpu_nid,
+							NODE_ACCESS_CLASS_1);
 	}
 }
 
@@ -746,8 +759,8 @@ static void hmat_register_target(struct memory_target *target)
 	if (!target->registered) {
 		hmat_register_target_initiators(target);
 		hmat_register_target_cache(target);
-		hmat_register_target_perf(target, 0);
-		hmat_register_target_perf(target, 1);
+		hmat_register_target_perf(target, NODE_ACCESS_CLASS_0);
+		hmat_register_target_perf(target, NODE_ACCESS_CLASS_1);
 		target->registered = true;
 	}
 	mutex_unlock(&target_lock);



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 14/23] ACPI: NUMA: Add genport target allocation to the HMAT parsing
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (12 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 13/23] ACPI: NUMA: Create enum for memory_target hmem_attrs indexing Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-19 20:22 ` [PATCH v4 15/23] ACPI: NUMA: Add setting of generic port locality attributes Dave Jiang
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Add SRAT parsing for the HMAT init in order to collect the device handle
from the Generic Port Affinity Structure. The devie handle will serve as
the key to search for target data.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v3:
- Refactored to share common code
---
 drivers/acpi/numa/hmat.c |   52 +++++++++++++++++++++++++++++++++++++++++++---
 include/acpi/actbl3.h    |    2 ++
 2 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 4911b7b9e4dd..d11b4231ae92 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -71,6 +71,7 @@ struct memory_target {
 	struct node_hmem_attrs hmem_attrs[NODE_ACCESS_CLASS_MAX];
 	struct list_head caches;
 	struct node_cache_attrs cache_attrs;
+	u8 device_handle[ACPI_SRAT_DEVICE_HANDLE_SIZE];
 	bool registered;
 };
 
@@ -125,8 +126,7 @@ static __init void alloc_memory_initiator(unsigned int cpu_pxm)
 	list_add_tail(&initiator->node, &initiators);
 }
 
-static __init void alloc_memory_target(unsigned int mem_pxm,
-		resource_size_t start, resource_size_t len)
+static __init struct memory_target *alloc_target(unsigned int mem_pxm)
 {
 	struct memory_target *target;
 
@@ -134,7 +134,7 @@ static __init void alloc_memory_target(unsigned int mem_pxm,
 	if (!target) {
 		target = kzalloc(sizeof(*target), GFP_KERNEL);
 		if (!target)
-			return;
+			return NULL;
 		target->memory_pxm = mem_pxm;
 		target->processor_pxm = PXM_INVAL;
 		target->memregions = (struct resource) {
@@ -147,6 +147,18 @@ static __init void alloc_memory_target(unsigned int mem_pxm,
 		INIT_LIST_HEAD(&target->caches);
 	}
 
+	return target;
+}
+
+static __init void alloc_memory_target(unsigned int mem_pxm,
+		resource_size_t start, resource_size_t len)
+{
+	struct memory_target *target;
+
+	target = alloc_target(mem_pxm);
+	if (!target)
+		return;
+
 	/*
 	 * There are potentially multiple ranges per PXM, so record each
 	 * in the per-target memregions resource tree.
@@ -157,6 +169,17 @@ static __init void alloc_memory_target(unsigned int mem_pxm,
 				start, start + len, mem_pxm);
 }
 
+static __init void alloc_genport_target(unsigned int mem_pxm, u8 *handle)
+{
+	struct memory_target *target;
+
+	target = alloc_target(mem_pxm);
+	if (!target)
+		return;
+
+	memcpy(target->device_handle, handle, ACPI_SRAT_DEVICE_HANDLE_SIZE);
+}
+
 static __init const char *hmat_data_type(u8 type)
 {
 	switch (type) {
@@ -498,6 +521,22 @@ static __init int srat_parse_mem_affinity(union acpi_subtable_headers *header,
 	return 0;
 }
 
+static __init int srat_parse_genport_affinity(union acpi_subtable_headers *header,
+					      const unsigned long end)
+{
+	struct acpi_srat_generic_affinity *ga = (void *)header;
+
+	if (!ga)
+		return -EINVAL;
+
+	if (!(ga->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED))
+		return 0;
+
+	alloc_genport_target(ga->proximity_domain, (u8 *)ga->device_handle);
+
+	return 0;
+}
+
 static u32 hmat_initiator_perf(struct memory_target *target,
 			       struct memory_initiator *initiator,
 			       struct acpi_hmat_locality *hmat_loc)
@@ -848,6 +887,13 @@ static __init int hmat_init(void)
 				ACPI_SRAT_TYPE_MEMORY_AFFINITY,
 				srat_parse_mem_affinity, 0) < 0)
 		goto out_put;
+
+	if (acpi_table_parse_entries(ACPI_SIG_SRAT,
+				     sizeof(struct acpi_table_srat),
+				     ACPI_SRAT_TYPE_GENERIC_PORT_AFFINITY,
+				     srat_parse_genport_affinity, 0) < 0)
+		goto out_put;
+
 	acpi_put_table(tbl);
 
 	status = acpi_get_table(ACPI_SIG_HMAT, 0, &tbl);
diff --git a/include/acpi/actbl3.h b/include/acpi/actbl3.h
index 832c6464f063..0daf5a94f08a 100644
--- a/include/acpi/actbl3.h
+++ b/include/acpi/actbl3.h
@@ -289,6 +289,8 @@ struct acpi_srat_generic_affinity {
 	u32 reserved1;
 };
 
+#define ACPI_SRAT_DEVICE_HANDLE_SIZE		16
+
 /* Flags for struct acpi_srat_generic_affinity */
 
 #define ACPI_SRAT_GENERIC_AFFINITY_ENABLED     (1)	/* 00: Use affinity structure */



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 15/23] ACPI: NUMA: Add setting of generic port locality attributes
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (13 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 14/23] ACPI: NUMA: Add genport target allocation to the HMAT parsing Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-19 20:22 ` [PATCH v4 16/23] ACPI: NUMA: Add helper function to retrieve the performance attributes Dave Jiang
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Add generic port support for the parsing of HMAT locality sub-table. The
attributes will be added to the third array member of the hmem_attrs in
order to not mix with the existing memory attributes it only provides the
locality attributes from initator to the generic port targets and is
missing the rest of the data from the actual memory device.

The actual memory attributes will be updated when a memory device is
attached and the locality information is calculated end to end.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/acpi/numa/hmat.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index d11b4231ae92..ad0cf21700a1 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -60,6 +60,7 @@ struct target_cache {
 enum {
 	NODE_ACCESS_CLASS_0 = 0,
 	NODE_ACCESS_CLASS_1,
+	NODE_ACCESS_CLASS_GENPORT,
 	NODE_ACCESS_CLASS_MAX,
 };
 
@@ -367,6 +368,12 @@ static __init int hmat_parse_locality(union acpi_subtable_headers *header,
 			if (mem_hier == ACPI_HMAT_MEMORY) {
 				target = find_mem_target(targs[targ]);
 				if (target && target->processor_pxm == inits[init]) {
+					if (*(target->device_handle)) {
+						hmat_update_target_access(target, type, value,
+								NODE_ACCESS_CLASS_GENPORT);
+						continue;
+					}
+
 					hmat_update_target_access(target, type, value,
 								  NODE_ACCESS_CLASS_0);
 					/* If the node has a CPU, update access 1 */



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 16/23] ACPI: NUMA: Add helper function to retrieve the performance attributes
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (14 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 15/23] ACPI: NUMA: Add setting of generic port locality attributes Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-19 20:22 ` [PATCH v4 17/23] cxl: Add helper function to retrieve generic port QoS Dave Jiang
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Add helper to retrieve the performance attributes based on the device
handle.  The helper function is exported so the CXL driver can use that
to acquire the performance data between the CPU and the CXL host bridge.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v3:
- Fix 0-day report on extra ';'
---
 drivers/acpi/numa/hmat.c |   44 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/acpi.h     |    6 ++++++
 2 files changed, 50 insertions(+)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index ad0cf21700a1..768df2f3e6bc 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -107,6 +107,50 @@ static struct memory_target *find_mem_target(unsigned int mem_pxm)
 	return NULL;
 }
 
+static struct memory_target *acpi_find_genport_target(u8 *device_handle)
+{
+	struct memory_target *target;
+
+	list_for_each_entry(target, &targets, node) {
+		if (!strncmp(target->device_handle, device_handle,
+			     ACPI_SRAT_DEVICE_HANDLE_SIZE))
+			return target;
+	}
+
+	return NULL;
+}
+
+int acpi_get_genport_attrs(u8 *device_handle, u64 *val, int type)
+{
+	struct memory_target *target;
+
+	target = acpi_find_genport_target(device_handle);
+	if (!target)
+		return -ENOENT;
+
+	switch (type) {
+	case ACPI_HMAT_ACCESS_LATENCY:
+	case ACPI_HMAT_READ_LATENCY:
+		*val = target->hmem_attrs[NODE_ACCESS_CLASS_GENPORT].read_latency;
+		break;
+	case ACPI_HMAT_WRITE_LATENCY:
+		*val = target->hmem_attrs[NODE_ACCESS_CLASS_GENPORT].write_latency;
+		break;
+	case ACPI_HMAT_ACCESS_BANDWIDTH:
+	case ACPI_HMAT_READ_BANDWIDTH:
+		*val = target->hmem_attrs[NODE_ACCESS_CLASS_GENPORT].read_bandwidth;
+		break;
+	case ACPI_HMAT_WRITE_BANDWIDTH:
+		*val = target->hmem_attrs[NODE_ACCESS_CLASS_GENPORT].write_bandwidth;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(acpi_get_genport_attrs);
+
 static __init void alloc_memory_initiator(unsigned int cpu_pxm)
 {
 	struct memory_initiator *initiator;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index efff750f326d..2dc49ef3e28a 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -451,6 +451,7 @@ extern bool acpi_osi_is_win8(void);
 #ifdef CONFIG_ACPI_NUMA
 int acpi_map_pxm_to_node(int pxm);
 int acpi_get_node(acpi_handle handle);
+int acpi_get_genport_attrs(u8 *device_handle, u64 *val, int type);
 
 /**
  * pxm_to_online_node - Map proximity ID to online node
@@ -485,6 +486,11 @@ static inline int acpi_get_node(acpi_handle handle)
 {
 	return 0;
 }
+
+static inline int acpi_get_genport_attrs(u8 *device_handle, u64 *val, int type)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 extern int acpi_paddr_to_node(u64 start_addr, u64 size);
 



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 17/23] cxl: Add helper function to retrieve generic port QoS
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (15 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 16/23] ACPI: NUMA: Add helper function to retrieve the performance attributes Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-19 20:22 ` [PATCH v4 18/23] cxl: Add latency and bandwidth calculations for the CXL path Dave Jiang
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Add CXL helper function that retrieves the bandwidth and latency data of a
generic port by calling acpi_get_genport_attrs() function. A device handle
is passed in constructed from the ACPI HID and UID of the CXL host bridge
(ACPI0016) device.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/acpi.c |   30 ++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    1 +
 2 files changed, 31 insertions(+)

diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
index 191644d0ca6d..41eeaa8c272e 100644
--- a/drivers/cxl/core/acpi.c
+++ b/drivers/cxl/core/acpi.c
@@ -148,3 +148,33 @@ struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
 	return ERR_PTR(rc);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_acpi_evaluate_qtg_dsm, CXL);
+
+/**
+ * cxl_acpi_get_hb_qos - retrieve QoS data for generic port
+ * @host: 'struct device' of the CXL host bridge
+ * @latency: genport latency data
+ * @bandwidth: genport bandwidth data
+ *
+ * Return: Errno on failure, 0 on success.
+ */
+int cxl_acpi_get_hb_qos(struct device *host, u64 *latency, u64 *bandwidth)
+{
+	u8 handle[ACPI_SRAT_DEVICE_HANDLE_SIZE] = { 0 };
+	struct acpi_device *adev = ACPI_COMPANION(host);
+	int rc;
+
+	/* ACPI spec 6.5 Table 5.65 */
+	memcpy(handle, acpi_device_hid(adev), 8);
+	memcpy(&handle[8], acpi_device_uid(adev), 4);
+
+	rc = acpi_get_genport_attrs(handle, latency, ACPI_HMAT_ACCESS_LATENCY);
+	if (rc)
+		return rc;
+
+	rc = acpi_get_genport_attrs(handle, bandwidth, ACPI_HMAT_ACCESS_BANDWIDTH);
+	if (rc)
+		return rc;
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_acpi_get_hb_qos, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 6a6387a545db..f9b9ce2e1647 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -813,6 +813,7 @@ int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
 			    u64 *wr_bw, u64 *wr_lat);
 int cxl_port_get_downstream_qos(struct cxl_port *port, u64 *bandwidth,
 				u64 *latency);
+int cxl_acpi_get_hb_qos(struct device *host, u64 *latency, u64 *bandwidth);
 
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 18/23] cxl: Add latency and bandwidth calculations for the CXL path
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (16 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 17/23] cxl: Add helper function to retrieve generic port QoS Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-19 20:22 ` [PATCH v4 19/23] cxl: Wait Memory_Info_Valid before access memory related info Dave Jiang
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to
caluclate latency and bandwidth for CXL memory device. Calculate minimum
bandwidth and total latency for the path from the CXL device to the root
port. The retrieved QTG ID is stored to the cxl_port of the CXL device.

For example for a device that is directly attached to a host bus:
Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link
		Latency + Generic Port Latency
Min Bandwidth = Min bandwidth for link bandwidth between HB
		and CXL device, device CDAT bandwidth, and Generic Port
		Bandwidth

For a device that has a switch in between host bus and CXL device:
Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency +
		Switch (CDAT) Latency + Switch to HB Link Latency +
		Generic Port Latency
Min Bandwidth = Min bandwidth for link bandwidth between CXL device
		to CXL switch, CXL device CDAT bandwidth, CXL switch CDAT
		bandwidth, CXL switch to HB bandwidth, and Generic Port
		Bandwidth.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/cxlpci.h |    1 +
 drivers/cxl/port.c   |   61 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 62 insertions(+)

diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 795eba31fe29..ff4c2d10ca4a 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -120,6 +120,7 @@ struct dsmas_entry {
 	struct range dpa_range;
 	u8 handle;
 	struct node_hmem_attrs hmem_attrs;
+	u16 qtg_id;
 };
 
 /* Sub-table 0: Device Scoped Memory Affinity Structure (DSMAS) */
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 2d5b9ba13429..8e6e49ca8c7d 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -67,6 +67,63 @@ static void dsmas_list_destroy(struct list_head *dsmas_list)
 	}
 }
 
+static int cxl_port_qos_calculate(struct cxl_port *port,
+				  struct list_head *dsmas_list)
+{
+	u64 sw_wr_bw, sw_wr_lat, sw_rd_bw, sw_rd_lat;
+	u64 min_rd_bw, total_rd_lat, min_wr_bw, total_wr_lat;
+	struct qtg_dsm_output *output;
+	struct qtg_dsm_input input;
+	struct dsmas_entry *dent;
+	acpi_handle handle;
+	u64 gp_bw, gp_lat;
+	u64 ds_bw, ds_lat;
+	int rc;
+
+	rc = cxl_port_get_downstream_qos(port, &ds_bw, &ds_lat);
+	if (rc)
+		return rc;
+
+	rc = cxl_port_get_switch_qos(port, &sw_rd_bw, &sw_rd_lat,
+				     &sw_wr_bw, &sw_wr_lat);
+	if (rc && rc != -ENOENT)
+		return rc;
+
+	rc = cxl_acpi_get_hb_qos(port->host_bridge, &gp_lat, &gp_bw);
+	if (rc)
+		return rc;
+
+	min_rd_bw = min_t(u64, ds_bw, sw_rd_bw);
+	min_rd_bw = min_t(u64, gp_bw, min_rd_bw);
+	total_rd_lat = ds_lat + gp_lat + sw_rd_lat;
+
+	min_wr_bw = min_t(u64, ds_bw, sw_wr_bw);
+	min_wr_bw = min_t(u64, gp_bw, min_wr_bw);
+	total_wr_lat = ds_lat + gp_lat + sw_wr_lat;
+
+	handle = cxl_acpi_get_rootdev_handle(&port->dev);
+	if (IS_ERR(handle))
+		return PTR_ERR(handle);
+
+	list_for_each_entry(dent, dsmas_list, list) {
+		input.rd_lat = dent->hmem_attrs.read_latency + total_rd_lat;
+		input.wr_lat = dent->hmem_attrs.write_latency + total_wr_lat;
+		input.rd_bw = min_t(int, min_rd_bw,
+				    dent->hmem_attrs.read_bandwidth);
+		input.wr_bw = min_t(int, min_wr_bw,
+				    dent->hmem_attrs.write_bandwidth);
+
+		output = cxl_acpi_evaluate_qtg_dsm(handle, &input);
+		if (IS_ERR(output))
+			continue;
+
+		dent->qtg_id = output->qtg_ids[0];
+		kfree(output);
+	}
+
+	return 0;
+}
+
 static int cxl_switch_port_probe(struct cxl_port *port)
 {
 	struct cxl_hdm *cxlhdm;
@@ -165,6 +222,10 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 			dev_warn(&port->dev, "Failed to parse DSMAS: %d\n", rc);
 		}
 
+		rc = cxl_port_qos_calculate(port, &dsmas_list);
+		if (rc)
+			dev_dbg(&port->dev, "Failed to do QoS calculations\n");
+
 		dsmas_list_destroy(&dsmas_list);
 	}
 



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 19/23] cxl: Wait Memory_Info_Valid before access memory related info
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (17 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 18/23] cxl: Add latency and bandwidth calculations for the CXL path Dave Jiang
@ 2023-04-19 20:22 ` Dave Jiang
  2023-04-19 20:23 ` [PATCH v4 20/23] cxl: Move identify and partition query from pci probe to port probe Dave Jiang
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:22 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

CXL rev3.0 8.1.3.8.2 Memory_Info_valid field

The Memory_Info_Valid bit indicates that the CXL Range Size High and Size
Low registers are valid. The bit must be set within 1 second of reset
deassertion to the device. Check valid bit before we check the
Memory_Active bit when waiting for cxl_await_media_ready() to ensure that
the memory info is valid for consumption.

Fixes: 2e4ba0ec9783 ("cxl/pci: Move cxl_await_media_ready() to the core")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v2:
- Check both ranges. (Jonathan)
---
 drivers/cxl/core/pci.c |   83 +++++++++++++++++++++++++++++++++++++++++++-----
 drivers/cxl/cxlpci.h   |    2 +
 2 files changed, 77 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index bb58296b3e56..84d6b1472a92 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -101,21 +101,55 @@ int devm_cxl_port_enumerate_dports(struct cxl_port *port)
 }
 EXPORT_SYMBOL_NS_GPL(devm_cxl_port_enumerate_dports, CXL);
 
-/*
- * Wait up to @media_ready_timeout for the device to report memory
- * active.
- */
-int cxl_await_media_ready(struct cxl_dev_state *cxlds)
+static int cxl_dvsec_mem_range_valid(struct cxl_dev_state *cxlds, int id)
+{
+	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
+	int d = cxlds->cxl_dvsec;
+	bool valid = false;
+	int rc, i;
+	u32 temp;
+
+	if (id > CXL_DVSEC_RANGE_MAX)
+		return -EINVAL;
+
+	/* Check MEM INFO VALID bit first, give up after 1s */
+	i = 1;
+	do {
+		rc = pci_read_config_dword(pdev,
+					   d + CXL_DVSEC_RANGE_SIZE_LOW(id),
+					   &temp);
+		if (rc)
+			return rc;
+
+		valid = FIELD_GET(CXL_DVSEC_MEM_INFO_VALID, temp);
+		if (valid)
+			break;
+		msleep(1000);
+	} while (i--);
+
+	if (!valid) {
+		dev_err(&pdev->dev,
+			"Timeout awaiting memory range %d valid after 1s.\n",
+			id);
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static int cxl_dvsec_mem_range_active(struct cxl_dev_state *cxlds, int id)
 {
 	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
 	int d = cxlds->cxl_dvsec;
 	bool active = false;
-	u64 md_status;
 	int rc, i;
+	u32 temp;
 
-	for (i = media_ready_timeout; i; i--) {
-		u32 temp;
+	if (id > CXL_DVSEC_RANGE_MAX)
+		return -EINVAL;
 
+	/* Check MEM ACTIVE bit, up to 60s timeout by default */
+	for (i = media_ready_timeout; i; i--) {
 		rc = pci_read_config_dword(
 			pdev, d + CXL_DVSEC_RANGE_SIZE_LOW(0), &temp);
 		if (rc)
@@ -134,6 +168,39 @@ int cxl_await_media_ready(struct cxl_dev_state *cxlds)
 		return -ETIMEDOUT;
 	}
 
+	return 0;
+}
+
+/*
+ * Wait up to @media_ready_timeout for the device to report memory
+ * active.
+ */
+int cxl_await_media_ready(struct cxl_dev_state *cxlds)
+{
+	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
+	int d = cxlds->cxl_dvsec;
+	int rc, i, hdm_count;
+	u64 md_status;
+	u16 cap;
+
+	rc = pci_read_config_word(pdev,
+				  d + CXL_DVSEC_CAP_OFFSET, &cap);
+	if (rc)
+		return rc;
+
+	hdm_count = FIELD_GET(CXL_DVSEC_HDM_COUNT_MASK, cap);
+	for (i = 0; i < hdm_count; i++) {
+		rc = cxl_dvsec_mem_range_valid(cxlds, i);
+		if (rc)
+			return rc;
+	}
+
+	for (i = 0; i < hdm_count; i++) {
+		rc = cxl_dvsec_mem_range_active(cxlds, i);
+		if (rc)
+			return rc;
+	}
+
 	md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
 	if (!CXLMDEV_READY(md_status))
 		return -EIO;
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index ff4c2d10ca4a..cca86a76bd75 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -32,6 +32,8 @@
 #define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
 #define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
 
+#define CXL_DVSEC_RANGE_MAX		2
+
 /* CXL 2.0 8.1.4: Non-CXL Function Map DVSEC */
 #define CXL_DVSEC_FUNCTION_MAP					2
 



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 20/23] cxl: Move identify and partition query from pci probe to port probe
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (18 preceding siblings ...)
  2023-04-19 20:22 ` [PATCH v4 19/23] cxl: Wait Memory_Info_Valid before access memory related info Dave Jiang
@ 2023-04-19 20:23 ` Dave Jiang
  2023-04-19 20:23 ` [PATCH v4 21/23] cxl: Store QTG IDs and related info to the CXL memory device context Dave Jiang
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:23 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Move the enumeration of device capacity to cxl_port_probe() from
cxl_pci_probe(). The size and capacity information should be read
after cxl_await_media_ready() so the data is valid.

Fixes: 5e2411ae8071 ("cxl/memdev: Change cxl_mem to a more descriptive name")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/pci.c  |    8 --------
 drivers/cxl/port.c |    8 ++++++++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index ed39d133b70d..06324266eae8 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -707,14 +707,6 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
-	rc = cxl_dev_state_identify(cxlds);
-	if (rc)
-		return rc;
-
-	rc = cxl_mem_create_range_info(cxlds);
-	if (rc)
-		return rc;
-
 	rc = cxl_alloc_irq_vectors(pdev);
 	if (rc)
 		return rc;
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 8e6e49ca8c7d..82c24a4c85a2 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -187,6 +187,14 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 		return rc;
 	}
 
+	rc = cxl_dev_state_identify(cxlds);
+	if (rc)
+		return rc;
+
+	rc = cxl_mem_create_range_info(cxlds);
+	if (rc)
+		return rc;
+
 	rc = devm_cxl_enumerate_decoders(cxlhdm, &info);
 	if (rc)
 		return rc;



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 21/23] cxl: Store QTG IDs and related info to the CXL memory device context
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (19 preceding siblings ...)
  2023-04-19 20:23 ` [PATCH v4 20/23] cxl: Move identify and partition query from pci probe to port probe Dave Jiang
@ 2023-04-19 20:23 ` Dave Jiang
  2023-04-19 20:23 ` [PATCH v4 22/23] cxl: Export sysfs attributes for memory device QTG ID Dave Jiang
  2023-04-19 20:23 ` [PATCH v4 23/23] cxl/mem: Add debugfs output for QTG related data Dave Jiang
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:23 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Once the QTG ID _DSM is executed successfully, the QTG ID is retrieved from
the return package. Create a list of entries in the cxl_memdev context and
store the QTG ID and the associated DPA range. This information can be
exposed to user space via sysfs in order to help region setup for
hot-plugged CXL memory devices.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v4:
- Remove unused qos_list from cxl_md
v3:
- Move back to QTG ID per partition
---
 drivers/cxl/core/mbox.c |    3 +++
 drivers/cxl/cxlmem.h    |   21 +++++++++++++++++++++
 drivers/cxl/port.c      |   35 +++++++++++++++++++++++++++++++++++
 3 files changed, 59 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index f2addb457172..0352bd36d47e 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1120,6 +1120,9 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
 	mutex_init(&cxlds->mbox_mutex);
 	mutex_init(&cxlds->event.log_lock);
 	cxlds->dev = dev;
+	INIT_LIST_HEAD(&cxlds->qos_list);
+	cxlds->ram_qtg_id = CXL_QTG_ID_INVALID;
+	cxlds->pmem_qtg_id = CXL_QTG_ID_INVALID;
 
 	return cxlds;
 }
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 001dabf0231b..9a23e13ce796 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -5,6 +5,7 @@
 #include <uapi/linux/cxl_mem.h>
 #include <linux/cdev.h>
 #include <linux/uuid.h>
+#include <linux/node.h>
 #include "cxl.h"
 
 /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
@@ -215,6 +216,19 @@ struct cxl_event_state {
 	struct mutex log_lock;
 };
 
+/**
+ * struct qos_prop - QoS property entry
+ * @list - list entry
+ * @dpa_range - range for DPA address
+ * @qtg_id - QoS Throttling Group ID
+ */
+struct qos_prop_entry {
+	struct list_head list;
+	struct range dpa_range;
+	u16 qtg_id;
+	struct node_hmem_attrs hmem_attrs;
+};
+
 /**
  * struct cxl_dev_state - The driver device state
  *
@@ -251,6 +265,9 @@ struct cxl_event_state {
  * @serial: PCIe Device Serial Number
  * @event: event log driver state
  * @mbox_send: @dev specific transport for transmitting mailbox commands
+ * @ram_qtg_id: QTG ID for volatile region
+ * @pmem_qtg_id: QTG ID for persistent region
+ * @qos_list: QTG ID related list of entries
  *
  * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
  * details on capacity parameters.
@@ -283,6 +300,10 @@ struct cxl_dev_state {
 	u64 next_volatile_bytes;
 	u64 next_persistent_bytes;
 
+	int ram_qtg_id;
+	int pmem_qtg_id;
+	struct list_head qos_list;
+
 	resource_size_t component_reg_phys;
 	u64 serial;
 
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 82c24a4c85a2..0bf6ed7a05a8 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -124,6 +124,40 @@ static int cxl_port_qos_calculate(struct cxl_port *port,
 	return 0;
 }
 
+static void cxl_memdev_set_qtg(struct cxl_dev_state *cxlds, struct list_head *dsmas_list)
+{
+	struct range pmem_range = {
+		.start = cxlds->pmem_res.start,
+		.end = cxlds->pmem_res.end,
+	};
+	struct range ram_range = {
+		.start = cxlds->ram_res.start,
+		.end = cxlds->ram_res.end,
+	};
+	struct qos_prop_entry *qos;
+	struct dsmas_entry *dent;
+
+	list_for_each_entry(dent, dsmas_list, list) {
+		qos = devm_kzalloc(cxlds->dev, sizeof(*qos), GFP_KERNEL);
+		if (!qos)
+			return;
+
+		qos->dpa_range = dent->dpa_range;
+		qos->qtg_id = dent->qtg_id;
+		qos->hmem_attrs = dent->hmem_attrs;
+		list_add_tail(&qos->list, &cxlds->qos_list);
+
+		if (resource_size(&cxlds->ram_res) &&
+		    range_contains(&ram_range, &dent->dpa_range) &&
+		    cxlds->ram_qtg_id == -1)
+			cxlds->ram_qtg_id = dent->qtg_id;
+		else if (resource_size(&cxlds->pmem_res) &&
+			 range_contains(&pmem_range, &dent->dpa_range) &&
+			 cxlds->pmem_qtg_id == -1)
+			cxlds->pmem_qtg_id = dent->qtg_id;
+	}
+}
+
 static int cxl_switch_port_probe(struct cxl_port *port)
 {
 	struct cxl_hdm *cxlhdm;
@@ -234,6 +268,7 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 		if (rc)
 			dev_dbg(&port->dev, "Failed to do QoS calculations\n");
 
+		cxl_memdev_set_qtg(cxlds, &dsmas_list);
 		dsmas_list_destroy(&dsmas_list);
 	}
 



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 22/23] cxl: Export sysfs attributes for memory device QTG ID
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (20 preceding siblings ...)
  2023-04-19 20:23 ` [PATCH v4 21/23] cxl: Store QTG IDs and related info to the CXL memory device context Dave Jiang
@ 2023-04-19 20:23 ` Dave Jiang
  2023-04-19 20:23 ` [PATCH v4 23/23] cxl/mem: Add debugfs output for QTG related data Dave Jiang
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:23 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: Dan Williams, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas, Jonathan.Cameron

Export qtg_id sysfs attributes for the CXL memory device. The QTG ID
should show up as /sys/bus/cxl/devices/memX/ram/qtg_id for the volatile
partition and /sys/bus/cxl/devices/memX/pmem/qtg_id for the persistent
partition. The QTG ID is retrieved via _DSM after supplying the
calculated bandwidth and latency for the entire CXL path from device to
the CPU. This ID is used to match up to the root decoder QTG ID to
determine which CFMWS the memory range of a hotplugged CXL mem device
should be assigned under.

While there may be multiple DSMAS exported by the device CDAT, the driver
will only expose the first QTG ID per partition in sysfs for now. In the
future when multiple QTG IDs are necessary, they can be exposed. [1]

[1]: https://lore.kernel.org/linux-cxl/167571650007.587790.10040913293130712882.stgit@djiang5-mobl3.local/T/#md2a47b1ead3e1ba08f50eab29a4af1aed1d215ab

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v4:
- Change kernel version for documentation to v6.5
v3:
- Expand description of qtg_id. (Alison)
---
 Documentation/ABI/testing/sysfs-bus-cxl |   22 ++++++++++++++++++++++
 drivers/cxl/core/memdev.c               |   26 ++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index bd2b59784979..ff93d61ee23e 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -28,6 +28,17 @@ Description:
 		Payload in the CXL-2.0 specification.
 
 
+What:		/sys/bus/cxl/devices/memX/ram/qtg_id
+Date:		January, 2023
+KernelVersion:	v6.5
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) Shows calculated QoS Throttling Group ID for the
+		"Volatile Only Capacity" DPA range. When creating regions,
+		the qtg_id for the memory range should match the root
+		decoder's qtg_id to have optimal performance.
+
+
 What:		/sys/bus/cxl/devices/memX/pmem/size
 Date:		December, 2020
 KernelVersion:	v5.12
@@ -38,6 +49,17 @@ Description:
 		Payload in the CXL-2.0 specification.
 
 
+What:		/sys/bus/cxl/devices/memX/pmem/qtg_id
+Date:		January, 2023
+KernelVersion:	v6.5
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) Shows calculated QoS Throttling Group ID for the
+		"Persistent Only Capacity" DPA range. When creating regions,
+		the qtg_id for the memory range should match the root
+		decoder's qtg_id to have optimal performance.
+
+
 What:		/sys/bus/cxl/devices/memX/serial
 Date:		January, 2022
 KernelVersion:	v5.18
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 28a05f2fe32d..04058ec5fcff 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -76,6 +76,18 @@ static ssize_t ram_size_show(struct device *dev, struct device_attribute *attr,
 static struct device_attribute dev_attr_ram_size =
 	__ATTR(size, 0444, ram_size_show, NULL);
 
+static ssize_t ram_qtg_id_show(struct device *dev, struct device_attribute *attr,
+			       char *buf)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	return sysfs_emit(buf, "%d\n", cxlds->ram_qtg_id);
+}
+
+static struct device_attribute dev_attr_ram_qtg_id =
+	__ATTR(qtg_id, 0444, ram_qtg_id_show, NULL);
+
 static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
 			      char *buf)
 {
@@ -89,6 +101,18 @@ static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
 static struct device_attribute dev_attr_pmem_size =
 	__ATTR(size, 0444, pmem_size_show, NULL);
 
+static ssize_t pmem_qtg_id_show(struct device *dev, struct device_attribute *attr,
+				char *buf)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	return sysfs_emit(buf, "%d\n", cxlds->pmem_qtg_id);
+}
+
+static struct device_attribute dev_attr_pmem_qtg_id =
+	__ATTR(qtg_id, 0444, pmem_qtg_id_show, NULL);
+
 static ssize_t serial_show(struct device *dev, struct device_attribute *attr,
 			   char *buf)
 {
@@ -117,11 +141,13 @@ static struct attribute *cxl_memdev_attributes[] = {
 
 static struct attribute *cxl_memdev_pmem_attributes[] = {
 	&dev_attr_pmem_size.attr,
+	&dev_attr_pmem_qtg_id.attr,
 	NULL,
 };
 
 static struct attribute *cxl_memdev_ram_attributes[] = {
 	&dev_attr_ram_size.attr,
+	&dev_attr_ram_qtg_id.attr,
 	NULL,
 };
 



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 23/23] cxl/mem: Add debugfs output for QTG related data
  2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (21 preceding siblings ...)
  2023-04-19 20:23 ` [PATCH v4 22/23] cxl: Export sysfs attributes for memory device QTG ID Dave Jiang
@ 2023-04-19 20:23 ` Dave Jiang
  22 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-19 20:23 UTC (permalink / raw)
  To: linux-cxl, linux-acpi
  Cc: Dan Williams, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas, Jonathan.Cameron

Add debugfs output to /sys/kernel/debug/cxl/memX/qtgmap
The debugfs attribute will dump out all the DSMAS ranges and the associated
QTG ID exported by the CXL device CDAT.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v4:
- Use cxlds->qos_list instead of the stray cxlmd->qos_list
---
 Documentation/ABI/testing/debugfs-cxl |   11 +++++++++++
 drivers/cxl/mem.c                     |   17 +++++++++++++++++
 2 files changed, 28 insertions(+)
 create mode 100644 Documentation/ABI/testing/debugfs-cxl

diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
new file mode 100644
index 000000000000..0f36eeb7e59b
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-cxl
@@ -0,0 +1,11 @@
+What:		/sys/kernel/debug/cxl/memX/qtg_map
+Date:		Mar, 2023
+KernelVersion:	v6.4
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) Entries of all Device Physical Address (DPA) ranges
+		provided by the device Coherent Device Attributes Table (CDAT)
+		Device Scoped Memory Affinity Structure (DSMAS) entries with
+		the matching QoS Throttling Group (QTG) id calculated from the
+		latency and bandwidth of the CXL path from the memory device
+		to the CPU.
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 39c4b54f0715..9a23875975eb 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -45,6 +45,22 @@ static int cxl_mem_dpa_show(struct seq_file *file, void *data)
 	return 0;
 }
 
+static int cxl_mem_qtg_show(struct seq_file *file, void *data)
+{
+	struct device *dev = file->private;
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct qos_prop_entry *qos;
+
+	list_for_each_entry(qos, &cxlds->qos_list, list) {
+		seq_printf(file, "%08llx-%08llx : QTG ID %u\n",
+			   qos->dpa_range.start, qos->dpa_range.end,
+			   qos->qtg_id);
+	}
+
+	return 0;
+}
+
 static int devm_cxl_add_endpoint(struct device *host, struct cxl_memdev *cxlmd,
 				 struct cxl_dport *parent_dport)
 {
@@ -117,6 +133,7 @@ static int cxl_mem_probe(struct device *dev)
 
 	dentry = cxl_debugfs_create_dir(dev_name(dev));
 	debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show);
+	debugfs_create_devm_seqfile(dev, "qtgmap", dentry, cxl_mem_qtg_show);
 	rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
 	if (rc)
 		return rc;



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs
  2023-04-19 20:21 ` [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
@ 2023-04-20  8:51   ` Jonathan Cameron
  2023-04-20 20:53     ` Dave Jiang
  2023-04-24 21:46   ` Dan Williams
  1 sibling, 1 reply; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20  8:51 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, Ira Weiny, dan.j.williams, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:21:07 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Export the QoS Throttling Group ID from the CXL Fixed Memory Window
> Structure (CFMWS) under the root decoder sysfs attributes.
> CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS)
> 
> cxl cli will use this QTG ID to match with the _DSM retrieved QTG ID for a
> hot-plugged CXL memory device DPA memory range to make sure that the DPA range
> is under the right CFMWS window.
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 

Bikeshedding alert: 

Why not just call it qtg?  What does the _id add?

I don't really care either way...

LGTM
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

One (more) completely trivial comment inline.

Jonathan

> ---
> v4:
> - Change kernel version for documentation to v6.5
> v2:
> - Add explanation commit header (Jonathan)
> ---
>  Documentation/ABI/testing/sysfs-bus-cxl |    9 +++++++++
>  drivers/cxl/acpi.c                      |    3 +++
>  drivers/cxl/core/port.c                 |   14 ++++++++++++++
>  drivers/cxl/cxl.h                       |    3 +++
>  4 files changed, 29 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 3acf2f17a73f..bd2b59784979 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -309,6 +309,15 @@ Description:
>  		(WO) Write a string in the form 'regionZ' to delete that region,
>  		provided it is currently idle / not bound to a driver.
>  
> +What:		/sys/bus/cxl/devices/decoderX.Y/qtg_id
> +Date:		Jan, 2023
> +KernelVersion:	v6.5
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RO) Shows the QoS Throttling Group ID. The QTG ID for a root
> +		decoder comes from the CFMWS structure of the CEDT. A value of
> +		-1 indicates that no QTG ID was retrieved. The QTG ID is used as
> +		guidance to match against the QTG ID of a hot-plugged device.
>  
>  What:		/sys/bus/cxl/devices/regionZ/uuid
>  Date:		May, 2022
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 7e1765b09e04..abc24137c291 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>  			}
>  		}
>  	}
> +
> +	cxld->qtg_id = cfmws->qtg_id;
> +
>  	rc = cxl_decoder_add(cxld, target_map);
>  err_xormap:
>  	if (rc)
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 4d1f9c5b5029..024d4178f557 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -276,6 +276,16 @@ static ssize_t interleave_ways_show(struct device *dev,
>  
>  static DEVICE_ATTR_RO(interleave_ways);
>  
> +static ssize_t qtg_id_show(struct device *dev,
> +			   struct device_attribute *attr, char *buf)
> +{
> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +
> +	return sysfs_emit(buf, "%d\n", cxld->qtg_id);
> +}
> +

No blank line here would be more consistent with local style (based on 
a really quick look).

> +static DEVICE_ATTR_RO(qtg_id);
> +
>  static struct attribute *cxl_decoder_base_attrs[] = {
>  	&dev_attr_start.attr,
>  	&dev_attr_size.attr,
> @@ -295,6 +305,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
>  	&dev_attr_cap_type2.attr,
>  	&dev_attr_cap_type3.attr,
>  	&dev_attr_target_list.attr,
> +	&dev_attr_qtg_id.attr,
>  	SET_CXL_REGION_ATTR(create_pmem_region)
>  	SET_CXL_REGION_ATTR(create_ram_region)
>  	SET_CXL_REGION_ATTR(delete_region)
> @@ -1625,6 +1636,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
>  	}
>  
>  	atomic_set(&cxlrd->region_id, rc);
> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>  	return cxlrd;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL);
> @@ -1662,6 +1674,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
>  
>  	cxld = &cxlsd->cxld;
>  	cxld->dev.type = &cxl_decoder_switch_type;
> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>  	return cxlsd;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL);
> @@ -1694,6 +1707,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port)
>  	}
>  
>  	cxld->dev.type = &cxl_decoder_endpoint_type;
> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>  	return cxled;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 044a92d9813e..278ab6952332 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -300,6 +300,7 @@ enum cxl_decoder_type {
>   */
>  #define CXL_DECODER_MAX_INTERLEAVE 16
>  
> +#define CXL_QTG_ID_INVALID	-1
>  
>  /**
>   * struct cxl_decoder - Common CXL HDM Decoder Attributes
> @@ -311,6 +312,7 @@ enum cxl_decoder_type {
>   * @target_type: accelerator vs expander (type2 vs type3) selector
>   * @region: currently assigned region for this decoder
>   * @flags: memory type capabilities and locking
> + * @qtg_id: QoS Throttling Group ID
>   * @commit: device/decoder-type specific callback to commit settings to hw
>   * @reset: device/decoder-type specific callback to reset hw settings
>  */
> @@ -323,6 +325,7 @@ struct cxl_decoder {
>  	enum cxl_decoder_type target_type;
>  	struct cxl_region *region;
>  	unsigned long flags;
> +	int qtg_id;
>  	int (*commit)(struct cxl_decoder *cxld);
>  	int (*reset)(struct cxl_decoder *cxld);
>  };
> 
> 


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 02/23] cxl: Add checksum verification to CDAT from CXL
  2023-04-19 20:21 ` [PATCH v4 02/23] cxl: Add checksum verification to CDAT from CXL Dave Jiang
@ 2023-04-20  8:55   ` Jonathan Cameron
  2023-04-24 22:01   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20  8:55 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, Ira Weiny, dan.j.williams, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:21:13 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> A CDAT table is available from a CXL device. The table is read by the
> driver and cached in software. With the CXL subsystem needing to parse the
> CDAT table, the checksum should be verified. Add checksum verification
> after the CDAT table is read from device.
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> v3:
> - Just return the final sum. (Alison)
> v2:
> - Drop ACPI checksum export and just use local verification. (Dan)
> ---
>  drivers/cxl/core/pci.c |   16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 25b7e8125d5d..9c7e2f69d9ca 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -528,6 +528,16 @@ static int cxl_cdat_read_table(struct device *dev,
>  	return 0;
>  }
>  
> +static unsigned char cdat_checksum(void *buf, size_t size)
> +{
> +	unsigned char sum, *data = buf;
> +	size_t i;
> +
> +	for (sum = 0, i = 0; i < size; i++)
> +		sum += data[i];
> +	return sum;
> +}
> +
>  /**
>   * read_cdat_data - Read the CDAT data on this port
>   * @port: Port to read data from
> @@ -573,6 +583,12 @@ void read_cdat_data(struct cxl_port *port)
>  	}
>  
>  	port->cdat.table = cdat_table + sizeof(__le32);
> +	if (cdat_checksum(port->cdat.table, cdat_length)) {
> +		/* Don't leave table data allocated on error */
> +		devm_kfree(dev, cdat_table);
> +		dev_err(dev, "CDAT data checksum error\n");
> +	}
> +
>  	port->cdat.length = cdat_length;
>  }
>  EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
> 
> 


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 03/23] cxl: Add support for reading CXL switch CDAT table
  2023-04-19 20:21 ` [PATCH v4 03/23] cxl: Add support for reading CXL switch CDAT table Dave Jiang
@ 2023-04-20  9:25   ` Jonathan Cameron
  2023-04-24 22:08   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20  9:25 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, Ira Weiny, dan.j.williams, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:21:19 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Move read_cdat_data() from endpoint probe to general port probe to
> allow reading of CDAT data for CXL switches as well as CXL device.
> Add wrapper support for cxl_test to bypass the cdat reading.
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

It might be worth a wrapper at somepoint for that dance
from port to the PCI device with the actual DOE etc.

Such a wrapp would provide somewhere to add a bit of
documentation on why the uport might be a platform device
(memX) or might be a PCI device thus explaining the two
different way it is handled.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> v4:
> - Remove cxl_test wrapper. (Ira)
> ---
>  drivers/cxl/core/pci.c |   20 +++++++++++++++-----
>  drivers/cxl/port.c     |    6 +++---
>  2 files changed, 18 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 9c7e2f69d9ca..1c415b26e866 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -546,16 +546,26 @@ static unsigned char cdat_checksum(void *buf, size_t size)
>   */
>  void read_cdat_data(struct cxl_port *port)
>  {
> -	struct pci_doe_mb *cdat_doe;
> -	struct device *dev = &port->dev;
>  	struct device *uport = port->uport;
> -	struct cxl_memdev *cxlmd = to_cxl_memdev(uport);
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> -	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> +	struct device *dev = &port->dev;
> +	struct cxl_dev_state *cxlds;
> +	struct pci_doe_mb *cdat_doe;
> +	struct cxl_memdev *cxlmd;
> +	struct pci_dev *pdev;
>  	size_t cdat_length;
>  	void *cdat_table;
>  	int rc;
>  
> +	if (is_cxl_memdev(uport)) {
> +		cxlmd = to_cxl_memdev(uport);
> +		cxlds = cxlmd->cxlds;          
> +		pdev = to_pci_dev(cxlds->dev); 
> +	} else if (dev_is_pci(uport)) {
> +		pdev = to_pci_dev(uport);
> +	} else {
> +		return;
> +	}
> +
>  	cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL,
>  					CXL_DOE_PROTOCOL_TABLE_ACCESS);
>  	if (!cdat_doe) {
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index 22a7ab2bae7c..615e0ef6b440 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -93,9 +93,6 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
>  	if (IS_ERR(cxlhdm))
>  		return PTR_ERR(cxlhdm);
>  
> -	/* Cache the data early to ensure is_visible() works */
> -	read_cdat_data(port);
> -
>  	get_device(&cxlmd->dev);
>  	rc = devm_add_action_or_reset(&port->dev, schedule_detach, cxlmd);
>  	if (rc)
> @@ -135,6 +132,9 @@ static int cxl_port_probe(struct device *dev)
>  {
>  	struct cxl_port *port = to_cxl_port(dev);
>  
> +	/* Cache the data early to ensure is_visible() works */
> +	read_cdat_data(port);
> +
>  	if (is_cxl_endpoint(port))
>  		return cxl_endpoint_port_probe(port);
>  	return cxl_switch_port_probe(port);
> 
> 


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/23] cxl: Add common helpers for cdat parsing
  2023-04-19 20:21 ` [PATCH v4 04/23] cxl: Add common helpers for cdat parsing Dave Jiang
@ 2023-04-20  9:41   ` Jonathan Cameron
  2023-04-20 21:05     ` Dave Jiang
  2023-04-24 22:33   ` Dan Williams
  1 sibling, 1 reply; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20  9:41 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:21:25 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Add helper functions to parse the CDAT table and provide a callback to
> parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
> parsing. The code is patterned after the ACPI table parsing helpers.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
A few minor things inline.   More than possible you addressed them
in earlier versions though.

Jonathan

> ---
> v2:
> - Use local headers to handle LE instead of ACPI header
> - Reduce complexity of parser function. (Jonathan)
> - Directly access header type. (Jonathan)
> - Simplify header ptr math. (Jonathan)
> - Move parsed counter to the correct location. (Jonathan)
> - Add LE to host conversion for entry length
> ---
>  drivers/cxl/core/Makefile |    1 
>  drivers/cxl/core/cdat.c   |  100 +++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlpci.h      |   29 +++++++++++++
>  3 files changed, 130 insertions(+)
>  create mode 100644 drivers/cxl/core/cdat.c
> 
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index ca4ae31d8f57..867a8014b462 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -12,5 +12,6 @@ cxl_core-y += memdev.o
>  cxl_core-y += mbox.o
>  cxl_core-y += pci.o
>  cxl_core-y += hdm.o
> +cxl_core-y += cdat.o
>  cxl_core-$(CONFIG_TRACING) += trace.o
>  cxl_core-$(CONFIG_CXL_REGION) += region.o
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> new file mode 100644
> index 000000000000..210f4499bddb
> --- /dev/null
> +++ b/drivers/cxl/core/cdat.c
> @@ -0,0 +1,100 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
> +#include "cxlpci.h"
> +#include "cxl.h"
> +
> +static bool has_handler(struct cdat_subtable_proc *proc)

Even though they are static, I'd add a cxl_ or cdat_ prefix
to these to make it clear they are local.

> +{
> +	return proc->handler;
> +}
> +
> +static int call_handler(struct cdat_subtable_proc *proc,
> +			struct cdat_subtable_entry *ent)
> +{
> +	if (has_handler(proc))

Do we need to check this again? It's checked in the parse_entries code
well before this point.

Also, if moving to checking it once, then is it worth the
little wrapper functions?


> +		return proc->handler(ent->hdr, proc->arg);
> +	return -EINVAL;
> +}
> +
> +static bool cdat_is_subtable_match(struct cdat_subtable_entry *ent)
> +{
> +	return ent->hdr->type == ent->type;
> +}
> +
> +static int cdat_table_parse_entries(enum cdat_type type,
> +				    struct cdat_header *table_header,
> +				    struct cdat_subtable_proc *proc)
> +{
> +	unsigned long table_end, entry_len;
> +	struct cdat_subtable_entry entry;
> +	int count = 0;
> +	int rc;
> +
> +	if (!has_handler(proc))
> +		return -EINVAL;
> +
> +	table_end = (unsigned long)table_header + table_header->length;
> +
> +	if (type >= CDAT_TYPE_RESERVED)
> +		return -EINVAL;
> +
> +	entry.type = type;
> +	entry.hdr = (struct cdat_entry_header *)(table_header + 1);
> +
> +	while ((unsigned long)entry.hdr < table_end) {
> +		entry_len = le16_to_cpu(entry.hdr->length);
> +
> +		if ((unsigned long)entry.hdr + entry_len > table_end)
> +			return -EINVAL;
> +
> +		if (entry_len == 0)
> +			return -EINVAL;
> +
> +		if (cdat_is_subtable_match(&entry)) {
> +			rc = call_handler(proc, &entry);
> +			if (rc)
> +				return rc;
> +			count++;
> +		}
> +
> +		entry.hdr = (struct cdat_entry_header *)((unsigned long)entry.hdr + entry_len);
> +	}
> +
> +	return count;
> +}

...

> +int cdat_table_parse_sslbis(struct cdat_header *table,
> +			    cdat_tbl_entry_handler handler, void *arg)

Feels like these ones should take a typed arg.  Sure you'll loose
that again to use the generic handling code, but at this level we can 
do it I think.

> +{
> +	struct cdat_subtable_proc proc = {
> +		.handler	= handler,
> +		.arg		= arg,
> +	};
> +
> +	return cdat_table_parse_entries(CDAT_TYPE_SSLBIS, table, &proc);
> +}


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-04-19 20:21 ` [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
@ 2023-04-20 11:33   ` Jonathan Cameron
  2023-04-20 11:35     ` Jonathan Cameron
  2023-04-24 22:38   ` Dan Williams
  2023-04-26  3:44   ` Li, Ming
  2 siblings, 1 reply; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20 11:33 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:21:31 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Provide a callback function to the CDAT parser in order to parse the Device
> Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
> DPA range and its associated attributes in each entry. See the CDAT
> specification for details.
> 
> Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity
> Structure (DSMAS)

I'm not sure what purpose of this is. If it's just detecting problems
with the entry because we aren't interested in the content yet, then fine
but good to make that clear in patch intro.

Maybe I'm missing something!

Thanks,

Jonathan

> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v3:
> - Add spec section number. (Alison)
> - Remove cast from void *. (Alison)
> - Refactor cxl_port_probe() block. (Alison)
> - Move CDAT parse to cxl_endpoint_port_probe()
> 
> v2:
> - Add DSMAS table size check. (Lukas)
> - Use local DSMAS header for LE handling.
> - Remove dsmas lock. (Jonathan)
> - Fix handle size (Jonathan)
> - Add LE to host conversion for DSMAS address and length.
> - Make dsmas_list local


> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index 615e0ef6b440..3022bdd52439 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -57,6 +57,16 @@ static int discover_region(struct device *dev, void *root)
>  	return 0;
>  }

>  static int cxl_switch_port_probe(struct cxl_port *port)
>  {
>  	struct cxl_hdm *cxlhdm;
> @@ -125,6 +135,18 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
>  	device_for_each_child(&port->dev, root, discover_region);
>  	put_device(&root->dev);
>  
> +	if (port->cdat.table) {
> +		LIST_HEAD(dsmas_list);
> +
> +		rc = cdat_table_parse_dsmas(port->cdat.table,
> +					    cxl_dsmas_parse_entry,
> +					    (void *)&dsmas_list);
> +		if (rc < 0)
> +			dev_warn(&port->dev, "Failed to parse DSMAS: %d\n", rc);
> +
> +		dsmas_list_destroy(&dsmas_list);

I'm a little confused here.  What's the point?  Parse them then throw the info away?
Maybe a comment if all we are trying to do is warn about CDAT problems.


> +	}
> +
>  	return 0;
>  }
>  
> 
> 


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-04-20 11:33   ` Jonathan Cameron
@ 2023-04-20 11:35     ` Jonathan Cameron
  2023-04-20 23:25       ` Dave Jiang
  0 siblings, 1 reply; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20 11:35 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Thu, 20 Apr 2023 12:33:50 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Wed, 19 Apr 2023 13:21:31 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
> > Provide a callback function to the CDAT parser in order to parse the Device
> > Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
> > DPA range and its associated attributes in each entry. See the CDAT
> > specification for details.
> > 
> > Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity
> > Structure (DSMAS)  
> 
> I'm not sure what purpose of this is. If it's just detecting problems
> with the entry because we aren't interested in the content yet, then fine
> but good to make that clear in patch intro.
> 
> Maybe I'm missing something!
> 
Ah. Got to next patch.  Perhaps a forwards reference to that will avoid
anyone else wondering what is going on here!

> Thanks,
> 
> Jonathan
> 
> > 
> > Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> > 
> > ---
> > v3:
> > - Add spec section number. (Alison)
> > - Remove cast from void *. (Alison)
> > - Refactor cxl_port_probe() block. (Alison)
> > - Move CDAT parse to cxl_endpoint_port_probe()
> > 
> > v2:
> > - Add DSMAS table size check. (Lukas)
> > - Use local DSMAS header for LE handling.
> > - Remove dsmas lock. (Jonathan)
> > - Fix handle size (Jonathan)
> > - Add LE to host conversion for DSMAS address and length.
> > - Make dsmas_list local  
> 
> 
> > diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> > index 615e0ef6b440..3022bdd52439 100644
> > --- a/drivers/cxl/port.c
> > +++ b/drivers/cxl/port.c
> > @@ -57,6 +57,16 @@ static int discover_region(struct device *dev, void *root)
> >  	return 0;
> >  }  
> 
> >  static int cxl_switch_port_probe(struct cxl_port *port)
> >  {
> >  	struct cxl_hdm *cxlhdm;
> > @@ -125,6 +135,18 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
> >  	device_for_each_child(&port->dev, root, discover_region);
> >  	put_device(&root->dev);
> >  
> > +	if (port->cdat.table) {
> > +		LIST_HEAD(dsmas_list);
> > +
> > +		rc = cdat_table_parse_dsmas(port->cdat.table,
> > +					    cxl_dsmas_parse_entry,
> > +					    (void *)&dsmas_list);
> > +		if (rc < 0)
> > +			dev_warn(&port->dev, "Failed to parse DSMAS: %d\n", rc);
> > +
> > +		dsmas_list_destroy(&dsmas_list);  
> 
> I'm a little confused here.  What's the point?  Parse them then throw the info away?
> Maybe a comment if all we are trying to do is warn about CDAT problems.
> 
> 
> > +	}
> > +
> >  	return 0;
> >  }
> >  
> > 
> >   
> 


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable from CDAT
  2023-04-19 20:21 ` [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
@ 2023-04-20 11:40   ` Jonathan Cameron
  2023-04-20 23:25     ` Dave Jiang
  2023-04-24 22:46   ` Dan Williams
  1 sibling, 1 reply; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20 11:40 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:21:37 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Provide a callback to parse the Device Scoped Latency and Bandwidth
> Information Structure (DSLBIS) in the CDAT structures. The DSLBIS
> contains the bandwidth and latency information that's tied to a DSMAS
> handle. The driver will retrieve the read and write latency and
> bandwidth associated with the DSMAS which is tied to a DPA range.
> 
> Coherent Device Attribute Table 1.03 2.1 Device Scoped Latency and
> Bandwidth Information Structure (DSLBIS)
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 

One comment inline.

> +/* Flags for DSLBIS subtable */
> +#define DSLBIS_MEM_MASK		GENMASK(3, 0)
> +#define DSLBIS_MEM_MEMORY	0
> +
>  int devm_cxl_port_enumerate_dports(struct cxl_port *port);
>  struct cxl_dev_state;
>  int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> @@ -136,5 +164,9 @@ cdat_table_parse(dsmas);
>  cdat_table_parse(dslbis);
>  cdat_table_parse(sslbis);
>  
> -int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg);
> +#define cxl_parse_entry(x) \
> +int cxl_##x##_parse_entry(struct cdat_entry_header *header, void *arg)
I'm not sure this is worthwhile. What was your reasoning for it?
Also wrecks typing that arg argument as I suggested earlier...

> +
> +cxl_parse_entry(dsmas);
> +cxl_parse_entry(dslbis);


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 07/23] cxl: Add callback to parse the SSLBIS subtable from CDAT
  2023-04-19 20:21 ` [PATCH v4 07/23] cxl: Add callback to parse the SSLBIS " Dave Jiang
@ 2023-04-20 11:50   ` Jonathan Cameron
  2023-04-24 23:38   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20 11:50 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:21:43 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Provide a callback to parse the Switched Scoped Latency and Bandwidth
> Information Structure (DSLBIS) in the CDAT structures. The SSLBIS
> contains the bandwidth and latency information that's tied to the
> CLX switch that the data table has been read from. The extracted

CLX? :)

> values are indexed by the downstream port id. It is possible
> the downstream port id is 0xffff which is a wildcard value for any
> port id.
> 
> Coherent Device Attribute Table 1.03 2.1 Switched Scoped Latency
> and Bandwidth Information Structure (DSLBIS)
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
LGTM subject to same comment on typing arg.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>




^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-04-19 20:21 ` [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
@ 2023-04-20 12:00   ` Jonathan Cameron
  2023-04-21  0:11     ` Dave Jiang
  2023-04-25  0:12   ` Dan Williams
  1 sibling, 1 reply; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20 12:00 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:21:49 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)
> 
> Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
> an input of an ACPI package with 4 dwords (read latency, write latency,
> read bandwidth, write bandwidth). The call returns a package with 1 WORD
> that provides the max supported QTG ID and a package that may contain 0 or
> more WORDs as the recommended QTG IDs in the recommended order.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 

A few minor comments inline. 


> +/**
> + * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
> + * @handle: ACPI handle
> + * @input: bandwidth and latency data
> + *
> + * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
> + * the QTG IDs that falls within the performance data.
Falls within is a little vague.  Perhaps something like

the QTG IDs that are suitable for the performance point in order of most suitable
to least suitable.

> + */
> +struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
> +						 struct qtg_dsm_input *input)
> +{
> +	union acpi_object *out_obj, *out_buf, *pkg;
> +	union acpi_object in_buf = {
> +		.buffer = {
> +			.type = ACPI_TYPE_BUFFER,
> +			.pointer = (u8 *)input,
> +			.length = sizeof(u32) * 4,

sizeof(*input)?

Also, ACPI structures are always little endian. Do we need to be careful of that
here?

> +		},
> +	};
> +	union acpi_object in_obj = {
> +		.package = {
> +			.type = ACPI_TYPE_PACKAGE,
> +			.count = 1,
> +			.elements = &in_buf
> +		},
> +	};
> +	struct qtg_dsm_output *output = NULL;
> +	int len, rc, i;
> +	u16 *max_qtg;
> +
> +	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
> +	if (!out_obj)
> +		return ERR_PTR(-ENXIO);
> +
> +	if (out_obj->type != ACPI_TYPE_PACKAGE) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	/* Check Max QTG ID */
> +	pkg = &out_obj->package.elements[0];
> +	if (pkg->type != ACPI_TYPE_BUFFER) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	if (pkg->buffer.length != sizeof(u16)) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +	max_qtg = (u16 *)pkg->buffer.pointer;
> +
> +	/* Retrieve QTG IDs package */
> +	pkg = &out_obj->package.elements[1];
> +	if (pkg->type != ACPI_TYPE_PACKAGE) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	out_buf = &pkg->package.elements[0];
> +	if (out_buf->type != ACPI_TYPE_BUFFER) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	len = out_buf->buffer.length;
> +
> +	/* It's legal to have 0 QTG entries */
> +	if (len == 0)
> +		goto out;
> +
> +	/* Malformed package, not multiple of WORD size */
> +	if (len % sizeof(u16)) {
> +		rc = -ENXIO;
> +		goto out;
> +	}
> +
> +	output = kmalloc(len + sizeof(*output), GFP_KERNEL);
> +	if (!output) {
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +
> +	output->nr = len / sizeof(u16);
> +	memcpy(output->qtg_ids, out_buf->buffer.pointer, len);
> +
> +	for (i = 0; i < output->nr; i++) {
> +		if (output->qtg_ids[i] > *max_qtg)
> +			pr_warn("QTG ID %u greater than MAX %u\n",
> +				output->qtg_ids[i], *max_qtg);
> +	}
> +
> +out:
> +	ACPI_FREE(out_obj);
> +	return output;
> +
> +err:
> +	ACPI_FREE(out_obj);
> +	return ERR_PTR(rc);

Why not combine these with something like

	return IS_ERR(rc) ? ERR_PTR(rc) : output;

I'm fine with leaving as it is, if this is common style for these
sorts of ACPI functions.

> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_evaluate_qtg_dsm, CXL);


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 09/23] cxl: Add helper function to retrieve ACPI handle of CXL root device
  2023-04-19 20:21 ` [PATCH v4 09/23] cxl: Add helper function to retrieve ACPI handle of CXL root device Dave Jiang
@ 2023-04-20 12:06   ` Jonathan Cameron
  2023-04-21 23:24     ` Dave Jiang
  2023-04-25  0:18   ` Dan Williams
  1 sibling, 1 reply; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20 12:06 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:21:55 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Provide a helper to find the ACPI0017 device in order to issue the _DSM.
> The helper will take the 'struct device' from a cxl_port and iterate until
> the root device is reached. The ACPI handle will be returned from the root
> device.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

Question inline.  If the answer is no then this looks fine to me.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


> +/**
> + * cxl_acpi_get_rootdev_handle - get the ACPI handle of the CXL root device
> + * @dev: 'struct device' to start searching from. Should be from cxl_port->dev.
> + *
> + * Return: acpi_handle on success, errptr of errno on error.
> + *
> + * Looks for the ACPI0017 device and return the ACPI handle
> + **/

Could we implement this in terms of find_cxl_root()?  I think that will
end up giving you the same device though I haven't tested it.

> +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev)
> +{
> +	struct device *itr = dev;
> +	struct device *root_dev;
> +	acpi_handle handle;
> +
> +	if (!dev)
> +		return ERR_PTR(-EINVAL);
> +
> +	while (itr->parent) {
> +		root_dev = itr;
> +		itr = itr->parent;
> +	}
> +
> +	if (!dev_is_platform(root_dev))
> +		return ERR_PTR(-ENODEV);
> +
> +	handle = ACPI_HANDLE(root_dev);
> +	if (!handle)
> +		return ERR_PTR(-ENODEV);
> +
> +	return handle;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_get_rootdev_handle, CXL);



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 10/23] cxl: Add helpers to calculate pci latency for the CXL device
  2023-04-19 20:22 ` [PATCH v4 10/23] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
@ 2023-04-20 12:15   ` Jonathan Cameron
  2023-04-25  0:30   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20 12:15 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:22:01 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> The latency is calculated by dividing the flit size over the bandwidth. Add
> support to retrieve the flit size for the CXL device and calculate the
> latency of the downstream link.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

Totally trivial stuff about using defines that exist for the various multipliers.
Otherwise looks good

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> v2:
> - Fix commit log issues. (Jonathan)
> - Fix var declaration issues. (Jonathan)
> ---
>  drivers/cxl/core/pci.c |   68 ++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlpci.h   |   15 +++++++++++
>  drivers/cxl/pci.c      |   13 ---------
>  3 files changed, 83 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 1c415b26e866..bb58296b3e56 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c


> +static int cxl_pci_mbits_to_mbytes(struct pci_dev *pdev)
> +{
> +	int mbits;
> +
> +	mbits = pci_bus_speed_to_mbps(pdev->bus->cur_bus_speed);
> +	if (mbits < 0)
> +		return mbits;
> +
> +	return mbits >> 3;

mbits / BITS_PER_BYTE; from linux/bits.h

maybe.

> +}

> +/**
> + * cxl_pci_get_latency - calculate the link latency for the PCIe link
> + * @pdev - PCI device
> + *
> + * return: calculated latency or -errno
> + *
> + * CXL Memory Device SW Guide v1.0 2.11.4 Link latency calculation
> + * Link latency = LinkPropagationLatency + FlitLatency + RetimerLatency
> + * LinkProgationLatency is negligible, so 0 will be used
> + * RetimerLatency is assumed to be negligible and 0 will be used
> + * FlitLatency = FlitSize / LinkBandwidth
> + * FlitSize is defined by spec. CXL rev3.0 4.2.1.
> + * 68B flit is used up to 32GT/s. >32GT/s, 256B flit size is used.
> + * The FlitLatency is converted to picoseconds.
> + */
> +long cxl_pci_get_latency(struct pci_dev *pdev)
> +{
> +	long bw;
> +
> +	bw = cxl_pci_mbits_to_mbytes(pdev);
> +	if (bw < 0)
> +		return bw;
> +
> +	return cxl_flit_size(pdev) * 1000000L / bw;

MEGA from include/linux/units.h perhaps though it's an oddity because output of this is
pico seconds, so maybe needs to be PICO / MEGA to act as documentation of why.

> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_get_latency, CXL);



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches
  2023-04-19 20:22 ` [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches Dave Jiang
@ 2023-04-20 12:26   ` Jonathan Cameron
  2023-04-24 17:09     ` Dave Jiang
  2023-04-25  0:33   ` Dan Williams
  1 sibling, 1 reply; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20 12:26 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:22:07 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> The CDAT information from the switch, Switch Scoped Latency and Bandwidth
> Information Strucutre (SSLBIS), is parsed and stored in an xarray under the
> cxl_port. The QoS data are indexed by the downstream port id.  Walk the CXL
> ports from endpoint to root and retrieve the relevant QoS information
> (bandwidth and latency) that are from the switch CDAT. If read or write QoS
> values are not available, then use the access QoS value.

I'd drop the access reference.  You already did that mapping from access to read
and write in earlier patch. Now we have no concept of access so mentioning
it will only potentially cause confusion.

> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v3:
> - Move to use 'struct node_hmem_attrs'
> ---
>  drivers/cxl/core/port.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |    2 +
>  2 files changed, 83 insertions(+)
> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 3fedbabac1af..770b540d5325 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1921,6 +1921,87 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd)
>  }
>  EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
>  
> +/**
> + * cxl_port_get_switch_qos - retrieve QoS data for CXL switches

Hmm. Terminology wise, this is called QoS data in either CXL spec
or the HMAT stuff it came from.  I'd avoid that term here.
Might also get confused with the QoS telemetry stuff from the CXL
spec which is totally different or the QoS controls on an MLD
which are perhaps indirectly related to these.

QoS only gets involved once these are mapped to a QTG - assumption
being that a given QoS policy should apply to devices of similar access
characteristics.

Other than that bikeshedding.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>



> + * @port: endpoint cxl_port
> + * @rd_bw: writeback value for min read bandwidth
> + * @rd_lat: writeback value for total read latency
> + * @wr_bw: writeback value for min write bandwidth
> + * @wr_lat: writeback value for total write latency
> + *
> + * Return: Errno on failure, 0 on success. -ENOENT if no switch device
> + */
> +int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
> +			    u64 *wr_bw, u64 *wr_lat)
> +{
> +	u64 min_rd_bw = ULONG_MAX;
> +	u64 min_wr_bw = ULONG_MAX;
> +	struct cxl_dport *dport;
> +	struct cxl_port *nport;
> +	u64 total_rd_lat = 0;
> +	u64 total_wr_lat = 0;
> +	struct device *next;
> +	int switches = 0;
> +	int rc = 0;
> +
> +	if (!is_cxl_endpoint(port))
> +		return -EINVAL;
> +
> +	/* Skip the endpoint */
> +	next = port->dev.parent;
> +	nport = to_cxl_port(next);
> +	dport = port->parent_dport;
> +
> +	do {
> +		struct node_hmem_attrs *hmem_attrs;
> +		u64 lat, bw;
> +
> +		if (!nport->cdat.table)
> +			break;
> +
> +		if (!dev_is_pci(dport->dport))
> +			break;
> +
> +		hmem_attrs = xa_load(&nport->cdat.sslbis_xa, dport->port_id);
> +		if (xa_is_err(hmem_attrs))
> +			return xa_err(hmem_attrs);
> +
> +		if (!hmem_attrs) {
> +			hmem_attrs = xa_load(&nport->cdat.sslbis_xa, SSLBIS_ANY_PORT);
> +			if (xa_is_err(hmem_attrs))
> +				return xa_err(hmem_attrs);
> +			if (!hmem_attrs)
> +				return -ENXIO;
> +		}
> +
> +		bw = hmem_attrs->write_bandwidth;
> +		lat = hmem_attrs->write_latency;
> +		min_wr_bw = min_t(u64, min_wr_bw, bw);
> +		total_wr_lat += lat;
> +
> +		bw = hmem_attrs->read_bandwidth;
> +		lat = hmem_attrs->read_latency;
> +		min_rd_bw = min_t(u64, min_rd_bw, bw);
> +		total_rd_lat += lat;
> +
> +		dport = nport->parent_dport;
> +		next = next->parent;
> +		nport = to_cxl_port(next);
> +		switches++;
> +	} while (next);
> +
> +	*wr_bw = min_wr_bw;
> +	*wr_lat = total_wr_lat;
> +	*rd_bw = min_rd_bw;
> +	*rd_lat = total_rd_lat;
> +
> +	if (!switches)
> +		return -ENOENT;
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_switch_qos, CXL);



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 12/23] cxl: Add helper function that calculate QoS values for PCI path
  2023-04-19 20:22 ` [PATCH v4 12/23] cxl: Add helper function that calculate QoS values for PCI path Dave Jiang
@ 2023-04-20 12:32   ` Jonathan Cameron
  2023-04-25  0:45   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-20 12:32 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Wed, 19 Apr 2023 13:22:13 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Calculate the link bandwidth and latency for the PCIe path from the device
> to the CXL Host Bridge. This does not include the CDAT data from the device
> or the switch(es) in the path.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

Same comment on _qos naming and one trivial comment inline.


> ---
> v4:
> - 0-day fix, remove unused var. Fix checking < 0 for unsigned var.
> - Rework port hierachy walk to calculate the latencies correctly
> ---
>  drivers/cxl/core/port.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |    2 +
>  2 files changed, 85 insertions(+)
> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 770b540d5325..8da437e038b9 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -2002,6 +2002,89 @@ int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_port_get_switch_qos, CXL);
>  
> +/**
> + * cxl_port_get_downstream_qos - retrieve QoS data for PCIE downstream path
> + * @port: endpoint cxl_port
> + * @bandwidth: writeback value for min bandwidth
> + * @latency: writeback value for total latency
> + *
> + * Return: Errno on failure, 0 on success.
> + */
> +int cxl_port_get_downstream_qos(struct cxl_port *port, u64 *bandwidth,
> +				u64 *latency)
> +{
> +	u64 min_bw = ULONG_MAX;
> +	struct pci_dev *pdev;
> +	struct cxl_port *p;
> +	struct device *dev;
> +	u64 total_lat = 0;
> +	long lat;
> +
> +	*bandwidth = 0;
> +	*latency = 0;
> +
> +	/* Grab the device that is the PCI device for CXL memdev */
> +	dev = port->uport->parent;
> +	/* Skip if it's not PCI, most likely a cxl_test device */
> +	if (!dev_is_pci(dev))
> +		return 0;
> +
> +	pdev = to_pci_dev(dev);
> +	min_bw = pcie_bandwidth_available(pdev, NULL, NULL, NULL);
> +	if (min_bw == 0)
> +		return -ENXIO;
> +
> +	/* convert to MB/s from Mb/s */
> +	min_bw >>= 3;

/ BITS_PER_BYTE; (well MEGABITS_PER_MEGABYTE but still better than >>= 3;)

> +
> +	/*
> +	 * Walk the cxl_port hierachy to retrieve the link latencies for
> +	 * each of the PCIe segments. The loop will obtain the link latency
> +	 * via each of the switch downstream port.
> +	 */
> +	p = port;
> +	do {
> +		struct cxl_dport *dport = p->parent_dport;
> +		struct device *dport_dev, *uport_dev;
> +		struct pci_dev *dport_pdev;
> +
> +		if (!dport)
> +			break;
> +
> +		dport_dev = dport->dport;
> +		if (!dev_is_pci(dport_dev))
> +			break;
> +
> +		p = dport->port;
> +		uport_dev = p->uport;
> +		if (!dev_is_pci(uport_dev))
> +			break;
> +
> +		dport_pdev = to_pci_dev(dport_dev);
> +		pdev = to_pci_dev(uport_dev);
> +		lat = cxl_pci_get_latency(dport_pdev);
> +		if (lat < 0)
> +			return lat;
> +
> +		total_lat += lat;
> +	} while (1);
> +
> +	/*
> +	 * pdev would be either the cxl device if there are no switches, or the
> +	 * upstream port of the last switch.
> +	 */
> +	lat = cxl_pci_get_latency(pdev);
> +	if (lat < 0)
> +		return lat;
> +
> +	total_lat += lat;
> +	*bandwidth = min_bw;
> +	*latency = total_lat;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_downstream_qos, CXL);
> +
>  /* for user tooling to ensure port disable work has completed */
>  static ssize_t flush_store(struct bus_type *bus, const char *buf, size_t count)
>  {
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 76ccc815134f..6a6387a545db 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -811,6 +811,8 @@ struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
>  acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev);
>  int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
>  			    u64 *wr_bw, u64 *wr_lat);
> +int cxl_port_get_downstream_qos(struct cxl_port *port, u64 *bandwidth,
> +				u64 *latency);
>  
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
> 
> 


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs
  2023-04-20  8:51   ` Jonathan Cameron
@ 2023-04-20 20:53     ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-20 20:53 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-acpi, Ira Weiny, dan.j.williams, vishal.l.verma,
	alison.schofield, rafael, lukas



On 4/20/23 1:51 AM, Jonathan Cameron wrote:
> On Wed, 19 Apr 2023 13:21:07 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Export the QoS Throttling Group ID from the CXL Fixed Memory Window
>> Structure (CFMWS) under the root decoder sysfs attributes.
>> CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS)
>>
>> cxl cli will use this QTG ID to match with the _DSM retrieved QTG ID for a
>> hot-plugged CXL memory device DPA memory range to make sure that the DPA range
>> is under the right CFMWS window.
>>
>> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
> 
> Bikeshedding alert:
> 
> Why not just call it qtg?  What does the _id add?
> 
> I don't really care either way...
> 
> LGTM
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> One (more) completely trivial comment inline.
> 
> Jonathan
> 
>> ---
>> v4:
>> - Change kernel version for documentation to v6.5
>> v2:
>> - Add explanation commit header (Jonathan)
>> ---
>>   Documentation/ABI/testing/sysfs-bus-cxl |    9 +++++++++
>>   drivers/cxl/acpi.c                      |    3 +++
>>   drivers/cxl/core/port.c                 |   14 ++++++++++++++
>>   drivers/cxl/cxl.h                       |    3 +++
>>   4 files changed, 29 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
>> index 3acf2f17a73f..bd2b59784979 100644
>> --- a/Documentation/ABI/testing/sysfs-bus-cxl
>> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
>> @@ -309,6 +309,15 @@ Description:
>>   		(WO) Write a string in the form 'regionZ' to delete that region,
>>   		provided it is currently idle / not bound to a driver.
>>   
>> +What:		/sys/bus/cxl/devices/decoderX.Y/qtg_id
>> +Date:		Jan, 2023
>> +KernelVersion:	v6.5
>> +Contact:	linux-cxl@vger.kernel.org
>> +Description:
>> +		(RO) Shows the QoS Throttling Group ID. The QTG ID for a root
>> +		decoder comes from the CFMWS structure of the CEDT. A value of
>> +		-1 indicates that no QTG ID was retrieved. The QTG ID is used as
>> +		guidance to match against the QTG ID of a hot-plugged device.
>>   
>>   What:		/sys/bus/cxl/devices/regionZ/uuid
>>   Date:		May, 2022
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index 7e1765b09e04..abc24137c291 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
>> @@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>>   			}
>>   		}
>>   	}
>> +
>> +	cxld->qtg_id = cfmws->qtg_id;
>> +
>>   	rc = cxl_decoder_add(cxld, target_map);
>>   err_xormap:
>>   	if (rc)
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 4d1f9c5b5029..024d4178f557 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -276,6 +276,16 @@ static ssize_t interleave_ways_show(struct device *dev,
>>   
>>   static DEVICE_ATTR_RO(interleave_ways);
>>   
>> +static ssize_t qtg_id_show(struct device *dev,
>> +			   struct device_attribute *attr, char *buf)
>> +{
>> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
>> +
>> +	return sysfs_emit(buf, "%d\n", cxld->qtg_id);
>> +}
>> +
> 
> No blank line here would be more consistent with local style (based on
> a really quick look).

I see it going both ways in that file. But I'll delete the line.

> 
>> +static DEVICE_ATTR_RO(qtg_id);
>> +
>>   static struct attribute *cxl_decoder_base_attrs[] = {
>>   	&dev_attr_start.attr,
>>   	&dev_attr_size.attr,
>> @@ -295,6 +305,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
>>   	&dev_attr_cap_type2.attr,
>>   	&dev_attr_cap_type3.attr,
>>   	&dev_attr_target_list.attr,
>> +	&dev_attr_qtg_id.attr,
>>   	SET_CXL_REGION_ATTR(create_pmem_region)
>>   	SET_CXL_REGION_ATTR(create_ram_region)
>>   	SET_CXL_REGION_ATTR(delete_region)
>> @@ -1625,6 +1636,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
>>   	}
>>   
>>   	atomic_set(&cxlrd->region_id, rc);
>> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>>   	return cxlrd;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL);
>> @@ -1662,6 +1674,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
>>   
>>   	cxld = &cxlsd->cxld;
>>   	cxld->dev.type = &cxl_decoder_switch_type;
>> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>>   	return cxlsd;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL);
>> @@ -1694,6 +1707,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port)
>>   	}
>>   
>>   	cxld->dev.type = &cxl_decoder_endpoint_type;
>> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>>   	return cxled;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL);
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 044a92d9813e..278ab6952332 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -300,6 +300,7 @@ enum cxl_decoder_type {
>>    */
>>   #define CXL_DECODER_MAX_INTERLEAVE 16
>>   
>> +#define CXL_QTG_ID_INVALID	-1
>>   
>>   /**
>>    * struct cxl_decoder - Common CXL HDM Decoder Attributes
>> @@ -311,6 +312,7 @@ enum cxl_decoder_type {
>>    * @target_type: accelerator vs expander (type2 vs type3) selector
>>    * @region: currently assigned region for this decoder
>>    * @flags: memory type capabilities and locking
>> + * @qtg_id: QoS Throttling Group ID
>>    * @commit: device/decoder-type specific callback to commit settings to hw
>>    * @reset: device/decoder-type specific callback to reset hw settings
>>   */
>> @@ -323,6 +325,7 @@ struct cxl_decoder {
>>   	enum cxl_decoder_type target_type;
>>   	struct cxl_region *region;
>>   	unsigned long flags;
>> +	int qtg_id;
>>   	int (*commit)(struct cxl_decoder *cxld);
>>   	int (*reset)(struct cxl_decoder *cxld);
>>   };
>>
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/23] cxl: Add common helpers for cdat parsing
  2023-04-20  9:41   ` Jonathan Cameron
@ 2023-04-20 21:05     ` Dave Jiang
  2023-04-21 16:06       ` Jonathan Cameron
  0 siblings, 1 reply; 70+ messages in thread
From: Dave Jiang @ 2023-04-20 21:05 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas



On 4/20/23 2:41 AM, Jonathan Cameron wrote:
> On Wed, 19 Apr 2023 13:21:25 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Add helper functions to parse the CDAT table and provide a callback to
>> parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
>> parsing. The code is patterned after the ACPI table parsing helpers.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
> A few minor things inline.   More than possible you addressed them
> in earlier versions though.
> 
> Jonathan
> 
>> ---
>> v2:
>> - Use local headers to handle LE instead of ACPI header
>> - Reduce complexity of parser function. (Jonathan)
>> - Directly access header type. (Jonathan)
>> - Simplify header ptr math. (Jonathan)
>> - Move parsed counter to the correct location. (Jonathan)
>> - Add LE to host conversion for entry length
>> ---
>>   drivers/cxl/core/Makefile |    1
>>   drivers/cxl/core/cdat.c   |  100 +++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxlpci.h      |   29 +++++++++++++
>>   3 files changed, 130 insertions(+)
>>   create mode 100644 drivers/cxl/core/cdat.c
>>
>> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
>> index ca4ae31d8f57..867a8014b462 100644
>> --- a/drivers/cxl/core/Makefile
>> +++ b/drivers/cxl/core/Makefile
>> @@ -12,5 +12,6 @@ cxl_core-y += memdev.o
>>   cxl_core-y += mbox.o
>>   cxl_core-y += pci.o
>>   cxl_core-y += hdm.o
>> +cxl_core-y += cdat.o
>>   cxl_core-$(CONFIG_TRACING) += trace.o
>>   cxl_core-$(CONFIG_CXL_REGION) += region.o
>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>> new file mode 100644
>> index 000000000000..210f4499bddb
>> --- /dev/null
>> +++ b/drivers/cxl/core/cdat.c
>> @@ -0,0 +1,100 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
>> +#include "cxlpci.h"
>> +#include "cxl.h"
>> +
>> +static bool has_handler(struct cdat_subtable_proc *proc)
> 
> Even though they are static, I'd add a cxl_ or cdat_ prefix
> to these to make it clear they are local.

Ok I'll change to cdat_*

> 
>> +{
>> +	return proc->handler;
>> +}
>> +
>> +static int call_handler(struct cdat_subtable_proc *proc,
>> +			struct cdat_subtable_entry *ent)
>> +{
>> +	if (has_handler(proc))
> 
> Do we need to check this again? It's checked in the parse_entries code
> well before this point.
> 
> Also, if moving to checking it once, then is it worth the
> little wrapper functions?

Ok I'll call it directly and remove the wrapper.

> 
> 
>> +		return proc->handler(ent->hdr, proc->arg);
>> +	return -EINVAL;
>> +}
>> +
>> +static bool cdat_is_subtable_match(struct cdat_subtable_entry *ent)
>> +{
>> +	return ent->hdr->type == ent->type;
>> +}
>> +
>> +static int cdat_table_parse_entries(enum cdat_type type,
>> +				    struct cdat_header *table_header,
>> +				    struct cdat_subtable_proc *proc)
>> +{
>> +	unsigned long table_end, entry_len;
>> +	struct cdat_subtable_entry entry;
>> +	int count = 0;
>> +	int rc;
>> +
>> +	if (!has_handler(proc))
>> +		return -EINVAL;
>> +
>> +	table_end = (unsigned long)table_header + table_header->length;
>> +
>> +	if (type >= CDAT_TYPE_RESERVED)
>> +		return -EINVAL;
>> +
>> +	entry.type = type;
>> +	entry.hdr = (struct cdat_entry_header *)(table_header + 1);
>> +
>> +	while ((unsigned long)entry.hdr < table_end) {
>> +		entry_len = le16_to_cpu(entry.hdr->length);
>> +
>> +		if ((unsigned long)entry.hdr + entry_len > table_end)
>> +			return -EINVAL;
>> +
>> +		if (entry_len == 0)
>> +			return -EINVAL;
>> +
>> +		if (cdat_is_subtable_match(&entry)) {
>> +			rc = call_handler(proc, &entry);
>> +			if (rc)
>> +				return rc;
>> +			count++;
>> +		}
>> +
>> +		entry.hdr = (struct cdat_entry_header *)((unsigned long)entry.hdr + entry_len);
>> +	}
>> +
>> +	return count;
>> +}
> 
> ...
> 
>> +int cdat_table_parse_sslbis(struct cdat_header *table,
>> +			    cdat_tbl_entry_handler handler, void *arg)
> 
> Feels like these ones should take a typed arg.  Sure you'll loose
> that again to use the generic handling code, but at this level we can
> do it I think.

while DSMAS and DSLBIS takes a list_head, SSLBIS takes an xarray. I can 
create a union.

> 
>> +{
>> +	struct cdat_subtable_proc proc = {
>> +		.handler	= handler,
>> +		.arg		= arg,
>> +	};
>> +
>> +	return cdat_table_parse_entries(CDAT_TYPE_SSLBIS, table, &proc);
>> +}
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-04-20 11:35     ` Jonathan Cameron
@ 2023-04-20 23:25       ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-20 23:25 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas



On 4/20/23 4:35 AM, Jonathan Cameron wrote:
> On Thu, 20 Apr 2023 12:33:50 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
>> On Wed, 19 Apr 2023 13:21:31 -0700
>> Dave Jiang <dave.jiang@intel.com> wrote:
>>
>>> Provide a callback function to the CDAT parser in order to parse the Device
>>> Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
>>> DPA range and its associated attributes in each entry. See the CDAT
>>> specification for details.
>>>
>>> Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity
>>> Structure (DSMAS)
>>
>> I'm not sure what purpose of this is. If it's just detecting problems
>> with the entry because we aren't interested in the content yet, then fine
>> but good to make that clear in patch intro.
>>
>> Maybe I'm missing something!
>>
> Ah. Got to next patch.  Perhaps a forwards reference to that will avoid
> anyone else wondering what is going on here!

Ok I'll clarify.

> 
>> Thanks,
>>
>> Jonathan
>>
>>>
>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>>
>>> ---
>>> v3:
>>> - Add spec section number. (Alison)
>>> - Remove cast from void *. (Alison)
>>> - Refactor cxl_port_probe() block. (Alison)
>>> - Move CDAT parse to cxl_endpoint_port_probe()
>>>
>>> v2:
>>> - Add DSMAS table size check. (Lukas)
>>> - Use local DSMAS header for LE handling.
>>> - Remove dsmas lock. (Jonathan)
>>> - Fix handle size (Jonathan)
>>> - Add LE to host conversion for DSMAS address and length.
>>> - Make dsmas_list local
>>
>>
>>> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
>>> index 615e0ef6b440..3022bdd52439 100644
>>> --- a/drivers/cxl/port.c
>>> +++ b/drivers/cxl/port.c
>>> @@ -57,6 +57,16 @@ static int discover_region(struct device *dev, void *root)
>>>   	return 0;
>>>   }
>>
>>>   static int cxl_switch_port_probe(struct cxl_port *port)
>>>   {
>>>   	struct cxl_hdm *cxlhdm;
>>> @@ -125,6 +135,18 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
>>>   	device_for_each_child(&port->dev, root, discover_region);
>>>   	put_device(&root->dev);
>>>   
>>> +	if (port->cdat.table) {
>>> +		LIST_HEAD(dsmas_list);
>>> +
>>> +		rc = cdat_table_parse_dsmas(port->cdat.table,
>>> +					    cxl_dsmas_parse_entry,
>>> +					    (void *)&dsmas_list);
>>> +		if (rc < 0)
>>> +			dev_warn(&port->dev, "Failed to parse DSMAS: %d\n", rc);
>>> +
>>> +		dsmas_list_destroy(&dsmas_list);
>>
>> I'm a little confused here.  What's the point?  Parse them then throw the info away?
>> Maybe a comment if all we are trying to do is warn about CDAT problems.
>>
>>
>>> +	}
>>> +
>>>   	return 0;
>>>   }
>>>   
>>>
>>>    
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable from CDAT
  2023-04-20 11:40   ` Jonathan Cameron
@ 2023-04-20 23:25     ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-20 23:25 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas



On 4/20/23 4:40 AM, Jonathan Cameron wrote:
> On Wed, 19 Apr 2023 13:21:37 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Provide a callback to parse the Device Scoped Latency and Bandwidth
>> Information Structure (DSLBIS) in the CDAT structures. The DSLBIS
>> contains the bandwidth and latency information that's tied to a DSMAS
>> handle. The driver will retrieve the read and write latency and
>> bandwidth associated with the DSMAS which is tied to a DPA range.
>>
>> Coherent Device Attribute Table 1.03 2.1 Device Scoped Latency and
>> Bandwidth Information Structure (DSLBIS)
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
> 
> One comment inline.
> 
>> +/* Flags for DSLBIS subtable */
>> +#define DSLBIS_MEM_MASK		GENMASK(3, 0)
>> +#define DSLBIS_MEM_MEMORY	0
>> +
>>   int devm_cxl_port_enumerate_dports(struct cxl_port *port);
>>   struct cxl_dev_state;
>>   int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
>> @@ -136,5 +164,9 @@ cdat_table_parse(dsmas);
>>   cdat_table_parse(dslbis);
>>   cdat_table_parse(sslbis);
>>   
>> -int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg);
>> +#define cxl_parse_entry(x) \
>> +int cxl_##x##_parse_entry(struct cdat_entry_header *header, void *arg)
> I'm not sure this is worthwhile. What was your reasoning for it?
> Also wrecks typing that arg argument as I suggested earlier...

I can remove the macros. They are patterned after the code in ACPI.

> 
>> +
>> +cxl_parse_entry(dsmas);
>> +cxl_parse_entry(dslbis);
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-04-20 12:00   ` Jonathan Cameron
@ 2023-04-21  0:11     ` Dave Jiang
  2023-04-21 16:07       ` Jonathan Cameron
  0 siblings, 1 reply; 70+ messages in thread
From: Dave Jiang @ 2023-04-21  0:11 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas



On 4/20/23 5:00 AM, Jonathan Cameron wrote:
> On Wed, 19 Apr 2023 13:21:49 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)
>>
>> Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
>> an input of an ACPI package with 4 dwords (read latency, write latency,
>> read bandwidth, write bandwidth). The call returns a package with 1 WORD
>> that provides the max supported QTG ID and a package that may contain 0 or
>> more WORDs as the recommended QTG IDs in the recommended order.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
> 
> A few minor comments inline.
> 
> 
>> +/**
>> + * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
>> + * @handle: ACPI handle
>> + * @input: bandwidth and latency data
>> + *
>> + * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
>> + * the QTG IDs that falls within the performance data.
> Falls within is a little vague.  Perhaps something like
> 
> the QTG IDs that are suitable for the performance point in order of most suitable
> to least suitable.

Thanks. I will update with your suggestion.

> 
>> + */
>> +struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
>> +						 struct qtg_dsm_input *input)
>> +{
>> +	union acpi_object *out_obj, *out_buf, *pkg;
>> +	union acpi_object in_buf = {
>> +		.buffer = {
>> +			.type = ACPI_TYPE_BUFFER,
>> +			.pointer = (u8 *)input,
>> +			.length = sizeof(u32) * 4,
> 
> sizeof(*input)?

ok

> 
> Also, ACPI structures are always little endian. Do we need to be careful of that
> here?

sigh yes. I will add in endieness handling for the patch.

> 
>> +		},
>> +	};
>> +	union acpi_object in_obj = {
>> +		.package = {
>> +			.type = ACPI_TYPE_PACKAGE,
>> +			.count = 1,
>> +			.elements = &in_buf
>> +		},
>> +	};
>> +	struct qtg_dsm_output *output = NULL;
>> +	int len, rc, i;
>> +	u16 *max_qtg;
>> +
>> +	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
>> +	if (!out_obj)
>> +		return ERR_PTR(-ENXIO);
>> +
>> +	if (out_obj->type != ACPI_TYPE_PACKAGE) {
>> +		rc = -ENXIO;
>> +		goto err;
>> +	}
>> +
>> +	/* Check Max QTG ID */
>> +	pkg = &out_obj->package.elements[0];
>> +	if (pkg->type != ACPI_TYPE_BUFFER) {
>> +		rc = -ENXIO;
>> +		goto err;
>> +	}
>> +
>> +	if (pkg->buffer.length != sizeof(u16)) {
>> +		rc = -ENXIO;
>> +		goto err;
>> +	}
>> +	max_qtg = (u16 *)pkg->buffer.pointer;
>> +
>> +	/* Retrieve QTG IDs package */
>> +	pkg = &out_obj->package.elements[1];
>> +	if (pkg->type != ACPI_TYPE_PACKAGE) {
>> +		rc = -ENXIO;
>> +		goto err;
>> +	}
>> +
>> +	out_buf = &pkg->package.elements[0];
>> +	if (out_buf->type != ACPI_TYPE_BUFFER) {
>> +		rc = -ENXIO;
>> +		goto err;
>> +	}
>> +
>> +	len = out_buf->buffer.length;
>> +
>> +	/* It's legal to have 0 QTG entries */
>> +	if (len == 0)
>> +		goto out;
>> +
>> +	/* Malformed package, not multiple of WORD size */
>> +	if (len % sizeof(u16)) {
>> +		rc = -ENXIO;
>> +		goto out;
>> +	}
>> +
>> +	output = kmalloc(len + sizeof(*output), GFP_KERNEL);
>> +	if (!output) {
>> +		rc = -ENOMEM;
>> +		goto err;
>> +	}
>> +
>> +	output->nr = len / sizeof(u16);
>> +	memcpy(output->qtg_ids, out_buf->buffer.pointer, len);
>> +
>> +	for (i = 0; i < output->nr; i++) {
>> +		if (output->qtg_ids[i] > *max_qtg)
>> +			pr_warn("QTG ID %u greater than MAX %u\n",
>> +				output->qtg_ids[i], *max_qtg);
>> +	}
>> +
>> +out:
>> +	ACPI_FREE(out_obj);
>> +	return output;
>> +
>> +err:
>> +	ACPI_FREE(out_obj);
>> +	return ERR_PTR(rc);
> 
> Why not combine these with something like
> 
> 	return IS_ERR(rc) ? ERR_PTR(rc) : output;
> 
> I'm fine with leaving as it is, if this is common style for these
> sorts of ACPI functions.

I'll combine it. But if I just set output to ERR_PTR(errno) for all the 
error cases then we can just return output directly without checking?

> 
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_evaluate_qtg_dsm, CXL);
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/23] cxl: Add common helpers for cdat parsing
  2023-04-20 21:05     ` Dave Jiang
@ 2023-04-21 16:06       ` Jonathan Cameron
  2023-04-21 16:12         ` Dave Jiang
  0 siblings, 1 reply; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-21 16:06 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas


> >> +int cdat_table_parse_sslbis(struct cdat_header *table,
> >> +			    cdat_tbl_entry_handler handler, void *arg)  
> > 
> > Feels like these ones should take a typed arg.  Sure you'll loose
> > that again to use the generic handling code, but at this level we can
> > do it I think.  
> 
> while DSMAS and DSLBIS takes a list_head, SSLBIS takes an xarray. I can 
> create a union.

I don't understand why,  If you drop the macro usage introduced in
a later patch you can just have each one take the right thing.
That macro isn't a huge saving anyway.

Jonathan

> 
> >   
> >> +{
> >> +	struct cdat_subtable_proc proc = {
> >> +		.handler	= handler,
> >> +		.arg		= arg,
> >> +	};
> >> +
> >> +	return cdat_table_parse_entries(CDAT_TYPE_SSLBIS, table, &proc);
> >> +}  
> >   
> 


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-04-21  0:11     ` Dave Jiang
@ 2023-04-21 16:07       ` Jonathan Cameron
  0 siblings, 0 replies; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-21 16:07 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

> >> +	}
> >> +
> >> +out:
> >> +	ACPI_FREE(out_obj);
> >> +	return output;
> >> +
> >> +err:
> >> +	ACPI_FREE(out_obj);
> >> +	return ERR_PTR(rc);  
> > 
> > Why not combine these with something like
> > 
> > 	return IS_ERR(rc) ? ERR_PTR(rc) : output;
> > 
> > I'm fine with leaving as it is, if this is common style for these
> > sorts of ACPI functions.  
> 
> I'll combine it. But if I just set output to ERR_PTR(errno) for all the 
> error cases then we can just return output directly without checking?

Excellent point.

> 
> >   
> >> +}
> >> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_evaluate_qtg_dsm, CXL);  
> >   


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/23] cxl: Add common helpers for cdat parsing
  2023-04-21 16:06       ` Jonathan Cameron
@ 2023-04-21 16:12         ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-21 16:12 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas



On 4/21/23 9:06 AM, Jonathan Cameron wrote:
> 
>>>> +int cdat_table_parse_sslbis(struct cdat_header *table,
>>>> +			    cdat_tbl_entry_handler handler, void *arg)
>>>
>>> Feels like these ones should take a typed arg.  Sure you'll loose
>>> that again to use the generic handling code, but at this level we can
>>> do it I think.
>>
>> while DSMAS and DSLBIS takes a list_head, SSLBIS takes an xarray. I can
>> create a union.
> 
> I don't understand why,  If you drop the macro usage introduced in
> a later patch you can just have each one take the right thing.
> That macro isn't a huge saving anyway.

Oh I think I understand where you are trying to get at. Ok I'll update.
> 
> Jonathan
> 
>>
>>>    
>>>> +{
>>>> +	struct cdat_subtable_proc proc = {
>>>> +		.handler	= handler,
>>>> +		.arg		= arg,
>>>> +	};
>>>> +
>>>> +	return cdat_table_parse_entries(CDAT_TYPE_SSLBIS, table, &proc);
>>>> +}
>>>    
>>
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 09/23] cxl: Add helper function to retrieve ACPI handle of CXL root device
  2023-04-20 12:06   ` Jonathan Cameron
@ 2023-04-21 23:24     ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-21 23:24 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas



On 4/20/23 5:06 AM, Jonathan Cameron wrote:
> On Wed, 19 Apr 2023 13:21:55 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Provide a helper to find the ACPI0017 device in order to issue the _DSM.
>> The helper will take the 'struct device' from a cxl_port and iterate until
>> the root device is reached. The ACPI handle will be returned from the root
>> device.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> Question inline.  If the answer is no then this looks fine to me.
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> 
>> +/**
>> + * cxl_acpi_get_rootdev_handle - get the ACPI handle of the CXL root device
>> + * @dev: 'struct device' to start searching from. Should be from cxl_port->dev.
>> + *
>> + * Return: acpi_handle on success, errptr of errno on error.
>> + *
>> + * Looks for the ACPI0017 device and return the ACPI handle
>> + **/
> 
> Could we implement this in terms of find_cxl_root()?  I think that will
> end up giving you the same device though I haven't tested it.

Yes I can simplify this. Thanks.

> 
>> +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev)
>> +{
>> +	struct device *itr = dev;
>> +	struct device *root_dev;
>> +	acpi_handle handle;
>> +
>> +	if (!dev)
>> +		return ERR_PTR(-EINVAL);
>> +
>> +	while (itr->parent) {
>> +		root_dev = itr;
>> +		itr = itr->parent;
>> +	}
>> +
>> +	if (!dev_is_platform(root_dev))
>> +		return ERR_PTR(-ENODEV);
>> +
>> +	handle = ACPI_HANDLE(root_dev);
>> +	if (!handle)
>> +		return ERR_PTR(-ENODEV);
>> +
>> +	return handle;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_get_rootdev_handle, CXL);
> 
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches
  2023-04-20 12:26   ` Jonathan Cameron
@ 2023-04-24 17:09     ` Dave Jiang
  2023-04-24 17:31       ` Dave Jiang
  0 siblings, 1 reply; 70+ messages in thread
From: Dave Jiang @ 2023-04-24 17:09 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas



On 4/20/23 5:26 AM, Jonathan Cameron wrote:
> On Wed, 19 Apr 2023 13:22:07 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> The CDAT information from the switch, Switch Scoped Latency and Bandwidth
>> Information Strucutre (SSLBIS), is parsed and stored in an xarray under the
>> cxl_port. The QoS data are indexed by the downstream port id.  Walk the CXL
>> ports from endpoint to root and retrieve the relevant QoS information
>> (bandwidth and latency) that are from the switch CDAT. If read or write QoS
>> values are not available, then use the access QoS value.
> 
> I'd drop the access reference.  You already did that mapping from access to read
> and write in earlier patch. Now we have no concept of access so mentioning
> it will only potentially cause confusion.

ok

> 
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> ---
>> v3:
>> - Move to use 'struct node_hmem_attrs'
>> ---
>>   drivers/cxl/core/port.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h       |    2 +
>>   2 files changed, 83 insertions(+)
>>
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 3fedbabac1af..770b540d5325 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1921,6 +1921,87 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
>>   
>> +/**
>> + * cxl_port_get_switch_qos - retrieve QoS data for CXL switches
> 
> Hmm. Terminology wise, this is called QoS data in either CXL spec
> or the HMAT stuff it came from.  I'd avoid that term here.
> Might also get confused with the QoS telemetry stuff from the CXL
> spec which is totally different or the QoS controls on an MLD
> which are perhaps indirectly related to these.
> 
> QoS only gets involved once these are mapped to a QTG - assumption
> being that a given QoS policy should apply to devices of similar access
> characteristics.

locality_info?


> 
> Other than that bikeshedding.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> 
> 
>> + * @port: endpoint cxl_port
>> + * @rd_bw: writeback value for min read bandwidth
>> + * @rd_lat: writeback value for total read latency
>> + * @wr_bw: writeback value for min write bandwidth
>> + * @wr_lat: writeback value for total write latency
>> + *
>> + * Return: Errno on failure, 0 on success. -ENOENT if no switch device
>> + */
>> +int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
>> +			    u64 *wr_bw, u64 *wr_lat)
>> +{
>> +	u64 min_rd_bw = ULONG_MAX;
>> +	u64 min_wr_bw = ULONG_MAX;
>> +	struct cxl_dport *dport;
>> +	struct cxl_port *nport;
>> +	u64 total_rd_lat = 0;
>> +	u64 total_wr_lat = 0;
>> +	struct device *next;
>> +	int switches = 0;
>> +	int rc = 0;
>> +
>> +	if (!is_cxl_endpoint(port))
>> +		return -EINVAL;
>> +
>> +	/* Skip the endpoint */
>> +	next = port->dev.parent;
>> +	nport = to_cxl_port(next);
>> +	dport = port->parent_dport;
>> +
>> +	do {
>> +		struct node_hmem_attrs *hmem_attrs;
>> +		u64 lat, bw;
>> +
>> +		if (!nport->cdat.table)
>> +			break;
>> +
>> +		if (!dev_is_pci(dport->dport))
>> +			break;
>> +
>> +		hmem_attrs = xa_load(&nport->cdat.sslbis_xa, dport->port_id);
>> +		if (xa_is_err(hmem_attrs))
>> +			return xa_err(hmem_attrs);
>> +
>> +		if (!hmem_attrs) {
>> +			hmem_attrs = xa_load(&nport->cdat.sslbis_xa, SSLBIS_ANY_PORT);
>> +			if (xa_is_err(hmem_attrs))
>> +				return xa_err(hmem_attrs);
>> +			if (!hmem_attrs)
>> +				return -ENXIO;
>> +		}
>> +
>> +		bw = hmem_attrs->write_bandwidth;
>> +		lat = hmem_attrs->write_latency;
>> +		min_wr_bw = min_t(u64, min_wr_bw, bw);
>> +		total_wr_lat += lat;
>> +
>> +		bw = hmem_attrs->read_bandwidth;
>> +		lat = hmem_attrs->read_latency;
>> +		min_rd_bw = min_t(u64, min_rd_bw, bw);
>> +		total_rd_lat += lat;
>> +
>> +		dport = nport->parent_dport;
>> +		next = next->parent;
>> +		nport = to_cxl_port(next);
>> +		switches++;
>> +	} while (next);
>> +
>> +	*wr_bw = min_wr_bw;
>> +	*wr_lat = total_wr_lat;
>> +	*rd_bw = min_rd_bw;
>> +	*rd_lat = total_rd_lat;
>> +
>> +	if (!switches)
>> +		return -ENOENT;
>> +
>> +	return rc;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_switch_qos, CXL);
> 
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches
  2023-04-24 17:09     ` Dave Jiang
@ 2023-04-24 17:31       ` Dave Jiang
  2023-04-24 21:59         ` Jonathan Cameron
  0 siblings, 1 reply; 70+ messages in thread
From: Dave Jiang @ 2023-04-24 17:31 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas



On 4/24/23 10:09 AM, Dave Jiang wrote:
> 
> 
> On 4/20/23 5:26 AM, Jonathan Cameron wrote:
>> On Wed, 19 Apr 2023 13:22:07 -0700
>> Dave Jiang <dave.jiang@intel.com> wrote:
>>
>>> The CDAT information from the switch, Switch Scoped Latency and 
>>> Bandwidth
>>> Information Strucutre (SSLBIS), is parsed and stored in an xarray 
>>> under the
>>> cxl_port. The QoS data are indexed by the downstream port id.  Walk 
>>> the CXL
>>> ports from endpoint to root and retrieve the relevant QoS information
>>> (bandwidth and latency) that are from the switch CDAT. If read or 
>>> write QoS
>>> values are not available, then use the access QoS value.
>>
>> I'd drop the access reference.  You already did that mapping from 
>> access to read
>> and write in earlier patch. Now we have no concept of access so 
>> mentioning
>> it will only potentially cause confusion.
> 
> ok
> 
>>
>>>
>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>>
>>> ---
>>> v3:
>>> - Move to use 'struct node_hmem_attrs'
>>> ---
>>>   drivers/cxl/core/port.c |   81 
>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>>   drivers/cxl/cxl.h       |    2 +
>>>   2 files changed, 83 insertions(+)
>>>
>>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>>> index 3fedbabac1af..770b540d5325 100644
>>> --- a/drivers/cxl/core/port.c
>>> +++ b/drivers/cxl/core/port.c
>>> @@ -1921,6 +1921,87 @@ bool schedule_cxl_memdev_detach(struct 
>>> cxl_memdev *cxlmd)
>>>   }
>>>   EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
>>> +/**
>>> + * cxl_port_get_switch_qos - retrieve QoS data for CXL switches
>>
>> Hmm. Terminology wise, this is called QoS data in either CXL spec
>> or the HMAT stuff it came from.  I'd avoid that term here.
>> Might also get confused with the QoS telemetry stuff from the CXL
>> spec which is totally different or the QoS controls on an MLD
>> which are perhaps indirectly related to these.
>>
>> QoS only gets involved once these are mapped to a QTG - assumption
>> being that a given QoS policy should apply to devices of similar access
>> characteristics.
> 
> locality_info?

Or perf_data in accordance with this doc:
https://www.kernel.org/doc/html/v5.9/admin-guide/mm/numaperf.html

> 
> 
>>
>> Other than that bikeshedding.
>>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>
>>
>>
>>> + * @port: endpoint cxl_port
>>> + * @rd_bw: writeback value for min read bandwidth
>>> + * @rd_lat: writeback value for total read latency
>>> + * @wr_bw: writeback value for min write bandwidth
>>> + * @wr_lat: writeback value for total write latency
>>> + *
>>> + * Return: Errno on failure, 0 on success. -ENOENT if no switch device
>>> + */
>>> +int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 
>>> *rd_lat,
>>> +                u64 *wr_bw, u64 *wr_lat)
>>> +{
>>> +    u64 min_rd_bw = ULONG_MAX;
>>> +    u64 min_wr_bw = ULONG_MAX;
>>> +    struct cxl_dport *dport;
>>> +    struct cxl_port *nport;
>>> +    u64 total_rd_lat = 0;
>>> +    u64 total_wr_lat = 0;
>>> +    struct device *next;
>>> +    int switches = 0;
>>> +    int rc = 0;
>>> +
>>> +    if (!is_cxl_endpoint(port))
>>> +        return -EINVAL;
>>> +
>>> +    /* Skip the endpoint */
>>> +    next = port->dev.parent;
>>> +    nport = to_cxl_port(next);
>>> +    dport = port->parent_dport;
>>> +
>>> +    do {
>>> +        struct node_hmem_attrs *hmem_attrs;
>>> +        u64 lat, bw;
>>> +
>>> +        if (!nport->cdat.table)
>>> +            break;
>>> +
>>> +        if (!dev_is_pci(dport->dport))
>>> +            break;
>>> +
>>> +        hmem_attrs = xa_load(&nport->cdat.sslbis_xa, dport->port_id);
>>> +        if (xa_is_err(hmem_attrs))
>>> +            return xa_err(hmem_attrs);
>>> +
>>> +        if (!hmem_attrs) {
>>> +            hmem_attrs = xa_load(&nport->cdat.sslbis_xa, 
>>> SSLBIS_ANY_PORT);
>>> +            if (xa_is_err(hmem_attrs))
>>> +                return xa_err(hmem_attrs);
>>> +            if (!hmem_attrs)
>>> +                return -ENXIO;
>>> +        }
>>> +
>>> +        bw = hmem_attrs->write_bandwidth;
>>> +        lat = hmem_attrs->write_latency;
>>> +        min_wr_bw = min_t(u64, min_wr_bw, bw);
>>> +        total_wr_lat += lat;
>>> +
>>> +        bw = hmem_attrs->read_bandwidth;
>>> +        lat = hmem_attrs->read_latency;
>>> +        min_rd_bw = min_t(u64, min_rd_bw, bw);
>>> +        total_rd_lat += lat;
>>> +
>>> +        dport = nport->parent_dport;
>>> +        next = next->parent;
>>> +        nport = to_cxl_port(next);
>>> +        switches++;
>>> +    } while (next);
>>> +
>>> +    *wr_bw = min_wr_bw;
>>> +    *wr_lat = total_wr_lat;
>>> +    *rd_bw = min_rd_bw;
>>> +    *rd_lat = total_rd_lat;
>>> +
>>> +    if (!switches)
>>> +        return -ENOENT;
>>> +
>>> +    return rc;
>>> +}
>>> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_switch_qos, CXL);
>>
>>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs
  2023-04-19 20:21 ` [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
  2023-04-20  8:51   ` Jonathan Cameron
@ 2023-04-24 21:46   ` Dan Williams
  2023-04-26 23:14     ` Dave Jiang
  1 sibling, 1 reply; 70+ messages in thread
From: Dan Williams @ 2023-04-24 21:46 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: Ira Weiny, dan.j.williams, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> Export the QoS Throttling Group ID from the CXL Fixed Memory Window
> Structure (CFMWS) under the root decoder sysfs attributes.
> CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS)
> 
> cxl cli will use this QTG ID to match with the _DSM retrieved QTG ID for a
> hot-plugged CXL memory device DPA memory range to make sure that the DPA range
> is under the right CFMWS window.
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v4:
> - Change kernel version for documentation to v6.5
> v2:
> - Add explanation commit header (Jonathan)
> ---
>  Documentation/ABI/testing/sysfs-bus-cxl |    9 +++++++++
>  drivers/cxl/acpi.c                      |    3 +++
>  drivers/cxl/core/port.c                 |   14 ++++++++++++++
>  drivers/cxl/cxl.h                       |    3 +++
>  4 files changed, 29 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 3acf2f17a73f..bd2b59784979 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -309,6 +309,15 @@ Description:
>  		(WO) Write a string in the form 'regionZ' to delete that region,
>  		provided it is currently idle / not bound to a driver.
>  
> +What:		/sys/bus/cxl/devices/decoderX.Y/qtg_id
> +Date:		Jan, 2023
> +KernelVersion:	v6.5
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RO) Shows the QoS Throttling Group ID. The QTG ID for a root
> +		decoder comes from the CFMWS structure of the CEDT. A value of
> +		-1 indicates that no QTG ID was retrieved. The QTG ID is used as
> +		guidance to match against the QTG ID of a hot-plugged device.

For user documentation I do not expect a someone to know the relevance
of those ACPI table names. Also, looking at this from a future proofing
perspective, even though there is yet to be a non-ACPI CXL host
definition I do not want to tie ourselves to ACPI-specific terms here.

The CXL generic concept here is a "class" as defined in CXL 3.0 3.3.4
QoS Telemetry for Memory, and that mentions an optional platform
facility to group memory regions by their performance.  So QTG-ID is an
ACPI.CFMWS specific respopnse to that CXL QoS class and grouping
concept. See CXL 3.0 3.3.4.3 Memory Device Support for QoS Telemetry for
its usage of "class").

So lets call the user-facing attribute a "qos_class". Then the
description can be something like the below. Note that I call it a
"cookie" since the value has no meaning besides just an id for
matching purposes.

---

What:		/sys/bus/cxl/devices/decoderX.Y/qos_class
Description:
		(RO) For CXL host platforms that support "QoS Telemmetry" this
		root-decoder-only attribute conveys a platform specific cookie
		that identifies a QoS performance class for the CXL Window.
		This class-id can be compared against a similar "qos_class"
		published for each memory-type that an endpoint supports. While
		it is not required that endpoints map their local memory-class
		to a matching platform class, mismatches are not recommended and
		there are platform specific side-effects that may result.

>  
>  What:		/sys/bus/cxl/devices/regionZ/uuid
>  Date:		May, 2022
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 7e1765b09e04..abc24137c291 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>  			}
>  		}
>  	}
> +
> +	cxld->qtg_id = cfmws->qtg_id;
> +
>  	rc = cxl_decoder_add(cxld, target_map);
>  err_xormap:
>  	if (rc)
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 4d1f9c5b5029..024d4178f557 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -276,6 +276,16 @@ static ssize_t interleave_ways_show(struct device *dev,
>  
>  static DEVICE_ATTR_RO(interleave_ways);
>  
> +static ssize_t qtg_id_show(struct device *dev,
> +			   struct device_attribute *attr, char *buf)
> +{
> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +
> +	return sysfs_emit(buf, "%d\n", cxld->qtg_id);
> +}
> +
> +static DEVICE_ATTR_RO(qtg_id);
> +
>  static struct attribute *cxl_decoder_base_attrs[] = {
>  	&dev_attr_start.attr,
>  	&dev_attr_size.attr,
> @@ -295,6 +305,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
>  	&dev_attr_cap_type2.attr,
>  	&dev_attr_cap_type3.attr,
>  	&dev_attr_target_list.attr,
> +	&dev_attr_qtg_id.attr,
>  	SET_CXL_REGION_ATTR(create_pmem_region)
>  	SET_CXL_REGION_ATTR(create_ram_region)
>  	SET_CXL_REGION_ATTR(delete_region)
> @@ -1625,6 +1636,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
>  	}
>  
>  	atomic_set(&cxlrd->region_id, rc);
> +	cxld->qtg_id = CXL_QTG_ID_INVALID;

If qtg_id needs to stay in 'struct cxl_decoder' why not move this to
cxl_decoder_init() and do it once?

>  	return cxlrd;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL);
> @@ -1662,6 +1674,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
>  
>  	cxld = &cxlsd->cxld;
>  	cxld->dev.type = &cxl_decoder_switch_type;
> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>  	return cxlsd;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL);
> @@ -1694,6 +1707,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port)
>  	}
>  
>  	cxld->dev.type = &cxl_decoder_endpoint_type;
> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>  	return cxled;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 044a92d9813e..278ab6952332 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -300,6 +300,7 @@ enum cxl_decoder_type {
>   */
>  #define CXL_DECODER_MAX_INTERLEAVE 16
>  
> +#define CXL_QTG_ID_INVALID	-1
>  
>  /**
>   * struct cxl_decoder - Common CXL HDM Decoder Attributes
> @@ -311,6 +312,7 @@ enum cxl_decoder_type {
>   * @target_type: accelerator vs expander (type2 vs type3) selector
>   * @region: currently assigned region for this decoder
>   * @flags: memory type capabilities and locking
> + * @qtg_id: QoS Throttling Group ID
>   * @commit: device/decoder-type specific callback to commit settings to hw
>   * @reset: device/decoder-type specific callback to reset hw settings
>  */
> @@ -323,6 +325,7 @@ struct cxl_decoder {
>  	enum cxl_decoder_type target_type;
>  	struct cxl_region *region;
>  	unsigned long flags;
> +	int qtg_id;

Why not just keep this limited to 'struct cxl_root_decoder'?

>  	int (*commit)(struct cxl_decoder *cxld);
>  	int (*reset)(struct cxl_decoder *cxld);
>  };
> 
> 



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches
  2023-04-24 17:31       ` Dave Jiang
@ 2023-04-24 21:59         ` Jonathan Cameron
  0 siblings, 0 replies; 70+ messages in thread
From: Jonathan Cameron @ 2023-04-24 21:59 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, lukas

On Mon, 24 Apr 2023 10:31:02 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> On 4/24/23 10:09 AM, Dave Jiang wrote:
> > 
> > 
> > On 4/20/23 5:26 AM, Jonathan Cameron wrote:  
> >> On Wed, 19 Apr 2023 13:22:07 -0700
> >> Dave Jiang <dave.jiang@intel.com> wrote:
> >>  
> >>> The CDAT information from the switch, Switch Scoped Latency and 
> >>> Bandwidth
> >>> Information Strucutre (SSLBIS), is parsed and stored in an xarray 
> >>> under the
> >>> cxl_port. The QoS data are indexed by the downstream port id.  Walk 
> >>> the CXL
> >>> ports from endpoint to root and retrieve the relevant QoS information
> >>> (bandwidth and latency) that are from the switch CDAT. If read or 
> >>> write QoS
> >>> values are not available, then use the access QoS value.  
> >>
> >> I'd drop the access reference.  You already did that mapping from 
> >> access to read
> >> and write in earlier patch. Now we have no concept of access so 
> >> mentioning
> >> it will only potentially cause confusion.  
> > 
> > ok
> >   
> >>  
> >>>
> >>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> >>>
> >>> ---
> >>> v3:
> >>> - Move to use 'struct node_hmem_attrs'
> >>> ---
> >>>   drivers/cxl/core/port.c |   81 
> >>> +++++++++++++++++++++++++++++++++++++++++++++++
> >>>   drivers/cxl/cxl.h       |    2 +
> >>>   2 files changed, 83 insertions(+)
> >>>
> >>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> >>> index 3fedbabac1af..770b540d5325 100644
> >>> --- a/drivers/cxl/core/port.c
> >>> +++ b/drivers/cxl/core/port.c
> >>> @@ -1921,6 +1921,87 @@ bool schedule_cxl_memdev_detach(struct 
> >>> cxl_memdev *cxlmd)
> >>>   }
> >>>   EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
> >>> +/**
> >>> + * cxl_port_get_switch_qos - retrieve QoS data for CXL switches  
> >>
> >> Hmm. Terminology wise, this is called QoS data in either CXL spec
> >> or the HMAT stuff it came from.  I'd avoid that term here.
> >> Might also get confused with the QoS telemetry stuff from the CXL
> >> spec which is totally different or the QoS controls on an MLD
> >> which are perhaps indirectly related to these.
> >>
> >> QoS only gets involved once these are mapped to a QTG - assumption
> >> being that a given QoS policy should apply to devices of similar access
> >> characteristics.  
> > 
> > locality_info?  
> 
> Or perf_data in accordance with this doc:
> https://www.kernel.org/doc/html/v5.9/admin-guide/mm/numaperf.html

Either would be fine as far as I'm concerned.
I guess perf_data puts less of a spin on what it 'means' than
locality_info - so I'd slightly prefer that.

Jonathan

> 
> > 
> >   
> >>
> >> Other than that bikeshedding.
> >>
> >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >>
> >>
> >>  
> >>> + * @port: endpoint cxl_port
> >>> + * @rd_bw: writeback value for min read bandwidth
> >>> + * @rd_lat: writeback value for total read latency
> >>> + * @wr_bw: writeback value for min write bandwidth
> >>> + * @wr_lat: writeback value for total write latency
> >>> + *
> >>> + * Return: Errno on failure, 0 on success. -ENOENT if no switch device
> >>> + */
> >>> +int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 
> >>> *rd_lat,
> >>> +                u64 *wr_bw, u64 *wr_lat)
> >>> +{
> >>> +    u64 min_rd_bw = ULONG_MAX;
> >>> +    u64 min_wr_bw = ULONG_MAX;
> >>> +    struct cxl_dport *dport;
> >>> +    struct cxl_port *nport;
> >>> +    u64 total_rd_lat = 0;
> >>> +    u64 total_wr_lat = 0;
> >>> +    struct device *next;
> >>> +    int switches = 0;
> >>> +    int rc = 0;
> >>> +
> >>> +    if (!is_cxl_endpoint(port))
> >>> +        return -EINVAL;
> >>> +
> >>> +    /* Skip the endpoint */
> >>> +    next = port->dev.parent;
> >>> +    nport = to_cxl_port(next);
> >>> +    dport = port->parent_dport;
> >>> +
> >>> +    do {
> >>> +        struct node_hmem_attrs *hmem_attrs;
> >>> +        u64 lat, bw;
> >>> +
> >>> +        if (!nport->cdat.table)
> >>> +            break;
> >>> +
> >>> +        if (!dev_is_pci(dport->dport))
> >>> +            break;
> >>> +
> >>> +        hmem_attrs = xa_load(&nport->cdat.sslbis_xa, dport->port_id);
> >>> +        if (xa_is_err(hmem_attrs))
> >>> +            return xa_err(hmem_attrs);
> >>> +
> >>> +        if (!hmem_attrs) {
> >>> +            hmem_attrs = xa_load(&nport->cdat.sslbis_xa, 
> >>> SSLBIS_ANY_PORT);
> >>> +            if (xa_is_err(hmem_attrs))
> >>> +                return xa_err(hmem_attrs);
> >>> +            if (!hmem_attrs)
> >>> +                return -ENXIO;
> >>> +        }
> >>> +
> >>> +        bw = hmem_attrs->write_bandwidth;
> >>> +        lat = hmem_attrs->write_latency;
> >>> +        min_wr_bw = min_t(u64, min_wr_bw, bw);
> >>> +        total_wr_lat += lat;
> >>> +
> >>> +        bw = hmem_attrs->read_bandwidth;
> >>> +        lat = hmem_attrs->read_latency;
> >>> +        min_rd_bw = min_t(u64, min_rd_bw, bw);
> >>> +        total_rd_lat += lat;
> >>> +
> >>> +        dport = nport->parent_dport;
> >>> +        next = next->parent;
> >>> +        nport = to_cxl_port(next);
> >>> +        switches++;
> >>> +    } while (next);
> >>> +
> >>> +    *wr_bw = min_wr_bw;
> >>> +    *wr_lat = total_wr_lat;
> >>> +    *rd_bw = min_rd_bw;
> >>> +    *rd_lat = total_rd_lat;
> >>> +
> >>> +    if (!switches)
> >>> +        return -ENOENT;
> >>> +
> >>> +    return rc;
> >>> +}
> >>> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_switch_qos, CXL);  
> >>
> >>  


^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 02/23] cxl: Add checksum verification to CDAT from CXL
  2023-04-19 20:21 ` [PATCH v4 02/23] cxl: Add checksum verification to CDAT from CXL Dave Jiang
  2023-04-20  8:55   ` Jonathan Cameron
@ 2023-04-24 22:01   ` Dan Williams
  2023-04-26 23:24     ` Dave Jiang
  1 sibling, 1 reply; 70+ messages in thread
From: Dan Williams @ 2023-04-24 22:01 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: Ira Weiny, dan.j.williams, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> A CDAT table is available from a CXL device. The table is read by the
> driver and cached in software. With the CXL subsystem needing to parse the
> CDAT table, the checksum should be verified. Add checksum verification
> after the CDAT table is read from device.
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v3:
> - Just return the final sum. (Alison)
> v2:
> - Drop ACPI checksum export and just use local verification. (Dan)
> ---
>  drivers/cxl/core/pci.c |   16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 25b7e8125d5d..9c7e2f69d9ca 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -528,6 +528,16 @@ static int cxl_cdat_read_table(struct device *dev,
>  	return 0;
>  }
>  
> +static unsigned char cdat_checksum(void *buf, size_t size)
> +{
> +	unsigned char sum, *data = buf;
> +	size_t i;
> +
> +	for (sum = 0, i = 0; i < size; i++)
> +		sum += data[i];
> +	return sum;
> +}
> +
>  /**
>   * read_cdat_data - Read the CDAT data on this port
>   * @port: Port to read data from
> @@ -573,6 +583,12 @@ void read_cdat_data(struct cxl_port *port)
>  	}
>  
>  	port->cdat.table = cdat_table + sizeof(__le32);
> +	if (cdat_checksum(port->cdat.table, cdat_length)) {
> +		/* Don't leave table data allocated on error */
> +		devm_kfree(dev, cdat_table);
> +		dev_err(dev, "CDAT data checksum error\n");
> +	}
> +
>  	port->cdat.length = cdat_length;

I think read_cdat_data() is confused about error cases. I note that
/sys/firmware/acpi/tables does not emit the entry if the table has bad
length or bad checksum. If you want to have a debug mode then maybe make
it a compile time option, but I otherwise do not see the benefit of
publishing known bad tables to userspace.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 03/23] cxl: Add support for reading CXL switch CDAT table
  2023-04-19 20:21 ` [PATCH v4 03/23] cxl: Add support for reading CXL switch CDAT table Dave Jiang
  2023-04-20  9:25   ` Jonathan Cameron
@ 2023-04-24 22:08   ` Dan Williams
  2023-04-27 15:55     ` Dave Jiang
  1 sibling, 1 reply; 70+ messages in thread
From: Dan Williams @ 2023-04-24 22:08 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: Ira Weiny, dan.j.williams, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> Move read_cdat_data() from endpoint probe to general port probe to
> allow reading of CDAT data for CXL switches as well as CXL device.
> Add wrapper support for cxl_test to bypass the cdat reading.
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v4:
> - Remove cxl_test wrapper. (Ira)
> ---
>  drivers/cxl/core/pci.c |   20 +++++++++++++++-----
>  drivers/cxl/port.c     |    6 +++---
>  2 files changed, 18 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 9c7e2f69d9ca..1c415b26e866 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -546,16 +546,26 @@ static unsigned char cdat_checksum(void *buf, size_t size)
>   */
>  void read_cdat_data(struct cxl_port *port)
>  {
> -	struct pci_doe_mb *cdat_doe;
> -	struct device *dev = &port->dev;
>  	struct device *uport = port->uport;
> -	struct cxl_memdev *cxlmd = to_cxl_memdev(uport);
> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> -	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> +	struct device *dev = &port->dev;
> +	struct cxl_dev_state *cxlds;
> +	struct pci_doe_mb *cdat_doe;
> +	struct cxl_memdev *cxlmd;
> +	struct pci_dev *pdev;
>  	size_t cdat_length;
>  	void *cdat_table;
>  	int rc;
>  
> +	if (is_cxl_memdev(uport)) {
> +		cxlmd = to_cxl_memdev(uport);
> +		cxlds = cxlmd->cxlds;
> +		pdev = to_pci_dev(cxlds->dev);

Per this fix [1], there's no need to reference cxlds, the parent of the
memory device is the device this wants, and needs to be careful that not
all 'struct cxl_memdev' instances are hosted by pci devices.

[1]: http://lore.kernel.org/r/168213190748.708404.16215095414060364800.stgit@dwillia2-xfh.jf.intel.com

Otherwise, looks good to me.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 04/23] cxl: Add common helpers for cdat parsing
  2023-04-19 20:21 ` [PATCH v4 04/23] cxl: Add common helpers for cdat parsing Dave Jiang
  2023-04-20  9:41   ` Jonathan Cameron
@ 2023-04-24 22:33   ` Dan Williams
  2023-04-25 16:00     ` Dave Jiang
  1 sibling, 1 reply; 70+ messages in thread
From: Dan Williams @ 2023-04-24 22:33 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> Add helper functions to parse the CDAT table and provide a callback to
> parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
> parsing. The code is patterned after the ACPI table parsing helpers.

It seems a shame that CDAT is so ACPI-like, but can't reuse the ACPI
table parsing infrastructure. Can this not be achieved by modifying some
of the helpers helpers in drivers/acpi/tables.c to take a passed in
@table_header?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-04-19 20:21 ` [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
  2023-04-20 11:33   ` Jonathan Cameron
@ 2023-04-24 22:38   ` Dan Williams
  2023-04-26  3:44   ` Li, Ming
  2 siblings, 0 replies; 70+ messages in thread
From: Dan Williams @ 2023-04-24 22:38 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> Provide a callback function to the CDAT parser in order to parse the Device
> Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
> DPA range and its associated attributes in each entry. See the CDAT
> specification for details.
> 
> Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity
> Structure (DSMAS)
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v3:
> - Add spec section number. (Alison)
> - Remove cast from void *. (Alison)
> - Refactor cxl_port_probe() block. (Alison)
> - Move CDAT parse to cxl_endpoint_port_probe()
> 
> v2:
> - Add DSMAS table size check. (Lukas)
> - Use local DSMAS header for LE handling.
> - Remove dsmas lock. (Jonathan)
> - Fix handle size (Jonathan)
> - Add LE to host conversion for DSMAS address and length.
> - Make dsmas_list local
> ---
>  drivers/cxl/core/cdat.c |   26 ++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |    1 +
>  drivers/cxl/cxlpci.h    |   18 ++++++++++++++++++
>  drivers/cxl/port.c      |   22 ++++++++++++++++++++++
>  4 files changed, 67 insertions(+)
> 
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index 210f4499bddb..6f20af83a3ed 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -98,3 +98,29 @@ int cdat_table_parse_sslbis(struct cdat_header *table,
>  	return cdat_table_parse_entries(CDAT_TYPE_SSLBIS, table, &proc);
>  }
>  EXPORT_SYMBOL_NS_GPL(cdat_table_parse_sslbis, CXL);
> +
> +int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg)
> +{
> +	struct cdat_dsmas *dsmas = (struct cdat_dsmas *)header;
> +	struct list_head *dsmas_list = arg;
> +	struct dsmas_entry *dent;
> +
> +	if (dsmas->hdr.length != sizeof(*dsmas)) {
> +		pr_warn("Malformed DSMAS table length: (%lu:%u)\n",
> +			 (unsigned long)sizeof(*dsmas), dsmas->hdr.length);
> +		return -EINVAL;
> +	}
> +
> +	dent = kzalloc(sizeof(*dent), GFP_KERNEL);
> +	if (!dent)
> +		return -ENOMEM;
> +
> +	dent->handle = dsmas->dsmad_handle;
> +	dent->dpa_range.start = le64_to_cpu(dsmas->dpa_base_address);
> +	dent->dpa_range.end = le64_to_cpu(dsmas->dpa_base_address) +
> +			      le64_to_cpu(dsmas->dpa_length) - 1;
> +	list_add_tail(&dent->list, dsmas_list);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 278ab6952332..18ca25c7e527 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -8,6 +8,7 @@
>  #include <linux/bitfield.h>
>  #include <linux/bitops.h>
>  #include <linux/log2.h>
> +#include <linux/list.h>
>  #include <linux/io.h>
>  
>  /**
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 45e2f2bf5ef8..9a2468a93d83 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -104,6 +104,22 @@ struct cdat_subtable_entry {
>  	enum cdat_type type;
>  };
>  
> +struct dsmas_entry {
> +	struct list_head list;
> +	struct range dpa_range;
> +	u8 handle;
> +};
> +
> +/* Sub-table 0: Device Scoped Memory Affinity Structure (DSMAS) */
> +struct cdat_dsmas {
> +	struct cdat_entry_header hdr;
> +	u8 dsmad_handle;
> +	u8 flags;
> +	__u16 reserved;
> +	__le64 dpa_base_address;
> +	__le64 dpa_length;
> +} __packed;
> +
>  int devm_cxl_port_enumerate_dports(struct cxl_port *port);
>  struct cxl_dev_state;
>  int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> @@ -119,4 +135,6 @@ int cdat_table_parse_##x(struct cdat_header *table, \
>  cdat_table_parse(dsmas);
>  cdat_table_parse(dslbis);
>  cdat_table_parse(sslbis);
> +
> +int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg);
>  #endif /* __CXL_PCI_H__ */
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index 615e0ef6b440..3022bdd52439 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -57,6 +57,16 @@ static int discover_region(struct device *dev, void *root)
>  	return 0;
>  }
>  
> +static void dsmas_list_destroy(struct list_head *dsmas_list)
> +{
> +	struct dsmas_entry *dentry, *n;
> +
> +	list_for_each_entry_safe(dentry, n, dsmas_list, list) {
> +		list_del(&dentry->list);
> +		kfree(dentry);
> +	}
> +}
> +
>  static int cxl_switch_port_probe(struct cxl_port *port)
>  {
>  	struct cxl_hdm *cxlhdm;
> @@ -125,6 +135,18 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
>  	device_for_each_child(&port->dev, root, discover_region);
>  	put_device(&root->dev);
>  
> +	if (port->cdat.table) {
> +		LIST_HEAD(dsmas_list);
> +
> +		rc = cdat_table_parse_dsmas(port->cdat.table,
> +					    cxl_dsmas_parse_entry,
> +					    (void *)&dsmas_list);
> +		if (rc < 0)
> +			dev_warn(&port->dev, "Failed to parse DSMAS: %d\n", rc);
> +
> +		dsmas_list_destroy(&dsmas_list);
> +	}
> +

Do these entries need to be cached in a list? For example all the table
walking in drivers/acpi/numa/srat.c injects the data directly into their
final kernel data structure location, it's all self contained in the
parse handler.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable from CDAT
  2023-04-19 20:21 ` [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
  2023-04-20 11:40   ` Jonathan Cameron
@ 2023-04-24 22:46   ` Dan Williams
  2023-04-24 22:59     ` Dave Jiang
  1 sibling, 1 reply; 70+ messages in thread
From: Dan Williams @ 2023-04-24 22:46 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> Provide a callback to parse the Device Scoped Latency and Bandwidth
> Information Structure (DSLBIS) in the CDAT structures. The DSLBIS
> contains the bandwidth and latency information that's tied to a DSMAS
> handle. The driver will retrieve the read and write latency and
> bandwidth associated with the DSMAS which is tied to a DPA range.
> 
> Coherent Device Attribute Table 1.03 2.1 Device Scoped Latency and
> Bandwidth Information Structure (DSLBIS)
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v3:
> - Added spec section in commit header. (Alison)
> - Remove void * recast. (Alison)
> - Rework comment. (Alison)
> - Move CDAT parse to cxl_endpoint_port_probe()
> - Convert to use 'struct node_hmem_attrs'
> 
> v2:
> - Add size check to DSLIBIS table. (Lukas)
> - Remove unnecessary entry type check. (Jonathan)
> - Move data_type check to after match. (Jonathan)
> - Skip unknown data type. (Jonathan)
> - Add overflow check for unit multiply. (Jonathan)
> - Use dev_warn() when entries parsing fail. (Jonathan)
> ---
>  drivers/cxl/core/cdat.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlpci.h    |   34 +++++++++++++++++++++++-
>  drivers/cxl/port.c      |   11 +++++++-
>  3 files changed, 111 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index 6f20af83a3ed..e8b9bb99dfdf 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -1,5 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /* Copyright(c) 2023 Intel Corporation. All rights reserved. */
> +#include <linux/overflow.h>
>  #include "cxlpci.h"
>  #include "cxl.h"
>  
> @@ -124,3 +125,70 @@ int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg)
>  	return 0;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
> +
> +static void cxl_hmem_attrs_set(struct node_hmem_attrs *attrs,
> +			       int access, unsigned int val)
> +{
> +	switch (access) {
> +	case HMAT_SLLBIS_ACCESS_LATENCY:
> +		attrs->read_latency = val;
> +		attrs->write_latency = val;
> +		break;
> +	case HMAT_SLLBIS_READ_LATENCY:
> +		attrs->read_latency = val;
> +		break;
> +	case HMAT_SLLBIS_WRITE_LATENCY:
> +		attrs->write_latency = val;
> +		break;
> +	case HMAT_SLLBIS_ACCESS_BANDWIDTH:
> +		attrs->read_bandwidth = val;
> +		attrs->write_bandwidth = val;
> +		break;
> +	case HMAT_SLLBIS_READ_BANDWIDTH:
> +		attrs->read_bandwidth = val;
> +		break;
> +	case HMAT_SLLBIS_WRITE_BANDWIDTH:
> +		attrs->write_bandwidth = val;
> +		break;
> +	}
> +}
> +
> +int cxl_dslbis_parse_entry(struct cdat_entry_header *header, void *arg)
> +{
> +	struct cdat_dslbis *dslbis = (struct cdat_dslbis *)header;
> +	struct list_head *dsmas_list = arg;
> +	struct dsmas_entry *dent;
> +
> +	if (dslbis->hdr.length != sizeof(*dslbis)) {
> +		pr_warn("Malformed DSLBIS table length: (%lu:%u)\n",
> +			(unsigned long)sizeof(*dslbis), dslbis->hdr.length);
> +		return -EINVAL;
> +	}
> +
> +	/* Skip unrecognized data type */
> +	if (dslbis->data_type >= HMAT_SLLBIS_DATA_TYPE_MAX)
> +		return 0;
> +
> +	list_for_each_entry(dent, dsmas_list, list) {
> +		u64 val;
> +		int rc;
> +
> +		if (dslbis->handle != dent->handle)
> +			continue;

Oh, now I see why the list is needed. Update the changelog of the
previous patch to indicate that the entries are cached to a list so they
can be cross referenced during dslbis parsing. At least that would have
saved me from picking on it.


> +
> +		/* Not a memory type, skip */
> +		if ((dslbis->flags & DSLBIS_MEM_MASK) != DSLBIS_MEM_MEMORY)
> +			return 0;
> +
> +		rc = check_mul_overflow(le64_to_cpu(dslbis->entry_base_unit),
> +					le16_to_cpu(dslbis->entry[0]), &val);
> +		if (unlikely(rc))

Don't use likely() / unlikely() without performance numbers. The
compiler generally does a better job and this is not a hot path.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable from CDAT
  2023-04-24 22:46   ` Dan Williams
@ 2023-04-24 22:59     ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-24 22:59 UTC (permalink / raw)
  To: Dan Williams, linux-cxl, linux-acpi
  Cc: ira.weiny, vishal.l.verma, alison.schofield, rafael, lukas,
	Jonathan.Cameron



On 4/24/23 3:46 PM, Dan Williams wrote:
> Dave Jiang wrote:
>> Provide a callback to parse the Device Scoped Latency and Bandwidth
>> Information Structure (DSLBIS) in the CDAT structures. The DSLBIS
>> contains the bandwidth and latency information that's tied to a DSMAS
>> handle. The driver will retrieve the read and write latency and
>> bandwidth associated with the DSMAS which is tied to a DPA range.
>>
>> Coherent Device Attribute Table 1.03 2.1 Device Scoped Latency and
>> Bandwidth Information Structure (DSLBIS)
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> ---
>> v3:
>> - Added spec section in commit header. (Alison)
>> - Remove void * recast. (Alison)
>> - Rework comment. (Alison)
>> - Move CDAT parse to cxl_endpoint_port_probe()
>> - Convert to use 'struct node_hmem_attrs'
>>
>> v2:
>> - Add size check to DSLIBIS table. (Lukas)
>> - Remove unnecessary entry type check. (Jonathan)
>> - Move data_type check to after match. (Jonathan)
>> - Skip unknown data type. (Jonathan)
>> - Add overflow check for unit multiply. (Jonathan)
>> - Use dev_warn() when entries parsing fail. (Jonathan)
>> ---
>>   drivers/cxl/core/cdat.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxlpci.h    |   34 +++++++++++++++++++++++-
>>   drivers/cxl/port.c      |   11 +++++++-
>>   3 files changed, 111 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>> index 6f20af83a3ed..e8b9bb99dfdf 100644
>> --- a/drivers/cxl/core/cdat.c
>> +++ b/drivers/cxl/core/cdat.c
>> @@ -1,5 +1,6 @@
>>   // SPDX-License-Identifier: GPL-2.0-only
>>   /* Copyright(c) 2023 Intel Corporation. All rights reserved. */
>> +#include <linux/overflow.h>
>>   #include "cxlpci.h"
>>   #include "cxl.h"
>>   
>> @@ -124,3 +125,70 @@ int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg)
>>   	return 0;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
>> +
>> +static void cxl_hmem_attrs_set(struct node_hmem_attrs *attrs,
>> +			       int access, unsigned int val)
>> +{
>> +	switch (access) {
>> +	case HMAT_SLLBIS_ACCESS_LATENCY:
>> +		attrs->read_latency = val;
>> +		attrs->write_latency = val;
>> +		break;
>> +	case HMAT_SLLBIS_READ_LATENCY:
>> +		attrs->read_latency = val;
>> +		break;
>> +	case HMAT_SLLBIS_WRITE_LATENCY:
>> +		attrs->write_latency = val;
>> +		break;
>> +	case HMAT_SLLBIS_ACCESS_BANDWIDTH:
>> +		attrs->read_bandwidth = val;
>> +		attrs->write_bandwidth = val;
>> +		break;
>> +	case HMAT_SLLBIS_READ_BANDWIDTH:
>> +		attrs->read_bandwidth = val;
>> +		break;
>> +	case HMAT_SLLBIS_WRITE_BANDWIDTH:
>> +		attrs->write_bandwidth = val;
>> +		break;
>> +	}
>> +}
>> +
>> +int cxl_dslbis_parse_entry(struct cdat_entry_header *header, void *arg)
>> +{
>> +	struct cdat_dslbis *dslbis = (struct cdat_dslbis *)header;
>> +	struct list_head *dsmas_list = arg;
>> +	struct dsmas_entry *dent;
>> +
>> +	if (dslbis->hdr.length != sizeof(*dslbis)) {
>> +		pr_warn("Malformed DSLBIS table length: (%lu:%u)\n",
>> +			(unsigned long)sizeof(*dslbis), dslbis->hdr.length);
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* Skip unrecognized data type */
>> +	if (dslbis->data_type >= HMAT_SLLBIS_DATA_TYPE_MAX)
>> +		return 0;
>> +
>> +	list_for_each_entry(dent, dsmas_list, list) {
>> +		u64 val;
>> +		int rc;
>> +
>> +		if (dslbis->handle != dent->handle)
>> +			continue;
> 
> Oh, now I see why the list is needed. Update the changelog of the
> previous patch to indicate that the entries are cached to a list so they
> can be cross referenced during dslbis parsing. At least that would have
> saved me from picking on it.

Jonathan had the same comment. It'll be updated for the next rev to make 
the connection.

> 
> 
>> +
>> +		/* Not a memory type, skip */
>> +		if ((dslbis->flags & DSLBIS_MEM_MASK) != DSLBIS_MEM_MEMORY)
>> +			return 0;
>> +
>> +		rc = check_mul_overflow(le64_to_cpu(dslbis->entry_base_unit),
>> +					le16_to_cpu(dslbis->entry[0]), &val);
>> +		if (unlikely(rc))
> 
> Don't use likely() / unlikely() without performance numbers. The
> compiler generally does a better job and this is not a hot path.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 07/23] cxl: Add callback to parse the SSLBIS subtable from CDAT
  2023-04-19 20:21 ` [PATCH v4 07/23] cxl: Add callback to parse the SSLBIS " Dave Jiang
  2023-04-20 11:50   ` Jonathan Cameron
@ 2023-04-24 23:38   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Dan Williams @ 2023-04-24 23:38 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> Provide a callback to parse the Switched Scoped Latency and Bandwidth
> Information Structure (DSLBIS) in the CDAT structures. The SSLBIS
> contains the bandwidth and latency information that's tied to the
> CLX switch that the data table has been read from. The extracted

s/CLX/CXL/

> values are indexed by the downstream port id.

For other readers of this patch it might be worth mentioning that this
corresponds to 'struct cxl_dport::portid'.

> It is possible the downstream port id is 0xffff which is a wildcard
> value for any port id.
> 
> Coherent Device Attribute Table 1.03 2.1 Switched Scoped Latency
> and Bandwidth Information Structure (DSLBIS)
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v3:
> - Add spec section in commit header (Alison)
> - Move CDAT parse to cxl_switch_port_probe()
> - Use 'struct node_hmem_attrs'
> ---
>  drivers/cxl/core/cdat.c |   76 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/port.c |    5 +++
>  drivers/cxl/cxl.h       |    1 +
>  drivers/cxl/cxlpci.h    |   20 ++++++++++++
>  drivers/cxl/port.c      |   14 ++++++++-
>  5 files changed, 115 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index e8b9bb99dfdf..ec3420dddf27 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -192,3 +192,79 @@ int cxl_dslbis_parse_entry(struct cdat_entry_header *header, void *arg)
>  	return 0;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_dslbis_parse_entry, CXL);
> +
> +int cxl_sslbis_parse_entry(struct cdat_entry_header *header, void *arg)
> +{
> +	struct cdat_sslbis *sslbis = (struct cdat_sslbis *)header;
> +	struct xarray *sslbis_xa = arg;
> +	int remain, entries, i;
> +
> +	remain = sslbis->hdr.length - sizeof(*sslbis);
> +	if (!remain || remain % sizeof(struct sslbis_sslbe)) {
> +		pr_warn("Malformed SSLBIS table length: (%u)\n",
> +			sslbis->hdr.length);
> +		return -EINVAL;
> +	}
> +
> +	/* Unrecognized data type, we can skip */
> +	if (sslbis->data_type >= HMAT_SLLBIS_DATA_TYPE_MAX)
> +		return 0;
> +
> +	entries = remain / sizeof(*sslbis);
> +
> +	for (i = 0; i < entries; i++) {
> +		struct sslbis_sslbe *sslbe = &sslbis->sslbe[i];
> +		u16 x = le16_to_cpu(sslbe->port_x_id);
> +		u16 y = le16_to_cpu(sslbe->port_y_id);
> +		struct node_hmem_attrs *hmem_attrs;

The more "node_hmem_attrs" get reused the more it sticks out as no
longer a good name. There's no Linux "nodes" to consider in this code,
no hmem since this is switch path and not a memory node, and no sysfs
attributes (which are typically named with "_attrs"). This data
structure is just a container for passing a tuple of r/w-latency and
r/w-bandwidth numbers. It's a performance coordinate that just happens
to get reused by the hmem sysfs nodes and now CXL cdat. Perhaps 'struct
access_coordinate'?

That would also make this code more readable:

void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs,
                         unsigned access);

...vs:

void node_set_perf_attrs(unsigned int nid, struct access_coordinate *coord,
                         unsigned access);


...at least that seems more readable to me.


> +		u16 dsp_id;
> +		u64 val;
> +		int rc;
> +
> +		switch (x) {
> +		case SSLBIS_US_PORT:
> +			dsp_id = y;
> +			break;
> +		case SSLBIS_ANY_PORT:
> +			switch (y) {
> +			case SSLBIS_US_PORT:
> +				dsp_id = x;
> +				break;
> +			case SSLBIS_ANY_PORT:
> +				dsp_id = SSLBIS_ANY_PORT;
> +				break;
> +			default:
> +				dsp_id = y;
> +				break;
> +			}
> +			break;
> +		default:
> +			dsp_id = x;
> +			break;
> +		}
> +
> +		hmem_attrs = xa_load(sslbis_xa, dsp_id);
> +		if (xa_is_err(hmem_attrs))
> +			return xa_err(hmem_attrs);
> +		if (!hmem_attrs) {
> +			hmem_attrs = kzalloc(sizeof(*hmem_attrs), GFP_KERNEL);
> +			if (!hmem_attrs)
> +				return -ENOMEM;
> +		}
> +
> +		rc = check_mul_overflow(le64_to_cpu(sslbis->entry_base_unit),
> +					le16_to_cpu(sslbe->value), &val);
> +		if (unlikely(rc))
> +			pr_warn("SSLBIS value overflowed!\n");
> +
> +		cxl_hmem_attrs_set(hmem_attrs, sslbis->data_type, val);
> +		rc = xa_insert(sslbis_xa, dsp_id, hmem_attrs, GFP_KERNEL);

I'm confused why an xarray is needed. If the sslbis indicates the access
parameters from the upstream port to the downstream port, just record
that access_coordinate and point each downstream port to the same
coordinate. Why keep an xarray full of these around?

In other words just add 'struct access_coordinate' to 'struct cxl_dport'
rather maintaining this parallel array of per-downstream port data.

When / if p2p support comes along then we can worry about dport-to-dport
performance, but for this patchset those sslbis entries are 'don't care'.

> +		if (rc < 0 && rc != -EBUSY) {
> +			kfree(hmem_attrs);
> +			return rc;
> +		}
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_sslbis_parse_entry, CXL);

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-04-19 20:21 ` [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
  2023-04-20 12:00   ` Jonathan Cameron
@ 2023-04-25  0:12   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Dan Williams @ 2023-04-25  0:12 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)
> 
> Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
> an input of an ACPI package with 4 dwords (read latency, write latency,
> read bandwidth, write bandwidth). The call returns a package with 1 WORD
> that provides the max supported QTG ID and a package that may contain 0 or
> more WORDs as the recommended QTG IDs in the recommended order.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v2:
> - Reorder var declaration and use C99 style. (Jonathan)
> - Allow >2 ACPI objects in package for future expansion. (Jonathan)
> - Check QTG IDs against MAX QTG ID provided by output package. (Jonathan)
> ---
>  drivers/cxl/core/Makefile |    1 
>  drivers/cxl/core/acpi.c   |  116 +++++++++++++++++++++++++++++++++++++++++++++

Why a new core file? This seems something that only drivers/cxl/acpi.c
could ever care about.

Similar to the @calc_hb callback for root decoders this is another
platform specific callback that the endpoint drivers need not care that
it is an ACPI platform or not. They just ask their 'root' cxl_port
implementation for a qos_class and whether that is ACPI or not is
hidden.

>  drivers/cxl/cxl.h         |   16 ++++++
>  3 files changed, 133 insertions(+)
>  create mode 100644 drivers/cxl/core/acpi.c
> 
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 867a8014b462..30d61c8cae22 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -13,5 +13,6 @@ cxl_core-y += mbox.o
>  cxl_core-y += pci.o
>  cxl_core-y += hdm.o
>  cxl_core-y += cdat.o
> +cxl_core-y += acpi.o
>  cxl_core-$(CONFIG_TRACING) += trace.o
>  cxl_core-$(CONFIG_CXL_REGION) += region.o
> diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
> new file mode 100644
> index 000000000000..6eda5cad8d59
> --- /dev/null
> +++ b/drivers/cxl/core/acpi.c
> @@ -0,0 +1,116 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/acpi.h>
> +#include <linux/pci.h>
> +#include <asm/div64.h>
> +#include "cxlpci.h"
> +#include "cxl.h"
> +
> +const guid_t acpi_cxl_qtg_id_guid =

static?

> +	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
> +		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
> +
> +/**
> + * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
> + * @handle: ACPI handle
> + * @input: bandwidth and latency data
> + *
> + * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
> + * the QTG IDs that falls within the performance data.
> + */
> +struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
> +						 struct qtg_dsm_input *input)
> +{
> +	union acpi_object *out_obj, *out_buf, *pkg;
> +	union acpi_object in_buf = {
> +		.buffer = {
> +			.type = ACPI_TYPE_BUFFER,
> +			.pointer = (u8 *)input,
> +			.length = sizeof(u32) * 4,
> +		},
> +	};
> +	union acpi_object in_obj = {
> +		.package = {
> +			.type = ACPI_TYPE_PACKAGE,
> +			.count = 1,
> +			.elements = &in_buf
> +		},
> +	};
> +	struct qtg_dsm_output *output = NULL;
> +	int len, rc, i;
> +	u16 *max_qtg;
> +
> +	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
> +	if (!out_obj)
> +		return ERR_PTR(-ENXIO);
> +
> +	if (out_obj->type != ACPI_TYPE_PACKAGE) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	/* Check Max QTG ID */
> +	pkg = &out_obj->package.elements[0];
> +	if (pkg->type != ACPI_TYPE_BUFFER) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	if (pkg->buffer.length != sizeof(u16)) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +	max_qtg = (u16 *)pkg->buffer.pointer;
> +
> +	/* Retrieve QTG IDs package */
> +	pkg = &out_obj->package.elements[1];
> +	if (pkg->type != ACPI_TYPE_PACKAGE) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	out_buf = &pkg->package.elements[0];
> +	if (out_buf->type != ACPI_TYPE_BUFFER) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	len = out_buf->buffer.length;
> +
> +	/* It's legal to have 0 QTG entries */
> +	if (len == 0)
> +		goto out;
> +
> +	/* Malformed package, not multiple of WORD size */
> +	if (len % sizeof(u16)) {
> +		rc = -ENXIO;
> +		goto out;
> +	}
> +
> +	output = kmalloc(len + sizeof(*output), GFP_KERNEL);

This feels more complicated than it needs to be. The only output from
this function that matters is a qos_class number for a given input. The
backup qtg-ids are not yet interesting without a real world example of
where selecting from the backup list vs any other id matters. In other
words the only recommendation is match or non-match. Whether a non-match
is in the backup list is still a platform-specific consideration that
Linux as of today has nothing to point to say that this distinction
matters.

That will be an end user call to their platform vendor to ask "there's
not enough capacity left in QoS class X what are the implications for
picking performance class Y", or "please increase capacity of the
window for QoS class X".

> +	if (!output) {
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +
> +	output->nr = len / sizeof(u16);
> +	memcpy(output->qtg_ids, out_buf->buffer.pointer, len);
> +
> +	for (i = 0; i < output->nr; i++) {
> +		if (output->qtg_ids[i] > *max_qtg)
> +			pr_warn("QTG ID %u greater than MAX %u\n",
> +				output->qtg_ids[i], *max_qtg);
> +	}
> +
> +out:
> +	ACPI_FREE(out_obj);
> +	return output;
> +
> +err:
> +	ACPI_FREE(out_obj);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_evaluate_qtg_dsm, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 318aa051f65a..6426c4c22e28 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -7,6 +7,7 @@
>  #include <linux/libnvdimm.h>
>  #include <linux/bitfield.h>
>  #include <linux/bitops.h>
> +#include <linux/acpi.h>
>  #include <linux/log2.h>
>  #include <linux/list.h>
>  #include <linux/io.h>
> @@ -793,6 +794,21 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
>  }
>  #endif
>  
> +struct qtg_dsm_input {
> +	u32 rd_lat;
> +	u32 wr_lat;
> +	u32 rd_bw;
> +	u32 wr_bw;
> +};
> +
> +struct qtg_dsm_output {
> +	int nr;
> +	u16 qtg_ids[];
> +};
> +
> +struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
> +						 struct qtg_dsm_input *input);
> +
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
>   * of these symbols in tools/testing/cxl/.
> 
> 



^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 09/23] cxl: Add helper function to retrieve ACPI handle of CXL root device
  2023-04-19 20:21 ` [PATCH v4 09/23] cxl: Add helper function to retrieve ACPI handle of CXL root device Dave Jiang
  2023-04-20 12:06   ` Jonathan Cameron
@ 2023-04-25  0:18   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Dan Williams @ 2023-04-25  0:18 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> Provide a helper to find the ACPI0017 device in order to issue the _DSM.
> The helper will take the 'struct device' from a cxl_port and iterate until
> the root device is reached. The ACPI handle will be returned from the root
> device.

Following on from the last patch this should all be self contained to
drivers/cxl/acpi.c with something like:

struct cxl_root {
	struct cxl_port port;
	cxl_qos_class_fn qos_class;
};

...and then the caller does:

port = find_cxl_root(...);
root = container_of(port, typeof(*root), port);
class = root->qos_class(port, ...);

Yes, it means finally creating a formal 'struct cxl_root' type, but I
expect it will not be the last root method that gets added.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 10/23] cxl: Add helpers to calculate pci latency for the CXL device
  2023-04-19 20:22 ` [PATCH v4 10/23] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
  2023-04-20 12:15   ` Jonathan Cameron
@ 2023-04-25  0:30   ` Dan Williams
  2023-05-01 16:29     ` Dave Jiang
  1 sibling, 1 reply; 70+ messages in thread
From: Dan Williams @ 2023-04-25  0:30 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> The latency is calculated by dividing the flit size over the bandwidth. Add
> support to retrieve the flit size for the CXL device and calculate the
> latency of the downstream link.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v2:
> - Fix commit log issues. (Jonathan)
> - Fix var declaration issues. (Jonathan)
> ---
>  drivers/cxl/core/pci.c |   68 ++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlpci.h   |   15 +++++++++++
>  drivers/cxl/pci.c      |   13 ---------
>  3 files changed, 83 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 1c415b26e866..bb58296b3e56 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -712,3 +712,71 @@ pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
>  	return PCI_ERS_RESULT_NEED_RESET;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_error_detected, CXL);
> +
> +static int pci_bus_speed_to_mbps(enum pci_bus_speed speed)
> +{
> +	switch (speed) {
> +	case PCIE_SPEED_2_5GT:
> +		return 2500;
> +	case PCIE_SPEED_5_0GT:
> +		return 5000;
> +	case PCIE_SPEED_8_0GT:
> +		return 8000;
> +	case PCIE_SPEED_16_0GT:
> +		return 16000;
> +	case PCIE_SPEED_32_0GT:
> +		return 32000;
> +	case PCIE_SPEED_64_0GT:
> +		return 64000;
> +	default:
> +		break;
> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static int cxl_pci_mbits_to_mbytes(struct pci_dev *pdev)
> +{
> +	int mbits;
> +
> +	mbits = pci_bus_speed_to_mbps(pdev->bus->cur_bus_speed);
> +	if (mbits < 0)
> +		return mbits;
> +
> +	return mbits >> 3;

Why not just return mbits directly and skip the conversion? Otherwise a
"/ 8" requires bit less cleverness to read than ">> 3".

> +}
> +
> +static int cxl_flit_size(struct pci_dev *pdev)

This like something that might be worth caching in 'struct cxl_port'
rather than re-reading the configuration register each call. Depends on
how often it is used.

> +{
> +	if (cxl_pci_flit_256(pdev))
> +		return 256;
> +
> +	return 68;
> +}
> +
> +/**
> + * cxl_pci_get_latency - calculate the link latency for the PCIe link
> + * @pdev - PCI device
> + *
> + * return: calculated latency or -errno
> + *
> + * CXL Memory Device SW Guide v1.0 2.11.4 Link latency calculation
> + * Link latency = LinkPropagationLatency + FlitLatency + RetimerLatency
> + * LinkProgationLatency is negligible, so 0 will be used
> + * RetimerLatency is assumed to be negligible and 0 will be used
> + * FlitLatency = FlitSize / LinkBandwidth
> + * FlitSize is defined by spec. CXL rev3.0 4.2.1.
> + * 68B flit is used up to 32GT/s. >32GT/s, 256B flit size is used.
> + * The FlitLatency is converted to picoseconds.
> + */
> +long cxl_pci_get_latency(struct pci_dev *pdev)
> +{
> +	long bw;
> +
> +	bw = cxl_pci_mbits_to_mbytes(pdev);

This function looks misnamed when I read it here, it's retrieving the
bus speed in MiBs not doing a conversion.

> +	if (bw < 0)
> +		return bw;
> +
> +	return cxl_flit_size(pdev) * 1000000L / bw;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_get_latency, CXL);
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 1bca1c0e4b40..795eba31fe29 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -167,6 +167,19 @@ struct cdat_sslbis {
>  #define SSLBIS_US_PORT		0x0100
>  #define SSLBIS_ANY_PORT		0xffff
>  
> +/*
> + * CXL v3.0 6.2.3 Table 6-4
> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
> + * mode, otherwise it's 68B flits mode.
> + */
> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
> +{
> +	u16 lnksta2;
> +
> +	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
> +	return lnksta2 & PCI_EXP_LNKSTA2_FLIT;
> +}
> +
>  int devm_cxl_port_enumerate_dports(struct cxl_port *port);
>  struct cxl_dev_state;
>  int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> @@ -189,4 +202,6 @@ int cxl_##x##_parse_entry(struct cdat_entry_header *header, void *arg)
>  cxl_parse_entry(dsmas);
>  cxl_parse_entry(dslbis);
>  cxl_parse_entry(sslbis);
> +
> +long cxl_pci_get_latency(struct pci_dev *pdev);
>  #endif /* __CXL_PCI_H__ */
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index ea38bd49b0cf..ed39d133b70d 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -365,19 +365,6 @@ static bool is_cxl_restricted(struct pci_dev *pdev)
>  	return pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END;
>  }
>  
> -/*
> - * CXL v3.0 6.2.3 Table 6-4
> - * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
> - * mode, otherwise it's 68B flits mode.
> - */
> -static bool cxl_pci_flit_256(struct pci_dev *pdev)
> -{
> -	u16 lnksta2;
> -
> -	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
> -	return lnksta2 & PCI_EXP_LNKSTA2_FLIT;
> -}
> -
>  static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>  {
>  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> 
> 



^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches
  2023-04-19 20:22 ` [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches Dave Jiang
  2023-04-20 12:26   ` Jonathan Cameron
@ 2023-04-25  0:33   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Dan Williams @ 2023-04-25  0:33 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> The CDAT information from the switch, Switch Scoped Latency and Bandwidth
> Information Strucutre (SSLBIS), is parsed and stored in an xarray under the
> cxl_port. The QoS data are indexed by the downstream port id.  Walk the CXL
> ports from endpoint to root and retrieve the relevant QoS information
> (bandwidth and latency) that are from the switch CDAT. If read or write QoS
> values are not available, then use the access QoS value.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v3:
> - Move to use 'struct node_hmem_attrs'
> ---
>  drivers/cxl/core/port.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |    2 +
>  2 files changed, 83 insertions(+)
> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 3fedbabac1af..770b540d5325 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1921,6 +1921,87 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd)
>  }
>  EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
>  
> +/**
> + * cxl_port_get_switch_qos - retrieve QoS data for CXL switches
> + * @port: endpoint cxl_port
> + * @rd_bw: writeback value for min read bandwidth
> + * @rd_lat: writeback value for total read latency
> + * @wr_bw: writeback value for min write bandwidth
> + * @wr_lat: writeback value for total write latency
> + *
> + * Return: Errno on failure, 0 on success. -ENOENT if no switch device
> + */
> +int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
> +			    u64 *wr_bw, u64 *wr_lat)
> +{
> +	u64 min_rd_bw = ULONG_MAX;
> +	u64 min_wr_bw = ULONG_MAX;
> +	struct cxl_dport *dport;
> +	struct cxl_port *nport;
> +	u64 total_rd_lat = 0;
> +	u64 total_wr_lat = 0;
> +	struct device *next;
> +	int switches = 0;
> +	int rc = 0;
> +
> +	if (!is_cxl_endpoint(port))
> +		return -EINVAL;
> +
> +	/* Skip the endpoint */
> +	next = port->dev.parent;
> +	nport = to_cxl_port(next);
> +	dport = port->parent_dport;
> +
> +	do {
> +		struct node_hmem_attrs *hmem_attrs;
> +		u64 lat, bw;
> +
> +		if (!nport->cdat.table)
> +			break;
> +
> +		if (!dev_is_pci(dport->dport))
> +			break;
> +
> +		hmem_attrs = xa_load(&nport->cdat.sslbis_xa, dport->port_id);
> +		if (xa_is_err(hmem_attrs))
> +			return xa_err(hmem_attrs);
> +
> +		if (!hmem_attrs) {
> +			hmem_attrs = xa_load(&nport->cdat.sslbis_xa, SSLBIS_ANY_PORT);
> +			if (xa_is_err(hmem_attrs))
> +				return xa_err(hmem_attrs);
> +			if (!hmem_attrs)
> +				return -ENXIO;
> +		}

Yeah, I think my comment from a few patches back stands. There appears
to be no need to maintain the xarray if each dport just resolves and
caches its relative access coordinate at init time.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH v4 12/23] cxl: Add helper function that calculate QoS values for PCI path
  2023-04-19 20:22 ` [PATCH v4 12/23] cxl: Add helper function that calculate QoS values for PCI path Dave Jiang
  2023-04-20 12:32   ` Jonathan Cameron
@ 2023-04-25  0:45   ` Dan Williams
  1 sibling, 0 replies; 70+ messages in thread
From: Dan Williams @ 2023-04-25  0:45 UTC (permalink / raw)
  To: Dave Jiang, linux-cxl, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron

Dave Jiang wrote:
> Calculate the link bandwidth and latency for the PCIe path from the device
> to the CXL Host Bridge. This does not include the CDAT data from the device
> or the switch(es) in the path.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> v4:
> - 0-day fix, remove unused var. Fix checking < 0 for unsigned var.
> - Rework port hierachy walk to calculate the latencies correctly
> ---
>  drivers/cxl/core/port.c |   83 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |    2 +
>  2 files changed, 85 insertions(+)
> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 770b540d5325..8da437e038b9 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -2002,6 +2002,89 @@ int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_port_get_switch_qos, CXL);
>  
> +/**
> + * cxl_port_get_downstream_qos - retrieve QoS data for PCIE downstream path
> + * @port: endpoint cxl_port
> + * @bandwidth: writeback value for min bandwidth
> + * @latency: writeback value for total latency
> + *
> + * Return: Errno on failure, 0 on success.
> + */
> +int cxl_port_get_downstream_qos(struct cxl_port *port, u64 *bandwidth,
> +				u64 *latency)
> +{
> +	u64 min_bw = ULONG_MAX;
> +	struct pci_dev *pdev;
> +	struct cxl_port *p;
> +	struct device *dev;
> +	u64 total_lat = 0;
> +	long lat;
> +
> +	*bandwidth = 0;
> +	*latency = 0;
> +
> +	/* Grab the device that is the PCI device for CXL memdev */
> +	dev = port->uport->parent;

The prototype for this function makes it seem like it can apply to
either switch downstream ports or endpoint ports. Given that this is
retrieving link speed can it not just retrieve the speed from the switch
downstream port status rather than the endpoint status?

> +	/* Skip if it's not PCI, most likely a cxl_test device */
> +	if (!dev_is_pci(dev))
> +		return 0;
> +
> +	pdev = to_pci_dev(dev);
> +	min_bw = pcie_bandwidth_available(pdev, NULL, NULL, NULL);
> +	if (min_bw == 0)
> +		return -ENXIO;
> +
> +	/* convert to MB/s from Mb/s */
> +	min_bw >>= 3;
> +
> +	/*
> +	 * Walk the cxl_port hierachy to retrieve the link latencies for
> +	 * each of the PCIe segments. The loop will obtain the link latency
> +	 * via each of the switch downstream port.
> +	 */

If performance data is cached in 'struct cxl_dport' at init then the PCI
device topology need not be walked later. All of the dev_is_pci() calls
can be removed from cxl_port_get_downstream_qos().

> +	p = port;
> +	do {
> +		struct cxl_dport *dport = p->parent_dport;
> +		struct device *dport_dev, *uport_dev;
> +		struct pci_dev *dport_pdev;
> +
> +		if (!dport)
> +			break;
> +
> +		dport_dev = dport->dport;
> +		if (!dev_is_pci(dport_dev))
> +			break;
> +
> +		p = dport->port;
> +		uport_dev = p->uport;
> +		if (!dev_is_pci(uport_dev))
> +			break;
> +
> +		dport_pdev = to_pci_dev(dport_dev);
> +		pdev = to_pci_dev(uport_dev);
> +		lat = cxl_pci_get_latency(dport_pdev);
> +		if (lat < 0)
> +			return lat;
> +
> +		total_lat += lat;
> +	} while (1);
> +
> +	/*
> +	 * pdev would be either the cxl device if there are no switches, or the
> +	 * upstream port of the last switch.
> +	 */
> +	lat = cxl_pci_get_latency(pdev);
> +	if (lat < 0)
> +		return lat;
> +
> +	total_lat += lat;
> +	*bandwidth = min_bw;
> +	*latency = total_lat;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_port_get_downstream_qos, CXL);
> +
>  /* for user tooling to ensure port disable work has completed */
>  static ssize_t flush_store(struct bus_type *bus, const char *buf, size_t count)
>  {
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 76ccc815134f..6a6387a545db 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -811,6 +811,8 @@ struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
>  acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev);
>  int cxl_port_get_switch_qos(struct cxl_port *port, u64 *rd_bw, u64 *rd_lat,
>  			    u64 *wr_bw, u64 *wr_lat);
> +int cxl_port_get_downstream_qos(struct cxl_port *port, u64 *bandwidth,
> +				u64 *latency);
>  
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
> 
> 



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/23] cxl: Add common helpers for cdat parsing
  2023-04-24 22:33   ` Dan Williams
@ 2023-04-25 16:00     ` Dave Jiang
  2023-04-27  0:09       ` Dan Williams
  0 siblings, 1 reply; 70+ messages in thread
From: Dave Jiang @ 2023-04-25 16:00 UTC (permalink / raw)
  To: Dan Williams, linux-cxl, linux-acpi
  Cc: ira.weiny, vishal.l.verma, alison.schofield, rafael, lukas,
	Jonathan.Cameron



On 4/24/23 3:33 PM, Dan Williams wrote:
> Dave Jiang wrote:
>> Add helper functions to parse the CDAT table and provide a callback to
>> parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
>> parsing. The code is patterned after the ACPI table parsing helpers.
> 
> It seems a shame that CDAT is so ACPI-like, but can't reuse the ACPI
> table parsing infrastructure. Can this not be achieved by modifying some
> of the helpers helpers in drivers/acpi/tables.c to take a passed in
> @table_header?

Rafael,
Do you have any issues with adding some endieness support in 
drivers/acpi/tables.c in order to support CDAT parsing by BE hosts? To 
start off with something like below?

diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 7b4680da57d7..e63e2daf151d 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -287,6 +287,12 @@ acpi_get_subtable_type(char *id)
         return ACPI_SUBTABLE_COMMON;
  }

+static unsigned long __init_or_acpilib
+acpi_table_get_length(struct acpi_table_header *hdr)
+{
+       return le32_to_cpu((__force __le32)hdr->length);
+}
+
  static __init_or_acpilib bool has_handler(struct acpi_subtable_proc *proc)
  {
         return proc->handler || proc->handler_arg;
@@ -337,7 +343,8 @@ static int __init_or_acpilib acpi_parse_entries_array(
         int errs = 0;
         int i;

-       table_end = (unsigned long)table_header + table_header->length;
+       table_end = (unsigned long)table_header +
+                   acpi_table_get_length(table_header);

         /* Parse all entries looking for a match. */


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-04-19 20:21 ` [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
  2023-04-20 11:33   ` Jonathan Cameron
  2023-04-24 22:38   ` Dan Williams
@ 2023-04-26  3:44   ` Li, Ming
  2023-04-26 18:27     ` Dave Jiang
  2 siblings, 1 reply; 70+ messages in thread
From: Li, Ming @ 2023-04-26  3:44 UTC (permalink / raw)
  To: Dave Jiang
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron, linux-cxl, linux-acpi

On 4/20/2023 4:21 AM, Dave Jiang wrote:
> Provide a callback function to the CDAT parser in order to parse the Device
> Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
> DPA range and its associated attributes in each entry. See the CDAT
> specification for details.
> 
> Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity
> Structure (DSMAS)
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v3:
> - Add spec section number. (Alison)
> - Remove cast from void *. (Alison)
> - Refactor cxl_port_probe() block. (Alison)
> - Move CDAT parse to cxl_endpoint_port_probe()
> 
> v2:
> - Add DSMAS table size check. (Lukas)
> - Use local DSMAS header for LE handling.
> - Remove dsmas lock. (Jonathan)
> - Fix handle size (Jonathan)
> - Add LE to host conversion for DSMAS address and length.
> - Make dsmas_list local
> ---
>  drivers/cxl/core/cdat.c |   26 ++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |    1 +
>  drivers/cxl/cxlpci.h    |   18 ++++++++++++++++++
>  drivers/cxl/port.c      |   22 ++++++++++++++++++++++
>  4 files changed, 67 insertions(+)
> 
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index 210f4499bddb..6f20af83a3ed 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -98,3 +98,29 @@ int cdat_table_parse_sslbis(struct cdat_header *table,
>  	return cdat_table_parse_entries(CDAT_TYPE_SSLBIS, table, &proc);
>  }
>  EXPORT_SYMBOL_NS_GPL(cdat_table_parse_sslbis, CXL);
> +
> +int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg)
> +{
> +	struct cdat_dsmas *dsmas = (struct cdat_dsmas *)header;
> +	struct list_head *dsmas_list = arg;
> +	struct dsmas_entry *dent;
> +
> +	if (dsmas->hdr.length != sizeof(*dsmas)) {
> +		pr_warn("Malformed DSMAS table length: (%lu:%u)\n",
> +			 (unsigned long)sizeof(*dsmas), dsmas->hdr.length);
> +		return -EINVAL;
> +	}
> +
> +	dent = kzalloc(sizeof(*dent), GFP_KERNEL);
> +	if (!dent)
> +		return -ENOMEM;
> +
> +	dent->handle = dsmas->dsmad_handle;
> +	dent->dpa_range.start = le64_to_cpu(dsmas->dpa_base_address);
> +	dent->dpa_range.end = le64_to_cpu(dsmas->dpa_base_address) +
> +			      le64_to_cpu(dsmas->dpa_length) - 1;

Hi Dave,

I saw you didn't store flags field into dent, it is not needed or missed?

Thanks
Ming



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-04-26  3:44   ` Li, Ming
@ 2023-04-26 18:27     ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-26 18:27 UTC (permalink / raw)
  To: Li, Ming
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, lukas, Jonathan.Cameron, linux-cxl, linux-acpi



On 4/25/23 8:44 PM, Li, Ming wrote:
> On 4/20/2023 4:21 AM, Dave Jiang wrote:
>> Provide a callback function to the CDAT parser in order to parse the Device
>> Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
>> DPA range and its associated attributes in each entry. See the CDAT
>> specification for details.
>>
>> Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity
>> Structure (DSMAS)
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> ---
>> v3:
>> - Add spec section number. (Alison)
>> - Remove cast from void *. (Alison)
>> - Refactor cxl_port_probe() block. (Alison)
>> - Move CDAT parse to cxl_endpoint_port_probe()
>>
>> v2:
>> - Add DSMAS table size check. (Lukas)
>> - Use local DSMAS header for LE handling.
>> - Remove dsmas lock. (Jonathan)
>> - Fix handle size (Jonathan)
>> - Add LE to host conversion for DSMAS address and length.
>> - Make dsmas_list local
>> ---
>>   drivers/cxl/core/cdat.c |   26 ++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h       |    1 +
>>   drivers/cxl/cxlpci.h    |   18 ++++++++++++++++++
>>   drivers/cxl/port.c      |   22 ++++++++++++++++++++++
>>   4 files changed, 67 insertions(+)
>>
>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>> index 210f4499bddb..6f20af83a3ed 100644
>> --- a/drivers/cxl/core/cdat.c
>> +++ b/drivers/cxl/core/cdat.c
>> @@ -98,3 +98,29 @@ int cdat_table_parse_sslbis(struct cdat_header *table,
>>   	return cdat_table_parse_entries(CDAT_TYPE_SSLBIS, table, &proc);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cdat_table_parse_sslbis, CXL);
>> +
>> +int cxl_dsmas_parse_entry(struct cdat_entry_header *header, void *arg)
>> +{
>> +	struct cdat_dsmas *dsmas = (struct cdat_dsmas *)header;
>> +	struct list_head *dsmas_list = arg;
>> +	struct dsmas_entry *dent;
>> +
>> +	if (dsmas->hdr.length != sizeof(*dsmas)) {
>> +		pr_warn("Malformed DSMAS table length: (%lu:%u)\n",
>> +			 (unsigned long)sizeof(*dsmas), dsmas->hdr.length);
>> +		return -EINVAL;
>> +	}
>> +
>> +	dent = kzalloc(sizeof(*dent), GFP_KERNEL);
>> +	if (!dent)
>> +		return -ENOMEM;
>> +
>> +	dent->handle = dsmas->dsmad_handle;
>> +	dent->dpa_range.start = le64_to_cpu(dsmas->dpa_base_address);
>> +	dent->dpa_range.end = le64_to_cpu(dsmas->dpa_base_address) +
>> +			      le64_to_cpu(dsmas->dpa_length) - 1;
> 
> Hi Dave,
> 
> I saw you didn't store flags field into dent, it is not needed or missed?

Hi Ming,
I didn't have a need for it for this patch set. But I think it may be 
needed in the future for DCD. I figured when we do, we can add it.

> 
> Thanks
> Ming
> 
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs
  2023-04-24 21:46   ` Dan Williams
@ 2023-04-26 23:14     ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-26 23:14 UTC (permalink / raw)
  To: Dan Williams, linux-cxl, linux-acpi
  Cc: Ira Weiny, vishal.l.verma, alison.schofield, rafael, lukas,
	Jonathan.Cameron



On 4/24/23 2:46 PM, Dan Williams wrote:
> Dave Jiang wrote:
>> Export the QoS Throttling Group ID from the CXL Fixed Memory Window
>> Structure (CFMWS) under the root decoder sysfs attributes.
>> CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS)
>>
>> cxl cli will use this QTG ID to match with the _DSM retrieved QTG ID for a
>> hot-plugged CXL memory device DPA memory range to make sure that the DPA range
>> is under the right CFMWS window.
>>
>> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> ---
>> v4:
>> - Change kernel version for documentation to v6.5
>> v2:
>> - Add explanation commit header (Jonathan)
>> ---
>>   Documentation/ABI/testing/sysfs-bus-cxl |    9 +++++++++
>>   drivers/cxl/acpi.c                      |    3 +++
>>   drivers/cxl/core/port.c                 |   14 ++++++++++++++
>>   drivers/cxl/cxl.h                       |    3 +++
>>   4 files changed, 29 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
>> index 3acf2f17a73f..bd2b59784979 100644
>> --- a/Documentation/ABI/testing/sysfs-bus-cxl
>> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
>> @@ -309,6 +309,15 @@ Description:
>>   		(WO) Write a string in the form 'regionZ' to delete that region,
>>   		provided it is currently idle / not bound to a driver.
>>   
>> +What:		/sys/bus/cxl/devices/decoderX.Y/qtg_id
>> +Date:		Jan, 2023
>> +KernelVersion:	v6.5
>> +Contact:	linux-cxl@vger.kernel.org
>> +Description:
>> +		(RO) Shows the QoS Throttling Group ID. The QTG ID for a root
>> +		decoder comes from the CFMWS structure of the CEDT. A value of
>> +		-1 indicates that no QTG ID was retrieved. The QTG ID is used as
>> +		guidance to match against the QTG ID of a hot-plugged device.
> 
> For user documentation I do not expect a someone to know the relevance
> of those ACPI table names. Also, looking at this from a future proofing
> perspective, even though there is yet to be a non-ACPI CXL host
> definition I do not want to tie ourselves to ACPI-specific terms here.
> 
> The CXL generic concept here is a "class" as defined in CXL 3.0 3.3.4
> QoS Telemetry for Memory, and that mentions an optional platform
> facility to group memory regions by their performance.  So QTG-ID is an
> ACPI.CFMWS specific respopnse to that CXL QoS class and grouping
> concept. See CXL 3.0 3.3.4.3 Memory Device Support for QoS Telemetry for
> its usage of "class").
> 
> So lets call the user-facing attribute a "qos_class". Then the
> description can be something like the below. Note that I call it a
> "cookie" since the value has no meaning besides just an id for
> matching purposes.
> 
> ---
> 
> What:		/sys/bus/cxl/devices/decoderX.Y/qos_class
> Description:
> 		(RO) For CXL host platforms that support "QoS Telemmetry" this
> 		root-decoder-only attribute conveys a platform specific cookie
> 		that identifies a QoS performance class for the CXL Window.
> 		This class-id can be compared against a similar "qos_class"
> 		published for each memory-type that an endpoint supports. While
> 		it is not required that endpoints map their local memory-class
> 		to a matching platform class, mismatches are not recommended and
> 		there are platform specific side-effects that may result.

Ok I'll update.

> 
>>   
>>   What:		/sys/bus/cxl/devices/regionZ/uuid
>>   Date:		May, 2022
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index 7e1765b09e04..abc24137c291 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
>> @@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>>   			}
>>   		}
>>   	}
>> +
>> +	cxld->qtg_id = cfmws->qtg_id;
>> +
>>   	rc = cxl_decoder_add(cxld, target_map);
>>   err_xormap:
>>   	if (rc)
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 4d1f9c5b5029..024d4178f557 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -276,6 +276,16 @@ static ssize_t interleave_ways_show(struct device *dev,
>>   
>>   static DEVICE_ATTR_RO(interleave_ways);
>>   
>> +static ssize_t qtg_id_show(struct device *dev,
>> +			   struct device_attribute *attr, char *buf)
>> +{
>> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
>> +
>> +	return sysfs_emit(buf, "%d\n", cxld->qtg_id);
>> +}
>> +
>> +static DEVICE_ATTR_RO(qtg_id);
>> +
>>   static struct attribute *cxl_decoder_base_attrs[] = {
>>   	&dev_attr_start.attr,
>>   	&dev_attr_size.attr,
>> @@ -295,6 +305,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
>>   	&dev_attr_cap_type2.attr,
>>   	&dev_attr_cap_type3.attr,
>>   	&dev_attr_target_list.attr,
>> +	&dev_attr_qtg_id.attr,
>>   	SET_CXL_REGION_ATTR(create_pmem_region)
>>   	SET_CXL_REGION_ATTR(create_ram_region)
>>   	SET_CXL_REGION_ATTR(delete_region)
>> @@ -1625,6 +1636,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
>>   	}
>>   
>>   	atomic_set(&cxlrd->region_id, rc);
>> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
> 
> If qtg_id needs to stay in 'struct cxl_decoder' why not move this to
> cxl_decoder_init() and do it once?
> 
>>   	return cxlrd;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL);
>> @@ -1662,6 +1674,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
>>   
>>   	cxld = &cxlsd->cxld;
>>   	cxld->dev.type = &cxl_decoder_switch_type;
>> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>>   	return cxlsd;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL);
>> @@ -1694,6 +1707,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port)
>>   	}
>>   
>>   	cxld->dev.type = &cxl_decoder_endpoint_type;
>> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>>   	return cxled;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL);
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 044a92d9813e..278ab6952332 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -300,6 +300,7 @@ enum cxl_decoder_type {
>>    */
>>   #define CXL_DECODER_MAX_INTERLEAVE 16
>>   
>> +#define CXL_QTG_ID_INVALID	-1
>>   
>>   /**
>>    * struct cxl_decoder - Common CXL HDM Decoder Attributes
>> @@ -311,6 +312,7 @@ enum cxl_decoder_type {
>>    * @target_type: accelerator vs expander (type2 vs type3) selector
>>    * @region: currently assigned region for this decoder
>>    * @flags: memory type capabilities and locking
>> + * @qtg_id: QoS Throttling Group ID
>>    * @commit: device/decoder-type specific callback to commit settings to hw
>>    * @reset: device/decoder-type specific callback to reset hw settings
>>   */
>> @@ -323,6 +325,7 @@ struct cxl_decoder {
>>   	enum cxl_decoder_type target_type;
>>   	struct cxl_region *region;
>>   	unsigned long flags;
>> +	int qtg_id;
> 
> Why not just keep this limited to 'struct cxl_root_decoder'?

Will move to cxl_root_decoder

> 
>>   	int (*commit)(struct cxl_decoder *cxld);
>>   	int (*reset)(struct cxl_decoder *cxld);
>>   };
>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 02/23] cxl: Add checksum verification to CDAT from CXL
  2023-04-24 22:01   ` Dan Williams
@ 2023-04-26 23:24     ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-26 23:24 UTC (permalink / raw)
  To: Dan Williams, linux-cxl, linux-acpi
  Cc: Ira Weiny, vishal.l.verma, alison.schofield, rafael, lukas,
	Jonathan.Cameron



On 4/24/23 3:01 PM, Dan Williams wrote:
> Dave Jiang wrote:
>> A CDAT table is available from a CXL device. The table is read by the
>> driver and cached in software. With the CXL subsystem needing to parse the
>> CDAT table, the checksum should be verified. Add checksum verification
>> after the CDAT table is read from device.
>>
>> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> ---
>> v3:
>> - Just return the final sum. (Alison)
>> v2:
>> - Drop ACPI checksum export and just use local verification. (Dan)
>> ---
>>   drivers/cxl/core/pci.c |   16 ++++++++++++++++
>>   1 file changed, 16 insertions(+)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index 25b7e8125d5d..9c7e2f69d9ca 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -528,6 +528,16 @@ static int cxl_cdat_read_table(struct device *dev,
>>   	return 0;
>>   }
>>   
>> +static unsigned char cdat_checksum(void *buf, size_t size)
>> +{
>> +	unsigned char sum, *data = buf;
>> +	size_t i;
>> +
>> +	for (sum = 0, i = 0; i < size; i++)
>> +		sum += data[i];
>> +	return sum;
>> +}
>> +
>>   /**
>>    * read_cdat_data - Read the CDAT data on this port
>>    * @port: Port to read data from
>> @@ -573,6 +583,12 @@ void read_cdat_data(struct cxl_port *port)
>>   	}
>>   
>>   	port->cdat.table = cdat_table + sizeof(__le32);
>> +	if (cdat_checksum(port->cdat.table, cdat_length)) {
>> +		/* Don't leave table data allocated on error */
>> +		devm_kfree(dev, cdat_table);
>> +		dev_err(dev, "CDAT data checksum error\n");
>> +	}
>> +
>>   	port->cdat.length = cdat_length;
> 
> I think read_cdat_data() is confused about error cases. I note that
> /sys/firmware/acpi/tables does not emit the entry if the table has bad
> length or bad checksum. If you want to have a debug mode then maybe make
> it a compile time option, but I otherwise do not see the benefit of
> publishing known bad tables to userspace.

I'll have it return on errors.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/23] cxl: Add common helpers for cdat parsing
  2023-04-25 16:00     ` Dave Jiang
@ 2023-04-27  0:09       ` Dan Williams
  0 siblings, 0 replies; 70+ messages in thread
From: Dan Williams @ 2023-04-27  0:09 UTC (permalink / raw)
  To: Dave Jiang, Dan Williams, linux-cxl, linux-acpi
  Cc: ira.weiny, vishal.l.verma, alison.schofield, rafael, lukas,
	Jonathan.Cameron

Dave Jiang wrote:
> 
> 
> On 4/24/23 3:33 PM, Dan Williams wrote:
> > Dave Jiang wrote:
> >> Add helper functions to parse the CDAT table and provide a callback to
> >> parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
> >> parsing. The code is patterned after the ACPI table parsing helpers.
> > 
> > It seems a shame that CDAT is so ACPI-like, but can't reuse the ACPI
> > table parsing infrastructure. Can this not be achieved by modifying some
> > of the helpers helpers in drivers/acpi/tables.c to take a passed in
> > @table_header?
> 
> Rafael,
> Do you have any issues with adding some endieness support in 
> drivers/acpi/tables.c in order to support CDAT parsing by BE hosts? To 
> start off with something like below?

Some additional background, recall that CDAT is an ACPI-like data
structure that lives on endpoint CXL devices to describe the access
characteristics of the device's memory similar to SRAT+HMAT for
host-memory. Unlike ACPI that is guaranteed to be deployed on a
little-endian host-cpu, a big-endian host might also encounter CXL
endpoints.

This reuse ends up at ~50 lines and duplication ends up at ~100 lines.
Not a huge win, but a win nonetheless.

> 
> diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
> index 7b4680da57d7..e63e2daf151d 100644
> --- a/drivers/acpi/tables.c
> +++ b/drivers/acpi/tables.c
> @@ -287,6 +287,12 @@ acpi_get_subtable_type(char *id)
>          return ACPI_SUBTABLE_COMMON;
>   }
> 
> +static unsigned long __init_or_acpilib
> +acpi_table_get_length(struct acpi_table_header *hdr)
> +{
> +       return le32_to_cpu((__force __le32)hdr->length);
> +}
> +
>   static __init_or_acpilib bool has_handler(struct acpi_subtable_proc *proc)
>   {
>          return proc->handler || proc->handler_arg;
> @@ -337,7 +343,8 @@ static int __init_or_acpilib acpi_parse_entries_array(
>          int errs = 0;
>          int i;
> 
> -       table_end = (unsigned long)table_header + table_header->length;
> +       table_end = (unsigned long)table_header +
> +                   acpi_table_get_length(table_header);
> 
>          /* Parse all entries looking for a match. */
> 



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 03/23] cxl: Add support for reading CXL switch CDAT table
  2023-04-24 22:08   ` Dan Williams
@ 2023-04-27 15:55     ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-04-27 15:55 UTC (permalink / raw)
  To: Dan Williams, linux-cxl, linux-acpi
  Cc: Ira Weiny, vishal.l.verma, alison.schofield, rafael, lukas,
	Jonathan.Cameron



On 4/24/23 3:08 PM, Dan Williams wrote:
> Dave Jiang wrote:
>> Move read_cdat_data() from endpoint probe to general port probe to
>> allow reading of CDAT data for CXL switches as well as CXL device.
>> Add wrapper support for cxl_test to bypass the cdat reading.
>>
>> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> ---
>> v4:
>> - Remove cxl_test wrapper. (Ira)
>> ---
>>   drivers/cxl/core/pci.c |   20 +++++++++++++++-----
>>   drivers/cxl/port.c     |    6 +++---
>>   2 files changed, 18 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index 9c7e2f69d9ca..1c415b26e866 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -546,16 +546,26 @@ static unsigned char cdat_checksum(void *buf, size_t size)
>>    */
>>   void read_cdat_data(struct cxl_port *port)
>>   {
>> -	struct pci_doe_mb *cdat_doe;
>> -	struct device *dev = &port->dev;
>>   	struct device *uport = port->uport;
>> -	struct cxl_memdev *cxlmd = to_cxl_memdev(uport);
>> -	struct cxl_dev_state *cxlds = cxlmd->cxlds;
>> -	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
>> +	struct device *dev = &port->dev;
>> +	struct cxl_dev_state *cxlds;
>> +	struct pci_doe_mb *cdat_doe;
>> +	struct cxl_memdev *cxlmd;
>> +	struct pci_dev *pdev;
>>   	size_t cdat_length;
>>   	void *cdat_table;
>>   	int rc;
>>   
>> +	if (is_cxl_memdev(uport)) {
>> +		cxlmd = to_cxl_memdev(uport);
>> +		cxlds = cxlmd->cxlds;
>> +		pdev = to_pci_dev(cxlds->dev);
> 
> Per this fix [1], there's no need to reference cxlds, the parent of the
> memory device is the device this wants, and needs to be careful that not
> all 'struct cxl_memdev' instances are hosted by pci devices.
> 
> [1]: http://lore.kernel.org/r/168213190748.708404.16215095414060364800.stgit@dwillia2-xfh.jf.intel.com

Ok will pull this fix patch in and rebase against that.
> 
> Otherwise, looks good to me.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 10/23] cxl: Add helpers to calculate pci latency for the CXL device
  2023-04-25  0:30   ` Dan Williams
@ 2023-05-01 16:29     ` Dave Jiang
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Jiang @ 2023-05-01 16:29 UTC (permalink / raw)
  To: Dan Williams, linux-cxl, linux-acpi
  Cc: ira.weiny, vishal.l.verma, alison.schofield, rafael, lukas,
	Jonathan.Cameron



On 4/24/23 5:30 PM, Dan Williams wrote:
> Dave Jiang wrote:
>> The latency is calculated by dividing the flit size over the bandwidth. Add
>> support to retrieve the flit size for the CXL device and calculate the
>> latency of the downstream link.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> ---
>> v2:
>> - Fix commit log issues. (Jonathan)
>> - Fix var declaration issues. (Jonathan)
>> ---
>>   drivers/cxl/core/pci.c |   68 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxlpci.h   |   15 +++++++++++
>>   drivers/cxl/pci.c      |   13 ---------
>>   3 files changed, 83 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index 1c415b26e866..bb58296b3e56 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -712,3 +712,71 @@ pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
>>   	return PCI_ERS_RESULT_NEED_RESET;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_error_detected, CXL);
>> +
>> +static int pci_bus_speed_to_mbps(enum pci_bus_speed speed)
>> +{
>> +	switch (speed) {
>> +	case PCIE_SPEED_2_5GT:
>> +		return 2500;
>> +	case PCIE_SPEED_5_0GT:
>> +		return 5000;
>> +	case PCIE_SPEED_8_0GT:
>> +		return 8000;
>> +	case PCIE_SPEED_16_0GT:
>> +		return 16000;
>> +	case PCIE_SPEED_32_0GT:
>> +		return 32000;
>> +	case PCIE_SPEED_64_0GT:
>> +		return 64000;
>> +	default:
>> +		break;
>> +	}
>> +
>> +	return -EINVAL;
>> +}
>> +
>> +static int cxl_pci_mbits_to_mbytes(struct pci_dev *pdev)
>> +{
>> +	int mbits;
>> +
>> +	mbits = pci_bus_speed_to_mbps(pdev->bus->cur_bus_speed);
>> +	if (mbits < 0)
>> +		return mbits;
>> +
>> +	return mbits >> 3;
> 
> Why not just return mbits directly and skip the conversion? Otherwise a
> "/ 8" requires bit less cleverness to read than ">> 3".

You mean just move the math to the caller()?
> 
>> +}
>> +
>> +static int cxl_flit_size(struct pci_dev *pdev)
> 
> This like something that might be worth caching in 'struct cxl_port'
> rather than re-reading the configuration register each call. Depends on
> how often it is used.

You mean we just calculate it during cxl_port creation? I think the 
calculations for a switch upstream segment towards the root complex may 
be used multiple times. Downstream towards device, 1 or more depends on 
how many partitions there are. But probably not a big deal to just cache 
it.

> 
>> +{
>> +	if (cxl_pci_flit_256(pdev))
>> +		return 256;
>> +
>> +	return 68;
>> +}
>> +
>> +/**
>> + * cxl_pci_get_latency - calculate the link latency for the PCIe link
>> + * @pdev - PCI device
>> + *
>> + * return: calculated latency or -errno
>> + *
>> + * CXL Memory Device SW Guide v1.0 2.11.4 Link latency calculation
>> + * Link latency = LinkPropagationLatency + FlitLatency + RetimerLatency
>> + * LinkProgationLatency is negligible, so 0 will be used
>> + * RetimerLatency is assumed to be negligible and 0 will be used
>> + * FlitLatency = FlitSize / LinkBandwidth
>> + * FlitSize is defined by spec. CXL rev3.0 4.2.1.
>> + * 68B flit is used up to 32GT/s. >32GT/s, 256B flit size is used.
>> + * The FlitLatency is converted to picoseconds.
>> + */
>> +long cxl_pci_get_latency(struct pci_dev *pdev)
>> +{
>> +	long bw;
>> +
>> +	bw = cxl_pci_mbits_to_mbytes(pdev);
> 
> This function looks misnamed when I read it here, it's retrieving the
> bus speed in MiBs not doing a conversion.
> 
>> +	if (bw < 0)
>> +		return bw;
>> +
>> +	return cxl_flit_size(pdev) * 1000000L / bw;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_pci_get_latency, CXL);
>> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
>> index 1bca1c0e4b40..795eba31fe29 100644
>> --- a/drivers/cxl/cxlpci.h
>> +++ b/drivers/cxl/cxlpci.h
>> @@ -167,6 +167,19 @@ struct cdat_sslbis {
>>   #define SSLBIS_US_PORT		0x0100
>>   #define SSLBIS_ANY_PORT		0xffff
>>   
>> +/*
>> + * CXL v3.0 6.2.3 Table 6-4
>> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
>> + * mode, otherwise it's 68B flits mode.
>> + */
>> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
>> +{
>> +	u16 lnksta2;
>> +
>> +	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
>> +	return lnksta2 & PCI_EXP_LNKSTA2_FLIT;
>> +}
>> +
>>   int devm_cxl_port_enumerate_dports(struct cxl_port *port);
>>   struct cxl_dev_state;
>>   int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
>> @@ -189,4 +202,6 @@ int cxl_##x##_parse_entry(struct cdat_entry_header *header, void *arg)
>>   cxl_parse_entry(dsmas);
>>   cxl_parse_entry(dslbis);
>>   cxl_parse_entry(sslbis);
>> +
>> +long cxl_pci_get_latency(struct pci_dev *pdev);
>>   #endif /* __CXL_PCI_H__ */
>> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>> index ea38bd49b0cf..ed39d133b70d 100644
>> --- a/drivers/cxl/pci.c
>> +++ b/drivers/cxl/pci.c
>> @@ -365,19 +365,6 @@ static bool is_cxl_restricted(struct pci_dev *pdev)
>>   	return pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END;
>>   }
>>   
>> -/*
>> - * CXL v3.0 6.2.3 Table 6-4
>> - * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
>> - * mode, otherwise it's 68B flits mode.
>> - */
>> -static bool cxl_pci_flit_256(struct pci_dev *pdev)
>> -{
>> -	u16 lnksta2;
>> -
>> -	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
>> -	return lnksta2 & PCI_EXP_LNKSTA2_FLIT;
>> -}
>> -
>>   static int cxl_pci_ras_unmask(struct pci_dev *pdev)
>>   {
>>   	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2023-05-01 16:29 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-19 20:21 [PATCH v4 00/23] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
2023-04-19 20:21 ` [PATCH v4 01/23] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
2023-04-20  8:51   ` Jonathan Cameron
2023-04-20 20:53     ` Dave Jiang
2023-04-24 21:46   ` Dan Williams
2023-04-26 23:14     ` Dave Jiang
2023-04-19 20:21 ` [PATCH v4 02/23] cxl: Add checksum verification to CDAT from CXL Dave Jiang
2023-04-20  8:55   ` Jonathan Cameron
2023-04-24 22:01   ` Dan Williams
2023-04-26 23:24     ` Dave Jiang
2023-04-19 20:21 ` [PATCH v4 03/23] cxl: Add support for reading CXL switch CDAT table Dave Jiang
2023-04-20  9:25   ` Jonathan Cameron
2023-04-24 22:08   ` Dan Williams
2023-04-27 15:55     ` Dave Jiang
2023-04-19 20:21 ` [PATCH v4 04/23] cxl: Add common helpers for cdat parsing Dave Jiang
2023-04-20  9:41   ` Jonathan Cameron
2023-04-20 21:05     ` Dave Jiang
2023-04-21 16:06       ` Jonathan Cameron
2023-04-21 16:12         ` Dave Jiang
2023-04-24 22:33   ` Dan Williams
2023-04-25 16:00     ` Dave Jiang
2023-04-27  0:09       ` Dan Williams
2023-04-19 20:21 ` [PATCH v4 05/23] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
2023-04-20 11:33   ` Jonathan Cameron
2023-04-20 11:35     ` Jonathan Cameron
2023-04-20 23:25       ` Dave Jiang
2023-04-24 22:38   ` Dan Williams
2023-04-26  3:44   ` Li, Ming
2023-04-26 18:27     ` Dave Jiang
2023-04-19 20:21 ` [PATCH v4 06/23] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
2023-04-20 11:40   ` Jonathan Cameron
2023-04-20 23:25     ` Dave Jiang
2023-04-24 22:46   ` Dan Williams
2023-04-24 22:59     ` Dave Jiang
2023-04-19 20:21 ` [PATCH v4 07/23] cxl: Add callback to parse the SSLBIS " Dave Jiang
2023-04-20 11:50   ` Jonathan Cameron
2023-04-24 23:38   ` Dan Williams
2023-04-19 20:21 ` [PATCH v4 08/23] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
2023-04-20 12:00   ` Jonathan Cameron
2023-04-21  0:11     ` Dave Jiang
2023-04-21 16:07       ` Jonathan Cameron
2023-04-25  0:12   ` Dan Williams
2023-04-19 20:21 ` [PATCH v4 09/23] cxl: Add helper function to retrieve ACPI handle of CXL root device Dave Jiang
2023-04-20 12:06   ` Jonathan Cameron
2023-04-21 23:24     ` Dave Jiang
2023-04-25  0:18   ` Dan Williams
2023-04-19 20:22 ` [PATCH v4 10/23] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
2023-04-20 12:15   ` Jonathan Cameron
2023-04-25  0:30   ` Dan Williams
2023-05-01 16:29     ` Dave Jiang
2023-04-19 20:22 ` [PATCH v4 11/23] cxl: Add helper function that calculates QoS values for switches Dave Jiang
2023-04-20 12:26   ` Jonathan Cameron
2023-04-24 17:09     ` Dave Jiang
2023-04-24 17:31       ` Dave Jiang
2023-04-24 21:59         ` Jonathan Cameron
2023-04-25  0:33   ` Dan Williams
2023-04-19 20:22 ` [PATCH v4 12/23] cxl: Add helper function that calculate QoS values for PCI path Dave Jiang
2023-04-20 12:32   ` Jonathan Cameron
2023-04-25  0:45   ` Dan Williams
2023-04-19 20:22 ` [PATCH v4 13/23] ACPI: NUMA: Create enum for memory_target hmem_attrs indexing Dave Jiang
2023-04-19 20:22 ` [PATCH v4 14/23] ACPI: NUMA: Add genport target allocation to the HMAT parsing Dave Jiang
2023-04-19 20:22 ` [PATCH v4 15/23] ACPI: NUMA: Add setting of generic port locality attributes Dave Jiang
2023-04-19 20:22 ` [PATCH v4 16/23] ACPI: NUMA: Add helper function to retrieve the performance attributes Dave Jiang
2023-04-19 20:22 ` [PATCH v4 17/23] cxl: Add helper function to retrieve generic port QoS Dave Jiang
2023-04-19 20:22 ` [PATCH v4 18/23] cxl: Add latency and bandwidth calculations for the CXL path Dave Jiang
2023-04-19 20:22 ` [PATCH v4 19/23] cxl: Wait Memory_Info_Valid before access memory related info Dave Jiang
2023-04-19 20:23 ` [PATCH v4 20/23] cxl: Move identify and partition query from pci probe to port probe Dave Jiang
2023-04-19 20:23 ` [PATCH v4 21/23] cxl: Store QTG IDs and related info to the CXL memory device context Dave Jiang
2023-04-19 20:23 ` [PATCH v4 22/23] cxl: Export sysfs attributes for memory device QTG ID Dave Jiang
2023-04-19 20:23 ` [PATCH v4 23/23] cxl/mem: Add debugfs output for QTG related data Dave Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).