All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem
@ 2023-02-06 20:49 Dave Jiang
  2023-02-06 20:49 ` [PATCH 01/18] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
                   ` (17 more replies)
  0 siblings, 18 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:49 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Hi Bjorn, please review the relevant patches to the PCI subsystem: 10/18, 11/18. Thank you!
pcie_get_speed() and pcie_get_width() are created in order to allow CXL driver to calculate
the bandwith and latency for the PCI links.

Hi Rafael, please review the relevant patches to the ACPI: 2/18, 5/18. Thank you!
acpi_ut_verify_cdat_checksum() is exported to allow usage by a driver.

This series adds the retrieval of QoS Throttling Group (QTG) IDs for the CXL Fixed
Memory Window Structure (CFMWS) and the CXL memory device. It provides the QTG IDs
to user space to provide some guidance with putting the proper DPA range under the
appropriate CFMWS window.

The CFMWS structure contains a QTG ID that is associated with the memory window that the
structure exports. On Linux, the CFMWS is represented as a CXL root decoder. The QTG
ID will be attached to the CXL root decoder and exported as a sysfs attribute (qtg_id).

The QTG ID for a device is retrieved via sending a _DSM method to the ACPI0017 device.
The _DSM expects an input package of 4 DWORDS that contains the read latency, write
latency, read bandwidth, and write banwidth. These are the caluclated numbers for the
path between the CXL device and the CXL host bridge (HB). The QTG ID is also exported
as a sysfs attribute under the mem device memory type:
/sys/bus/cxl/devices/memX/ram/qtg_id or /sys/bus/cxl/devices/memX/pmem/qtg_id

The latency numbers are the aggregated latencies for the path between the CXL device and
the CXL HB. If a CXL device is directly attached to the CXL HB, the latency
would be the device latencies from the device Coherent Device Attribute Table (CDAT) plus
the caluclated PCIe link latency between the device and the HB. The bandwidth in this
configuration would be the minimum between the CDAT bandwidth number and link bandwidth
between the device and the HB.

If a configuration has a switch in between, then the latency would be the aggregated
latencies from the device CDAT, the link latency between device and switch, the
latencies from the switch CDAT, and the link latency between switch and the HB. Given
that the switch CDAT is not easily retrieved on Linux currently, a guesstimated
constant number will be used for calculation. The bandwidth calculation would be the
min of device CDAT bandwidth, link bandwith between device and switch, switch CDAT
bandwidth, and link bandwidth between switch and HB.  And without the switch CDAT,
the switch CDAT bandwidth will be skipped.

There can be 0 or more switches between the CXL device and the CXL HB. There are detailed
examples on calculating bandwidth and latency in the CXL Memory Device Software Guide [4].

The CDAT provides Device Scoped Memory Affinity Structures (DSMAS) that contains the
Device Physical Address (DPA) range and the related Device Scoped Latency and Bandwidth
Informat Stuctures (DSLBIS). Each DSLBIS provides a latency or bandwidth entry that is
tied to a DSMAS entry via a per DSMAS unique DSMAD handle.

[1]: https://www.computeexpresslink.org/download-the-specification
[2]: https://uefi.org/sites/default/files/resources/Coherent%20Device%20Attribute%20Table_1.01.pdf
[3]: https://uefi.org/sites/default/files/resources/ACPI_Spec_6_5_Aug29.pdf
[4]: https://cdrdv2-public.intel.com/643805/643805_CXL%20Memory%20Device%20SW%20Guide_Rev1p0.pdf

---

Dave Jiang (18):
      cxl: Export QTG ids from CFMWS to sysfs
      ACPICA: Export acpi_ut_verify_cdat_checksum()
      cxl: Add checksum verification to CDAT from CXL
      cxl: Add common helpers for cdat parsing
      ACPICA: Fix 'struct acpi_cdat_dsmas' spelling mistake
      cxl: Add callback to parse the DSMAS subtables from CDAT
      cxl: Add callback to parse the DSLBIS subtable from CDAT
      cxl: Add support for _DSM Function for retrieving QTG ID
      cxl: Add helper function to retrieve ACPI handle of CXL root device
      PCI: Export pcie_get_speed() using the code from sysfs PCI link speed show function
      PCI: Export pcie_get_width() using the code from sysfs PCI link width show function
      cxl: Add helpers to calculate pci latency for the CXL device
      cxl: Add latency and bandwidth calculations for the CXL path
      cxl: Wait Memory_Info_Valid before access memory related info
      cxl: Move identify and partition query from pci probe to port probe
      cxl: Move reading of CDAT data from device to after media is ready
      cxl: Attach QTG IDs to the DPA ranges for the device
      cxl: Export sysfs attributes for device QTG IDs


 Documentation/ABI/testing/sysfs-bus-cxl |  22 ++++
 drivers/acpi/acpica/utcksum.c           |   4 +-
 drivers/cxl/acpi.c                      |   3 +
 drivers/cxl/core/Makefile               |   2 +
 drivers/cxl/core/acpi.c                 | 129 +++++++++++++++++++
 drivers/cxl/core/cdat.c                 | 157 ++++++++++++++++++++++++
 drivers/cxl/core/cdat.h                 |  15 +++
 drivers/cxl/core/mbox.c                 |   2 +
 drivers/cxl/core/memdev.c               |  26 ++++
 drivers/cxl/core/pci.c                  | 103 +++++++++++++++-
 drivers/cxl/core/port.c                 |  76 ++++++++++++
 drivers/cxl/cxl.h                       |  50 ++++++++
 drivers/cxl/cxlmem.h                    |   2 +
 drivers/cxl/cxlpci.h                    |  14 +++
 drivers/cxl/pci.c                       |   8 --
 drivers/cxl/port.c                      | 105 +++++++++++++++-
 drivers/pci/pci-sysfs.c                 |  21 +---
 drivers/pci/pci.c                       |  40 ++++++
 include/acpi/actbl1.h                   |   7 +-
 include/linux/acpi.h                    |   7 ++
 include/linux/pci.h                     |   2 +
 21 files changed, 760 insertions(+), 35 deletions(-)
 create mode 100644 drivers/cxl/core/acpi.c
 create mode 100644 drivers/cxl/core/cdat.c
 create mode 100644 drivers/cxl/core/cdat.h

--


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 01/18] cxl: Export QTG ids from CFMWS to sysfs
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
@ 2023-02-06 20:49 ` Dave Jiang
  2023-02-09 11:15   ` Jonathan Cameron
  2023-02-06 20:49 ` [PATCH 02/18] ACPICA: Export acpi_ut_verify_cdat_checksum() Dave Jiang
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:49 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Export the QoS Throttling Group ID from the CXL Fixed Memory Window
Structure (CFMWS) under the root decoder sysfs attributes.
CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS)

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-cxl |    7 +++++++
 drivers/cxl/acpi.c                      |    3 +++
 drivers/cxl/core/port.c                 |   14 ++++++++++++++
 drivers/cxl/cxl.h                       |    3 +++
 4 files changed, 27 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 8494ef27e8d2..0932c2f6fbf4 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -294,6 +294,13 @@ Description:
 		(WO) Write a string in the form 'regionZ' to delete that region,
 		provided it is currently idle / not bound to a driver.
 
+What:		/sys/bus/cxl/devices/decoderX.Y/qtg_id
+Date:		Jan, 2023
+KernelVersion:	v6.3
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) Shows the QoS Throttling Group ID. The QTG ID for a root
+		decoder comes from the CFMWS structure of the CEDT.
 
 What:		/sys/bus/cxl/devices/regionZ/uuid
 Date:		May, 2022
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 13cde44c6086..7a71bb5041c7 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 			}
 		}
 	}
+
+	cxld->qtg_id = cfmws->qtg_id;
+
 	rc = cxl_decoder_add(cxld, target_map);
 err_xormap:
 	if (rc)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index b631a0520456..fe78daf7e7c8 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -284,6 +284,16 @@ static ssize_t interleave_ways_show(struct device *dev,
 
 static DEVICE_ATTR_RO(interleave_ways);
 
+static ssize_t qtg_id_show(struct device *dev,
+			   struct device_attribute *attr, char *buf)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+
+	return sysfs_emit(buf, "%d\n", cxld->qtg_id);
+}
+
+static DEVICE_ATTR_RO(qtg_id);
+
 static struct attribute *cxl_decoder_base_attrs[] = {
 	&dev_attr_start.attr,
 	&dev_attr_size.attr,
@@ -303,6 +313,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
 	&dev_attr_cap_type2.attr,
 	&dev_attr_cap_type3.attr,
 	&dev_attr_target_list.attr,
+	&dev_attr_qtg_id.attr,
 	SET_CXL_REGION_ATTR(create_pmem_region)
 	SET_CXL_REGION_ATTR(delete_region)
 	NULL,
@@ -1606,6 +1617,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
 	}
 
 	atomic_set(&cxlrd->region_id, rc);
+	cxld->qtg_id = CXL_QTG_ID_INVALID;
 	return cxlrd;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL);
@@ -1643,6 +1655,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
 
 	cxld = &cxlsd->cxld;
 	cxld->dev.type = &cxl_decoder_switch_type;
+	cxld->qtg_id = CXL_QTG_ID_INVALID;
 	return cxlsd;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL);
@@ -1675,6 +1688,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port)
 	}
 
 	cxld->dev.type = &cxl_decoder_endpoint_type;
+	cxld->qtg_id = CXL_QTG_ID_INVALID;
 	return cxled;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 1b1cf459ac77..f558bbfc0332 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -279,6 +279,7 @@ enum cxl_decoder_type {
  */
 #define CXL_DECODER_MAX_INTERLEAVE 16
 
+#define CXL_QTG_ID_INVALID	-1
 
 /**
  * struct cxl_decoder - Common CXL HDM Decoder Attributes
@@ -290,6 +291,7 @@ enum cxl_decoder_type {
  * @target_type: accelerator vs expander (type2 vs type3) selector
  * @region: currently assigned region for this decoder
  * @flags: memory type capabilities and locking
+ * @qtg_id: QoS Throttling Group ID
  * @commit: device/decoder-type specific callback to commit settings to hw
  * @reset: device/decoder-type specific callback to reset hw settings
 */
@@ -302,6 +304,7 @@ struct cxl_decoder {
 	enum cxl_decoder_type target_type;
 	struct cxl_region *region;
 	unsigned long flags;
+	int qtg_id;
 	int (*commit)(struct cxl_decoder *cxld);
 	int (*reset)(struct cxl_decoder *cxld);
 };



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 02/18] ACPICA: Export acpi_ut_verify_cdat_checksum()
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
  2023-02-06 20:49 ` [PATCH 01/18] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
@ 2023-02-06 20:49 ` Dave Jiang
  2023-02-07 14:19   ` Rafael J. Wysocki
  2023-02-06 20:49 ` [PATCH 03/18] cxl: Add checksum verification to CDAT from CXL Dave Jiang
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:49 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Export the CDAT checksum verify function so CXL driver can use it to verify
CDAT coming from the CXL devices.

Given that this function isn't actually being used by ACPI internals,
removing the define check of APCI_CHECKSUM_ABORT so the function would
return failure on checksum fail since the driver will need to know.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/acpi/acpica/utcksum.c |    4 +---
 include/linux/acpi.h          |    7 +++++++
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/acpica/utcksum.c b/drivers/acpi/acpica/utcksum.c
index c166e4c05ab6..c0f98c8f9a0b 100644
--- a/drivers/acpi/acpica/utcksum.c
+++ b/drivers/acpi/acpica/utcksum.c
@@ -102,15 +102,13 @@ acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length)
 				   "should be 0x%2.2X",
 				   acpi_gbl_CDAT, cdat_table->checksum,
 				   checksum));
-
-#if (ACPI_CHECKSUM_ABORT)
 		return (AE_BAD_CHECKSUM);
-#endif
 	}
 
 	cdat_table->checksum = checksum;
 	return (AE_OK);
 }
+EXPORT_SYMBOL_GPL(acpi_ut_verify_cdat_checksum);
 
 /*******************************************************************************
  *
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 5e6a876e17ba..09b44afef7df 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1504,9 +1504,16 @@ static inline void acpi_init_ffh(void) { }
 #ifdef CONFIG_ACPI
 extern void acpi_device_notify(struct device *dev);
 extern void acpi_device_notify_remove(struct device *dev);
+extern acpi_status
+acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length);
 #else
 static inline void acpi_device_notify(struct device *dev) { }
 static inline void acpi_device_notify_remove(struct device *dev) { }
+static inline acpi_status
+acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length)
+{
+	return (AE_NOT_CONFIGURED);
+}
 #endif
 
 #endif	/*_LINUX_ACPI_H*/



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 03/18] cxl: Add checksum verification to CDAT from CXL
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
  2023-02-06 20:49 ` [PATCH 01/18] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
  2023-02-06 20:49 ` [PATCH 02/18] ACPICA: Export acpi_ut_verify_cdat_checksum() Dave Jiang
@ 2023-02-06 20:49 ` Dave Jiang
  2023-02-09 11:34   ` Jonathan Cameron
  2023-02-06 20:49 ` [PATCH 04/18] cxl: Add common helpers for cdat parsing Dave Jiang
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:49 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

A CDAT table is available from a CXL device. The table is read by the
driver and cached in software. With the CXL subsystem needing to parse the
CDAT table, the checksum should be verified. Add checksum verification
after the CDAT table is read from device.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/pci.c |   11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 57764e9cd19d..a24dac36bedd 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -3,6 +3,7 @@
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/device.h>
 #include <linux/delay.h>
+#include <linux/acpi.h>
 #include <linux/pci.h>
 #include <linux/pci-doe.h>
 #include <cxlpci.h>
@@ -592,6 +593,7 @@ void read_cdat_data(struct cxl_port *port)
 	struct device *dev = &port->dev;
 	struct device *uport = port->uport;
 	size_t cdat_length;
+	acpi_status status;
 	int rc;
 
 	cdat_doe = find_cdat_doe(uport);
@@ -620,5 +622,14 @@ void read_cdat_data(struct cxl_port *port)
 		port->cdat.length = 0;
 		dev_err(dev, "CDAT data read error\n");
 	}
+
+	status = acpi_ut_verify_cdat_checksum(port->cdat.table, port->cdat.length);
+	if (status != AE_OK) {
+		/* Don't leave table data allocated on error */
+		devm_kfree(dev, port->cdat.table);
+		port->cdat.table = NULL;
+		port->cdat.length = 0;
+		dev_err(dev, "CDAT data checksum error\n");
+	}
 }
 EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 04/18] cxl: Add common helpers for cdat parsing
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (2 preceding siblings ...)
  2023-02-06 20:49 ` [PATCH 03/18] cxl: Add checksum verification to CDAT from CXL Dave Jiang
@ 2023-02-06 20:49 ` Dave Jiang
  2023-02-09 11:58   ` Jonathan Cameron
  2023-02-06 20:50 ` [PATCH 05/18] ACPICA: Fix 'struct acpi_cdat_dsmas' spelling mistake Dave Jiang
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:49 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Add helper functions to parse the CDAT table and provide a callback to
parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
parsing. The code is patterned after the ACPI table parsing helpers.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/Makefile |    1 
 drivers/cxl/core/cdat.c   |   98 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/cdat.h   |   15 +++++++
 drivers/cxl/cxl.h         |    9 ++++
 4 files changed, 123 insertions(+)
 create mode 100644 drivers/cxl/core/cdat.c
 create mode 100644 drivers/cxl/core/cdat.h

diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 79c7257f4107..438ce27faf77 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -10,4 +10,5 @@ cxl_core-y += memdev.o
 cxl_core-y += mbox.o
 cxl_core-y += pci.o
 cxl_core-y += hdm.o
+cxl_core-y += cdat.o
 cxl_core-$(CONFIG_CXL_REGION) += region.o
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
new file mode 100644
index 000000000000..be09c8a690f5
--- /dev/null
+++ b/drivers/cxl/core/cdat.c
@@ -0,0 +1,98 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
+#include "cxl.h"
+#include "cdat.h"
+
+static u8 cdat_get_subtable_entry_type(struct cdat_subtable_entry *entry)
+{
+	return entry->hdr->type;
+}
+
+static u16 cdat_get_subtable_entry_length(struct cdat_subtable_entry *entry)
+{
+	return entry->hdr->length;
+}
+
+static bool has_handler(struct cdat_subtable_proc *proc)
+{
+	return proc->handler;
+}
+
+static int call_handler(struct cdat_subtable_proc *proc,
+			struct cdat_subtable_entry *ent)
+{
+	if (proc->handler)
+		return proc->handler(ent->hdr, proc->arg);
+	return -EINVAL;
+}
+
+static int cdat_table_parse_entries(enum acpi_cdat_type type,
+				    struct acpi_table_cdat *table_header,
+				    struct cdat_subtable_proc *proc,
+				    unsigned int max_entries)
+{
+	struct cdat_subtable_entry entry;
+	unsigned long table_end, entry_len;
+	int count = 0;
+	int rc;
+
+	if (!has_handler(proc))
+		return -EINVAL;
+
+	table_end = (unsigned long)table_header + table_header->length;
+
+	if (type >= ACPI_CDAT_TYPE_RESERVED)
+		return -EINVAL;
+
+	entry.type = type;
+	entry.hdr = (struct acpi_cdat_header *)((unsigned long)table_header +
+					       sizeof(*table_header));
+
+	while ((unsigned long)entry.hdr < table_end) {
+		entry_len = cdat_get_subtable_entry_length(&entry);
+
+		if ((unsigned long)entry.hdr + entry_len > table_end)
+			return -EINVAL;
+
+		if (max_entries && count >= max_entries)
+			break;
+
+		if (entry_len == 0)
+			return -EINVAL;
+
+		if (cdat_get_subtable_entry_type(&entry) == type) {
+			rc = call_handler(proc, &entry);
+			if (rc)
+				return rc;
+		}
+
+		entry.hdr = (struct acpi_cdat_header *)((unsigned long)entry.hdr + entry_len);
+		count++;
+	}
+
+	return count;
+}
+
+int cdat_table_parse_dsmas(void *table, cdat_tbl_entry_handler handler, void *arg)
+{
+	struct acpi_table_cdat *header = (struct acpi_table_cdat *)table;
+	struct cdat_subtable_proc proc = {
+		.handler	= handler,
+		.arg		= arg,
+	};
+
+	return cdat_table_parse_entries(ACPI_CDAT_TYPE_DSMAS, header, &proc, 0);
+}
+EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dsmas, CXL);
+
+int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler, void *arg)
+{
+	struct acpi_table_cdat *header = (struct acpi_table_cdat *)table;
+	struct cdat_subtable_proc proc = {
+		.handler	= handler,
+		.arg		= arg,
+	};
+
+	return cdat_table_parse_entries(ACPI_CDAT_TYPE_DSLBIS, header, &proc, 0);
+}
+EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dslbis, CXL);
diff --git a/drivers/cxl/core/cdat.h b/drivers/cxl/core/cdat.h
new file mode 100644
index 000000000000..f690325e82a6
--- /dev/null
+++ b/drivers/cxl/core/cdat.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2023 Intel Corporation. */
+#ifndef __CXL_CDAT_H__
+#define __CXL_CDAT_H__
+
+struct cdat_subtable_proc {
+	cdat_tbl_entry_handler handler;
+	void *arg;
+};
+
+struct cdat_subtable_entry {
+	struct acpi_cdat_header *hdr;
+	enum acpi_cdat_type type;
+};
+#endif
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index f558bbfc0332..839a121c1997 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -9,6 +9,7 @@
 #include <linux/bitops.h>
 #include <linux/log2.h>
 #include <linux/io.h>
+#include <linux/acpi.h>
 
 /**
  * DOC: cxl objects
@@ -697,6 +698,14 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
 }
 #endif
 
+typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
+
+u8 cdat_table_checksum(u8 *buffer, u32 length);
+int cdat_table_parse_dsmas(void *table, cdat_tbl_entry_handler handler,
+			   void *arg);
+int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
+			    void *arg);
+
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 05/18] ACPICA: Fix 'struct acpi_cdat_dsmas' spelling mistake
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (3 preceding siblings ...)
  2023-02-06 20:49 ` [PATCH 04/18] cxl: Add common helpers for cdat parsing Dave Jiang
@ 2023-02-06 20:50 ` Dave Jiang
  2023-02-06 22:00   ` Lukas Wunner
  2023-02-06 20:50 ` [PATCH 06/18] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:50 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

'struct acpi_cadt_dsmas' => 'struct acpi_cdat_dsmas'

Fixes: 51aad1a6723b ("ACPICA: Finish support for the CDAT table")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 include/acpi/actbl1.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
index 4175dce3967c..e8297cefde09 100644
--- a/include/acpi/actbl1.h
+++ b/include/acpi/actbl1.h
@@ -344,7 +344,7 @@ enum acpi_cdat_type {
 
 /* Subtable 0: Device Scoped Memory Affinity Structure (DSMAS) */
 
-struct acpi_cadt_dsmas {
+struct acpi_cdat_dsmas {
 	u8 dsmad_handle;
 	u8 flags;
 	u16 reserved;



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 06/18] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (4 preceding siblings ...)
  2023-02-06 20:50 ` [PATCH 05/18] ACPICA: Fix 'struct acpi_cdat_dsmas' spelling mistake Dave Jiang
@ 2023-02-06 20:50 ` Dave Jiang
  2023-02-09 13:29   ` Jonathan Cameron
  2023-02-06 20:50 ` [PATCH 07/18] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:50 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Provide a callback function to the CDAT parser in order to parse the Device
Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
DPA range and its associated attributes in each entry. See the CDAT
specification for details.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/cdat.c |   25 +++++++++++++++++++++++++
 drivers/cxl/core/port.c |    2 ++
 drivers/cxl/cxl.h       |   11 +++++++++++
 drivers/cxl/port.c      |    8 ++++++++
 4 files changed, 46 insertions(+)

diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index be09c8a690f5..f9a64a0f1ee4 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -96,3 +96,28 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler, void *a
 	return cdat_table_parse_entries(ACPI_CDAT_TYPE_DSLBIS, header, &proc, 0);
 }
 EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dslbis, CXL);
+
+int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg)
+{
+	struct cxl_port *port = (struct cxl_port *)arg;
+	struct dsmas_entry *dent;
+	struct acpi_cdat_dsmas *dsmas;
+
+	if (header->type != ACPI_CDAT_TYPE_DSMAS)
+		return -EINVAL;
+
+	dent = devm_kzalloc(&port->dev, sizeof(*dent), GFP_KERNEL);
+	if (!dent)
+		return -ENOMEM;
+
+	dsmas = (struct acpi_cdat_dsmas *)((unsigned long)header + sizeof(*header));
+	dent->handle = dsmas->dsmad_handle;
+	dent->dpa_range.start = dsmas->dpa_base_address;
+	dent->dpa_range.end = dsmas->dpa_base_address + dsmas->dpa_length - 1;
+
+	mutex_lock(&port->cdat.dsmas_lock);
+	list_add_tail(&dent->list, &port->cdat.dsmas_list);
+	mutex_unlock(&port->cdat.dsmas_lock);
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index fe78daf7e7c8..2b27319cfd42 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -660,6 +660,8 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
 	device_set_pm_not_required(dev);
 	dev->bus = &cxl_bus_type;
 	dev->type = &cxl_port_type;
+	INIT_LIST_HEAD(&port->cdat.dsmas_list);
+	mutex_init(&port->cdat.dsmas_lock);
 
 	return port;
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 839a121c1997..1e5e69f08480 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -8,6 +8,7 @@
 #include <linux/bitfield.h>
 #include <linux/bitops.h>
 #include <linux/log2.h>
+#include <linux/list.h>
 #include <linux/io.h>
 #include <linux/acpi.h>
 
@@ -520,6 +521,8 @@ struct cxl_port {
 	struct cxl_cdat {
 		void *table;
 		size_t length;
+		struct list_head dsmas_list;
+		struct mutex dsmas_lock; /* lock for dsmas_list */
 	} cdat;
 	bool cdat_available;
 };
@@ -698,6 +701,12 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
 }
 #endif
 
+struct dsmas_entry {
+	struct list_head list;
+	struct range dpa_range;
+	u16 handle;
+};
+
 typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
 
 u8 cdat_table_checksum(u8 *buffer, u32 length);
@@ -706,6 +715,8 @@ int cdat_table_parse_dsmas(void *table, cdat_tbl_entry_handler handler,
 int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
 			    void *arg);
 
+int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg);
+
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 5453771bf330..b1da73e99bab 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -61,6 +61,14 @@ static int cxl_port_probe(struct device *dev)
 		if (rc)
 			return rc;
 
+		if (port->cdat.table) {
+			rc = cdat_table_parse_dsmas(port->cdat.table,
+						    cxl_dsmas_parse_entry,
+						    (void *)port);
+			if (rc < 0)
+				dev_dbg(dev, "Failed to parse DSMAS: %d\n", rc);
+		}
+
 		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
 		if (rc)
 			return rc;



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 07/18] cxl: Add callback to parse the DSLBIS subtable from CDAT
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (5 preceding siblings ...)
  2023-02-06 20:50 ` [PATCH 06/18] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
@ 2023-02-06 20:50 ` Dave Jiang
  2023-02-09 13:50   ` Jonathan Cameron
  2023-02-06 20:50 ` [PATCH 08/18] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:50 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Provide a callback to parse the Device Scoped Latency and Bandwidth
Information Structure (DSLBIS) in the CDAT structures. The DSLBIS
contains the bandwidth and latency information that's tied to a DSMAS
handle. The driver will retrieve the read and write latency and
bandwidth associated with the DSMAS which is tied to a DPA range.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/cdat.c |   34 ++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    2 ++
 drivers/cxl/port.c      |    9 ++++++++-
 include/acpi/actbl1.h   |    5 +++++
 4 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index f9a64a0f1ee4..3c8f3956487e 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -121,3 +121,37 @@ int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg)
 	return 0;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
+
+int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg)
+{
+	struct cxl_port *port = (struct cxl_port *)arg;
+	struct dsmas_entry *dent;
+	struct acpi_cdat_dslbis *dslbis;
+	u64 val;
+
+	if (header->type != ACPI_CDAT_TYPE_DSLBIS)
+		return -EINVAL;
+
+	dslbis = (struct acpi_cdat_dslbis *)((unsigned long)header + sizeof(*header));
+	if ((dslbis->flags & ACPI_CEDT_DSLBIS_MEM_MASK) !=
+	     ACPI_CEDT_DSLBIS_MEM_MEMORY)
+		return 0;
+
+	if (dslbis->data_type > ACPI_HMAT_WRITE_BANDWIDTH)
+		return -ENXIO;
+
+	/* Value calculation with base_unit, see ACPI Spec 6.5 5.2.28.4 */
+	val = dslbis->entry[0] * dslbis->entry_base_unit;
+
+	mutex_lock(&port->cdat.dsmas_lock);
+	list_for_each_entry(dent, &port->cdat.dsmas_list, list) {
+		if (dslbis->handle == dent->handle) {
+			dent->qos[dslbis->data_type] = val;
+			break;
+		}
+	}
+	mutex_unlock(&port->cdat.dsmas_lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_dslbis_parse_entry, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 1e5e69f08480..849b22236f1d 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -705,6 +705,7 @@ struct dsmas_entry {
 	struct list_head list;
 	struct range dpa_range;
 	u16 handle;
+	u64 qos[ACPI_HMAT_WRITE_BANDWIDTH + 1];
 };
 
 typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
@@ -716,6 +717,7 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
 			    void *arg);
 
 int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg);
+int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg);
 
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index b1da73e99bab..8de311208b37 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -65,8 +65,15 @@ static int cxl_port_probe(struct device *dev)
 			rc = cdat_table_parse_dsmas(port->cdat.table,
 						    cxl_dsmas_parse_entry,
 						    (void *)port);
-			if (rc < 0)
+			if (rc > 0) {
+				rc = cdat_table_parse_dslbis(port->cdat.table,
+							     cxl_dslbis_parse_entry,
+							     (void *)port);
+				if (rc <= 0)
+					dev_dbg(dev, "Failed to parse DSLBIS: %d\n", rc);
+			} else {
 				dev_dbg(dev, "Failed to parse DSMAS: %d\n", rc);
+			}
 		}
 
 		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
index e8297cefde09..ff6092e45196 100644
--- a/include/acpi/actbl1.h
+++ b/include/acpi/actbl1.h
@@ -369,6 +369,11 @@ struct acpi_cdat_dslbis {
 	u16 reserved2;
 };
 
+/* Flags for subtable above */
+
+#define ACPI_CEDT_DSLBIS_MEM_MASK	GENMASK(3, 0)
+#define ACPI_CEDT_DSLBIS_MEM_MEMORY	0
+
 /* Subtable 2: Device Scoped Memory Side Cache Information Structure (DSMSCIS) */
 
 struct acpi_cdat_dsmscis {



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 08/18] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (6 preceding siblings ...)
  2023-02-06 20:50 ` [PATCH 07/18] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
@ 2023-02-06 20:50 ` Dave Jiang
  2023-02-09 14:02   ` Jonathan Cameron
  2023-02-06 20:50 ` [PATCH 09/18] cxl: Add helper function to retrieve ACPI handle of CXL root device Dave Jiang
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:50 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)

Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
an input of an ACPI package with 4 dwords (read latency, write latency,
read bandwidth, write bandwidth). The call returns a package with 1 WORD
that provides the max supported QTG ID and a package that may contain 0 or
more WORDs as the recommended QTG IDs in the recommended order.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/Makefile |    1 
 drivers/cxl/core/acpi.c   |   99 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h         |   15 +++++++
 3 files changed, 115 insertions(+)
 create mode 100644 drivers/cxl/core/acpi.c

diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 438ce27faf77..11ccc2016ab7 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -11,4 +11,5 @@ cxl_core-y += mbox.o
 cxl_core-y += pci.o
 cxl_core-y += hdm.o
 cxl_core-y += cdat.o
+cxl_core-y += acpi.o
 cxl_core-$(CONFIG_CXL_REGION) += region.o
diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
new file mode 100644
index 000000000000..86dc6c9c1f24
--- /dev/null
+++ b/drivers/cxl/core/acpi.c
@@ -0,0 +1,99 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+#include <asm/div64.h>
+#include "cxlpci.h"
+#include "cxl.h"
+
+const guid_t acpi_cxl_qtg_id_guid =
+	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
+		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
+
+/**
+ * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
+ * @handle: ACPI handle
+ * @input: bandwidth and latency data
+ *
+ * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
+ * the QTG IDs that falls within the performance data.
+ */
+struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
+						 struct qtg_dsm_input *input)
+{
+	struct qtg_dsm_output *output;
+	union acpi_object *out_obj, *out_buf, *pkg, in_buf, in_obj;
+	int len;
+	int rc;
+
+	in_obj.type = ACPI_TYPE_PACKAGE;
+	in_obj.package.count = 1;
+	in_obj.package.elements = &in_buf;
+	in_buf.type = ACPI_TYPE_BUFFER;
+	in_buf.buffer.pointer = (u8 *)input;
+	in_buf.buffer.length = sizeof(u32) * 4;
+
+	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
+	if (!out_obj)
+		return ERR_PTR(-ENXIO);
+
+	if (out_obj->type != ACPI_TYPE_PACKAGE) {
+		rc = -ENXIO;
+		goto err;
+	}
+
+	/*
+	 * CXL spec v3.0 9.17.3.1
+	 * There should be 2 elements in the package. 1 WORD for max QTG ID supported
+	 * by the platform, and the other a package of recommended QTGs
+	 */
+	if (out_obj->package.count != 2) {
+		rc = -ENXIO;
+		goto err;
+	}
+
+	pkg = &out_obj->package.elements[1];
+	if (pkg->type != ACPI_TYPE_PACKAGE) {
+		rc = -ENXIO;
+		goto err;
+	}
+
+	out_buf = &pkg->package.elements[0];
+	if (out_buf->type != ACPI_TYPE_BUFFER) {
+		rc = -ENXIO;
+		goto err;
+	}
+
+	len = out_buf->buffer.length;
+	output = kmalloc(len + sizeof(*output), GFP_KERNEL);
+	if (!output) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	/* It's legal to have 0 QTG entries */
+	if (len == 0) {
+		output->nr = 0;
+		goto out;
+	}
+
+	/* Malformed package, not multiple of WORD size */
+	if (len % sizeof(u16)) {
+		rc = -ENXIO;
+		goto out;
+	}
+
+	output->nr = len / sizeof(u16);
+	memcpy(output->qtg_ids, out_buf->buffer.pointer, len);
+out:
+	ACPI_FREE(out_obj);
+	return output;
+
+err:
+	ACPI_FREE(out_obj);
+	return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_acpi_evaluate_qtg_dsm, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 849b22236f1d..e70df07f9b4b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -719,6 +719,21 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
 int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg);
 int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg);
 
+struct qtg_dsm_input {
+	u32 rd_lat;
+	u32 wr_lat;
+	u32 rd_bw;
+	u32 wr_bw;
+};
+
+struct qtg_dsm_output {
+	int nr;
+	u16 qtg_ids[];
+};
+
+struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
+						 struct qtg_dsm_input *input);
+
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 09/18] cxl: Add helper function to retrieve ACPI handle of CXL root device
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (7 preceding siblings ...)
  2023-02-06 20:50 ` [PATCH 08/18] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
@ 2023-02-06 20:50 ` Dave Jiang
  2023-02-09 14:10   ` Jonathan Cameron
  2023-02-06 20:50 ` [PATCH 10/18] PCI: Export pcie_get_speed() using the code from sysfs PCI link speed show function Dave Jiang
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:50 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Provide a helper to find the ACPI0017 device in order to issue the _DSM.
The helper will take the 'struct device' from a cxl_port and iterate until
the root device is reached. The ACPI handle will be returned from the root
device.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/acpi.c |   30 ++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    1 +
 2 files changed, 31 insertions(+)

diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
index 86dc6c9c1f24..05fcd4751619 100644
--- a/drivers/cxl/core/acpi.c
+++ b/drivers/cxl/core/acpi.c
@@ -5,6 +5,7 @@
 #include <linux/kernel.h>
 #include <linux/acpi.h>
 #include <linux/pci.h>
+#include <linux/platform_device.h>
 #include <asm/div64.h>
 #include "cxlpci.h"
 #include "cxl.h"
@@ -13,6 +14,35 @@ const guid_t acpi_cxl_qtg_id_guid =
 	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
 		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
 
+/**
+ * cxl_acpi_get_root_acpi_handle - get the ACPI handle of the CXL root device
+ * @dev: 'struct device' to start searching from. Should be from cxl_port->dev.
+ * Looks for the ACPI0017 device and return the ACPI handle
+ **/
+acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev)
+{
+	struct device *itr = dev, *root_dev;
+	acpi_handle handle;
+
+	if (!dev)
+		return ERR_PTR(-EINVAL);
+
+	while (itr->parent) {
+		root_dev = itr;
+		itr = itr->parent;
+	}
+
+	if (!dev_is_platform(root_dev))
+		return ERR_PTR(-ENODEV);
+
+	handle = ACPI_HANDLE(root_dev);
+	if (!handle)
+		return ERR_PTR(-ENODEV);
+
+	return handle;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_acpi_get_rootdev_handle, CXL);
+
 /**
  * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
  * @handle: ACPI handle
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index e70df07f9b4b..ac6ea550ab0a 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -733,6 +733,7 @@ struct qtg_dsm_output {
 
 struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
 						 struct qtg_dsm_input *input);
+acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev);
 
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 10/18] PCI: Export pcie_get_speed() using the code from sysfs PCI link speed show function
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (8 preceding siblings ...)
  2023-02-06 20:50 ` [PATCH 09/18] cxl: Add helper function to retrieve ACPI handle of CXL root device Dave Jiang
@ 2023-02-06 20:50 ` Dave Jiang
  2023-02-06 22:27   ` Lukas Wunner
  2023-02-06 20:51 ` [PATCH 11/18] PCI: Export pcie_get_width() using the code from sysfs PCI link width " Dave Jiang
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:50 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Move the logic in current_link_speed_show() to a common function and export
that functiuon as pcie_get_speed() to allow other drivers to to retrieve
the current negotiated link speed.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/pci/pci-sysfs.c |   12 ++----------
 drivers/pci/pci.c       |   20 ++++++++++++++++++++
 include/linux/pci.h     |    1 +
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index dd0d9d9bc509..0217bb5ca8fa 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -205,17 +205,9 @@ static ssize_t current_link_speed_show(struct device *dev,
 				       struct device_attribute *attr, char *buf)
 {
 	struct pci_dev *pci_dev = to_pci_dev(dev);
-	u16 linkstat;
-	int err;
-	enum pci_bus_speed speed;
-
-	err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat);
-	if (err)
-		return -EINVAL;
 
-	speed = pcie_link_speed[linkstat & PCI_EXP_LNKSTA_CLS];
-
-	return sysfs_emit(buf, "%s\n", pci_speed_string(speed));
+	return sysfs_emit(buf, "%s\n",
+			  pci_speed_string(pcie_get_speed(pci_dev)));
 }
 static DEVICE_ATTR_RO(current_link_speed);
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index fba95486caaf..d0131b5623b1 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6215,6 +6215,26 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(pcie_get_width_cap);
 
+/**
+ * pcie_get_speed - query for the PCI device's current link speed
+ * @dev: PCI device to query
+ *
+ * Query the PCI device current link speed.
+ */
+enum pci_bus_speed pcie_get_speed(struct pci_dev *dev)
+{
+	u16 linkstat, cls;
+	int err;
+
+	err = pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &linkstat);
+	if (err)
+		return PCI_SPEED_UNKNOWN;
+
+	cls = FIELD_GET(PCI_EXP_LNKSTA_CLS, linkstat);
+	return pcie_link_speed[cls];
+}
+EXPORT_SYMBOL(pcie_get_speed);
+
 /**
  * pcie_bandwidth_capable - calculate a PCI device's link bandwidth capability
  * @dev: PCI device
diff --git a/include/linux/pci.h b/include/linux/pci.h
index adffd65e84b4..6a065986ff8f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -303,6 +303,7 @@ enum pci_bus_speed {
 	PCI_SPEED_UNKNOWN		= 0xff,
 };
 
+enum pci_bus_speed pcie_get_speed(struct pci_dev *dev);
 enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
 enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
 



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 11/18] PCI: Export pcie_get_width() using the code from sysfs PCI link width show function
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (9 preceding siblings ...)
  2023-02-06 20:50 ` [PATCH 10/18] PCI: Export pcie_get_speed() using the code from sysfs PCI link speed show function Dave Jiang
@ 2023-02-06 20:51 ` Dave Jiang
  2023-02-06 22:43   ` Bjorn Helgaas
  2023-02-06 20:51 ` [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:51 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Move the logic in current_link_width_show() to a common function and export
that functiuon as pcie_get_width() to allow other drivers to to retrieve
the current negotiated link width.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/pci/pci-sysfs.c |    9 +--------
 drivers/pci/pci.c       |   20 ++++++++++++++++++++
 include/linux/pci.h     |    1 +
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 0217bb5ca8fa..139096c39380 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -215,15 +215,8 @@ static ssize_t current_link_width_show(struct device *dev,
 				       struct device_attribute *attr, char *buf)
 {
 	struct pci_dev *pci_dev = to_pci_dev(dev);
-	u16 linkstat;
-	int err;
 
-	err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat);
-	if (err)
-		return -EINVAL;
-
-	return sysfs_emit(buf, "%u\n",
-		(linkstat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT);
+	return sysfs_emit(buf, "%u\n", pcie_get_width(pci_dev));
 }
 static DEVICE_ATTR_RO(current_link_width);
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d0131b5623b1..0858fa2f1c2d 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6235,6 +6235,26 @@ enum pci_bus_speed pcie_get_speed(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(pcie_get_speed);
 
+/**
+ * pcie_get_width - query for the PCI device's current link width
+ * @dev: PCI device to query
+ *
+ * Query the PCI device current negoiated width.
+ */
+
+enum pcie_link_width pcie_get_width(struct pci_dev *dev)
+{
+	u16 linkstat;
+	int err;
+
+	err = pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &linkstat);
+	if (err)
+		return PCIE_LNK_WIDTH_UNKNOWN;
+
+	return FIELD_GET(PCI_EXP_LNKSTA_NLW, linkstat);
+}
+EXPORT_SYMBOL(pcie_get_width);
+
 /**
  * pcie_bandwidth_capable - calculate a PCI device's link bandwidth capability
  * @dev: PCI device
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 6a065986ff8f..21eca09a98e2 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -305,6 +305,7 @@ enum pci_bus_speed {
 
 enum pci_bus_speed pcie_get_speed(struct pci_dev *dev);
 enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
+enum pcie_link_width pcie_get_width(struct pci_dev *dev);
 enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
 
 struct pci_vpd {



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (10 preceding siblings ...)
  2023-02-06 20:51 ` [PATCH 11/18] PCI: Export pcie_get_width() using the code from sysfs PCI link width " Dave Jiang
@ 2023-02-06 20:51 ` Dave Jiang
  2023-02-06 22:39   ` Bjorn Helgaas
  2023-02-09 15:16   ` Jonathan Cameron
  2023-02-06 20:51 ` [PATCH 13/18] cxl: Add latency and bandwidth calculations for the CXL path Dave Jiang
                   ` (5 subsequent siblings)
  17 siblings, 2 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:51 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

The latency is calculated by dividing the FLIT size over the bandwidth. Add
support to retrieve the FLIT size for the CXL device and calculate the
latency of the downstream link.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/pci.c |   67 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlpci.h   |   14 ++++++++++
 2 files changed, 81 insertions(+)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index a24dac36bedd..54ac6f8825ff 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -633,3 +633,70 @@ void read_cdat_data(struct cxl_port *port)
 	}
 }
 EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
+
+static int pcie_speed_to_mbps(enum pci_bus_speed speed)
+{
+	switch (speed) {
+	case PCIE_SPEED_2_5GT:
+		return 2500;
+	case PCIE_SPEED_5_0GT:
+		return 5000;
+	case PCIE_SPEED_8_0GT:
+		return 8000;
+	case PCIE_SPEED_16_0GT:
+		return 16000;
+	case PCIE_SPEED_32_0GT:
+		return 32000;
+	case PCIE_SPEED_64_0GT:
+		return 64000;
+	default:
+		break;
+	}
+
+	return -EINVAL;
+}
+
+static int cxl_pci_mbits_to_mbytes(struct pci_dev *pdev)
+{
+	int mbits;
+
+	mbits = pcie_speed_to_mbps(pcie_get_speed(pdev));
+	if (mbits < 0)
+		return mbits;
+
+	return mbits >> 3;
+}
+
+static int cxl_get_flit_size(struct pci_dev *pdev)
+{
+	if (cxl_pci_flit_256(pdev))
+		return 256;
+
+	return 66;
+}
+
+/**
+ * cxl_pci_get_latency - calculate the link latency for the PCIe link
+ * @pdev - PCI device
+ *
+ * CXL Memory Device SW Guide v1.0 2.11.4 Link latency calculation
+ * Link latency = LinkPropagationLatency + FlitLatency + RetimerLatency
+ * LinkProgationLatency is negligible, so 0 will be used
+ * RetimerLatency is assumed to be neglibible and 0 will be used
+ * FlitLatency = FlitSize / LinkBandwidth
+ * FlitSize is defined by spec. CXL v3.0 4.2.1.
+ * 68B flit is used up to 32GT/s. >32GT/s, 256B flit size is used.
+ * The FlitLatency is converted to pico-seconds.
+ */
+long cxl_pci_get_latency(struct pci_dev *pdev)
+{
+	long bw, flit_size;
+
+	bw = cxl_pci_mbits_to_mbytes(pdev);
+	if (bw < 0)
+		return bw;
+
+	flit_size = cxl_get_flit_size(pdev);
+	return flit_size * 1000000L / bw;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_pci_get_latency, CXL);
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 920909791bb9..d64a3e0458ab 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -62,8 +62,22 @@ enum cxl_regloc_type {
 	CXL_REGLOC_RBI_TYPES
 };
 
+/*
+ * CXL v3.0 6.2.3 Table 6-4
+ * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
+ * mode, otherwise it's 68B flits mode.
+ */
+static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
+{
+	u32 lnksta2;
+
+	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
+	return lnksta2 & BIT(10);
+}
+
 int devm_cxl_port_enumerate_dports(struct cxl_port *port);
 struct cxl_dev_state;
 int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm);
 void read_cdat_data(struct cxl_port *port);
+long cxl_pci_get_latency(struct pci_dev *pdev);
 #endif /* __CXL_PCI_H__ */



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 13/18] cxl: Add latency and bandwidth calculations for the CXL path
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (11 preceding siblings ...)
  2023-02-06 20:51 ` [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
@ 2023-02-06 20:51 ` Dave Jiang
  2023-02-09 15:24   ` Jonathan Cameron
  2023-02-06 20:51 ` [PATCH 14/18] cxl: Wait Memory_Info_Valid before access memory related info Dave Jiang
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:51 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to
caluclate latency and bandwidth for CXL memory device. Calculate minimum
bandwidth and total latency for the path from the CXL device to the root
port. The calculates values are stored in the cached DSMAS entries attached
to the cxl_port of the CXL device.

For example for a device that is directly attached to a host bus:
Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link
		Latency
Min Bandwidth = Link Bandwidth between Host Bus and CXL device

For a device that has a switch in between host bus and CXL device:
Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency +
		Switch (CDAT) Latency + Switch to HB Link Latency
Min Bandwidth = min(dev to switch bandwidth, switch to HB bandwidth)
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

The internal latency for a switch can be retrieved from the CDAT of the
switch PCI device. However, since there's no easy way to retrieve that
right now on Linux, a guesstimated constant is used per switch to simplify
the driver code.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/port.c |   60 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    9 +++++++
 drivers/cxl/port.c      |   42 +++++++++++++++++++++++++++++++++
 3 files changed, 111 insertions(+)

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 2b27319cfd42..aa260361ba7d 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1899,6 +1899,66 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd)
 }
 EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
 
+int cxl_port_get_downstream_qos(struct cxl_port *port, long *bw, long *lat)
+{
+	long total_lat = 0, latency;
+	long min_bw = INT_MAX;
+	struct pci_dev *pdev;
+	struct cxl_port *p;
+	struct device *dev;
+	int devices = 0;
+
+	/* Grab the device that is the PCI device for CXL memdev */
+	dev = port->uport->parent;
+	/* Skip if it's not PCI, most likely a cxl_test device */
+	if (!dev_is_pci(dev))
+		return 0;
+
+	pdev = to_pci_dev(dev);
+	min_bw = pcie_bandwidth_available(pdev, NULL, NULL, NULL);
+	if (min_bw == 0)
+		return -ENXIO;
+
+	/* convert to MB/s from Mb/s */
+	min_bw >>= 3;
+
+	p = port;
+	do {
+		struct cxl_dport *dport;
+
+		latency = cxl_pci_get_latency(pdev);
+		if (latency < 0)
+			return latency;
+
+		total_lat += latency;
+		devices++;
+
+		dport = p->parent_dport;
+		if (!dport)
+			break;
+
+		p = dport->port;
+		dev = p->uport;
+		if (!dev_is_pci(dev))
+			break;
+		pdev = to_pci_dev(dev);
+	} while (1);
+
+	/*
+	 * Add an approximate latency to the switch. Currently there
+	 * is no easy mechanism to read the CDAT for switches. 'devices'
+	 * should account for all the PCI devices encountered minus the
+	 * root device. So the number of switches would be 'devices - 1'
+	 * to account for the CXL device.
+	 */
+	total_lat += CXL_SWITCH_APPROX_LAT * (devices - 1);
+
+	*bw = min_bw;
+	*lat = total_lat;
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_port_get_downstream_qos, CXL);
+
 /* for user tooling to ensure port disable work has completed */
 static ssize_t flush_store(struct bus_type *bus, const char *buf, size_t count)
 {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index ac6ea550ab0a..86668fab6e91 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -480,6 +480,13 @@ struct cxl_pmem_region {
 	struct cxl_pmem_region_mapping mapping[];
 };
 
+/*
+ * Set in picoseconds per ACPI spec 6.5 Table 5.148 Entry Base Unit.
+ * This is an approximate constant to use for switch latency calculation
+ * until there's a way to access switch CDAT.
+ */
+#define CXL_SWITCH_APPROX_LAT	5000
+
 /**
  * struct cxl_port - logical collection of upstream port devices and
  *		     downstream port devices to construct a CXL memory
@@ -706,6 +713,7 @@ struct dsmas_entry {
 	struct range dpa_range;
 	u16 handle;
 	u64 qos[ACPI_HMAT_WRITE_BANDWIDTH + 1];
+	int qtg_id;
 };
 
 typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
@@ -734,6 +742,7 @@ struct qtg_dsm_output {
 struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
 						 struct qtg_dsm_input *input);
 acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev);
+int cxl_port_get_downstream_qos(struct cxl_port *port, long *bw, long *lat);
 
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 8de311208b37..d72e38f9ae44 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -30,6 +30,44 @@ static void schedule_detach(void *cxlmd)
 	schedule_cxl_memdev_detach(cxlmd);
 }
 
+static int cxl_port_qos_calculate(struct cxl_port *port)
+{
+	struct qtg_dsm_output *output;
+	struct qtg_dsm_input input;
+	struct dsmas_entry *dent;
+	long min_bw, total_lat;
+	acpi_handle handle;
+	int rc;
+
+	rc = cxl_port_get_downstream_qos(port, &min_bw, &total_lat);
+	if (rc)
+		return rc;
+
+	handle = cxl_acpi_get_rootdev_handle(&port->dev);
+	if (IS_ERR(handle))
+		return PTR_ERR(handle);
+
+	mutex_lock(&port->cdat.dsmas_lock);
+	list_for_each_entry(dent, &port->cdat.dsmas_list, list) {
+		input.rd_lat = dent->qos[ACPI_HMAT_READ_LATENCY] + total_lat;
+		input.wr_lat = dent->qos[ACPI_HMAT_WRITE_LATENCY] + total_lat;
+		input.rd_bw = min_t(int, min_bw,
+				    dent->qos[ACPI_HMAT_READ_BANDWIDTH]);
+		input.wr_bw = min_t(int, min_bw,
+				    dent->qos[ACPI_HMAT_WRITE_BANDWIDTH]);
+
+		output = cxl_acpi_evaluate_qtg_dsm(handle, &input);
+		if (IS_ERR(output))
+			continue;
+
+		dent->qtg_id = output->qtg_ids[0];
+		kfree(output);
+	}
+	mutex_unlock(&port->cdat.dsmas_lock);
+
+	return 0;
+}
+
 static int cxl_port_probe(struct device *dev)
 {
 	struct cxl_port *port = to_cxl_port(dev);
@@ -74,6 +112,10 @@ static int cxl_port_probe(struct device *dev)
 			} else {
 				dev_dbg(dev, "Failed to parse DSMAS: %d\n", rc);
 			}
+
+			rc = cxl_port_qos_calculate(port);
+			if (rc)
+				dev_dbg(dev, "Failed to do QoS calculations\n");
 		}
 
 		rc = cxl_hdm_decode_init(cxlds, cxlhdm);



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 14/18] cxl: Wait Memory_Info_Valid before access memory related info
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (12 preceding siblings ...)
  2023-02-06 20:51 ` [PATCH 13/18] cxl: Add latency and bandwidth calculations for the CXL path Dave Jiang
@ 2023-02-06 20:51 ` Dave Jiang
  2023-02-09 15:29   ` Jonathan Cameron
  2023-02-06 20:51 ` [PATCH 15/18] cxl: Move identify and partition query from pci probe to port probe Dave Jiang
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:51 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

CXL rev3.0 8.1.3.8.2 Memory_Info_valid field

The Memory_Info_Valid bit indicates that the CXL Range Size High and Size
Low registers are valid. The bit must be set within 1 second of reset
deassertion to the device. Check valid bit before we check the
Memory_Active bit when waiting for cxl_await_media_ready() to ensure that
the memory info is valid for consumption.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/pci.c |   25 +++++++++++++++++++++++--
 drivers/cxl/port.c     |   20 ++++++++++----------
 2 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 54ac6f8825ff..79a1348e7b98 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -111,11 +111,32 @@ int cxl_await_media_ready(struct cxl_dev_state *cxlds)
 	int d = cxlds->cxl_dvsec;
 	bool active = false;
 	u64 md_status;
+	u32 temp;
 	int rc, i;
 
-	for (i = media_ready_timeout; i; i--) {
-		u32 temp;
+	/* Check MEM INFO VALID bit first, give up after 1s */
+	i = 1;
+	do {
+		rc = pci_read_config_dword(pdev,
+					   d + CXL_DVSEC_RANGE_SIZE_LOW(0),
+					   &temp);
+		if (rc)
+			return rc;
 
+		active = FIELD_GET(CXL_DVSEC_MEM_INFO_VALID, temp);
+		if (active)
+			break;
+		msleep(1000);
+	} while (i--);
+
+	if (!active) {
+		dev_err(&pdev->dev,
+			"timeout awaiting memory valid after 1 second.\n");
+		return -ETIMEDOUT;
+	}
+
+	/* Check MEM ACTIVE bit, up to 60s timeout by default */
+	for (i = media_ready_timeout; i; i--) {
 		rc = pci_read_config_dword(
 			pdev, d + CXL_DVSEC_RANGE_SIZE_LOW(0), &temp);
 		if (rc)
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index d72e38f9ae44..03380c18fc52 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -99,6 +99,16 @@ static int cxl_port_probe(struct device *dev)
 		if (rc)
 			return rc;
 
+		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
+		if (rc)
+			return rc;
+
+		rc = cxl_await_media_ready(cxlds);
+		if (rc) {
+			dev_err(dev, "Media not active (%d)\n", rc);
+			return rc;
+		}
+
 		if (port->cdat.table) {
 			rc = cdat_table_parse_dsmas(port->cdat.table,
 						    cxl_dsmas_parse_entry,
@@ -117,16 +127,6 @@ static int cxl_port_probe(struct device *dev)
 			if (rc)
 				dev_dbg(dev, "Failed to do QoS calculations\n");
 		}
-
-		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
-		if (rc)
-			return rc;
-
-		rc = cxl_await_media_ready(cxlds);
-		if (rc) {
-			dev_err(dev, "Media not active (%d)\n", rc);
-			return rc;
-		}
 	}
 
 	rc = devm_cxl_enumerate_decoders(cxlhdm);



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 15/18] cxl: Move identify and partition query from pci probe to port probe
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (13 preceding siblings ...)
  2023-02-06 20:51 ` [PATCH 14/18] cxl: Wait Memory_Info_Valid before access memory related info Dave Jiang
@ 2023-02-06 20:51 ` Dave Jiang
  2023-02-09 15:29   ` Jonathan Cameron
  2023-02-06 20:51 ` [PATCH 16/18] cxl: Move reading of CDAT data from device to after media is ready Dave Jiang
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:51 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Move the enumeration of device capacity to cxl_port_probe() from
cxl_pci_probe(). The size and capacity information should be read
after cxl_await_media_ready() so the data is valid.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/pci.c  |    8 --------
 drivers/cxl/port.c |    8 ++++++++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 258004f34281..e35ed250214e 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -484,14 +484,6 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
-	rc = cxl_dev_state_identify(cxlds);
-	if (rc)
-		return rc;
-
-	rc = cxl_mem_create_range_info(cxlds);
-	if (rc)
-		return rc;
-
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 03380c18fc52..b7a4a1be2945 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -127,6 +127,14 @@ static int cxl_port_probe(struct device *dev)
 			if (rc)
 				dev_dbg(dev, "Failed to do QoS calculations\n");
 		}
+
+		rc = cxl_dev_state_identify(cxlds);
+		if (rc)
+			return rc;
+
+		rc = cxl_mem_create_range_info(cxlds);
+		if (rc)
+			return rc;
 	}
 
 	rc = devm_cxl_enumerate_decoders(cxlhdm);



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 16/18] cxl: Move reading of CDAT data from device to after media is ready
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (14 preceding siblings ...)
  2023-02-06 20:51 ` [PATCH 15/18] cxl: Move identify and partition query from pci probe to port probe Dave Jiang
@ 2023-02-06 20:51 ` Dave Jiang
  2023-02-06 22:17   ` Lukas Wunner
  2023-02-09 15:31   ` Jonathan Cameron
  2023-02-06 20:51 ` [PATCH 17/18] cxl: Attach QTG IDs to the DPA ranges for the device Dave Jiang
  2023-02-06 20:52 ` [PATCH 18/18] cxl: Export sysfs attributes for device QTG IDs Dave Jiang
  17 siblings, 2 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:51 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

The CDAT data is only valid after the hardware signals the media is ready.
Move the reading to after cxl_await_media_ready() has succeeded.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/port.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index b7a4a1be2945..6b2ad22487f5 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -91,9 +91,6 @@ static int cxl_port_probe(struct device *dev)
 		struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport);
 		struct cxl_dev_state *cxlds = cxlmd->cxlds;
 
-		/* Cache the data early to ensure is_visible() works */
-		read_cdat_data(port);
-
 		get_device(&cxlmd->dev);
 		rc = devm_add_action_or_reset(dev, schedule_detach, cxlmd);
 		if (rc)
@@ -109,6 +106,8 @@ static int cxl_port_probe(struct device *dev)
 			return rc;
 		}
 
+		/* Cache the data early to ensure is_visible() works */
+		read_cdat_data(port);
 		if (port->cdat.table) {
 			rc = cdat_table_parse_dsmas(port->cdat.table,
 						    cxl_dsmas_parse_entry,



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 17/18] cxl: Attach QTG IDs to the DPA ranges for the device
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (15 preceding siblings ...)
  2023-02-06 20:51 ` [PATCH 16/18] cxl: Move reading of CDAT data from device to after media is ready Dave Jiang
@ 2023-02-06 20:51 ` Dave Jiang
  2023-02-09 15:34   ` Jonathan Cameron
  2023-02-06 20:52 ` [PATCH 18/18] cxl: Export sysfs attributes for device QTG IDs Dave Jiang
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:51 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Match the DPA ranges of the mem device and the calcuated DPA range attached
to the DSMAS. If a match is found, then assign the QTG ID to the relevant
DPA range of the memory device.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/mbox.c |    2 ++
 drivers/cxl/cxlmem.h    |    2 ++
 drivers/cxl/port.c      |   35 +++++++++++++++++++++++++++++++++++
 3 files changed, 39 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index b03fba212799..2a7b07d65010 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -869,6 +869,8 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
 
 	mutex_init(&cxlds->mbox_mutex);
 	cxlds->dev = dev;
+	cxlds->pmem_qtg_id = -1;
+	cxlds->ram_qtg_id = -1;
 
 	return cxlds;
 }
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index ab138004f644..d88b88ecc807 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -251,6 +251,8 @@ struct cxl_dev_state {
 	struct resource dpa_res;
 	struct resource pmem_res;
 	struct resource ram_res;
+	int pmem_qtg_id;
+	int ram_qtg_id;
 	u64 total_bytes;
 	u64 volatile_only_bytes;
 	u64 persistent_only_bytes;
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 6b2ad22487f5..c4cee69d6625 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -68,6 +68,39 @@ static int cxl_port_qos_calculate(struct cxl_port *port)
 	return 0;
 }
 
+static bool dpa_match_qtg_range(struct range *dpa, struct range *qtg)
+{
+	if (dpa->start >= qtg->start && dpa->end <= qtg->end)
+		return true;
+	return false;
+}
+
+static void cxl_dev_set_qtg(struct cxl_port *port, struct cxl_dev_state *cxlds)
+{
+	struct dsmas_entry *dent;
+	struct range ram_range = {
+		.start = cxlds->ram_res.start,
+		.end = cxlds->ram_res.end,
+	};
+	struct range pmem_range =  {
+		.start = cxlds->pmem_res.start,
+		.end = cxlds->pmem_res.end,
+	};
+
+	mutex_lock(&port->cdat.dsmas_lock);
+	list_for_each_entry(dent, &port->cdat.dsmas_list, list) {
+		if (dpa_match_qtg_range(&ram_range, &dent->dpa_range)) {
+			cxlds->ram_qtg_id = dent->qtg_id;
+			break;
+		}
+		if (dpa_match_qtg_range(&pmem_range, &dent->dpa_range)) {
+			cxlds->pmem_qtg_id = dent->qtg_id;
+			break;
+		}
+	}
+	mutex_unlock(&port->cdat.dsmas_lock);
+}
+
 static int cxl_port_probe(struct device *dev)
 {
 	struct cxl_port *port = to_cxl_port(dev);
@@ -134,6 +167,8 @@ static int cxl_port_probe(struct device *dev)
 		rc = cxl_mem_create_range_info(cxlds);
 		if (rc)
 			return rc;
+
+		cxl_dev_set_qtg(port, cxlds);
 	}
 
 	rc = devm_cxl_enumerate_decoders(cxlhdm);



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 18/18] cxl: Export sysfs attributes for device QTG IDs
  2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (16 preceding siblings ...)
  2023-02-06 20:51 ` [PATCH 17/18] cxl: Attach QTG IDs to the DPA ranges for the device Dave Jiang
@ 2023-02-06 20:52 ` Dave Jiang
  2023-02-09 15:41   ` Jonathan Cameron
  17 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-06 20:52 UTC (permalink / raw)
  To: linux-cxl, linux-pci, linux-acpi
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

Export qtg_id sysfs attributes for the respective ram and pmem DPA range of
a CXL device. The QTG ID should show up as
/sys/bus/cxl/devices/memX/pmem/qtg_id for pmem or as
/sys/bus/cxl/devices/memX/ram/qtg_id for ram.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-cxl |   15 +++++++++++++++
 drivers/cxl/core/memdev.c               |   26 ++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 0932c2f6fbf4..8133a13e118d 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -27,6 +27,14 @@ Description:
 		identically named field in the Identify Memory Device Output
 		Payload in the CXL-2.0 specification.
 
+What:		/sys/bus/cxl/devices/memX/ram/qtg_id
+Date:		January, 2023
+KernelVersion:	v6.3
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) Shows calculated QoS Throttling Group ID for the
+		"Volatile Only Capacity" DPA range.
+
 
 What:		/sys/bus/cxl/devices/memX/pmem/size
 Date:		December, 2020
@@ -37,6 +45,13 @@ Description:
 		identically named field in the Identify Memory Device Output
 		Payload in the CXL-2.0 specification.
 
+What:		/sys/bus/cxl/devices/memX/pmem/qtg_id
+Date:		January, 2023
+KernelVersion:	v6.3
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) Shows calculated QoS Throttling Group ID for the
+		"Persistent Only Capacity" DPA range.
 
 What:		/sys/bus/cxl/devices/memX/serial
 Date:		January, 2022
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index a74a93310d26..06f9ac929ef4 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -76,6 +76,18 @@ static ssize_t ram_size_show(struct device *dev, struct device_attribute *attr,
 static struct device_attribute dev_attr_ram_size =
 	__ATTR(size, 0444, ram_size_show, NULL);
 
+static ssize_t ram_qtg_id_show(struct device *dev, struct device_attribute *attr,
+			       char *buf)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	return sysfs_emit(buf, "%d\n", cxlds->ram_qtg_id);
+}
+
+static struct device_attribute dev_attr_ram_qtg_id =
+	__ATTR(qtg_id, 0444, ram_qtg_id_show, NULL);
+
 static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
 			      char *buf)
 {
@@ -89,6 +101,18 @@ static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
 static struct device_attribute dev_attr_pmem_size =
 	__ATTR(size, 0444, pmem_size_show, NULL);
 
+static ssize_t pmem_qtg_id_show(struct device *dev, struct device_attribute *attr,
+				char *buf)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	return sysfs_emit(buf, "%d\n", cxlds->pmem_qtg_id);
+}
+
+static struct device_attribute dev_attr_pmem_qtg_id =
+	__ATTR(qtg_id, 0444, pmem_qtg_id_show, NULL);
+
 static ssize_t serial_show(struct device *dev, struct device_attribute *attr,
 			   char *buf)
 {
@@ -117,11 +141,13 @@ static struct attribute *cxl_memdev_attributes[] = {
 
 static struct attribute *cxl_memdev_pmem_attributes[] = {
 	&dev_attr_pmem_size.attr,
+	&dev_attr_pmem_qtg_id.attr,
 	NULL,
 };
 
 static struct attribute *cxl_memdev_ram_attributes[] = {
 	&dev_attr_ram_size.attr,
+	&dev_attr_ram_qtg_id.attr,
 	NULL,
 };
 



^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 05/18] ACPICA: Fix 'struct acpi_cdat_dsmas' spelling mistake
  2023-02-06 20:50 ` [PATCH 05/18] ACPICA: Fix 'struct acpi_cdat_dsmas' spelling mistake Dave Jiang
@ 2023-02-06 22:00   ` Lukas Wunner
  0 siblings, 0 replies; 65+ messages in thread
From: Lukas Wunner @ 2023-02-06 22:00 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, Feb 06, 2023 at 01:50:06PM -0700, Dave Jiang wrote:
> 'struct acpi_cadt_dsmas' => 'struct acpi_cdat_dsmas'
> 
> Fixes: 51aad1a6723b ("ACPICA: Finish support for the CDAT table")
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

ACPICA changes need to go into upstream first (via a pull request on
GitHub).  Once it's merged, you can submit the same patch downstream
for the kernel and reference the ACPICA commit with a Link: tag.

I've already submitted a pull request for the exact same change more
than a week ago:

https://github.com/acpica/acpica/pull/830

The pull request has been approved but not merged.  Hopefully that'll
happen soon.

Thanks,

Lukas

> ---
>  include/acpi/actbl1.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
> index 4175dce3967c..e8297cefde09 100644
> --- a/include/acpi/actbl1.h
> +++ b/include/acpi/actbl1.h
> @@ -344,7 +344,7 @@ enum acpi_cdat_type {
>  
>  /* Subtable 0: Device Scoped Memory Affinity Structure (DSMAS) */
>  
> -struct acpi_cadt_dsmas {
> +struct acpi_cdat_dsmas {
>  	u8 dsmad_handle;
>  	u8 flags;
>  	u16 reserved;
> 
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 16/18] cxl: Move reading of CDAT data from device to after media is ready
  2023-02-06 20:51 ` [PATCH 16/18] cxl: Move reading of CDAT data from device to after media is ready Dave Jiang
@ 2023-02-06 22:17   ` Lukas Wunner
  2023-02-07 20:55     ` Dave Jiang
  2023-02-09 15:31   ` Jonathan Cameron
  1 sibling, 1 reply; 65+ messages in thread
From: Lukas Wunner @ 2023-02-06 22:17 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, Feb 06, 2023 at 01:51:46PM -0700, Dave Jiang wrote:
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -109,6 +106,8 @@ static int cxl_port_probe(struct device *dev)
>  			return rc;
>  		}
>  
> +		/* Cache the data early to ensure is_visible() works */
> +		read_cdat_data(port);
>  		if (port->cdat.table) {
>  			rc = cdat_table_parse_dsmas(port->cdat.table,
>  						    cxl_dsmas_parse_entry,

Which branch is this patch based on?  I'm not seeing a function
called cdat_table_parse_dsmas() in cxl/next.

cxl_cdat_read_table() could be amended with a switch/case ladder
which compares entry->type to acpi_cdat_type values and stores
a pointer to an entry of interest e.g. in port->cdat->dsmas.
Then you can use that pointer directly to find the dsmas in the
CDAT and parse it.

Note however that cxl_cdat_read_table() is refactored heavily by
my DOE rework series (will submit v3 later this week):

https://github.com/l1k/linux/commits/doe

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/18] PCI: Export pcie_get_speed() using the code from sysfs PCI link speed show function
  2023-02-06 20:50 ` [PATCH 10/18] PCI: Export pcie_get_speed() using the code from sysfs PCI link speed show function Dave Jiang
@ 2023-02-06 22:27   ` Lukas Wunner
  2023-02-07 20:29     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Lukas Wunner @ 2023-02-06 22:27 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, Feb 06, 2023 at 01:50:52PM -0700, Dave Jiang wrote:
> Move the logic in current_link_speed_show() to a common function and export
> that functiuon as pcie_get_speed() to allow other drivers to to retrieve
> the current negotiated link speed.
[...]
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -6215,6 +6215,26 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL(pcie_get_width_cap);
>  
> +/**
> + * pcie_get_speed - query for the PCI device's current link speed
> + * @dev: PCI device to query
> + *
> + * Query the PCI device current link speed.
> + */
> +enum pci_bus_speed pcie_get_speed(struct pci_dev *dev)
> +{
> +	u16 linkstat, cls;
> +	int err;
> +
> +	err = pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &linkstat);
> +	if (err)
> +		return PCI_SPEED_UNKNOWN;
> +
> +	cls = FIELD_GET(PCI_EXP_LNKSTA_CLS, linkstat);
> +	return pcie_link_speed[cls];
> +}
> +EXPORT_SYMBOL(pcie_get_speed);

It seems we're already caching the current speed in dev->bus->cur_bus_speed.
Is that not sufficient?  If it isn't, that should be explained in the
commit message.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-06 20:51 ` [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
@ 2023-02-06 22:39   ` Bjorn Helgaas
  2023-02-07 20:51     ` Dave Jiang
  2023-02-09 15:16   ` Jonathan Cameron
  1 sibling, 1 reply; 65+ messages in thread
From: Bjorn Helgaas @ 2023-02-06 22:39 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, Feb 06, 2023 at 01:51:10PM -0700, Dave Jiang wrote:
> The latency is calculated by dividing the FLIT size over the bandwidth. Add
> support to retrieve the FLIT size for the CXL device and calculate the
> latency of the downstream link.

s/FLIT/flit/ to match spec usage.

Most of this looks like PCIe, not necessarily CXL-specific.

I guess you only care about the latency of a single link, not the
entire path?

> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/pci.c |   67 ++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlpci.h   |   14 ++++++++++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index a24dac36bedd..54ac6f8825ff 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -633,3 +633,70 @@ void read_cdat_data(struct cxl_port *port)
>  	}
>  }
>  EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
> +
> +static int pcie_speed_to_mbps(enum pci_bus_speed speed)
> +{
> +	switch (speed) {
> +	case PCIE_SPEED_2_5GT:
> +		return 2500;
> +	case PCIE_SPEED_5_0GT:
> +		return 5000;
> +	case PCIE_SPEED_8_0GT:
> +		return 8000;
> +	case PCIE_SPEED_16_0GT:
> +		return 16000;
> +	case PCIE_SPEED_32_0GT:
> +		return 32000;
> +	case PCIE_SPEED_64_0GT:
> +		return 64000;
> +	default:
> +		break;
> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static int cxl_pci_mbits_to_mbytes(struct pci_dev *pdev)
> +{
> +	int mbits;
> +
> +	mbits = pcie_speed_to_mbps(pcie_get_speed(pdev));
> +	if (mbits < 0)
> +		return mbits;
> +
> +	return mbits >> 3;
> +}
> +
> +static int cxl_get_flit_size(struct pci_dev *pdev)
> +{
> +	if (cxl_pci_flit_256(pdev))
> +		return 256;
> +
> +	return 66;

I don't know about the 66-byte flit format, maybe this part is
CXL-specific?

> + * cxl_pci_get_latency - calculate the link latency for the PCIe link
> + * @pdev - PCI device
> + *
> + * CXL Memory Device SW Guide v1.0 2.11.4 Link latency calculation
> + * Link latency = LinkPropagationLatency + FlitLatency + RetimerLatency
> + * LinkProgationLatency is negligible, so 0 will be used
> + * RetimerLatency is assumed to be neglibible and 0 will be used

s/neglibible/negligible/

> + * FlitLatency = FlitSize / LinkBandwidth
> + * FlitSize is defined by spec. CXL v3.0 4.2.1.
> + * 68B flit is used up to 32GT/s. >32GT/s, 256B flit size is used.
> + * The FlitLatency is converted to pico-seconds.

I guess this means cxl_pci_get_latency() actually *returns* a value in
picoseconds?

There are a couple instances of this written as "pico-seconds", but
most are "picoseconds".

> +long cxl_pci_get_latency(struct pci_dev *pdev)
> +{
> +	long bw, flit_size;
> +
> +	bw = cxl_pci_mbits_to_mbytes(pdev);
> +	if (bw < 0)
> +		return bw;
> +
> +	flit_size = cxl_get_flit_size(pdev);
> +	return flit_size * 1000000L / bw;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_get_latency, CXL);
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 920909791bb9..d64a3e0458ab 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -62,8 +62,22 @@ enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_TYPES
>  };
>  
> +/*
> + * CXL v3.0 6.2.3 Table 6-4

The copy I have refers to *Revision 3.0, Version 1.0*, i.e.,
"Revision" is the major level and "Version" is the minor.  So I would
cite this as "CXL r3.0", not "CXL v3.0".  I suppose the same for CXL
Memory Device above, but I don't have that spec.

> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
> + * mode, otherwise it's 68B flits mode.
> + */
> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
> +{
> +	u32 lnksta2;
> +
> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
> +	return lnksta2 & BIT(10);

Add a #define for the bit.

AFAICT, the PCIe spec defines this bit, and it only indicates the link
is or will be operating in Flit Mode; it doesn't actually say anything
about how large the flits are.  I suppose that's because PCIe only
talks about 256B flits, not 66B ones?

Bjorn

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 11/18] PCI: Export pcie_get_width() using the code from sysfs PCI link width show function
  2023-02-06 20:51 ` [PATCH 11/18] PCI: Export pcie_get_width() using the code from sysfs PCI link width " Dave Jiang
@ 2023-02-06 22:43   ` Bjorn Helgaas
  2023-02-07 20:35     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Bjorn Helgaas @ 2023-02-06 22:43 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, Feb 06, 2023 at 01:51:01PM -0700, Dave Jiang wrote:
> Move the logic in current_link_width_show() to a common function and export
> that functiuon as pcie_get_width() to allow other drivers to to retrieve
> the current negotiated link width.

s/a common function and export that functiuon and export that functiuon as//

I don't see the module caller of this, so not clear on why it needs to
be exported.

> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/pci/pci-sysfs.c |    9 +--------
>  drivers/pci/pci.c       |   20 ++++++++++++++++++++
>  include/linux/pci.h     |    1 +
>  3 files changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 0217bb5ca8fa..139096c39380 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -215,15 +215,8 @@ static ssize_t current_link_width_show(struct device *dev,
>  				       struct device_attribute *attr, char *buf)
>  {
>  	struct pci_dev *pci_dev = to_pci_dev(dev);
> -	u16 linkstat;
> -	int err;
>  
> -	err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat);
> -	if (err)
> -		return -EINVAL;
> -
> -	return sysfs_emit(buf, "%u\n",
> -		(linkstat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT);
> +	return sysfs_emit(buf, "%u\n", pcie_get_width(pci_dev));
>  }
>  static DEVICE_ATTR_RO(current_link_width);
>  
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index d0131b5623b1..0858fa2f1c2d 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -6235,6 +6235,26 @@ enum pci_bus_speed pcie_get_speed(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL(pcie_get_speed);
>  
> +/**
> + * pcie_get_width - query for the PCI device's current link width
> + * @dev: PCI device to query
> + *
> + * Query the PCI device current negoiated width.
> + */
> +
> +enum pcie_link_width pcie_get_width(struct pci_dev *dev)
> +{
> +	u16 linkstat;
> +	int err;
> +
> +	err = pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &linkstat);
> +	if (err)
> +		return PCIE_LNK_WIDTH_UNKNOWN;
> +
> +	return FIELD_GET(PCI_EXP_LNKSTA_NLW, linkstat);
> +}
> +EXPORT_SYMBOL(pcie_get_width);
> +
>  /**
>   * pcie_bandwidth_capable - calculate a PCI device's link bandwidth capability
>   * @dev: PCI device
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 6a065986ff8f..21eca09a98e2 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -305,6 +305,7 @@ enum pci_bus_speed {
>  
>  enum pci_bus_speed pcie_get_speed(struct pci_dev *dev);
>  enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
> +enum pcie_link_width pcie_get_width(struct pci_dev *dev);
>  enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
>  
>  struct pci_vpd {
> 
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/18] ACPICA: Export acpi_ut_verify_cdat_checksum()
  2023-02-06 20:49 ` [PATCH 02/18] ACPICA: Export acpi_ut_verify_cdat_checksum() Dave Jiang
@ 2023-02-07 14:19   ` Rafael J. Wysocki
  2023-02-07 15:47     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Rafael J. Wysocki @ 2023-02-07 14:19 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, Feb 6, 2023 at 9:49 PM Dave Jiang <dave.jiang@intel.com> wrote:
>
> Export the CDAT checksum verify function so CXL driver can use it to verify
> CDAT coming from the CXL devices.
>
> Given that this function isn't actually being used by ACPI internals,
> removing the define check of APCI_CHECKSUM_ABORT so the function would
> return failure on checksum fail since the driver will need to know.

If you want to make ACPICA changes, please first submit a pull request
to the upstream ACPICA project on GitHub.

Having done that, please resubmit the corresponding Linux patch with a
Link tag pointing to the upstream PR.

Thanks!

> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/acpi/acpica/utcksum.c |    4 +---
>  include/linux/acpi.h          |    7 +++++++
>  2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/acpi/acpica/utcksum.c b/drivers/acpi/acpica/utcksum.c
> index c166e4c05ab6..c0f98c8f9a0b 100644
> --- a/drivers/acpi/acpica/utcksum.c
> +++ b/drivers/acpi/acpica/utcksum.c
> @@ -102,15 +102,13 @@ acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length)
>                                    "should be 0x%2.2X",
>                                    acpi_gbl_CDAT, cdat_table->checksum,
>                                    checksum));
> -
> -#if (ACPI_CHECKSUM_ABORT)
>                 return (AE_BAD_CHECKSUM);
> -#endif
>         }
>
>         cdat_table->checksum = checksum;
>         return (AE_OK);
>  }
> +EXPORT_SYMBOL_GPL(acpi_ut_verify_cdat_checksum);
>
>  /*******************************************************************************
>   *
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 5e6a876e17ba..09b44afef7df 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1504,9 +1504,16 @@ static inline void acpi_init_ffh(void) { }
>  #ifdef CONFIG_ACPI
>  extern void acpi_device_notify(struct device *dev);
>  extern void acpi_device_notify_remove(struct device *dev);
> +extern acpi_status
> +acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length);
>  #else
>  static inline void acpi_device_notify(struct device *dev) { }
>  static inline void acpi_device_notify_remove(struct device *dev) { }
> +static inline acpi_status
> +acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length)
> +{
> +       return (AE_NOT_CONFIGURED);
> +}
>  #endif
>
>  #endif /*_LINUX_ACPI_H*/
>
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/18] ACPICA: Export acpi_ut_verify_cdat_checksum()
  2023-02-07 14:19   ` Rafael J. Wysocki
@ 2023-02-07 15:47     ` Dave Jiang
  2023-02-09 11:30       ` Jonathan Cameron
  0 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-07 15:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, bhelgaas, robert.moore



On 2/7/23 7:19 AM, Rafael J. Wysocki wrote:
> On Mon, Feb 6, 2023 at 9:49 PM Dave Jiang <dave.jiang@intel.com> wrote:
>>
>> Export the CDAT checksum verify function so CXL driver can use it to verify
>> CDAT coming from the CXL devices.
>>
>> Given that this function isn't actually being used by ACPI internals,
>> removing the define check of APCI_CHECKSUM_ABORT so the function would
>> return failure on checksum fail since the driver will need to know.
> 
> If you want to make ACPICA changes, please first submit a pull request
> to the upstream ACPICA project on GitHub.
> 
> Having done that, please resubmit the corresponding Linux patch with a
> Link tag pointing to the upstream PR.

Ok will do. Thanks!
> 
> Thanks!
> 
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>   drivers/acpi/acpica/utcksum.c |    4 +---
>>   include/linux/acpi.h          |    7 +++++++
>>   2 files changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/acpi/acpica/utcksum.c b/drivers/acpi/acpica/utcksum.c
>> index c166e4c05ab6..c0f98c8f9a0b 100644
>> --- a/drivers/acpi/acpica/utcksum.c
>> +++ b/drivers/acpi/acpica/utcksum.c
>> @@ -102,15 +102,13 @@ acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length)
>>                                     "should be 0x%2.2X",
>>                                     acpi_gbl_CDAT, cdat_table->checksum,
>>                                     checksum));
>> -
>> -#if (ACPI_CHECKSUM_ABORT)
>>                  return (AE_BAD_CHECKSUM);
>> -#endif
>>          }
>>
>>          cdat_table->checksum = checksum;
>>          return (AE_OK);
>>   }
>> +EXPORT_SYMBOL_GPL(acpi_ut_verify_cdat_checksum);
>>
>>   /*******************************************************************************
>>    *
>> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
>> index 5e6a876e17ba..09b44afef7df 100644
>> --- a/include/linux/acpi.h
>> +++ b/include/linux/acpi.h
>> @@ -1504,9 +1504,16 @@ static inline void acpi_init_ffh(void) { }
>>   #ifdef CONFIG_ACPI
>>   extern void acpi_device_notify(struct device *dev);
>>   extern void acpi_device_notify_remove(struct device *dev);
>> +extern acpi_status
>> +acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length);
>>   #else
>>   static inline void acpi_device_notify(struct device *dev) { }
>>   static inline void acpi_device_notify_remove(struct device *dev) { }
>> +static inline acpi_status
>> +acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length)
>> +{
>> +       return (AE_NOT_CONFIGURED);
>> +}
>>   #endif
>>
>>   #endif /*_LINUX_ACPI_H*/
>>
>>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/18] PCI: Export pcie_get_speed() using the code from sysfs PCI link speed show function
  2023-02-06 22:27   ` Lukas Wunner
@ 2023-02-07 20:29     ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-07 20:29 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/6/23 3:27 PM, Lukas Wunner wrote:
> On Mon, Feb 06, 2023 at 01:50:52PM -0700, Dave Jiang wrote:
>> Move the logic in current_link_speed_show() to a common function and export
>> that functiuon as pcie_get_speed() to allow other drivers to to retrieve
>> the current negotiated link speed.
> [...]
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -6215,6 +6215,26 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
>>   }
>>   EXPORT_SYMBOL(pcie_get_width_cap);
>>   
>> +/**
>> + * pcie_get_speed - query for the PCI device's current link speed
>> + * @dev: PCI device to query
>> + *
>> + * Query the PCI device current link speed.
>> + */
>> +enum pci_bus_speed pcie_get_speed(struct pci_dev *dev)
>> +{
>> +	u16 linkstat, cls;
>> +	int err;
>> +
>> +	err = pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &linkstat);
>> +	if (err)
>> +		return PCI_SPEED_UNKNOWN;
>> +
>> +	cls = FIELD_GET(PCI_EXP_LNKSTA_CLS, linkstat);
>> +	return pcie_link_speed[cls];
>> +}
>> +EXPORT_SYMBOL(pcie_get_speed);
> 
> It seems we're already caching the current speed in dev->bus->cur_bus_speed.
> Is that not sufficient?  If it isn't, that should be explained in the
> commit message.

I did not realize. That should work. Thanks. I'll drop patch.

> 
> Thanks,
> 
> Lukas

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 11/18] PCI: Export pcie_get_width() using the code from sysfs PCI link width show function
  2023-02-06 22:43   ` Bjorn Helgaas
@ 2023-02-07 20:35     ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-07 20:35 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/6/23 3:43 PM, Bjorn Helgaas wrote:
> On Mon, Feb 06, 2023 at 01:51:01PM -0700, Dave Jiang wrote:
>> Move the logic in current_link_width_show() to a common function and export
>> that functiuon as pcie_get_width() to allow other drivers to to retrieve
>> the current negotiated link width.
> 
> s/a common function and export that functiuon and export that functiuon as//
> 
> I don't see the module caller of this, so not clear on why it needs to
> be exported.

You are right. I think I was using it before I found 
pcie_bandwidth_available() call. I will drop.

> 
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>   drivers/pci/pci-sysfs.c |    9 +--------
>>   drivers/pci/pci.c       |   20 ++++++++++++++++++++
>>   include/linux/pci.h     |    1 +
>>   3 files changed, 22 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index 0217bb5ca8fa..139096c39380 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -215,15 +215,8 @@ static ssize_t current_link_width_show(struct device *dev,
>>   				       struct device_attribute *attr, char *buf)
>>   {
>>   	struct pci_dev *pci_dev = to_pci_dev(dev);
>> -	u16 linkstat;
>> -	int err;
>>   
>> -	err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat);
>> -	if (err)
>> -		return -EINVAL;
>> -
>> -	return sysfs_emit(buf, "%u\n",
>> -		(linkstat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT);
>> +	return sysfs_emit(buf, "%u\n", pcie_get_width(pci_dev));
>>   }
>>   static DEVICE_ATTR_RO(current_link_width);
>>   
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index d0131b5623b1..0858fa2f1c2d 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -6235,6 +6235,26 @@ enum pci_bus_speed pcie_get_speed(struct pci_dev *dev)
>>   }
>>   EXPORT_SYMBOL(pcie_get_speed);
>>   
>> +/**
>> + * pcie_get_width - query for the PCI device's current link width
>> + * @dev: PCI device to query
>> + *
>> + * Query the PCI device current negoiated width.
>> + */
>> +
>> +enum pcie_link_width pcie_get_width(struct pci_dev *dev)
>> +{
>> +	u16 linkstat;
>> +	int err;
>> +
>> +	err = pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &linkstat);
>> +	if (err)
>> +		return PCIE_LNK_WIDTH_UNKNOWN;
>> +
>> +	return FIELD_GET(PCI_EXP_LNKSTA_NLW, linkstat);
>> +}
>> +EXPORT_SYMBOL(pcie_get_width);
>> +
>>   /**
>>    * pcie_bandwidth_capable - calculate a PCI device's link bandwidth capability
>>    * @dev: PCI device
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 6a065986ff8f..21eca09a98e2 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -305,6 +305,7 @@ enum pci_bus_speed {
>>   
>>   enum pci_bus_speed pcie_get_speed(struct pci_dev *dev);
>>   enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
>> +enum pcie_link_width pcie_get_width(struct pci_dev *dev);
>>   enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
>>   
>>   struct pci_vpd {
>>
>>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-06 22:39   ` Bjorn Helgaas
@ 2023-02-07 20:51     ` Dave Jiang
  2023-02-08 22:15       ` Bjorn Helgaas
  0 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-07 20:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/6/23 3:39 PM, Bjorn Helgaas wrote:
> On Mon, Feb 06, 2023 at 01:51:10PM -0700, Dave Jiang wrote:
>> The latency is calculated by dividing the FLIT size over the bandwidth. Add
>> support to retrieve the FLIT size for the CXL device and calculate the
>> latency of the downstream link.
> 
> s/FLIT/flit/ to match spec usage.

ok will fix.

> 
> Most of this looks like PCIe, not necessarily CXL-specific.
> 
> I guess you only care about the latency of a single link, not the
> entire path?

I am adding each of the link individually together in the next patch. 
Are you suggesting a similar function like pcie_bandwidth_available() 
but for latency for the entire path?
> 
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>   drivers/cxl/core/pci.c |   67 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxlpci.h   |   14 ++++++++++
>>   2 files changed, 81 insertions(+)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index a24dac36bedd..54ac6f8825ff 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -633,3 +633,70 @@ void read_cdat_data(struct cxl_port *port)
>>   	}
>>   }
>>   EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
>> +
>> +static int pcie_speed_to_mbps(enum pci_bus_speed speed)
>> +{
>> +	switch (speed) {
>> +	case PCIE_SPEED_2_5GT:
>> +		return 2500;
>> +	case PCIE_SPEED_5_0GT:
>> +		return 5000;
>> +	case PCIE_SPEED_8_0GT:
>> +		return 8000;
>> +	case PCIE_SPEED_16_0GT:
>> +		return 16000;
>> +	case PCIE_SPEED_32_0GT:
>> +		return 32000;
>> +	case PCIE_SPEED_64_0GT:
>> +		return 64000;
>> +	default:
>> +		break;
>> +	}
>> +
>> +	return -EINVAL;
>> +}
>> +
>> +static int cxl_pci_mbits_to_mbytes(struct pci_dev *pdev)
>> +{
>> +	int mbits;
>> +
>> +	mbits = pcie_speed_to_mbps(pcie_get_speed(pdev));
>> +	if (mbits < 0)
>> +		return mbits;
>> +
>> +	return mbits >> 3;
>> +}
>> +
>> +static int cxl_get_flit_size(struct pci_dev *pdev)
>> +{
>> +	if (cxl_pci_flit_256(pdev))
>> +		return 256;
>> +
>> +	return 66;
> 
> I don't know about the 66-byte flit format, maybe this part is
> CXL-specific?

68-byte flit format. Looks like this is a typo from me.

> 
>> + * cxl_pci_get_latency - calculate the link latency for the PCIe link
>> + * @pdev - PCI device
>> + *
>> + * CXL Memory Device SW Guide v1.0 2.11.4 Link latency calculation
>> + * Link latency = LinkPropagationLatency + FlitLatency + RetimerLatency
>> + * LinkProgationLatency is negligible, so 0 will be used
>> + * RetimerLatency is assumed to be neglibible and 0 will be used
> 
> s/neglibible/negligible/

thank you will fix.
> 
>> + * FlitLatency = FlitSize / LinkBandwidth
>> + * FlitSize is defined by spec. CXL v3.0 4.2.1.
>> + * 68B flit is used up to 32GT/s. >32GT/s, 256B flit size is used.
>> + * The FlitLatency is converted to pico-seconds.
> 
> I guess this means cxl_pci_get_latency() actually *returns* a value in
> picoseconds?

yes

> 
> There are a couple instances of this written as "pico-seconds", but
> most are "picoseconds".

ok will fix.

> 
>> +long cxl_pci_get_latency(struct pci_dev *pdev)
>> +{
>> +	long bw, flit_size;
>> +
>> +	bw = cxl_pci_mbits_to_mbytes(pdev);
>> +	if (bw < 0)
>> +		return bw;
>> +
>> +	flit_size = cxl_get_flit_size(pdev);
>> +	return flit_size * 1000000L / bw;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_pci_get_latency, CXL);
>> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
>> index 920909791bb9..d64a3e0458ab 100644
>> --- a/drivers/cxl/cxlpci.h
>> +++ b/drivers/cxl/cxlpci.h
>> @@ -62,8 +62,22 @@ enum cxl_regloc_type {
>>   	CXL_REGLOC_RBI_TYPES
>>   };
>>   
>> +/*
>> + * CXL v3.0 6.2.3 Table 6-4
> 
> The copy I have refers to *Revision 3.0, Version 1.0*, i.e.,
> "Revision" is the major level and "Version" is the minor.  So I would
> cite this as "CXL r3.0", not "CXL v3.0".  I suppose the same for CXL
> Memory Device above, but I don't have that spec.

Ok will fix.

> 
>> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
>> + * mode, otherwise it's 68B flits mode.
>> + */
>> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
>> +{
>> +	u32 lnksta2;
>> +
>> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
>> +	return lnksta2 & BIT(10);
> 
> Add a #define for the bit.

ok will add.

> 
> AFAICT, the PCIe spec defines this bit, and it only indicates the link
> is or will be operating in Flit Mode; it doesn't actually say anything
> about how large the flits are.  I suppose that's because PCIe only
> talks about 256B flits, not 66B ones?

Looking at CXL v1.0 rev3.0 6.2.3 "256B Flit Mode", table 6-4, it shows 
that when PCIe Flit Mode is set, then CXL is in 256B flits mode, 
otherwise, it is 68B flits. So an assumption is made here regarding the 
flit side based on the table.

> 
> Bjorn

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 16/18] cxl: Move reading of CDAT data from device to after media is ready
  2023-02-06 22:17   ` Lukas Wunner
@ 2023-02-07 20:55     ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-07 20:55 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/6/23 3:17 PM, Lukas Wunner wrote:
> On Mon, Feb 06, 2023 at 01:51:46PM -0700, Dave Jiang wrote:
>> --- a/drivers/cxl/port.c
>> +++ b/drivers/cxl/port.c
>> @@ -109,6 +106,8 @@ static int cxl_port_probe(struct device *dev)
>>   			return rc;
>>   		}
>>   
>> +		/* Cache the data early to ensure is_visible() works */
>> +		read_cdat_data(port);
>>   		if (port->cdat.table) {
>>   			rc = cdat_table_parse_dsmas(port->cdat.table,
>>   						    cxl_dsmas_parse_entry,
> 
> Which branch is this patch based on?  I'm not seeing a function
> called cdat_table_parse_dsmas() in cxl/next.

v6.2-rc7. See commit 4/18. That's where it's introduced. I adapted it 
from ACPI entries parsing code.

> 
> cxl_cdat_read_table() could be amended with a switch/case ladder
> which compares entry->type to acpi_cdat_type values and stores
> a pointer to an entry of interest e.g. in port->cdat->dsmas.
> Then you can use that pointer directly to find the dsmas in the
> CDAT and parse it.

Yes, but we may have more than 1 DSMAS right? Plus having to parse the 
DSLBIS entries as well, may be better to just have a common parsing 
routine to deal with all that.
> 
> Note however that cxl_cdat_read_table() is refactored heavily by
> my DOE rework series (will submit v3 later this week):
> 
> https://github.com/l1k/linux/commits/doe
> 
> Thanks,
> 
> Lukas

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-07 20:51     ` Dave Jiang
@ 2023-02-08 22:15       ` Bjorn Helgaas
  2023-02-08 23:56         ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Bjorn Helgaas @ 2023-02-08 22:15 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Tue, Feb 07, 2023 at 01:51:17PM -0700, Dave Jiang wrote:
> 
> 
> On 2/6/23 3:39 PM, Bjorn Helgaas wrote:
> > On Mon, Feb 06, 2023 at 01:51:10PM -0700, Dave Jiang wrote:
> > > The latency is calculated by dividing the FLIT size over the
> > > bandwidth. Add support to retrieve the FLIT size for the CXL
> > > device and calculate the latency of the downstream link.

> > I guess you only care about the latency of a single link, not the
> > entire path?
> 
> I am adding each of the link individually together in the next
> patch. Are you suggesting a similar function like
> pcie_bandwidth_available() but for latency for the entire path?

Only a clarifying question.

> > > +static int cxl_get_flit_size(struct pci_dev *pdev)
> > > +{
> > > +	if (cxl_pci_flit_256(pdev))
> > > +		return 256;
> > > +
> > > +	return 66;
> > 
> > I don't know about the 66-byte flit format, maybe this part is
> > CXL-specific?
> 
> 68-byte flit format. Looks like this is a typo from me.

This part must be CXL-specific, since I don't think PCIe mentions
68-byte flits.

> > > + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
> > > + * mode, otherwise it's 68B flits mode.
> > > + */
> > > +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
> > > +{
> > > +	u32 lnksta2;
> > > +
> > > +	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
> > > +	return lnksta2 & BIT(10);
> > 
> > Add a #define for the bit.
> 
> ok will add.
> 
> > 
> > AFAICT, the PCIe spec defines this bit, and it only indicates the link
> > is or will be operating in Flit Mode; it doesn't actually say anything
> > about how large the flits are.  I suppose that's because PCIe only
> > talks about 256B flits, not 66B ones?
> 
> Looking at CXL v1.0 rev3.0 6.2.3 "256B Flit Mode", table 6-4, it shows that
> when PCIe Flit Mode is set, then CXL is in 256B flits mode, otherwise, it is
> 68B flits. So an assumption is made here regarding the flit side based on
> the table.

So reading PCI_EXP_LNKSTA2 and extracting the Flit Mode bit is
PCIe-generic, but the interpretation of "PCIe Flit Mode not enabled
means 68-byte flits" is CXL-specific?

This sounds wrong, but I don't know quite how.  How would the PCI core
manage links where Flit Mode being cleared really means Flit Mode is
*enabled* but with a different size?  Seems like something could go
wrong there.

Bjorn

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-08 22:15       ` Bjorn Helgaas
@ 2023-02-08 23:56         ` Dave Jiang
  2023-02-09 15:10           ` Jonathan Cameron
  0 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-08 23:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/8/23 3:15 PM, Bjorn Helgaas wrote:
> On Tue, Feb 07, 2023 at 01:51:17PM -0700, Dave Jiang wrote:
>>
>>
>> On 2/6/23 3:39 PM, Bjorn Helgaas wrote:
>>> On Mon, Feb 06, 2023 at 01:51:10PM -0700, Dave Jiang wrote:
>>>> The latency is calculated by dividing the FLIT size over the
>>>> bandwidth. Add support to retrieve the FLIT size for the CXL
>>>> device and calculate the latency of the downstream link.
> 
>>> I guess you only care about the latency of a single link, not the
>>> entire path?
>>
>> I am adding each of the link individually together in the next
>> patch. Are you suggesting a similar function like
>> pcie_bandwidth_available() but for latency for the entire path?
> 
> Only a clarifying question.
> 
>>>> +static int cxl_get_flit_size(struct pci_dev *pdev)
>>>> +{
>>>> +	if (cxl_pci_flit_256(pdev))
>>>> +		return 256;
>>>> +
>>>> +	return 66;
>>>
>>> I don't know about the 66-byte flit format, maybe this part is
>>> CXL-specific?
>>
>> 68-byte flit format. Looks like this is a typo from me.
> 
> This part must be CXL-specific, since I don't think PCIe mentions
> 68-byte flits.
> 
>>>> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
>>>> + * mode, otherwise it's 68B flits mode.
>>>> + */
>>>> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
>>>> +{
>>>> +	u32 lnksta2;
>>>> +
>>>> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
>>>> +	return lnksta2 & BIT(10);
>>>
>>> Add a #define for the bit.
>>
>> ok will add.
>>
>>>
>>> AFAICT, the PCIe spec defines this bit, and it only indicates the link
>>> is or will be operating in Flit Mode; it doesn't actually say anything
>>> about how large the flits are.  I suppose that's because PCIe only
>>> talks about 256B flits, not 66B ones?
>>
>> Looking at CXL v1.0 rev3.0 6.2.3 "256B Flit Mode", table 6-4, it shows that
>> when PCIe Flit Mode is set, then CXL is in 256B flits mode, otherwise, it is
>> 68B flits. So an assumption is made here regarding the flit side based on
>> the table.
> 
> So reading PCI_EXP_LNKSTA2 and extracting the Flit Mode bit is
> PCIe-generic, but the interpretation of "PCIe Flit Mode not enabled
> means 68-byte flits" is CXL-specific?
> 
> This sounds wrong, but I don't know quite how.  How would the PCI core
> manage links where Flit Mode being cleared really means Flit Mode is
> *enabled* but with a different size?  Seems like something could go
> wrong there.

Looking at the PCIe base spec and the CXL spec, that seemed to be the 
only way that implies the flit size for a CXL device as far as I can 
tell. I've yet to find a good way to make that determination. Dan?


> 
> Bjorn

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 01/18] cxl: Export QTG ids from CFMWS to sysfs
  2023-02-06 20:49 ` [PATCH 01/18] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
@ 2023-02-09 11:15   ` Jonathan Cameron
  2023-02-09 17:28     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 11:15 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-pci, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:49:30 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Export the QoS Throttling Group ID from the CXL Fixed Memory Window
> Structure (CFMWS) under the root decoder sysfs attributes.
> CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS)
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

Hi Dave,


I've no objection to this, but would good to say why this
might be of use to userspace.  What tooling needs it?

One comment on docs inline. With those two things tidied up
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


> ---
>  Documentation/ABI/testing/sysfs-bus-cxl |    7 +++++++
>  drivers/cxl/acpi.c                      |    3 +++
>  drivers/cxl/core/port.c                 |   14 ++++++++++++++
>  drivers/cxl/cxl.h                       |    3 +++
>  4 files changed, 27 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 8494ef27e8d2..0932c2f6fbf4 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -294,6 +294,13 @@ Description:
>  		(WO) Write a string in the form 'regionZ' to delete that region,
>  		provided it is currently idle / not bound to a driver.
>  
> +What:		/sys/bus/cxl/devices/decoderX.Y/qtg_id
> +Date:		Jan, 2023
> +KernelVersion:	v6.3
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RO) Shows the QoS Throttling Group ID. The QTG ID for a root
> +		decoder comes from the CFMWS structure of the CEDT.

Document the -1 value for no ID in here. Hopefully people will write
their userspace against this document and we want them to know about that
corner case!

>  
>  What:		/sys/bus/cxl/devices/regionZ/uuid
>  Date:		May, 2022
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 13cde44c6086..7a71bb5041c7 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>  			}
>  		}
>  	}
> +
> +	cxld->qtg_id = cfmws->qtg_id;
> +
>  	rc = cxl_decoder_add(cxld, target_map);
>  err_xormap:
>  	if (rc)
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index b631a0520456..fe78daf7e7c8 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -284,6 +284,16 @@ static ssize_t interleave_ways_show(struct device *dev,
>  
>  static DEVICE_ATTR_RO(interleave_ways);
>  
> +static ssize_t qtg_id_show(struct device *dev,
> +			   struct device_attribute *attr, char *buf)
> +{
> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +
> +	return sysfs_emit(buf, "%d\n", cxld->qtg_id);
> +}
> +
> +static DEVICE_ATTR_RO(qtg_id);
> +
>  static struct attribute *cxl_decoder_base_attrs[] = {
>  	&dev_attr_start.attr,
>  	&dev_attr_size.attr,
> @@ -303,6 +313,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
>  	&dev_attr_cap_type2.attr,
>  	&dev_attr_cap_type3.attr,
>  	&dev_attr_target_list.attr,
> +	&dev_attr_qtg_id.attr,
>  	SET_CXL_REGION_ATTR(create_pmem_region)
>  	SET_CXL_REGION_ATTR(delete_region)
>  	NULL,
> @@ -1606,6 +1617,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
>  	}
>  
>  	atomic_set(&cxlrd->region_id, rc);
> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>  	return cxlrd;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL);
> @@ -1643,6 +1655,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
>  
>  	cxld = &cxlsd->cxld;
>  	cxld->dev.type = &cxl_decoder_switch_type;
> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>  	return cxlsd;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL);
> @@ -1675,6 +1688,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port)
>  	}
>  
>  	cxld->dev.type = &cxl_decoder_endpoint_type;
> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>  	return cxled;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 1b1cf459ac77..f558bbfc0332 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -279,6 +279,7 @@ enum cxl_decoder_type {
>   */
>  #define CXL_DECODER_MAX_INTERLEAVE 16
>  
> +#define CXL_QTG_ID_INVALID	-1
>  
>  /**
>   * struct cxl_decoder - Common CXL HDM Decoder Attributes
> @@ -290,6 +291,7 @@ enum cxl_decoder_type {
>   * @target_type: accelerator vs expander (type2 vs type3) selector
>   * @region: currently assigned region for this decoder
>   * @flags: memory type capabilities and locking
> + * @qtg_id: QoS Throttling Group ID
>   * @commit: device/decoder-type specific callback to commit settings to hw
>   * @reset: device/decoder-type specific callback to reset hw settings
>  */
> @@ -302,6 +304,7 @@ struct cxl_decoder {
>  	enum cxl_decoder_type target_type;
>  	struct cxl_region *region;
>  	unsigned long flags;
> +	int qtg_id;
>  	int (*commit)(struct cxl_decoder *cxld);
>  	int (*reset)(struct cxl_decoder *cxld);
>  };
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 02/18] ACPICA: Export acpi_ut_verify_cdat_checksum()
  2023-02-07 15:47     ` Dave Jiang
@ 2023-02-09 11:30       ` Jonathan Cameron
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 11:30 UTC (permalink / raw)
  To: Dave Jiang
  Cc: Rafael J. Wysocki, linux-cxl, linux-pci, linux-acpi,
	dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	bhelgaas, robert.moore

On Tue, 7 Feb 2023 08:47:58 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> On 2/7/23 7:19 AM, Rafael J. Wysocki wrote:
> > On Mon, Feb 6, 2023 at 9:49 PM Dave Jiang <dave.jiang@intel.com> wrote:  
> >>
> >> Export the CDAT checksum verify function so CXL driver can use it to verify
> >> CDAT coming from the CXL devices.
> >>
> >> Given that this function isn't actually being used by ACPI internals,
> >> removing the define check of APCI_CHECKSUM_ABORT so the function would
> >> return failure on checksum fail since the driver will need to know.  

Seems unlikely this won't cause problems in usage of
AcpiUtVerifyCdatChecksum in the upstream ACPICA code.  So you may need
to leave that alone.

You will probably want a linux wrapper to export rather than
the acpica function.  That should let you avoid an acpica change I think.
There are no exports from within acpica code.

Jonathan


> > 
> > If you want to make ACPICA changes, please first submit a pull request
> > to the upstream ACPICA project on GitHub.
> > 
> > Having done that, please resubmit the corresponding Linux patch with a
> > Link tag pointing to the upstream PR.  
> 
> Ok will do. Thanks!
> > 
> > Thanks!
> >   
> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> >> ---
> >>   drivers/acpi/acpica/utcksum.c |    4 +---
> >>   include/linux/acpi.h          |    7 +++++++
> >>   2 files changed, 8 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/acpi/acpica/utcksum.c b/drivers/acpi/acpica/utcksum.c
> >> index c166e4c05ab6..c0f98c8f9a0b 100644
> >> --- a/drivers/acpi/acpica/utcksum.c
> >> +++ b/drivers/acpi/acpica/utcksum.c
> >> @@ -102,15 +102,13 @@ acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length)
> >>                                     "should be 0x%2.2X",
> >>                                     acpi_gbl_CDAT, cdat_table->checksum,
> >>                                     checksum));
> >> -
> >> -#if (ACPI_CHECKSUM_ABORT)
> >>                  return (AE_BAD_CHECKSUM);
> >> -#endif
> >>          }
> >>
> >>          cdat_table->checksum = checksum;
> >>          return (AE_OK);
> >>   }
> >> +EXPORT_SYMBOL_GPL(acpi_ut_verify_cdat_checksum);
> >>
> >>   /*******************************************************************************
> >>    *
> >> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> >> index 5e6a876e17ba..09b44afef7df 100644
> >> --- a/include/linux/acpi.h
> >> +++ b/include/linux/acpi.h
> >> @@ -1504,9 +1504,16 @@ static inline void acpi_init_ffh(void) { }
> >>   #ifdef CONFIG_ACPI
> >>   extern void acpi_device_notify(struct device *dev);
> >>   extern void acpi_device_notify_remove(struct device *dev);
> >> +extern acpi_status
> >> +acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length);
> >>   #else
> >>   static inline void acpi_device_notify(struct device *dev) { }
> >>   static inline void acpi_device_notify_remove(struct device *dev) { }
> >> +static inline acpi_status
> >> +acpi_ut_verify_cdat_checksum(struct acpi_table_cdat *cdat_table, u32 length)
> >> +{
> >> +       return (AE_NOT_CONFIGURED);
> >> +}
> >>   #endif
> >>
> >>   #endif /*_LINUX_ACPI_H*/
> >>
> >>  


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/18] cxl: Add checksum verification to CDAT from CXL
  2023-02-06 20:49 ` [PATCH 03/18] cxl: Add checksum verification to CDAT from CXL Dave Jiang
@ 2023-02-09 11:34   ` Jonathan Cameron
  2023-02-09 17:31     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 11:34 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:49:48 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> A CDAT table is available from a CXL device. The table is read by the
> driver and cached in software. With the CXL subsystem needing to parse the
> CDAT table, the checksum should be verified. Add checksum verification
> after the CDAT table is read from device.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Hi Dave,

Some comments on this follow on from previous patch so may not
be relevant once you've updated how that is done.

Jonathan

> ---
>  drivers/cxl/core/pci.c |   11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 57764e9cd19d..a24dac36bedd 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -3,6 +3,7 @@
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/device.h>
>  #include <linux/delay.h>
> +#include <linux/acpi.h>
>  #include <linux/pci.h>
>  #include <linux/pci-doe.h>
>  #include <cxlpci.h>
> @@ -592,6 +593,7 @@ void read_cdat_data(struct cxl_port *port)
>  	struct device *dev = &port->dev;
>  	struct device *uport = port->uport;
>  	size_t cdat_length;
> +	acpi_status status;
>  	int rc;
>  
>  	cdat_doe = find_cdat_doe(uport);
> @@ -620,5 +622,14 @@ void read_cdat_data(struct cxl_port *port)
>  		port->cdat.length = 0;
>  		dev_err(dev, "CDAT data read error\n");
>  	}
> +
> +	status = acpi_ut_verify_cdat_checksum(port->cdat.table, port->cdat.length);
> +	if (status != AE_OK) {

if (ACPI_FAILURE(acpi_ut...))  or better still put that in the wrapper I suggeste
in previous patch so that we have normal kernel return code handling out here.


> +		/* Don't leave table data allocated on error */
> +		devm_kfree(dev, port->cdat.table);
> +		port->cdat.table = NULL;
> +		port->cdat.length = 0;

I'd rather see us manipulate a local copy of cdat_length, and cdat_table
then only assign them to the port->cdat fields the success path rather
than setting them then unsetting the on error.

Diff will be bigger, but nicer resulting code (and hopefully diff
won't be too big!)


> +		dev_err(dev, "CDAT data checksum error\n");
> +	}
>  }
>  EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/18] cxl: Add common helpers for cdat parsing
  2023-02-06 20:49 ` [PATCH 04/18] cxl: Add common helpers for cdat parsing Dave Jiang
@ 2023-02-09 11:58   ` Jonathan Cameron
  2023-02-09 22:57     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 11:58 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:49:58 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Add helper functions to parse the CDAT table and provide a callback to
> parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
> parsing. The code is patterned after the ACPI table parsing helpers.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/Makefile |    1 
>  drivers/cxl/core/cdat.c   |   98 +++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/cdat.h   |   15 +++++++
>  drivers/cxl/cxl.h         |    9 ++++
>  4 files changed, 123 insertions(+)
>  create mode 100644 drivers/cxl/core/cdat.c
>  create mode 100644 drivers/cxl/core/cdat.h
> 
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 79c7257f4107..438ce27faf77 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -10,4 +10,5 @@ cxl_core-y += memdev.o
>  cxl_core-y += mbox.o
>  cxl_core-y += pci.o
>  cxl_core-y += hdm.o
> +cxl_core-y += cdat.o
>  cxl_core-$(CONFIG_CXL_REGION) += region.o
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> new file mode 100644
> index 000000000000..be09c8a690f5
> --- /dev/null
> +++ b/drivers/cxl/core/cdat.c
> @@ -0,0 +1,98 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
> +#include "cxl.h"
> +#include "cdat.h"
> +
> +static u8 cdat_get_subtable_entry_type(struct cdat_subtable_entry *entry)
> +{
> +	return entry->hdr->type;
> +}

Are these all worthwhile given the resulting function name is longer
than accessing it directly.  If aim is to move the details of the
struct cdat_subtable_entry away from being exposed at caller, then
fair enough, but if that is the plan I'd expect to see something about
that in the patch description.

Feels like some premature abstraction, but I don't feel particularly
strongly about this.


> +
> +static u16 cdat_get_subtable_entry_length(struct cdat_subtable_entry *entry)
> +{
> +	return entry->hdr->length;
> +}
> +
> +static bool has_handler(struct cdat_subtable_proc *proc)
> +{
> +	return proc->handler;
> +}
> +
> +static int call_handler(struct cdat_subtable_proc *proc,
> +			struct cdat_subtable_entry *ent)
> +{
> +	if (proc->handler)

Use your wrapper...

> +		return proc->handler(ent->hdr, proc->arg);
> +	return -EINVAL;
> +}
> +
> +static int cdat_table_parse_entries(enum acpi_cdat_type type,
> +				    struct acpi_table_cdat *table_header,
> +				    struct cdat_subtable_proc *proc,
> +				    unsigned int max_entries)

Documentation needed.  max_entries wasn't what I was expecting.
I would have expected it to be a cap on number of entries of
matching type, whereas it seems to be number of entries of any type.

Also, max_entries == 0 non obvious parameter value.


> +{
> +	struct cdat_subtable_entry entry;
> +	unsigned long table_end, entry_len;
> +	int count = 0;
> +	int rc;
> +
> +	if (!has_handler(proc))
> +		return -EINVAL;
> +
> +	table_end = (unsigned long)table_header + table_header->length;
> +
> +	if (type >= ACPI_CDAT_TYPE_RESERVED)
> +		return -EINVAL;
> +
> +	entry.type = type;
> +	entry.hdr = (struct acpi_cdat_header *)((unsigned long)table_header +
> +					       sizeof(*table_header));

Common idiom for this is.

	entry.hdr = (struct acpi_cdat_header *)(table_header + 1);

> +
> +	while ((unsigned long)entry.hdr < table_end) {
> +		entry_len = cdat_get_subtable_entry_length(&entry);
> +
> +		if ((unsigned long)entry.hdr + entry_len > table_end)
> +			return -EINVAL;
> +
> +		if (max_entries && count >= max_entries)
> +			break;
> +
> +		if (entry_len == 0)
> +			return -EINVAL;
> +
> +		if (cdat_get_subtable_entry_type(&entry) == type) {

This is a little odd as we set entry.type above == type, but
the match here is on the value in the one in entry.hdr.

That's not particularly intuitive. Not sure on what a good solution
would be though.  Maybe just

		if (cdat_is_subtable_match(&entry))

> +			rc = call_handler(proc, &entry);
> +			if (rc)
> +				return rc;
> +		}

As above.  Maybe intent, but my initial assumption would have had
count not incremented unless there was a match. (so put it in this if block
not below)

> +
> +		entry.hdr = (struct acpi_cdat_header *)((unsigned long)entry.hdr + entry_len);
> +		count++;
> +	}
> +
> +	return count;
> +}
> +
> +int cdat_table_parse_dsmas(void *table, cdat_tbl_entry_handler handler, void *arg)
> +{
> +	struct acpi_table_cdat *header = (struct acpi_table_cdat *)table;

Now struct acpi_table_cdata exists, maybe just move to using
that type for all references.  Will make a mess of the range checking
efforts the hardening folk are working on as we will index off end of
it and it doesn't have a variable length array trailing element.

Random musing follows...
We could add a variable length element to that struct
definition and the magic to associate that with the length parameter
and get range protection if relevant hardening is turned on.

Structure definition comes (I think) from scripts in acpica so
would need to push such changes into acpica and I'm not sure
they will be keen even though it would be good for the kernel
to have the protections.

https://people.kernel.org/kees/bounded-flexible-arrays-in-c
for Kees Cook's blog on this stuff.  The last bit needs
the 'comming soon' part.

> +	struct cdat_subtable_proc proc = {
> +		.handler	= handler,
> +		.arg		= arg,
> +	};
> +
> +	return cdat_table_parse_entries(ACPI_CDAT_TYPE_DSMAS, header, &proc, 0);
> +}
> +EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dsmas, CXL);
> +
> +int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler, void *arg)
> +{
> +	struct acpi_table_cdat *header = (struct acpi_table_cdat *)table;
> +	struct cdat_subtable_proc proc = {
> +		.handler	= handler,
> +		.arg		= arg,
> +	};
> +
> +	return cdat_table_parse_entries(ACPI_CDAT_TYPE_DSLBIS, header, &proc, 0);
> +}
> +EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dslbis, CXL);
> diff --git a/drivers/cxl/core/cdat.h b/drivers/cxl/core/cdat.h
> new file mode 100644
> index 000000000000..f690325e82a6
> --- /dev/null
> +++ b/drivers/cxl/core/cdat.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/* Copyright(c) 2023 Intel Corporation. */
> +#ifndef __CXL_CDAT_H__
> +#define __CXL_CDAT_H__
> +
> +struct cdat_subtable_proc {
> +	cdat_tbl_entry_handler handler;
> +	void *arg;
> +};
> +
> +struct cdat_subtable_entry {
> +	struct acpi_cdat_header *hdr;
> +	enum acpi_cdat_type type;
> +};
> +#endif
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index f558bbfc0332..839a121c1997 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -9,6 +9,7 @@
>  #include <linux/bitops.h>
>  #include <linux/log2.h>
>  #include <linux/io.h>
> +#include <linux/acpi.h>
>  
>  /**
>   * DOC: cxl objects
> @@ -697,6 +698,14 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
>  }
>  #endif
>  
> +typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
> +
> +u8 cdat_table_checksum(u8 *buffer, u32 length);
> +int cdat_table_parse_dsmas(void *table, cdat_tbl_entry_handler handler,
> +			   void *arg);
> +int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
> +			    void *arg);
> +
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
>   * of these symbols in tools/testing/cxl/.
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/18] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-02-06 20:50 ` [PATCH 06/18] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
@ 2023-02-09 13:29   ` Jonathan Cameron
  2023-02-13 22:55     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 13:29 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:50:15 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Provide a callback function to the CDAT parser in order to parse the Device
> Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
> DPA range and its associated attributes in each entry. See the CDAT
> specification for details.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Hi Dave,

A few minor questions / comments inline,

Jonathan

> ---
>  drivers/cxl/core/cdat.c |   25 +++++++++++++++++++++++++
>  drivers/cxl/core/port.c |    2 ++
>  drivers/cxl/cxl.h       |   11 +++++++++++
>  drivers/cxl/port.c      |    8 ++++++++
>  4 files changed, 46 insertions(+)
> 
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index be09c8a690f5..f9a64a0f1ee4 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -96,3 +96,28 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler, void *a
>  	return cdat_table_parse_entries(ACPI_CDAT_TYPE_DSLBIS, header, &proc, 0);
>  }
>  EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dslbis, CXL);
> +
> +int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg)
> +{
> +	struct cxl_port *port = (struct cxl_port *)arg;
> +	struct dsmas_entry *dent;
> +	struct acpi_cdat_dsmas *dsmas;
> +
> +	if (header->type != ACPI_CDAT_TYPE_DSMAS)
> +		return -EINVAL;
> +
> +	dent = devm_kzalloc(&port->dev, sizeof(*dent), GFP_KERNEL);
> +	if (!dent)
> +		return -ENOMEM;
> +
> +	dsmas = (struct acpi_cdat_dsmas *)((unsigned long)header + sizeof(*header));

I'd prefer header + 1


> +	dent->handle = dsmas->dsmad_handle;
> +	dent->dpa_range.start = dsmas->dpa_base_address;
> +	dent->dpa_range.end = dsmas->dpa_base_address + dsmas->dpa_length - 1;
> +
> +	mutex_lock(&port->cdat.dsmas_lock);
> +	list_add_tail(&dent->list, &port->cdat.dsmas_list);
> +	mutex_unlock(&port->cdat.dsmas_lock);
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index fe78daf7e7c8..2b27319cfd42 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -660,6 +660,8 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
>  	device_set_pm_not_required(dev);
>  	dev->bus = &cxl_bus_type;
>  	dev->type = &cxl_port_type;
> +	INIT_LIST_HEAD(&port->cdat.dsmas_list);
> +	mutex_init(&port->cdat.dsmas_lock);
>  
>  	return port;
>  
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 839a121c1997..1e5e69f08480 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -8,6 +8,7 @@
>  #include <linux/bitfield.h>
>  #include <linux/bitops.h>
>  #include <linux/log2.h>
> +#include <linux/list.h>
>  #include <linux/io.h>
>  #include <linux/acpi.h>
>  
> @@ -520,6 +521,8 @@ struct cxl_port {
>  	struct cxl_cdat {
>  		void *table;
>  		size_t length;
> +		struct list_head dsmas_list;
> +		struct mutex dsmas_lock; /* lock for dsmas_list */

I'm curious, what might race with the dsmas_list changing and hence what is lock for?

>  	} cdat;
>  	bool cdat_available;
>  };
> @@ -698,6 +701,12 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
>  }
>  #endif
>  
> +struct dsmas_entry {
> +	struct list_head list;
> +	struct range dpa_range;
> +	u16 handle;

handle is 1 byte in the spec. Why larger here?

> +};
> +
>  typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
>  
>  u8 cdat_table_checksum(u8 *buffer, u32 length);
> @@ -706,6 +715,8 @@ int cdat_table_parse_dsmas(void *table, cdat_tbl_entry_handler handler,
>  int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
>  			    void *arg);
>  
> +int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg);
> +
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
>   * of these symbols in tools/testing/cxl/.
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index 5453771bf330..b1da73e99bab 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -61,6 +61,14 @@ static int cxl_port_probe(struct device *dev)
>  		if (rc)
>  			return rc;
>  
> +		if (port->cdat.table) {
> +			rc = cdat_table_parse_dsmas(port->cdat.table,
> +						    cxl_dsmas_parse_entry,
> +						    (void *)port);
> +			if (rc < 0)
> +				dev_dbg(dev, "Failed to parse DSMAS: %d\n", rc);
> +		}
> +
>  		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
>  		if (rc)
>  			return rc;
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 07/18] cxl: Add callback to parse the DSLBIS subtable from CDAT
  2023-02-06 20:50 ` [PATCH 07/18] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
@ 2023-02-09 13:50   ` Jonathan Cameron
  2023-02-14  0:24     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 13:50 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:50:23 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Provide a callback to parse the Device Scoped Latency and Bandwidth
> Information Structure (DSLBIS) in the CDAT structures. The DSLBIS
> contains the bandwidth and latency information that's tied to a DSMAS
> handle. The driver will retrieve the read and write latency and
> bandwidth associated with the DSMAS which is tied to a DPA range.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
A few comments inline,

Thanks,

Jonathan

> ---
>  drivers/cxl/core/cdat.c |   34 ++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |    2 ++
>  drivers/cxl/port.c      |    9 ++++++++-
>  include/acpi/actbl1.h   |    5 +++++
>  4 files changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
> index f9a64a0f1ee4..3c8f3956487e 100644
> --- a/drivers/cxl/core/cdat.c
> +++ b/drivers/cxl/core/cdat.c
> @@ -121,3 +121,37 @@ int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg)
>  	return 0;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
> +
> +int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg)
> +{
> +	struct cxl_port *port = (struct cxl_port *)arg;
> +	struct dsmas_entry *dent;
> +	struct acpi_cdat_dslbis *dslbis;

Perhaps reorder to maintain the pretty upside-down Christmas trees
(I don't care :)

> +	u64 val;
> +
> +	if (header->type != ACPI_CDAT_TYPE_DSLBIS)
> +		return -EINVAL;

Isn't this guaranteed by the caller?  Seems overkill do it twice
and I don't think these will ever be called outside of that wrapper that
loops over the entries. I could be wrong though!

> +
> +	dslbis = (struct acpi_cdat_dslbis *)((unsigned long)header + sizeof(*header));
header + 1

> +	if ((dslbis->flags & ACPI_CEDT_DSLBIS_MEM_MASK) !=

This field 'must be ignored' if the DSMAS handle isn't a match
(as it's an initiator only entry) Odd though it may seem I think we
might see one of those on a type 3 device and we are probably going to
have other users of this function anyway.

I think you need to do the walk below to check we have a DSMAS match, before
running this check.

> +	     ACPI_CEDT_DSLBIS_MEM_MEMORY)
> +		return 0;
> +
> +	if (dslbis->data_type > ACPI_HMAT_WRITE_BANDWIDTH)
> +		return -ENXIO;

This would probably imply a new HMAT spec value, so probably just
log it and ignore rather than error out.

> +
> +	/* Value calculation with base_unit, see ACPI Spec 6.5 5.2.28.4 */
> +	val = dslbis->entry[0] * dslbis->entry_base_unit;

In theory this might overflow as u64 * u16.
Doubt it will ever happen in reality, but maybe a check and debug print if it does?

> +
> +	mutex_lock(&port->cdat.dsmas_lock);
> +	list_for_each_entry(dent, &port->cdat.dsmas_list, list) {
> +		if (dslbis->handle == dent->handle) {
> +			dent->qos[dslbis->data_type] = val;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&port->cdat.dsmas_lock);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_dslbis_parse_entry, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 1e5e69f08480..849b22236f1d 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -705,6 +705,7 @@ struct dsmas_entry {
>  	struct list_head list;
>  	struct range dpa_range;
>  	u16 handle;
> +	u64 qos[ACPI_HMAT_WRITE_BANDWIDTH + 1];
>  };
>  
>  typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
> @@ -716,6 +717,7 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
>  			    void *arg);
>  
>  int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg);
> +int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg);
>  
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index b1da73e99bab..8de311208b37 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -65,8 +65,15 @@ static int cxl_port_probe(struct device *dev)
>  			rc = cdat_table_parse_dsmas(port->cdat.table,
>  						    cxl_dsmas_parse_entry,
>  						    (void *)port);
> -			if (rc < 0)
> +			if (rc > 0) {
> +				rc = cdat_table_parse_dslbis(port->cdat.table,
> +							     cxl_dslbis_parse_entry,
> +							     (void *)port);
> +				if (rc <= 0)
> +					dev_dbg(dev, "Failed to parse DSLBIS: %d\n", rc);

If we have entries and they won't parse, I think we should be screaming louder.
dev_warn() would be my preference for this and the one in the previous patch.
Sure we can carry on, but something on the device is not working as expected.

> +			} else {
>  				dev_dbg(dev, "Failed to parse DSMAS: %d\n", rc);
> +			}
>  		}
>  
>  		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
> diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
> index e8297cefde09..ff6092e45196 100644
> --- a/include/acpi/actbl1.h
> +++ b/include/acpi/actbl1.h
> @@ -369,6 +369,11 @@ struct acpi_cdat_dslbis {
>  	u16 reserved2;
>  };
>  
> +/* Flags for subtable above */
> +
> +#define ACPI_CEDT_DSLBIS_MEM_MASK	GENMASK(3, 0)
> +#define ACPI_CEDT_DSLBIS_MEM_MEMORY	0
> +
>  /* Subtable 2: Device Scoped Memory Side Cache Information Structure (DSMSCIS) */
>  
>  struct acpi_cdat_dsmscis {
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 08/18] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-02-06 20:50 ` [PATCH 08/18] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
@ 2023-02-09 14:02   ` Jonathan Cameron
  2023-02-14 21:07     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 14:02 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:50:33 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)
> 
> Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
> an input of an ACPI package with 4 dwords (read latency, write latency,
> read bandwidth, write bandwidth). The call returns a package with 1 WORD
> that provides the max supported QTG ID and a package that may contain 0 or
> more WORDs as the recommended QTG IDs in the recommended order.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
A few minor bits inline.

Jonathan

> ---
>  drivers/cxl/core/Makefile |    1 
>  drivers/cxl/core/acpi.c   |   99 +++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h         |   15 +++++++
>  3 files changed, 115 insertions(+)
>  create mode 100644 drivers/cxl/core/acpi.c
> 
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 438ce27faf77..11ccc2016ab7 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -11,4 +11,5 @@ cxl_core-y += mbox.o
>  cxl_core-y += pci.o
>  cxl_core-y += hdm.o
>  cxl_core-y += cdat.o
> +cxl_core-y += acpi.o
>  cxl_core-$(CONFIG_CXL_REGION) += region.o
> diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
> new file mode 100644
> index 000000000000..86dc6c9c1f24
> --- /dev/null
> +++ b/drivers/cxl/core/acpi.c
> @@ -0,0 +1,99 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/acpi.h>
> +#include <linux/pci.h>
> +#include <asm/div64.h>
> +#include "cxlpci.h"
> +#include "cxl.h"
> +
> +const guid_t acpi_cxl_qtg_id_guid =
> +	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
> +		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
> +
> +/**
> + * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
> + * @handle: ACPI handle
> + * @input: bandwidth and latency data
> + *
> + * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
> + * the QTG IDs that falls within the performance data.
> + */
> +struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
> +						 struct qtg_dsm_input *input)
> +{
> +	struct qtg_dsm_output *output;
> +	union acpi_object *out_obj, *out_buf, *pkg, in_buf, in_obj;

Reorder to reverse Xmas tree perhaps.

> +	int len;
> +	int rc;
Might as well put those on one line.

> +
> +	in_obj.type = ACPI_TYPE_PACKAGE;
> +	in_obj.package.count = 1;
> +	in_obj.package.elements = &in_buf;
> +	in_buf.type = ACPI_TYPE_BUFFER;
> +	in_buf.buffer.pointer = (u8 *)input;
> +	in_buf.buffer.length = sizeof(u32) * 4;
c99 style is nicer to read.

	union acpi_object in_obj = {
		.type = 

	}
> +
> +	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
> +	if (!out_obj)
> +		return ERR_PTR(-ENXIO);
> +
> +	if (out_obj->type != ACPI_TYPE_PACKAGE) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	/*
> +	 * CXL spec v3.0 9.17.3.1
> +	 * There should be 2 elements in the package. 1 WORD for max QTG ID supported
> +	 * by the platform, and the other a package of recommended QTGs
> +	 */
> +	if (out_obj->package.count != 2) {

This stuff is usually designed to be extensible - tends to be explicitly allowed in
stuff in the ACPI spec (not mentioned AFAICT in the CXL spec).  So I'd be tempted to allow
> 2 just don't read them.

	if (out_obj->package.count < 2) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	pkg = &out_obj->package.elements[1];
> +	if (pkg->type != ACPI_TYPE_PACKAGE) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	out_buf = &pkg->package.elements[0];
> +	if (out_buf->type != ACPI_TYPE_BUFFER) {
> +		rc = -ENXIO;
> +		goto err;
> +	}
> +
> +	len = out_buf->buffer.length;
> +	output = kmalloc(len + sizeof(*output), GFP_KERNEL);
> +	if (!output) {
> +		rc = -ENOMEM;
> +		goto err;
> +	}
> +
> +	/* It's legal to have 0 QTG entries */
> +	if (len == 0) {
> +		output->nr = 0;
> +		goto out;
> +	}
> +
> +	/* Malformed package, not multiple of WORD size */
> +	if (len % sizeof(u16)) {
> +		rc = -ENXIO;
> +		goto out;
> +	}
> +
> +	output->nr = len / sizeof(u16);
> +	memcpy(output->qtg_ids, out_buf->buffer.pointer, len);

Worth checking them against Max Support QTG ID as provided in the
outer package?  Obviously if they are greater than that there is
a bug, but meh.

> +out:
> +	ACPI_FREE(out_obj);
> +	return output;
> +
> +err:
> +	ACPI_FREE(out_obj);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_evaluate_qtg_dsm, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 849b22236f1d..e70df07f9b4b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -719,6 +719,21 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
>  int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg);
>  int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg);
>  
> +struct qtg_dsm_input {
> +	u32 rd_lat;
> +	u32 wr_lat;
> +	u32 rd_bw;
> +	u32 wr_bw;
> +};
> +
> +struct qtg_dsm_output {
> +	int nr;
> +	u16 qtg_ids[];
> +};
> +
> +struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
> +						 struct qtg_dsm_input *input);
> +
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
>   * of these symbols in tools/testing/cxl/.
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 09/18] cxl: Add helper function to retrieve ACPI handle of CXL root device
  2023-02-06 20:50 ` [PATCH 09/18] cxl: Add helper function to retrieve ACPI handle of CXL root device Dave Jiang
@ 2023-02-09 14:10   ` Jonathan Cameron
  2023-02-14 21:29     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 14:10 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:50:42 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Provide a helper to find the ACPI0017 device in order to issue the _DSM.
> The helper will take the 'struct device' from a cxl_port and iterate until
> the root device is reached. The ACPI handle will be returned from the root
> device.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/acpi.c |   30 ++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |    1 +
>  2 files changed, 31 insertions(+)
> 
> diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
> index 86dc6c9c1f24..05fcd4751619 100644
> --- a/drivers/cxl/core/acpi.c
> +++ b/drivers/cxl/core/acpi.c
> @@ -5,6 +5,7 @@
>  #include <linux/kernel.h>
>  #include <linux/acpi.h>
>  #include <linux/pci.h>
> +#include <linux/platform_device.h>
>  #include <asm/div64.h>
>  #include "cxlpci.h"
>  #include "cxl.h"
> @@ -13,6 +14,35 @@ const guid_t acpi_cxl_qtg_id_guid =
>  	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
>  		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
>  
> +/**
> + * cxl_acpi_get_root_acpi_handle - get the ACPI handle of the CXL root device
> + * @dev: 'struct device' to start searching from. Should be from cxl_port->dev.
> + * Looks for the ACPI0017 device and return the ACPI handle
> + **/

Inconsistent comment style.

> +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev)
> +{
> +	struct device *itr = dev, *root_dev;

Not nice for readability to have an assignment in a list of definitions
all on the same line.

> +	acpi_handle handle;
> +
> +	if (!dev)
> +		return ERR_PTR(-EINVAL);
> +
> +	while (itr->parent) {
> +		root_dev = itr;
> +		itr = itr->parent;
> +	}
> +
> +	if (!dev_is_platform(root_dev))
> +		return ERR_PTR(-ENODEV);
> +
> +	handle = ACPI_HANDLE(root_dev);
> +	if (!handle)
> +		return ERR_PTR(-ENODEV);
> +
> +	return handle;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_get_rootdev_handle, CXL);
> +
>  /**
>   * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
>   * @handle: ACPI handle
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index e70df07f9b4b..ac6ea550ab0a 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -733,6 +733,7 @@ struct qtg_dsm_output {
>  
>  struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
>  						 struct qtg_dsm_input *input);
> +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev);
>  
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-08 23:56         ` Dave Jiang
@ 2023-02-09 15:10           ` Jonathan Cameron
  2023-02-14 22:22             ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 15:10 UTC (permalink / raw)
  To: Dave Jiang
  Cc: Bjorn Helgaas, linux-cxl, linux-pci, linux-acpi, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, rafael, bhelgaas,
	robert.moore

On Wed, 8 Feb 2023 16:56:30 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> On 2/8/23 3:15 PM, Bjorn Helgaas wrote:
> > On Tue, Feb 07, 2023 at 01:51:17PM -0700, Dave Jiang wrote:  
> >>
> >>
> >> On 2/6/23 3:39 PM, Bjorn Helgaas wrote:  
> >>> On Mon, Feb 06, 2023 at 01:51:10PM -0700, Dave Jiang wrote:  
> >>>> The latency is calculated by dividing the FLIT size over the
> >>>> bandwidth. Add support to retrieve the FLIT size for the CXL
> >>>> device and calculate the latency of the downstream link.  
> >   
> >>> I guess you only care about the latency of a single link, not the
> >>> entire path?  
> >>
> >> I am adding each of the link individually together in the next
> >> patch. Are you suggesting a similar function like
> >> pcie_bandwidth_available() but for latency for the entire path?  
> > 
> > Only a clarifying question.
> >   
> >>>> +static int cxl_get_flit_size(struct pci_dev *pdev)
> >>>> +{
> >>>> +	if (cxl_pci_flit_256(pdev))
> >>>> +		return 256;
> >>>> +
> >>>> +	return 66;  
> >>>
> >>> I don't know about the 66-byte flit format, maybe this part is
> >>> CXL-specific?  
> >>
> >> 68-byte flit format. Looks like this is a typo from me.  
> > 
> > This part must be CXL-specific, since I don't think PCIe mentions
> > 68-byte flits.
> >   
> >>>> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
> >>>> + * mode, otherwise it's 68B flits mode.
> >>>> + */
> >>>> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
> >>>> +{
> >>>> +	u32 lnksta2;
> >>>> +
> >>>> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
> >>>> +	return lnksta2 & BIT(10);  
> >>>
> >>> Add a #define for the bit.  
> >>
> >> ok will add.
> >>  
> >>>
> >>> AFAICT, the PCIe spec defines this bit, and it only indicates the link
> >>> is or will be operating in Flit Mode; it doesn't actually say anything
> >>> about how large the flits are.  I suppose that's because PCIe only
> >>> talks about 256B flits, not 66B ones?  
> >>
> >> Looking at CXL v1.0 rev3.0 6.2.3 "256B Flit Mode", table 6-4, it shows that
> >> when PCIe Flit Mode is set, then CXL is in 256B flits mode, otherwise, it is
> >> 68B flits. So an assumption is made here regarding the flit side based on
> >> the table.  
> > 
> > So reading PCI_EXP_LNKSTA2 and extracting the Flit Mode bit is
> > PCIe-generic, but the interpretation of "PCIe Flit Mode not enabled
> > means 68-byte flits" is CXL-specific?
> > 
> > This sounds wrong, but I don't know quite how.  How would the PCI core
> > manage links where Flit Mode being cleared really means Flit Mode is
> > *enabled* but with a different size?  Seems like something could go
> > wrong there.  
> 
> Looking at the PCIe base spec and the CXL spec, that seemed to be the 
> only way that implies the flit size for a CXL device as far as I can 
> tell. I've yet to find a good way to make that determination. Dan?

So a given CXL port has either trained up in:
* normal PCI (in which case all the normal PCI stuff applies) and we'll
  fail some of the other checks in the CXL driver never get hear here
  - I 'think' the driver will load for the PCI device to enable things
  like firmware upgrade, but we won't register the CXL Port devices
  that ultimately call this stuff.
  It's perfectly possible to have a driver that will cope with this
  but it's pretty meaningless for a lot of cxl type 3 driver.
* 68 byte flit (which was CXL precursor to PCI going flit based)
  Can be queried via CXL DVSEC Flex Bus Port Status CXL r3.0 8.2.1.3.3
* 256 byte flits (may or may not be compatible with PCIe ones as there
  are some optional latency optimizations)

So if the 68 byte flit is enabled the 256 byte one should never be and
CXL description is overriding the old PCIe

Hence I think we should have the additional check on the flex bus
dvsec even though it should be consistent with your assumption above.

Hmm. That does raise a question of how we take the latency optimized
flits into account or indeed some of the other latency impacting things
that may or may not be running - IDE in it's various modes for example.

For latency optimized we can query relevant bit in the flex bus port status.
IDE info will be somewhere I guess though no idea if there is a way to
know the latency impacts.  

Jonathan

> 
> 
> > 
> > Bjorn  


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-06 20:51 ` [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
  2023-02-06 22:39   ` Bjorn Helgaas
@ 2023-02-09 15:16   ` Jonathan Cameron
  1 sibling, 0 replies; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 15:16 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:51:10 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> The latency is calculated by dividing the FLIT size over the bandwidth. Add
> support to retrieve the FLIT size for the CXL device and calculate the
> latency of the downstream link.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

I'd like to see some approx numbers in this patch description. What
sort of level is each component?  Hard to be sure the neglected parts
don't matter without that sort of back of the envelope numbers.

Jonathan

> ---
>  drivers/cxl/core/pci.c |   67 ++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlpci.h   |   14 ++++++++++
>  2 files changed, 81 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index a24dac36bedd..54ac6f8825ff 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -633,3 +633,70 @@ void read_cdat_data(struct cxl_port *port)
>  	}
>  }
>  EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
> +
> +static int pcie_speed_to_mbps(enum pci_bus_speed speed)
> +{
> +	switch (speed) {
> +	case PCIE_SPEED_2_5GT:
> +		return 2500;
> +	case PCIE_SPEED_5_0GT:
> +		return 5000;
> +	case PCIE_SPEED_8_0GT:
> +		return 8000;
> +	case PCIE_SPEED_16_0GT:
> +		return 16000;
> +	case PCIE_SPEED_32_0GT:
> +		return 32000;
> +	case PCIE_SPEED_64_0GT:
> +		return 64000;
> +	default:
> +		break;
> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static int cxl_pci_mbits_to_mbytes(struct pci_dev *pdev)
> +{
> +	int mbits;
> +
> +	mbits = pcie_speed_to_mbps(pcie_get_speed(pdev));
> +	if (mbits < 0)
> +		return mbits;
> +
> +	return mbits >> 3;
> +}
> +
> +static int cxl_get_flit_size(struct pci_dev *pdev)
> +{
> +	if (cxl_pci_flit_256(pdev))
> +		return 256;
> +
> +	return 66;
> +}
> +
> +/**
> + * cxl_pci_get_latency - calculate the link latency for the PCIe link
> + * @pdev - PCI device
> + *
> + * CXL Memory Device SW Guide v1.0 2.11.4 Link latency calculation
> + * Link latency = LinkPropagationLatency + FlitLatency + RetimerLatency
> + * LinkProgationLatency is negligible, so 0 will be used
> + * RetimerLatency is assumed to be neglibible and 0 will be used
> + * FlitLatency = FlitSize / LinkBandwidth
> + * FlitSize is defined by spec. CXL v3.0 4.2.1.
> + * 68B flit is used up to 32GT/s. >32GT/s, 256B flit size is used.
> + * The FlitLatency is converted to pico-seconds.
> + */
> +long cxl_pci_get_latency(struct pci_dev *pdev)
> +{
> +	long bw, flit_size;
> +
> +	bw = cxl_pci_mbits_to_mbytes(pdev);
> +	if (bw < 0)
> +		return bw;
> +
> +	flit_size = cxl_get_flit_size(pdev);

So, if latency optimized, it's 128 bytes (approx)

> +	return flit_size * 1000000L / bw;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_pci_get_latency, CXL);
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 920909791bb9..d64a3e0458ab 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -62,8 +62,22 @@ enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_TYPES
>  };
>  
> +/*
> + * CXL v3.0 6.2.3 Table 6-4
> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
> + * mode, otherwise it's 68B flits mode.
> + */
> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
> +{

As per other branch of thread, I'd like to see 68 byte confirmed by checking
the flex bus dvsec.  Sure it should always match your assumption (as we shouldn't
be in normal PCI at this stage) but we might be if this code gets called
in other paths from current intent.


> +	u32 lnksta2;
> +
> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
> +	return lnksta2 & BIT(10);
> +}
> +
>  int devm_cxl_port_enumerate_dports(struct cxl_port *port);
>  struct cxl_dev_state;
>  int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm);
>  void read_cdat_data(struct cxl_port *port);
> +long cxl_pci_get_latency(struct pci_dev *pdev);
>  #endif /* __CXL_PCI_H__ */
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 13/18] cxl: Add latency and bandwidth calculations for the CXL path
  2023-02-06 20:51 ` [PATCH 13/18] cxl: Add latency and bandwidth calculations for the CXL path Dave Jiang
@ 2023-02-09 15:24   ` Jonathan Cameron
  2023-02-14 23:03     ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 15:24 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:51:19 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to
> caluclate latency and bandwidth for CXL memory device. Calculate minimum

Spell check your descriptions (I often forget to do this as well!
)
> bandwidth and total latency for the path from the CXL device to the root
> port. The calculates values are stored in the cached DSMAS entries attached
> to the cxl_port of the CXL device.
> 
> For example for a device that is directly attached to a host bus:
> Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link
> 		Latency
> Min Bandwidth = Link Bandwidth between Host Bus and CXL device
> 
> For a device that has a switch in between host bus and CXL device:
> Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency +
> 		Switch (CDAT) Latency + Switch to HB Link Latency

For QTG purposes, are we also supposed to take into account HB to
system interconnect type latency (or maybe nearest CPU?).
That is likely to be non trivial.

> Min Bandwidth = min(dev to switch bandwidth, switch to HB bandwidth)
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

Stray sign off.

> 
> The internal latency for a switch can be retrieved from the CDAT of the
> switch PCI device. However, since there's no easy way to retrieve that
> right now on Linux, a guesstimated constant is used per switch to simplify
> the driver code.

I'd like to see that gap closed asap. I think it is fairly obvious how to do
it, so shouldn't be too hard, just needs a dance to get the DOE for a switch
port using Lukas' updated handling of DOE mailboxes. 

> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/port.c |   60 +++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |    9 +++++++
>  drivers/cxl/port.c      |   42 +++++++++++++++++++++++++++++++++
>  3 files changed, 111 insertions(+)
> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 2b27319cfd42..aa260361ba7d 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1899,6 +1899,66 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd)
>  }
>  EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
>  
> +int cxl_port_get_downstream_qos(struct cxl_port *port, long *bw, long *lat)
> +{
> +	long total_lat = 0, latency;

Similar to before, not good for readability to hide asignments in a list all on one line.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 14/18] cxl: Wait Memory_Info_Valid before access memory related info
  2023-02-06 20:51 ` [PATCH 14/18] cxl: Wait Memory_Info_Valid before access memory related info Dave Jiang
@ 2023-02-09 15:29   ` Jonathan Cameron
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 15:29 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:51:28 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> CXL rev3.0 8.1.3.8.2 Memory_Info_valid field
> 
> The Memory_Info_Valid bit indicates that the CXL Range Size High and Size
> Low registers are valid. The bit must be set within 1 second of reset
> deassertion to the device. Check valid bit before we check the
> Memory_Active bit when waiting for cxl_await_media_ready() to ensure that
> the memory info is valid for consumption.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

Fix?

> ---
>  drivers/cxl/core/pci.c |   25 +++++++++++++++++++++++--
>  drivers/cxl/port.c     |   20 ++++++++++----------
>  2 files changed, 33 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 54ac6f8825ff..79a1348e7b98 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -111,11 +111,32 @@ int cxl_await_media_ready(struct cxl_dev_state *cxlds)
>  	int d = cxlds->cxl_dvsec;
>  	bool active = false;
>  	u64 md_status;
> +	u32 temp;
>  	int rc, i;
>  
> -	for (i = media_ready_timeout; i; i--) {
> -		u32 temp;
> +	/* Check MEM INFO VALID bit first, give up after 1s */
> +	i = 1;
> +	do {
> +		rc = pci_read_config_dword(pdev,
> +					   d + CXL_DVSEC_RANGE_SIZE_LOW(0),
> +					   &temp);
> +		if (rc)
> +			return rc;
>  
> +		active = FIELD_GET(CXL_DVSEC_MEM_INFO_VALID, temp);
> +		if (active)
> +			break;
> +		msleep(1000);
> +	} while (i--);

If HDM_Count > 1, there is a second range to check and I think we
need both to be valid here.



> +
> +	if (!active) {
> +		dev_err(&pdev->dev,
> +			"timeout awaiting memory valid after 1 second.\n");
> +		return -ETIMEDOUT;
> +	}
> +
> +	/* Check MEM ACTIVE bit, up to 60s timeout by default */
> +	for (i = media_ready_timeout; i; i--) {
>  		rc = pci_read_config_dword(
>  			pdev, d + CXL_DVSEC_RANGE_SIZE_LOW(0), &temp);
>  		if (rc)
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index d72e38f9ae44..03380c18fc52 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -99,6 +99,16 @@ static int cxl_port_probe(struct device *dev)
>  		if (rc)
>  			return rc;
>  
> +		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
> +		if (rc)
> +			return rc;
> +
> +		rc = cxl_await_media_ready(cxlds);
> +		if (rc) {
> +			dev_err(dev, "Media not active (%d)\n", rc);
> +			return rc;
> +		}
> +
>  		if (port->cdat.table) {
>  			rc = cdat_table_parse_dsmas(port->cdat.table,
>  						    cxl_dsmas_parse_entry,
> @@ -117,16 +127,6 @@ static int cxl_port_probe(struct device *dev)
>  			if (rc)
>  				dev_dbg(dev, "Failed to do QoS calculations\n");
>  		}
> -
> -		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
> -		if (rc)
> -			return rc;
> -
> -		rc = cxl_await_media_ready(cxlds);
> -		if (rc) {
> -			dev_err(dev, "Media not active (%d)\n", rc);
> -			return rc;
> -		}
>  	}
>  
>  	rc = devm_cxl_enumerate_decoders(cxlhdm);
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 15/18] cxl: Move identify and partition query from pci probe to port probe
  2023-02-06 20:51 ` [PATCH 15/18] cxl: Move identify and partition query from pci probe to port probe Dave Jiang
@ 2023-02-09 15:29   ` Jonathan Cameron
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 15:29 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:51:37 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Move the enumeration of device capacity to cxl_port_probe() from
> cxl_pci_probe(). The size and capacity information should be read
> after cxl_await_media_ready() so the data is valid.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fix?

> ---
>  drivers/cxl/pci.c  |    8 --------
>  drivers/cxl/port.c |    8 ++++++++
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 258004f34281..e35ed250214e 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -484,14 +484,6 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	rc = cxl_dev_state_identify(cxlds);
> -	if (rc)
> -		return rc;
> -
> -	rc = cxl_mem_create_range_info(cxlds);
> -	if (rc)
> -		return rc;
> -
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index 03380c18fc52..b7a4a1be2945 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -127,6 +127,14 @@ static int cxl_port_probe(struct device *dev)
>  			if (rc)
>  				dev_dbg(dev, "Failed to do QoS calculations\n");
>  		}
> +
> +		rc = cxl_dev_state_identify(cxlds);
> +		if (rc)
> +			return rc;
> +
> +		rc = cxl_mem_create_range_info(cxlds);
> +		if (rc)
> +			return rc;
>  	}
>  
>  	rc = devm_cxl_enumerate_decoders(cxlhdm);
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 16/18] cxl: Move reading of CDAT data from device to after media is ready
  2023-02-06 20:51 ` [PATCH 16/18] cxl: Move reading of CDAT data from device to after media is ready Dave Jiang
  2023-02-06 22:17   ` Lukas Wunner
@ 2023-02-09 15:31   ` Jonathan Cameron
  1 sibling, 0 replies; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 15:31 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:51:46 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> The CDAT data is only valid after the hardware signals the media is ready.
> Move the reading to after cxl_await_media_ready() has succeeded.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fix?  Though I doubt we care about backporting this one as until
after this patch series, CDAT was mostly informational so hopefully
no one relies on it.

Jonathan

> ---
>  drivers/cxl/port.c |    5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index b7a4a1be2945..6b2ad22487f5 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -91,9 +91,6 @@ static int cxl_port_probe(struct device *dev)
>  		struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport);
>  		struct cxl_dev_state *cxlds = cxlmd->cxlds;
>  
> -		/* Cache the data early to ensure is_visible() works */
> -		read_cdat_data(port);
> -
>  		get_device(&cxlmd->dev);
>  		rc = devm_add_action_or_reset(dev, schedule_detach, cxlmd);
>  		if (rc)
> @@ -109,6 +106,8 @@ static int cxl_port_probe(struct device *dev)
>  			return rc;
>  		}
>  
> +		/* Cache the data early to ensure is_visible() works */
> +		read_cdat_data(port);
>  		if (port->cdat.table) {
>  			rc = cdat_table_parse_dsmas(port->cdat.table,
>  						    cxl_dsmas_parse_entry,
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 17/18] cxl: Attach QTG IDs to the DPA ranges for the device
  2023-02-06 20:51 ` [PATCH 17/18] cxl: Attach QTG IDs to the DPA ranges for the device Dave Jiang
@ 2023-02-09 15:34   ` Jonathan Cameron
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 15:34 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:51:55 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Match the DPA ranges of the mem device and the calcuated DPA range attached
> to the DSMAS. If a match is found, then assign the QTG ID to the relevant
> DPA range of the memory device.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/mbox.c |    2 ++
>  drivers/cxl/cxlmem.h    |    2 ++
>  drivers/cxl/port.c      |   35 +++++++++++++++++++++++++++++++++++
>  3 files changed, 39 insertions(+)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index b03fba212799..2a7b07d65010 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -869,6 +869,8 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
>  
>  	mutex_init(&cxlds->mbox_mutex);
>  	cxlds->dev = dev;
> +	cxlds->pmem_qtg_id = -1;
> +	cxlds->ram_qtg_id = -1;
>  
>  	return cxlds;
>  }
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index ab138004f644..d88b88ecc807 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -251,6 +251,8 @@ struct cxl_dev_state {
>  	struct resource dpa_res;
>  	struct resource pmem_res;
>  	struct resource ram_res;
> +	int pmem_qtg_id;
> +	int ram_qtg_id;
>  	u64 total_bytes;
>  	u64 volatile_only_bytes;
>  	u64 persistent_only_bytes;
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index 6b2ad22487f5..c4cee69d6625 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -68,6 +68,39 @@ static int cxl_port_qos_calculate(struct cxl_port *port)
>  	return 0;
>  }
>  
> +static bool dpa_match_qtg_range(struct range *dpa, struct range *qtg)
> +{
> +	if (dpa->start >= qtg->start && dpa->end <= qtg->end)
> +		return true;
> +	return false;

	return dpa->start >= qtg->start && dpa->end <= qtg-end;

> +}
> +
> +static void cxl_dev_set_qtg(struct cxl_port *port, struct cxl_dev_state *cxlds)
> +{
> +	struct dsmas_entry *dent;
> +	struct range ram_range = {
> +		.start = cxlds->ram_res.start,
> +		.end = cxlds->ram_res.end,
> +	};
> +	struct range pmem_range =  {
> +		.start = cxlds->pmem_res.start,
> +		.end = cxlds->pmem_res.end,
> +	};
> +
> +	mutex_lock(&port->cdat.dsmas_lock);
> +	list_for_each_entry(dent, &port->cdat.dsmas_list, list) {
> +		if (dpa_match_qtg_range(&ram_range, &dent->dpa_range)) {
> +			cxlds->ram_qtg_id = dent->qtg_id;
> +			break;
> +		}
> +		if (dpa_match_qtg_range(&pmem_range, &dent->dpa_range)) {
> +			cxlds->pmem_qtg_id = dent->qtg_id;

Could be multiple ranges in ram and pmem. I guess we can leave that for
future work.

> +			break;
> +		}
> +	}
> +	mutex_unlock(&port->cdat.dsmas_lock);
> +}
> +
>  static int cxl_port_probe(struct device *dev)
>  {
>  	struct cxl_port *port = to_cxl_port(dev);
> @@ -134,6 +167,8 @@ static int cxl_port_probe(struct device *dev)
>  		rc = cxl_mem_create_range_info(cxlds);
>  		if (rc)
>  			return rc;
> +
> +		cxl_dev_set_qtg(port, cxlds);
>  	}
>  
>  	rc = devm_cxl_enumerate_decoders(cxlhdm);
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 18/18] cxl: Export sysfs attributes for device QTG IDs
  2023-02-06 20:52 ` [PATCH 18/18] cxl: Export sysfs attributes for device QTG IDs Dave Jiang
@ 2023-02-09 15:41   ` Jonathan Cameron
  2023-03-23 23:20     ` Dan Williams
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-09 15:41 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Mon, 06 Feb 2023 13:52:05 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Export qtg_id sysfs attributes for the respective ram and pmem DPA range of
> a CXL device. The QTG ID should show up as
> /sys/bus/cxl/devices/memX/pmem/qtg_id for pmem or as
> /sys/bus/cxl/devices/memX/ram/qtg_id for ram.

This doesn't extend to devices with say multiple DSMAS regions
for RAM with different access characteristics.  Think of a device
with HBM and DDR for example, or a mix of DDR4 and DDR5.

Once we are dealing with memory pools of significant size there
are very likely to be DPA regions with different characteristics.

So minimum I'd suggest is leave space for an ABI that might look like.

mem/range0_qtg_id
mem/range1_qtg_id
mem/range0_base
mem/range0_length
mem/range1_base
mem/range1_length
etc but with the flexibility to not present the rangeX_base/length stuff if there
is only one presented.  For now just present the range0_qtg_id

I'm fine if you want to implement multiple ranges from the start though.

As with previous ABI patch, I'd like to see a little description in the patch
header of what this stuff is for as well.  Obvious to some of us perhaps, but
better to call it out for anyone who is wondering why userspace needs to know.

I'm guessing you have a nice QEMU patch adding the DSM etc?

Thanks,

Jonathan


> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  Documentation/ABI/testing/sysfs-bus-cxl |   15 +++++++++++++++
>  drivers/cxl/core/memdev.c               |   26 ++++++++++++++++++++++++++
>  2 files changed, 41 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 0932c2f6fbf4..8133a13e118d 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -27,6 +27,14 @@ Description:
>  		identically named field in the Identify Memory Device Output
>  		Payload in the CXL-2.0 specification.
>  
> +What:		/sys/bus/cxl/devices/memX/ram/qtg_id
> +Date:		January, 2023
> +KernelVersion:	v6.3
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RO) Shows calculated QoS Throttling Group ID for the
> +		"Volatile Only Capacity" DPA range.
> +
>  
>  What:		/sys/bus/cxl/devices/memX/pmem/size
>  Date:		December, 2020
> @@ -37,6 +45,13 @@ Description:
>  		identically named field in the Identify Memory Device Output
>  		Payload in the CXL-2.0 specification.
>  
> +What:		/sys/bus/cxl/devices/memX/pmem/qtg_id
> +Date:		January, 2023
> +KernelVersion:	v6.3
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RO) Shows calculated QoS Throttling Group ID for the
> +		"Persistent Only Capacity" DPA range.
>  
>  What:		/sys/bus/cxl/devices/memX/serial
>  Date:		January, 2022
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index a74a93310d26..06f9ac929ef4 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -76,6 +76,18 @@ static ssize_t ram_size_show(struct device *dev, struct device_attribute *attr,
>  static struct device_attribute dev_attr_ram_size =
>  	__ATTR(size, 0444, ram_size_show, NULL);
>  
> +static ssize_t ram_qtg_id_show(struct device *dev, struct device_attribute *attr,
> +			       char *buf)
> +{
> +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +
> +	return sysfs_emit(buf, "%d\n", cxlds->ram_qtg_id);
> +}
> +
> +static struct device_attribute dev_attr_ram_qtg_id =
> +	__ATTR(qtg_id, 0444, ram_qtg_id_show, NULL);
> +
>  static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
>  			      char *buf)
>  {
> @@ -89,6 +101,18 @@ static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
>  static struct device_attribute dev_attr_pmem_size =
>  	__ATTR(size, 0444, pmem_size_show, NULL);
>  
> +static ssize_t pmem_qtg_id_show(struct device *dev, struct device_attribute *attr,
> +				char *buf)
> +{
> +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +
> +	return sysfs_emit(buf, "%d\n", cxlds->pmem_qtg_id);
> +}
> +
> +static struct device_attribute dev_attr_pmem_qtg_id =
> +	__ATTR(qtg_id, 0444, pmem_qtg_id_show, NULL);
> +
>  static ssize_t serial_show(struct device *dev, struct device_attribute *attr,
>  			   char *buf)
>  {
> @@ -117,11 +141,13 @@ static struct attribute *cxl_memdev_attributes[] = {
>  
>  static struct attribute *cxl_memdev_pmem_attributes[] = {
>  	&dev_attr_pmem_size.attr,
> +	&dev_attr_pmem_qtg_id.attr,
>  	NULL,
>  };
>  
>  static struct attribute *cxl_memdev_ram_attributes[] = {
>  	&dev_attr_ram_size.attr,
> +	&dev_attr_ram_qtg_id.attr,
>  	NULL,
>  };
>  
> 
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 01/18] cxl: Export QTG ids from CFMWS to sysfs
  2023-02-09 11:15   ` Jonathan Cameron
@ 2023-02-09 17:28     ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-09 17:28 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-pci, linux-acpi, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, rafael, bhelgaas, robert.moore



On 2/9/23 4:15 AM, Jonathan Cameron wrote:
> On Mon, 06 Feb 2023 13:49:30 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Export the QoS Throttling Group ID from the CXL Fixed Memory Window
>> Structure (CFMWS) under the root decoder sysfs attributes.
>> CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS)
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> Hi Dave,
> 
> 
> I've no objection to this, but would good to say why this
> might be of use to userspace.  What tooling needs it?

Will do.

> 
> One comment on docs inline. With those two things tidied up
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> 
>> ---
>>   Documentation/ABI/testing/sysfs-bus-cxl |    7 +++++++
>>   drivers/cxl/acpi.c                      |    3 +++
>>   drivers/cxl/core/port.c                 |   14 ++++++++++++++
>>   drivers/cxl/cxl.h                       |    3 +++
>>   4 files changed, 27 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
>> index 8494ef27e8d2..0932c2f6fbf4 100644
>> --- a/Documentation/ABI/testing/sysfs-bus-cxl
>> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
>> @@ -294,6 +294,13 @@ Description:
>>   		(WO) Write a string in the form 'regionZ' to delete that region,
>>   		provided it is currently idle / not bound to a driver.
>>   
>> +What:		/sys/bus/cxl/devices/decoderX.Y/qtg_id
>> +Date:		Jan, 2023
>> +KernelVersion:	v6.3
>> +Contact:	linux-cxl@vger.kernel.org
>> +Description:
>> +		(RO) Shows the QoS Throttling Group ID. The QTG ID for a root
>> +		decoder comes from the CFMWS structure of the CEDT.
> 
> Document the -1 value for no ID in here. Hopefully people will write
> their userspace against this document and we want them to know about that
> corner case!

Ok I will add.

> 
>>   
>>   What:		/sys/bus/cxl/devices/regionZ/uuid
>>   Date:		May, 2022
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index 13cde44c6086..7a71bb5041c7 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
>> @@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
>>   			}
>>   		}
>>   	}
>> +
>> +	cxld->qtg_id = cfmws->qtg_id;
>> +
>>   	rc = cxl_decoder_add(cxld, target_map);
>>   err_xormap:
>>   	if (rc)
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index b631a0520456..fe78daf7e7c8 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -284,6 +284,16 @@ static ssize_t interleave_ways_show(struct device *dev,
>>   
>>   static DEVICE_ATTR_RO(interleave_ways);
>>   
>> +static ssize_t qtg_id_show(struct device *dev,
>> +			   struct device_attribute *attr, char *buf)
>> +{
>> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
>> +
>> +	return sysfs_emit(buf, "%d\n", cxld->qtg_id);
>> +}
>> +
>> +static DEVICE_ATTR_RO(qtg_id);
>> +
>>   static struct attribute *cxl_decoder_base_attrs[] = {
>>   	&dev_attr_start.attr,
>>   	&dev_attr_size.attr,
>> @@ -303,6 +313,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
>>   	&dev_attr_cap_type2.attr,
>>   	&dev_attr_cap_type3.attr,
>>   	&dev_attr_target_list.attr,
>> +	&dev_attr_qtg_id.attr,
>>   	SET_CXL_REGION_ATTR(create_pmem_region)
>>   	SET_CXL_REGION_ATTR(delete_region)
>>   	NULL,
>> @@ -1606,6 +1617,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
>>   	}
>>   
>>   	atomic_set(&cxlrd->region_id, rc);
>> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>>   	return cxlrd;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL);
>> @@ -1643,6 +1655,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
>>   
>>   	cxld = &cxlsd->cxld;
>>   	cxld->dev.type = &cxl_decoder_switch_type;
>> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>>   	return cxlsd;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL);
>> @@ -1675,6 +1688,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port)
>>   	}
>>   
>>   	cxld->dev.type = &cxl_decoder_endpoint_type;
>> +	cxld->qtg_id = CXL_QTG_ID_INVALID;
>>   	return cxled;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL);
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 1b1cf459ac77..f558bbfc0332 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -279,6 +279,7 @@ enum cxl_decoder_type {
>>    */
>>   #define CXL_DECODER_MAX_INTERLEAVE 16
>>   
>> +#define CXL_QTG_ID_INVALID	-1
>>   
>>   /**
>>    * struct cxl_decoder - Common CXL HDM Decoder Attributes
>> @@ -290,6 +291,7 @@ enum cxl_decoder_type {
>>    * @target_type: accelerator vs expander (type2 vs type3) selector
>>    * @region: currently assigned region for this decoder
>>    * @flags: memory type capabilities and locking
>> + * @qtg_id: QoS Throttling Group ID
>>    * @commit: device/decoder-type specific callback to commit settings to hw
>>    * @reset: device/decoder-type specific callback to reset hw settings
>>   */
>> @@ -302,6 +304,7 @@ struct cxl_decoder {
>>   	enum cxl_decoder_type target_type;
>>   	struct cxl_region *region;
>>   	unsigned long flags;
>> +	int qtg_id;
>>   	int (*commit)(struct cxl_decoder *cxld);
>>   	int (*reset)(struct cxl_decoder *cxld);
>>   };
>>
>>
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/18] cxl: Add checksum verification to CDAT from CXL
  2023-02-09 11:34   ` Jonathan Cameron
@ 2023-02-09 17:31     ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-09 17:31 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/9/23 4:34 AM, Jonathan Cameron wrote:
> On Mon, 06 Feb 2023 13:49:48 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> A CDAT table is available from a CXL device. The table is read by the
>> driver and cached in software. With the CXL subsystem needing to parse the
>> CDAT table, the checksum should be verified. Add checksum verification
>> after the CDAT table is read from device.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> Hi Dave,
> 
> Some comments on this follow on from previous patch so may not
> be relevant once you've updated how that is done.

Dan advised to just drop the ACPICA changes and just do it locally since 
the verification code is tiny and simple.

> 
> Jonathan
> 
>> ---
>>   drivers/cxl/core/pci.c |   11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index 57764e9cd19d..a24dac36bedd 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -3,6 +3,7 @@
>>   #include <linux/io-64-nonatomic-lo-hi.h>
>>   #include <linux/device.h>
>>   #include <linux/delay.h>
>> +#include <linux/acpi.h>
>>   #include <linux/pci.h>
>>   #include <linux/pci-doe.h>
>>   #include <cxlpci.h>
>> @@ -592,6 +593,7 @@ void read_cdat_data(struct cxl_port *port)
>>   	struct device *dev = &port->dev;
>>   	struct device *uport = port->uport;
>>   	size_t cdat_length;
>> +	acpi_status status;
>>   	int rc;
>>   
>>   	cdat_doe = find_cdat_doe(uport);
>> @@ -620,5 +622,14 @@ void read_cdat_data(struct cxl_port *port)
>>   		port->cdat.length = 0;
>>   		dev_err(dev, "CDAT data read error\n");
>>   	}
>> +
>> +	status = acpi_ut_verify_cdat_checksum(port->cdat.table, port->cdat.length);
>> +	if (status != AE_OK) {
> 
> if (ACPI_FAILURE(acpi_ut...))  or better still put that in the wrapper I suggeste
> in previous patch so that we have normal kernel return code handling out here.
> 
> 
>> +		/* Don't leave table data allocated on error */
>> +		devm_kfree(dev, port->cdat.table);
>> +		port->cdat.table = NULL;
>> +		port->cdat.length = 0;
> 
> I'd rather see us manipulate a local copy of cdat_length, and cdat_table
> then only assign them to the port->cdat fields the success path rather
> than setting them then unsetting the on error.
> 
> Diff will be bigger, but nicer resulting code (and hopefully diff
> won't be too big!)

Ok, I'll create a prep patch to change this as you suggested.

> 
> 
>> +		dev_err(dev, "CDAT data checksum error\n");
>> +	}
>>   }
>>   EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
>>
>>
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/18] cxl: Add common helpers for cdat parsing
  2023-02-09 11:58   ` Jonathan Cameron
@ 2023-02-09 22:57     ` Dave Jiang
  2023-02-11 10:18       ` Lukas Wunner
  0 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-09 22:57 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/9/23 4:58 AM, Jonathan Cameron wrote:
> On Mon, 06 Feb 2023 13:49:58 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Add helper functions to parse the CDAT table and provide a callback to
>> parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
>> parsing. The code is patterned after the ACPI table parsing helpers.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>   drivers/cxl/core/Makefile |    1
>>   drivers/cxl/core/cdat.c   |   98 +++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/core/cdat.h   |   15 +++++++
>>   drivers/cxl/cxl.h         |    9 ++++
>>   4 files changed, 123 insertions(+)
>>   create mode 100644 drivers/cxl/core/cdat.c
>>   create mode 100644 drivers/cxl/core/cdat.h
>>
>> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
>> index 79c7257f4107..438ce27faf77 100644
>> --- a/drivers/cxl/core/Makefile
>> +++ b/drivers/cxl/core/Makefile
>> @@ -10,4 +10,5 @@ cxl_core-y += memdev.o
>>   cxl_core-y += mbox.o
>>   cxl_core-y += pci.o
>>   cxl_core-y += hdm.o
>> +cxl_core-y += cdat.o
>>   cxl_core-$(CONFIG_CXL_REGION) += region.o
>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>> new file mode 100644
>> index 000000000000..be09c8a690f5
>> --- /dev/null
>> +++ b/drivers/cxl/core/cdat.c
>> @@ -0,0 +1,98 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
>> +#include "cxl.h"
>> +#include "cdat.h"
>> +
>> +static u8 cdat_get_subtable_entry_type(struct cdat_subtable_entry *entry)
>> +{
>> +	return entry->hdr->type;
>> +}
> 
> Are these all worthwhile given the resulting function name is longer
> than accessing it directly.  If aim is to move the details of the
> struct cdat_subtable_entry away from being exposed at caller, then
> fair enough, but if that is the plan I'd expect to see something about
> that in the patch description.
> 
> Feels like some premature abstraction, but I don't feel particularly
> strongly about this.

I'll drop them. The code was adapted from ACPI table parsing code. But 
we can simplify for our usages.
> 
> 
>> +
>> +static u16 cdat_get_subtable_entry_length(struct cdat_subtable_entry *entry)
>> +{
>> +	return entry->hdr->length;
>> +}
>> +
>> +static bool has_handler(struct cdat_subtable_proc *proc)
>> +{
>> +	return proc->handler;
>> +}
>> +
>> +static int call_handler(struct cdat_subtable_proc *proc,
>> +			struct cdat_subtable_entry *ent)
>> +{
>> +	if (proc->handler)
> 
> Use your wrapper...

ok
> 
>> +		return proc->handler(ent->hdr, proc->arg);
>> +	return -EINVAL;
>> +}
>> +
>> +static int cdat_table_parse_entries(enum acpi_cdat_type type,
>> +				    struct acpi_table_cdat *table_header,
>> +				    struct cdat_subtable_proc *proc,
>> +				    unsigned int max_entries)
> 
> Documentation needed.  max_entries wasn't what I was expecting.
> I would have expected it to be a cap on number of entries of
> matching type, whereas it seems to be number of entries of any type.
> 
> Also, max_entries == 0 non obvious parameter value.

I'll drop max_entries. Code came from ACPI, but I don't think we need it.
> 
> 
>> +{
>> +	struct cdat_subtable_entry entry;
>> +	unsigned long table_end, entry_len;
>> +	int count = 0;
>> +	int rc;
>> +
>> +	if (!has_handler(proc))
>> +		return -EINVAL;
>> +
>> +	table_end = (unsigned long)table_header + table_header->length;
>> +
>> +	if (type >= ACPI_CDAT_TYPE_RESERVED)
>> +		return -EINVAL;
>> +
>> +	entry.type = type;
>> +	entry.hdr = (struct acpi_cdat_header *)((unsigned long)table_header +
>> +					       sizeof(*table_header));
> 
> Common idiom for this is.
> 
> 	entry.hdr = (struct acpi_cdat_header *)(table_header + 1);
>
ok.


>> +
>> +	while ((unsigned long)entry.hdr < table_end) {
>> +		entry_len = cdat_get_subtable_entry_length(&entry);
>> +
>> +		if ((unsigned long)entry.hdr + entry_len > table_end)
>> +			return -EINVAL;
>> +
>> +		if (max_entries && count >= max_entries)
>> +			break;
>> +
>> +		if (entry_len == 0)
>> +			return -EINVAL;
>> +
>> +		if (cdat_get_subtable_entry_type(&entry) == type) {
> 
> This is a little odd as we set entry.type above == type, but
> the match here is on the value in the one in entry.hdr.
> 
> That's not particularly intuitive. Not sure on what a good solution
> would be though.  Maybe just
> 
> 		if (cdat_is_subtable_match(&entry))
ok

> 
>> +			rc = call_handler(proc, &entry);
>> +			if (rc)
>> +				return rc;
>> +		}
> 
> As above.  Maybe intent, but my initial assumption would have had
> count not incremented unless there was a match. (so put it in this if block
> not below)

right ok.

> 
>> +
>> +		entry.hdr = (struct acpi_cdat_header *)((unsigned long)entry.hdr + entry_len);
>> +		count++;
>> +	}
>> +
>> +	return count;
>> +}
>> +
>> +int cdat_table_parse_dsmas(void *table, cdat_tbl_entry_handler handler, void *arg)
>> +{
>> +	struct acpi_table_cdat *header = (struct acpi_table_cdat *)table;
> 
> Now struct acpi_table_cdata exists, maybe just move to using
> that type for all references.  Will make a mess of the range checking
> efforts the hardening folk are working on as we will index off end of
> it and it doesn't have a variable length array trailing element.
> 
> Random musing follows...
> We could add a variable length element to that struct
> definition and the magic to associate that with the length parameter
> and get range protection if relevant hardening is turned on.
> 
> Structure definition comes (I think) from scripts in acpica so
> would need to push such changes into acpica and I'm not sure
> they will be keen even though it would be good for the kernel
> to have the protections.


Lukas actually noticed that the ACPI data structs are unsuitable for 
kernel usage because it doesn't designate the data as LE. He has created 
local structs that has the correct data type. We can expand on top of 
that for our usages.

https://github.com/l1k/linux/commit/d376a53a45da2fff219799a02f216962123f9fd0

I see what you are saying. But I'm not sure how easily we can do this 
for the CDAT table due to endieness. Is this what you had in mind?

From:
struct cdat_entry_header {
	u8 type;
	u8 reserved;
	__le16 length;
} __packed;

To:
struct cdat_entry_header {
	u8 type;
	u8 reserved;
	__le16 length;
	DECLARE_BOUNDED_ARRAY(u8, body, le16_to_cpu(length));
} __packed;

> 
> https://people.kernel.org/kees/bounded-flexible-arrays-in-c
> for Kees Cook's blog on this stuff.  The last bit needs
> the 'comming soon' part.
> 
>> +	struct cdat_subtable_proc proc = {
>> +		.handler	= handler,
>> +		.arg		= arg,
>> +	};
>> +
>> +	return cdat_table_parse_entries(ACPI_CDAT_TYPE_DSMAS, header, &proc, 0);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dsmas, CXL);
>> +
>> +int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler, void *arg)
>> +{
>> +	struct acpi_table_cdat *header = (struct acpi_table_cdat *)table;
>> +	struct cdat_subtable_proc proc = {
>> +		.handler	= handler,
>> +		.arg		= arg,
>> +	};
>> +
>> +	return cdat_table_parse_entries(ACPI_CDAT_TYPE_DSLBIS, header, &proc, 0);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dslbis, CXL);
>> diff --git a/drivers/cxl/core/cdat.h b/drivers/cxl/core/cdat.h
>> new file mode 100644
>> index 000000000000..f690325e82a6
>> --- /dev/null
>> +++ b/drivers/cxl/core/cdat.h
>> @@ -0,0 +1,15 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/* Copyright(c) 2023 Intel Corporation. */
>> +#ifndef __CXL_CDAT_H__
>> +#define __CXL_CDAT_H__
>> +
>> +struct cdat_subtable_proc {
>> +	cdat_tbl_entry_handler handler;
>> +	void *arg;
>> +};
>> +
>> +struct cdat_subtable_entry {
>> +	struct acpi_cdat_header *hdr;
>> +	enum acpi_cdat_type type;
>> +};
>> +#endif
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index f558bbfc0332..839a121c1997 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -9,6 +9,7 @@
>>   #include <linux/bitops.h>
>>   #include <linux/log2.h>
>>   #include <linux/io.h>
>> +#include <linux/acpi.h>
>>   
>>   /**
>>    * DOC: cxl objects
>> @@ -697,6 +698,14 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
>>   }
>>   #endif
>>   
>> +typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
>> +
>> +u8 cdat_table_checksum(u8 *buffer, u32 length);
>> +int cdat_table_parse_dsmas(void *table, cdat_tbl_entry_handler handler,
>> +			   void *arg);
>> +int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
>> +			    void *arg);
>> +
>>   /*
>>    * Unit test builds overrides this to __weak, find the 'strong' version
>>    * of these symbols in tools/testing/cxl/.
>>
>>
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/18] cxl: Add common helpers for cdat parsing
  2023-02-09 22:57     ` Dave Jiang
@ 2023-02-11 10:18       ` Lukas Wunner
  2023-02-14 13:17         ` Jonathan Cameron
  2023-02-14 20:36         ` Dave Jiang
  0 siblings, 2 replies; 65+ messages in thread
From: Lukas Wunner @ 2023-02-11 10:18 UTC (permalink / raw)
  To: Dave Jiang
  Cc: Jonathan Cameron, linux-cxl, linux-pci, linux-acpi,
	dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore

On Thu, Feb 09, 2023 at 03:57:32PM -0700, Dave Jiang wrote:
> On 2/9/23 4:58 AM, Jonathan Cameron wrote:
> > On Mon, 06 Feb 2023 13:49:58 -0700 Dave Jiang <dave.jiang@intel.com> wrote:
> > > Add helper functions to parse the CDAT table and provide a callback to
> > > parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
> > > parsing. The code is patterned after the ACPI table parsing helpers.
[...]
> > Are these all worthwhile given the resulting function name is longer
> > than accessing it directly.  If aim is to move the details of the
> > struct cdat_subtable_entry away from being exposed at caller, then
> > fair enough, but if that is the plan I'd expect to see something about
> > that in the patch description.
> > 
> > Feels like some premature abstraction, but I don't feel particularly
> > strongly about this.
> 
> I'll drop them. The code was adapted from ACPI table parsing code. But we
> can simplify for our usages.

Yes just iterating over the CDAT entries and directly calling the
appropriate parser function for the entry seems more straightforward.


> > Random musing follows...
> > We could add a variable length element to that struct
> > definition and the magic to associate that with the length parameter
> > and get range protection if relevant hardening is turned on.
> > 
> > Structure definition comes (I think) from scripts in acpica so
> > would need to push such changes into acpica and I'm not sure
> > they will be keen even though it would be good for the kernel
> > to have the protections.
[...]
> I see what you are saying. But I'm not sure how easily we can do this for
> the CDAT table due to endieness. Is this what you had in mind?
> 
> From:
> struct cdat_entry_header {
> 	u8 type;
> 	u8 reserved;
> 	__le16 length;
> } __packed;
> 
> To:
> struct cdat_entry_header {
> 	u8 type;
> 	u8 reserved;
> 	__le16 length;
> 	DECLARE_BOUNDED_ARRAY(u8, body, le16_to_cpu(length));
> } __packed;

I think this is backwards.  I'd suggest creating a struct for each
CDAT entry which includes the header.  The kernel switched to
-std=gnu11 a while ago, so you should be able to use an unnamed field
for the header:

struct cdat_dsmas {
	struct cdat_entry_header;
	u8 dsmad_handle;
	u8 flags;
	u8 reserved[2];
	__le64 dpa_base;
	__le64 dpa_length;
}

Note that in my commit "cxl/pci: Handle truncated CDAT entries",
I'm only verifying that the number of bytes received via DOE
matches the length field in the cdat_entry_header.  I do not
verify in cxl_cdat_read_table() whether that length is correct
for the specific CDAT structure.  I think that's the job of
the function parsing that particular structure type.

In other words, at the top of your DSMAS parsing function,
you need to check:

	struct cdat_dsmas dsmas;

	if (dsmas->length != sizeof(*dsmas)) {
		dev_err(...);
		return -EINVAL;
	}


Note how the check is simplified by the header being part of
struct cdat_dsmas.  If the header wasn't part of struct cdat_dsmas,
an addition would be needed here.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/18] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-02-09 13:29   ` Jonathan Cameron
@ 2023-02-13 22:55     ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-13 22:55 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/9/23 6:29 AM, Jonathan Cameron wrote:
> On Mon, 06 Feb 2023 13:50:15 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Provide a callback function to the CDAT parser in order to parse the Device
>> Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
>> DPA range and its associated attributes in each entry. See the CDAT
>> specification for details.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> Hi Dave,
> 
> A few minor questions / comments inline,
> 
> Jonathan
> 
>> ---
>>   drivers/cxl/core/cdat.c |   25 +++++++++++++++++++++++++
>>   drivers/cxl/core/port.c |    2 ++
>>   drivers/cxl/cxl.h       |   11 +++++++++++
>>   drivers/cxl/port.c      |    8 ++++++++
>>   4 files changed, 46 insertions(+)
>>
>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>> index be09c8a690f5..f9a64a0f1ee4 100644
>> --- a/drivers/cxl/core/cdat.c
>> +++ b/drivers/cxl/core/cdat.c
>> @@ -96,3 +96,28 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler, void *a
>>   	return cdat_table_parse_entries(ACPI_CDAT_TYPE_DSLBIS, header, &proc, 0);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cdat_table_parse_dslbis, CXL);
>> +
>> +int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg)
>> +{
>> +	struct cxl_port *port = (struct cxl_port *)arg;
>> +	struct dsmas_entry *dent;
>> +	struct acpi_cdat_dsmas *dsmas;
>> +
>> +	if (header->type != ACPI_CDAT_TYPE_DSMAS)
>> +		return -EINVAL;
>> +
>> +	dent = devm_kzalloc(&port->dev, sizeof(*dent), GFP_KERNEL);
>> +	if (!dent)
>> +		return -ENOMEM;
>> +
>> +	dsmas = (struct acpi_cdat_dsmas *)((unsigned long)header + sizeof(*header));
> 
> I'd prefer header + 1

It's simpler. Will update.

> 
> 
>> +	dent->handle = dsmas->dsmad_handle;
>> +	dent->dpa_range.start = dsmas->dpa_base_address;
>> +	dent->dpa_range.end = dsmas->dpa_base_address + dsmas->dpa_length - 1;
>> +
>> +	mutex_lock(&port->cdat.dsmas_lock);
>> +	list_add_tail(&dent->list, &port->cdat.dsmas_list);
>> +	mutex_unlock(&port->cdat.dsmas_lock);
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index fe78daf7e7c8..2b27319cfd42 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -660,6 +660,8 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
>>   	device_set_pm_not_required(dev);
>>   	dev->bus = &cxl_bus_type;
>>   	dev->type = &cxl_port_type;
>> +	INIT_LIST_HEAD(&port->cdat.dsmas_list);
>> +	mutex_init(&port->cdat.dsmas_lock);
>>   
>>   	return port;
>>   
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 839a121c1997..1e5e69f08480 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -8,6 +8,7 @@
>>   #include <linux/bitfield.h>
>>   #include <linux/bitops.h>
>>   #include <linux/log2.h>
>> +#include <linux/list.h>
>>   #include <linux/io.h>
>>   #include <linux/acpi.h>
>>   
>> @@ -520,6 +521,8 @@ struct cxl_port {
>>   	struct cxl_cdat {
>>   		void *table;
>>   		size_t length;
>> +		struct list_head dsmas_list;
>> +		struct mutex dsmas_lock; /* lock for dsmas_list */
> 
> I'm curious, what might race with the dsmas_list changing and hence what is lock for?

It should be dropped. The latest implementation has all the access 
during port probe so no longer needing locking.

> 
>>   	} cdat;
>>   	bool cdat_available;
>>   };
>> @@ -698,6 +701,12 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
>>   }
>>   #endif
>>   
>> +struct dsmas_entry {
>> +	struct list_head list;
>> +	struct range dpa_range;
>> +	u16 handle;
> 
> handle is 1 byte in the spec. Why larger here?

It should be u8. Oops. Thanks for the catch.


> 
>> +};
>> +
>>   typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
>>   
>>   u8 cdat_table_checksum(u8 *buffer, u32 length);
>> @@ -706,6 +715,8 @@ int cdat_table_parse_dsmas(void *table, cdat_tbl_entry_handler handler,
>>   int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
>>   			    void *arg);
>>   
>> +int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg);
>> +
>>   /*
>>    * Unit test builds overrides this to __weak, find the 'strong' version
>>    * of these symbols in tools/testing/cxl/.
>> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
>> index 5453771bf330..b1da73e99bab 100644
>> --- a/drivers/cxl/port.c
>> +++ b/drivers/cxl/port.c
>> @@ -61,6 +61,14 @@ static int cxl_port_probe(struct device *dev)
>>   		if (rc)
>>   			return rc;
>>   
>> +		if (port->cdat.table) {
>> +			rc = cdat_table_parse_dsmas(port->cdat.table,
>> +						    cxl_dsmas_parse_entry,
>> +						    (void *)port);
>> +			if (rc < 0)
>> +				dev_dbg(dev, "Failed to parse DSMAS: %d\n", rc);
>> +		}
>> +
>>   		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
>>   		if (rc)
>>   			return rc;
>>
>>
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 07/18] cxl: Add callback to parse the DSLBIS subtable from CDAT
  2023-02-09 13:50   ` Jonathan Cameron
@ 2023-02-14  0:24     ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-14  0:24 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/9/23 6:50 AM, Jonathan Cameron wrote:
> On Mon, 06 Feb 2023 13:50:23 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Provide a callback to parse the Device Scoped Latency and Bandwidth
>> Information Structure (DSLBIS) in the CDAT structures. The DSLBIS
>> contains the bandwidth and latency information that's tied to a DSMAS
>> handle. The driver will retrieve the read and write latency and
>> bandwidth associated with the DSMAS which is tied to a DPA range.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> A few comments inline,
> 
> Thanks,
> 
> Jonathan
> 
>> ---
>>   drivers/cxl/core/cdat.c |   34 ++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h       |    2 ++
>>   drivers/cxl/port.c      |    9 ++++++++-
>>   include/acpi/actbl1.h   |    5 +++++
>>   4 files changed, 49 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
>> index f9a64a0f1ee4..3c8f3956487e 100644
>> --- a/drivers/cxl/core/cdat.c
>> +++ b/drivers/cxl/core/cdat.c
>> @@ -121,3 +121,37 @@ int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg)
>>   	return 0;
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL);
>> +
>> +int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg)
>> +{
>> +	struct cxl_port *port = (struct cxl_port *)arg;
>> +	struct dsmas_entry *dent;
>> +	struct acpi_cdat_dslbis *dslbis;
> 
> Perhaps reorder to maintain the pretty upside-down Christmas trees
> (I don't care :)

will fix
> 
>> +	u64 val;
>> +
>> +	if (header->type != ACPI_CDAT_TYPE_DSLBIS)
>> +		return -EINVAL;
> 
> Isn't this guaranteed by the caller?  Seems overkill do it twice
> and I don't think these will ever be called outside of that wrapper that
> loops over the entries. I could be wrong though!
>

ok will remove


>> +
>> +	dslbis = (struct acpi_cdat_dslbis *)((unsigned long)header + sizeof(*header));
> header + 1
> 
>> +	if ((dslbis->flags & ACPI_CEDT_DSLBIS_MEM_MASK) !=
> 
> This field 'must be ignored' if the DSMAS handle isn't a match
> (as it's an initiator only entry) Odd though it may seem I think we
> might see one of those on a type 3 device and we are probably going to
> have other users of this function anyway.
> 
> I think you need to do the walk below to check we have a DSMAS match, before
> running this check.

ok, will move down to where entry is matched

> 
>> +	     ACPI_CEDT_DSLBIS_MEM_MEMORY)
>> +		return 0;
>> +
>> +	if (dslbis->data_type > ACPI_HMAT_WRITE_BANDWIDTH)
>> +		return -ENXIO;
> 
> This would probably imply a new HMAT spec value, so probably just
> log it and ignore rather than error out.

ok

> 
>> +
>> +	/* Value calculation with base_unit, see ACPI Spec 6.5 5.2.28.4 */
>> +	val = dslbis->entry[0] * dslbis->entry_base_unit;
> 
> In theory this might overflow as u64 * u16.
> Doubt it will ever happen in reality, but maybe a check and debug print if it does?

ok will use check_mul_overflow()
> 
>> +
>> +	mutex_lock(&port->cdat.dsmas_lock);
>> +	list_for_each_entry(dent, &port->cdat.dsmas_list, list) {
>> +		if (dslbis->handle == dent->handle) {
>> +			dent->qos[dslbis->data_type] = val;
>> +			break;
>> +		}
>> +	}
>> +	mutex_unlock(&port->cdat.dsmas_lock);
>> +
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_dslbis_parse_entry, CXL);
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 1e5e69f08480..849b22236f1d 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -705,6 +705,7 @@ struct dsmas_entry {
>>   	struct list_head list;
>>   	struct range dpa_range;
>>   	u16 handle;
>> +	u64 qos[ACPI_HMAT_WRITE_BANDWIDTH + 1];
>>   };
>>   
>>   typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg);
>> @@ -716,6 +717,7 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
>>   			    void *arg);
>>   
>>   int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg);
>> +int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg);
>>   
>>   /*
>>    * Unit test builds overrides this to __weak, find the 'strong' version
>> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
>> index b1da73e99bab..8de311208b37 100644
>> --- a/drivers/cxl/port.c
>> +++ b/drivers/cxl/port.c
>> @@ -65,8 +65,15 @@ static int cxl_port_probe(struct device *dev)
>>   			rc = cdat_table_parse_dsmas(port->cdat.table,
>>   						    cxl_dsmas_parse_entry,
>>   						    (void *)port);
>> -			if (rc < 0)
>> +			if (rc > 0) {
>> +				rc = cdat_table_parse_dslbis(port->cdat.table,
>> +							     cxl_dslbis_parse_entry,
>> +							     (void *)port);
>> +				if (rc <= 0)
>> +					dev_dbg(dev, "Failed to parse DSLBIS: %d\n", rc);
> 
> If we have entries and they won't parse, I think we should be screaming louder.
> dev_warn() would be my preference for this and the one in the previous patch.
> Sure we can carry on, but something on the device is not working as expected.

ok will fix this one and previous.

> 
>> +			} else {
>>   				dev_dbg(dev, "Failed to parse DSMAS: %d\n", rc);
>> +			}
>>   		}
>>   
>>   		rc = cxl_hdm_decode_init(cxlds, cxlhdm);
>> diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
>> index e8297cefde09..ff6092e45196 100644
>> --- a/include/acpi/actbl1.h
>> +++ b/include/acpi/actbl1.h
>> @@ -369,6 +369,11 @@ struct acpi_cdat_dslbis {
>>   	u16 reserved2;
>>   };
>>   
>> +/* Flags for subtable above */
>> +
>> +#define ACPI_CEDT_DSLBIS_MEM_MASK	GENMASK(3, 0)
>> +#define ACPI_CEDT_DSLBIS_MEM_MEMORY	0
>> +
>>   /* Subtable 2: Device Scoped Memory Side Cache Information Structure (DSMSCIS) */
>>   
>>   struct acpi_cdat_dsmscis {
>>
>>
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/18] cxl: Add common helpers for cdat parsing
  2023-02-11 10:18       ` Lukas Wunner
@ 2023-02-14 13:17         ` Jonathan Cameron
  2023-02-14 20:36         ` Dave Jiang
  1 sibling, 0 replies; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-14 13:17 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Dave Jiang, linux-cxl, linux-pci, linux-acpi, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, rafael, bhelgaas,
	robert.moore

On Sat, 11 Feb 2023 11:18:33 +0100
Lukas Wunner <lukas@wunner.de> wrote:

> On Thu, Feb 09, 2023 at 03:57:32PM -0700, Dave Jiang wrote:
> > On 2/9/23 4:58 AM, Jonathan Cameron wrote:  
> > > On Mon, 06 Feb 2023 13:49:58 -0700 Dave Jiang <dave.jiang@intel.com> wrote:  
> > > > Add helper functions to parse the CDAT table and provide a callback to
> > > > parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
> > > > parsing. The code is patterned after the ACPI table parsing helpers.  
> [...]
> > > Are these all worthwhile given the resulting function name is longer
> > > than accessing it directly.  If aim is to move the details of the
> > > struct cdat_subtable_entry away from being exposed at caller, then
> > > fair enough, but if that is the plan I'd expect to see something about
> > > that in the patch description.
> > > 
> > > Feels like some premature abstraction, but I don't feel particularly
> > > strongly about this.  
> > 
> > I'll drop them. The code was adapted from ACPI table parsing code. But we
> > can simplify for our usages.  
> 
> Yes just iterating over the CDAT entries and directly calling the
> appropriate parser function for the entry seems more straightforward.
> 
> 
> > > Random musing follows...
> > > We could add a variable length element to that struct
> > > definition and the magic to associate that with the length parameter
> > > and get range protection if relevant hardening is turned on.
> > > 
> > > Structure definition comes (I think) from scripts in acpica so
> > > would need to push such changes into acpica and I'm not sure
> > > they will be keen even though it would be good for the kernel
> > > to have the protections.  
> [...]
> > I see what you are saying. But I'm not sure how easily we can do this for
> > the CDAT table due to endieness. Is this what you had in mind?
> > 
> > From:
> > struct cdat_entry_header {
> > 	u8 type;
> > 	u8 reserved;
> > 	__le16 length;
> > } __packed;
> > 
> > To:
> > struct cdat_entry_header {
> > 	u8 type;
> > 	u8 reserved;
> > 	__le16 length;
> > 	DECLARE_BOUNDED_ARRAY(u8, body, le16_to_cpu(length));
> > } __packed;  
> 
> I think this is backwards.  I'd suggest creating a struct for each
> CDAT entry which includes the header.  The kernel switched to
> -std=gnu11 a while ago, so you should be able to use an unnamed field
> for the header:
> 
> struct cdat_dsmas {
> 	struct cdat_entry_header;
> 	u8 dsmad_handle;
> 	u8 flags;
> 	u8 reserved[2];
> 	__le64 dpa_base;
> 	__le64 dpa_length;
> }

This is indeed better given we always know the type before accessing.

The above trick might be useful for any code that treats it as
generic entries though a straight forwards check might be easier
and is already present in Lukas' latest code.

> 
> Note that in my commit "cxl/pci: Handle truncated CDAT entries",
> I'm only verifying that the number of bytes received via DOE
> matches the length field in the cdat_entry_header.  I do not
> verify in cxl_cdat_read_table() whether that length is correct
> for the specific CDAT structure.  I think that's the job of
> the function parsing that particular structure type.
> 
> In other words, at the top of your DSMAS parsing function,
> you need to check:
> 
> 	struct cdat_dsmas dsmas;
> 
> 	if (dsmas->length != sizeof(*dsmas)) {
> 		dev_err(...);
> 		return -EINVAL;
> 	}
> 
> 
> Note how the check is simplified by the header being part of
> struct cdat_dsmas.  If the header wasn't part of struct cdat_dsmas,
> an addition would be needed here.
> 
> Thanks,
> 
> Lukas
> 


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/18] cxl: Add common helpers for cdat parsing
  2023-02-11 10:18       ` Lukas Wunner
  2023-02-14 13:17         ` Jonathan Cameron
@ 2023-02-14 20:36         ` Dave Jiang
  1 sibling, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-14 20:36 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Jonathan Cameron, linux-cxl, linux-pci, linux-acpi,
	dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	rafael, bhelgaas, robert.moore



On 2/11/23 3:18 AM, Lukas Wunner wrote:
> On Thu, Feb 09, 2023 at 03:57:32PM -0700, Dave Jiang wrote:
>> On 2/9/23 4:58 AM, Jonathan Cameron wrote:
>>> On Mon, 06 Feb 2023 13:49:58 -0700 Dave Jiang <dave.jiang@intel.com> wrote:
>>>> Add helper functions to parse the CDAT table and provide a callback to
>>>> parse the sub-table. Helpers are provided for DSMAS and DSLBIS sub-table
>>>> parsing. The code is patterned after the ACPI table parsing helpers.
> [...]
>>> Are these all worthwhile given the resulting function name is longer
>>> than accessing it directly.  If aim is to move the details of the
>>> struct cdat_subtable_entry away from being exposed at caller, then
>>> fair enough, but if that is the plan I'd expect to see something about
>>> that in the patch description.
>>>
>>> Feels like some premature abstraction, but I don't feel particularly
>>> strongly about this.
>>
>> I'll drop them. The code was adapted from ACPI table parsing code. But we
>> can simplify for our usages.
> 
> Yes just iterating over the CDAT entries and directly calling the
> appropriate parser function for the entry seems more straightforward.
> 
> 
>>> Random musing follows...
>>> We could add a variable length element to that struct
>>> definition and the magic to associate that with the length parameter
>>> and get range protection if relevant hardening is turned on.
>>>
>>> Structure definition comes (I think) from scripts in acpica so
>>> would need to push such changes into acpica and I'm not sure
>>> they will be keen even though it would be good for the kernel
>>> to have the protections.
> [...]
>> I see what you are saying. But I'm not sure how easily we can do this for
>> the CDAT table due to endieness. Is this what you had in mind?
>>
>> From:
>> struct cdat_entry_header {
>> 	u8 type;
>> 	u8 reserved;
>> 	__le16 length;
>> } __packed;
>>
>> To:
>> struct cdat_entry_header {
>> 	u8 type;
>> 	u8 reserved;
>> 	__le16 length;
>> 	DECLARE_BOUNDED_ARRAY(u8, body, le16_to_cpu(length));
>> } __packed;
> 
> I think this is backwards.  I'd suggest creating a struct for each
> CDAT entry which includes the header.  The kernel switched to
> -std=gnu11 a while ago, so you should be able to use an unnamed field
> for the header:
> 
> struct cdat_dsmas {
> 	struct cdat_entry_header;
> 	u8 dsmad_handle;
> 	u8 flags;
> 	u8 reserved[2];
> 	__le64 dpa_base;
> 	__le64 dpa_length;
> }

In file included from drivers/cxl/pci.c:14:
drivers/cxl/cxlpci.h:109:33: warning: declaration does not declare anything
   109 |         struct cdat_entry_header;
       |                                 ^

Does not seem to be happy about the unamed field.


> 
> Note that in my commit "cxl/pci: Handle truncated CDAT entries",
> I'm only verifying that the number of bytes received via DOE
> matches the length field in the cdat_entry_header.  I do not
> verify in cxl_cdat_read_table() whether that length is correct
> for the specific CDAT structure.  I think that's the job of
> the function parsing that particular structure type.
> 
> In other words, at the top of your DSMAS parsing function,
> you need to check:
> 
> 	struct cdat_dsmas dsmas;
> 
> 	if (dsmas->length != sizeof(*dsmas)) {
> 		dev_err(...);
> 		return -EINVAL;
> 	}
> 
> 
> Note how the check is simplified by the header being part of
> struct cdat_dsmas.  If the header wasn't part of struct cdat_dsmas,
> an addition would be needed here.
> 
> Thanks,
> 
> Lukas

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 08/18] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-02-09 14:02   ` Jonathan Cameron
@ 2023-02-14 21:07     ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-14 21:07 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/9/23 7:02 AM, Jonathan Cameron wrote:
> On Mon, 06 Feb 2023 13:50:33 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)
>>
>> Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
>> an input of an ACPI package with 4 dwords (read latency, write latency,
>> read bandwidth, write bandwidth). The call returns a package with 1 WORD
>> that provides the max supported QTG ID and a package that may contain 0 or
>> more WORDs as the recommended QTG IDs in the recommended order.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> A few minor bits inline.
> 
> Jonathan
> 
>> ---
>>   drivers/cxl/core/Makefile |    1
>>   drivers/cxl/core/acpi.c   |   99 +++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h         |   15 +++++++
>>   3 files changed, 115 insertions(+)
>>   create mode 100644 drivers/cxl/core/acpi.c
>>
>> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
>> index 438ce27faf77..11ccc2016ab7 100644
>> --- a/drivers/cxl/core/Makefile
>> +++ b/drivers/cxl/core/Makefile
>> @@ -11,4 +11,5 @@ cxl_core-y += mbox.o
>>   cxl_core-y += pci.o
>>   cxl_core-y += hdm.o
>>   cxl_core-y += cdat.o
>> +cxl_core-y += acpi.o
>>   cxl_core-$(CONFIG_CXL_REGION) += region.o
>> diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
>> new file mode 100644
>> index 000000000000..86dc6c9c1f24
>> --- /dev/null
>> +++ b/drivers/cxl/core/acpi.c
>> @@ -0,0 +1,99 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
>> +#include <linux/module.h>
>> +#include <linux/device.h>
>> +#include <linux/kernel.h>
>> +#include <linux/acpi.h>
>> +#include <linux/pci.h>
>> +#include <asm/div64.h>
>> +#include "cxlpci.h"
>> +#include "cxl.h"
>> +
>> +const guid_t acpi_cxl_qtg_id_guid =
>> +	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
>> +		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
>> +
>> +/**
>> + * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
>> + * @handle: ACPI handle
>> + * @input: bandwidth and latency data
>> + *
>> + * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
>> + * the QTG IDs that falls within the performance data.
>> + */
>> +struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
>> +						 struct qtg_dsm_input *input)
>> +{
>> +	struct qtg_dsm_output *output;
>> +	union acpi_object *out_obj, *out_buf, *pkg, in_buf, in_obj;
> 
> Reorder to reverse Xmas tree perhaps.

Ok
> 
>> +	int len;
>> +	int rc;
> Might as well put those on one line.

Ok

> 
>> +
>> +	in_obj.type = ACPI_TYPE_PACKAGE;
>> +	in_obj.package.count = 1;
>> +	in_obj.package.elements = &in_buf;
>> +	in_buf.type = ACPI_TYPE_BUFFER;
>> +	in_buf.buffer.pointer = (u8 *)input;
>> +	in_buf.buffer.length = sizeof(u32) * 4;
> c99 style is nicer to read.

Ok

> 
> 	union acpi_object in_obj = {
> 		.type =
> 
> 	}
>> +
>> +	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
>> +	if (!out_obj)
>> +		return ERR_PTR(-ENXIO);
>> +
>> +	if (out_obj->type != ACPI_TYPE_PACKAGE) {
>> +		rc = -ENXIO;
>> +		goto err;
>> +	}
>> +
>> +	/*
>> +	 * CXL spec v3.0 9.17.3.1
>> +	 * There should be 2 elements in the package. 1 WORD for max QTG ID supported
>> +	 * by the platform, and the other a package of recommended QTGs
>> +	 */
>> +	if (out_obj->package.count != 2) {
> 
> This stuff is usually designed to be extensible - tends to be explicitly allowed in
> stuff in the ACPI spec (not mentioned AFAICT in the CXL spec).  So I'd be tempted to allow
>> 2 just don't read them.

Will remove check.

> 
> 	if (out_obj->package.count < 2) {
>> +		rc = -ENXIO;
>> +		goto err;
>> +	}
>> +
>> +	pkg = &out_obj->package.elements[1];
>> +	if (pkg->type != ACPI_TYPE_PACKAGE) {
>> +		rc = -ENXIO;
>> +		goto err;
>> +	}
>> +
>> +	out_buf = &pkg->package.elements[0];
>> +	if (out_buf->type != ACPI_TYPE_BUFFER) {
>> +		rc = -ENXIO;
>> +		goto err;
>> +	}
>> +
>> +	len = out_buf->buffer.length;
>> +	output = kmalloc(len + sizeof(*output), GFP_KERNEL);
>> +	if (!output) {
>> +		rc = -ENOMEM;
>> +		goto err;
>> +	}
>> +
>> +	/* It's legal to have 0 QTG entries */
>> +	if (len == 0) {
>> +		output->nr = 0;
>> +		goto out;
>> +	}
>> +
>> +	/* Malformed package, not multiple of WORD size */
>> +	if (len % sizeof(u16)) {
>> +		rc = -ENXIO;
>> +		goto out;
>> +	}
>> +
>> +	output->nr = len / sizeof(u16);
>> +	memcpy(output->qtg_ids, out_buf->buffer.pointer, len);
> 
> Worth checking them against Max Support QTG ID as provided in the
> outer package?  Obviously if they are greater than that there is
> a bug, but meh.

Ok will add check and warn.

> 
>> +out:
>> +	ACPI_FREE(out_obj);
>> +	return output;
>> +
>> +err:
>> +	ACPI_FREE(out_obj);
>> +	return ERR_PTR(rc);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_evaluate_qtg_dsm, CXL);
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 849b22236f1d..e70df07f9b4b 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -719,6 +719,21 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler,
>>   int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg);
>>   int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg);
>>   
>> +struct qtg_dsm_input {
>> +	u32 rd_lat;
>> +	u32 wr_lat;
>> +	u32 rd_bw;
>> +	u32 wr_bw;
>> +};
>> +
>> +struct qtg_dsm_output {
>> +	int nr;
>> +	u16 qtg_ids[];
>> +};
>> +
>> +struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
>> +						 struct qtg_dsm_input *input);
>> +
>>   /*
>>    * Unit test builds overrides this to __weak, find the 'strong' version
>>    * of these symbols in tools/testing/cxl/.
>>
>>
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 09/18] cxl: Add helper function to retrieve ACPI handle of CXL root device
  2023-02-09 14:10   ` Jonathan Cameron
@ 2023-02-14 21:29     ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-14 21:29 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/9/23 7:10 AM, Jonathan Cameron wrote:
> On Mon, 06 Feb 2023 13:50:42 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Provide a helper to find the ACPI0017 device in order to issue the _DSM.
>> The helper will take the 'struct device' from a cxl_port and iterate until
>> the root device is reached. The ACPI handle will be returned from the root
>> device.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>   drivers/cxl/core/acpi.c |   30 ++++++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h       |    1 +
>>   2 files changed, 31 insertions(+)
>>
>> diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c
>> index 86dc6c9c1f24..05fcd4751619 100644
>> --- a/drivers/cxl/core/acpi.c
>> +++ b/drivers/cxl/core/acpi.c
>> @@ -5,6 +5,7 @@
>>   #include <linux/kernel.h>
>>   #include <linux/acpi.h>
>>   #include <linux/pci.h>
>> +#include <linux/platform_device.h>
>>   #include <asm/div64.h>
>>   #include "cxlpci.h"
>>   #include "cxl.h"
>> @@ -13,6 +14,35 @@ const guid_t acpi_cxl_qtg_id_guid =
>>   	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
>>   		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
>>   
>> +/**
>> + * cxl_acpi_get_root_acpi_handle - get the ACPI handle of the CXL root device
>> + * @dev: 'struct device' to start searching from. Should be from cxl_port->dev.
>> + * Looks for the ACPI0017 device and return the ACPI handle
>> + **/
> 
> Inconsistent comment style.

ok
> 
>> +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev)
>> +{
>> +	struct device *itr = dev, *root_dev;
> 
> Not nice for readability to have an assignment in a list of definitions
> all on the same line.

ok
> 
>> +	acpi_handle handle;
>> +
>> +	if (!dev)
>> +		return ERR_PTR(-EINVAL);
>> +
>> +	while (itr->parent) {
>> +		root_dev = itr;
>> +		itr = itr->parent;
>> +	}
>> +
>> +	if (!dev_is_platform(root_dev))
>> +		return ERR_PTR(-ENODEV);
>> +
>> +	handle = ACPI_HANDLE(root_dev);
>> +	if (!handle)
>> +		return ERR_PTR(-ENODEV);
>> +
>> +	return handle;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_get_rootdev_handle, CXL);
>> +
>>   /**
>>    * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
>>    * @handle: ACPI handle
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index e70df07f9b4b..ac6ea550ab0a 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -733,6 +733,7 @@ struct qtg_dsm_output {
>>   
>>   struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle,
>>   						 struct qtg_dsm_input *input);
>> +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev);
>>   
>>   /*
>>    * Unit test builds overrides this to __weak, find the 'strong' version
>>
>>
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-09 15:10           ` Jonathan Cameron
@ 2023-02-14 22:22             ` Dave Jiang
  2023-02-15 12:13               ` Jonathan Cameron
  0 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-14 22:22 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Bjorn Helgaas, linux-cxl, linux-pci, linux-acpi, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, rafael, bhelgaas,
	robert.moore



On 2/9/23 8:10 AM, Jonathan Cameron wrote:
> On Wed, 8 Feb 2023 16:56:30 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> On 2/8/23 3:15 PM, Bjorn Helgaas wrote:
>>> On Tue, Feb 07, 2023 at 01:51:17PM -0700, Dave Jiang wrote:
>>>>
>>>>
>>>> On 2/6/23 3:39 PM, Bjorn Helgaas wrote:
>>>>> On Mon, Feb 06, 2023 at 01:51:10PM -0700, Dave Jiang wrote:
>>>>>> The latency is calculated by dividing the FLIT size over the
>>>>>> bandwidth. Add support to retrieve the FLIT size for the CXL
>>>>>> device and calculate the latency of the downstream link.
>>>    
>>>>> I guess you only care about the latency of a single link, not the
>>>>> entire path?
>>>>
>>>> I am adding each of the link individually together in the next
>>>> patch. Are you suggesting a similar function like
>>>> pcie_bandwidth_available() but for latency for the entire path?
>>>
>>> Only a clarifying question.
>>>    
>>>>>> +static int cxl_get_flit_size(struct pci_dev *pdev)
>>>>>> +{
>>>>>> +	if (cxl_pci_flit_256(pdev))
>>>>>> +		return 256;
>>>>>> +
>>>>>> +	return 66;
>>>>>
>>>>> I don't know about the 66-byte flit format, maybe this part is
>>>>> CXL-specific?
>>>>
>>>> 68-byte flit format. Looks like this is a typo from me.
>>>
>>> This part must be CXL-specific, since I don't think PCIe mentions
>>> 68-byte flits.
>>>    
>>>>>> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
>>>>>> + * mode, otherwise it's 68B flits mode.
>>>>>> + */
>>>>>> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
>>>>>> +{
>>>>>> +	u32 lnksta2;
>>>>>> +
>>>>>> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
>>>>>> +	return lnksta2 & BIT(10);
>>>>>
>>>>> Add a #define for the bit.
>>>>
>>>> ok will add.
>>>>   
>>>>>
>>>>> AFAICT, the PCIe spec defines this bit, and it only indicates the link
>>>>> is or will be operating in Flit Mode; it doesn't actually say anything
>>>>> about how large the flits are.  I suppose that's because PCIe only
>>>>> talks about 256B flits, not 66B ones?
>>>>
>>>> Looking at CXL v1.0 rev3.0 6.2.3 "256B Flit Mode", table 6-4, it shows that
>>>> when PCIe Flit Mode is set, then CXL is in 256B flits mode, otherwise, it is
>>>> 68B flits. So an assumption is made here regarding the flit side based on
>>>> the table.
>>>
>>> So reading PCI_EXP_LNKSTA2 and extracting the Flit Mode bit is
>>> PCIe-generic, but the interpretation of "PCIe Flit Mode not enabled
>>> means 68-byte flits" is CXL-specific?
>>>
>>> This sounds wrong, but I don't know quite how.  How would the PCI core
>>> manage links where Flit Mode being cleared really means Flit Mode is
>>> *enabled* but with a different size?  Seems like something could go
>>> wrong there.
>>
>> Looking at the PCIe base spec and the CXL spec, that seemed to be the
>> only way that implies the flit size for a CXL device as far as I can
>> tell. I've yet to find a good way to make that determination. Dan?
> 
> So a given CXL port has either trained up in:
> * normal PCI (in which case all the normal PCI stuff applies) and we'll
>    fail some of the other checks in the CXL driver never get hear here
>    - I 'think' the driver will load for the PCI device to enable things
>    like firmware upgrade, but we won't register the CXL Port devices
>    that ultimately call this stuff.
>    It's perfectly possible to have a driver that will cope with this
>    but it's pretty meaningless for a lot of cxl type 3 driver.
> * 68 byte flit (which was CXL precursor to PCI going flit based)
>    Can be queried via CXL DVSEC Flex Bus Port Status CXL r3.0 8.2.1.3.3
> * 256 byte flits (may or may not be compatible with PCIe ones as there
>    are some optional latency optimizations)
> 
> So if the 68 byte flit is enabled the 256 byte one should never be and
> CXL description is overriding the old PCIe
> 
> Hence I think we should have the additional check on the flex bus
> dvsec even though it should be consistent with your assumption above.

So I'm trying to understand the CXL DVSEC Port status "68B flit and VH 
Enabled bit". If this bit is set, it means we are in 68B flit mode and 
VH mode? Do we just ignore RCH/RCD calculations since it doesn't support 
hotplug? Does this bit get cleared for 256B flit mode? It's not clear to 
me.

> 
> Hmm. That does raise a question of how we take the latency optimized
> flits into account or indeed some of the other latency impacting things
> that may or may not be running - IDE in it's various modes for example.
> 
> For latency optimized we can query relevant bit in the flex bus port status.
> IDE info will be somewhere I guess though no idea if there is a way to
> know the latency impacts.

Should we deal with latency optimized flits and IDE in a later step?

> 
> Jonathan
> 
>>
>>
>>>
>>> Bjorn
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 13/18] cxl: Add latency and bandwidth calculations for the CXL path
  2023-02-09 15:24   ` Jonathan Cameron
@ 2023-02-14 23:03     ` Dave Jiang
  2023-02-15 13:17       ` Jonathan Cameron
  0 siblings, 1 reply; 65+ messages in thread
From: Dave Jiang @ 2023-02-14 23:03 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/9/23 8:24 AM, Jonathan Cameron wrote:
> On Mon, 06 Feb 2023 13:51:19 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to
>> caluclate latency and bandwidth for CXL memory device. Calculate minimum
> 
> Spell check your descriptions (I often forget to do this as well!
> )
>> bandwidth and total latency for the path from the CXL device to the root
>> port. The calculates values are stored in the cached DSMAS entries attached
>> to the cxl_port of the CXL device.
>>
>> For example for a device that is directly attached to a host bus:
>> Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link
>> 		Latency
>> Min Bandwidth = Link Bandwidth between Host Bus and CXL device
>>
>> For a device that has a switch in between host bus and CXL device:
>> Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency +
>> 		Switch (CDAT) Latency + Switch to HB Link Latency
> 
> For QTG purposes, are we also supposed to take into account HB to
> system interconnect type latency (or maybe nearest CPU?).
> That is likely to be non trivial.

Dan brought this ECN [1] to my attention. We can add this if we can find 
a BIOS that implements the ECN. Or should we code a place holder for it 
until this is available?

https://lore.kernel.org/linux-cxl/e1a52da9aec90766da5de51b1b839fd95d63a5af.camel@intel.com/

> 
>> Min Bandwidth = min(dev to switch bandwidth, switch to HB bandwidth)
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> Stray sign off.
> 
>>
>> The internal latency for a switch can be retrieved from the CDAT of the
>> switch PCI device. However, since there's no easy way to retrieve that
>> right now on Linux, a guesstimated constant is used per switch to simplify
>> the driver code.
> 
> I'd like to see that gap closed asap. I think it is fairly obvious how to do
> it, so shouldn't be too hard, just needs a dance to get the DOE for a switch
> port using Lukas' updated handling of DOE mailboxes.

Talked to Lukas and this may not be difficult with his latest changes. I 
can take a look. Do we support switch CDAT in QEMU yet?

> 
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>   drivers/cxl/core/port.c |   60 +++++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/cxl/cxl.h       |    9 +++++++
>>   drivers/cxl/port.c      |   42 +++++++++++++++++++++++++++++++++
>>   3 files changed, 111 insertions(+)
>>
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 2b27319cfd42..aa260361ba7d 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1899,6 +1899,66 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
>>   
>> +int cxl_port_get_downstream_qos(struct cxl_port *port, long *bw, long *lat)
>> +{
>> +	long total_lat = 0, latency;
> 
> Similar to before, not good for readability to hide asignments in a list all on one line.
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-14 22:22             ` Dave Jiang
@ 2023-02-15 12:13               ` Jonathan Cameron
  2023-02-22 17:54                 ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-15 12:13 UTC (permalink / raw)
  To: Dave Jiang
  Cc: Bjorn Helgaas, linux-cxl, linux-pci, linux-acpi, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, rafael, bhelgaas,
	robert.moore

On Tue, 14 Feb 2023 15:22:42 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> On 2/9/23 8:10 AM, Jonathan Cameron wrote:
> > On Wed, 8 Feb 2023 16:56:30 -0700
> > Dave Jiang <dave.jiang@intel.com> wrote:
> >   
> >> On 2/8/23 3:15 PM, Bjorn Helgaas wrote:  
> >>> On Tue, Feb 07, 2023 at 01:51:17PM -0700, Dave Jiang wrote:  
> >>>>
> >>>>
> >>>> On 2/6/23 3:39 PM, Bjorn Helgaas wrote:  
> >>>>> On Mon, Feb 06, 2023 at 01:51:10PM -0700, Dave Jiang wrote:  
> >>>>>> The latency is calculated by dividing the FLIT size over the
> >>>>>> bandwidth. Add support to retrieve the FLIT size for the CXL
> >>>>>> device and calculate the latency of the downstream link.  
> >>>      
> >>>>> I guess you only care about the latency of a single link, not the
> >>>>> entire path?  
> >>>>
> >>>> I am adding each of the link individually together in the next
> >>>> patch. Are you suggesting a similar function like
> >>>> pcie_bandwidth_available() but for latency for the entire path?  
> >>>
> >>> Only a clarifying question.
> >>>      
> >>>>>> +static int cxl_get_flit_size(struct pci_dev *pdev)
> >>>>>> +{
> >>>>>> +	if (cxl_pci_flit_256(pdev))
> >>>>>> +		return 256;
> >>>>>> +
> >>>>>> +	return 66;  
> >>>>>
> >>>>> I don't know about the 66-byte flit format, maybe this part is
> >>>>> CXL-specific?  
> >>>>
> >>>> 68-byte flit format. Looks like this is a typo from me.  
> >>>
> >>> This part must be CXL-specific, since I don't think PCIe mentions
> >>> 68-byte flits.
> >>>      
> >>>>>> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
> >>>>>> + * mode, otherwise it's 68B flits mode.
> >>>>>> + */
> >>>>>> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
> >>>>>> +{
> >>>>>> +	u32 lnksta2;
> >>>>>> +
> >>>>>> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
> >>>>>> +	return lnksta2 & BIT(10);  
> >>>>>
> >>>>> Add a #define for the bit.  
> >>>>
> >>>> ok will add.
> >>>>     
> >>>>>
> >>>>> AFAICT, the PCIe spec defines this bit, and it only indicates the link
> >>>>> is or will be operating in Flit Mode; it doesn't actually say anything
> >>>>> about how large the flits are.  I suppose that's because PCIe only
> >>>>> talks about 256B flits, not 66B ones?  
> >>>>
> >>>> Looking at CXL v1.0 rev3.0 6.2.3 "256B Flit Mode", table 6-4, it shows that
> >>>> when PCIe Flit Mode is set, then CXL is in 256B flits mode, otherwise, it is
> >>>> 68B flits. So an assumption is made here regarding the flit side based on
> >>>> the table.  
> >>>
> >>> So reading PCI_EXP_LNKSTA2 and extracting the Flit Mode bit is
> >>> PCIe-generic, but the interpretation of "PCIe Flit Mode not enabled
> >>> means 68-byte flits" is CXL-specific?
> >>>
> >>> This sounds wrong, but I don't know quite how.  How would the PCI core
> >>> manage links where Flit Mode being cleared really means Flit Mode is
> >>> *enabled* but with a different size?  Seems like something could go
> >>> wrong there.  
> >>
> >> Looking at the PCIe base spec and the CXL spec, that seemed to be the
> >> only way that implies the flit size for a CXL device as far as I can
> >> tell. I've yet to find a good way to make that determination. Dan?  
> > 
> > So a given CXL port has either trained up in:
> > * normal PCI (in which case all the normal PCI stuff applies) and we'll
> >    fail some of the other checks in the CXL driver never get hear here
> >    - I 'think' the driver will load for the PCI device to enable things
> >    like firmware upgrade, but we won't register the CXL Port devices
> >    that ultimately call this stuff.
> >    It's perfectly possible to have a driver that will cope with this
> >    but it's pretty meaningless for a lot of cxl type 3 driver.
> > * 68 byte flit (which was CXL precursor to PCI going flit based)
> >    Can be queried via CXL DVSEC Flex Bus Port Status CXL r3.0 8.2.1.3.3
> > * 256 byte flits (may or may not be compatible with PCIe ones as there
> >    are some optional latency optimizations)
> > 
> > So if the 68 byte flit is enabled the 256 byte one should never be and
> > CXL description is overriding the old PCIe
> > 
> > Hence I think we should have the additional check on the flex bus
> > dvsec even though it should be consistent with your assumption above.  
> 
> So I'm trying to understand the CXL DVSEC Port status "68B flit and VH 
> Enabled bit". If this bit is set, it means we are in 68B flit mode and 
> VH mode? 

I think so. 

> Do we just ignore RCH/RCD calculations since it doesn't support 
> hotplug?

Agreed. An impdef solution for RCH etc might be possible but there
isn't enough in the spec to do it.

> Does this bit get cleared for 256B flit mode? It's not clear to 
> me.

I think so.  I think once we are in 256B we know we are CXL 3.0 so
VH is true.  There is some compliance test coverage 14.6.11 but
it only talks about checking Link Status 2 to confirm link has
trained in 256B Flit Mode.  Not sure if there is a gap there to close
or not.  One to poke your spec folk on perhaps (I'm not making this
one my problem ;)


> 
> > 
> > Hmm. That does raise a question of how we take the latency optimized
> > flits into account or indeed some of the other latency impacting things
> > that may or may not be running - IDE in it's various modes for example.
> > 
> > For latency optimized we can query relevant bit in the flex bus port status.
> > IDE info will be somewhere I guess though no idea if there is a way to
> > know the latency impacts.  
> 
> Should we deal with latency optimized flits and IDE in a later step?

No fun :)

Sure.

Jonathan

> 
> > 
> > Jonathan
> >   
> >>
> >>  
> >>>
> >>> Bjorn  
> >   


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 13/18] cxl: Add latency and bandwidth calculations for the CXL path
  2023-02-14 23:03     ` Dave Jiang
@ 2023-02-15 13:17       ` Jonathan Cameron
  2023-02-15 16:38         ` Dave Jiang
  0 siblings, 1 reply; 65+ messages in thread
From: Jonathan Cameron @ 2023-02-15 13:17 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

On Tue, 14 Feb 2023 16:03:27 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> On 2/9/23 8:24 AM, Jonathan Cameron wrote:
> > On Mon, 06 Feb 2023 13:51:19 -0700
> > Dave Jiang <dave.jiang@intel.com> wrote:
> >   
> >> CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to
> >> caluclate latency and bandwidth for CXL memory device. Calculate minimum  
> > 
> > Spell check your descriptions (I often forget to do this as well!
> > )  
> >> bandwidth and total latency for the path from the CXL device to the root
> >> port. The calculates values are stored in the cached DSMAS entries attached
> >> to the cxl_port of the CXL device.
> >>
> >> For example for a device that is directly attached to a host bus:
> >> Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link
> >> 		Latency
> >> Min Bandwidth = Link Bandwidth between Host Bus and CXL device
> >>
> >> For a device that has a switch in between host bus and CXL device:
> >> Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency +
> >> 		Switch (CDAT) Latency + Switch to HB Link Latency  
> > 
> > For QTG purposes, are we also supposed to take into account HB to
> > system interconnect type latency (or maybe nearest CPU?).
> > That is likely to be non trivial.  
> 
> Dan brought this ECN [1] to my attention. We can add this if we can find 
> a BIOS that implements the ECN. Or should we code a place holder for it 
> until this is available?
> 
> https://lore.kernel.org/linux-cxl/e1a52da9aec90766da5de51b1b839fd95d63a5af.camel@intel.com/

I've had Generic Ports on my list to add to QEMU for a while but not been
high enough priority to either do it myself, or make it someone else's problem.
I suspect the biggest barrier in QEMU is going to be the interface to add
these to the NUMA description.

It's easy enough to hand build and inject a SRAT /SLIT/HMAT tables with
these in (that's how we developed the Generic Initiator support in Linux before
any BIOS support).  

So I'd like to see it soon, but I'm not hugely bothered if that element
follows this patch set. However, we are potentially going to see different
decisions made when that detail is added so it 'might' count as ABI
breakage if it's not there from the start. I think we are fine as probably
no BIOS' yet though.

> 
> >   
> >> Min Bandwidth = min(dev to switch bandwidth, switch to HB bandwidth)
> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com>  
> > 
> > Stray sign off.
> >   
> >>
> >> The internal latency for a switch can be retrieved from the CDAT of the
> >> switch PCI device. However, since there's no easy way to retrieve that
> >> right now on Linux, a guesstimated constant is used per switch to simplify
> >> the driver code.  
> > 
> > I'd like to see that gap closed asap. I think it is fairly obvious how to do
> > it, so shouldn't be too hard, just needs a dance to get the DOE for a switch
> > port using Lukas' updated handling of DOE mailboxes.  
> 
> Talked to Lukas and this may not be difficult with his latest changes. I 
> can take a look. Do we support switch CDAT in QEMU yet?

I started typing no, then thought I'd just check.  Seems I did write support
for CDAT on switches (and then completely forgot about it ;)
It's upstream and everything!
https://elixir.bootlin.com/qemu/latest/source/hw/pci-bridge/cxl_upstream.c#L194


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 13/18] cxl: Add latency and bandwidth calculations for the CXL path
  2023-02-15 13:17       ` Jonathan Cameron
@ 2023-02-15 16:38         ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-15 16:38 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore



On 2/15/23 6:17 AM, Jonathan Cameron wrote:
> On Tue, 14 Feb 2023 16:03:27 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> On 2/9/23 8:24 AM, Jonathan Cameron wrote:
>>> On Mon, 06 Feb 2023 13:51:19 -0700
>>> Dave Jiang <dave.jiang@intel.com> wrote:
>>>    
>>>> CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to
>>>> caluclate latency and bandwidth for CXL memory device. Calculate minimum
>>>
>>> Spell check your descriptions (I often forget to do this as well!
>>> )
>>>> bandwidth and total latency for the path from the CXL device to the root
>>>> port. The calculates values are stored in the cached DSMAS entries attached
>>>> to the cxl_port of the CXL device.
>>>>
>>>> For example for a device that is directly attached to a host bus:
>>>> Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link
>>>> 		Latency
>>>> Min Bandwidth = Link Bandwidth between Host Bus and CXL device
>>>>
>>>> For a device that has a switch in between host bus and CXL device:
>>>> Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency +
>>>> 		Switch (CDAT) Latency + Switch to HB Link Latency
>>>
>>> For QTG purposes, are we also supposed to take into account HB to
>>> system interconnect type latency (or maybe nearest CPU?).
>>> That is likely to be non trivial.
>>
>> Dan brought this ECN [1] to my attention. We can add this if we can find
>> a BIOS that implements the ECN. Or should we code a place holder for it
>> until this is available?
>>
>> https://lore.kernel.org/linux-cxl/e1a52da9aec90766da5de51b1b839fd95d63a5af.camel@intel.com/
> 
> I've had Generic Ports on my list to add to QEMU for a while but not been
> high enough priority to either do it myself, or make it someone else's problem.
> I suspect the biggest barrier in QEMU is going to be the interface to add
> these to the NUMA description.
> 
> It's easy enough to hand build and inject a SRAT /SLIT/HMAT tables with
> these in (that's how we developed the Generic Initiator support in Linux before
> any BIOS support).
> 
> So I'd like to see it soon, but I'm not hugely bothered if that element
> follows this patch set. However, we are potentially going to see different
> decisions made when that detail is added so it 'might' count as ABI
> breakage if it's not there from the start. I think we are fine as probably
> no BIOS' yet though.
> 
>>
>>>    
>>>> Min Bandwidth = min(dev to switch bandwidth, switch to HB bandwidth)
>>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>>
>>> Stray sign off.
>>>    
>>>>
>>>> The internal latency for a switch can be retrieved from the CDAT of the
>>>> switch PCI device. However, since there's no easy way to retrieve that
>>>> right now on Linux, a guesstimated constant is used per switch to simplify
>>>> the driver code.
>>>
>>> I'd like to see that gap closed asap. I think it is fairly obvious how to do
>>> it, so shouldn't be too hard, just needs a dance to get the DOE for a switch
>>> port using Lukas' updated handling of DOE mailboxes.
>>
>> Talked to Lukas and this may not be difficult with his latest changes. I
>> can take a look. Do we support switch CDAT in QEMU yet?
> 
> I started typing no, then thought I'd just check.  Seems I did write support
> for CDAT on switches (and then completely forgot about it ;)
> It's upstream and everything!
> https://elixir.bootlin.com/qemu/latest/source/hw/pci-bridge/cxl_upstream.c#L194
> 
Awesome! I'll go poke around a bit. Also it's very helpful to see the 
creation code. Helped me realize that I need to support parsing of 
SSLBIS sub-table for switches. Thanks!

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device
  2023-02-15 12:13               ` Jonathan Cameron
@ 2023-02-22 17:54                 ` Dave Jiang
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Jiang @ 2023-02-22 17:54 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Bjorn Helgaas, linux-cxl, linux-pci, linux-acpi, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, rafael, bhelgaas,
	robert.moore



On 2/15/23 5:13 AM, Jonathan Cameron wrote:
> On Tue, 14 Feb 2023 15:22:42 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> On 2/9/23 8:10 AM, Jonathan Cameron wrote:
>>> On Wed, 8 Feb 2023 16:56:30 -0700
>>> Dave Jiang <dave.jiang@intel.com> wrote:
>>>    
>>>> On 2/8/23 3:15 PM, Bjorn Helgaas wrote:
>>>>> On Tue, Feb 07, 2023 at 01:51:17PM -0700, Dave Jiang wrote:
>>>>>>
>>>>>>
>>>>>> On 2/6/23 3:39 PM, Bjorn Helgaas wrote:
>>>>>>> On Mon, Feb 06, 2023 at 01:51:10PM -0700, Dave Jiang wrote:
>>>>>>>> The latency is calculated by dividing the FLIT size over the
>>>>>>>> bandwidth. Add support to retrieve the FLIT size for the CXL
>>>>>>>> device and calculate the latency of the downstream link.
>>>>>       
>>>>>>> I guess you only care about the latency of a single link, not the
>>>>>>> entire path?
>>>>>>
>>>>>> I am adding each of the link individually together in the next
>>>>>> patch. Are you suggesting a similar function like
>>>>>> pcie_bandwidth_available() but for latency for the entire path?
>>>>>
>>>>> Only a clarifying question.
>>>>>       
>>>>>>>> +static int cxl_get_flit_size(struct pci_dev *pdev)
>>>>>>>> +{
>>>>>>>> +	if (cxl_pci_flit_256(pdev))
>>>>>>>> +		return 256;
>>>>>>>> +
>>>>>>>> +	return 66;
>>>>>>>
>>>>>>> I don't know about the 66-byte flit format, maybe this part is
>>>>>>> CXL-specific?
>>>>>>
>>>>>> 68-byte flit format. Looks like this is a typo from me.
>>>>>
>>>>> This part must be CXL-specific, since I don't think PCIe mentions
>>>>> 68-byte flits.
>>>>>       
>>>>>>>> + * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
>>>>>>>> + * mode, otherwise it's 68B flits mode.
>>>>>>>> + */
>>>>>>>> +static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
>>>>>>>> +{
>>>>>>>> +	u32 lnksta2;
>>>>>>>> +
>>>>>>>> +	pcie_capability_read_dword(pdev, PCI_EXP_LNKSTA2, &lnksta2);
>>>>>>>> +	return lnksta2 & BIT(10);
>>>>>>>
>>>>>>> Add a #define for the bit.
>>>>>>
>>>>>> ok will add.
>>>>>>      
>>>>>>>
>>>>>>> AFAICT, the PCIe spec defines this bit, and it only indicates the link
>>>>>>> is or will be operating in Flit Mode; it doesn't actually say anything
>>>>>>> about how large the flits are.  I suppose that's because PCIe only
>>>>>>> talks about 256B flits, not 66B ones?
>>>>>>
>>>>>> Looking at CXL v1.0 rev3.0 6.2.3 "256B Flit Mode", table 6-4, it shows that
>>>>>> when PCIe Flit Mode is set, then CXL is in 256B flits mode, otherwise, it is
>>>>>> 68B flits. So an assumption is made here regarding the flit side based on
>>>>>> the table.
>>>>>
>>>>> So reading PCI_EXP_LNKSTA2 and extracting the Flit Mode bit is
>>>>> PCIe-generic, but the interpretation of "PCIe Flit Mode not enabled
>>>>> means 68-byte flits" is CXL-specific?
>>>>>
>>>>> This sounds wrong, but I don't know quite how.  How would the PCI core
>>>>> manage links where Flit Mode being cleared really means Flit Mode is
>>>>> *enabled* but with a different size?  Seems like something could go
>>>>> wrong there.
>>>>
>>>> Looking at the PCIe base spec and the CXL spec, that seemed to be the
>>>> only way that implies the flit size for a CXL device as far as I can
>>>> tell. I've yet to find a good way to make that determination. Dan?
>>>
>>> So a given CXL port has either trained up in:
>>> * normal PCI (in which case all the normal PCI stuff applies) and we'll
>>>     fail some of the other checks in the CXL driver never get hear here
>>>     - I 'think' the driver will load for the PCI device to enable things
>>>     like firmware upgrade, but we won't register the CXL Port devices
>>>     that ultimately call this stuff.
>>>     It's perfectly possible to have a driver that will cope with this
>>>     but it's pretty meaningless for a lot of cxl type 3 driver.
>>> * 68 byte flit (which was CXL precursor to PCI going flit based)
>>>     Can be queried via CXL DVSEC Flex Bus Port Status CXL r3.0 8.2.1.3.3
>>> * 256 byte flits (may or may not be compatible with PCIe ones as there
>>>     are some optional latency optimizations)
>>>
>>> So if the 68 byte flit is enabled the 256 byte one should never be and
>>> CXL description is overriding the old PCIe
>>>
>>> Hence I think we should have the additional check on the flex bus
>>> dvsec even though it should be consistent with your assumption above.
>>
>> So I'm trying to understand the CXL DVSEC Port status "68B flit and VH
>> Enabled bit". If this bit is set, it means we are in 68B flit mode and
>> VH mode?
> 
> I think so.
> 
>> Do we just ignore RCH/RCD calculations since it doesn't support
>> hotplug?
> 
> Agreed. An impdef solution for RCH etc might be possible but there
> isn't enough in the spec to do it.
> 
>> Does this bit get cleared for 256B flit mode? It's not clear to
>> me.
> 
> I think so.  I think once we are in 256B we know we are CXL 3.0 so
> VH is true.  There is some compliance test coverage 14.6.11 but
> it only talks about checking Link Status 2 to confirm link has
> trained in 256B Flit Mode.  Not sure if there is a gap there to close
> or not.  One to poke your spec folk on perhaps (I'm not making this
> one my problem ;)

According to our spec guy, with the PCIe flit mode bit from PCIe LNKSTA2 
it is sufficient to determine if CXL is in 256B or 68B mode as the table 
implied.

DJ


> 
> 
>>
>>>
>>> Hmm. That does raise a question of how we take the latency optimized
>>> flits into account or indeed some of the other latency impacting things
>>> that may or may not be running - IDE in it's various modes for example.
>>>
>>> For latency optimized we can query relevant bit in the flex bus port status.
>>> IDE info will be somewhere I guess though no idea if there is a way to
>>> know the latency impacts.
>>
>> Should we deal with latency optimized flits and IDE in a later step?
> 
> No fun :)
> 
> Sure.
> 
> Jonathan
> 
>>
>>>
>>> Jonathan
>>>    
>>>>
>>>>   
>>>>>
>>>>> Bjorn
>>>    
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 18/18] cxl: Export sysfs attributes for device QTG IDs
  2023-02-09 15:41   ` Jonathan Cameron
@ 2023-03-23 23:20     ` Dan Williams
  0 siblings, 0 replies; 65+ messages in thread
From: Dan Williams @ 2023-03-23 23:20 UTC (permalink / raw)
  To: Jonathan Cameron, Dave Jiang
  Cc: linux-cxl, linux-pci, linux-acpi, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, rafael, bhelgaas, robert.moore

Jonathan Cameron wrote:
> On Mon, 06 Feb 2023 13:52:05 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
> > Export qtg_id sysfs attributes for the respective ram and pmem DPA range of
> > a CXL device. The QTG ID should show up as
> > /sys/bus/cxl/devices/memX/pmem/qtg_id for pmem or as
> > /sys/bus/cxl/devices/memX/ram/qtg_id for ram.
> 
> This doesn't extend to devices with say multiple DSMAS regions
> for RAM with different access characteristics.  Think of a device
> with HBM and DDR for example, or a mix of DDR4 and DDR5.
> 
> Once we are dealing with memory pools of significant size there
> are very likely to be DPA regions with different characteristics.
> 
> So minimum I'd suggest is leave space for an ABI that might look like.
> 
> mem/range0_qtg_id
> mem/range1_qtg_id
> mem/range0_base
> mem/range0_length
> mem/range1_base
> mem/range1_length
> etc but with the flexibility to not present the rangeX_base/length stuff if there
> is only one presented.  For now just present the range0_qtg_id

I do agree that there should be some mechanism to dump this information,
I am just not yet sure the should prioritize for the case where someone
builds multiple performance classes per partition type. There would seem
to be design pressure against that given you can not allocate regions
out of DPA order otherwise capacity gets stranded.

So I am thinking something like a debugfs interface to dump all the
ranges but otherwise leave memX/{ram,pmem,dcd[0-7]} with a single
qtg-id each.

If it turns out later that devices really call for multiple qtg-ids
per-partition as a first-class ABI then there's the option of something
like:

memX/ram/qtg_id
memX/ram/qtg_id1
memX/ram/qtg_id2

memX/ram/qtg_range/
memX/ram/qtg1_range/
memX/ram/qtg2_range/

...but I hope the primary use case for devices with multiple performance
ranges is due to having 'pmem' or 'dcd' in addition to 'ram'.

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2023-03-23 23:21 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-06 20:49 [PATCH 00/18] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
2023-02-06 20:49 ` [PATCH 01/18] cxl: Export QTG ids from CFMWS to sysfs Dave Jiang
2023-02-09 11:15   ` Jonathan Cameron
2023-02-09 17:28     ` Dave Jiang
2023-02-06 20:49 ` [PATCH 02/18] ACPICA: Export acpi_ut_verify_cdat_checksum() Dave Jiang
2023-02-07 14:19   ` Rafael J. Wysocki
2023-02-07 15:47     ` Dave Jiang
2023-02-09 11:30       ` Jonathan Cameron
2023-02-06 20:49 ` [PATCH 03/18] cxl: Add checksum verification to CDAT from CXL Dave Jiang
2023-02-09 11:34   ` Jonathan Cameron
2023-02-09 17:31     ` Dave Jiang
2023-02-06 20:49 ` [PATCH 04/18] cxl: Add common helpers for cdat parsing Dave Jiang
2023-02-09 11:58   ` Jonathan Cameron
2023-02-09 22:57     ` Dave Jiang
2023-02-11 10:18       ` Lukas Wunner
2023-02-14 13:17         ` Jonathan Cameron
2023-02-14 20:36         ` Dave Jiang
2023-02-06 20:50 ` [PATCH 05/18] ACPICA: Fix 'struct acpi_cdat_dsmas' spelling mistake Dave Jiang
2023-02-06 22:00   ` Lukas Wunner
2023-02-06 20:50 ` [PATCH 06/18] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
2023-02-09 13:29   ` Jonathan Cameron
2023-02-13 22:55     ` Dave Jiang
2023-02-06 20:50 ` [PATCH 07/18] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
2023-02-09 13:50   ` Jonathan Cameron
2023-02-14  0:24     ` Dave Jiang
2023-02-06 20:50 ` [PATCH 08/18] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
2023-02-09 14:02   ` Jonathan Cameron
2023-02-14 21:07     ` Dave Jiang
2023-02-06 20:50 ` [PATCH 09/18] cxl: Add helper function to retrieve ACPI handle of CXL root device Dave Jiang
2023-02-09 14:10   ` Jonathan Cameron
2023-02-14 21:29     ` Dave Jiang
2023-02-06 20:50 ` [PATCH 10/18] PCI: Export pcie_get_speed() using the code from sysfs PCI link speed show function Dave Jiang
2023-02-06 22:27   ` Lukas Wunner
2023-02-07 20:29     ` Dave Jiang
2023-02-06 20:51 ` [PATCH 11/18] PCI: Export pcie_get_width() using the code from sysfs PCI link width " Dave Jiang
2023-02-06 22:43   ` Bjorn Helgaas
2023-02-07 20:35     ` Dave Jiang
2023-02-06 20:51 ` [PATCH 12/18] cxl: Add helpers to calculate pci latency for the CXL device Dave Jiang
2023-02-06 22:39   ` Bjorn Helgaas
2023-02-07 20:51     ` Dave Jiang
2023-02-08 22:15       ` Bjorn Helgaas
2023-02-08 23:56         ` Dave Jiang
2023-02-09 15:10           ` Jonathan Cameron
2023-02-14 22:22             ` Dave Jiang
2023-02-15 12:13               ` Jonathan Cameron
2023-02-22 17:54                 ` Dave Jiang
2023-02-09 15:16   ` Jonathan Cameron
2023-02-06 20:51 ` [PATCH 13/18] cxl: Add latency and bandwidth calculations for the CXL path Dave Jiang
2023-02-09 15:24   ` Jonathan Cameron
2023-02-14 23:03     ` Dave Jiang
2023-02-15 13:17       ` Jonathan Cameron
2023-02-15 16:38         ` Dave Jiang
2023-02-06 20:51 ` [PATCH 14/18] cxl: Wait Memory_Info_Valid before access memory related info Dave Jiang
2023-02-09 15:29   ` Jonathan Cameron
2023-02-06 20:51 ` [PATCH 15/18] cxl: Move identify and partition query from pci probe to port probe Dave Jiang
2023-02-09 15:29   ` Jonathan Cameron
2023-02-06 20:51 ` [PATCH 16/18] cxl: Move reading of CDAT data from device to after media is ready Dave Jiang
2023-02-06 22:17   ` Lukas Wunner
2023-02-07 20:55     ` Dave Jiang
2023-02-09 15:31   ` Jonathan Cameron
2023-02-06 20:51 ` [PATCH 17/18] cxl: Attach QTG IDs to the DPA ranges for the device Dave Jiang
2023-02-09 15:34   ` Jonathan Cameron
2023-02-06 20:52 ` [PATCH 18/18] cxl: Export sysfs attributes for device QTG IDs Dave Jiang
2023-02-09 15:41   ` Jonathan Cameron
2023-03-23 23:20     ` Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.