All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem
@ 2023-10-11  1:04 Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 01/22] cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute Dave Jiang
                   ` (22 more replies)
  0 siblings, 23 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:04 UTC (permalink / raw)
  To: linux-cxl
  Cc: Hanjun Guo, Len Brown, Ira Weiny, Dan Williams,
	Greg Kroah-Hartman, Rafael J. Wysocki, Davidlohr Bueso,
	Rafael J. Wysocki, Jonathan Cameron, Jonathan Cameron,
	dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave

v10:
- Remove memory allocation in DSM handler. Add input parameters to request number of ids
  to return. (Dan)
- Only export a single qos_class sysfs attribute, qos_class0 for devices. (Dan)
- Add sanity check of device qos_class against root decoders (Dan)
- Recombine all relevant patches for cxl upstream merge
- Rebased against v6.6-rc4

v9:
- Correct DSM input package to use integers. (Dan)
- Remove all endien conversions. (Dan)

v8:
- Correct DSM return package parsing to use integers

v7:
- Minor changes. Please see specific patches for log entries addressing comments
  from v6.

v6:
- Please see specific patches for log entries addressing comments from v5.
- Use CDAT sub-table structs as is, adjust w/o common header. Changing causes
  significant ACPICA code changes.
- Deal with table header now that is a union. (Jonathan)
- Move DSMAS list destroy to core/cdat.c. (Jonathan)
- Retain and display entire QTG ID list from _DSM. (Jonathan)

v5:
- Please see specific patches for log entries addressing comments from v4.
- Split out ACPI and generic node code to send separately to respective maintainers
- Reworked to use ACPI tables code for CDAT parsing (Dan)
- Cache relevant perf data under dports (Dan)
- Add cxl root callback for QTG ID _DSM (Dan)
- Rename 'node_hmem_attr' to 'access_coordinate' (Dan)
- Change qtg_id sysfs attrib to qos_class (Dan)

v4:
- Reworked PCIe link path latency calculation
- 0-day fixes
- Removed unused qos_list from cxl_memdev and its stray usages

v3:
- Please see specific patches for log entries addressing comments from v2.
- Refactor cxl_port_probe() additions. (Alison)
- Convert to use 'struct node_hmem_attrs'
- Refactor to use common code for genport target allocation.
- Add third array entry for target hmem_attrs to store genport locality data.
- Go back to per partition QTG ID. (Dan)

v2:
- Please see specific patches for log entries addressing comments from v1.
- Removed ACPICA code usages.
- Removed PCI subsystem helpers for latency and bandwidth.
- Add CXL switch CDAT parsing support (SSLBIS)
- Add generic port SRAT+HMAT support (ACPI)
- Export a single QTG ID via sysfs per memory device (Dan)
- Provide rest of DSMAS range info in debugfs (Dan)

Hi Dan,
Please consider taking the entire series including the CXL bits for the next
convenient merge window. Thanks!

This series adds the retrieval of QoS Throttling Group (QTG) IDs for the CXL Fixed
Memory Window Structure (CFMWS) and the CXL memory device. It provides the QTG IDs
to user space to provide some guidance with putting the proper DPA range under the
appropriate CFMWS window for a hot-plugged CXL memory device.

The CFMWS structure contains a QTG ID that is associated with the memory window that the
structure exports. On Linux, the CFMWS is represented as a CXL root decoder. The QTG
ID will be attached to the CXL root decoder and exported as a sysfs attribute (qos_class).

The QTG IDs for a device is retrieved via sending a _DSM method to the ACPI0017 device.
The _DSM expects an input package of 4 DWORDS that contains the read latency, write
latency, read bandwidth, and write banwidth. These are the caluclated numbers for the
path between the CXL device and the CPU. The list of QTG IDs are also exported as a sysfs
attribute under the mem device memory partition type:
/sys/bus/cxl/devices/memX/ram/qos_class
/sys/bus/cxl/devices/memX/pmem/qos_class
A mapping of DPA ranges and it's correlated QTG IDs are found under
/sys/kernel/debug/cxl/memX/qtgmap. Each DSMAS from the device CDAT will provide a DPA
range.

The latency numbers are the aggregated latencies for the path between the CXL device and
the CPU. If a CXL device is directly attached to the CXL HB, the latency
would be the aggregated latencies from the device Coherent Device Attribute Table (CDAT),
the caluclated PCIe link latency between the device and the HB, and the generic port data
from ACPI SRAT+HMAT. The bandwidth in this configuration would be the minimum between the
CDAT bandwidth number, link bandwidth between the device and the HB, and the bandwidth data
from the generic port data via ACPI SRAT+HMAT.

If a configuration has a switch in between then the latency would be the aggregated
latencies from the device CDAT, the link latency between device and switch, the
latency from the switch CDAT, the link latency between switch and the HB, and the
generic port latency between the CPU and the CXL HB. The bandwidth calculation would be the
min of device CDAT bandwidth, link bandwith between device and switch, switch CDAT
bandwidth, the link bandwidth between switch and HB, and the generic port bandwidth

There can be 0 or more switches between the CXL device and the CXL HB. There are detailed
examples on calculating bandwidth and latency in the CXL Memory Device Software Guide [4].

The CDAT provides Device Scoped Memory Affinity Structures (DSMAS) that contains the
Device Physical Address (DPA) range and the related Device Scoped Latency and Bandwidth
Informat Stuctures (DSLBIS). Each DSLBIS provides a latency or bandwidth entry that is
tied to a DSMAS entry via a per DSMAS unique DSMAD handle.

Test setup is done with runqemu genport support branch [6]. The setup provides 2 CXL HBs
with one HB having a CXL switch underneath. It also provides generic port support detailed
below.

A hacked up qemu branch is used to support generic port SRAT and HMAT [7].

To create the appropriate HMAT entries for generic port, the following qemu paramters must
be added:

-object genport,id=$X -numa node,genport=genport$X,nodeid=$Y,initiator=$Z
-numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-latency,latency=$latency
-numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-bandwidth,bandwidth=$bandwidthM
for ((i = 0; i < total_nodes; i++)); do
	for ((j = 0; j < cxl_hbs; j++ )); do	# 2 CXL HBs
		-numa dist,src=$i,dst=$X,val=$dist
	done
done

See the genport run_qemu branch for full details.

[1]: https://www.computeexpresslink.org/download-the-specification
[2]: https://uefi.org/sites/default/files/resources/Coherent%20Device%20Attribute%20Table_1.01.pdf
[3]: https://uefi.org/sites/default/files/resources/ACPI_Spec_6_5_Aug29.pdf
[4]: https://cdrdv2-public.intel.com/643805/643805_CXL%20Memory%20Device%20SW%20Guide_Rev1p0.pdf
[5]: https://lore.kernel.org/linux-cxl/20230313195530.GA1532686@bhelgaas/T/#t
[6]: https://git.kernel.org/pub/scm/linux/kernel/git/djiang/linux.git/log/?h=cxl-qtg
[7]: https://github.com/pmem/run_qemu/tree/djiang/genport
[8]: https://github.com/davejiang/qemu/tree/genport

---

base-commit: 178e1ea6a68f12967ee0e9afc4d79a2939acd43c

---
Dave Jiang (22):
      cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute
      cxl: Add checksum verification to CDAT from CXL
      cxl: Add support for reading CXL switch CDAT table
      acpi: Move common tables helper functions to common lib
      lib/firmware_table: tables: Add CDAT table parsing support
      base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates'
      acpi: numa: Create enum for memory_target access coordinates indexing
      acpi: numa: Add genport target allocation to the HMAT parsing
      acpi: Break out nesting for hmat_parse_locality()
      acpi: numa: Add setting of generic port system locality attributes
      acpi: numa: Add helper function to retrieve the performance attributes
      cxl: Add callback to parse the DSMAS subtables from CDAT
      cxl: Add callback to parse the DSLBIS subtable from CDAT
      cxl: Add callback to parse the SSLBIS subtable from CDAT
      cxl: Add support for _DSM Function for retrieving QTG ID
      cxl: Calculate and store PCI link latency for the downstream ports
      cxl: Store the access coordinates for the generic ports
      cxl: Add helper function that calculate performance data for downstream ports
      cxl: Compute the entire CXL path latency and bandwidth data
      cxl: Store QTG IDs and related info to the CXL memory device context
      cxl: Export sysfs attributes for memory device QoS class
      cxl: Check qos_class validity on memdev probe


 Documentation/ABI/testing/sysfs-bus-cxl |  49 +++++
 MAINTAINERS                             |   2 +
 drivers/acpi/Kconfig                    |   1 +
 drivers/acpi/numa/hmat.c                | 172 +++++++++++++---
 drivers/acpi/tables.c                   | 178 +----------------
 drivers/base/node.c                     |  12 +-
 drivers/cxl/Kconfig                     |   1 +
 drivers/cxl/acpi.c                      | 151 +++++++++++++-
 drivers/cxl/core/Makefile               |   1 +
 drivers/cxl/core/cdat.c                 | 249 ++++++++++++++++++++++++
 drivers/cxl/core/mbox.c                 |   1 +
 drivers/cxl/core/memdev.c               |  34 ++++
 drivers/cxl/core/pci.c                  | 125 ++++++++++--
 drivers/cxl/core/port.c                 | 124 +++++++++++-
 drivers/cxl/cxl.h                       |  74 +++++++
 drivers/cxl/cxlmem.h                    |  23 +++
 drivers/cxl/cxlpci.h                    |  15 ++
 drivers/cxl/mem.c                       |  68 +++++++
 drivers/cxl/port.c                      | 109 +++++++++++
 include/linux/acpi.h                    |  54 +++--
 include/linux/fw_table.h                |  52 +++++
 include/linux/node.h                    |   8 +-
 lib/Kconfig                             |   3 +
 lib/Makefile                            |   2 +
 lib/fw_table.c                          | 237 ++++++++++++++++++++++
 25 files changed, 1478 insertions(+), 267 deletions(-)
 create mode 100644 drivers/cxl/core/cdat.c
 create mode 100644 include/linux/fw_table.h
 create mode 100644 lib/fw_table.c

--


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v10 01/22] cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 02/22] cxl: Add checksum verification to CDAT from CXL Dave Jiang
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Davidlohr Bueso, Ira Weiny, Jonathan Cameron, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, Jonathan.Cameron,
	dave

Export the QoS Throttling Group ID from the CXL Fixed Memory Window
Structure (CFMWS) under the root decoder sysfs attributes as qos_class.

CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS)

cxl cli will use this id to match with the _DSM retrieved id for a
hot-plugged CXL memory device DPA memory range to make sure that the
DPA range is under the right CFMWS window.

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-cxl |   15 +++++++++++++++
 drivers/cxl/acpi.c                      |    3 +++
 drivers/cxl/core/port.c                 |   11 +++++++++++
 drivers/cxl/cxl.h                       |    3 +++
 4 files changed, 32 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 087f762ebfd5..44ffbbb36654 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -369,6 +369,21 @@ Description:
 		provided it is currently idle / not bound to a driver.
 
 
+What:		/sys/bus/cxl/devices/decoderX.Y/qos_class
+Date:		May, 2023
+KernelVersion:	v6.5
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) For CXL host platforms that support "QoS Telemmetry" this
+		root-decoder-only attribute conveys a platform specific cookie
+		that identifies a QoS performance class for the CXL Window.
+		This class-id can be compared against a similar "qos_class"
+		published for each memory-type that an endpoint supports. While
+		it is not required that endpoints map their local memory-class
+		to a matching platform class, mismatches are not recommended and
+		there are platform specific side-effects that may result.
+
+
 What:		/sys/bus/cxl/devices/regionZ/uuid
 Date:		May, 2022
 KernelVersion:	v6.0
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 40d055560e52..2034eb4ce83f 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 			}
 		}
 	}
+
+	cxlrd->qos_class = cfmws->qtg_id;
+
 	rc = cxl_decoder_add(cxld, target_map);
 err_xormap:
 	if (rc)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 7ca01a834e18..fc31ff8954c0 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -278,6 +278,15 @@ static ssize_t interleave_ways_show(struct device *dev,
 
 static DEVICE_ATTR_RO(interleave_ways);
 
+static ssize_t qos_class_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev);
+
+	return sysfs_emit(buf, "%d\n", cxlrd->qos_class);
+}
+static DEVICE_ATTR_RO(qos_class);
+
 static struct attribute *cxl_decoder_base_attrs[] = {
 	&dev_attr_start.attr,
 	&dev_attr_size.attr,
@@ -297,6 +306,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
 	&dev_attr_cap_type2.attr,
 	&dev_attr_cap_type3.attr,
 	&dev_attr_target_list.attr,
+	&dev_attr_qos_class.attr,
 	SET_CXL_REGION_ATTR(create_pmem_region)
 	SET_CXL_REGION_ATTR(create_ram_region)
 	SET_CXL_REGION_ATTR(delete_region)
@@ -1691,6 +1701,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
 	}
 
 	atomic_set(&cxlrd->region_id, rc);
+	cxlrd->qos_class = CXL_QOS_CLASS_INVALID;
 	return cxlrd;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 76d92561af29..eb7924648cb0 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -321,6 +321,7 @@ enum cxl_decoder_type {
  */
 #define CXL_DECODER_MAX_INTERLEAVE 16
 
+#define CXL_QOS_CLASS_INVALID -1
 
 /**
  * struct cxl_decoder - Common CXL HDM Decoder Attributes
@@ -432,6 +433,7 @@ typedef struct cxl_dport *(*cxl_calc_hb_fn)(struct cxl_root_decoder *cxlrd,
  * @calc_hb: which host bridge covers the n'th position by granularity
  * @platform_data: platform specific configuration data
  * @range_lock: sync region autodiscovery by address range
+ * @qos_class: QoS performance class cookie
  * @cxlsd: base cxl switch decoder
  */
 struct cxl_root_decoder {
@@ -440,6 +442,7 @@ struct cxl_root_decoder {
 	cxl_calc_hb_fn calc_hb;
 	void *platform_data;
 	struct mutex range_lock;
+	int qos_class;
 	struct cxl_switch_decoder cxlsd;
 };
 



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 02/22] cxl: Add checksum verification to CDAT from CXL
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 01/22] cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 03/22] cxl: Add support for reading CXL switch CDAT table Dave Jiang
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Ira Weiny, Jonathan Cameron, Davidlohr Bueso, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, Jonathan.Cameron,
	dave

A CDAT table is available from a CXL device. The table is read by the
driver and cached in software. With the CXL subsystem needing to parse the
CDAT table, the checksum should be verified. Add checksum verification
after the CDAT table is read from device.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/pci.c |   30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index c7a7887ebdcf..9321a3865c30 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -595,6 +595,16 @@ static int cxl_cdat_read_table(struct device *dev,
 	return 0;
 }
 
+static unsigned char cdat_checksum(void *buf, size_t size)
+{
+	unsigned char sum, *data = buf;
+	size_t i;
+
+	for (sum = 0, i = 0; i < size; i++)
+		sum += data[i];
+	return sum;
+}
+
 /**
  * read_cdat_data - Read the CDAT data on this port
  * @port: Port to read data from
@@ -634,15 +644,21 @@ void read_cdat_data(struct cxl_port *port)
 		return;
 
 	rc = cxl_cdat_read_table(dev, cdat_doe, cdat_table, &cdat_length);
-	if (rc) {
-		/* Don't leave table data allocated on error */
-		devm_kfree(dev, cdat_table);
-		dev_err(dev, "CDAT data read error\n");
-		return;
-	}
+	if (rc)
+		goto err;
 
-	port->cdat.table = cdat_table + sizeof(__le32);
+	cdat_table = cdat_table + sizeof(__le32);
+	if (cdat_checksum(cdat_table, cdat_length))
+		goto err;
+
+	port->cdat.table = cdat_table;
 	port->cdat.length = cdat_length;
+	return;
+
+err:
+	/* Don't leave table data allocated on error */
+	devm_kfree(dev, cdat_table);
+	dev_err(dev, "Failed to read/validate CDAT.\n");
 }
 EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
 



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 03/22] cxl: Add support for reading CXL switch CDAT table
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 01/22] cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 02/22] cxl: Add checksum verification to CDAT from CXL Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 04/22] acpi: Move common tables helper functions to common lib Dave Jiang
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Davidlohr Bueso, Ira Weiny, Jonathan Cameron, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, Jonathan.Cameron,
	dave

Add read_cdat_data() call in cxl_switch_port_probe() to allow
reading of CDAT data for CXL switches. read_cdat_data() needs
to be adjusted for the retrieving of the PCIe device depending
on if the passed in port is endpoint or switch.

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/pci.c |   22 +++++++++++++++++-----
 drivers/cxl/port.c     |    3 +++
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 9321a3865c30..c86bcc181a5d 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -613,18 +613,30 @@ static unsigned char cdat_checksum(void *buf, size_t size)
  */
 void read_cdat_data(struct cxl_port *port)
 {
-	struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport_dev);
-	struct device *host = cxlmd->dev.parent;
+	struct device *uport = port->uport_dev;
 	struct device *dev = &port->dev;
 	struct pci_doe_mb *cdat_doe;
+	struct pci_dev *pdev = NULL;
+	struct cxl_memdev *cxlmd;
 	size_t cdat_length;
 	void *cdat_table;
 	int rc;
 
-	if (!dev_is_pci(host))
+	if (is_cxl_memdev(uport)) {
+		struct device *host;
+
+		cxlmd = to_cxl_memdev(uport);
+		host = cxlmd->dev.parent;
+		if (dev_is_pci(host))
+			pdev = to_pci_dev(host);
+	} else if (dev_is_pci(uport)) {
+		pdev = to_pci_dev(uport);
+	}
+
+	if (!pdev)
 		return;
-	cdat_doe = pci_find_doe_mailbox(to_pci_dev(host),
-					PCI_DVSEC_VENDOR_ID_CXL,
+
+	cdat_doe = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL,
 					CXL_DOE_PROTOCOL_TABLE_ACCESS);
 	if (!cdat_doe) {
 		dev_dbg(dev, "No CDAT mailbox\n");
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 6240e05b9542..47bc8e0b8590 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -62,6 +62,9 @@ static int cxl_switch_port_probe(struct cxl_port *port)
 	struct cxl_hdm *cxlhdm;
 	int rc;
 
+	/* Cache the data early to ensure is_visible() works */
+	read_cdat_data(port);
+
 	rc = devm_cxl_port_enumerate_dports(port);
 	if (rc < 0)
 		return rc;



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 04/22] acpi: Move common tables helper functions to common lib
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (2 preceding siblings ...)
  2023-10-11  1:05 ` [PATCH v10 03/22] cxl: Add support for reading CXL switch CDAT table Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11 12:55   ` Jonathan Cameron
  2023-10-11  1:05 ` [PATCH v10 05/22] lib/firmware_table: tables: Add CDAT table parsing support Dave Jiang
                   ` (18 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Rafael J. Wysocki, Hanjun Guo, Rafael J. Wysocki, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, Jonathan.Cameron,
	dave

Some of the routines in ACPI driver/acpi/tables.c can be shared with
parsing CDAT. CDAT is a device-provided data structure that is formatted
similar to a platform provided ACPI table. CDAT is used by CXL and can
exist on platforms that do not use ACPI. Split out the common routine
from ACPI to accommodate platforms that do not support ACPI and move that
to /lib. The common routines can be built outside of ACPI if
FIRMWARE_TABLES is selected.

Link: https://lore.kernel.org/linux-cxl/CAJZ5v0jipbtTNnsA0-o5ozOk8ZgWnOg34m34a9pPenTyRLj=6A@mail.gmail.com/
Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 MAINTAINERS              |    2 
 drivers/acpi/Kconfig     |    1 
 drivers/acpi/tables.c    |  173 ------------------------------------------
 include/linux/acpi.h     |   42 +++-------
 include/linux/fw_table.h |   43 ++++++++++
 lib/Kconfig              |    3 +
 lib/Makefile             |    2 
 lib/fw_table.c           |  189 ++++++++++++++++++++++++++++++++++++++++++++++
 8 files changed, 251 insertions(+), 204 deletions(-)
 create mode 100644 include/linux/fw_table.h
 create mode 100644 lib/fw_table.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 35977b269d5e..c6b08ec50109 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -294,6 +294,8 @@ F:	drivers/pnp/pnpacpi/
 F:	include/acpi/
 F:	include/linux/acpi.h
 F:	include/linux/fwnode.h
+F:	include/linux/fw_table.h
+F:	lib/fw_table.c
 F:	tools/power/acpi/
 
 ACPI APEI
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index cee82b473dc5..c0a2d43f8451 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -12,6 +12,7 @@ menuconfig ACPI
 	select PNP
 	select NLS
 	select CRC32
+	select FIRMWARE_TABLE
 	default y if X86
 	help
 	  Advanced Configuration and Power Interface (ACPI) support for 
diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 8ab0a82b4da4..c1516337f668 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -37,18 +37,6 @@ static struct acpi_table_desc initial_tables[ACPI_MAX_TABLES] __initdata;
 
 static int acpi_apic_instance __initdata_or_acpilib;
 
-enum acpi_subtable_type {
-	ACPI_SUBTABLE_COMMON,
-	ACPI_SUBTABLE_HMAT,
-	ACPI_SUBTABLE_PRMT,
-	ACPI_SUBTABLE_CEDT,
-};
-
-struct acpi_subtable_entry {
-	union acpi_subtable_headers *hdr;
-	enum acpi_subtable_type type;
-};
-
 /*
  * Disable table checksum verification for the early stage due to the size
  * limitation of the current x86 early mapping implementation.
@@ -237,167 +225,6 @@ void acpi_table_print_madt_entry(struct acpi_subtable_header *header)
 	}
 }
 
-static unsigned long __init_or_acpilib
-acpi_get_entry_type(struct acpi_subtable_entry *entry)
-{
-	switch (entry->type) {
-	case ACPI_SUBTABLE_COMMON:
-		return entry->hdr->common.type;
-	case ACPI_SUBTABLE_HMAT:
-		return entry->hdr->hmat.type;
-	case ACPI_SUBTABLE_PRMT:
-		return 0;
-	case ACPI_SUBTABLE_CEDT:
-		return entry->hdr->cedt.type;
-	}
-	return 0;
-}
-
-static unsigned long __init_or_acpilib
-acpi_get_entry_length(struct acpi_subtable_entry *entry)
-{
-	switch (entry->type) {
-	case ACPI_SUBTABLE_COMMON:
-		return entry->hdr->common.length;
-	case ACPI_SUBTABLE_HMAT:
-		return entry->hdr->hmat.length;
-	case ACPI_SUBTABLE_PRMT:
-		return entry->hdr->prmt.length;
-	case ACPI_SUBTABLE_CEDT:
-		return entry->hdr->cedt.length;
-	}
-	return 0;
-}
-
-static unsigned long __init_or_acpilib
-acpi_get_subtable_header_length(struct acpi_subtable_entry *entry)
-{
-	switch (entry->type) {
-	case ACPI_SUBTABLE_COMMON:
-		return sizeof(entry->hdr->common);
-	case ACPI_SUBTABLE_HMAT:
-		return sizeof(entry->hdr->hmat);
-	case ACPI_SUBTABLE_PRMT:
-		return sizeof(entry->hdr->prmt);
-	case ACPI_SUBTABLE_CEDT:
-		return sizeof(entry->hdr->cedt);
-	}
-	return 0;
-}
-
-static enum acpi_subtable_type __init_or_acpilib
-acpi_get_subtable_type(char *id)
-{
-	if (strncmp(id, ACPI_SIG_HMAT, 4) == 0)
-		return ACPI_SUBTABLE_HMAT;
-	if (strncmp(id, ACPI_SIG_PRMT, 4) == 0)
-		return ACPI_SUBTABLE_PRMT;
-	if (strncmp(id, ACPI_SIG_CEDT, 4) == 0)
-		return ACPI_SUBTABLE_CEDT;
-	return ACPI_SUBTABLE_COMMON;
-}
-
-static __init_or_acpilib bool has_handler(struct acpi_subtable_proc *proc)
-{
-	return proc->handler || proc->handler_arg;
-}
-
-static __init_or_acpilib int call_handler(struct acpi_subtable_proc *proc,
-					  union acpi_subtable_headers *hdr,
-					  unsigned long end)
-{
-	if (proc->handler)
-		return proc->handler(hdr, end);
-	if (proc->handler_arg)
-		return proc->handler_arg(hdr, proc->arg, end);
-	return -EINVAL;
-}
-
-/**
- * acpi_parse_entries_array - for each proc_num find a suitable subtable
- *
- * @id: table id (for debugging purposes)
- * @table_size: size of the root table
- * @table_header: where does the table start?
- * @proc: array of acpi_subtable_proc struct containing entry id
- *        and associated handler with it
- * @proc_num: how big proc is?
- * @max_entries: how many entries can we process?
- *
- * For each proc_num find a subtable with proc->id and run proc->handler
- * on it. Assumption is that there's only single handler for particular
- * entry id.
- *
- * The table_size is not the size of the complete ACPI table (the length
- * field in the header struct), but only the size of the root table; i.e.,
- * the offset from the very first byte of the complete ACPI table, to the
- * first byte of the very first subtable.
- *
- * On success returns sum of all matching entries for all proc handlers.
- * Otherwise, -ENODEV or -EINVAL is returned.
- */
-static int __init_or_acpilib acpi_parse_entries_array(
-	char *id, unsigned long table_size,
-	struct acpi_table_header *table_header, struct acpi_subtable_proc *proc,
-	int proc_num, unsigned int max_entries)
-{
-	struct acpi_subtable_entry entry;
-	unsigned long table_end, subtable_len, entry_len;
-	int count = 0;
-	int errs = 0;
-	int i;
-
-	table_end = (unsigned long)table_header + table_header->length;
-
-	/* Parse all entries looking for a match. */
-
-	entry.type = acpi_get_subtable_type(id);
-	entry.hdr = (union acpi_subtable_headers *)
-	    ((unsigned long)table_header + table_size);
-	subtable_len = acpi_get_subtable_header_length(&entry);
-
-	while (((unsigned long)entry.hdr) + subtable_len  < table_end) {
-		if (max_entries && count >= max_entries)
-			break;
-
-		for (i = 0; i < proc_num; i++) {
-			if (acpi_get_entry_type(&entry) != proc[i].id)
-				continue;
-			if (!has_handler(&proc[i]) ||
-			    (!errs &&
-			     call_handler(&proc[i], entry.hdr, table_end))) {
-				errs++;
-				continue;
-			}
-
-			proc[i].count++;
-			break;
-		}
-		if (i != proc_num)
-			count++;
-
-		/*
-		 * If entry->length is 0, break from this loop to avoid
-		 * infinite loop.
-		 */
-		entry_len = acpi_get_entry_length(&entry);
-		if (entry_len == 0) {
-			pr_err("[%4.4s:0x%02x] Invalid zero length\n", id, proc->id);
-			return -EINVAL;
-		}
-
-		entry.hdr = (union acpi_subtable_headers *)
-		    ((unsigned long)entry.hdr + entry_len);
-	}
-
-	if (max_entries && count > max_entries) {
-		pr_warn("[%4.4s:0x%02x] found the maximum %i entries\n",
-			id, proc->id, count);
-	}
-
-	return errs ? -EINVAL : count;
-}
-
 int __init_or_acpilib acpi_table_parse_entries_array(
 	char *id, unsigned long table_size, struct acpi_subtable_proc *proc,
 	int proc_num, unsigned int max_entries)
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index a73246c3c35e..0d334975c0c9 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -15,6 +15,7 @@
 #include <linux/mod_devicetable.h>
 #include <linux/property.h>
 #include <linux/uuid.h>
+#include <linux/fw_table.h>
 
 struct irq_domain;
 struct irq_domain_ops;
@@ -24,6 +25,16 @@ struct irq_domain_ops;
 #endif
 #include <acpi/acpi.h>
 
+#ifdef CONFIG_ACPI_TABLE_LIB
+#define EXPORT_SYMBOL_ACPI_LIB(x) EXPORT_SYMBOL_NS_GPL(x, ACPI)
+#define __init_or_acpilib
+#define __initdata_or_acpilib
+#else
+#define EXPORT_SYMBOL_ACPI_LIB(x)
+#define __init_or_acpilib __init
+#define __initdata_or_acpilib __initdata
+#endif
+
 #ifdef	CONFIG_ACPI
 
 #include <linux/list.h>
@@ -119,21 +130,8 @@ enum acpi_address_range_id {
 
 
 /* Table Handlers */
-union acpi_subtable_headers {
-	struct acpi_subtable_header common;
-	struct acpi_hmat_structure hmat;
-	struct acpi_prmt_module_header prmt;
-	struct acpi_cedt_header cedt;
-};
-
 typedef int (*acpi_tbl_table_handler)(struct acpi_table_header *table);
 
-typedef int (*acpi_tbl_entry_handler)(union acpi_subtable_headers *header,
-				      const unsigned long end);
-
-typedef int (*acpi_tbl_entry_handler_arg)(union acpi_subtable_headers *header,
-					  void *arg, const unsigned long end);
-
 /* Debugger support */
 
 struct acpi_debugger_ops {
@@ -207,14 +205,6 @@ static inline int acpi_debugger_notify_command_complete(void)
 		(!entry) || (unsigned long)entry + sizeof(*entry) > end ||  \
 		((struct acpi_subtable_header *)entry)->length < sizeof(*entry))
 
-struct acpi_subtable_proc {
-	int id;
-	acpi_tbl_entry_handler handler;
-	acpi_tbl_entry_handler_arg handler_arg;
-	void *arg;
-	int count;
-};
-
 void __iomem *__acpi_map_table(unsigned long phys, unsigned long size);
 void __acpi_unmap_table(void __iomem *map, unsigned long size);
 int early_acpi_boot_init(void);
@@ -229,16 +219,6 @@ void acpi_reserve_initial_tables (void);
 void acpi_table_init_complete (void);
 int acpi_table_init (void);
 
-#ifdef CONFIG_ACPI_TABLE_LIB
-#define EXPORT_SYMBOL_ACPI_LIB(x) EXPORT_SYMBOL_NS_GPL(x, ACPI)
-#define __init_or_acpilib
-#define __initdata_or_acpilib
-#else
-#define EXPORT_SYMBOL_ACPI_LIB(x)
-#define __init_or_acpilib __init
-#define __initdata_or_acpilib __initdata
-#endif
-
 int acpi_table_parse(char *id, acpi_tbl_table_handler handler);
 int __init_or_acpilib acpi_table_parse_entries(char *id,
 		unsigned long table_size, int entry_id,
diff --git a/include/linux/fw_table.h b/include/linux/fw_table.h
new file mode 100644
index 000000000000..ff8fa58d5818
--- /dev/null
+++ b/include/linux/fw_table.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ *  fw_tables.h - Parsing support for ACPI and ACPI-like tables provided by
+ *                platform or device firmware
+ *
+ *  Copyright (C) 2001 Paul Diefenbaugh <paul.s.diefenbaugh@intel.com>
+ *  Copyright (C) 2023 Intel Corp.
+ */
+#ifndef _FW_TABLE_H_
+#define _FW_TABLE_H_
+
+union acpi_subtable_headers;
+
+typedef int (*acpi_tbl_entry_handler)(union acpi_subtable_headers *header,
+				      const unsigned long end);
+
+typedef int (*acpi_tbl_entry_handler_arg)(union acpi_subtable_headers *header,
+					  void *arg, const unsigned long end);
+
+struct acpi_subtable_proc {
+	int id;
+	acpi_tbl_entry_handler handler;
+	acpi_tbl_entry_handler_arg handler_arg;
+	void *arg;
+	int count;
+};
+
+#include <linux/acpi.h>
+#include <acpi/acpi.h>
+
+union acpi_subtable_headers {
+	struct acpi_subtable_header common;
+	struct acpi_hmat_structure hmat;
+	struct acpi_prmt_module_header prmt;
+	struct acpi_cedt_header cedt;
+};
+
+int acpi_parse_entries_array(char *id, unsigned long table_size,
+			     struct acpi_table_header *table_header,
+			     struct acpi_subtable_proc *proc,
+			     int proc_num, unsigned int max_entries);
+
+#endif
diff --git a/lib/Kconfig b/lib/Kconfig
index c686f4adc124..d0474e251da5 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -764,3 +764,6 @@ config ASN1_ENCODER
 
 config POLYNOMIAL
        tristate
+
+config FIRMWARE_TABLE
+	bool
diff --git a/lib/Makefile b/lib/Makefile
index 740109b6e2c8..f113b44bc4e4 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -405,6 +405,8 @@ obj-$(CONFIG_SIPHASH_KUNIT_TEST) += siphash_kunit.o
 
 obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
 
+obj-$(CONFIG_FIRMWARE_TABLE) += fw_table.o
+
 # FORTIFY_SOURCE compile-time behavior tests
 TEST_FORTIFY_SRCS = $(wildcard $(srctree)/$(src)/test_fortify/*-*.c)
 TEST_FORTIFY_LOGS = $(patsubst $(srctree)/$(src)/%.c, %.log, $(TEST_FORTIFY_SRCS))
diff --git a/lib/fw_table.c b/lib/fw_table.c
new file mode 100644
index 000000000000..e84bd0866e10
--- /dev/null
+++ b/lib/fw_table.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ *  fw_tables.c - Parsing support for ACPI and ACPI-like tables provided by
+ *                platform or device firmware
+ *
+ *  Copyright (C) 2001 Paul Diefenbaugh <paul.s.diefenbaugh@intel.com>
+ *  Copyright (C) 2023 Intel Corp.
+ */
+#include <linux/errno.h>
+#include <linux/fw_table.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/types.h>
+
+enum acpi_subtable_type {
+	ACPI_SUBTABLE_COMMON,
+	ACPI_SUBTABLE_HMAT,
+	ACPI_SUBTABLE_PRMT,
+	ACPI_SUBTABLE_CEDT,
+};
+
+struct acpi_subtable_entry {
+	union acpi_subtable_headers *hdr;
+	enum acpi_subtable_type type;
+};
+
+static unsigned long __init_or_acpilib
+acpi_get_entry_type(struct acpi_subtable_entry *entry)
+{
+	switch (entry->type) {
+	case ACPI_SUBTABLE_COMMON:
+		return entry->hdr->common.type;
+	case ACPI_SUBTABLE_HMAT:
+		return entry->hdr->hmat.type;
+	case ACPI_SUBTABLE_PRMT:
+		return 0;
+	case ACPI_SUBTABLE_CEDT:
+		return entry->hdr->cedt.type;
+	}
+	return 0;
+}
+
+static unsigned long __init_or_acpilib
+acpi_get_entry_length(struct acpi_subtable_entry *entry)
+{
+	switch (entry->type) {
+	case ACPI_SUBTABLE_COMMON:
+		return entry->hdr->common.length;
+	case ACPI_SUBTABLE_HMAT:
+		return entry->hdr->hmat.length;
+	case ACPI_SUBTABLE_PRMT:
+		return entry->hdr->prmt.length;
+	case ACPI_SUBTABLE_CEDT:
+		return entry->hdr->cedt.length;
+	}
+	return 0;
+}
+
+static unsigned long __init_or_acpilib
+acpi_get_subtable_header_length(struct acpi_subtable_entry *entry)
+{
+	switch (entry->type) {
+	case ACPI_SUBTABLE_COMMON:
+		return sizeof(entry->hdr->common);
+	case ACPI_SUBTABLE_HMAT:
+		return sizeof(entry->hdr->hmat);
+	case ACPI_SUBTABLE_PRMT:
+		return sizeof(entry->hdr->prmt);
+	case ACPI_SUBTABLE_CEDT:
+		return sizeof(entry->hdr->cedt);
+	}
+	return 0;
+}
+
+static enum acpi_subtable_type __init_or_acpilib
+acpi_get_subtable_type(char *id)
+{
+	if (strncmp(id, ACPI_SIG_HMAT, 4) == 0)
+		return ACPI_SUBTABLE_HMAT;
+	if (strncmp(id, ACPI_SIG_PRMT, 4) == 0)
+		return ACPI_SUBTABLE_PRMT;
+	if (strncmp(id, ACPI_SIG_CEDT, 4) == 0)
+		return ACPI_SUBTABLE_CEDT;
+	return ACPI_SUBTABLE_COMMON;
+}
+
+static __init_or_acpilib bool has_handler(struct acpi_subtable_proc *proc)
+{
+	return proc->handler || proc->handler_arg;
+}
+
+static __init_or_acpilib int call_handler(struct acpi_subtable_proc *proc,
+					  union acpi_subtable_headers *hdr,
+					  unsigned long end)
+{
+	if (proc->handler)
+		return proc->handler(hdr, end);
+	if (proc->handler_arg)
+		return proc->handler_arg(hdr, proc->arg, end);
+	return -EINVAL;
+}
+
+/**
+ * acpi_parse_entries_array - for each proc_num find a suitable subtable
+ *
+ * @id: table id (for debugging purposes)
+ * @table_size: size of the root table
+ * @table_header: where does the table start?
+ * @proc: array of acpi_subtable_proc struct containing entry id
+ *        and associated handler with it
+ * @proc_num: how big proc is?
+ * @max_entries: how many entries can we process?
+ *
+ * For each proc_num find a subtable with proc->id and run proc->handler
+ * on it. Assumption is that there's only single handler for particular
+ * entry id.
+ *
+ * The table_size is not the size of the complete ACPI table (the length
+ * field in the header struct), but only the size of the root table; i.e.,
+ * the offset from the very first byte of the complete ACPI table, to the
+ * first byte of the very first subtable.
+ *
+ * On success returns sum of all matching entries for all proc handlers.
+ * Otherwise, -ENODEV or -EINVAL is returned.
+ */
+int __init_or_acpilib
+acpi_parse_entries_array(char *id, unsigned long table_size,
+			 struct acpi_table_header *table_header,
+			 struct acpi_subtable_proc *proc,
+			 int proc_num, unsigned int max_entries)
+{
+	unsigned long table_end, subtable_len, entry_len;
+	struct acpi_subtable_entry entry;
+	int count = 0;
+	int errs = 0;
+	int i;
+
+	table_end = (unsigned long)table_header + table_header->length;
+
+	/* Parse all entries looking for a match. */
+
+	entry.type = acpi_get_subtable_type(id);
+	entry.hdr = (union acpi_subtable_headers *)
+	    ((unsigned long)table_header + table_size);
+	subtable_len = acpi_get_subtable_header_length(&entry);
+
+	while (((unsigned long)entry.hdr) + subtable_len  < table_end) {
+		if (max_entries && count >= max_entries)
+			break;
+
+		for (i = 0; i < proc_num; i++) {
+			if (acpi_get_entry_type(&entry) != proc[i].id)
+				continue;
+			if (!has_handler(&proc[i]) ||
+			    (!errs &&
+			     call_handler(&proc[i], entry.hdr, table_end))) {
+				errs++;
+				continue;
+			}
+
+			proc[i].count++;
+			break;
+		}
+		if (i != proc_num)
+			count++;
+
+		/*
+		 * If entry->length is 0, break from this loop to avoid
+		 * infinite loop.
+		 */
+		entry_len = acpi_get_entry_length(&entry);
+		if (entry_len == 0) {
+			pr_err("[%4.4s:0x%02x] Invalid zero length\n", id, proc->id);
+			return -EINVAL;
+		}
+
+		entry.hdr = (union acpi_subtable_headers *)
+		    ((unsigned long)entry.hdr + entry_len);
+	}
+
+	if (max_entries && count > max_entries) {
+		pr_warn("[%4.4s:0x%02x] found the maximum %i entries\n",
+			id, proc->id, count);
+	}
+
+	return errs ? -EINVAL : count;
+}
+EXPORT_SYMBOL_GPL(acpi_parse_entries_array);



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 05/22] lib/firmware_table: tables: Add CDAT table parsing support
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (3 preceding siblings ...)
  2023-10-11  1:05 ` [PATCH v10 04/22] acpi: Move common tables helper functions to common lib Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 06/22] base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates' Dave Jiang
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Rafael J. Wysocki, Len Brown, Jonathan Cameron,
	Rafael J. Wysocki, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave

The CDAT table is very similar to ACPI tables when it comes to sub-table
and entry structures. The helper functions can be also used to parse the
CDAT table. Add support to the helper functions to deal with an external
CDAT table, and also handle the endieness since CDAT can be processed by a
BE host. Export a function cdat_table_parse() for CXL driver to parse
a CDAT table.

In order to minimize ACPICA code changes, __force is being utilized to deal
with the case of a big endian (BE) host parsing a CDAT. All CDAT data
structure variables are being force casted to __leX as appropriate.

Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/tables.c    |    5 +++-
 include/linux/fw_table.h |   11 +++++++++
 lib/fw_table.c           |   54 +++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 64 insertions(+), 6 deletions(-)

diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index c1516337f668..b07f7d091d13 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -251,8 +251,9 @@ int __init_or_acpilib acpi_table_parse_entries_array(
 		return -ENODEV;
 	}
 
-	count = acpi_parse_entries_array(id, table_size, table_header,
-			proc, proc_num, max_entries);
+	count = acpi_parse_entries_array(id, table_size,
+					 (union fw_table_header *)table_header,
+					 proc, proc_num, max_entries);
 
 	acpi_put_table(table_header);
 	return count;
diff --git a/include/linux/fw_table.h b/include/linux/fw_table.h
index ff8fa58d5818..a97d368df520 100644
--- a/include/linux/fw_table.h
+++ b/include/linux/fw_table.h
@@ -28,16 +28,25 @@ struct acpi_subtable_proc {
 #include <linux/acpi.h>
 #include <acpi/acpi.h>
 
+union fw_table_header {
+	struct acpi_table_header acpi;
+	struct acpi_table_cdat cdat;
+};
+
 union acpi_subtable_headers {
 	struct acpi_subtable_header common;
 	struct acpi_hmat_structure hmat;
 	struct acpi_prmt_module_header prmt;
 	struct acpi_cedt_header cedt;
+	struct acpi_cdat_header cdat;
 };
 
 int acpi_parse_entries_array(char *id, unsigned long table_size,
-			     struct acpi_table_header *table_header,
+			     union fw_table_header *table_header,
 			     struct acpi_subtable_proc *proc,
 			     int proc_num, unsigned int max_entries);
 
+int cdat_table_parse(enum acpi_cdat_type type,
+		     acpi_tbl_entry_handler_arg handler_arg, void *arg,
+		     struct acpi_table_cdat *table_header);
 #endif
diff --git a/lib/fw_table.c b/lib/fw_table.c
index e84bd0866e10..108f1db3f1a6 100644
--- a/lib/fw_table.c
+++ b/lib/fw_table.c
@@ -18,6 +18,7 @@ enum acpi_subtable_type {
 	ACPI_SUBTABLE_HMAT,
 	ACPI_SUBTABLE_PRMT,
 	ACPI_SUBTABLE_CEDT,
+	CDAT_SUBTABLE,
 };
 
 struct acpi_subtable_entry {
@@ -37,6 +38,8 @@ acpi_get_entry_type(struct acpi_subtable_entry *entry)
 		return 0;
 	case ACPI_SUBTABLE_CEDT:
 		return entry->hdr->cedt.type;
+	case CDAT_SUBTABLE:
+		return entry->hdr->cdat.type;
 	}
 	return 0;
 }
@@ -53,6 +56,11 @@ acpi_get_entry_length(struct acpi_subtable_entry *entry)
 		return entry->hdr->prmt.length;
 	case ACPI_SUBTABLE_CEDT:
 		return entry->hdr->cedt.length;
+	case CDAT_SUBTABLE: {
+		__le16 length = (__force __le16)entry->hdr->cdat.length;
+
+		return le16_to_cpu(length);
+	}
 	}
 	return 0;
 }
@@ -69,6 +77,8 @@ acpi_get_subtable_header_length(struct acpi_subtable_entry *entry)
 		return sizeof(entry->hdr->prmt);
 	case ACPI_SUBTABLE_CEDT:
 		return sizeof(entry->hdr->cedt);
+	case CDAT_SUBTABLE:
+		return sizeof(entry->hdr->cdat);
 	}
 	return 0;
 }
@@ -82,9 +92,24 @@ acpi_get_subtable_type(char *id)
 		return ACPI_SUBTABLE_PRMT;
 	if (strncmp(id, ACPI_SIG_CEDT, 4) == 0)
 		return ACPI_SUBTABLE_CEDT;
+	if (strncmp(id, ACPI_SIG_CDAT, 4) == 0)
+		return CDAT_SUBTABLE;
 	return ACPI_SUBTABLE_COMMON;
 }
 
+static unsigned long __init_or_acpilib
+acpi_table_get_length(enum acpi_subtable_type type,
+		      union fw_table_header *header)
+{
+	if (type == CDAT_SUBTABLE) {
+		__le32 length = (__force __le32)header->cdat.length;
+
+		return le32_to_cpu(length);
+	}
+
+	return header->acpi.length;
+}
+
 static __init_or_acpilib bool has_handler(struct acpi_subtable_proc *proc)
 {
 	return proc->handler || proc->handler_arg;
@@ -126,21 +151,24 @@ static __init_or_acpilib int call_handler(struct acpi_subtable_proc *proc,
  */
 int __init_or_acpilib
 acpi_parse_entries_array(char *id, unsigned long table_size,
-			 struct acpi_table_header *table_header,
+			 union fw_table_header *table_header,
 			 struct acpi_subtable_proc *proc,
 			 int proc_num, unsigned int max_entries)
 {
 	unsigned long table_end, subtable_len, entry_len;
 	struct acpi_subtable_entry entry;
+	enum acpi_subtable_type type;
 	int count = 0;
 	int errs = 0;
 	int i;
 
-	table_end = (unsigned long)table_header + table_header->length;
+	type = acpi_get_subtable_type(id);
+	table_end = (unsigned long)table_header +
+		    acpi_table_get_length(type, table_header);
 
 	/* Parse all entries looking for a match. */
 
-	entry.type = acpi_get_subtable_type(id);
+	entry.type = type;
 	entry.hdr = (union acpi_subtable_headers *)
 	    ((unsigned long)table_header + table_size);
 	subtable_len = acpi_get_subtable_header_length(&entry);
@@ -187,3 +215,23 @@ acpi_parse_entries_array(char *id, unsigned long table_size,
 	return errs ? -EINVAL : count;
 }
 EXPORT_SYMBOL_GPL(acpi_parse_entries_array);
+
+int cdat_table_parse(enum acpi_cdat_type type,
+		     acpi_tbl_entry_handler_arg handler_arg, void *arg,
+		     struct acpi_table_cdat *table_header)
+{
+	struct acpi_subtable_proc proc = {
+		.id		= type,
+		.handler_arg	= handler_arg,
+		.arg		= arg,
+	};
+
+	if (!table_header)
+		return -EINVAL;
+
+	return acpi_parse_entries_array(ACPI_SIG_CDAT,
+					sizeof(struct acpi_table_cdat),
+					(union fw_table_header *)table_header,
+					&proc, 1, 0);
+}
+EXPORT_SYMBOL_NS_GPL(cdat_table_parse, CXL);



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 06/22] base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates'
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (4 preceding siblings ...)
  2023-10-11  1:05 ` [PATCH v10 05/22] lib/firmware_table: tables: Add CDAT table parsing support Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11 12:57   ` Jonathan Cameron
  2023-10-11  1:05 ` [PATCH v10 07/22] acpi: numa: Create enum for memory_target access coordinates indexing Dave Jiang
                   ` (16 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Dan Williams, Dan Williams, Greg Kroah-Hartman, dan.j.williams,
	ira.weiny, vishal.l.verma, alison.schofield, Jonathan.Cameron,
	dave

Dan Williams suggested changing the struct 'node_hmem_attrs' to
'access_coordinates' [1]. The struct is a container of r/w-latency and
r/w-bandwidth numbers. Moving forward, this container will also be used by
CXL to store the performance characteristics of each link hop in
the PCIE/CXL topology. So, where node_hmem_attrs is just the access
parameters of a memory-node, access_coordinates applies more broadly
to hardware topology characteristics. The observation is that seemed like
an excercise in having the application identify "where" it falls on a
spectrum of bandwidth and latency needs. For the tuple of read/write-latency
and read/write-bandwidth, "coordinates" is not a perfect fit. Sometimes it
is just conveying values in isolation and not a "location" relative to
other performance points, but in the end this data is used to identify the
performance operation point of a given memory-node. [2]

Link: http://lore.kernel.org/r/64471313421f7_1b66294d5@dwillia2-xfh.jf.intel.com.notmuch/
Link: https://lore.kernel.org/linux-cxl/645e6215ee0de_1e6f2945e@dwillia2-xfh.jf.intel.com.notmuch/
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/acpi/numa/hmat.c |   20 ++++++++++----------
 drivers/base/node.c      |   12 ++++++------
 include/linux/node.h     |    8 ++++----
 3 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index bba268ecd802..f9ff992038fa 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -62,7 +62,7 @@ struct memory_target {
 	unsigned int memory_pxm;
 	unsigned int processor_pxm;
 	struct resource memregions;
-	struct node_hmem_attrs hmem_attrs[2];
+	struct access_coordinate coord[2];
 	struct list_head caches;
 	struct node_cache_attrs cache_attrs;
 	bool registered;
@@ -227,24 +227,24 @@ static void hmat_update_target_access(struct memory_target *target,
 {
 	switch (type) {
 	case ACPI_HMAT_ACCESS_LATENCY:
-		target->hmem_attrs[access].read_latency = value;
-		target->hmem_attrs[access].write_latency = value;
+		target->coord[access].read_latency = value;
+		target->coord[access].write_latency = value;
 		break;
 	case ACPI_HMAT_READ_LATENCY:
-		target->hmem_attrs[access].read_latency = value;
+		target->coord[access].read_latency = value;
 		break;
 	case ACPI_HMAT_WRITE_LATENCY:
-		target->hmem_attrs[access].write_latency = value;
+		target->coord[access].write_latency = value;
 		break;
 	case ACPI_HMAT_ACCESS_BANDWIDTH:
-		target->hmem_attrs[access].read_bandwidth = value;
-		target->hmem_attrs[access].write_bandwidth = value;
+		target->coord[access].read_bandwidth = value;
+		target->coord[access].write_bandwidth = value;
 		break;
 	case ACPI_HMAT_READ_BANDWIDTH:
-		target->hmem_attrs[access].read_bandwidth = value;
+		target->coord[access].read_bandwidth = value;
 		break;
 	case ACPI_HMAT_WRITE_BANDWIDTH:
-		target->hmem_attrs[access].write_bandwidth = value;
+		target->coord[access].write_bandwidth = value;
 		break;
 	default:
 		break;
@@ -701,7 +701,7 @@ static void hmat_register_target_cache(struct memory_target *target)
 static void hmat_register_target_perf(struct memory_target *target, int access)
 {
 	unsigned mem_nid = pxm_to_node(target->memory_pxm);
-	node_set_perf_attrs(mem_nid, &target->hmem_attrs[access], access);
+	node_set_perf_attrs(mem_nid, &target->coord[access], access);
 }
 
 static void hmat_register_target_devices(struct memory_target *target)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 493d533f8375..cb2b6cc7f6e6 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -74,14 +74,14 @@ static BIN_ATTR_RO(cpulist, CPULIST_FILE_MAX_BYTES);
  * @dev:	Device for this memory access class
  * @list_node:	List element in the node's access list
  * @access:	The access class rank
- * @hmem_attrs: Heterogeneous memory performance attributes
+ * @coord:	Heterogeneous memory performance coordinates
  */
 struct node_access_nodes {
 	struct device		dev;
 	struct list_head	list_node;
 	unsigned int		access;
 #ifdef CONFIG_HMEM_REPORTING
-	struct node_hmem_attrs	hmem_attrs;
+	struct access_coordinate	coord;
 #endif
 };
 #define to_access_nodes(dev) container_of(dev, struct node_access_nodes, dev)
@@ -167,7 +167,7 @@ static ssize_t property##_show(struct device *dev,			\
 			   char *buf)					\
 {									\
 	return sysfs_emit(buf, "%u\n",					\
-			  to_access_nodes(dev)->hmem_attrs.property);	\
+			  to_access_nodes(dev)->coord.property);	\
 }									\
 static DEVICE_ATTR_RO(property)
 
@@ -187,10 +187,10 @@ static struct attribute *access_attrs[] = {
 /**
  * node_set_perf_attrs - Set the performance values for given access class
  * @nid: Node identifier to be set
- * @hmem_attrs: Heterogeneous memory performance attributes
+ * @coord: Heterogeneous memory performance coordinates
  * @access: The access class the for the given attributes
  */
-void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs,
+void node_set_perf_attrs(unsigned int nid, struct access_coordinate *coord,
 			 unsigned int access)
 {
 	struct node_access_nodes *c;
@@ -205,7 +205,7 @@ void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs,
 	if (!c)
 		return;
 
-	c->hmem_attrs = *hmem_attrs;
+	c->coord = *coord;
 	for (i = 0; access_attrs[i] != NULL; i++) {
 		if (sysfs_add_file_to_group(&c->dev.kobj, access_attrs[i],
 					    "initiators")) {
diff --git a/include/linux/node.h b/include/linux/node.h
index 427a5975cf40..25b66d705ee2 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -20,14 +20,14 @@
 #include <linux/list.h>
 
 /**
- * struct node_hmem_attrs - heterogeneous memory performance attributes
+ * struct access_coordinate - generic performance coordinates container
  *
  * @read_bandwidth:	Read bandwidth in MB/s
  * @write_bandwidth:	Write bandwidth in MB/s
  * @read_latency:	Read latency in nanoseconds
  * @write_latency:	Write latency in nanoseconds
  */
-struct node_hmem_attrs {
+struct access_coordinate {
 	unsigned int read_bandwidth;
 	unsigned int write_bandwidth;
 	unsigned int read_latency;
@@ -65,7 +65,7 @@ struct node_cache_attrs {
 
 #ifdef CONFIG_HMEM_REPORTING
 void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs);
-void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs,
+void node_set_perf_attrs(unsigned int nid, struct access_coordinate *coord,
 			 unsigned access);
 #else
 static inline void node_add_cache(unsigned int nid,
@@ -74,7 +74,7 @@ static inline void node_add_cache(unsigned int nid,
 }
 
 static inline void node_set_perf_attrs(unsigned int nid,
-				       struct node_hmem_attrs *hmem_attrs,
+				       struct access_coordinate *coord,
 				       unsigned access)
 {
 }



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 07/22] acpi: numa: Create enum for memory_target access coordinates indexing
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (5 preceding siblings ...)
  2023-10-11  1:05 ` [PATCH v10 06/22] base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates' Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 08/22] acpi: numa: Add genport target allocation to the HMAT parsing Dave Jiang
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, Rafael J. Wysocki, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, Jonathan.Cameron, dave

Create enums to provide named indexing for the access coordinate array.
This is in preparation for adding generic port support which will add a
third index in the array to keep the generic port attributes separate from
the memory attributes.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/numa/hmat.c |   35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index f9ff992038fa..abed728bf09d 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -57,12 +57,18 @@ struct target_cache {
 	struct node_cache_attrs cache_attrs;
 };
 
+enum {
+	NODE_ACCESS_CLASS_0 = 0,
+	NODE_ACCESS_CLASS_1,
+	NODE_ACCESS_CLASS_MAX,
+};
+
 struct memory_target {
 	struct list_head node;
 	unsigned int memory_pxm;
 	unsigned int processor_pxm;
 	struct resource memregions;
-	struct access_coordinate coord[2];
+	struct access_coordinate coord[NODE_ACCESS_CLASS_MAX];
 	struct list_head caches;
 	struct node_cache_attrs cache_attrs;
 	bool registered;
@@ -338,10 +344,12 @@ static __init int hmat_parse_locality(union acpi_subtable_headers *header,
 			if (mem_hier == ACPI_HMAT_MEMORY) {
 				target = find_mem_target(targs[targ]);
 				if (target && target->processor_pxm == inits[init]) {
-					hmat_update_target_access(target, type, value, 0);
+					hmat_update_target_access(target, type, value,
+								  NODE_ACCESS_CLASS_0);
 					/* If the node has a CPU, update access 1 */
 					if (node_state(pxm_to_node(inits[init]), N_CPU))
-						hmat_update_target_access(target, type, value, 1);
+						hmat_update_target_access(target, type, value,
+									  NODE_ACCESS_CLASS_1);
 				}
 			}
 		}
@@ -600,10 +608,12 @@ static void hmat_register_target_initiators(struct memory_target *target)
 	 */
 	if (target->processor_pxm != PXM_INVAL) {
 		cpu_nid = pxm_to_node(target->processor_pxm);
-		register_memory_node_under_compute_node(mem_nid, cpu_nid, 0);
+		register_memory_node_under_compute_node(mem_nid, cpu_nid,
+							NODE_ACCESS_CLASS_0);
 		access0done = true;
 		if (node_state(cpu_nid, N_CPU)) {
-			register_memory_node_under_compute_node(mem_nid, cpu_nid, 1);
+			register_memory_node_under_compute_node(mem_nid, cpu_nid,
+								NODE_ACCESS_CLASS_1);
 			return;
 		}
 	}
@@ -644,12 +654,13 @@ static void hmat_register_target_initiators(struct memory_target *target)
 			}
 			if (best)
 				hmat_update_target_access(target, loc->hmat_loc->data_type,
-							  best, 0);
+							  best, NODE_ACCESS_CLASS_0);
 		}
 
 		for_each_set_bit(i, p_nodes, MAX_NUMNODES) {
 			cpu_nid = pxm_to_node(i);
-			register_memory_node_under_compute_node(mem_nid, cpu_nid, 0);
+			register_memory_node_under_compute_node(mem_nid, cpu_nid,
+								NODE_ACCESS_CLASS_0);
 		}
 	}
 
@@ -681,11 +692,13 @@ static void hmat_register_target_initiators(struct memory_target *target)
 				clear_bit(initiator->processor_pxm, p_nodes);
 		}
 		if (best)
-			hmat_update_target_access(target, loc->hmat_loc->data_type, best, 1);
+			hmat_update_target_access(target, loc->hmat_loc->data_type, best,
+						  NODE_ACCESS_CLASS_1);
 	}
 	for_each_set_bit(i, p_nodes, MAX_NUMNODES) {
 		cpu_nid = pxm_to_node(i);
-		register_memory_node_under_compute_node(mem_nid, cpu_nid, 1);
+		register_memory_node_under_compute_node(mem_nid, cpu_nid,
+							NODE_ACCESS_CLASS_1);
 	}
 }
 
@@ -746,8 +759,8 @@ static void hmat_register_target(struct memory_target *target)
 	if (!target->registered) {
 		hmat_register_target_initiators(target);
 		hmat_register_target_cache(target);
-		hmat_register_target_perf(target, 0);
-		hmat_register_target_perf(target, 1);
+		hmat_register_target_perf(target, NODE_ACCESS_CLASS_0);
+		hmat_register_target_perf(target, NODE_ACCESS_CLASS_1);
 		target->registered = true;
 	}
 	mutex_unlock(&target_lock);



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 08/22] acpi: numa: Add genport target allocation to the HMAT parsing
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (6 preceding siblings ...)
  2023-10-11  1:05 ` [PATCH v10 07/22] acpi: numa: Create enum for memory_target access coordinates indexing Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 09/22] acpi: Break out nesting for hmat_parse_locality() Dave Jiang
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, Rafael J. Wysocki, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, Jonathan.Cameron, dave

Add SRAT parsing for the HMAT init in order to collect the device handle
from the Generic Port Affinity Structure. The device handle will serve as
the key to search for target data.

Consoliate the common code with alloc_memory_target() in a helper function
alloc_target().

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/numa/hmat.c |   55 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 52 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index abed728bf09d..8380460cfee2 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -71,6 +71,7 @@ struct memory_target {
 	struct access_coordinate coord[NODE_ACCESS_CLASS_MAX];
 	struct list_head caches;
 	struct node_cache_attrs cache_attrs;
+	u8 gen_port_device_handle[ACPI_SRAT_DEVICE_HANDLE_SIZE];
 	bool registered;
 };
 
@@ -125,8 +126,7 @@ static __init void alloc_memory_initiator(unsigned int cpu_pxm)
 	list_add_tail(&initiator->node, &initiators);
 }
 
-static __init void alloc_memory_target(unsigned int mem_pxm,
-		resource_size_t start, resource_size_t len)
+static __init struct memory_target *alloc_target(unsigned int mem_pxm)
 {
 	struct memory_target *target;
 
@@ -134,7 +134,7 @@ static __init void alloc_memory_target(unsigned int mem_pxm,
 	if (!target) {
 		target = kzalloc(sizeof(*target), GFP_KERNEL);
 		if (!target)
-			return;
+			return NULL;
 		target->memory_pxm = mem_pxm;
 		target->processor_pxm = PXM_INVAL;
 		target->memregions = (struct resource) {
@@ -147,6 +147,19 @@ static __init void alloc_memory_target(unsigned int mem_pxm,
 		INIT_LIST_HEAD(&target->caches);
 	}
 
+	return target;
+}
+
+static __init void alloc_memory_target(unsigned int mem_pxm,
+				       resource_size_t start,
+				       resource_size_t len)
+{
+	struct memory_target *target;
+
+	target = alloc_target(mem_pxm);
+	if (!target)
+		return;
+
 	/*
 	 * There are potentially multiple ranges per PXM, so record each
 	 * in the per-target memregions resource tree.
@@ -157,6 +170,18 @@ static __init void alloc_memory_target(unsigned int mem_pxm,
 				start, start + len, mem_pxm);
 }
 
+static __init void alloc_genport_target(unsigned int mem_pxm, u8 *handle)
+{
+	struct memory_target *target;
+
+	target = alloc_target(mem_pxm);
+	if (!target)
+		return;
+
+	memcpy(target->gen_port_device_handle, handle,
+	       ACPI_SRAT_DEVICE_HANDLE_SIZE);
+}
+
 static __init const char *hmat_data_type(u8 type)
 {
 	switch (type) {
@@ -498,6 +523,23 @@ static __init int srat_parse_mem_affinity(union acpi_subtable_headers *header,
 	return 0;
 }
 
+static __init int srat_parse_genport_affinity(union acpi_subtable_headers *header,
+					      const unsigned long end)
+{
+	struct acpi_srat_generic_affinity *ga = (void *)header;
+
+	if (!ga)
+		return -EINVAL;
+
+	if (!(ga->flags & ACPI_SRAT_GENERIC_AFFINITY_ENABLED))
+		return 0;
+
+	alloc_genport_target(ga->proximity_domain,
+			     (u8 *)ga->device_handle);
+
+	return 0;
+}
+
 static u32 hmat_initiator_perf(struct memory_target *target,
 			       struct memory_initiator *initiator,
 			       struct acpi_hmat_locality *hmat_loc)
@@ -848,6 +890,13 @@ static __init int hmat_init(void)
 				ACPI_SRAT_TYPE_MEMORY_AFFINITY,
 				srat_parse_mem_affinity, 0) < 0)
 		goto out_put;
+
+	if (acpi_table_parse_entries(ACPI_SIG_SRAT,
+				     sizeof(struct acpi_table_srat),
+				     ACPI_SRAT_TYPE_GENERIC_PORT_AFFINITY,
+				     srat_parse_genport_affinity, 0) < 0)
+		goto out_put;
+
 	acpi_put_table(tbl);
 
 	status = acpi_get_table(ACPI_SIG_HMAT, 0, &tbl);



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 09/22] acpi: Break out nesting for hmat_parse_locality()
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (7 preceding siblings ...)
  2023-10-11  1:05 ` [PATCH v10 08/22] acpi: numa: Add genport target allocation to the HMAT parsing Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11  1:05 ` [PATCH v10 10/22] acpi: numa: Add setting of generic port system locality attributes Dave Jiang
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, Jonathan Cameron, Rafael J. Wysocki,
	dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave

Refactor hmat_parse_locality() to break up the deep nesting of the
function.

Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/numa/hmat.c |   32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 8380460cfee2..957a38137c7e 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -321,11 +321,28 @@ static __init void hmat_add_locality(struct acpi_hmat_locality *hmat_loc)
 	}
 }
 
+static __init void hmat_update_target(unsigned int tgt_pxm, unsigned int init_pxm,
+				      u8 mem_hier, u8 type, u32 value)
+{
+	struct memory_target *target = find_mem_target(tgt_pxm);
+
+	if (mem_hier != ACPI_HMAT_MEMORY)
+		return;
+
+	if (target && target->processor_pxm == init_pxm) {
+		hmat_update_target_access(target, type, value,
+					  NODE_ACCESS_CLASS_0);
+		/* If the node has a CPU, update access 1 */
+		if (node_state(pxm_to_node(init_pxm), N_CPU))
+			hmat_update_target_access(target, type, value,
+						  NODE_ACCESS_CLASS_1);
+	}
+}
+
 static __init int hmat_parse_locality(union acpi_subtable_headers *header,
 				      const unsigned long end)
 {
 	struct acpi_hmat_locality *hmat_loc = (void *)header;
-	struct memory_target *target;
 	unsigned int init, targ, total_size, ipds, tpds;
 	u32 *inits, *targs, value;
 	u16 *entries;
@@ -366,17 +383,8 @@ static __init int hmat_parse_locality(union acpi_subtable_headers *header,
 				inits[init], targs[targ], value,
 				hmat_data_type_suffix(type));
 
-			if (mem_hier == ACPI_HMAT_MEMORY) {
-				target = find_mem_target(targs[targ]);
-				if (target && target->processor_pxm == inits[init]) {
-					hmat_update_target_access(target, type, value,
-								  NODE_ACCESS_CLASS_0);
-					/* If the node has a CPU, update access 1 */
-					if (node_state(pxm_to_node(inits[init]), N_CPU))
-						hmat_update_target_access(target, type, value,
-									  NODE_ACCESS_CLASS_1);
-				}
-			}
+			hmat_update_target(targs[targ], inits[init],
+					   mem_hier, type, value);
 		}
 	}
 



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 10/22] acpi: numa: Add setting of generic port system locality attributes
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (8 preceding siblings ...)
  2023-10-11  1:05 ` [PATCH v10 09/22] acpi: Break out nesting for hmat_parse_locality() Dave Jiang
@ 2023-10-11  1:05 ` Dave Jiang
  2023-10-11  1:06 ` [PATCH v10 11/22] acpi: numa: Add helper function to retrieve the performance attributes Dave Jiang
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:05 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, Rafael J. Wysocki, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, Jonathan.Cameron, dave

Add generic port support for the parsing of HMAT system locality sub-table.
The attributes will be added to the third array member of the access
coordinates in order to not mix with the existing memory attributes. It only
provides the system locality attributes from initator to the generic port
targets and is missing the rest of the data to the actual memory device.

The complete attributes will be updated when a memory device is
attached and the system locality information is calculated end to end.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/numa/hmat.c |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 957a38137c7e..aabd89c79e26 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -60,6 +60,7 @@ struct target_cache {
 enum {
 	NODE_ACCESS_CLASS_0 = 0,
 	NODE_ACCESS_CLASS_1,
+	NODE_ACCESS_CLASS_GENPORT_SINK,
 	NODE_ACCESS_CLASS_MAX,
 };
 
@@ -336,6 +337,10 @@ static __init void hmat_update_target(unsigned int tgt_pxm, unsigned int init_px
 		if (node_state(pxm_to_node(init_pxm), N_CPU))
 			hmat_update_target_access(target, type, value,
 						  NODE_ACCESS_CLASS_1);
+		/* Update access from generic port target */
+		if (*target->gen_port_device_handle)
+			hmat_update_target_access(target, type, value,
+						  NODE_ACCESS_CLASS_GENPORT_SINK);
 	}
 }
 



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 11/22] acpi: numa: Add helper function to retrieve the performance attributes
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (9 preceding siblings ...)
  2023-10-11  1:05 ` [PATCH v10 10/22] acpi: numa: Add setting of generic port system locality attributes Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11  1:06 ` [PATCH v10 12/22] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, Rafael J. Wysocki, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, Jonathan.Cameron, dave

Add helper to retrieve the performance attributes based on the device
handle.  The helper function is exported so the CXL driver can use that
to acquire the performance data between the CPU and the CXL host bridge.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/numa/hmat.c |   35 +++++++++++++++++++++++++++++++++++
 include/linux/acpi.h     |   12 ++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index aabd89c79e26..e3a29ddacbca 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -107,6 +107,41 @@ static struct memory_target *find_mem_target(unsigned int mem_pxm)
 	return NULL;
 }
 
+static struct memory_target *acpi_find_genport_target(u8 *device_handle)
+{
+	struct memory_target *target;
+
+	list_for_each_entry(target, &targets, node) {
+		if (!memcmp(target->gen_port_device_handle, device_handle,
+			    ACPI_SRAT_DEVICE_HANDLE_SIZE))
+			return target;
+	}
+
+	return NULL;
+}
+
+/**
+ * acpi_get_genport_coordinates - Retrieve the access coordinates for a generic port
+ * @device_handle: Device handle string (ACPI or PCI) to match up to the gen port
+ * @coord: The access coordinates written back out for the generic port
+ *
+ * Return: 0 on success. Errno on failure.
+ */
+int acpi_get_genport_coordinates(u8 *device_handle,
+				 struct access_coordinate *coord)
+{
+	struct memory_target *target;
+
+	target = acpi_find_genport_target(device_handle);
+	if (!target)
+		return -ENOENT;
+
+	*coord = target->coord[NODE_ACCESS_CLASS_GENPORT_SINK];
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(acpi_get_genport_coordinates, CXL);
+
 static __init void alloc_memory_initiator(unsigned int cpu_pxm)
 {
 	struct memory_initiator *initiator;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 0d334975c0c9..7ffe054befa2 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -16,6 +16,7 @@
 #include <linux/property.h>
 #include <linux/uuid.h>
 #include <linux/fw_table.h>
+#include <linux/node.h>
 
 struct irq_domain;
 struct irq_domain_ops;
@@ -419,6 +420,17 @@ extern int acpi_blacklisted(void);
 extern void acpi_osi_setup(char *str);
 extern bool acpi_osi_is_win8(void);
 
+#ifdef CONFIG_ACPI_HMAT
+int acpi_get_genport_coordinates(u8 *device_handle,
+				 struct access_coordinate *coord);
+#else
+static inline int acpi_get_genport_coordinates(u8 *device_handle,
+					       struct access_coordinate *coord)
+{
+	return -EOPNOTSUPP;
+}
+#endif
+
 #ifdef CONFIG_ACPI_NUMA
 int acpi_map_pxm_to_node(int pxm);
 int acpi_get_node(acpi_handle handle);



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 12/22] cxl: Add callback to parse the DSMAS subtables from CDAT
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (10 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 11/22] acpi: numa: Add helper function to retrieve the performance attributes Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11  1:06 ` [PATCH v10 13/22] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave

Provide a callback function to the CDAT parser in order to parse the Device
Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the
DPA range and its associated attributes in each entry. See the CDAT
specification for details. The device handle and the DPA range is saved and
to be associated with the DSLBIS locality data when the DSLBIS entries are
parsed. The list is a local list. When the total path performance data is
calculated and storred this list can be discarded.

Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity
Structure (DSMAS)

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/Kconfig       |    1 +
 drivers/cxl/core/Makefile |    1 +
 drivers/cxl/core/cdat.c   |   58 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h         |   23 ++++++++++++++++++
 drivers/cxl/port.c        |   12 +++++++++
 5 files changed, 95 insertions(+)
 create mode 100644 drivers/cxl/core/cdat.c

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 8ea1d340e438..a1ae35bd1afc 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -5,6 +5,7 @@ menuconfig CXL_BUS
 	select FW_LOADER
 	select FW_UPLOAD
 	select PCI_DOE
+	select FIRMWARE_TABLE
 	help
 	  CXL is a bus that is electrically compatible with PCI Express, but
 	  layers three protocols on that signalling (CXL.io, CXL.cache, and
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 1f66b5d4d935..f4eabfdd50ce 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -13,5 +13,6 @@ cxl_core-y += mbox.o
 cxl_core-y += pci.o
 cxl_core-y += hdm.o
 cxl_core-y += pmu.o
+cxl_core-$(CONFIG_FIRMWARE_TABLE) += cdat.o
 cxl_core-$(CONFIG_TRACING) += trace.o
 cxl_core-$(CONFIG_CXL_REGION) += region.o
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
new file mode 100644
index 000000000000..a858fd1b5744
--- /dev/null
+++ b/drivers/cxl/core/cdat.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
+#include <linux/acpi.h>
+#include <linux/fw_table.h>
+#include "cxlpci.h"
+#include "cxl.h"
+
+static int cdat_dsmas_handler(union acpi_subtable_headers *header, void *arg,
+			      const unsigned long end)
+{
+	struct acpi_cdat_header *hdr = &header->cdat;
+	struct acpi_cdat_dsmas *dsmas;
+	int size = sizeof(*hdr) + sizeof(*dsmas);
+	struct list_head *dsmas_list = arg;
+	struct dsmas_entry *dent;
+	u16 len;
+
+	len = le16_to_cpu((__force __le16)hdr->length);
+	if (len != size || (unsigned long)hdr + len > end) {
+		pr_warn("Malformed DSMAS table length: (%u:%u)\n", size, len);
+		return -EINVAL;
+	}
+
+	/* Skip common header */
+	dsmas = (struct acpi_cdat_dsmas *)(hdr + 1);
+
+	dent = kzalloc(sizeof(*dent), GFP_KERNEL);
+	if (!dent)
+		return -ENOMEM;
+
+	dent->handle = dsmas->dsmad_handle;
+	dent->dpa_range.start = le64_to_cpu((__force __le64)dsmas->dpa_base_address);
+	dent->dpa_range.end = le64_to_cpu((__force __le64)dsmas->dpa_base_address) +
+			      le64_to_cpu((__force __le64)dsmas->dpa_length) - 1;
+	list_add_tail(&dent->list, dsmas_list);
+
+	return 0;
+}
+
+int cxl_cdat_endpoint_process(struct cxl_port *port, struct list_head *list)
+{
+	return cdat_table_parse(ACPI_CDAT_TYPE_DSMAS, cdat_dsmas_handler,
+				list, port->cdat.table);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_cdat_endpoint_process, CXL);
+
+void cxl_cdat_dsmas_list_destroy(struct list_head *dsmas_list)
+{
+	struct dsmas_entry *dentry, *n;
+
+	list_for_each_entry_safe(dentry, n, dsmas_list, list) {
+		list_del(&dentry->list);
+		kfree(dentry);
+	}
+}
+EXPORT_SYMBOL_NS_GPL(cxl_cdat_dsmas_list_destroy, CXL);
+
+MODULE_IMPORT_NS(CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index eb7924648cb0..be2fd8dcb37b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -7,6 +7,7 @@
 #include <linux/libnvdimm.h>
 #include <linux/bitfield.h>
 #include <linux/bitops.h>
+#include <linux/list.h>
 #include <linux/log2.h>
 #include <linux/io.h>
 
@@ -824,6 +825,28 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
 }
 #endif
 
+/* CDAT related bits */
+struct dsmas_entry {
+	struct list_head list;
+	struct range dpa_range;
+	u8 handle;
+};
+
+#ifdef CONFIG_FIRMWARE_TABLE
+int cxl_cdat_endpoint_process(struct cxl_port *port, struct list_head *list);
+void cxl_cdat_dsmas_list_destroy(struct list_head *dsmas_list);
+#else
+static inline int cxl_cdat_endpoint_process(struct cxl_port *port,
+					    struct list_head *list)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void cxl_cdat_dsmas_list_destroy(struct list_head *dsmas_list)
+{
+}
+#endif
+
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 47bc8e0b8590..cceee9dbae2c 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -136,6 +136,18 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 	device_for_each_child(&port->dev, root, discover_region);
 	put_device(&root->dev);
 
+	if (port->cdat.table) {
+		LIST_HEAD(dsmas_list);
+
+		rc = cxl_cdat_endpoint_process(port, &dsmas_list);
+		if (rc < 0)
+			dev_dbg(&port->dev, "Failed to parse CDAT: %d\n", rc);
+
+		/* Performance data processing */
+
+		cxl_cdat_dsmas_list_destroy(&dsmas_list);
+	}
+
 	return 0;
 }
 



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 13/22] cxl: Add callback to parse the DSLBIS subtable from CDAT
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (11 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 12/22] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11  1:06 ` [PATCH v10 14/22] cxl: Add callback to parse the SSLBIS " Dave Jiang
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave

Provide a callback to parse the Device Scoped Latency and Bandwidth
Information Structure (DSLBIS) in the CDAT structures. The DSLBIS
contains the bandwidth and latency information that's tied to a DSMAS
handle. The driver will retrieve the read and write latency and
bandwidth associated with the DSMAS which is tied to a DPA range.

Coherent Device Attribute Table 1.03 2.1 Device Scoped Latency and
Bandwidth Information Structure (DSLBIS)

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/cdat.c |  102 ++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/cxl/cxl.h       |    2 +
 2 files changed, 102 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index a858fd1b5744..361983393666 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -2,6 +2,7 @@
 /* Copyright(c) 2023 Intel Corporation. All rights reserved. */
 #include <linux/acpi.h>
 #include <linux/fw_table.h>
+#include <linux/overflow.h>
 #include "cxlpci.h"
 #include "cxl.h"
 
@@ -37,10 +38,107 @@ static int cdat_dsmas_handler(union acpi_subtable_headers *header, void *arg,
 	return 0;
 }
 
+static void cxl_access_coordinate_set(struct access_coordinate *coord,
+				      int access, unsigned int val)
+{
+	switch (access) {
+	case ACPI_HMAT_ACCESS_LATENCY:
+		coord->read_latency = val;
+		coord->write_latency = val;
+		break;
+	case ACPI_HMAT_READ_LATENCY:
+		coord->read_latency = val;
+		break;
+	case ACPI_HMAT_WRITE_LATENCY:
+		coord->write_latency = val;
+		break;
+	case ACPI_HMAT_ACCESS_BANDWIDTH:
+		coord->read_bandwidth = val;
+		coord->write_bandwidth = val;
+		break;
+	case ACPI_HMAT_READ_BANDWIDTH:
+		coord->read_bandwidth = val;
+		break;
+	case ACPI_HMAT_WRITE_BANDWIDTH:
+		coord->write_bandwidth = val;
+		break;
+	}
+}
+
+static int cdat_dslbis_handler(union acpi_subtable_headers *header, void *arg,
+			       const unsigned long end)
+{
+	struct acpi_cdat_header *hdr = &header->cdat;
+	struct acpi_cdat_dslbis *dslbis;
+	int size = sizeof(*hdr) + sizeof(*dslbis);
+	struct list_head *dsmas_list = arg;
+	struct dsmas_entry *dent;
+	u16 len;
+
+	len = le16_to_cpu((__force __le16)hdr->length);
+	if (len != size || (unsigned long)hdr + len > end) {
+		pr_warn("Malformed DSLBIS table length: (%u:%u)\n", size, len);
+		return -EINVAL;
+	}
+
+	/* Skip common header */
+	dslbis = (struct acpi_cdat_dslbis *)(hdr + 1);
+
+	/* Skip unrecognized data type */
+	if (dslbis->data_type > ACPI_HMAT_WRITE_BANDWIDTH)
+		return 0;
+
+	list_for_each_entry(dent, dsmas_list, list) {
+		__le64 le_base;
+		__le16 le_val;
+		u64 val;
+		int rc;
+
+		if (dslbis->handle != dent->handle)
+			continue;
+
+		/* Not a memory type, skip */
+		if ((dslbis->flags & ACPI_HMAT_MEMORY_HIERARCHY) !=
+		    ACPI_HMAT_MEMORY)
+			return 0;
+
+		le_base = (__force __le64)dslbis->entry_base_unit;
+		le_val = (__force __le16)dslbis->entry[0];
+		rc = check_mul_overflow(le64_to_cpu(le_base),
+					le16_to_cpu(le_val), &val);
+		if (rc)
+			pr_warn("DSLBIS value overflowed.\n");
+
+		cxl_access_coordinate_set(&dent->coord, dslbis->data_type, val);
+		break;
+	}
+
+	return 0;
+}
+
+static int cdat_table_parse_output(int rc)
+{
+	if (rc < 0)
+		return rc;
+	if (rc == 0)
+		return -ENOENT;
+
+	return 0;
+}
+
 int cxl_cdat_endpoint_process(struct cxl_port *port, struct list_head *list)
 {
-	return cdat_table_parse(ACPI_CDAT_TYPE_DSMAS, cdat_dsmas_handler,
-				list, port->cdat.table);
+	int rc;
+
+	rc = cdat_table_parse(ACPI_CDAT_TYPE_DSMAS, cdat_dsmas_handler,
+			      list, port->cdat.table);
+	rc = cdat_table_parse_output(rc);
+	if (rc)
+		return rc;
+
+	rc = cdat_table_parse(ACPI_CDAT_TYPE_DSLBIS, cdat_dslbis_handler,
+			      list, port->cdat.table);
+	return cdat_table_parse_output(rc);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_cdat_endpoint_process, CXL);
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index be2fd8dcb37b..dc3b3b746c70 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -8,6 +8,7 @@
 #include <linux/bitfield.h>
 #include <linux/bitops.h>
 #include <linux/list.h>
+#include <linux/node.h>
 #include <linux/log2.h>
 #include <linux/io.h>
 
@@ -830,6 +831,7 @@ struct dsmas_entry {
 	struct list_head list;
 	struct range dpa_range;
 	u8 handle;
+	struct access_coordinate coord;
 };
 
 #ifdef CONFIG_FIRMWARE_TABLE



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 14/22] cxl: Add callback to parse the SSLBIS subtable from CDAT
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (12 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 13/22] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11  1:06 ` [PATCH v10 15/22] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave

Provide a callback to parse the Switched Scoped Latency and Bandwidth
Information Structure (SSLBIS) in the CDAT structures. The SSLBIS
contains the bandwidth and latency information that's tied to the
CXL switch that the data table has been read from. The extracted
values are stored to the cxl_dport correlated by the port_id
depending on the SSLBIS entry.

Coherent Device Attribute Table 1.03 2.1 Switched Scoped Latency
and Bandwidth Information Structure (DSLBIS)

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/cdat.c |   93 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    8 ++++
 drivers/cxl/port.c      |    6 +++
 3 files changed, 107 insertions(+)

diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index 361983393666..f233cebca37a 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -153,4 +153,97 @@ void cxl_cdat_dsmas_list_destroy(struct list_head *dsmas_list)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_cdat_dsmas_list_destroy, CXL);
 
+static int cdat_sslbis_handler(union acpi_subtable_headers *header, void *arg,
+			       const unsigned long end)
+{
+	struct acpi_cdat_sslbis *sslbis;
+	int size = sizeof(header->cdat) + sizeof(*sslbis);
+	struct cxl_port *port = arg;
+	struct device *dev = &port->dev;
+	struct acpi_cdat_sslbe *entry;
+	int remain, entries, i;
+	u16 len;
+
+	len = le16_to_cpu((__force __le16)header->cdat.length);
+	remain = len - size;
+	if (!remain || remain % sizeof(*entry) ||
+	    (unsigned long)header + len > end) {
+		dev_warn(dev, "Malformed SSLBIS table length: (%u)\n", len);
+		return -EINVAL;
+	}
+
+	/* Skip common header */
+	sslbis = (struct acpi_cdat_sslbis *)((unsigned long)header +
+					     sizeof(header->cdat));
+
+	/* Unrecognized data type, we can skip */
+	if (sslbis->data_type > ACPI_HMAT_WRITE_BANDWIDTH)
+		return 0;
+
+	entries = remain / sizeof(*entry);
+	entry = (struct acpi_cdat_sslbe *)((unsigned long)header + sizeof(*sslbis));
+
+	for (i = 0; i < entries; i++) {
+		u16 x = le16_to_cpu((__force __le16)entry->portx_id);
+		u16 y = le16_to_cpu((__force __le16)entry->porty_id);
+		__le64 le_base;
+		__le16 le_val;
+		struct cxl_dport *dport;
+		unsigned long index;
+		u16 dsp_id;
+		u64 val;
+
+		switch (x) {
+		case ACPI_CDAT_SSLBIS_US_PORT:
+			dsp_id = y;
+			break;
+		case ACPI_CDAT_SSLBIS_ANY_PORT:
+			switch (y) {
+			case ACPI_CDAT_SSLBIS_US_PORT:
+				dsp_id = x;
+				break;
+			case ACPI_CDAT_SSLBIS_ANY_PORT:
+				dsp_id = ACPI_CDAT_SSLBIS_ANY_PORT;
+				break;
+			default:
+				dsp_id = y;
+				break;
+			}
+			break;
+		default:
+			dsp_id = x;
+			break;
+		}
+
+		le_base = (__force __le64)sslbis->entry_base_unit;
+		le_val = (__force __le16)entry->latency_or_bandwidth;
+
+		if (check_mul_overflow(le64_to_cpu(le_base),
+				       le16_to_cpu(le_val), &val))
+			dev_warn(dev, "SSLBIS value overflowed!\n");
+
+		xa_for_each(&port->dports, index, dport) {
+			if (dsp_id == ACPI_CDAT_SSLBIS_ANY_PORT ||
+			    dsp_id == dport->port_id)
+				cxl_access_coordinate_set(&dport->coord,
+							  sslbis->data_type,
+							  val);
+		}
+
+		entry++;
+	}
+
+	return 0;
+}
+
+int cxl_cdat_switch_process(struct cxl_port *port)
+{
+	int rc;
+
+	rc = cdat_table_parse(ACPI_CDAT_TYPE_SSLBIS, cdat_sslbis_handler,
+			      port, port->cdat.table);
+	return cdat_table_parse_output(rc);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_cdat_switch_process, CXL);
+
 MODULE_IMPORT_NS(CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index dc3b3b746c70..2d1fd6d0cce5 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -630,6 +630,7 @@ struct cxl_rcrb_info {
  * @rcrb: Data about the Root Complex Register Block layout
  * @rch: Indicate whether this dport was enumerated in RCH or VH mode
  * @port: reference to cxl_port that contains this downstream port
+ * @coord: access coordinates (performance) for switch from CDAT
  */
 struct cxl_dport {
 	struct device *dport_dev;
@@ -638,6 +639,7 @@ struct cxl_dport {
 	struct cxl_rcrb_info rcrb;
 	bool rch;
 	struct cxl_port *port;
+	struct access_coordinate coord;
 };
 
 /**
@@ -837,6 +839,7 @@ struct dsmas_entry {
 #ifdef CONFIG_FIRMWARE_TABLE
 int cxl_cdat_endpoint_process(struct cxl_port *port, struct list_head *list);
 void cxl_cdat_dsmas_list_destroy(struct list_head *dsmas_list);
+int cxl_cdat_switch_process(struct cxl_port *port);
 #else
 static inline int cxl_cdat_endpoint_process(struct cxl_port *port,
 					    struct list_head *list)
@@ -847,6 +850,11 @@ static inline int cxl_cdat_endpoint_process(struct cxl_port *port,
 static inline void cxl_cdat_dsmas_list_destroy(struct list_head *dsmas_list)
 {
 }
+
+static inline int cxl_cdat_switch_process(struct cxl_port *port)
+{
+	return -EOPNOTSUPP;
+}
 #endif
 
 /*
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index cceee9dbae2c..f5e2be79f217 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -78,6 +78,12 @@ static int cxl_switch_port_probe(struct cxl_port *port)
 		return PTR_ERR(cxlhdm);
 	}
 
+	if (port->cdat.table) {
+		rc = cxl_cdat_switch_process(port);
+		if (rc < 0)
+			dev_warn(&port->dev, "Failed to parse SSLBIS: %d\n", rc);
+	}
+
 	if (rc == 1) {
 		dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
 		return devm_cxl_add_passthrough_decoder(port);



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 15/22] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (13 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 14/22] cxl: Add callback to parse the SSLBIS " Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11 13:10   ` Jonathan Cameron
  2023-10-11  1:06 ` [PATCH v10 16/22] cxl: Calculate and store PCI link latency for the downstream ports Dave Jiang
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave

CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)

Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
an input of an ACPI package with 4 dwords (read latency, write latency,
read bandwidth, write bandwidth). The call returns a package with 1 WORD
that provides the max supported QTG ID and a package that may contain 0 or
more WORDs as the recommended QTG IDs in the recommended order.

Create a cxl_root container for the root cxl_port and provide a callback
->get_qos_class() in order to retrieve the QoS class. For the ACPI case,
the _DSM helper is used to retrieve the QTG ID and returned. A
devm_cxl_add_root() function is added for root port setup and registration
of the cxl_root callback operation(s).

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v10:
- Remove allocation in _DSM handler. Change input parameter to request
  number of ids to return.
- Removed Jonathan's review tag due to significant changes
v9:
- Fix input to 4 ints instead of using buffers. (Dan)
- Remove all endien conversion inside acpi handling code. (Dan)
v8:
- Change DSM return package parsing to use integers.
v7:
- Fix stray lines in commit log. (Jonathan)
v5:
- Make the helper a callback for the CXL root. (Dan)
- Drop the addition of core/acpi.c. (Dan)
- Add endiness handling. (Jonathan)
- Refactor error exits. (Jonathan)
- Update evaluate function description. (Jonathan)
- Make uuid static. (Dan)
v2:
- Reorder var declaration and use C99 style. (Jonathan)
- Allow >2 ACPI objects in package for future expansion. (Jonathan)
- Check QTG IDs against MAX QTG ID provided by output package. (Jonathan)
---
 drivers/cxl/acpi.c      |  131 ++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/cxl/core/port.c |   41 +++++++++++++--
 drivers/cxl/cxl.h       |   25 +++++++++
 3 files changed, 189 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 2034eb4ce83f..a84fef73f8ce 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -6,6 +6,7 @@
 #include <linux/kernel.h>
 #include <linux/acpi.h>
 #include <linux/pci.h>
+#include <linux/node.h>
 #include <asm/div64.h>
 #include "cxlpci.h"
 #include "cxl.h"
@@ -17,6 +18,10 @@ struct cxl_cxims_data {
 	u64 xormaps[] __counted_by(nr_maps);
 };
 
+static const guid_t acpi_cxl_qtg_id_guid =
+	GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
+		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
+
 /*
  * Find a targets entry (n) in the host bridge interleave list.
  * CXL Specification 3.0 Table 9-22
@@ -194,6 +199,124 @@ struct cxl_cfmws_context {
 	int id;
 };
 
+/**
+ * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
+ * @handle: ACPI handle
+ * @coord: performance access coordinates
+ * @entries: number of QTG IDs to return
+ * @qos_class: int array provided by caller to return QTG IDs
+ *
+ * Return: number of QTG IDs returned, or -errno for errors
+ *
+ * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
+ * the QTG IDs that are suitable for the performance point in order of most
+ * suitable to least suitable. Return first QTG ID.
+ */
+static int
+cxl_acpi_evaluate_qtg_dsm(acpi_handle handle, struct access_coordinate *coord,
+			  int entries, int *qos_class)
+{
+	union acpi_object *out_obj, *out_buf, *pkg;
+	union acpi_object in_array[4] = {
+		[0].integer = { ACPI_TYPE_INTEGER, coord->read_latency },
+		[1].integer = { ACPI_TYPE_INTEGER, coord->write_latency },
+		[2].integer = { ACPI_TYPE_INTEGER, coord->read_bandwidth },
+		[3].integer = { ACPI_TYPE_INTEGER, coord->write_bandwidth },
+	};
+	union acpi_object in_obj = {
+		.package = {
+			.type = ACPI_TYPE_PACKAGE,
+			.count = 4,
+			.elements = in_array,
+		},
+	};
+	int count, pkg_entries, i;
+	u16 max_qtg;
+	int rc = 0;
+
+	if (!entries)
+		return -EINVAL;
+
+	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
+	if (!out_obj)
+		return -ENXIO;
+
+	if (out_obj->type != ACPI_TYPE_PACKAGE) {
+		rc = -ENXIO;
+		goto out;
+	}
+
+	/* Check Max QTG ID */
+	pkg = &out_obj->package.elements[0];
+	if (pkg->type != ACPI_TYPE_INTEGER) {
+		rc = -ENXIO;
+		goto out;
+	}
+
+	max_qtg = pkg->integer.value;
+
+	/* It's legal to have 0 QTG entries */
+	pkg_entries = out_obj->package.count;
+	if (pkg_entries <= 1) {
+		rc = -EEXIST;
+		goto out;
+	}
+
+	/* Retrieve QTG IDs package */
+	pkg = &out_obj->package.elements[1];
+	if (pkg->type != ACPI_TYPE_PACKAGE) {
+		rc = -ENXIO;
+		goto out;
+	}
+
+	pkg_entries = pkg->package.count;
+	count = min(entries, pkg_entries);
+	for (i = 0; i < count; i++) {
+		u16 qtg_id;
+
+		out_buf = &pkg->package.elements[i];
+		if (out_buf->type != ACPI_TYPE_INTEGER) {
+			rc = -ENXIO;
+			goto out;
+		}
+
+		qtg_id = out_buf->integer.value;
+		if (qtg_id > max_qtg)
+			pr_warn("QTG ID %u greater than MAX %u\n",
+				qtg_id, max_qtg);
+
+		qos_class[i] = qtg_id;
+	}
+	rc = count;
+
+out:
+	ACPI_FREE(out_obj);
+	return rc;
+}
+
+static int cxl_acpi_get_qos_class(struct cxl_port *root_port,
+				  struct access_coordinate *coord,
+				  int entries, int *qos_class)
+{
+	acpi_handle handle;
+	struct device *dev;
+
+	dev = root_port->uport_dev;
+
+	if (!dev_is_platform(dev))
+		return -ENODEV;
+
+	handle = ACPI_HANDLE(dev);
+	if (!handle)
+		return -ENODEV;
+
+	return cxl_acpi_evaluate_qtg_dsm(handle, coord, entries, qos_class);
+}
+
+static const struct cxl_root_ops acpi_root_ops = {
+	.get_qos_class = cxl_acpi_get_qos_class,
+};
+
 static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg,
 			   const unsigned long end)
 {
@@ -656,6 +779,7 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 {
 	int rc;
 	struct resource *cxl_res;
+	struct cxl_root *cxl_root;
 	struct cxl_port *root_port;
 	struct device *host = &pdev->dev;
 	struct acpi_device *adev = ACPI_COMPANION(host);
@@ -675,9 +799,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	cxl_res->end = -1;
 	cxl_res->flags = IORESOURCE_MEM;
 
-	root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
-	if (IS_ERR(root_port))
-		return PTR_ERR(root_port);
+	cxl_root = devm_cxl_add_root(host, &acpi_root_ops);
+	if (IS_ERR(cxl_root))
+		return PTR_ERR(cxl_root);
+	root_port = &cxl_root->port;
 
 	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
 			      add_host_bridge_dport);
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index fc31ff8954c0..23403538505b 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -528,7 +528,10 @@ static void cxl_port_release(struct device *dev)
 	xa_destroy(&port->dports);
 	xa_destroy(&port->regions);
 	ida_free(&cxl_port_ida, port->id);
-	kfree(port);
+	if (is_cxl_root(port))
+		kfree(to_cxl_root(port));
+	else
+		kfree(port);
 }
 
 static const struct attribute_group *cxl_port_attribute_groups[] = {
@@ -632,13 +635,22 @@ static struct cxl_port *cxl_port_alloc(struct device *uport_dev,
 				       resource_size_t component_reg_phys,
 				       struct cxl_dport *parent_dport)
 {
+	struct cxl_root *cxl_root = NULL;
 	struct cxl_port *port;
 	struct device *dev;
 	int rc;
 
-	port = kzalloc(sizeof(*port), GFP_KERNEL);
-	if (!port)
-		return ERR_PTR(-ENOMEM);
+	/* No parent_dport, root cxl_port */
+	if (!parent_dport) {
+		cxl_root = kzalloc(sizeof(*cxl_root), GFP_KERNEL);
+		if (!cxl_root)
+			return ERR_PTR(-ENOMEM);
+		port = &cxl_root->port;
+	} else {
+		port = kzalloc(sizeof(*port), GFP_KERNEL);
+		if (!port)
+			return ERR_PTR(-ENOMEM);
+	}
 
 	rc = ida_alloc(&cxl_port_ida, GFP_KERNEL);
 	if (rc < 0)
@@ -697,7 +709,10 @@ static struct cxl_port *cxl_port_alloc(struct device *uport_dev,
 	return port;
 
 err:
-	kfree(port);
+	if (cxl_root)
+		kfree(cxl_root);
+	else
+		kfree(port);
 	return ERR_PTR(rc);
 }
 
@@ -821,6 +836,22 @@ struct cxl_port *devm_cxl_add_port(struct device *host,
 }
 EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, CXL);
 
+struct cxl_root *devm_cxl_add_root(struct device *host,
+				   const struct cxl_root_ops *ops)
+{
+	struct cxl_root *cxl_root;
+	struct cxl_port *port;
+
+	port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
+	if (IS_ERR(port))
+		return (struct cxl_root *)port;
+
+	cxl_root = to_cxl_root(port);
+	cxl_root->ops = ops;
+	return cxl_root;
+}
+EXPORT_SYMBOL_NS_GPL(devm_cxl_add_root, CXL);
+
 struct pci_bus *cxl_port_to_pci_bus(struct cxl_port *port)
 {
 	/* There is no pci_bus associated with a CXL platform-root port */
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 2d1fd6d0cce5..587677fc597c 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -611,6 +611,29 @@ struct cxl_port {
 	bool cdat_available;
 };
 
+struct cxl_root_ops {
+	int (*get_qos_class)(struct cxl_port *root_port,
+			     struct access_coordinate *coord,
+			     int entries, int *qos_class);
+};
+
+/**
+ * struct cxl_root - logical collection of root cxl_port items
+ *
+ * @port: cxl_port member
+ * @ops: cxl root operations
+ */
+struct cxl_root {
+	struct cxl_port port;
+	const struct cxl_root_ops *ops;
+};
+
+static inline struct cxl_root *
+to_cxl_root(const struct cxl_port *port)
+{
+	return container_of(port, struct cxl_root, port);
+}
+
 static inline struct cxl_dport *
 cxl_find_dport_by_dev(struct cxl_port *port, const struct device *dport_dev)
 {
@@ -696,6 +719,8 @@ struct cxl_port *devm_cxl_add_port(struct device *host,
 				   struct device *uport_dev,
 				   resource_size_t component_reg_phys,
 				   struct cxl_dport *parent_dport);
+struct cxl_root *devm_cxl_add_root(struct device *host,
+				   const struct cxl_root_ops *ops);
 struct cxl_port *find_cxl_root(struct cxl_port *port);
 int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd);
 void cxl_bus_rescan(void);



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 16/22] cxl: Calculate and store PCI link latency for the downstream ports
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (14 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 15/22] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11  1:06 ` [PATCH v10 17/22] cxl: Store the access coordinates for the generic ports Dave Jiang
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave

The latency is calculated by dividing the flit size over the bandwidth. Add
support to retrieve the flit size for the CXL switch device and calculate
the latency of the PCIe link. Cache the latency number with cxl_dport.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/pci.c  |   73 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/port.c |    6 ++++
 drivers/cxl/cxl.h       |    4 +++
 drivers/cxl/cxlpci.h    |   15 ++++++++++
 4 files changed, 98 insertions(+)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index c86bcc181a5d..bf244ce1d11f 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/units.h>
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/device.h>
 #include <linux/delay.h>
@@ -783,3 +784,75 @@ pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
 	return PCI_ERS_RESULT_NEED_RESET;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_error_detected, CXL);
+
+extern const unsigned char pcie_link_speed[];
+
+static enum pci_bus_speed get_link_speed(struct pci_dev *pdev)
+{
+	u16 linkstat;
+	int err;
+
+	err = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &linkstat);
+	if (err)
+		return -EINVAL;
+
+	return pcie_link_speed[linkstat & PCI_EXP_LNKSTA_CLS];
+}
+
+static int pci_bus_speed_to_mbps(enum pci_bus_speed speed)
+{
+	switch (speed) {
+	case PCIE_SPEED_2_5GT:
+		return 2500;
+	case PCIE_SPEED_5_0GT:
+		return 5000;
+	case PCIE_SPEED_8_0GT:
+		return 8000;
+	case PCIE_SPEED_16_0GT:
+		return 16000;
+	case PCIE_SPEED_32_0GT:
+		return 32000;
+	case PCIE_SPEED_64_0GT:
+		return 64000;
+	default:
+		break;
+	}
+
+	return -EINVAL;
+}
+
+static int cxl_flit_size(struct pci_dev *pdev)
+{
+	if (cxl_pci_flit_256(pdev))
+		return 256;
+
+	return 68;
+}
+
+/**
+ * cxl_pci_get_latency - calculate the link latency for the PCIe link
+ * @pdev - PCI device
+ *
+ * return: calculated latency or 0 for no latency
+ *
+ * CXL Memory Device SW Guide v1.0 2.11.4 Link latency calculation
+ * Link latency = LinkPropagationLatency + FlitLatency + RetimerLatency
+ * LinkProgationLatency is negligible, so 0 will be used
+ * RetimerLatency is assumed to be negligible and 0 will be used
+ * FlitLatency = FlitSize / LinkBandwidth
+ * FlitSize is defined by spec. CXL rev3.0 4.2.1.
+ * 68B flit is used up to 32GT/s. >32GT/s, 256B flit size is used.
+ * The FlitLatency is converted to picoseconds.
+ */
+long cxl_pci_get_latency(struct pci_dev *pdev)
+{
+	long bw;
+
+	bw = pci_bus_speed_to_mbps(get_link_speed(pdev));
+	if (bw < 0)
+		return 0;
+	bw /= BITS_PER_BYTE;
+
+	return cxl_flit_size(pdev) * MEGA / bw;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_pci_get_latency, CXL);
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 23403538505b..3b45da2b425e 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -793,6 +793,9 @@ static struct cxl_port *__devm_cxl_add_port(struct device *host,
 	if (rc)
 		return ERR_PTR(rc);
 
+	if (parent_dport && dev_is_pci(uport_dev))
+		port->pci_latency = cxl_pci_get_latency(to_pci_dev(uport_dev));
+
 	return port;
 
 err:
@@ -1067,6 +1070,9 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
 	if (rc)
 		return ERR_PTR(rc);
 
+	if (dev_is_pci(dport_dev))
+		dport->link_latency = cxl_pci_get_latency(to_pci_dev(dport_dev));
+
 	return dport;
 }
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 587677fc597c..f1c43f02d65e 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -586,6 +586,7 @@ struct cxl_dax_region {
  * @depth: How deep this port is relative to the root. depth 0 is the root.
  * @cdat: Cached CDAT data
  * @cdat_available: Should a CDAT attribute be available in sysfs
+ * @pci_latency: Upstream latency in picoseconds
  */
 struct cxl_port {
 	struct device dev;
@@ -609,6 +610,7 @@ struct cxl_port {
 		size_t length;
 	} cdat;
 	bool cdat_available;
+	long pci_latency;
 };
 
 struct cxl_root_ops {
@@ -654,6 +656,7 @@ struct cxl_rcrb_info {
  * @rch: Indicate whether this dport was enumerated in RCH or VH mode
  * @port: reference to cxl_port that contains this downstream port
  * @coord: access coordinates (performance) for switch from CDAT
+ * @link_latency: calculated PCIe downstream latency
  */
 struct cxl_dport {
 	struct device *dport_dev;
@@ -663,6 +666,7 @@ struct cxl_dport {
 	bool rch;
 	struct cxl_port *port;
 	struct access_coordinate coord;
+	long link_latency;
 };
 
 /**
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 0fa4799ea316..466563536a92 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -85,6 +85,19 @@ struct cdat_entry_header {
 	__le16 length;
 } __packed;
 
+/*
+ * CXL v3.0 6.2.3 Table 6-4
+ * The table indicates that if PCIe Flit Mode is set, then CXL is in 256B flits
+ * mode, otherwise it's 68B flits mode.
+ */
+static inline bool cxl_pci_flit_256(struct pci_dev *pdev)
+{
+	u16 lnksta2;
+
+	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
+	return lnksta2 & PCI_EXP_LNKSTA2_FLIT;
+}
+
 int devm_cxl_port_enumerate_dports(struct cxl_port *port);
 struct cxl_dev_state;
 int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
@@ -93,4 +106,6 @@ void read_cdat_data(struct cxl_port *port);
 void cxl_cor_error_detected(struct pci_dev *pdev);
 pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
 				    pci_channel_state_t state);
+long cxl_pci_get_latency(struct pci_dev *pdev);
+
 #endif /* __CXL_PCI_H__ */



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 17/22] cxl: Store the access coordinates for the generic ports
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (15 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 16/22] cxl: Calculate and store PCI link latency for the downstream ports Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11  1:06 ` [PATCH v10 18/22] cxl: Add helper function that calculate performance data for downstream ports Dave Jiang
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave

Each CXL host bridge is represented by an ACPI0016 device. A generic port
device handle that is an ACPI device is represented by a string of
ACPI0016 device HID and UID. Create a device handle from the ACPI device
and retrieve the access coordinates from the stored memory targets. The
access coordinates are stored under the cxl_dport that is associated with
the CXL host bridge.

The access coordinates struct is dynamically allocated under cxl_dport in
order for code later on to detect whether the data exists or not.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/acpi.c |   17 +++++++++++++++++
 drivers/cxl/cxl.h  |    2 ++
 2 files changed, 19 insertions(+)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index a84fef73f8ce..b149be97c233 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -512,8 +512,21 @@ static int cxl_get_chbs(struct device *dev, struct acpi_device *hb,
 	return 0;
 }
 
+static int get_genport_coordinates(struct device *dev, struct cxl_dport *dport)
+{
+	struct acpi_device *hb = to_cxl_host_bridge(NULL, dev);
+	u8 handle[ACPI_SRAT_DEVICE_HANDLE_SIZE] = { 0 };
+
+	/* ACPI spec 6.5 table 5.65 */
+	strncpy(handle, acpi_device_hid(hb), 8);
+	strncpy(&handle[8], acpi_device_uid(hb), 4);
+
+	return acpi_get_genport_coordinates(handle, &dport->hb_access);
+}
+
 static int add_host_bridge_dport(struct device *match, void *arg)
 {
+	int ret;
 	acpi_status rc;
 	struct device *bridge;
 	struct cxl_dport *dport;
@@ -563,6 +576,10 @@ static int add_host_bridge_dport(struct device *match, void *arg)
 	if (IS_ERR(dport))
 		return PTR_ERR(dport);
 
+	ret = get_genport_coordinates(match, dport);
+	if (ret)
+		dev_dbg(match, "Failed to get generic port perf coordinates.\n");
+
 	return 0;
 }
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index f1c43f02d65e..f7bbd57d9bcf 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -655,6 +655,7 @@ struct cxl_rcrb_info {
  * @rcrb: Data about the Root Complex Register Block layout
  * @rch: Indicate whether this dport was enumerated in RCH or VH mode
  * @port: reference to cxl_port that contains this downstream port
+ * @genport_coord: access coordinates (performance) from ACPI generic port
  * @coord: access coordinates (performance) for switch from CDAT
  * @link_latency: calculated PCIe downstream latency
  */
@@ -665,6 +666,7 @@ struct cxl_dport {
 	struct cxl_rcrb_info rcrb;
 	bool rch;
 	struct cxl_port *port;
+	struct access_coordinate hb_access;
 	struct access_coordinate coord;
 	long link_latency;
 };



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 18/22] cxl: Add helper function that calculate performance data for downstream ports
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (16 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 17/22] cxl: Store the access coordinates for the generic ports Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11  1:06 ` [PATCH v10 19/22] cxl: Compute the entire CXL path latency and bandwidth data Dave Jiang
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave

The CDAT information from the switch, Switch Scoped Latency and Bandwidth
Information Strucutre (SSLBIS), is parsed and stored under a cxl_dport
based on the correlated downstream port id from the SSLBIS entry. Walk
the entire CXL port paths and collect all the performance data. Also
pick up the link latency number that's stored under the dports. The
entire path PCIe bandwidth can be retrieved using the
pcie_bandwidth_available() call.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/port.c |   66 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |    3 ++
 2 files changed, 69 insertions(+)

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 3b45da2b425e..f2b4f9048f2e 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -9,6 +9,7 @@
 #include <linux/pci.h>
 #include <linux/slab.h>
 #include <linux/idr.h>
+#include <linux/node.h>
 #include <cxlmem.h>
 #include <cxlpci.h>
 #include <cxl.h>
@@ -2016,6 +2017,71 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd)
 }
 EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL);
 
+static void combine_coordinates(struct access_coordinate *c1,
+				struct access_coordinate *c2)
+{
+		if (c2->write_bandwidth)
+			c1->write_bandwidth = min(c1->write_bandwidth,
+						  c2->write_bandwidth);
+		c1->write_latency += c2->write_latency;
+
+		if (c2->read_bandwidth)
+			c1->read_bandwidth = min(c1->read_bandwidth,
+						 c2->read_bandwidth);
+		c1->read_latency += c2->read_latency;
+}
+
+/**
+ * cxl_endpoint_get_perf_coordinates - Retrieve performance numbers stored in dports
+ *				   of CXL path
+ * @port: endpoint cxl_port
+ * @coord: output performance data
+ *
+ * Return: errno on failure, 0 on success.
+ */
+int cxl_endpoint_get_perf_coordinates(struct cxl_port *port,
+				      struct access_coordinate *coord)
+{
+	struct access_coordinate c = {
+		.read_bandwidth = UINT_MAX,
+		.write_bandwidth = UINT_MAX,
+	};
+	struct cxl_port *iter = port;
+	struct cxl_dport *dport;
+	struct pci_dev *pdev;
+	unsigned int bw;
+
+	if (!is_cxl_endpoint(port))
+		return -EINVAL;
+
+	dport = iter->parent_dport;
+	while (iter && !is_cxl_root(iter)) {
+		combine_coordinates(&c, &dport->coord);
+		c.write_latency += dport->link_latency;
+		c.read_latency += dport->link_latency;
+
+		combine_coordinates(&c, &dport->hb_access);
+
+		iter = to_cxl_port(iter->dev.parent);
+		dport = iter->parent_dport;
+	}
+
+	/* Get the calculated PCI paths bandwidth */
+	pdev = to_pci_dev(port->uport_dev->parent);
+	bw = pcie_bandwidth_available(pdev, NULL, NULL, NULL);
+	if (bw == 0)
+		return -ENXIO;
+	bw /= BITS_PER_BYTE;
+
+	c.write_bandwidth = min(c.write_bandwidth, bw);
+	c.read_bandwidth = min(c.read_bandwidth, bw);
+
+	*coord = c;
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_endpoint_get_perf_coordinates, CXL);
+
 /* for user tooling to ensure port disable work has completed */
 static ssize_t flush_store(const struct bus_type *bus, const char *buf, size_t count)
 {
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index f7bbd57d9bcf..61b58b25834d 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -888,6 +888,9 @@ static inline int cxl_cdat_switch_process(struct cxl_port *port)
 }
 #endif
 
+int cxl_endpoint_get_perf_coordinates(struct cxl_port *port,
+				      struct access_coordinate *coord);
+
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 19/22] cxl: Compute the entire CXL path latency and bandwidth data
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (17 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 18/22] cxl: Add helper function that calculate performance data for downstream ports Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11  1:06 ` [PATCH v10 20/22] cxl: Store QTG IDs and related info to the CXL memory device context Dave Jiang
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: Jonathan Cameron, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave

CXL Memory Device SW Guide [1] rev1.0 2.11.2 provides instruction on how to
calculate latency and bandwidth for CXL memory device. Calculate minimum
bandwidth and total latency for the path from the CXL device to the root
port. The QTG id is retrieved by providing the performance data as input
and calling the root port callback ->get_qos_class(). The retrieved id is
stored with the cxl_port of the CXL device.

For example for a device that is directly attached to a host bus:
Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link
		Latency + Generic Port Latency
Min Bandwidth = Min bandwidth for link bandwidth between HB
		and CXL device, device CDAT bandwidth, and Generic Port
		Bandwidth

For a device that has a switch in between host bus and CXL device:
Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency +
		Switch (CDAT) Latency + Switch to HB Link Latency +
		Generic Port Latency
Min Bandwidth = Min bandwidth for link bandwidth between CXL device
		to CXL switch, CXL device CDAT bandwidth, CXL switch CDAT
		bandwidth, CXL switch to HB bandwidth, and Generic Port
		Bandwidth.

[1]: https://cdrdv2-public.intel.com/643805/643805_CXL%20Memory%20Device%20SW%20Guide_Rev1p0.pdf

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v10:
- Update to new get_qos_class() API call. Only retrieve 1 QTG ID since
  this is what the kernel is using. Can expand to retrieve additional IDs
  in the future when needed. (Dan)
---
 drivers/cxl/cxl.h  |    4 ++++
 drivers/cxl/port.c |   58 +++++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 61b58b25834d..9569ff70ea6e 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -859,12 +859,16 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
 }
 #endif
 
+
 /* CDAT related bits */
 struct dsmas_entry {
 	struct list_head list;
 	struct range dpa_range;
 	u8 handle;
 	struct access_coordinate coord;
+
+	int entries;
+	int qos_class;
 };
 
 #ifdef CONFIG_FIRMWARE_TABLE
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index f5e2be79f217..99a619360bc5 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -57,6 +57,54 @@ static int discover_region(struct device *dev, void *root)
 	return 0;
 }
 
+static int cxl_port_perf_data_calculate(struct cxl_port *port,
+					struct list_head *dsmas_list)
+{
+	struct access_coordinate c;
+	struct cxl_port *root_port;
+	struct cxl_root *cxl_root;
+	struct dsmas_entry *dent;
+	int valid_entries = 0;
+	int rc;
+
+	rc = cxl_endpoint_get_perf_coordinates(port, &c);
+	if (rc) {
+		dev_dbg(&port->dev, "Failed to retrieve perf coordinates.\n");
+		return rc;
+	}
+
+	root_port = find_cxl_root(port);
+	cxl_root = to_cxl_root(root_port);
+	if (!cxl_root->ops || !cxl_root->ops->get_qos_class)
+		return -EOPNOTSUPP;
+
+	list_for_each_entry(dent, dsmas_list, list) {
+		int qos_class;
+
+		dent->coord.read_latency = dent->coord.read_latency +
+					   c.read_latency;
+		dent->coord.write_latency = dent->coord.write_latency +
+					    c.write_latency;
+		dent->coord.read_bandwidth = min_t(int, c.read_bandwidth,
+						   dent->coord.read_bandwidth);
+		dent->coord.write_bandwidth = min_t(int, c.write_bandwidth,
+						    dent->coord.write_bandwidth);
+
+		dent->entries = 1;
+		rc = cxl_root->ops->get_qos_class(root_port, &dent->coord, 1, &qos_class);
+		if (rc != 1)
+			continue;
+
+		valid_entries++;
+		dent->qos_class = qos_class;
+	}
+
+	if (!valid_entries)
+		return -ENOENT;
+
+	return 0;
+}
+
 static int cxl_switch_port_probe(struct cxl_port *port)
 {
 	struct cxl_hdm *cxlhdm;
@@ -146,10 +194,14 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 		LIST_HEAD(dsmas_list);
 
 		rc = cxl_cdat_endpoint_process(port, &dsmas_list);
-		if (rc < 0)
+		if (rc < 0) {
 			dev_dbg(&port->dev, "Failed to parse CDAT: %d\n", rc);
-
-		/* Performance data processing */
+		} else {
+			rc = cxl_port_perf_data_calculate(port, &dsmas_list);
+			if (rc)
+				dev_dbg(&port->dev,
+					"Failed to do perf coord calculations.\n");
+		}
 
 		cxl_cdat_dsmas_list_destroy(&dsmas_list);
 	}



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 20/22] cxl: Store QTG IDs and related info to the CXL memory device context
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (18 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 19/22] cxl: Compute the entire CXL path latency and bandwidth data Dave Jiang
@ 2023-10-11  1:06 ` Dave Jiang
  2023-10-11 13:19   ` Jonathan Cameron
  2023-10-11  1:07 ` [PATCH v10 21/22] cxl: Export sysfs attributes for memory device QoS class Dave Jiang
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave

Once the QTG ID _DSM is executed successfully, the QTG ID is retrieved from
the return package. Create a list of entries in the cxl_memdev context and
store the QTG ID as qos_class token and the associated DPA range. This
information can be exposed to user space via sysfs in order to help region
setup for hot-plugged CXL memory devices.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v10:
- Store single qos_class value. (Dan)
- Rename cxl_memdev_set_qtg() to cxl_memdev_set_qos_class()
- Removed Jonathan's review tag due to code changes.
---
 drivers/cxl/core/mbox.c |    1 +
 drivers/cxl/cxlmem.h    |   23 +++++++++++++++++++++++
 drivers/cxl/port.c      |   36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 60 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 4df4f614f490..6193b8d57469 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1378,6 +1378,7 @@ struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
 	mutex_init(&mds->event.log_lock);
 	mds->cxlds.dev = dev;
 	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
+	INIT_LIST_HEAD(&mds->perf_list);
 
 	return mds;
 }
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 706f8a6d1ef4..a310a51a3fa4 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -6,6 +6,7 @@
 #include <linux/cdev.h>
 #include <linux/uuid.h>
 #include <linux/rcuwait.h>
+#include <linux/node.h>
 #include "cxl.h"
 
 /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
@@ -388,6 +389,20 @@ enum cxl_devtype {
 	CXL_DEVTYPE_CLASSMEM,
 };
 
+/**
+ * struct perf_prop - performance property entry
+ * @list - list entry
+ * @dpa_range - range for DPA address
+ * @coord - QoS performance data (i.e. latency, bandwidth)
+ * @qos_class - QoS Class cookies
+ */
+struct perf_prop_entry {
+	struct list_head list;
+	struct range dpa_range;
+	struct access_coordinate coord;
+	int qos_class;
+};
+
 /**
  * struct cxl_dev_state - The driver device state
  *
@@ -452,6 +467,9 @@ struct cxl_dev_state {
  * @security: security driver state info
  * @fw: firmware upload / activation state
  * @mbox_send: @dev specific transport for transmitting mailbox commands
+ * @ram_qos_class: QoS class cookies for volatile region
+ * @pmem_qos_class: QoS class cookies for persistent region
+ * @perf_list: performance data entries list
  *
  * See CXL 3.0 8.2.9.8.2 Capacity Configuration and Label Storage for
  * details on capacity parameters.
@@ -472,6 +490,11 @@ struct cxl_memdev_state {
 	u64 active_persistent_bytes;
 	u64 next_volatile_bytes;
 	u64 next_persistent_bytes;
+
+	int ram_qos_class;
+	int pmem_qos_class;
+	struct list_head perf_list;
+
 	struct cxl_event_state event;
 	struct cxl_poison_state poison;
 	struct cxl_security_state security;
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 99a619360bc5..049a16b7eb1f 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -105,6 +105,40 @@ static int cxl_port_perf_data_calculate(struct cxl_port *port,
 	return 0;
 }
 
+static void cxl_memdev_set_qos_class(struct cxl_dev_state *cxlds,
+				     struct list_head *dsmas_list)
+{
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
+	struct range pmem_range = {
+		.start = cxlds->pmem_res.start,
+		.end = cxlds->pmem_res.end,
+	};
+	struct range ram_range = {
+		.start = cxlds->ram_res.start,
+		.end = cxlds->ram_res.end,
+	};
+	struct perf_prop_entry *perf;
+	struct dsmas_entry *dent;
+
+	list_for_each_entry(dent, dsmas_list, list) {
+		perf = devm_kzalloc(cxlds->dev, sizeof(*perf), GFP_KERNEL);
+		if (!perf)
+			return;
+
+		perf->dpa_range = dent->dpa_range;
+		perf->coord = dent->coord;
+		perf->qos_class = dent->qos_class;
+		list_add_tail(&perf->list, &mds->perf_list);
+
+		if (resource_size(&cxlds->ram_res) &&
+		    range_contains(&ram_range, &dent->dpa_range))
+			mds->ram_qos_class = perf->qos_class;
+		else if (resource_size(&cxlds->pmem_res) &&
+			 range_contains(&pmem_range, &dent->dpa_range))
+			mds->pmem_qos_class = perf->qos_class;
+	}
+}
+
 static int cxl_switch_port_probe(struct cxl_port *port)
 {
 	struct cxl_hdm *cxlhdm;
@@ -201,6 +235,8 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
 			if (rc)
 				dev_dbg(&port->dev,
 					"Failed to do perf coord calculations.\n");
+			else
+				cxl_memdev_set_qos_class(cxlds, &dsmas_list);
 		}
 
 		cxl_cdat_dsmas_list_destroy(&dsmas_list);



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 21/22] cxl: Export sysfs attributes for memory device QoS class
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (19 preceding siblings ...)
  2023-10-11  1:06 ` [PATCH v10 20/22] cxl: Store QTG IDs and related info to the CXL memory device context Dave Jiang
@ 2023-10-11  1:07 ` Dave Jiang
  2023-10-11 13:26   ` Jonathan Cameron
  2023-10-11  1:07 ` [PATCH v10 22/22] cxl: Check qos_class validity on memdev probe Dave Jiang
  2023-10-11 12:59 ` [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Jonathan Cameron
  22 siblings, 1 reply; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:07 UTC (permalink / raw)
  To: linux-cxl
  Cc: Dan Williams, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave

Export qos_class sysfs attributes for the CXL memory device. The QoS clas
should show up as /sys/bus/cxl/devices/memX/ram/qos_class0 for the volatile
partition and /sys/bus/cxl/devices/memX/pmem/qos_class0 for the persistent
partition. The QTG ID is retrieved via _DSM after supplying the
calculated bandwidth and latency for the entire CXL path from device to
the CPU. This ID is used to match up to the root decoder QoS class to
determine which CFMWS the memory range of a hotplugged CXL mem device
should be assigned under.

While there may be multiple DSMAS exported by the device CDAT, the driver
will only expose the first QTG ID per partition in sysfs for now. In the
future when multiple QTG IDs are necessary, they can be exposed. [1]

[1]: https://lore.kernel.org/linux-cxl/167571650007.587790.10040913293130712882.stgit@djiang5-mobl3.local/T/#md2a47b1ead3e1ba08f50eab29a4af1aed1d215ab

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>

---
v10:
- Export only qos_class0, the first entry. Additional qos_class entries can be
  exported later as needed. (Dan)
- Have the sysfs attrib return -ENOENT unless driver is attached. (Dan)
- Removed Jonathan's review tag due to code changes.
---
 Documentation/ABI/testing/sysfs-bus-cxl |   34 +++++++++++++++++++++++++++++++
 drivers/cxl/core/memdev.c               |   34 +++++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 44ffbbb36654..dd613f5987b5 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -28,6 +28,23 @@ Description:
 		Payload in the CXL-2.0 specification.
 
 
+What:		/sys/bus/cxl/devices/memX/ram/qos_class0
+Date:		May, 2023
+KernelVersion:	v6.7
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) For CXL host platforms that support "QoS Telemmetry"
+		this attribute conveys a comma delimited list of platform
+		specific cookies that identifies a QoS performance class
+		for the volatile partition of the CXL mem device. These
+		class-ids can be compared against a similar "qos_class"
+		published for a root decoder. While it is not required
+		that the endpoints map their local memory-class to a
+		matching platform class, mismatches are not recommended
+		and there are platform specific performance related
+		side-effects that may result. First class-id is displayed.
+
+
 What:		/sys/bus/cxl/devices/memX/pmem/size
 Date:		December, 2020
 KernelVersion:	v5.12
@@ -38,6 +55,23 @@ Description:
 		Payload in the CXL-2.0 specification.
 
 
+What:		/sys/bus/cxl/devices/memX/pmem/qos_class0
+Date:		May, 2023
+KernelVersion:	v6.7
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) For CXL host platforms that support "QoS Telemmetry"
+		this attribute conveys a comma delimited list of platform
+		specific cookies that identifies a QoS performance class
+		for the persistent partition of the CXL mem device. These
+		class-ids can be compared against a similar "qos_class"
+		published for a root decoder. While it is not required
+		that the endpoints map their local memory-class to a
+		matching platform class, mismatches are not recommended
+		and there are platform specific performance related
+		side-effects that may result. First class-id is displayed.
+
+
 What:		/sys/bus/cxl/devices/memX/serial
 Date:		January, 2022
 KernelVersion:	v5.18
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 14b547c07f54..bfecc9da2561 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -88,6 +88,22 @@ static ssize_t ram_size_show(struct device *dev, struct device_attribute *attr,
 static struct device_attribute dev_attr_ram_size =
 	__ATTR(size, 0444, ram_size_show, NULL);
 
+static ssize_t ram_qos_class0_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
+
+	if (!dev->driver)
+		return -ENOENT;
+
+	return sysfs_emit(buf, "%d\n", mds->ram_qos_class);
+}
+
+static struct device_attribute dev_attr_ram_qos_class0 =
+	__ATTR(qos_class0, 0444, ram_qos_class0_show, NULL);
+
 static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
 			      char *buf)
 {
@@ -101,6 +117,22 @@ static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
 static struct device_attribute dev_attr_pmem_size =
 	__ATTR(size, 0444, pmem_size_show, NULL);
 
+static ssize_t pmem_qos_class0_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
+
+	if (!dev->driver)
+		return -ENOENT;
+
+	return sysfs_emit(buf, "%d\n", mds->pmem_qos_class);
+}
+
+static struct device_attribute dev_attr_pmem_qos_class0 =
+	__ATTR(qos_class0, 0444, pmem_qos_class0_show, NULL);
+
 static ssize_t serial_show(struct device *dev, struct device_attribute *attr,
 			   char *buf)
 {
@@ -439,11 +471,13 @@ static struct attribute *cxl_memdev_attributes[] = {
 
 static struct attribute *cxl_memdev_pmem_attributes[] = {
 	&dev_attr_pmem_size.attr,
+	&dev_attr_pmem_qos_class0.attr,
 	NULL,
 };
 
 static struct attribute *cxl_memdev_ram_attributes[] = {
 	&dev_attr_ram_size.attr,
+	&dev_attr_ram_qos_class0.attr,
 	NULL,
 };
 



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v10 22/22] cxl: Check qos_class validity on memdev probe
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (20 preceding siblings ...)
  2023-10-11  1:07 ` [PATCH v10 21/22] cxl: Export sysfs attributes for memory device QoS class Dave Jiang
@ 2023-10-11  1:07 ` Dave Jiang
  2023-10-11 13:29   ` Jonathan Cameron
  2023-10-11 12:59 ` [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Jonathan Cameron
  22 siblings, 1 reply; 36+ messages in thread
From: Dave Jiang @ 2023-10-11  1:07 UTC (permalink / raw)
  To: linux-cxl
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave

Add a check to make sure the qos_class for the device will match one of
the root decoders qos_class. If no match is found, then the qos_class for
the device is set to invalid.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/mem.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 317c7548e4e9..3495119d2edf 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -104,6 +104,70 @@ static int cxl_debugfs_poison_clear(void *data, u64 dpa)
 DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
 			 cxl_debugfs_poison_clear, "%llx\n");
 
+struct qos_class_ctx {
+	bool matched;
+	int dev_qos_class;
+};
+
+static int match_cxlrd_qos_class(struct device *dev, void *data)
+{
+	struct qos_class_ctx *ctx = data;
+	struct cxl_root_decoder *cxlrd;
+
+	if (ctx->matched)
+		return 0;
+
+	if (!is_root_decoder(dev))
+		return 0;
+
+	cxlrd = to_cxl_root_decoder(dev);
+	if (cxlrd->qos_class == CXL_QOS_CLASS_INVALID ||
+	    ctx->dev_qos_class == CXL_QOS_CLASS_INVALID)
+		return 0;
+
+	if (cxlrd->qos_class == ctx->dev_qos_class)
+		ctx->matched = 1;
+
+	return 0;
+}
+
+static int cxl_qos_class_verify(struct cxl_memdev *cxlmd)
+{
+	struct device *dev = &cxlmd->dev;
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
+	struct qos_class_ctx ctx;
+	int rc;
+
+	if (mds->ram_qos_class != CXL_QOS_CLASS_INVALID) {
+		ctx.matched = false;
+		ctx.dev_qos_class =  mds->ram_qos_class;
+		rc = bus_for_each_dev(dev->bus, NULL, &ctx, match_cxlrd_qos_class);
+		if (rc)
+			return rc;
+
+		if (ctx.matched)
+			return 0;
+
+		mds->ram_qos_class = CXL_QOS_CLASS_INVALID;
+	}
+
+	if (mds->pmem_qos_class != CXL_QOS_CLASS_INVALID) {
+		ctx.matched = false;
+		ctx.dev_qos_class = mds->pmem_qos_class;
+		rc = bus_for_each_dev(dev->bus, NULL, &ctx, match_cxlrd_qos_class);
+		if (rc)
+			return rc;
+
+		if (ctx.matched)
+			return 0;
+
+		mds->ram_qos_class = CXL_QOS_CLASS_INVALID;
+	}
+
+	return 0;
+}
+
 static int cxl_mem_probe(struct device *dev)
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
@@ -181,6 +245,10 @@ static int cxl_mem_probe(struct device *dev)
 			return rc;
 	}
 
+	rc = cxl_qos_class_verify(cxlmd);
+	if (rc)
+		return rc;
+
 	/*
 	 * The kernel may be operating out of CXL memory on this device,
 	 * there is no spec defined way to determine whether this device



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 04/22] acpi: Move common tables helper functions to common lib
  2023-10-11  1:05 ` [PATCH v10 04/22] acpi: Move common tables helper functions to common lib Dave Jiang
@ 2023-10-11 12:55   ` Jonathan Cameron
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2023-10-11 12:55 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, Rafael J. Wysocki, Hanjun Guo, Rafael J. Wysocki,
	dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	dave

On Tue, 10 Oct 2023 18:05:21 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Some of the routines in ACPI driver/acpi/tables.c can be shared with
> parsing CDAT. CDAT is a device-provided data structure that is formatted
> similar to a platform provided ACPI table. CDAT is used by CXL and can
> exist on platforms that do not use ACPI. Split out the common routine
> from ACPI to accommodate platforms that do not support ACPI and move that
> to /lib. The common routines can be built outside of ACPI if
> FIRMWARE_TABLES is selected.
> 
> Link: https://lore.kernel.org/linux-cxl/CAJZ5v0jipbtTNnsA0-o5ozOk8ZgWnOg34m34a9pPenTyRLj=6A@mail.gmail.com/
> Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
> Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It's a simple code move so all I've really checked is that no obvious differences
snuck in.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Mostly so I remember I checked if you do a v11 :)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 06/22] base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates'
  2023-10-11  1:05 ` [PATCH v10 06/22] base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates' Dave Jiang
@ 2023-10-11 12:57   ` Jonathan Cameron
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2023-10-11 12:57 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, Dan Williams, Greg Kroah-Hartman, ira.weiny,
	vishal.l.verma, alison.schofield, dave

On Tue, 10 Oct 2023 18:05:33 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Dan Williams suggested changing the struct 'node_hmem_attrs' to
> 'access_coordinates' [1]. The struct is a container of r/w-latency and
> r/w-bandwidth numbers. Moving forward, this container will also be used by
> CXL to store the performance characteristics of each link hop in
> the PCIE/CXL topology. So, where node_hmem_attrs is just the access
> parameters of a memory-node, access_coordinates applies more broadly
> to hardware topology characteristics. The observation is that seemed like
> an excercise in having the application identify "where" it falls on a
> spectrum of bandwidth and latency needs. For the tuple of read/write-latency
> and read/write-bandwidth, "coordinates" is not a perfect fit. Sometimes it
> is just conveying values in isolation and not a "location" relative to
> other performance points, but in the end this data is used to identify the
> performance operation point of a given memory-node. [2]
> 
> Link: http://lore.kernel.org/r/64471313421f7_1b66294d5@dwillia2-xfh.jf.intel.com.notmuch/
> Link: https://lore.kernel.org/linux-cxl/645e6215ee0de_1e6f2945e@dwillia2-xfh.jf.intel.com.notmuch/
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Whilst the coordinates analogy is a bit weak in my mind, I guess I'll get used
to it.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/acpi/numa/hmat.c |   20 ++++++++++----------
>  drivers/base/node.c      |   12 ++++++------
>  include/linux/node.h     |    8 ++++----
>  3 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
> index bba268ecd802..f9ff992038fa 100644
> --- a/drivers/acpi/numa/hmat.c
> +++ b/drivers/acpi/numa/hmat.c
> @@ -62,7 +62,7 @@ struct memory_target {
>  	unsigned int memory_pxm;
>  	unsigned int processor_pxm;
>  	struct resource memregions;
> -	struct node_hmem_attrs hmem_attrs[2];
> +	struct access_coordinate coord[2];
>  	struct list_head caches;
>  	struct node_cache_attrs cache_attrs;
>  	bool registered;
> @@ -227,24 +227,24 @@ static void hmat_update_target_access(struct memory_target *target,
>  {
>  	switch (type) {
>  	case ACPI_HMAT_ACCESS_LATENCY:
> -		target->hmem_attrs[access].read_latency = value;
> -		target->hmem_attrs[access].write_latency = value;
> +		target->coord[access].read_latency = value;
> +		target->coord[access].write_latency = value;
>  		break;
>  	case ACPI_HMAT_READ_LATENCY:
> -		target->hmem_attrs[access].read_latency = value;
> +		target->coord[access].read_latency = value;
>  		break;
>  	case ACPI_HMAT_WRITE_LATENCY:
> -		target->hmem_attrs[access].write_latency = value;
> +		target->coord[access].write_latency = value;
>  		break;
>  	case ACPI_HMAT_ACCESS_BANDWIDTH:
> -		target->hmem_attrs[access].read_bandwidth = value;
> -		target->hmem_attrs[access].write_bandwidth = value;
> +		target->coord[access].read_bandwidth = value;
> +		target->coord[access].write_bandwidth = value;
>  		break;
>  	case ACPI_HMAT_READ_BANDWIDTH:
> -		target->hmem_attrs[access].read_bandwidth = value;
> +		target->coord[access].read_bandwidth = value;
>  		break;
>  	case ACPI_HMAT_WRITE_BANDWIDTH:
> -		target->hmem_attrs[access].write_bandwidth = value;
> +		target->coord[access].write_bandwidth = value;
>  		break;
>  	default:
>  		break;
> @@ -701,7 +701,7 @@ static void hmat_register_target_cache(struct memory_target *target)
>  static void hmat_register_target_perf(struct memory_target *target, int access)
>  {
>  	unsigned mem_nid = pxm_to_node(target->memory_pxm);
> -	node_set_perf_attrs(mem_nid, &target->hmem_attrs[access], access);
> +	node_set_perf_attrs(mem_nid, &target->coord[access], access);
>  }
>  
>  static void hmat_register_target_devices(struct memory_target *target)
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index 493d533f8375..cb2b6cc7f6e6 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -74,14 +74,14 @@ static BIN_ATTR_RO(cpulist, CPULIST_FILE_MAX_BYTES);
>   * @dev:	Device for this memory access class
>   * @list_node:	List element in the node's access list
>   * @access:	The access class rank
> - * @hmem_attrs: Heterogeneous memory performance attributes
> + * @coord:	Heterogeneous memory performance coordinates
>   */
>  struct node_access_nodes {
>  	struct device		dev;
>  	struct list_head	list_node;
>  	unsigned int		access;
>  #ifdef CONFIG_HMEM_REPORTING
> -	struct node_hmem_attrs	hmem_attrs;
> +	struct access_coordinate	coord;
>  #endif
>  };
>  #define to_access_nodes(dev) container_of(dev, struct node_access_nodes, dev)
> @@ -167,7 +167,7 @@ static ssize_t property##_show(struct device *dev,			\
>  			   char *buf)					\
>  {									\
>  	return sysfs_emit(buf, "%u\n",					\
> -			  to_access_nodes(dev)->hmem_attrs.property);	\
> +			  to_access_nodes(dev)->coord.property);	\
>  }									\
>  static DEVICE_ATTR_RO(property)
>  
> @@ -187,10 +187,10 @@ static struct attribute *access_attrs[] = {
>  /**
>   * node_set_perf_attrs - Set the performance values for given access class
>   * @nid: Node identifier to be set
> - * @hmem_attrs: Heterogeneous memory performance attributes
> + * @coord: Heterogeneous memory performance coordinates
>   * @access: The access class the for the given attributes
>   */
> -void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs,
> +void node_set_perf_attrs(unsigned int nid, struct access_coordinate *coord,
>  			 unsigned int access)
>  {
>  	struct node_access_nodes *c;
> @@ -205,7 +205,7 @@ void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs,
>  	if (!c)
>  		return;
>  
> -	c->hmem_attrs = *hmem_attrs;
> +	c->coord = *coord;
>  	for (i = 0; access_attrs[i] != NULL; i++) {
>  		if (sysfs_add_file_to_group(&c->dev.kobj, access_attrs[i],
>  					    "initiators")) {
> diff --git a/include/linux/node.h b/include/linux/node.h
> index 427a5975cf40..25b66d705ee2 100644
> --- a/include/linux/node.h
> +++ b/include/linux/node.h
> @@ -20,14 +20,14 @@
>  #include <linux/list.h>
>  
>  /**
> - * struct node_hmem_attrs - heterogeneous memory performance attributes
> + * struct access_coordinate - generic performance coordinates container
>   *
>   * @read_bandwidth:	Read bandwidth in MB/s
>   * @write_bandwidth:	Write bandwidth in MB/s
>   * @read_latency:	Read latency in nanoseconds
>   * @write_latency:	Write latency in nanoseconds
>   */
> -struct node_hmem_attrs {
> +struct access_coordinate {
>  	unsigned int read_bandwidth;
>  	unsigned int write_bandwidth;
>  	unsigned int read_latency;
> @@ -65,7 +65,7 @@ struct node_cache_attrs {
>  
>  #ifdef CONFIG_HMEM_REPORTING
>  void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs);
> -void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs,
> +void node_set_perf_attrs(unsigned int nid, struct access_coordinate *coord,
>  			 unsigned access);
>  #else
>  static inline void node_add_cache(unsigned int nid,
> @@ -74,7 +74,7 @@ static inline void node_add_cache(unsigned int nid,
>  }
>  
>  static inline void node_set_perf_attrs(unsigned int nid,
> -				       struct node_hmem_attrs *hmem_attrs,
> +				       struct access_coordinate *coord,
>  				       unsigned access)
>  {
>  }
> 
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem
  2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
                   ` (21 preceding siblings ...)
  2023-10-11  1:07 ` [PATCH v10 22/22] cxl: Check qos_class validity on memdev probe Dave Jiang
@ 2023-10-11 12:59 ` Jonathan Cameron
  2023-10-11 16:31   ` Dave Jiang
  22 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2023-10-11 12:59 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, Hanjun Guo, Len Brown, Ira Weiny, Dan Williams,
	Greg Kroah-Hartman, Rafael J. Wysocki, Davidlohr Bueso,
	Rafael J. Wysocki, vishal.l.verma, alison.schofield

On Tue, 10 Oct 2023 18:04:56 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

Awkward to rename a patch series, but feels like the QTG ID part may be the last thing
added but there is a whole bunch of stuff in between to enable that which
isn't covered by the one line description.

J

> v10:
> - Remove memory allocation in DSM handler. Add input parameters to request number of ids
>   to return. (Dan)
> - Only export a single qos_class sysfs attribute, qos_class0 for devices. (Dan)
> - Add sanity check of device qos_class against root decoders (Dan)
> - Recombine all relevant patches for cxl upstream merge
> - Rebased against v6.6-rc4
> 
> v9:
> - Correct DSM input package to use integers. (Dan)
> - Remove all endien conversions. (Dan)
> 
> v8:
> - Correct DSM return package parsing to use integers
> 
> v7:
> - Minor changes. Please see specific patches for log entries addressing comments
>   from v6.
> 
> v6:
> - Please see specific patches for log entries addressing comments from v5.
> - Use CDAT sub-table structs as is, adjust w/o common header. Changing causes
>   significant ACPICA code changes.
> - Deal with table header now that is a union. (Jonathan)
> - Move DSMAS list destroy to core/cdat.c. (Jonathan)
> - Retain and display entire QTG ID list from _DSM. (Jonathan)
> 
> v5:
> - Please see specific patches for log entries addressing comments from v4.
> - Split out ACPI and generic node code to send separately to respective maintainers
> - Reworked to use ACPI tables code for CDAT parsing (Dan)
> - Cache relevant perf data under dports (Dan)
> - Add cxl root callback for QTG ID _DSM (Dan)
> - Rename 'node_hmem_attr' to 'access_coordinate' (Dan)
> - Change qtg_id sysfs attrib to qos_class (Dan)
> 
> v4:
> - Reworked PCIe link path latency calculation
> - 0-day fixes
> - Removed unused qos_list from cxl_memdev and its stray usages
> 
> v3:
> - Please see specific patches for log entries addressing comments from v2.
> - Refactor cxl_port_probe() additions. (Alison)
> - Convert to use 'struct node_hmem_attrs'
> - Refactor to use common code for genport target allocation.
> - Add third array entry for target hmem_attrs to store genport locality data.
> - Go back to per partition QTG ID. (Dan)
> 
> v2:
> - Please see specific patches for log entries addressing comments from v1.
> - Removed ACPICA code usages.
> - Removed PCI subsystem helpers for latency and bandwidth.
> - Add CXL switch CDAT parsing support (SSLBIS)
> - Add generic port SRAT+HMAT support (ACPI)
> - Export a single QTG ID via sysfs per memory device (Dan)
> - Provide rest of DSMAS range info in debugfs (Dan)
> 
> Hi Dan,
> Please consider taking the entire series including the CXL bits for the next
> convenient merge window. Thanks!
> 
> This series adds the retrieval of QoS Throttling Group (QTG) IDs for the CXL Fixed
> Memory Window Structure (CFMWS) and the CXL memory device. It provides the QTG IDs
> to user space to provide some guidance with putting the proper DPA range under the
> appropriate CFMWS window for a hot-plugged CXL memory device.
> 
> The CFMWS structure contains a QTG ID that is associated with the memory window that the
> structure exports. On Linux, the CFMWS is represented as a CXL root decoder. The QTG
> ID will be attached to the CXL root decoder and exported as a sysfs attribute (qos_class).
> 
> The QTG IDs for a device is retrieved via sending a _DSM method to the ACPI0017 device.
> The _DSM expects an input package of 4 DWORDS that contains the read latency, write
> latency, read bandwidth, and write banwidth. These are the caluclated numbers for the
> path between the CXL device and the CPU. The list of QTG IDs are also exported as a sysfs
> attribute under the mem device memory partition type:
> /sys/bus/cxl/devices/memX/ram/qos_class
> /sys/bus/cxl/devices/memX/pmem/qos_class
> A mapping of DPA ranges and it's correlated QTG IDs are found under
> /sys/kernel/debug/cxl/memX/qtgmap. Each DSMAS from the device CDAT will provide a DPA
> range.
> 
> The latency numbers are the aggregated latencies for the path between the CXL device and
> the CPU. If a CXL device is directly attached to the CXL HB, the latency
> would be the aggregated latencies from the device Coherent Device Attribute Table (CDAT),
> the caluclated PCIe link latency between the device and the HB, and the generic port data
> from ACPI SRAT+HMAT. The bandwidth in this configuration would be the minimum between the
> CDAT bandwidth number, link bandwidth between the device and the HB, and the bandwidth data
> from the generic port data via ACPI SRAT+HMAT.
> 
> If a configuration has a switch in between then the latency would be the aggregated
> latencies from the device CDAT, the link latency between device and switch, the
> latency from the switch CDAT, the link latency between switch and the HB, and the
> generic port latency between the CPU and the CXL HB. The bandwidth calculation would be the
> min of device CDAT bandwidth, link bandwith between device and switch, switch CDAT
> bandwidth, the link bandwidth between switch and HB, and the generic port bandwidth
> 
> There can be 0 or more switches between the CXL device and the CXL HB. There are detailed
> examples on calculating bandwidth and latency in the CXL Memory Device Software Guide [4].
> 
> The CDAT provides Device Scoped Memory Affinity Structures (DSMAS) that contains the
> Device Physical Address (DPA) range and the related Device Scoped Latency and Bandwidth
> Informat Stuctures (DSLBIS). Each DSLBIS provides a latency or bandwidth entry that is
> tied to a DSMAS entry via a per DSMAS unique DSMAD handle.
> 
> Test setup is done with runqemu genport support branch [6]. The setup provides 2 CXL HBs
> with one HB having a CXL switch underneath. It also provides generic port support detailed
> below.
> 
> A hacked up qemu branch is used to support generic port SRAT and HMAT [7].
> 
> To create the appropriate HMAT entries for generic port, the following qemu paramters must
> be added:
> 
> -object genport,id=$X -numa node,genport=genport$X,nodeid=$Y,initiator=$Z
> -numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-latency,latency=$latency
> -numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-bandwidth,bandwidth=$bandwidthM
> for ((i = 0; i < total_nodes; i++)); do
> 	for ((j = 0; j < cxl_hbs; j++ )); do	# 2 CXL HBs
> 		-numa dist,src=$i,dst=$X,val=$dist
> 	done
> done
> 
> See the genport run_qemu branch for full details.
> 
> [1]: https://www.computeexpresslink.org/download-the-specification
> [2]: https://uefi.org/sites/default/files/resources/Coherent%20Device%20Attribute%20Table_1.01.pdf
> [3]: https://uefi.org/sites/default/files/resources/ACPI_Spec_6_5_Aug29.pdf
> [4]: https://cdrdv2-public.intel.com/643805/643805_CXL%20Memory%20Device%20SW%20Guide_Rev1p0.pdf
> [5]: https://lore.kernel.org/linux-cxl/20230313195530.GA1532686@bhelgaas/T/#t
> [6]: https://git.kernel.org/pub/scm/linux/kernel/git/djiang/linux.git/log/?h=cxl-qtg
> [7]: https://github.com/pmem/run_qemu/tree/djiang/genport
> [8]: https://github.com/davejiang/qemu/tree/genport
> 
> ---
> 
> base-commit: 178e1ea6a68f12967ee0e9afc4d79a2939acd43c
> 
> ---
> Dave Jiang (22):
>       cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute
>       cxl: Add checksum verification to CDAT from CXL
>       cxl: Add support for reading CXL switch CDAT table
>       acpi: Move common tables helper functions to common lib
>       lib/firmware_table: tables: Add CDAT table parsing support
>       base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates'
>       acpi: numa: Create enum for memory_target access coordinates indexing
>       acpi: numa: Add genport target allocation to the HMAT parsing
>       acpi: Break out nesting for hmat_parse_locality()
>       acpi: numa: Add setting of generic port system locality attributes
>       acpi: numa: Add helper function to retrieve the performance attributes
>       cxl: Add callback to parse the DSMAS subtables from CDAT
>       cxl: Add callback to parse the DSLBIS subtable from CDAT
>       cxl: Add callback to parse the SSLBIS subtable from CDAT
>       cxl: Add support for _DSM Function for retrieving QTG ID
>       cxl: Calculate and store PCI link latency for the downstream ports
>       cxl: Store the access coordinates for the generic ports
>       cxl: Add helper function that calculate performance data for downstream ports
>       cxl: Compute the entire CXL path latency and bandwidth data
>       cxl: Store QTG IDs and related info to the CXL memory device context
>       cxl: Export sysfs attributes for memory device QoS class
>       cxl: Check qos_class validity on memdev probe
> 
> 
>  Documentation/ABI/testing/sysfs-bus-cxl |  49 +++++
>  MAINTAINERS                             |   2 +
>  drivers/acpi/Kconfig                    |   1 +
>  drivers/acpi/numa/hmat.c                | 172 +++++++++++++---
>  drivers/acpi/tables.c                   | 178 +----------------
>  drivers/base/node.c                     |  12 +-
>  drivers/cxl/Kconfig                     |   1 +
>  drivers/cxl/acpi.c                      | 151 +++++++++++++-
>  drivers/cxl/core/Makefile               |   1 +
>  drivers/cxl/core/cdat.c                 | 249 ++++++++++++++++++++++++
>  drivers/cxl/core/mbox.c                 |   1 +
>  drivers/cxl/core/memdev.c               |  34 ++++
>  drivers/cxl/core/pci.c                  | 125 ++++++++++--
>  drivers/cxl/core/port.c                 | 124 +++++++++++-
>  drivers/cxl/cxl.h                       |  74 +++++++
>  drivers/cxl/cxlmem.h                    |  23 +++
>  drivers/cxl/cxlpci.h                    |  15 ++
>  drivers/cxl/mem.c                       |  68 +++++++
>  drivers/cxl/port.c                      | 109 +++++++++++
>  include/linux/acpi.h                    |  54 +++--
>  include/linux/fw_table.h                |  52 +++++
>  include/linux/node.h                    |   8 +-
>  lib/Kconfig                             |   3 +
>  lib/Makefile                            |   2 +
>  lib/fw_table.c                          | 237 ++++++++++++++++++++++
>  25 files changed, 1478 insertions(+), 267 deletions(-)
>  create mode 100644 drivers/cxl/core/cdat.c
>  create mode 100644 include/linux/fw_table.h
>  create mode 100644 lib/fw_table.c
> 
> --
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 15/22] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-10-11  1:06 ` [PATCH v10 15/22] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
@ 2023-10-11 13:10   ` Jonathan Cameron
  2023-10-11 15:37     ` Dave Jiang
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2023-10-11 13:10 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave

On Tue, 10 Oct 2023 18:06:28 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)
> 
> Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
> an input of an ACPI package with 4 dwords (read latency, write latency,
> read bandwidth, write bandwidth). The call returns a package with 1 WORD
> that provides the max supported QTG ID and a package that may contain 0 or
> more WORDs as the recommended QTG IDs in the recommended order.
> 
> Create a cxl_root container for the root cxl_port and provide a callback
> ->get_qos_class() in order to retrieve the QoS class. For the ACPI case,  
> the _DSM helper is used to retrieve the QTG ID and returned. A
> devm_cxl_add_root() function is added for root port setup and registration
> of the cxl_root callback operation(s).
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

A few comments inline. Main one that needs fixing is out dated docs.

Jonathan

> ---
> v10:
> - Remove allocation in _DSM handler. Change input parameter to request
>   number of ids to return.
> - Removed Jonathan's review tag due to significant changes
> v9:
> - Fix input to 4 ints instead of using buffers. (Dan)
> - Remove all endien conversion inside acpi handling code. (Dan)
> v8:
> - Change DSM return package parsing to use integers.
> v7:
> - Fix stray lines in commit log. (Jonathan)
> v5:
> - Make the helper a callback for the CXL root. (Dan)
> - Drop the addition of core/acpi.c. (Dan)
> - Add endiness handling. (Jonathan)
> - Refactor error exits. (Jonathan)
> - Update evaluate function description. (Jonathan)
> - Make uuid static. (Dan)
> v2:
> - Reorder var declaration and use C99 style. (Jonathan)
> - Allow >2 ACPI objects in package for future expansion. (Jonathan)
> - Check QTG IDs against MAX QTG ID provided by output package. (Jonathan)
> ---
>  drivers/cxl/acpi.c      |  131 ++++++++++++++++++++++++++++++++++++++++++++++-
>  drivers/cxl/core/port.c |   41 +++++++++++++--
>  drivers/cxl/cxl.h       |   25 +++++++++
>  3 files changed, 189 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 2034eb4ce83f..a84fef73f8ce 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c


> /**
> + * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
> + * @handle: ACPI handle
> + * @coord: performance access coordinates
> + * @entries: number of QTG IDs to return
> + * @qos_class: int array provided by caller to return QTG IDs
> + *
> + * Return: number of QTG IDs returned, or -errno for errors
> + *
> + * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
> + * the QTG IDs that are suitable for the performance point in order of most
> + * suitable to least suitable. Return first QTG ID.

Return count of QTG IDs I think..

> + */
> +static int
> +cxl_acpi_evaluate_qtg_dsm(acpi_handle handle, struct access_coordinate *coord,
> +			  int entries, int *qos_class)
> +{
> +	union acpi_object *out_obj, *out_buf, *pkg;
> +	union acpi_object in_array[4] = {
> +		[0].integer = { ACPI_TYPE_INTEGER, coord->read_latency },
> +		[1].integer = { ACPI_TYPE_INTEGER, coord->write_latency },
> +		[2].integer = { ACPI_TYPE_INTEGER, coord->read_bandwidth },
> +		[3].integer = { ACPI_TYPE_INTEGER, coord->write_bandwidth },
> +	};
> +	union acpi_object in_obj = {
> +		.package = {
> +			.type = ACPI_TYPE_PACKAGE,
> +			.count = 4,
> +			.elements = in_array,
> +		},
> +	};
> +	int count, pkg_entries, i;
> +	u16 max_qtg;
> +	int rc = 0;

Trivial but I think it's always set below so doesn't need init here.


> +
> +	if (!entries)
> +		return -EINVAL;
> +
> +	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
> +	if (!out_obj)
> +		return -ENXIO;
> +
> +	if (out_obj->type != ACPI_TYPE_PACKAGE) {
> +		rc = -ENXIO;
> +		goto out;
> +	}
> +
> +	/* Check Max QTG ID */
> +	pkg = &out_obj->package.elements[0];
> +	if (pkg->type != ACPI_TYPE_INTEGER) {
> +		rc = -ENXIO;
> +		goto out;
> +	}
> +
> +	max_qtg = pkg->integer.value;
> +
> +	/* It's legal to have 0 QTG entries */

If that's the case, why not return 0 and make it the callers problem to deal
with this. Seems odd to have the retrieval function return an error for
something the spec says is fine.

> +	pkg_entries = out_obj->package.count;
> +	if (pkg_entries <= 1) {
> +		rc = -EEXIST;
> +		goto out;
> +	}
> +
> +	/* Retrieve QTG IDs package */
> +	pkg = &out_obj->package.elements[1];
> +	if (pkg->type != ACPI_TYPE_PACKAGE) {
> +		rc = -ENXIO;
> +		goto out;
> +	}
> +
> +	pkg_entries = pkg->package.count;
> +	count = min(entries, pkg_entries);
> +	for (i = 0; i < count; i++) {
> +		u16 qtg_id;
> +
> +		out_buf = &pkg->package.elements[i];
> +		if (out_buf->type != ACPI_TYPE_INTEGER) {
> +			rc = -ENXIO;
> +			goto out;
> +		}
> +
> +		qtg_id = out_buf->integer.value;
> +		if (qtg_id > max_qtg)
> +			pr_warn("QTG ID %u greater than MAX %u\n",
> +				qtg_id, max_qtg);
> +
> +		qos_class[i] = qtg_id;
> +	}
> +	rc = count;
> +
> +out:
> +	ACPI_FREE(out_obj);
> +	return rc;
> +}
> +


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 20/22] cxl: Store QTG IDs and related info to the CXL memory device context
  2023-10-11  1:06 ` [PATCH v10 20/22] cxl: Store QTG IDs and related info to the CXL memory device context Dave Jiang
@ 2023-10-11 13:19   ` Jonathan Cameron
  2023-10-11 16:04     ` Dave Jiang
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2023-10-11 13:19 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave

On Tue, 10 Oct 2023 18:06:59 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Once the QTG ID _DSM is executed successfully, the QTG ID is retrieved from
> the return package. Create a list of entries in the cxl_memdev context and
> store the QTG ID as qos_class token and the associated DPA range. This
> information can be exposed to user space via sysfs in order to help region
> setup for hot-plugged CXL memory devices.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v10:
> - Store single qos_class value. (Dan)

I'm fine with doing this, but I'd like a print if more than one is
provided by the DSMAS (e.g. multiple DSMAS entries for a given region)
Mostly so we have something to let us know if anyone is shipping such a
device.

Otherwise a couple of minor comments inline.

Jonathan


> - Rename cxl_memdev_set_qtg() to cxl_memdev_set_qos_class()
> - Removed Jonathan's review tag due to code changes.

> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index 99a619360bc5..049a16b7eb1f 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -105,6 +105,40 @@ static int cxl_port_perf_data_calculate(struct cxl_port *port,
>  	return 0;
>  }
>  
> +static void cxl_memdev_set_qos_class(struct cxl_dev_state *cxlds,
> +				     struct list_head *dsmas_list)
> +{
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> +	struct range pmem_range = {
> +		.start = cxlds->pmem_res.start,
> +		.end = cxlds->pmem_res.end,
> +	};
> +	struct range ram_range = {
> +		.start = cxlds->ram_res.start,
> +		.end = cxlds->ram_res.end,
> +	};
> +	struct perf_prop_entry *perf;
> +	struct dsmas_entry *dent;
> +
> +	list_for_each_entry(dent, dsmas_list, list) {
> +		perf = devm_kzalloc(cxlds->dev, sizeof(*perf), GFP_KERNEL);
> +		if (!perf)
> +			return;
> +
> +		perf->dpa_range = dent->dpa_range;
> +		perf->coord = dent->coord;
> +		perf->qos_class = dent->qos_class;
> +		list_add_tail(&perf->list, &mds->perf_list);
> +
> +		if (resource_size(&cxlds->ram_res) &&
> +		    range_contains(&ram_range, &dent->dpa_range))
> +			mds->ram_qos_class = perf->qos_class;

So this assumes one DSMAS per memory type.
Not an unreasonable starting place, but I think this should 
print something to the log if it does see more that one.

> +		else if (resource_size(&cxlds->pmem_res) &&
> +			 range_contains(&pmem_range, &dent->dpa_range))
> +			mds->pmem_qos_class = perf->qos_class;
> +	}
> +}
> +
>  static int cxl_switch_port_probe(struct cxl_port *port)
>  {
>  	struct cxl_hdm *cxlhdm;
> @@ -201,6 +235,8 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
>  			if (rc)
>  				dev_dbg(&port->dev,
>  					"Failed to do perf coord calculations.\n");
> +			else
> +				cxl_memdev_set_qos_class(cxlds, &dsmas_list);

This is getting a bit deeply nested.  Perhaps a follow up patch to factor
it out makes sense so we can use a goto to cleanup the dmsmas_list without
that label being nested as well.

>  		}
>  
>  		cxl_cdat_dsmas_list_destroy(&dsmas_list);
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 21/22] cxl: Export sysfs attributes for memory device QoS class
  2023-10-11  1:07 ` [PATCH v10 21/22] cxl: Export sysfs attributes for memory device QoS class Dave Jiang
@ 2023-10-11 13:26   ` Jonathan Cameron
  2023-10-11 21:43     ` Dave Jiang
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2023-10-11 13:26 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, Dan Williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave

On Tue, 10 Oct 2023 18:07:06 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Export qos_class sysfs attributes for the CXL memory device. The QoS clas
> should show up as /sys/bus/cxl/devices/memX/ram/qos_class0 for the volatile
> partition and /sys/bus/cxl/devices/memX/pmem/qos_class0 for the persistent
> partition. The QTG ID is retrieved via _DSM after supplying the
> calculated bandwidth and latency for the entire CXL path from device to
> the CPU. This ID is used to match up to the root decoder QoS class to
> determine which CFMWS the memory range of a hotplugged CXL mem device
> should be assigned under.
> 
> While there may be multiple DSMAS exported by the device CDAT, the driver
> will only expose the first QTG ID per partition in sysfs for now. In the
> future when multiple QTG IDs are necessary, they can be exposed. [1]

I'm not sure this will extent cleanly if we get a two dimensional set to describle
1) Multiple DSMAS entries for RAM (so multiple inputs to pass to the _DSM)
   One nice thing here might be to ensure we have the first one seen.
   So if in future we do need to extent it this corresponds to the 0th one
   described.
2) Want to describe less ideal QTG values from _DSM 


Maybe it's too early to come to any conclusion and the single 0 is enough.
The cynic in me suggests we call it. qos_class0_0 though to give us the space.
If we needs DSMAS ranges, then we describe those using first index,
and second is the priority index if we have multiple answers from _DSM.
For now it's always 0_0


Jonathan

> 
> [1]: https://lore.kernel.org/linux-cxl/167571650007.587790.10040913293130712882.stgit@djiang5-mobl3.local/T/#md2a47b1ead3e1ba08f50eab29a4af1aed1d215ab
> 
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> v10:
> - Export only qos_class0, the first entry. Additional qos_class entries can be
>   exported later as needed. (Dan)
> - Have the sysfs attrib return -ENOENT unless driver is attached. (Dan)
> - Removed Jonathan's review tag due to code changes.
> ---
>  Documentation/ABI/testing/sysfs-bus-cxl |   34 +++++++++++++++++++++++++++++++
>  drivers/cxl/core/memdev.c               |   34 +++++++++++++++++++++++++++++++
>  2 files changed, 68 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 44ffbbb36654..dd613f5987b5 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -28,6 +28,23 @@ Description:
>  		Payload in the CXL-2.0 specification.
>  
>  
> +What:		/sys/bus/cxl/devices/memX/ram/qos_class0
> +Date:		May, 2023
> +KernelVersion:	v6.7
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RO) For CXL host platforms that support "QoS Telemmetry"
> +		this attribute conveys a comma delimited list of platform
> +		specific cookies that identifies a QoS performance class
> +		for the volatile partition of the CXL mem device. These
> +		class-ids can be compared against a similar "qos_class"
> +		published for a root decoder. While it is not required
> +		that the endpoints map their local memory-class to a
> +		matching platform class, mismatches are not recommended
> +		and there are platform specific performance related
> +		side-effects that may result. First class-id is displayed.
> +
> +
>  What:		/sys/bus/cxl/devices/memX/pmem/size
>  Date:		December, 2020
>  KernelVersion:	v5.12
> @@ -38,6 +55,23 @@ Description:
>  		Payload in the CXL-2.0 specification.
>  
>  
> +What:		/sys/bus/cxl/devices/memX/pmem/qos_class0
> +Date:		May, 2023
> +KernelVersion:	v6.7
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RO) For CXL host platforms that support "QoS Telemmetry"
> +		this attribute conveys a comma delimited list of platform
> +		specific cookies that identifies a QoS performance class
> +		for the persistent partition of the CXL mem device. These
> +		class-ids can be compared against a similar "qos_class"
> +		published for a root decoder. While it is not required
> +		that the endpoints map their local memory-class to a
> +		matching platform class, mismatches are not recommended
> +		and there are platform specific performance related
> +		side-effects that may result. First class-id is displayed.
> +
> +



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 22/22] cxl: Check qos_class validity on memdev probe
  2023-10-11  1:07 ` [PATCH v10 22/22] cxl: Check qos_class validity on memdev probe Dave Jiang
@ 2023-10-11 13:29   ` Jonathan Cameron
  2023-10-11 16:28     ` Dave Jiang
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2023-10-11 13:29 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave

On Tue, 10 Oct 2023 18:07:11 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Add a check to make sure the qos_class for the device will match one of
> the root decoders qos_class. If no match is found, then the qos_class for
> the device is set to invalid.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/mem.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 68 insertions(+)
> 
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 317c7548e4e9..3495119d2edf 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -104,6 +104,70 @@ static int cxl_debugfs_poison_clear(void *data, u64 dpa)
>  DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
>  			 cxl_debugfs_poison_clear, "%llx\n");
>  
> +struct qos_class_ctx {
> +	bool matched;
> +	int dev_qos_class;
> +};
> +
> +static int match_cxlrd_qos_class(struct device *dev, void *data)
> +{
> +	struct qos_class_ctx *ctx = data;
> +	struct cxl_root_decoder *cxlrd;
> +
> +	if (ctx->matched)
> +		return 0;
> +
> +	if (!is_root_decoder(dev))
> +		return 0;
> +
> +	cxlrd = to_cxl_root_decoder(dev);
> +	if (cxlrd->qos_class == CXL_QOS_CLASS_INVALID ||
> +	    ctx->dev_qos_class == CXL_QOS_CLASS_INVALID)
> +		return 0;
> +
> +	if (cxlrd->qos_class == ctx->dev_qos_class)
> +		ctx->matched = 1;
If matched, why not terminate the bus_for_each_dev()
That is return 1 and amend check to be if (rc < 0)
Not that this ever returns < 0 anyway.  It might in
future though so that test makes sense as defensive measure.


> +
> +	return 0;
> +}
> +
> +static int cxl_qos_class_verify(struct cxl_memdev *cxlmd)
> +{
> +	struct device *dev = &cxlmd->dev;
> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> +	struct qos_class_ctx ctx;
> +	int rc;
> +
> +	if (mds->ram_qos_class != CXL_QOS_CLASS_INVALID) {
> +		ctx.matched = false;
> +		ctx.dev_qos_class =  mds->ram_qos_class;
> +		rc = bus_for_each_dev(dev->bus, NULL, &ctx, match_cxlrd_qos_class);
> +		if (rc)
> +			return rc;
> +
> +		if (ctx.matched)
> +			return 0;
Early return doesn't make sense to me given not checked the pmem one yet.
> +
> +		mds->ram_qos_class = CXL_QOS_CLASS_INVALID;
> +	}
> +
> +	if (mds->pmem_qos_class != CXL_QOS_CLASS_INVALID) {
> +		ctx.matched = false;
> +		ctx.dev_qos_class = mds->pmem_qos_class;
> +		rc = bus_for_each_dev(dev->bus, NULL, &ctx, match_cxlrd_qos_class);
> +		if (rc)
> +			return rc;
> +
> +		if (ctx.matched)
> +			return 0;
> +
> +		mds->ram_qos_class = CXL_QOS_CLASS_INVALID;

pmem_qos_class?

> +	}
> +
> +	return 0;
> +}
> +
>  static int cxl_mem_probe(struct device *dev)
>  {
>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> @@ -181,6 +245,10 @@ static int cxl_mem_probe(struct device *dev)
>  			return rc;
>  	}
>  
> +	rc = cxl_qos_class_verify(cxlmd);
> +	if (rc)
> +		return rc;
> +
>  	/*
>  	 * The kernel may be operating out of CXL memory on this device,
>  	 * there is no spec defined way to determine whether this device
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 15/22] cxl: Add support for _DSM Function for retrieving QTG ID
  2023-10-11 13:10   ` Jonathan Cameron
@ 2023-10-11 15:37     ` Dave Jiang
  0 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11 15:37 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave



On 10/11/23 06:10, Jonathan Cameron wrote:
> On Tue, 10 Oct 2023 18:06:28 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM)
>>
>> Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires
>> an input of an ACPI package with 4 dwords (read latency, write latency,
>> read bandwidth, write bandwidth). The call returns a package with 1 WORD
>> that provides the max supported QTG ID and a package that may contain 0 or
>> more WORDs as the recommended QTG IDs in the recommended order.
>>
>> Create a cxl_root container for the root cxl_port and provide a callback
>> ->get_qos_class() in order to retrieve the QoS class. For the ACPI case,  
>> the _DSM helper is used to retrieve the QTG ID and returned. A
>> devm_cxl_add_root() function is added for root port setup and registration
>> of the cxl_root callback operation(s).
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> A few comments inline. Main one that needs fixing is out dated docs.
> 
> Jonathan
> 
>> ---
>> v10:
>> - Remove allocation in _DSM handler. Change input parameter to request
>>   number of ids to return.
>> - Removed Jonathan's review tag due to significant changes
>> v9:
>> - Fix input to 4 ints instead of using buffers. (Dan)
>> - Remove all endien conversion inside acpi handling code. (Dan)
>> v8:
>> - Change DSM return package parsing to use integers.
>> v7:
>> - Fix stray lines in commit log. (Jonathan)
>> v5:
>> - Make the helper a callback for the CXL root. (Dan)
>> - Drop the addition of core/acpi.c. (Dan)
>> - Add endiness handling. (Jonathan)
>> - Refactor error exits. (Jonathan)
>> - Update evaluate function description. (Jonathan)
>> - Make uuid static. (Dan)
>> v2:
>> - Reorder var declaration and use C99 style. (Jonathan)
>> - Allow >2 ACPI objects in package for future expansion. (Jonathan)
>> - Check QTG IDs against MAX QTG ID provided by output package. (Jonathan)
>> ---
>>  drivers/cxl/acpi.c      |  131 ++++++++++++++++++++++++++++++++++++++++++++++-
>>  drivers/cxl/core/port.c |   41 +++++++++++++--
>>  drivers/cxl/cxl.h       |   25 +++++++++
>>  3 files changed, 189 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index 2034eb4ce83f..a84fef73f8ce 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
> 
> 
>> /**
>> + * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM
>> + * @handle: ACPI handle
>> + * @coord: performance access coordinates
>> + * @entries: number of QTG IDs to return
>> + * @qos_class: int array provided by caller to return QTG IDs
>> + *
>> + * Return: number of QTG IDs returned, or -errno for errors
>> + *
>> + * Issue QTG _DSM with accompanied bandwidth and latency data in order to get
>> + * the QTG IDs that are suitable for the performance point in order of most
>> + * suitable to least suitable. Return first QTG ID.
> 
> Return count of QTG IDs I think..

I'll fix that
> 
>> + */
>> +static int
>> +cxl_acpi_evaluate_qtg_dsm(acpi_handle handle, struct access_coordinate *coord,
>> +			  int entries, int *qos_class)
>> +{
>> +	union acpi_object *out_obj, *out_buf, *pkg;
>> +	union acpi_object in_array[4] = {
>> +		[0].integer = { ACPI_TYPE_INTEGER, coord->read_latency },
>> +		[1].integer = { ACPI_TYPE_INTEGER, coord->write_latency },
>> +		[2].integer = { ACPI_TYPE_INTEGER, coord->read_bandwidth },
>> +		[3].integer = { ACPI_TYPE_INTEGER, coord->write_bandwidth },
>> +	};
>> +	union acpi_object in_obj = {
>> +		.package = {
>> +			.type = ACPI_TYPE_PACKAGE,
>> +			.count = 4,
>> +			.elements = in_array,
>> +		},
>> +	};
>> +	int count, pkg_entries, i;
>> +	u16 max_qtg;
>> +	int rc = 0;
> 
> Trivial but I think it's always set below so doesn't need init here.

Will remove.
> 
> 
>> +
>> +	if (!entries)
>> +		return -EINVAL;
>> +
>> +	out_obj = acpi_evaluate_dsm(handle, &acpi_cxl_qtg_id_guid, 1, 1, &in_obj);
>> +	if (!out_obj)
>> +		return -ENXIO;
>> +
>> +	if (out_obj->type != ACPI_TYPE_PACKAGE) {
>> +		rc = -ENXIO;
>> +		goto out;
>> +	}
>> +
>> +	/* Check Max QTG ID */
>> +	pkg = &out_obj->package.elements[0];
>> +	if (pkg->type != ACPI_TYPE_INTEGER) {
>> +		rc = -ENXIO;
>> +		goto out;
>> +	}
>> +
>> +	max_qtg = pkg->integer.value;
>> +
>> +	/* It's legal to have 0 QTG entries */
> 
> If that's the case, why not return 0 and make it the callers problem to deal
> with this. Seems odd to have the retrieval function return an error for
> something the spec says is fine.
> 

Will return 0.

>> +	pkg_entries = out_obj->package.count;
>> +	if (pkg_entries <= 1) {
>> +		rc = -EEXIST;
>> +		goto out;
>> +	}
>> +
>> +	/* Retrieve QTG IDs package */
>> +	pkg = &out_obj->package.elements[1];
>> +	if (pkg->type != ACPI_TYPE_PACKAGE) {
>> +		rc = -ENXIO;
>> +		goto out;
>> +	}
>> +
>> +	pkg_entries = pkg->package.count;
>> +	count = min(entries, pkg_entries);
>> +	for (i = 0; i < count; i++) {
>> +		u16 qtg_id;
>> +
>> +		out_buf = &pkg->package.elements[i];
>> +		if (out_buf->type != ACPI_TYPE_INTEGER) {
>> +			rc = -ENXIO;
>> +			goto out;
>> +		}
>> +
>> +		qtg_id = out_buf->integer.value;
>> +		if (qtg_id > max_qtg)
>> +			pr_warn("QTG ID %u greater than MAX %u\n",
>> +				qtg_id, max_qtg);
>> +
>> +		qos_class[i] = qtg_id;
>> +	}
>> +	rc = count;
>> +
>> +out:
>> +	ACPI_FREE(out_obj);
>> +	return rc;
>> +}
>> +
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 20/22] cxl: Store QTG IDs and related info to the CXL memory device context
  2023-10-11 13:19   ` Jonathan Cameron
@ 2023-10-11 16:04     ` Dave Jiang
  0 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11 16:04 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave



On 10/11/23 06:19, Jonathan Cameron wrote:
> On Tue, 10 Oct 2023 18:06:59 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Once the QTG ID _DSM is executed successfully, the QTG ID is retrieved from
>> the return package. Create a list of entries in the cxl_memdev context and
>> store the QTG ID as qos_class token and the associated DPA range. This
>> information can be exposed to user space via sysfs in order to help region
>> setup for hot-plugged CXL memory devices.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> ---
>> v10:
>> - Store single qos_class value. (Dan)
> 
> I'm fine with doing this, but I'd like a print if more than one is
> provided by the DSMAS (e.g. multiple DSMAS entries for a given region)
> Mostly so we have something to let us know if anyone is shipping such a
> device.
> 
> Otherwise a couple of minor comments inline.
> 
> Jonathan
> 
> 
>> - Rename cxl_memdev_set_qtg() to cxl_memdev_set_qos_class()
>> - Removed Jonathan's review tag due to code changes.
> 
>> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
>> index 99a619360bc5..049a16b7eb1f 100644
>> --- a/drivers/cxl/port.c
>> +++ b/drivers/cxl/port.c
>> @@ -105,6 +105,40 @@ static int cxl_port_perf_data_calculate(struct cxl_port *port,
>>  	return 0;
>>  }
>>  
>> +static void cxl_memdev_set_qos_class(struct cxl_dev_state *cxlds,
>> +				     struct list_head *dsmas_list)
>> +{
>> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>> +	struct range pmem_range = {
>> +		.start = cxlds->pmem_res.start,
>> +		.end = cxlds->pmem_res.end,
>> +	};
>> +	struct range ram_range = {
>> +		.start = cxlds->ram_res.start,
>> +		.end = cxlds->ram_res.end,
>> +	};
>> +	struct perf_prop_entry *perf;
>> +	struct dsmas_entry *dent;
>> +
>> +	list_for_each_entry(dent, dsmas_list, list) {
>> +		perf = devm_kzalloc(cxlds->dev, sizeof(*perf), GFP_KERNEL);
>> +		if (!perf)
>> +			return;
>> +
>> +		perf->dpa_range = dent->dpa_range;
>> +		perf->coord = dent->coord;
>> +		perf->qos_class = dent->qos_class;
>> +		list_add_tail(&perf->list, &mds->perf_list);
>> +
>> +		if (resource_size(&cxlds->ram_res) &&
>> +		    range_contains(&ram_range, &dent->dpa_range))
>> +			mds->ram_qos_class = perf->qos_class;
> 
> So this assumes one DSMAS per memory type.
> Not an unreasonable starting place, but I think this should 
> print something to the log if it does see more that one.

I'll add that. Also I seem to have dropped the check from before for multiple entries so we aren't clobbering the previous set value.

> 
>> +		else if (resource_size(&cxlds->pmem_res) &&
>> +			 range_contains(&pmem_range, &dent->dpa_range))
>> +			mds->pmem_qos_class = perf->qos_class;
>> +	}
>> +}
>> +
>>  static int cxl_switch_port_probe(struct cxl_port *port)
>>  {
>>  	struct cxl_hdm *cxlhdm;
>> @@ -201,6 +235,8 @@ static int cxl_endpoint_port_probe(struct cxl_port *port)
>>  			if (rc)
>>  				dev_dbg(&port->dev,
>>  					"Failed to do perf coord calculations.\n");
>> +			else
>> +				cxl_memdev_set_qos_class(cxlds, &dsmas_list);
> 
> This is getting a bit deeply nested.  Perhaps a follow up patch to factor
> it out makes sense so we can use a goto to cleanup the dmsmas_list without
> that label being nested as well.

I'll just fix it in place. It doesn't look too bad.

> 
>>  		}
>>  
>>  		cxl_cdat_dsmas_list_destroy(&dsmas_list);
>>
>>
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 22/22] cxl: Check qos_class validity on memdev probe
  2023-10-11 13:29   ` Jonathan Cameron
@ 2023-10-11 16:28     ` Dave Jiang
  0 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11 16:28 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave



On 10/11/23 06:29, Jonathan Cameron wrote:
> On Tue, 10 Oct 2023 18:07:11 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Add a check to make sure the qos_class for the device will match one of
>> the root decoders qos_class. If no match is found, then the qos_class for
>> the device is set to invalid.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>  drivers/cxl/mem.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 68 insertions(+)
>>
>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>> index 317c7548e4e9..3495119d2edf 100644
>> --- a/drivers/cxl/mem.c
>> +++ b/drivers/cxl/mem.c
>> @@ -104,6 +104,70 @@ static int cxl_debugfs_poison_clear(void *data, u64 dpa)
>>  DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
>>  			 cxl_debugfs_poison_clear, "%llx\n");
>>  
>> +struct qos_class_ctx {
>> +	bool matched;
>> +	int dev_qos_class;
>> +};
>> +
>> +static int match_cxlrd_qos_class(struct device *dev, void *data)
>> +{
>> +	struct qos_class_ctx *ctx = data;
>> +	struct cxl_root_decoder *cxlrd;
>> +
>> +	if (ctx->matched)
>> +		return 0;
>> +
>> +	if (!is_root_decoder(dev))
>> +		return 0;
>> +
>> +	cxlrd = to_cxl_root_decoder(dev);
>> +	if (cxlrd->qos_class == CXL_QOS_CLASS_INVALID ||
>> +	    ctx->dev_qos_class == CXL_QOS_CLASS_INVALID)
>> +		return 0;
>> +
>> +	if (cxlrd->qos_class == ctx->dev_qos_class)
>> +		ctx->matched = 1;
> If matched, why not terminate the bus_for_each_dev()
> That is return 1 and amend check to be if (rc < 0)
> Not that this ever returns < 0 anyway.  It might in
> future though so that test makes sense as defensive measure.

Will update.

> 
> 
>> +
>> +	return 0;
>> +}
>> +
>> +static int cxl_qos_class_verify(struct cxl_memdev *cxlmd)
>> +{
>> +	struct device *dev = &cxlmd->dev;
>> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
>> +	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
>> +	struct qos_class_ctx ctx;
>> +	int rc;
>> +
>> +	if (mds->ram_qos_class != CXL_QOS_CLASS_INVALID) {
>> +		ctx.matched = false;
>> +		ctx.dev_qos_class =  mds->ram_qos_class;
>> +		rc = bus_for_each_dev(dev->bus, NULL, &ctx, match_cxlrd_qos_class);
>> +		if (rc)
>> +			return rc;
>> +
>> +		if (ctx.matched)
>> +			return 0;
> Early return doesn't make sense to me given not checked the pmem one yet.

You are right. Will fix.

>> +
>> +		mds->ram_qos_class = CXL_QOS_CLASS_INVALID;
>> +	}
>> +
>> +	if (mds->pmem_qos_class != CXL_QOS_CLASS_INVALID) {
>> +		ctx.matched = false;
>> +		ctx.dev_qos_class = mds->pmem_qos_class;
>> +		rc = bus_for_each_dev(dev->bus, NULL, &ctx, match_cxlrd_qos_class);
>> +		if (rc)
>> +			return rc;
>> +
>> +		if (ctx.matched)
>> +			return 0;
>> +
>> +		mds->ram_qos_class = CXL_QOS_CLASS_INVALID;
> 
> pmem_qos_class?

copy/paste error. will fix.

> 
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>  static int cxl_mem_probe(struct device *dev)
>>  {
>>  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
>> @@ -181,6 +245,10 @@ static int cxl_mem_probe(struct device *dev)
>>  			return rc;
>>  	}
>>  
>> +	rc = cxl_qos_class_verify(cxlmd);
>> +	if (rc)
>> +		return rc;
>> +
>>  	/*
>>  	 * The kernel may be operating out of CXL memory on this device,
>>  	 * there is no spec defined way to determine whether this device
>>
>>
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem
  2023-10-11 12:59 ` [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Jonathan Cameron
@ 2023-10-11 16:31   ` Dave Jiang
  0 siblings, 0 replies; 36+ messages in thread
From: Dave Jiang @ 2023-10-11 16:31 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Hanjun Guo, Len Brown, Ira Weiny, Dan Williams,
	Greg Kroah-Hartman, Rafael J. Wysocki, Davidlohr Bueso,
	Rafael J. Wysocki, vishal.l.verma, alison.schofield



On 10/11/23 05:59, Jonathan Cameron wrote:
> On Tue, 10 Oct 2023 18:04:56 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
> Awkward to rename a patch series, but feels like the QTG ID part may be the last thing
> added but there is a whole bunch of stuff in between to enable that which
> isn't covered by the one line description.

Yes... casualty of recombining everything minus things that are upstreamed already after having the series split. Just wanted to keep everything together so it's convenient for Dan's merging. Most of the changes should be the last few patches.
 
> 
> J
> 
>> v10:
>> - Remove memory allocation in DSM handler. Add input parameters to request number of ids
>>   to return. (Dan)
>> - Only export a single qos_class sysfs attribute, qos_class0 for devices. (Dan)
>> - Add sanity check of device qos_class against root decoders (Dan)
>> - Recombine all relevant patches for cxl upstream merge
>> - Rebased against v6.6-rc4
>>
>> v9:
>> - Correct DSM input package to use integers. (Dan)
>> - Remove all endien conversions. (Dan)
>>
>> v8:
>> - Correct DSM return package parsing to use integers
>>
>> v7:
>> - Minor changes. Please see specific patches for log entries addressing comments
>>   from v6.
>>
>> v6:
>> - Please see specific patches for log entries addressing comments from v5.
>> - Use CDAT sub-table structs as is, adjust w/o common header. Changing causes
>>   significant ACPICA code changes.
>> - Deal with table header now that is a union. (Jonathan)
>> - Move DSMAS list destroy to core/cdat.c. (Jonathan)
>> - Retain and display entire QTG ID list from _DSM. (Jonathan)
>>
>> v5:
>> - Please see specific patches for log entries addressing comments from v4.
>> - Split out ACPI and generic node code to send separately to respective maintainers
>> - Reworked to use ACPI tables code for CDAT parsing (Dan)
>> - Cache relevant perf data under dports (Dan)
>> - Add cxl root callback for QTG ID _DSM (Dan)
>> - Rename 'node_hmem_attr' to 'access_coordinate' (Dan)
>> - Change qtg_id sysfs attrib to qos_class (Dan)
>>
>> v4:
>> - Reworked PCIe link path latency calculation
>> - 0-day fixes
>> - Removed unused qos_list from cxl_memdev and its stray usages
>>
>> v3:
>> - Please see specific patches for log entries addressing comments from v2.
>> - Refactor cxl_port_probe() additions. (Alison)
>> - Convert to use 'struct node_hmem_attrs'
>> - Refactor to use common code for genport target allocation.
>> - Add third array entry for target hmem_attrs to store genport locality data.
>> - Go back to per partition QTG ID. (Dan)
>>
>> v2:
>> - Please see specific patches for log entries addressing comments from v1.
>> - Removed ACPICA code usages.
>> - Removed PCI subsystem helpers for latency and bandwidth.
>> - Add CXL switch CDAT parsing support (SSLBIS)
>> - Add generic port SRAT+HMAT support (ACPI)
>> - Export a single QTG ID via sysfs per memory device (Dan)
>> - Provide rest of DSMAS range info in debugfs (Dan)
>>
>> Hi Dan,
>> Please consider taking the entire series including the CXL bits for the next
>> convenient merge window. Thanks!
>>
>> This series adds the retrieval of QoS Throttling Group (QTG) IDs for the CXL Fixed
>> Memory Window Structure (CFMWS) and the CXL memory device. It provides the QTG IDs
>> to user space to provide some guidance with putting the proper DPA range under the
>> appropriate CFMWS window for a hot-plugged CXL memory device.
>>
>> The CFMWS structure contains a QTG ID that is associated with the memory window that the
>> structure exports. On Linux, the CFMWS is represented as a CXL root decoder. The QTG
>> ID will be attached to the CXL root decoder and exported as a sysfs attribute (qos_class).
>>
>> The QTG IDs for a device is retrieved via sending a _DSM method to the ACPI0017 device.
>> The _DSM expects an input package of 4 DWORDS that contains the read latency, write
>> latency, read bandwidth, and write banwidth. These are the caluclated numbers for the
>> path between the CXL device and the CPU. The list of QTG IDs are also exported as a sysfs
>> attribute under the mem device memory partition type:
>> /sys/bus/cxl/devices/memX/ram/qos_class
>> /sys/bus/cxl/devices/memX/pmem/qos_class
>> A mapping of DPA ranges and it's correlated QTG IDs are found under
>> /sys/kernel/debug/cxl/memX/qtgmap. Each DSMAS from the device CDAT will provide a DPA
>> range.
>>
>> The latency numbers are the aggregated latencies for the path between the CXL device and
>> the CPU. If a CXL device is directly attached to the CXL HB, the latency
>> would be the aggregated latencies from the device Coherent Device Attribute Table (CDAT),
>> the caluclated PCIe link latency between the device and the HB, and the generic port data
>> from ACPI SRAT+HMAT. The bandwidth in this configuration would be the minimum between the
>> CDAT bandwidth number, link bandwidth between the device and the HB, and the bandwidth data
>> from the generic port data via ACPI SRAT+HMAT.
>>
>> If a configuration has a switch in between then the latency would be the aggregated
>> latencies from the device CDAT, the link latency between device and switch, the
>> latency from the switch CDAT, the link latency between switch and the HB, and the
>> generic port latency between the CPU and the CXL HB. The bandwidth calculation would be the
>> min of device CDAT bandwidth, link bandwith between device and switch, switch CDAT
>> bandwidth, the link bandwidth between switch and HB, and the generic port bandwidth
>>
>> There can be 0 or more switches between the CXL device and the CXL HB. There are detailed
>> examples on calculating bandwidth and latency in the CXL Memory Device Software Guide [4].
>>
>> The CDAT provides Device Scoped Memory Affinity Structures (DSMAS) that contains the
>> Device Physical Address (DPA) range and the related Device Scoped Latency and Bandwidth
>> Informat Stuctures (DSLBIS). Each DSLBIS provides a latency or bandwidth entry that is
>> tied to a DSMAS entry via a per DSMAS unique DSMAD handle.
>>
>> Test setup is done with runqemu genport support branch [6]. The setup provides 2 CXL HBs
>> with one HB having a CXL switch underneath. It also provides generic port support detailed
>> below.
>>
>> A hacked up qemu branch is used to support generic port SRAT and HMAT [7].
>>
>> To create the appropriate HMAT entries for generic port, the following qemu paramters must
>> be added:
>>
>> -object genport,id=$X -numa node,genport=genport$X,nodeid=$Y,initiator=$Z
>> -numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-latency,latency=$latency
>> -numa hmat-lb,initiator=$Z,target=$X,hierarchy=memory,data-type=access-bandwidth,bandwidth=$bandwidthM
>> for ((i = 0; i < total_nodes; i++)); do
>> 	for ((j = 0; j < cxl_hbs; j++ )); do	# 2 CXL HBs
>> 		-numa dist,src=$i,dst=$X,val=$dist
>> 	done
>> done
>>
>> See the genport run_qemu branch for full details.
>>
>> [1]: https://www.computeexpresslink.org/download-the-specification
>> [2]: https://uefi.org/sites/default/files/resources/Coherent%20Device%20Attribute%20Table_1.01.pdf
>> [3]: https://uefi.org/sites/default/files/resources/ACPI_Spec_6_5_Aug29.pdf
>> [4]: https://cdrdv2-public.intel.com/643805/643805_CXL%20Memory%20Device%20SW%20Guide_Rev1p0.pdf
>> [5]: https://lore.kernel.org/linux-cxl/20230313195530.GA1532686@bhelgaas/T/#t
>> [6]: https://git.kernel.org/pub/scm/linux/kernel/git/djiang/linux.git/log/?h=cxl-qtg
>> [7]: https://github.com/pmem/run_qemu/tree/djiang/genport
>> [8]: https://github.com/davejiang/qemu/tree/genport
>>
>> ---
>>
>> base-commit: 178e1ea6a68f12967ee0e9afc4d79a2939acd43c
>>
>> ---
>> Dave Jiang (22):
>>       cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute
>>       cxl: Add checksum verification to CDAT from CXL
>>       cxl: Add support for reading CXL switch CDAT table
>>       acpi: Move common tables helper functions to common lib
>>       lib/firmware_table: tables: Add CDAT table parsing support
>>       base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates'
>>       acpi: numa: Create enum for memory_target access coordinates indexing
>>       acpi: numa: Add genport target allocation to the HMAT parsing
>>       acpi: Break out nesting for hmat_parse_locality()
>>       acpi: numa: Add setting of generic port system locality attributes
>>       acpi: numa: Add helper function to retrieve the performance attributes
>>       cxl: Add callback to parse the DSMAS subtables from CDAT
>>       cxl: Add callback to parse the DSLBIS subtable from CDAT
>>       cxl: Add callback to parse the SSLBIS subtable from CDAT
>>       cxl: Add support for _DSM Function for retrieving QTG ID
>>       cxl: Calculate and store PCI link latency for the downstream ports
>>       cxl: Store the access coordinates for the generic ports
>>       cxl: Add helper function that calculate performance data for downstream ports
>>       cxl: Compute the entire CXL path latency and bandwidth data
>>       cxl: Store QTG IDs and related info to the CXL memory device context
>>       cxl: Export sysfs attributes for memory device QoS class
>>       cxl: Check qos_class validity on memdev probe
>>
>>
>>  Documentation/ABI/testing/sysfs-bus-cxl |  49 +++++
>>  MAINTAINERS                             |   2 +
>>  drivers/acpi/Kconfig                    |   1 +
>>  drivers/acpi/numa/hmat.c                | 172 +++++++++++++---
>>  drivers/acpi/tables.c                   | 178 +----------------
>>  drivers/base/node.c                     |  12 +-
>>  drivers/cxl/Kconfig                     |   1 +
>>  drivers/cxl/acpi.c                      | 151 +++++++++++++-
>>  drivers/cxl/core/Makefile               |   1 +
>>  drivers/cxl/core/cdat.c                 | 249 ++++++++++++++++++++++++
>>  drivers/cxl/core/mbox.c                 |   1 +
>>  drivers/cxl/core/memdev.c               |  34 ++++
>>  drivers/cxl/core/pci.c                  | 125 ++++++++++--
>>  drivers/cxl/core/port.c                 | 124 +++++++++++-
>>  drivers/cxl/cxl.h                       |  74 +++++++
>>  drivers/cxl/cxlmem.h                    |  23 +++
>>  drivers/cxl/cxlpci.h                    |  15 ++
>>  drivers/cxl/mem.c                       |  68 +++++++
>>  drivers/cxl/port.c                      | 109 +++++++++++
>>  include/linux/acpi.h                    |  54 +++--
>>  include/linux/fw_table.h                |  52 +++++
>>  include/linux/node.h                    |   8 +-
>>  lib/Kconfig                             |   3 +
>>  lib/Makefile                            |   2 +
>>  lib/fw_table.c                          | 237 ++++++++++++++++++++++
>>  25 files changed, 1478 insertions(+), 267 deletions(-)
>>  create mode 100644 drivers/cxl/core/cdat.c
>>  create mode 100644 include/linux/fw_table.h
>>  create mode 100644 lib/fw_table.c
>>
>> --
>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 21/22] cxl: Export sysfs attributes for memory device QoS class
  2023-10-11 13:26   ` Jonathan Cameron
@ 2023-10-11 21:43     ` Dave Jiang
  2023-10-12 11:04       ` Jonathan Cameron
  0 siblings, 1 reply; 36+ messages in thread
From: Dave Jiang @ 2023-10-11 21:43 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Dan Williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave



On 10/11/23 06:26, Jonathan Cameron wrote:
> On Tue, 10 Oct 2023 18:07:06 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> Export qos_class sysfs attributes for the CXL memory device. The QoS clas
>> should show up as /sys/bus/cxl/devices/memX/ram/qos_class0 for the volatile
>> partition and /sys/bus/cxl/devices/memX/pmem/qos_class0 for the persistent
>> partition. The QTG ID is retrieved via _DSM after supplying the
>> calculated bandwidth and latency for the entire CXL path from device to
>> the CPU. This ID is used to match up to the root decoder QoS class to
>> determine which CFMWS the memory range of a hotplugged CXL mem device
>> should be assigned under.
>>
>> While there may be multiple DSMAS exported by the device CDAT, the driver
>> will only expose the first QTG ID per partition in sysfs for now. In the
>> future when multiple QTG IDs are necessary, they can be exposed. [1]
> 
> I'm not sure this will extent cleanly if we get a two dimensional set to describle
> 1) Multiple DSMAS entries for RAM (so multiple inputs to pass to the _DSM)
>    One nice thing here might be to ensure we have the first one seen.
>    So if in future we do need to extent it this corresponds to the 0th one
>    described.
> 2) Want to describe less ideal QTG values from _DSM 
> 
> 
> Maybe it's too early to come to any conclusion and the single 0 is enough.
> The cynic in me suggests we call it. qos_class0_0 though to give us the space.
> If we needs DSMAS ranges, then we describe those using first index,
> and second is the priority index if we have multiple answers from _DSM.
> For now it's always 0_0

I talked to Dan and it seems he prefers the simplest form for the current version until we have a need to move towards something more complex. So qos_class0 -> qos_class. We can move to qos_classN_M when there is a need.


> 
> 
> Jonathan
> 
>>
>> [1]: https://lore.kernel.org/linux-cxl/167571650007.587790.10040913293130712882.stgit@djiang5-mobl3.local/T/#md2a47b1ead3e1ba08f50eab29a4af1aed1d215ab
>>
>> Suggested-by: Dan Williams <dan.j.williams@intel.com>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>
>> ---
>> v10:
>> - Export only qos_class0, the first entry. Additional qos_class entries can be
>>   exported later as needed. (Dan)
>> - Have the sysfs attrib return -ENOENT unless driver is attached. (Dan)
>> - Removed Jonathan's review tag due to code changes.
>> ---
>>  Documentation/ABI/testing/sysfs-bus-cxl |   34 +++++++++++++++++++++++++++++++
>>  drivers/cxl/core/memdev.c               |   34 +++++++++++++++++++++++++++++++
>>  2 files changed, 68 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
>> index 44ffbbb36654..dd613f5987b5 100644
>> --- a/Documentation/ABI/testing/sysfs-bus-cxl
>> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
>> @@ -28,6 +28,23 @@ Description:
>>  		Payload in the CXL-2.0 specification.
>>  
>>  
>> +What:		/sys/bus/cxl/devices/memX/ram/qos_class0
>> +Date:		May, 2023
>> +KernelVersion:	v6.7
>> +Contact:	linux-cxl@vger.kernel.org
>> +Description:
>> +		(RO) For CXL host platforms that support "QoS Telemmetry"
>> +		this attribute conveys a comma delimited list of platform
>> +		specific cookies that identifies a QoS performance class
>> +		for the volatile partition of the CXL mem device. These
>> +		class-ids can be compared against a similar "qos_class"
>> +		published for a root decoder. While it is not required
>> +		that the endpoints map their local memory-class to a
>> +		matching platform class, mismatches are not recommended
>> +		and there are platform specific performance related
>> +		side-effects that may result. First class-id is displayed.
>> +
>> +
>>  What:		/sys/bus/cxl/devices/memX/pmem/size
>>  Date:		December, 2020
>>  KernelVersion:	v5.12
>> @@ -38,6 +55,23 @@ Description:
>>  		Payload in the CXL-2.0 specification.
>>  
>>  
>> +What:		/sys/bus/cxl/devices/memX/pmem/qos_class0
>> +Date:		May, 2023
>> +KernelVersion:	v6.7
>> +Contact:	linux-cxl@vger.kernel.org
>> +Description:
>> +		(RO) For CXL host platforms that support "QoS Telemmetry"
>> +		this attribute conveys a comma delimited list of platform
>> +		specific cookies that identifies a QoS performance class
>> +		for the persistent partition of the CXL mem device. These
>> +		class-ids can be compared against a similar "qos_class"
>> +		published for a root decoder. While it is not required
>> +		that the endpoints map their local memory-class to a
>> +		matching platform class, mismatches are not recommended
>> +		and there are platform specific performance related
>> +		side-effects that may result. First class-id is displayed.
>> +
>> +
> 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v10 21/22] cxl: Export sysfs attributes for memory device QoS class
  2023-10-11 21:43     ` Dave Jiang
@ 2023-10-12 11:04       ` Jonathan Cameron
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2023-10-12 11:04 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, Dan Williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave

On Wed, 11 Oct 2023 14:43:14 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> On 10/11/23 06:26, Jonathan Cameron wrote:
> > On Tue, 10 Oct 2023 18:07:06 -0700
> > Dave Jiang <dave.jiang@intel.com> wrote:
> >   
> >> Export qos_class sysfs attributes for the CXL memory device. The QoS clas
> >> should show up as /sys/bus/cxl/devices/memX/ram/qos_class0 for the volatile
> >> partition and /sys/bus/cxl/devices/memX/pmem/qos_class0 for the persistent
> >> partition. The QTG ID is retrieved via _DSM after supplying the
> >> calculated bandwidth and latency for the entire CXL path from device to
> >> the CPU. This ID is used to match up to the root decoder QoS class to
> >> determine which CFMWS the memory range of a hotplugged CXL mem device
> >> should be assigned under.
> >>
> >> While there may be multiple DSMAS exported by the device CDAT, the driver
> >> will only expose the first QTG ID per partition in sysfs for now. In the
> >> future when multiple QTG IDs are necessary, they can be exposed. [1]  
> > 
> > I'm not sure this will extent cleanly if we get a two dimensional set to describle
> > 1) Multiple DSMAS entries for RAM (so multiple inputs to pass to the _DSM)
> >    One nice thing here might be to ensure we have the first one seen.
> >    So if in future we do need to extent it this corresponds to the 0th one
> >    described.
> > 2) Want to describe less ideal QTG values from _DSM 
> > 
> > 
> > Maybe it's too early to come to any conclusion and the single 0 is enough.
> > The cynic in me suggests we call it. qos_class0_0 though to give us the space.
> > If we needs DSMAS ranges, then we describe those using first index,
> > and second is the priority index if we have multiple answers from _DSM.
> > For now it's always 0_0  
> 
> I talked to Dan and it seems he prefers the simplest form for the current version until we have a need to move towards something more complex. So qos_class0 -> qos_class. We can move to qos_classN_M when there is a need.
> 
> 
Ok.   We would need to maintain the qos_class interface as whatever it
is today, but that's fine.

J
> > 
> > 
> > Jonathan
> >   
> >>
> >> [1]: https://lore.kernel.org/linux-cxl/167571650007.587790.10040913293130712882.stgit@djiang5-mobl3.local/T/#md2a47b1ead3e1ba08f50eab29a4af1aed1d215ab
> >>
> >> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> >>
> >> ---
> >> v10:
> >> - Export only qos_class0, the first entry. Additional qos_class entries can be
> >>   exported later as needed. (Dan)
> >> - Have the sysfs attrib return -ENOENT unless driver is attached. (Dan)
> >> - Removed Jonathan's review tag due to code changes.
> >> ---
> >>  Documentation/ABI/testing/sysfs-bus-cxl |   34 +++++++++++++++++++++++++++++++
> >>  drivers/cxl/core/memdev.c               |   34 +++++++++++++++++++++++++++++++
> >>  2 files changed, 68 insertions(+)
> >>
> >> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> >> index 44ffbbb36654..dd613f5987b5 100644
> >> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> >> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> >> @@ -28,6 +28,23 @@ Description:
> >>  		Payload in the CXL-2.0 specification.
> >>  
> >>  
> >> +What:		/sys/bus/cxl/devices/memX/ram/qos_class0
> >> +Date:		May, 2023
> >> +KernelVersion:	v6.7
> >> +Contact:	linux-cxl@vger.kernel.org
> >> +Description:
> >> +		(RO) For CXL host platforms that support "QoS Telemmetry"
> >> +		this attribute conveys a comma delimited list of platform
> >> +		specific cookies that identifies a QoS performance class
> >> +		for the volatile partition of the CXL mem device. These
> >> +		class-ids can be compared against a similar "qos_class"
> >> +		published for a root decoder. While it is not required
> >> +		that the endpoints map their local memory-class to a
> >> +		matching platform class, mismatches are not recommended
> >> +		and there are platform specific performance related
> >> +		side-effects that may result. First class-id is displayed.
> >> +
> >> +
> >>  What:		/sys/bus/cxl/devices/memX/pmem/size
> >>  Date:		December, 2020
> >>  KernelVersion:	v5.12
> >> @@ -38,6 +55,23 @@ Description:
> >>  		Payload in the CXL-2.0 specification.
> >>  
> >>  
> >> +What:		/sys/bus/cxl/devices/memX/pmem/qos_class0
> >> +Date:		May, 2023
> >> +KernelVersion:	v6.7
> >> +Contact:	linux-cxl@vger.kernel.org
> >> +Description:
> >> +		(RO) For CXL host platforms that support "QoS Telemmetry"
> >> +		this attribute conveys a comma delimited list of platform
> >> +		specific cookies that identifies a QoS performance class
> >> +		for the persistent partition of the CXL mem device. These
> >> +		class-ids can be compared against a similar "qos_class"
> >> +		published for a root decoder. While it is not required
> >> +		that the endpoints map their local memory-class to a
> >> +		matching platform class, mismatches are not recommended
> >> +		and there are platform specific performance related
> >> +		side-effects that may result. First class-id is displayed.
> >> +
> >> +  
> > 
> >   
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2023-10-12 11:04 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-11  1:04 [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Dave Jiang
2023-10-11  1:05 ` [PATCH v10 01/22] cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute Dave Jiang
2023-10-11  1:05 ` [PATCH v10 02/22] cxl: Add checksum verification to CDAT from CXL Dave Jiang
2023-10-11  1:05 ` [PATCH v10 03/22] cxl: Add support for reading CXL switch CDAT table Dave Jiang
2023-10-11  1:05 ` [PATCH v10 04/22] acpi: Move common tables helper functions to common lib Dave Jiang
2023-10-11 12:55   ` Jonathan Cameron
2023-10-11  1:05 ` [PATCH v10 05/22] lib/firmware_table: tables: Add CDAT table parsing support Dave Jiang
2023-10-11  1:05 ` [PATCH v10 06/22] base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates' Dave Jiang
2023-10-11 12:57   ` Jonathan Cameron
2023-10-11  1:05 ` [PATCH v10 07/22] acpi: numa: Create enum for memory_target access coordinates indexing Dave Jiang
2023-10-11  1:05 ` [PATCH v10 08/22] acpi: numa: Add genport target allocation to the HMAT parsing Dave Jiang
2023-10-11  1:05 ` [PATCH v10 09/22] acpi: Break out nesting for hmat_parse_locality() Dave Jiang
2023-10-11  1:05 ` [PATCH v10 10/22] acpi: numa: Add setting of generic port system locality attributes Dave Jiang
2023-10-11  1:06 ` [PATCH v10 11/22] acpi: numa: Add helper function to retrieve the performance attributes Dave Jiang
2023-10-11  1:06 ` [PATCH v10 12/22] cxl: Add callback to parse the DSMAS subtables from CDAT Dave Jiang
2023-10-11  1:06 ` [PATCH v10 13/22] cxl: Add callback to parse the DSLBIS subtable " Dave Jiang
2023-10-11  1:06 ` [PATCH v10 14/22] cxl: Add callback to parse the SSLBIS " Dave Jiang
2023-10-11  1:06 ` [PATCH v10 15/22] cxl: Add support for _DSM Function for retrieving QTG ID Dave Jiang
2023-10-11 13:10   ` Jonathan Cameron
2023-10-11 15:37     ` Dave Jiang
2023-10-11  1:06 ` [PATCH v10 16/22] cxl: Calculate and store PCI link latency for the downstream ports Dave Jiang
2023-10-11  1:06 ` [PATCH v10 17/22] cxl: Store the access coordinates for the generic ports Dave Jiang
2023-10-11  1:06 ` [PATCH v10 18/22] cxl: Add helper function that calculate performance data for downstream ports Dave Jiang
2023-10-11  1:06 ` [PATCH v10 19/22] cxl: Compute the entire CXL path latency and bandwidth data Dave Jiang
2023-10-11  1:06 ` [PATCH v10 20/22] cxl: Store QTG IDs and related info to the CXL memory device context Dave Jiang
2023-10-11 13:19   ` Jonathan Cameron
2023-10-11 16:04     ` Dave Jiang
2023-10-11  1:07 ` [PATCH v10 21/22] cxl: Export sysfs attributes for memory device QoS class Dave Jiang
2023-10-11 13:26   ` Jonathan Cameron
2023-10-11 21:43     ` Dave Jiang
2023-10-12 11:04       ` Jonathan Cameron
2023-10-11  1:07 ` [PATCH v10 22/22] cxl: Check qos_class validity on memdev probe Dave Jiang
2023-10-11 13:29   ` Jonathan Cameron
2023-10-11 16:28     ` Dave Jiang
2023-10-11 12:59 ` [PATCH v10 00/22] cxl: Add support for QTG ID retrieval for CXL subsystem Jonathan Cameron
2023-10-11 16:31   ` Dave Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.