All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/2] acpi: report numa nodes for device memory using GI
@ 2024-02-23 12:42 ankita
  2024-02-23 12:42 ` [PATCH v7 1/2] qom: new object to associate device to numa node ankita
  2024-02-23 12:42 ` [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure ankita
  0 siblings, 2 replies; 37+ messages in thread
From: ankita @ 2024-02-23 12:42 UTC (permalink / raw)
  To: ankita, jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell,
	ani, berrange, eduardo, imammedo, mst, eblake, armbru, david,
	gshan, Jonathan.Cameron, zhiw, mochs, pbonzini
  Cc: aniketa, cjia, kwankhede, targupta, vsethi, acurrid, dnigam,
	udhoke, qemu-arm, qemu-devel

From: Ankit Agrawal <ankita@nvidia.com>

There are upcoming devices which allow CPU to cache coherently access
their memory. It is sensible to expose such memory as NUMA nodes separate
from the sysmem node to the OS. The ACPI spec provides a scheme in SRAT
called Generic Initiator Affinity Structure [1] to allow an association
between a Proximity Domain (PXM) and a Generic Initiator (GI) (e.g.
heterogeneous processors and accelerators, GPUs, and I/O devices with
integrated compute or DMA engines).

While a single node per device may cover several use cases, it is however
insufficient for a full utilization of the NVIDIA GPUs MIG
(Mult-Instance GPUs) [2] feature. The feature allows partitioning of the
GPU device resources (including device memory) into several (upto 8)
isolated instances. Each of the partitioned memory requires a dedicated NUMA
node to operate. The partitions are not fixed and they can be created/deleted
at runtime.

Linux OS does not provide a means to dynamically create/destroy NUMA nodes
and such feature implementation is expected to be non-trivial. The nodes
that OS discovers at the boot time while parsing SRAT remains fixed. So we
utilize the GI Affinity structures that allows association between nodes
and devices. Multiple GI structures per device/BDF is possible, allowing
creation of multiple nodes in the VM by exposing unique PXM in each of these
structures.

Implement the mechanism to build the GI affinity structures as Qemu currently
does not. Introduce a new acpi-generic-initiator object to allow host admin
link a device with an associated NUMA node. Qemu maintains this association
and use this object to build the requisite GI Affinity Structure.

When multiple numa nodes are associated with a device, it is required to
create those many number of acpi-generic-initiator objects, each representing
a unique device:node association.

Following is one of a decoded GI affinity structure in VM ACPI SRAT.
[0C8h 0200   1]                Subtable Type : 05 [Generic Initiator Affinity]
[0C9h 0201   1]                       Length : 20

[0CAh 0202   1]                    Reserved1 : 00
[0CBh 0203   1]           Device Handle Type : 01
[0CCh 0204   4]             Proximity Domain : 00000007
[0D0h 0208  16]                Device Handle : 00 00 20 00 00 00 00 00 00 00 00
00 00 00 00 00
[0E0h 0224   4]        Flags (decoded below) : 00000001
                                     Enabled : 1
[0E4h 0228   4]                    Reserved2 : 00000000

[0E8h 0232   1]                Subtable Type : 05 [Generic Initiator Affinity]
[0E9h 0233   1]                       Length : 20

On Grace Hopper systems, an admin will create a range of 8 nodes and associate
them with the device using the acpi-generic-initiator object. While a
configuration of less than 8 nodes per device is allowed, such configuration
will prevent utilization of the feature to the fullest. This setting is
applicable to all the Grace+Hopper systems. The following is an example of
the Qemu command line arguments to create 8 nodes and link them to the device
'dev0':

-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
-numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
-numa node,nodeid=8 -numa node,nodeid=9 \
-device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
-object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
-object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
-object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
-object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
-object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
-object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
-object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \

The performance benefits can be realized by providing the NUMA node distances
appropriately (through libvirt tags or Qemu params). The admin can get the
distance among nodes in hardware using `numactl -H`.

This series goes along with the vfio-pci variant driver [3] under review.

Applied over v8.2.1.
base commit: f48c205fb42be48e2e47b7e1cd9a2802e5ca17b0

[1] ACPI Spec 6.3, Section 5.2.16.6
[2] https://www.nvidia.com/en-in/technologies/multi-instance-gpu
[3] https://lore.kernel.org/all/20240220115055.23546-4-ankita@nvidia.com/

Link for v6:
https://lore.kernel.org/all/20231225045603.7654-1-ankita@nvidia.com/

v6 -> v7
- Updated code and the commit message to make acpi-generic-initiator
  define a 1:1 relationship between device and node based on
  Jonathan Cameron's suggestion.
- Updated commit message to include the decoded GI entry in the SRAT.
- Rebased to v8.2.1.

v5 -> v6
- Updated commit message for the [1/2] and the cover letter.
- Updated the acpi-generic-initiator object comment description for
  clarity on the input host-nodes.
- Rebased to v8.2.0-rc4.

v4 -> v5
- Removed acpi-dev option until full support.
- The numa nodes are saved as bitmap instead of uint16List.
- Replaced asserts to exit calls.
- Addressed other miscellaneous comments.

v3 -> v4
- changed the ':' delimited way to a uint16 array to communicate the
nodes associated with the device.
- added asserts to handle invalid inputs.
- addressed other miscellaneous v3 comments.

v2 -> v3
- changed param to accept a ':' delimited list of numa nodes, instead
of a range.
- Removed nvidia-acpi-generic-initiator object.
- Addressed miscellaneous comments in v2.

v1 -> v2
- Removed dependency on sysfs to communicate the feature with variant module.
- Use GI Affinity SRAT structure instead of Memory Affinity.
- No DSDT entries needed to communicate the PXM for the device. SRAT GI
structure is used instead.
- New objects introduced to establish link between device and nodes.

Ankit Agrawal (2):
  qom: new object to associate device to numa node
  hw/acpi: Implement the SRAT GI affinity structure

 hw/acpi/acpi-generic-initiator.c         | 154 +++++++++++++++++++++++
 hw/acpi/meson.build                      |   1 +
 hw/arm/virt-acpi-build.c                 |   3 +
 include/hw/acpi/acpi-generic-initiator.h |  58 +++++++++
 qapi/qom.json                            |  17 +++
 5 files changed, 233 insertions(+)
 create mode 100644 hw/acpi/acpi-generic-initiator.c
 create mode 100644 include/hw/acpi/acpi-generic-initiator.h

-- 
2.34.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-23 12:42 [PATCH v7 0/2] acpi: report numa nodes for device memory using GI ankita
@ 2024-02-23 12:42 ` ankita
  2024-02-27 13:00   ` Jonathan Cameron via
  2024-02-28  7:35   ` Markus Armbruster
  2024-02-23 12:42 ` [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure ankita
  1 sibling, 2 replies; 37+ messages in thread
From: ankita @ 2024-02-23 12:42 UTC (permalink / raw)
  To: ankita, jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell,
	ani, berrange, eduardo, imammedo, mst, eblake, armbru, david,
	gshan, Jonathan.Cameron, zhiw, mochs, pbonzini
  Cc: aniketa, cjia, kwankhede, targupta, vsethi, acurrid, dnigam,
	udhoke, qemu-arm, qemu-devel

From: Ankit Agrawal <ankita@nvidia.com>

NVIDIA GPU's support MIG (Mult-Instance GPUs) feature [1], which allows
partitioning of the GPU device resources (including device memory) into
several (upto 8) isolated instances. Each of the partitioned memory needs
a dedicated NUMA node to operate. The partitions are not fixed and they
can be created/deleted at runtime.

Unfortunately Linux OS does not provide a means to dynamically create/destroy
NUMA nodes and such feature implementation is not expected to be trivial. The
nodes that OS discovers at the boot time while parsing SRAT remains fixed. So
we utilize the Generic Initiator Affinity structures that allows association
between nodes and devices. Multiple GI structures per BDF is possible,
allowing creation of multiple nodes by exposing unique PXM in each of these
structures.

Implement the mechanism to build the GI affinity structures as Qemu currently
does not. Introduce a new acpi-generic-initiator object to allow host admin
link a device with an associated NUMA node. Qemu maintains this association
and use this object to build the requisite GI Affinity Structure.

When multiple numa nodes are associated with a device, it is required to
create those many number of acpi-generic-initiator objects, each representing
a unique device:node association.

Following is one of a decoded GI affinity structure in VM ACPI SRAT.
[0C8h 0200   1]                Subtable Type : 05 [Generic Initiator Affinity]
[0C9h 0201   1]                       Length : 20

[0CAh 0202   1]                    Reserved1 : 00
[0CBh 0203   1]           Device Handle Type : 01
[0CCh 0204   4]             Proximity Domain : 00000007
[0D0h 0208  16]                Device Handle : 00 00 20 00 00 00 00 00 00 00 00
00 00 00 00 00
[0E0h 0224   4]        Flags (decoded below) : 00000001
                                     Enabled : 1
[0E4h 0228   4]                    Reserved2 : 00000000

[0E8h 0232   1]                Subtable Type : 05 [Generic Initiator Affinity]
[0E9h 0233   1]                       Length : 20

An admin can provide a range of acpi-generic-initiator objects, each
associating a device (by providing the id through pci-dev argument)
to the desired numa node (using the node argument). Currently, only PCI
device is supported.

For the grace hopper system, create a range of 8 nodes and associate that
with the device using the acpi-generic-initiator object. While a configuration
of less than 8 nodes per device is allowed, such configuration will prevent
utilization of the feature to the fullest. The following sample creates 8
nodes per PCI device for a VM with 2 PCI devices and link them to the
respecitve PCI device using acpi-generic-initiator objects:

-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
-numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
-numa node,nodeid=8 -numa node,nodeid=9 \
-device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
-object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
-object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
-object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
-object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
-object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
-object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
-object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \

-numa node,nodeid=10 -numa node,nodeid=11 -numa node,nodeid=12 \
-numa node,nodeid=13 -numa node,nodeid=14 -numa node,nodeid=15 \
-numa node,nodeid=16 -numa node,nodeid=17 \
-device vfio-pci-nohotplug,host=0009:01:01.0,bus=pcie.0,addr=05.0,rombar=0,id=dev1 \
-object acpi-generic-initiator,id=gi8,pci-dev=dev1,node=10 \
-object acpi-generic-initiator,id=gi9,pci-dev=dev1,node=11 \
-object acpi-generic-initiator,id=gi10,pci-dev=dev1,node=12 \
-object acpi-generic-initiator,id=gi11,pci-dev=dev1,node=13 \
-object acpi-generic-initiator,id=gi12,pci-dev=dev1,node=14 \
-object acpi-generic-initiator,id=gi13,pci-dev=dev1,node=15 \
-object acpi-generic-initiator,id=gi14,pci-dev=dev1,node=16 \
-object acpi-generic-initiator,id=gi15,pci-dev=dev1,node=17 \

The performance benefits can be realized by providing the NUMA node distances
appropriately (through libvirt tags or Qemu params). The admin can get the
distance among nodes in hardware using `numactl -H`.

[1] https://www.nvidia.com/en-in/technologies/multi-instance-gpu

Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
---
 hw/acpi/acpi-generic-initiator.c         | 70 ++++++++++++++++++++++++
 hw/acpi/meson.build                      |  1 +
 include/hw/acpi/acpi-generic-initiator.h | 32 +++++++++++
 qapi/qom.json                            | 17 ++++++
 4 files changed, 120 insertions(+)
 create mode 100644 hw/acpi/acpi-generic-initiator.c
 create mode 100644 include/hw/acpi/acpi-generic-initiator.h

diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c
new file mode 100644
index 0000000000..1ade2f723f
--- /dev/null
+++ b/hw/acpi/acpi-generic-initiator.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ */
+
+#include "qemu/osdep.h"
+#include "hw/acpi/acpi-generic-initiator.h"
+#include "hw/pci/pci_device.h"
+#include "qapi/error.h"
+#include "qapi/qapi-builtin-visit.h"
+#include "qapi/visitor.h"
+#include "qemu/error-report.h"
+
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(AcpiGenericInitiator, acpi_generic_initiator,
+                   ACPI_GENERIC_INITIATOR, OBJECT,
+                   { TYPE_USER_CREATABLE },
+                   { NULL })
+
+OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericInitiator, ACPI_GENERIC_INITIATOR)
+
+static void acpi_generic_initiator_init(Object *obj)
+{
+    AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+
+    gi->node = MAX_NODES;
+    gi->pci_dev = NULL;
+}
+
+static void acpi_generic_initiator_finalize(Object *obj)
+{
+    AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+
+    g_free(gi->pci_dev);
+}
+
+static void acpi_generic_initiator_set_pci_device(Object *obj, const char *val,
+                                                  Error **errp)
+{
+    AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+
+    gi->pci_dev = g_strdup(val);
+}
+
+static void acpi_generic_initiator_set_node(Object *obj, Visitor *v,
+                                            const char *name, void *opaque,
+                                            Error **errp)
+{
+    AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+    uint32_t value;
+
+    if (!visit_type_uint32(v, name, &value, errp)) {
+        return;
+    }
+
+    if (value >= MAX_NODES) {
+        error_printf("%s: Invalid NUMA node specified\n",
+                     TYPE_ACPI_GENERIC_INITIATOR);
+        exit(1);
+    }
+
+    gi->node = value;
+}
+
+static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data)
+{
+    object_class_property_add_str(oc, "pci-dev", NULL,
+        acpi_generic_initiator_set_pci_device);
+    object_class_property_add(oc, "node", "int", NULL,
+        acpi_generic_initiator_set_node, NULL, NULL);
+}
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index fc1b952379..2268589519 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -1,5 +1,6 @@
 acpi_ss = ss.source_set()
 acpi_ss.add(files(
+  'acpi-generic-initiator.c',
   'acpi_interface.c',
   'aml-build.c',
   'bios-linker-loader.c',
diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h
new file mode 100644
index 0000000000..2f183b029a
--- /dev/null
+++ b/include/hw/acpi/acpi-generic-initiator.h
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ */
+
+#ifndef ACPI_GENERIC_INITIATOR_H
+#define ACPI_GENERIC_INITIATOR_H
+
+#include "hw/mem/pc-dimm.h"
+#include "hw/acpi/bios-linker-loader.h"
+#include "hw/acpi/aml-build.h"
+#include "sysemu/numa.h"
+#include "qemu/uuid.h"
+#include "qom/object.h"
+#include "qom/object_interfaces.h"
+
+#define TYPE_ACPI_GENERIC_INITIATOR "acpi-generic-initiator"
+
+typedef struct AcpiGenericInitiator {
+    /* private */
+    Object parent;
+
+    /* public */
+    char *pci_dev;
+    uint16_t node;
+} AcpiGenericInitiator;
+
+typedef struct AcpiGenericInitiatorClass {
+        ObjectClass parent_class;
+} AcpiGenericInitiatorClass;
+
+#endif
diff --git a/qapi/qom.json b/qapi/qom.json
index c53ef978ff..7efa0e14f6 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -794,6 +794,21 @@
 { 'struct': 'VfioUserServerProperties',
   'data': { 'socket': 'SocketAddress', 'device': 'str' } }
 
+##
+# @AcpiGenericInitiatorProperties:
+#
+# Properties for acpi-generic-initiator objects.
+#
+# @pci-dev: PCI device ID to be associated with the node
+#
+# @node: numa node associated with the PCI device
+#
+# Since: 9.0
+##
+{ 'struct': 'AcpiGenericInitiatorProperties',
+  'data': { 'pci-dev': 'str',
+            'node': 'uint32' } }
+
 ##
 # @RngProperties:
 #
@@ -911,6 +926,7 @@
 ##
 { 'enum': 'ObjectType',
   'data': [
+    'acpi-generic-initiator',
     'authz-list',
     'authz-listfile',
     'authz-pam',
@@ -981,6 +997,7 @@
             'id': 'str' },
   'discriminator': 'qom-type',
   'data': {
+      'acpi-generic-initiator':     'AcpiGenericInitiatorProperties',
       'authz-list':                 'AuthZListProperties',
       'authz-listfile':             'AuthZListFileProperties',
       'authz-pam':                  'AuthZPAMProperties',
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-23 12:42 [PATCH v7 0/2] acpi: report numa nodes for device memory using GI ankita
  2024-02-23 12:42 ` [PATCH v7 1/2] qom: new object to associate device to numa node ankita
@ 2024-02-23 12:42 ` ankita
  2024-02-26 16:34   ` Jonathan Cameron via
                     ` (2 more replies)
  1 sibling, 3 replies; 37+ messages in thread
From: ankita @ 2024-02-23 12:42 UTC (permalink / raw)
  To: ankita, jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell,
	ani, berrange, eduardo, imammedo, mst, eblake, armbru, david,
	gshan, Jonathan.Cameron, zhiw, mochs, pbonzini
  Cc: aniketa, cjia, kwankhede, targupta, vsethi, acurrid, dnigam,
	udhoke, qemu-arm, qemu-devel

From: Ankit Agrawal <ankita@nvidia.com>

ACPI spec provides a scheme to associate "Generic Initiators" [1]
(e.g. heterogeneous processors and accelerators, GPUs, and I/O devices with
integrated compute or DMA engines GPUs) with Proximity Domains. This is
achieved using Generic Initiator Affinity Structure in SRAT. During bootup,
Linux kernel parse the ACPI SRAT to determine the PXM ids and create a NUMA
node for each unique PXM ID encountered. Qemu currently do not implement
these structures while building SRAT.

Add GI structures while building VM ACPI SRAT. The association between
device and node are stored using acpi-generic-initiator object. Lookup
presence of all such objects and use them to build these structures.

The structure needs a PCI device handle [2] that consists of the device BDF.
The vfio-pci device corresponding to the acpi-generic-initiator object is
located to determine the BDF.

[1] ACPI Spec 6.3, Section 5.2.16.6
[2] ACPI Spec 6.3, Table 5.80

Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
---
 hw/acpi/acpi-generic-initiator.c         | 84 ++++++++++++++++++++++++
 hw/arm/virt-acpi-build.c                 |  3 +
 include/hw/acpi/acpi-generic-initiator.h | 26 ++++++++
 3 files changed, 113 insertions(+)

diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c
index 1ade2f723f..d78382bc63 100644
--- a/hw/acpi/acpi-generic-initiator.c
+++ b/hw/acpi/acpi-generic-initiator.c
@@ -68,3 +68,87 @@ static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data)
     object_class_property_add(oc, "node", "int", NULL,
         acpi_generic_initiator_set_node, NULL, NULL);
 }
+
+static int acpi_generic_initiator_list(Object *obj, void *opaque)
+{
+    GSList **list = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_ACPI_GENERIC_INITIATOR)) {
+        *list = g_slist_append(*list, ACPI_GENERIC_INITIATOR(obj));
+    }
+
+    object_child_foreach(obj, acpi_generic_initiator_list, opaque);
+    return 0;
+}
+
+/*
+ * Identify Generic Initiator objects and link them into the list which is
+ * returned to the caller.
+ *
+ * Note: it is the caller's responsibility to free the list to avoid
+ * memory leak.
+ */
+static GSList *acpi_generic_initiator_get_list(void)
+{
+    GSList *list = NULL;
+
+    object_child_foreach(object_get_root(),
+                         acpi_generic_initiator_list, &list);
+    return list;
+}
+
+/*
+ * ACPI 6.3:
+ * Table 5-78 Generic Initiator Affinity Structure
+ */
+static void
+build_srat_generic_pci_initiator_affinity(GArray *table_data, int node,
+                                          PCIDeviceHandle *handle)
+{
+    uint8_t index;
+
+    build_append_int_noprefix(table_data, 5, 1);  /* Type */
+    build_append_int_noprefix(table_data, 32, 1); /* Length */
+    build_append_int_noprefix(table_data, 0, 1);  /* Reserved */
+    build_append_int_noprefix(table_data, 1, 1);  /* Device Handle Type: PCI */
+    build_append_int_noprefix(table_data, node, 4);  /* Proximity Domain */
+
+    /* Device Handle - PCI */
+    build_append_int_noprefix(table_data, handle->segment, 2);
+    build_append_int_noprefix(table_data, handle->bdf, 2);
+    for (index = 0; index < 12; index++) {
+        build_append_int_noprefix(table_data, 0, 1);
+    }
+
+    build_append_int_noprefix(table_data, GEN_AFFINITY_ENABLED, 4); /* Flags */
+    build_append_int_noprefix(table_data, 0, 4);     /* Reserved */
+}
+
+void build_srat_generic_pci_initiator(GArray *table_data)
+{
+    GSList *gi_list, *list = acpi_generic_initiator_get_list();
+    AcpiGenericInitiator *gi;
+
+    for (gi_list = list; gi_list; gi_list = gi_list->next) {
+        PCIDeviceHandle dev_handle;
+        PCIDevice *pci_dev;
+        Object *o;
+
+        gi = gi_list->data;
+
+        o = object_resolve_path_type(gi->pci_dev, TYPE_PCI_DEVICE, NULL);
+        if (!o) {
+            error_printf("Specified device must be a PCI device.\n");
+            exit(1);
+        }
+        pci_dev = PCI_DEVICE(o);
+
+        dev_handle.segment = 0;
+        dev_handle.bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
+                                                   pci_dev->devfn);
+        build_srat_generic_pci_initiator_affinity(table_data,
+                                                  gi->node, &dev_handle);
+    }
+
+    g_slist_free(list);
+}
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 8bc35a483c..00d77327e0 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -58,6 +58,7 @@
 #include "migration/vmstate.h"
 #include "hw/acpi/ghes.h"
 #include "hw/acpi/viot.h"
+#include "hw/acpi/acpi-generic-initiator.h"
 
 #define ARM_SPI_BASE 32
 
@@ -558,6 +559,8 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         }
     }
 
+    build_srat_generic_pci_initiator(table_data);
+
     if (ms->nvdimms_state->is_enabled) {
         nvdimm_build_srat(table_data);
     }
diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h
index 2f183b029a..213545e614 100644
--- a/include/hw/acpi/acpi-generic-initiator.h
+++ b/include/hw/acpi/acpi-generic-initiator.h
@@ -29,4 +29,30 @@ typedef struct AcpiGenericInitiatorClass {
         ObjectClass parent_class;
 } AcpiGenericInitiatorClass;
 
+/*
+ * ACPI 6.3:
+ * Table 5-81 Flags – Generic Initiator Affinity Structure
+ */
+typedef enum {
+    GEN_AFFINITY_ENABLED = (1 << 0), /*
+                                      * If clear, the OSPM ignores the contents
+                                      * of the Generic Initiator/Port Affinity
+                                      * Structure. This allows system firmware
+                                      * to populate the SRAT with a static
+                                      * number of structures, but only enable
+                                      * them as necessary.
+                                      */
+} GenericAffinityFlags;
+
+/*
+ * ACPI 6.3:
+ * Table 5-80 Device Handle - PCI
+ */
+typedef struct PCIDeviceHandle {
+    uint16_t segment;
+    uint16_t bdf;
+} PCIDeviceHandle;
+
+void build_srat_generic_pci_initiator(GArray *table_data);
+
 #endif
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-23 12:42 ` [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure ankita
@ 2024-02-26 16:34   ` Jonathan Cameron via
  2024-02-26 16:42   ` Jonathan Cameron via
  2024-02-27 12:53   ` Jonathan Cameron via
  2 siblings, 0 replies; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-26 16:34 UTC (permalink / raw)
  To: ankita
  Cc: jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell, ani,
	berrange, eduardo, imammedo, mst, eblake, armbru, david, gshan,
	zhiw, mochs, pbonzini, aniketa, cjia, kwankhede, targupta,
	vsethi, acurrid, dnigam, udhoke, qemu-arm, qemu-devel

On Fri, 23 Feb 2024 12:42:23 +0000
<ankita@nvidia.com> wrote:

> From: Ankit Agrawal <ankita@nvidia.com>
> 
> ACPI spec provides a scheme to associate "Generic Initiators" [1]
> (e.g. heterogeneous processors and accelerators, GPUs, and I/O devices with
> integrated compute or DMA engines GPUs) with Proximity Domains. This is
> achieved using Generic Initiator Affinity Structure in SRAT. During bootup,
> Linux kernel parse the ACPI SRAT to determine the PXM ids and create a NUMA
> node for each unique PXM ID encountered. Qemu currently do not implement
> these structures while building SRAT.
> 
> Add GI structures while building VM ACPI SRAT. The association between
> device and node are stored using acpi-generic-initiator object. Lookup
> presence of all such objects and use them to build these structures.
> 
> The structure needs a PCI device handle [2] that consists of the device BDF.
> The vfio-pci device corresponding to the acpi-generic-initiator object is
> located to determine the BDF.
> 
> [1] ACPI Spec 6.3, Section 5.2.16.6
> [2] ACPI Spec 6.3, Table 5.80
> 
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Hi Ankit,

As the code stands the use of a list seems overkill.

Otherwise looks good to me.  I need Generic Ports support for CXL
stuff so will copy your approach for that as it's ended up nice
and simple.

Jonathan

> ---
>  hw/acpi/acpi-generic-initiator.c         | 84 ++++++++++++++++++++++++
>  hw/arm/virt-acpi-build.c                 |  3 +
>  include/hw/acpi/acpi-generic-initiator.h | 26 ++++++++
>  3 files changed, 113 insertions(+)
> 
> diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c
> index 1ade2f723f..d78382bc63 100644
> --- a/hw/acpi/acpi-generic-initiator.c
> +++ b/hw/acpi/acpi-generic-initiator.c
> @@ -68,3 +68,87 @@ static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data)
>      object_class_property_add(oc, "node", "int", NULL,
>          acpi_generic_initiator_set_node, NULL, NULL);
>  }
> +
> +static int acpi_generic_initiator_list(Object *obj, void *opaque)
> +{
> +    GSList **list = opaque;
> +
> +    if (object_dynamic_cast(obj, TYPE_ACPI_GENERIC_INITIATOR)) {
> +        *list = g_slist_append(*list, ACPI_GENERIC_INITIATOR(obj));
> +    }
> +
> +    object_child_foreach(obj, acpi_generic_initiator_list, opaque);

See below.  There is a recursive helper that avoids need for this.

> +    return 0;
> +}
> +
> +/*
> + * Identify Generic Initiator objects and link them into the list which is
> + * returned to the caller.
> + *
> + * Note: it is the caller's responsibility to free the list to avoid
> + * memory leak.
> + */
> +static GSList *acpi_generic_initiator_get_list(void)
> +{
> +    GSList *list = NULL;
> +
> +    object_child_foreach(object_get_root(),
> +                         acpi_generic_initiator_list, &list);

I think you can use object_child_foreach_recursive() and skip the manual
calling above?

> +    return list;
> +}
> +
> +/*
> + * ACPI 6.3:
> + * Table 5-78 Generic Initiator Affinity Structure
> + */
> +static void
> +build_srat_generic_pci_initiator_affinity(GArray *table_data, int node,
> +                                          PCIDeviceHandle *handle)
> +{
> +    uint8_t index;
> +
> +    build_append_int_noprefix(table_data, 5, 1);  /* Type */
> +    build_append_int_noprefix(table_data, 32, 1); /* Length */
> +    build_append_int_noprefix(table_data, 0, 1);  /* Reserved */
> +    build_append_int_noprefix(table_data, 1, 1);  /* Device Handle Type: PCI */
> +    build_append_int_noprefix(table_data, node, 4);  /* Proximity Domain */
> +
> +    /* Device Handle - PCI */
> +    build_append_int_noprefix(table_data, handle->segment, 2);
> +    build_append_int_noprefix(table_data, handle->bdf, 2);
> +    for (index = 0; index < 12; index++) {
> +        build_append_int_noprefix(table_data, 0, 1);
> +    }
> +
> +    build_append_int_noprefix(table_data, GEN_AFFINITY_ENABLED, 4); /* Flags */
> +    build_append_int_noprefix(table_data, 0, 4);     /* Reserved */
> +}
> +
> +void build_srat_generic_pci_initiator(GArray *table_data)
> +{
> +    GSList *gi_list, *list = acpi_generic_initiator_get_list();


Did you consider just have the functional called in the scan do this?
Not sure you need anything as a parameter beyond the GArray *table_data

Something like...

static int acpi_generic_initiator_list(Object *obj, void *opaque)
{
    uint8_t index;
    AcpiGenericInitiator *gi;
    GArray *table_data = opaque;
    PCIDeviceHandle dev_handle;
    PCIDevice *pci_dev;
    Object *o;

    if (!object_dynamic_cast(obj, TYPE_ACPI_GENERIC_INITIATOR)) {
        return 0;
    }

    gi = ACPI_GENERIC_INITIATOR(obj);
    o = object_resolve_path_type(gi->pci_dev, TYPE_PCI_DEVICE, NULL);
    if (!o) {
	error_setg(&error_abort, "GI: Specified device must be a PCI device.\n")
	return 1;
    }
    pci_dev = PCI_DEVICE(o);

    dev_handle.segment = 0;
    dev_handle.bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
                                               pci_dev->devfn);
    build_srat_generic_pci_initiator_affinity(table_data,
                                              gi->node, &dev_handle);
}

+ a call to.
    object_child_foreach_recursive(object_get_root(),
                                   acpi_generic_srat, table_data);	

> +    AcpiGenericInitiator *gi;
> +
> +    for (gi_list = list; gi_list; gi_list = gi_list->next) {
> +        PCIDeviceHandle dev_handle;
> +        PCIDevice *pci_dev;
> +        Object *o;
> +
> +        gi = gi_list->data;
> +
> +        o = object_resolve_path_type(gi->pci_dev, TYPE_PCI_DEVICE, NULL);
> +        if (!o) {
> +            error_printf("Specified device must be a PCI device.\n");
as above, use an errp rather than exit(1);
> +            exit(1);
> +        }
> +        pci_dev = PCI_DEVICE(o);
> +
> +        dev_handle.segment = 0;
> +        dev_handle.bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
> +                                                   pci_dev->devfn);
> +        build_srat_generic_pci_initiator_affinity(table_data,
> +                                                  gi->node, &dev_handle);
Should we check for consistency of gi->node and
-numa node,id=X entries?

Maybe just check less than numa_state->num_nodes as that's the variable
used to walk the other structures when building srat.

> +    }
> +
> +    g_slist_free(list);
> +}
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 8bc35a483c..00d77327e0 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -58,6 +58,7 @@
>  #include "migration/vmstate.h"
>  #include "hw/acpi/ghes.h"
>  #include "hw/acpi/viot.h"
> +#include "hw/acpi/acpi-generic-initiator.h"
>  
>  #define ARM_SPI_BASE 32
>  
> @@ -558,6 +559,8 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
>          }
>      }
>  
> +    build_srat_generic_pci_initiator(table_data);
Perhaps passing in a suitable Error ** would be sensible.

> +
>      if (ms->nvdimms_state->is_enabled) {
>          nvdimm_build_srat(table_data);
>      }
> diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h
> index 2f183b029a..213545e614 100644
> --- a/include/hw/acpi/acpi-generic-initiator.h
> +++ b/include/hw/acpi/acpi-generic-initiator.h
> @@ -29,4 +29,30 @@ typedef struct AcpiGenericInitiatorClass {
>          ObjectClass parent_class;
>  } AcpiGenericInitiatorClass;
>  
> +/*
> + * ACPI 6.3:
> + * Table 5-81 Flags – Generic Initiator Affinity Structure
> + */
> +typedef enum {
> +    GEN_AFFINITY_ENABLED = (1 << 0), /*
> +                                      * If clear, the OSPM ignores the contents
> +                                      * of the Generic Initiator/Port Affinity
> +                                      * Structure. This allows system firmware
> +                                      * to populate the SRAT with a static
> +                                      * number of structures, but only enable
> +                                      * them as necessary.
> +                                      */
I'd put the comment above the definition to avoid wrapping so much!
> +} GenericAffinityFlags;
> +
> +/*
> + * ACPI 6.3:
> + * Table 5-80 Device Handle - PCI
> + */
> +typedef struct PCIDeviceHandle {
> +    uint16_t segment;
> +    uint16_t bdf;
> +} PCIDeviceHandle;
> +
> +void build_srat_generic_pci_initiator(GArray *table_data);
> +
>  #endif



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-23 12:42 ` [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure ankita
  2024-02-26 16:34   ` Jonathan Cameron via
@ 2024-02-26 16:42   ` Jonathan Cameron via
  2024-02-27  8:37     ` Ankit Agrawal
                       ` (3 more replies)
  2024-02-27 12:53   ` Jonathan Cameron via
  2 siblings, 4 replies; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-26 16:42 UTC (permalink / raw)
  To: ankita
  Cc: jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell, ani,
	berrange, eduardo, imammedo, mst, eblake, armbru, david, gshan,
	zhiw, mochs, pbonzini, aniketa, cjia, kwankhede, targupta,
	vsethi, acurrid, dnigam, udhoke, qemu-arm, qemu-devel

On Fri, 23 Feb 2024 12:42:23 +0000
<ankita@nvidia.com> wrote:

> From: Ankit Agrawal <ankita@nvidia.com>
> 
> ACPI spec provides a scheme to associate "Generic Initiators" [1]
> (e.g. heterogeneous processors and accelerators, GPUs, and I/O devices with
> integrated compute or DMA engines GPUs) with Proximity Domains. This is
> achieved using Generic Initiator Affinity Structure in SRAT. During bootup,
> Linux kernel parse the ACPI SRAT to determine the PXM ids and create a NUMA
> node for each unique PXM ID encountered. Qemu currently do not implement
> these structures while building SRAT.
> 
> Add GI structures while building VM ACPI SRAT. The association between
> device and node are stored using acpi-generic-initiator object. Lookup
> presence of all such objects and use them to build these structures.
> 
> The structure needs a PCI device handle [2] that consists of the device BDF.
> The vfio-pci device corresponding to the acpi-generic-initiator object is
> located to determine the BDF.
> 
> [1] ACPI Spec 6.3, Section 5.2.16.6
> [2] ACPI Spec 6.3, Table 5.80
> 
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>

One thing I forgot.

Please add a test.  tests/qtest/bios-tables-test.c
+ relevant table dumps.

Could also hook this up for x86 with a oneline addition and improve
test coverage.  If not, I'll do it when I add Generic Ports as annoyingly
people still care about x86 for some reason.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-26 16:42   ` Jonathan Cameron via
@ 2024-02-27  8:37     ` Ankit Agrawal
  2024-02-27 17:11       ` Jonathan Cameron via
  2024-02-29 11:43     ` Ankit Agrawal
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 37+ messages in thread
From: Ankit Agrawal @ 2024-02-27  8:37 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

Thanks Jonathan for reviewing the change.

Comments inline.

>> The structure needs a PCI device handle [2] that consists of the device BDF.
>> The vfio-pci device corresponding to the acpi-generic-initiator object is
>> located to determine the BDF.
>>
>> [1] ACPI Spec 6.3, Section 5.2.16.6
>> [2] ACPI Spec 6.3, Table 5.80
>>
>> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
>Hi Ankit,
>
> As the code stands the use of a list seems overkill.

Yeah, I will try out your suggestion.

> Otherwise looks good to me.  I need Generic Ports support for CXL
> stuff so will copy your approach for that as it's ended up nice
> and simple.
> 
> Jonathan

Nice, would be good to have uniform implementations.

>> +/*
>> + * Identify Generic Initiator objects and link them into the list which is
>> + * returned to the caller.
>> + *
>> + * Note: it is the caller's responsibility to free the list to avoid
>> + * memory leak.
>> + */
>> +static GSList *acpi_generic_initiator_get_list(void)
>> +{
>> +    GSList *list = NULL;
>> +
>> +    object_child_foreach(object_get_root(),
>> +                         acpi_generic_initiator_list, &list);
>
> I think you can use object_child_foreach_recursive() and skip the manual
> calling above?

Great suggestion, will give that a try.

>> + * ACPI 6.3:
>> + * Table 5-78 Generic Initiator Affinity Structure
>> + */
>> +static void
>> +build_srat_generic_pci_initiator_affinity(GArray *table_data, int node,
>> +                                          PCIDeviceHandle *handle)
>> +{
>> +    uint8_t index;
>> +
>> +    build_append_int_noprefix(table_data, 5, 1);  /* Type */
>> +    build_append_int_noprefix(table_data, 32, 1); /* Length */
>> +    build_append_int_noprefix(table_data, 0, 1);  /* Reserved */
>> +    build_append_int_noprefix(table_data, 1, 1);  /* Device Handle Type: PCI */
>> +    build_append_int_noprefix(table_data, node, 4);  /* Proximity Domain */
>> +
>> +    /* Device Handle - PCI */
>> +    build_append_int_noprefix(table_data, handle->segment, 2);
>> +    build_append_int_noprefix(table_data, handle->bdf, 2);
>> +    for (index = 0; index < 12; index++) {
>> +        build_append_int_noprefix(table_data, 0, 1);
>> +    }
>> +
>> +    build_append_int_noprefix(table_data, GEN_AFFINITY_ENABLED, 4); /* Flags */
>> +    build_append_int_noprefix(table_data, 0, 4);     /* Reserved */
>> +}
>> +
>> +void build_srat_generic_pci_initiator(GArray *table_data)
>> +{
>> +    GSList *gi_list, *list = acpi_generic_initiator_get_list();
>
>
> Did you consider just have the functional called in the scan do this?
> Not sure you need anything as a parameter beyond the GArray *table_data
>
> Something like...
>
> static int acpi_generic_initiator_list(Object *obj, void *opaque)
> {
>    uint8_t index;
>    AcpiGenericInitiator *gi;
>    GArray *table_data = opaque;
>    PCIDeviceHandle dev_handle;
>    PCIDevice *pci_dev;
>    Object *o;
>
>    if (!object_dynamic_cast(obj, TYPE_ACPI_GENERIC_INITIATOR)) {
>        return 0;
>    }
>
>    gi = ACPI_GENERIC_INITIATOR(obj);
>    o = object_resolve_path_type(gi->pci_dev, TYPE_PCI_DEVICE, NULL);
>    if (!o) {
>        error_setg(&error_abort, "GI: Specified device must be a PCI device.\n")
>        return 1;
>    }
>    pci_dev = PCI_DEVICE(o);
>
>    dev_handle.segment = 0;
>    dev_handle.bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
>                                               pci_dev->devfn);
>    build_srat_generic_pci_initiator_affinity(table_data,
>                                              gi->node, &dev_handle);
> }
>
> + a call to.
>    object_child_foreach_recursive(object_get_root(),
>                                   acpi_generic_srat, table_data);
>

Thanks. That does seem better.

>> +        pci_dev = PCI_DEVICE(o);
>> +
>> +        dev_handle.segment = 0;
>> +        dev_handle.bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
>> +                                                   pci_dev->devfn);
>> +        build_srat_generic_pci_initiator_affinity(table_data,
>> +                                                  gi->node, &dev_handle);
> Should we check for consistency of gi->node and
> -numa node,id=X entries?
>
> Maybe just check less than numa_state->num_nodes as that's the variable
> used to walk the other structures when building srat.

Ack, will add a check to compare against numa_state->num_nodes.

>> +/*
>> + * ACPI 6.3:
>> + * Table 5-81 Flags – Generic Initiator Affinity Structure
>> + */
>> +typedef enum {
>> +    GEN_AFFINITY_ENABLED = (1 << 0), /*
>> +                                      * If clear, the OSPM ignores the contents
>> +                                      * of the Generic Initiator/Port Affinity
>> +                                      * Structure. This allows system firmware
>> +                                      * to populate the SRAT with a static
>> +                                      * number of structures, but only enable
>> +                                      * them as necessary.
>> +                                      */
> I'd put the comment above the definition to avoid wrapping so much!

Yeah, it does look kind of silly. Will fix it.

>> +} GenericAffinityFlags;
>> +

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-23 12:42 ` [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure ankita
  2024-02-26 16:34   ` Jonathan Cameron via
  2024-02-26 16:42   ` Jonathan Cameron via
@ 2024-02-27 12:53   ` Jonathan Cameron via
  2024-02-29 11:46     ` Ankit Agrawal
  2 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-27 12:53 UTC (permalink / raw)
  To: ankita
  Cc: jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell, ani,
	berrange, eduardo, imammedo, mst, eblake, armbru, david, gshan,
	zhiw, mochs, pbonzini, aniketa, cjia, kwankhede, targupta,
	vsethi, acurrid, dnigam, udhoke, qemu-arm, qemu-devel

On Fri, 23 Feb 2024 12:42:23 +0000
<ankita@nvidia.com> wrote:

> From: Ankit Agrawal <ankita@nvidia.com>
> 
> ACPI spec provides a scheme to associate "Generic Initiators" [1]
> (e.g. heterogeneous processors and accelerators, GPUs, and I/O devices with
> integrated compute or DMA engines GPUs) with Proximity Domains. This is
> achieved using Generic Initiator Affinity Structure in SRAT. During bootup,
> Linux kernel parse the ACPI SRAT to determine the PXM ids and create a NUMA
> node for each unique PXM ID encountered. Qemu currently do not implement
> these structures while building SRAT.
> 
> Add GI structures while building VM ACPI SRAT. The association between
> device and node are stored using acpi-generic-initiator object. Lookup
> presence of all such objects and use them to build these structures.
> 
> The structure needs a PCI device handle [2] that consists of the device BDF.
> The vfio-pci device corresponding to the acpi-generic-initiator object is
> located to determine the BDF.
> 
> [1] ACPI Spec 6.3, Section 5.2.16.6
> [2] ACPI Spec 6.3, Table 5.80
> 
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
> ---
>  hw/acpi/acpi-generic-initiator.c         | 84 ++++++++++++++++++++++++
>  hw/arm/virt-acpi-build.c                 |  3 +
>  include/hw/acpi/acpi-generic-initiator.h | 26 ++++++++
A few more comments.

Maybe _ rather than - as more common for acpi include naming.

I also wonder if we need the acpi prefix for file names given context?



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-23 12:42 ` [PATCH v7 1/2] qom: new object to associate device to numa node ankita
@ 2024-02-27 13:00   ` Jonathan Cameron via
  2024-02-28  5:35     ` Ankit Agrawal
  2024-02-28  7:35   ` Markus Armbruster
  1 sibling, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-27 13:00 UTC (permalink / raw)
  To: ankita
  Cc: jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell, ani,
	berrange, eduardo, imammedo, mst, eblake, armbru, david, gshan,
	zhiw, mochs, pbonzini, aniketa, cjia, kwankhede, targupta,
	vsethi, acurrid, dnigam, udhoke, qemu-arm, qemu-devel


> diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h
> new file mode 100644
> index 0000000000..2f183b029a
> --- /dev/null
> +++ b/include/hw/acpi/acpi-generic-initiator.h

> +typedef struct AcpiGenericInitiatorClass {
> +        ObjectClass parent_class;
Too indented.
> +} AcpiGenericInitiatorClass;



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-27  8:37     ` Ankit Agrawal
@ 2024-02-27 17:11       ` Jonathan Cameron via
  2024-02-27 17:36         ` Jonathan Cameron via
  0 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-27 17:11 UTC (permalink / raw)
  To: Ankit Agrawal
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On Tue, 27 Feb 2024 08:37:15 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> Thanks Jonathan for reviewing the change.
> 
> Comments inline.
> 
> >> The structure needs a PCI device handle [2] that consists of the device BDF.
> >> The vfio-pci device corresponding to the acpi-generic-initiator object is
> >> located to determine the BDF.
> >>
> >> [1] ACPI Spec 6.3, Section 5.2.16.6
> >> [2] ACPI Spec 6.3, Table 5.80
> >>
> >> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>  
> >Hi Ankit,
> >
> > As the code stands the use of a list seems overkill.  
> 
> Yeah, I will try out your suggestion.
> 
> > Otherwise looks good to me.  I need Generic Ports support for CXL
> > stuff so will copy your approach for that as it's ended up nice
> > and simple.
> > 
> > Jonathan  
> 
> Nice, would be good to have uniform implementations.

I've been messing around with this today.

They differ only very trivially.
2 Options.
1) Have acpi-generic-port inherit from acpi-generic-initiator.
   Works but implies a relationship that isn't really true.
2) Add an abstract base class. I've called it acpi-generic-node
   and have bother acpi-generic-initiator and acpi-generic-port
   inherit from it.

The second feels more natural but is a tiny bit more code (a few
more empty init / finalize functions.

If we are going to end up with an abstract base 'object' it
will be cleaner to do this all as one series if you don't mind
carrying the generic port stuff as well? Or I can smash the
two series together and send out an updated version that hopefully
meets both our requirements (+ tests etc).

I'm just running tests against the CXL qos / generic port code
but assuming all goes well can share my additional changes
in next day or two.

Jonathan




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-27 17:11       ` Jonathan Cameron via
@ 2024-02-27 17:36         ` Jonathan Cameron via
  2024-02-29 15:59           ` Jonathan Cameron via
  0 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-27 17:36 UTC (permalink / raw)
  To: Jonathan Cameron via
  Cc: Jonathan Cameron, Ankit Agrawal, Jason Gunthorpe,
	alex.williamson, clg, shannon.zhaosl, peter.maydell, ani,
	berrange, eduardo, imammedo, mst, eblake, armbru, david, gshan,
	Zhi Wang, Matt Ochs, pbonzini, Aniket Agashe, Neo Jia,
	Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm

On Tue, 27 Feb 2024 17:11:15 +0000
Jonathan Cameron via <qemu-devel@nongnu.org> wrote:

> On Tue, 27 Feb 2024 08:37:15 +0000
> Ankit Agrawal <ankita@nvidia.com> wrote:
> 
> > Thanks Jonathan for reviewing the change.
> > 
> > Comments inline.
> >   
> > >> The structure needs a PCI device handle [2] that consists of the device BDF.
> > >> The vfio-pci device corresponding to the acpi-generic-initiator object is
> > >> located to determine the BDF.
> > >>
> > >> [1] ACPI Spec 6.3, Section 5.2.16.6
> > >> [2] ACPI Spec 6.3, Table 5.80
> > >>
> > >> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>    
> > >Hi Ankit,
> > >
> > > As the code stands the use of a list seems overkill.    
> > 
> > Yeah, I will try out your suggestion.
> >   
> > > Otherwise looks good to me.  I need Generic Ports support for CXL
> > > stuff so will copy your approach for that as it's ended up nice
> > > and simple.
> > > 
> > > Jonathan    
> > 
> > Nice, would be good to have uniform implementations.  
> 
> I've been messing around with this today.
> 
> They differ only very trivially.
> 2 Options.
> 1) Have acpi-generic-port inherit from acpi-generic-initiator.
>    Works but implies a relationship that isn't really true.
> 2) Add an abstract base class. I've called it acpi-generic-node
>    and have bother acpi-generic-initiator and acpi-generic-port
>    inherit from it.
> 
> The second feels more natural but is a tiny bit more code (a few
> more empty init / finalize functions.
> 
> If we are going to end up with an abstract base 'object' it
> will be cleaner to do this all as one series if you don't mind
> carrying the generic port stuff as well? Or I can smash the
> two series together and send out an updated version that hopefully
> meets both our requirements (+ tests etc).
> 
> I'm just running tests against the CXL qos / generic port code
> but assuming all goes well can share my additional changes
> in next day or two.
> 
> Jonathan

One more thing.  Right now we can't use Generic Initiators as
HMAT initiators.  That also wants fixing given that's their
normal usecase rather than what you are using them for so it
should 'work'.

Jonathan

> 
> 
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-27 13:00   ` Jonathan Cameron via
@ 2024-02-28  5:35     ` Ankit Agrawal
  0 siblings, 0 replies; 37+ messages in thread
From: Ankit Agrawal @ 2024-02-28  5:35 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

>> diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h
>> new file mode 100644
>> index 0000000000..2f183b029a
>> --- /dev/null
>> +++ b/include/hw/acpi/acpi-generic-initiator.h
>
>> +typedef struct AcpiGenericInitiatorClass {
>> +        ObjectClass parent_class;
>Too indented.

Yes, will fix it.

>> +} AcpiGenericInitiatorClass;


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-23 12:42 ` [PATCH v7 1/2] qom: new object to associate device to numa node ankita
  2024-02-27 13:00   ` Jonathan Cameron via
@ 2024-02-28  7:35   ` Markus Armbruster
  2024-02-28 13:55     ` Jonathan Cameron via
  1 sibling, 1 reply; 37+ messages in thread
From: Markus Armbruster @ 2024-02-28  7:35 UTC (permalink / raw)
  To: ankita
  Cc: jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell, ani,
	berrange, eduardo, imammedo, mst, eblake, armbru, david, gshan,
	Jonathan.Cameron, zhiw, mochs, pbonzini, aniketa, cjia,
	kwankhede, targupta, vsethi, acurrid, dnigam, udhoke, qemu-arm,
	qemu-devel

<ankita@nvidia.com> writes:

> From: Ankit Agrawal <ankita@nvidia.com>
>
> NVIDIA GPU's support MIG (Mult-Instance GPUs) feature [1], which allows
> partitioning of the GPU device resources (including device memory) into
> several (upto 8) isolated instances. Each of the partitioned memory needs
> a dedicated NUMA node to operate. The partitions are not fixed and they
> can be created/deleted at runtime.
>
> Unfortunately Linux OS does not provide a means to dynamically create/destroy
> NUMA nodes and such feature implementation is not expected to be trivial. The
> nodes that OS discovers at the boot time while parsing SRAT remains fixed. So
> we utilize the Generic Initiator Affinity structures that allows association
> between nodes and devices. Multiple GI structures per BDF is possible,
> allowing creation of multiple nodes by exposing unique PXM in each of these
> structures.
>
> Implement the mechanism to build the GI affinity structures as Qemu currently
> does not. Introduce a new acpi-generic-initiator object to allow host admin
> link a device with an associated NUMA node. Qemu maintains this association
> and use this object to build the requisite GI Affinity Structure.
>
> When multiple numa nodes are associated with a device, it is required to
> create those many number of acpi-generic-initiator objects, each representing
> a unique device:node association.
>
> Following is one of a decoded GI affinity structure in VM ACPI SRAT.
> [0C8h 0200   1]                Subtable Type : 05 [Generic Initiator Affinity]
> [0C9h 0201   1]                       Length : 20
>
> [0CAh 0202   1]                    Reserved1 : 00
> [0CBh 0203   1]           Device Handle Type : 01
> [0CCh 0204   4]             Proximity Domain : 00000007
> [0D0h 0208  16]                Device Handle : 00 00 20 00 00 00 00 00 00 00 00
> 00 00 00 00 00
> [0E0h 0224   4]        Flags (decoded below) : 00000001
>                                      Enabled : 1
> [0E4h 0228   4]                    Reserved2 : 00000000
>
> [0E8h 0232   1]                Subtable Type : 05 [Generic Initiator Affinity]
> [0E9h 0233   1]                       Length : 20
>
> An admin can provide a range of acpi-generic-initiator objects, each
> associating a device (by providing the id through pci-dev argument)
> to the desired numa node (using the node argument). Currently, only PCI
> device is supported.
>
> For the grace hopper system, create a range of 8 nodes and associate that
> with the device using the acpi-generic-initiator object. While a configuration
> of less than 8 nodes per device is allowed, such configuration will prevent
> utilization of the feature to the fullest. The following sample creates 8
> nodes per PCI device for a VM with 2 PCI devices and link them to the
> respecitve PCI device using acpi-generic-initiator objects:
>
> -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
> -numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
> -numa node,nodeid=8 -numa node,nodeid=9 \
> -device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
> -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
> -object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
> -object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
> -object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
> -object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
> -object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
> -object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
> -object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
>
> -numa node,nodeid=10 -numa node,nodeid=11 -numa node,nodeid=12 \
> -numa node,nodeid=13 -numa node,nodeid=14 -numa node,nodeid=15 \
> -numa node,nodeid=16 -numa node,nodeid=17 \
> -device vfio-pci-nohotplug,host=0009:01:01.0,bus=pcie.0,addr=05.0,rombar=0,id=dev1 \
> -object acpi-generic-initiator,id=gi8,pci-dev=dev1,node=10 \
> -object acpi-generic-initiator,id=gi9,pci-dev=dev1,node=11 \
> -object acpi-generic-initiator,id=gi10,pci-dev=dev1,node=12 \
> -object acpi-generic-initiator,id=gi11,pci-dev=dev1,node=13 \
> -object acpi-generic-initiator,id=gi12,pci-dev=dev1,node=14 \
> -object acpi-generic-initiator,id=gi13,pci-dev=dev1,node=15 \
> -object acpi-generic-initiator,id=gi14,pci-dev=dev1,node=16 \
> -object acpi-generic-initiator,id=gi15,pci-dev=dev1,node=17 \
>
> The performance benefits can be realized by providing the NUMA node distances
> appropriately (through libvirt tags or Qemu params). The admin can get the
> distance among nodes in hardware using `numactl -H`.
>
> [1] https://www.nvidia.com/en-in/technologies/multi-instance-gpu
>
> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
> ---
>  hw/acpi/acpi-generic-initiator.c         | 70 ++++++++++++++++++++++++
>  hw/acpi/meson.build                      |  1 +
>  include/hw/acpi/acpi-generic-initiator.h | 32 +++++++++++
>  qapi/qom.json                            | 17 ++++++
>  4 files changed, 120 insertions(+)
>  create mode 100644 hw/acpi/acpi-generic-initiator.c
>  create mode 100644 include/hw/acpi/acpi-generic-initiator.h
>
> diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c
> new file mode 100644
> index 0000000000..1ade2f723f
> --- /dev/null
> +++ b/hw/acpi/acpi-generic-initiator.c
> @@ -0,0 +1,70 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/acpi/acpi-generic-initiator.h"
> +#include "hw/pci/pci_device.h"
> +#include "qapi/error.h"
> +#include "qapi/qapi-builtin-visit.h"
> +#include "qapi/visitor.h"
> +#include "qemu/error-report.h"
> +
> +OBJECT_DEFINE_TYPE_WITH_INTERFACES(AcpiGenericInitiator, acpi_generic_initiator,
> +                   ACPI_GENERIC_INITIATOR, OBJECT,
> +                   { TYPE_USER_CREATABLE },
> +                   { NULL })
> +
> +OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericInitiator, ACPI_GENERIC_INITIATOR)
> +
> +static void acpi_generic_initiator_init(Object *obj)
> +{
> +    AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
> +
> +    gi->node = MAX_NODES;
> +    gi->pci_dev = NULL;
> +}
> +
> +static void acpi_generic_initiator_finalize(Object *obj)
> +{
> +    AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
> +
> +    g_free(gi->pci_dev);
> +}
> +
> +static void acpi_generic_initiator_set_pci_device(Object *obj, const char *val,
> +                                                  Error **errp)
> +{
> +    AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
> +
> +    gi->pci_dev = g_strdup(val);
> +}
> +
> +static void acpi_generic_initiator_set_node(Object *obj, Visitor *v,
> +                                            const char *name, void *opaque,
> +                                            Error **errp)
> +{
> +    AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
> +    uint32_t value;
> +
> +    if (!visit_type_uint32(v, name, &value, errp)) {
> +        return;
> +    }
> +
> +    if (value >= MAX_NODES) {
> +        error_printf("%s: Invalid NUMA node specified\n",
> +                     TYPE_ACPI_GENERIC_INITIATOR);
> +        exit(1);
> +    }
> +
> +    gi->node = value;
> +}
> +
> +static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data)
> +{
> +    object_class_property_add_str(oc, "pci-dev", NULL,
> +        acpi_generic_initiator_set_pci_device);
> +    object_class_property_add(oc, "node", "int", NULL,
> +        acpi_generic_initiator_set_node, NULL, NULL);
> +}
> diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
> index fc1b952379..2268589519 100644
> --- a/hw/acpi/meson.build
> +++ b/hw/acpi/meson.build
> @@ -1,5 +1,6 @@
>  acpi_ss = ss.source_set()
>  acpi_ss.add(files(
> +  'acpi-generic-initiator.c',
>    'acpi_interface.c',
>    'aml-build.c',
>    'bios-linker-loader.c',
> diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h
> new file mode 100644
> index 0000000000..2f183b029a
> --- /dev/null
> +++ b/include/hw/acpi/acpi-generic-initiator.h
> @@ -0,0 +1,32 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved
> + */
> +
> +#ifndef ACPI_GENERIC_INITIATOR_H
> +#define ACPI_GENERIC_INITIATOR_H
> +
> +#include "hw/mem/pc-dimm.h"
> +#include "hw/acpi/bios-linker-loader.h"
> +#include "hw/acpi/aml-build.h"
> +#include "sysemu/numa.h"
> +#include "qemu/uuid.h"
> +#include "qom/object.h"
> +#include "qom/object_interfaces.h"
> +
> +#define TYPE_ACPI_GENERIC_INITIATOR "acpi-generic-initiator"
> +
> +typedef struct AcpiGenericInitiator {
> +    /* private */
> +    Object parent;
> +
> +    /* public */
> +    char *pci_dev;
> +    uint16_t node;
> +} AcpiGenericInitiator;
> +
> +typedef struct AcpiGenericInitiatorClass {
> +        ObjectClass parent_class;
> +} AcpiGenericInitiatorClass;
> +
> +#endif
> diff --git a/qapi/qom.json b/qapi/qom.json
> index c53ef978ff..7efa0e14f6 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -794,6 +794,21 @@
>  { 'struct': 'VfioUserServerProperties',
>    'data': { 'socket': 'SocketAddress', 'device': 'str' } }
>  
> +##
> +# @AcpiGenericInitiatorProperties:
> +#
> +# Properties for acpi-generic-initiator objects.
> +#
> +# @pci-dev: PCI device ID to be associated with the node
> +#
> +# @node: numa node associated with the PCI device

NUMA

> +#
> +# Since: 9.0
> +##
> +{ 'struct': 'AcpiGenericInitiatorProperties',
> +  'data': { 'pci-dev': 'str',
> +            'node': 'uint32' } }
> +
>  ##
>  # @RngProperties:
>  #
> @@ -911,6 +926,7 @@
>  ##
>  { 'enum': 'ObjectType',
>    'data': [
> +    'acpi-generic-initiator',
>      'authz-list',
>      'authz-listfile',
>      'authz-pam',
> @@ -981,6 +997,7 @@
>              'id': 'str' },
>    'discriminator': 'qom-type',
>    'data': {
> +      'acpi-generic-initiator':     'AcpiGenericInitiatorProperties',
>        'authz-list':                 'AuthZListProperties',
>        'authz-listfile':             'AuthZListFileProperties',
>        'authz-pam':                  'AuthZPAMProperties',

Jonathan, you pointed out interface design issues in your review of v2.
Are you fully satisfied with the interface in v3?



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-28  7:35   ` Markus Armbruster
@ 2024-02-28 13:55     ` Jonathan Cameron via
  2024-02-28 16:08       ` Markus Armbruster
  0 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-28 13:55 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: ankita, jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell,
	ani, berrange, eduardo, imammedo, mst, eblake, david, gshan,
	zhiw, mochs, pbonzini, aniketa, cjia, kwankhede, targupta,
	vsethi, acurrid, dnigam, udhoke, qemu-arm, qemu-devel


> >  ##
> >  # @RngProperties:
> >  #
> > @@ -911,6 +926,7 @@
> >  ##
> >  { 'enum': 'ObjectType',
> >    'data': [
> > +    'acpi-generic-initiator',
> >      'authz-list',
> >      'authz-listfile',
> >      'authz-pam',
> > @@ -981,6 +997,7 @@
> >              'id': 'str' },
> >    'discriminator': 'qom-type',
> >    'data': {
> > +      'acpi-generic-initiator':     'AcpiGenericInitiatorProperties',
> >        'authz-list':                 'AuthZListProperties',
> >        'authz-listfile':             'AuthZListFileProperties',
> >        'authz-pam':                  'AuthZPAMProperties',  
> 
> Jonathan, you pointed out interface design issues in your review of v2.
> Are you fully satisfied with the interface in v3?

Yes. I'm fine with the interface in this version (though it's v7, so I'm lost
on v2 vs v3!)

Jonathan

> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-28 13:55     ` Jonathan Cameron via
@ 2024-02-28 16:08       ` Markus Armbruster
  2024-02-28 16:50         ` Ankit Agrawal
  0 siblings, 1 reply; 37+ messages in thread
From: Markus Armbruster @ 2024-02-28 16:08 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: ankita, jgg, alex.williamson, clg, shannon.zhaosl, peter.maydell,
	ani, berrange, eduardo, imammedo, mst, eblake, david, gshan,
	zhiw, mochs, pbonzini, aniketa, cjia, kwankhede, targupta,
	vsethi, acurrid, dnigam, udhoke, qemu-arm, qemu-devel

Jonathan Cameron <Jonathan.Cameron@Huawei.com> writes:

>> >  ##
>> >  # @RngProperties:
>> >  #
>> > @@ -911,6 +926,7 @@
>> >  ##
>> >  { 'enum': 'ObjectType',
>> >    'data': [
>> > +    'acpi-generic-initiator',
>> >      'authz-list',
>> >      'authz-listfile',
>> >      'authz-pam',
>> > @@ -981,6 +997,7 @@
>> >              'id': 'str' },
>> >    'discriminator': 'qom-type',
>> >    'data': {
>> > +      'acpi-generic-initiator':     'AcpiGenericInitiatorProperties',
>> >        'authz-list':                 'AuthZListProperties',
>> >        'authz-listfile':             'AuthZListFileProperties',
>> >        'authz-pam':                  'AuthZPAMProperties',  
>> 
>> Jonathan, you pointed out interface design issues in your review of v2.
>> Are you fully satisfied with the interface in v3?
>
> Yes. I'm fine with the interface in this version (though it's v7, so I'm lost
> on v2 vs v3!)

Looks like I can't count to 7!

With NUMA capitalized in the doc comment, QAPI schema
Acked-by: Markus Armbruster <armbru@redhat.com>

Thanks!



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-28 16:08       ` Markus Armbruster
@ 2024-02-28 16:50         ` Ankit Agrawal
  2024-02-29 10:22           ` Jonathan Cameron via
  0 siblings, 1 reply; 37+ messages in thread
From: Ankit Agrawal @ 2024-02-28 16:50 UTC (permalink / raw)
  To: Markus Armbruster, Jonathan Cameron
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	david, gshan, Zhi Wang, Matt Ochs, pbonzini, Aniket Agashe,
	Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel


>>> Jonathan, you pointed out interface design issues in your review of v2.>
>> Are you fully satisfied with the interface in v3?
>>
>> Yes. I'm fine with the interface in this version (though it's v7, so I'm lost
>> on v2 vs v3!)
>
> Looks like I can't count to 7!
>
> With NUMA capitalized in the doc comment, QAPI schema
> Acked-by: Markus Armbruster <armbru@redhat.com>
>
> Thanks!

Thanks! Will fix that in the next version.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-28 16:50         ` Ankit Agrawal
@ 2024-02-29 10:22           ` Jonathan Cameron via
  2024-02-29 13:00             ` Ankit Agrawal
  0 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-29 10:22 UTC (permalink / raw)
  To: Ankit Agrawal
  Cc: Markus Armbruster, Jason Gunthorpe, alex.williamson, clg,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On Wed, 28 Feb 2024 16:50:30 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> >>> Jonathan, you pointed out interface design issues in your review of v2.>  
> >> Are you fully satisfied with the interface in v3?
> >>
> >> Yes. I'm fine with the interface in this version (though it's v7, so I'm lost
> >> on v2 vs v3!)  
> >
> > Looks like I can't count to 7!
> >
> > With NUMA capitalized in the doc comment, QAPI schema
> > Acked-by: Markus Armbruster <armbru@redhat.com>
> >
> > Thanks!  
> 
> Thanks! Will fix that in the next version.

The following is really me arguing with myself, so can probably be
ignored, but maybe it will spark an idea from someone else!

One trivial tweak that might make our life easier if anyone adds
support in the future for the other device handle type might be to go
with simply dev rather than pci-dev.

There is a sticky corner though if a device is a PCI device
and in ACPI DSDT so maybe we are better off adding acpi-dev
to take either pci-dev or acpi-dev?

Annoyingly for generic ports, (I'm reusing this infrastructure here)
the kernel code currently only deals with the ACPI form (for CXL host
bridges).  Given I point that at the bus of a PXB_CXL it is both
a PCI device, and the only handle we have for getting to the
Root Bridge ACPI handle.

So I think I've argued myself around to thinking we need to extend
the interface with another optional parameter if we ever do support
the ACPI handle for generic initiators :(

Jonathan


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-26 16:42   ` Jonathan Cameron via
  2024-02-27  8:37     ` Ankit Agrawal
@ 2024-02-29 11:43     ` Ankit Agrawal
  2024-02-29 12:17       ` Jonathan Cameron via
  2024-03-05  5:59     ` Ankit Agrawal
  2024-03-06  9:12     ` Jonathan Cameron via
  3 siblings, 1 reply; 37+ messages in thread
From: Ankit Agrawal @ 2024-02-29 11:43 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

> One thing I forgot.
>
> Please add a test.  tests/qtest/bios-tables-test.c

IIUC, we need to add a test for aarch64 to test the interface with the
acpi-generic-initiator object.

> + relevant table dumps.

Sorry it isn't clear where do you want me to add this. In the git commit
message?


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-27 12:53   ` Jonathan Cameron via
@ 2024-02-29 11:46     ` Ankit Agrawal
  2024-02-29 12:20       ` Jonathan Cameron via
  0 siblings, 1 reply; 37+ messages in thread
From: Ankit Agrawal @ 2024-02-29 11:46 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

>> ---
>>  hw/acpi/acpi-generic-initiator.c         | 84 ++++++++++++++++++++++++
>>  hw/arm/virt-acpi-build.c                 |  3 +
>>  include/hw/acpi/acpi-generic-initiator.h | 26 ++++++++
> A few more comments.
>
> Maybe _ rather than - as more common for acpi include naming.

Ack. will change the name.

> I also wonder if we need the acpi prefix for file names given context?

I tried to keep it to match the object name. If you have preference for
not having it, I can change that too.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-29 11:43     ` Ankit Agrawal
@ 2024-02-29 12:17       ` Jonathan Cameron via
  2024-02-29 12:24         ` Ankit Agrawal
  0 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-29 12:17 UTC (permalink / raw)
  To: Ankit Agrawal
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On Thu, 29 Feb 2024 11:43:44 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> > One thing I forgot.
> >
> > Please add a test.  tests/qtest/bios-tables-test.c  
> 
> IIUC, we need to add a test for aarch64 to test the interface with the
> acpi-generic-initiator object.
> 
> > + relevant table dumps.  
> 
> Sorry it isn't clear where do you want me to add this. In the git commit
> message?

I meant as part of the test.  Typically you then include the relevant snippet
as part of the commit message to show what the key part of the test is.

Thanks,

Jonathan


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-29 11:46     ` Ankit Agrawal
@ 2024-02-29 12:20       ` Jonathan Cameron via
  0 siblings, 0 replies; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-29 12:20 UTC (permalink / raw)
  To: Ankit Agrawal
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On Thu, 29 Feb 2024 11:46:27 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> >> ---
> >>  hw/acpi/acpi-generic-initiator.c         | 84 ++++++++++++++++++++++++
> >>  hw/arm/virt-acpi-build.c                 |  3 +
> >>  include/hw/acpi/acpi-generic-initiator.h | 26 ++++++++  
> > A few more comments.
> >
> > Maybe _ rather than - as more common for acpi include naming.  
> 
> Ack. will change the name.
> 
> > I also wonder if we need the acpi prefix for file names given context?  
> 
> I tried to keep it to match the object name. If you have preference for
> not having it, I can change that too.

Not my area, so up to maintainers of hw/acpi to comment if they care!

I'd not wait for that though before v8.

*impatient user who has some other stuff queued on top of this!*
:)

Jonathan


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-29 12:17       ` Jonathan Cameron via
@ 2024-02-29 12:24         ` Ankit Agrawal
  0 siblings, 0 replies; 37+ messages in thread
From: Ankit Agrawal @ 2024-02-29 12:24 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

>> > One thing I forgot.
>> >
>> > Please add a test.  tests/qtest/bios-tables-test.c
>>
>> IIUC, we need to add a test for aarch64 to test the interface with the
>> acpi-generic-initiator object.
>>
>> > + relevant table dumps.
>>
>> Sorry it isn't clear where do you want me to add this. In the git commit
>> message?
>
> I meant as part of the test.  Typically you then include the relevant snippet
> as part of the commit message to show what the key part of the test is.

Got it, thanks!

> Thanks,
>
> Jonathan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-29 10:22           ` Jonathan Cameron via
@ 2024-02-29 13:00             ` Ankit Agrawal
  2024-02-29 16:32               ` Jonathan Cameron via
  0 siblings, 1 reply; 37+ messages in thread
From: Ankit Agrawal @ 2024-02-29 13:00 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Markus Armbruster, Jason Gunthorpe, alex.williamson, clg,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

>> >>> Jonathan, you pointed out interface design issues in your review of v2.>
>> >> Are you fully satisfied with the interface in v3?
>> >>
>> >> Yes. I'm fine with the interface in this version (though it's v7, so I'm lost
>> >> on v2 vs v3!)
>> >
>> > Looks like I can't count to 7!
>> >
>> > With NUMA capitalized in the doc comment, QAPI schema
>> > Acked-by: Markus Armbruster <armbru@redhat.com>
>> >
>> > Thanks!
>>
>> Thanks! Will fix that in the next version.
>
> The following is really me arguing with myself, so can probably be
> ignored, but maybe it will spark an idea from someone else!
>
> One trivial tweak that might make our life easier if anyone adds
> support in the future for the other device handle type might be to go
> with simply dev rather than pci-dev.
>
> There is a sticky corner though if a device is a PCI device
> and in ACPI DSDT so maybe we are better off adding acpi-dev
> to take either pci-dev or acpi-dev?

That use case does complicate the situation. Do you of any such
use case for generic initiator?

As for your suggestion of using acpi-dev as the arg to take both
pci-dev and acpi-dev.. Would that mean sending a pure pci device
(not the corner case you mentioned) through the acpi-dev argument
as well? Not sure if that would appropriate.

> Annoyingly for generic ports, (I'm reusing this infrastructure here)
> the kernel code currently only deals with the ACPI form (for CXL host
> bridges).  Given I point that at the bus of a PXB_CXL it is both
> a PCI device, and the only handle we have for getting to the
> Root Bridge ACPI handle.

So IIUC, you need to pass a PCI device to the generic port object, but use
that to reach the ACPI handle and build the Generic port affinity structure
for an ACPI device?

> So I think I've argued myself around to thinking we need to extend
> the interface with another optional parameter if we ever do support
> the ACPI handle for generic initiators :(
>
> Jonathan

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-27 17:36         ` Jonathan Cameron via
@ 2024-02-29 15:59           ` Jonathan Cameron via
  2024-03-01  8:30             ` Ankit Agrawal
  0 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-29 15:59 UTC (permalink / raw)
  To: Jonathan Cameron via
  Cc: Ankit Agrawal, Jason Gunthorpe, alex.williamson, clg,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm

On Tue, 27 Feb 2024 17:36:21 +0000
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Tue, 27 Feb 2024 17:11:15 +0000
> Jonathan Cameron via <qemu-devel@nongnu.org> wrote:
> 
> > On Tue, 27 Feb 2024 08:37:15 +0000
> > Ankit Agrawal <ankita@nvidia.com> wrote:
> >   
> > > Thanks Jonathan for reviewing the change.
> > > 
> > > Comments inline.
> > >     
> > > >> The structure needs a PCI device handle [2] that consists of the device BDF.
> > > >> The vfio-pci device corresponding to the acpi-generic-initiator object is
> > > >> located to determine the BDF.
> > > >>
> > > >> [1] ACPI Spec 6.3, Section 5.2.16.6
> > > >> [2] ACPI Spec 6.3, Table 5.80
> > > >>
> > > >> Signed-off-by: Ankit Agrawal <ankita@nvidia.com>      
> > > >Hi Ankit,
> > > >
> > > > As the code stands the use of a list seems overkill.      
> > > 
> > > Yeah, I will try out your suggestion.
> > >     
> > > > Otherwise looks good to me.  I need Generic Ports support for CXL
> > > > stuff so will copy your approach for that as it's ended up nice
> > > > and simple.
> > > > 
> > > > Jonathan      
> > > 
> > > Nice, would be good to have uniform implementations.    
> > 
> > I've been messing around with this today.
> > 
> > They differ only very trivially.
> > 2 Options.
> > 1) Have acpi-generic-port inherit from acpi-generic-initiator.
> >    Works but implies a relationship that isn't really true.
> > 2) Add an abstract base class. I've called it acpi-generic-node
> >    and have bother acpi-generic-initiator and acpi-generic-port
> >    inherit from it.
> > 
> > The second feels more natural but is a tiny bit more code (a few
> > more empty init / finalize functions.
> > 
> > If we are going to end up with an abstract base 'object' it
> > will be cleaner to do this all as one series if you don't mind
> > carrying the generic port stuff as well? Or I can smash the
> > two series together and send out an updated version that hopefully
> > meets both our requirements (+ tests etc).
> > 
> > I'm just running tests against the CXL qos / generic port code
> > but assuming all goes well can share my additional changes
> > in next day or two.
> > 
> > Jonathan  
> 
> One more thing.  Right now we can't use Generic Initiators as
> HMAT initiators.  That also wants fixing given that's their
> normal usecase rather than what you are using them for so it
> should 'work'.

Something along the lines of this will do the job.


diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 4173ef2afa..825cfe86bc 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -41,6 +41,7 @@ struct NodeInfo {
     struct HostMemoryBackend *node_memdev;
     bool present;
     bool has_cpu;
+    bool has_gi;
     uint8_t lb_info_provided;
     uint16_t initiator;
     uint8_t distance[MAX_NODES];
diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c
index 9179590a42..8a67300320 100644
--- a/hw/acpi/acpi-generic-initiator.c
+++ b/hw/acpi/acpi-generic-initiator.c
@@ -6,6 +6,7 @@
 #include "qemu/osdep.h"
 #include "hw/acpi/acpi-generic-initiator.h"
 #include "hw/pci/pci_device.h"
+#include "hw/boards.h"
 #include "qapi/error.h"
 #include "qapi/qapi-builtin-visit.h"
 #include "qapi/visitor.h"
@@ -58,6 +59,7 @@ static void acpi_generic_node_set_node(Object *obj, Visitor *v,
                                        const char *name, void *opaque,
                                        Error **errp)
 {
+    MachineState *ms = MACHINE(qdev_get_machine());
     AcpiGenericNode *gn = ACPI_GENERIC_NODE(obj);
     uint32_t value;
 
@@ -72,6 +74,10 @@ static void acpi_generic_node_set_node(Object *obj, Visitor *v,
     }
 
     gn->node = value;
+
+    if (object_dynamic_cast(obj, TYPE_ACPI_GENERIC_INITIATOR)) {
+        ms->numa_state->nodes[gn->node].has_gi = true;
+    }
 }
 
 static void acpi_generic_node_class_init(ObjectClass *oc, void *data)
diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c
index b933ae3c06..9b1662b6b8 100644
--- a/hw/acpi/hmat.c
+++ b/hw/acpi/hmat.c
@@ -225,7 +225,7 @@ static void hmat_build_table_structs(GArray *table_data, NumaState *numa_state)
     }
 
     for (i = 0; i < numa_state->num_nodes; i++) {
-        if (numa_state->nodes[i].has_cpu) {
+        if (numa_state->nodes[i].has_cpu || numa_state->nodes[i].has_gi) {
             initiator_list[num_initiator++] = i;
         }
     }
diff --git a/hw/core/numa.c b/hw/core/numa.c
index f08956ddb0..58a32f1564 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -229,7 +229,8 @@ void parse_numa_hmat_lb(NumaState *numa_state, NumaHmatLBOptions *node,
                    node->target, numa_state->num_nodes);
         return;
     }
-    if (!numa_info[node->initiator].has_cpu) {
+    if (!numa_info[node->initiator].has_cpu &&
+        !numa_info[node->initiator].has_gi) {
         error_setg(errp, "Invalid initiator=%d, it isn't an "
                    "initiator proximity domain", node->initiator);
         return;

> 
> Jonathan
> 
> > 
> > 
> >   
> 



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-29 13:00             ` Ankit Agrawal
@ 2024-02-29 16:32               ` Jonathan Cameron via
  2024-03-01  8:33                 ` Ankit Agrawal
  0 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-02-29 16:32 UTC (permalink / raw)
  To: Ankit Agrawal
  Cc: Markus Armbruster, Jason Gunthorpe, alex.williamson, clg,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On Thu, 29 Feb 2024 13:00:27 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> >> >>> Jonathan, you pointed out interface design issues in your review of v2.>  
> >> >> Are you fully satisfied with the interface in v3?
> >> >>
> >> >> Yes. I'm fine with the interface in this version (though it's v7, so I'm lost
> >> >> on v2 vs v3!)  
> >> >
> >> > Looks like I can't count to 7!
> >> >
> >> > With NUMA capitalized in the doc comment, QAPI schema
> >> > Acked-by: Markus Armbruster <armbru@redhat.com>
> >> >
> >> > Thanks!  
> >>
> >> Thanks! Will fix that in the next version.  
> >
> > The following is really me arguing with myself, so can probably be
> > ignored, but maybe it will spark an idea from someone else!
> >
> > One trivial tweak that might make our life easier if anyone adds
> > support in the future for the other device handle type might be to go
> > with simply dev rather than pci-dev.
> >
> > There is a sticky corner though if a device is a PCI device
> > and in ACPI DSDT so maybe we are better off adding acpi-dev
> > to take either pci-dev or acpi-dev?  
> 
> That use case does complicate the situation. Do you of any such
> use case for generic initiator?

In physical systems yes - in QEMU not yet, though it's a quirk
of the available ids to get to the ACPI devices (which oddly
are PCI devices :()

> 
> As for your suggestion of using acpi-dev as the arg to take both
> pci-dev and acpi-dev.. Would that mean sending a pure pci device
> (not the corner case you mentioned) through the acpi-dev argument
> as well? Not sure if that would appropriate.

Ah, looking up my description is unclear. I meant two optional parameters
pci-dev or acpi-dev - which one was supplied would indicate the type
of handle to be used.

> 
> > Annoyingly for generic ports, (I'm reusing this infrastructure here)
> > the kernel code currently only deals with the ACPI form (for CXL host
> > bridges).  Given I point that at the bus of a PXB_CXL it is both
> > a PCI device, and the only handle we have for getting to the
> > Root Bridge ACPI handle.  
> 
> So IIUC, you need to pass a PCI device to the generic port object, but use
> that to reach the ACPI handle and build the Generic port affinity structure
> for an ACPI device?

Yes.  Slightly shortcut is that the UID is the bus number for all the
relevant devices so I can abuse that.  QEMU doesn't keep track of
the ACPI handles directly so this is the current cleanest solution.

> 
> > So I think I've argued myself around to thinking we need to extend
> > the interface with another optional parameter if we ever do support
> > the ACPI handle for generic initiators :(
> >
> > Jonatha  



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-29 15:59           ` Jonathan Cameron via
@ 2024-03-01  8:30             ` Ankit Agrawal
  0 siblings, 0 replies; 37+ messages in thread
From: Ankit Agrawal @ 2024-03-01  8:30 UTC (permalink / raw)
  To: Jonathan Cameron, Jonathan Cameron via
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm

>>
>> One more thing.  Right now we can't use Generic Initiators as
>> HMAT initiators.  That also wants fixing given that's their
>> normal usecase rather than what you are using them for so it
>> should 'work'.
>
> Something along the lines of this will do the job.

Thanks! Will incorporate the patch in the next posting.

>
> diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
> index 4173ef2afa..825cfe86bc 100644
> --- a/include/sysemu/numa.h
> +++ b/include/sysemu/numa.h
> @@ -41,6 +41,7 @@ struct NodeInfo {
>     struct HostMemoryBackend *node_memdev;
>     bool present;
>     bool has_cpu;
>+    bool has_gi;
>     uint8_t lb_info_provided;
>     uint16_t initiator;
>     uint8_t distance[MAX_NODES];
> diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c
> index 9179590a42..8a67300320 100644
> --- a/hw/acpi/acpi-generic-initiator.c
> +++ b/hw/acpi/acpi-generic-initiator.c
> @@ -6,6 +6,7 @@
> #include "qemu/osdep.h"
> #include "hw/acpi/acpi-generic-initiator.h"
> #include "hw/pci/pci_device.h"
> +#include "hw/boards.h"
> #include "qapi/error.h"
> #include "qapi/qapi-builtin-visit.h"
> #include "qapi/visitor.h"
> @@ -58,6 +59,7 @@ static void acpi_generic_node_set_node(Object *obj, Visitor *v,
>                                        const char *name, void *opaque,
>                                        Error **errp)
> {
> +    MachineState *ms = MACHINE(qdev_get_machine());
>     AcpiGenericNode *gn = ACPI_GENERIC_NODE(obj);
>     uint32_t value;
>
> @@ -72,6 +74,10 @@ static void acpi_generic_node_set_node(Object *obj, Visitor *v,
>     }
>
>     gn->node = value;
> +
> +    if (object_dynamic_cast(obj, TYPE_ACPI_GENERIC_INITIATOR)) {
> +        ms->numa_state->nodes[gn->node].has_gi = true;
> +    }
> }
>
> static void acpi_generic_node_class_init(ObjectClass *oc, void *data)
> diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c
> index b933ae3c06..9b1662b6b8 100644
> --- a/hw/acpi/hmat.c
> +++ b/hw/acpi/hmat.c
> @@ -225,7 +225,7 @@ static void hmat_build_table_structs(GArray *table_data, NumaState *numa_state)
>     }
>
>     for (i = 0; i < numa_state->num_nodes; i++) {
> -        if (numa_state->nodes[i].has_cpu) {
> +        if (numa_state->nodes[i].has_cpu || numa_state->nodes[i].has_gi) {
>             initiator_list[num_initiator++] = i;
>         }
>     }
> diff --git a/hw/core/numa.c b/hw/core/numa.c
> index f08956ddb0..58a32f1564 100644
> --- a/hw/core/numa.c
> +++ b/hw/core/numa.c
> @@ -229,7 +229,8 @@ void parse_numa_hmat_lb(NumaState *numa_state, NumaHmatLBOptions *node,
>                    node->target, numa_state->num_nodes);
>         return;
>     }
> -    if (!numa_info[node->initiator].has_cpu) {
> +    if (!numa_info[node->initiator].has_cpu &&
> +        !numa_info[node->initiator].has_gi) {
>         error_setg(errp, "Invalid initiator=%d, it isn't an "
>                    "initiator proximity domain", node->initiator);
>         return;
>
> Jonathan
>
> >
> >
> >
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-02-29 16:32               ` Jonathan Cameron via
@ 2024-03-01  8:33                 ` Ankit Agrawal
  2024-03-01 16:13                   ` Alex Williamson
  0 siblings, 1 reply; 37+ messages in thread
From: Ankit Agrawal @ 2024-03-01  8:33 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Markus Armbruster, Jason Gunthorpe, alex.williamson, clg,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel


>> As for your suggestion of using acpi-dev as the arg to take both
>> pci-dev and acpi-dev.. Would that mean sending a pure pci device
>> (not the corner case you mentioned) through the acpi-dev argument
>> as well? Not sure if that would appropriate.
>
> Ah, looking up my description is unclear. I meant two optional parameters
> pci-dev or acpi-dev - which one was supplied would indicate the type
> of handle to be used.

Yes, that makes sense. But for now only have pci-dev until we have any
acpi-dev related code added? IIRC, this is what we discussed earlier.



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 1/2] qom: new object to associate device to numa node
  2024-03-01  8:33                 ` Ankit Agrawal
@ 2024-03-01 16:13                   ` Alex Williamson
  0 siblings, 0 replies; 37+ messages in thread
From: Alex Williamson @ 2024-03-01 16:13 UTC (permalink / raw)
  To: Ankit Agrawal
  Cc: Jonathan Cameron, Markus Armbruster, Jason Gunthorpe, clg,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On Fri, 1 Mar 2024 08:33:42 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> >> As for your suggestion of using acpi-dev as the arg to take both
> >> pci-dev and acpi-dev.. Would that mean sending a pure pci device
> >> (not the corner case you mentioned) through the acpi-dev argument
> >> as well? Not sure if that would appropriate.  
> >
> > Ah, looking up my description is unclear. I meant two optional parameters
> > pci-dev or acpi-dev - which one was supplied would indicate the type
> > of handle to be used.  
> 
> Yes, that makes sense. But for now only have pci-dev until we have any
> acpi-dev related code added? IIRC, this is what we discussed earlier.

Yep, I think we acknowledged that this device supports either PCI or
ACPI devices and we only currently have a use case for PCI devices, so
that's what's implemented and what the interface expects.  A separate
ACPI device interface could be added later or derived from updating the
interface to accept a generic device object.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-26 16:42   ` Jonathan Cameron via
  2024-02-27  8:37     ` Ankit Agrawal
  2024-02-29 11:43     ` Ankit Agrawal
@ 2024-03-05  5:59     ` Ankit Agrawal
  2024-03-05  7:11       ` Cédric Le Goater
  2024-03-05 21:06       ` Alex Williamson
  2024-03-06  9:12     ` Jonathan Cameron via
  3 siblings, 2 replies; 37+ messages in thread
From: Ankit Agrawal @ 2024-03-05  5:59 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

> One thing I forgot.
>
> Please add a test.  tests/qtest/bios-tables-test.c
> + relevant table dumps.

Here I need to add a test that creates a vfio-pci device and numa
nodes and link using the acpi-generic-initiator object. One thing
here is that the -device vfio-pci needs a host=<bdf> argument. I
probably cannot provide the device bdf from my local setup. So
I am not sure how can I add this test to tests/qtest/bios-tables-test.c.
FYI, the following is a sample args we use for the
acpi-generic-initiator object.

       -numa node,nodeid=2
       -device vfio-pci-nohotplug,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
       -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \

Moreover based on a quick grep, I don't see any other test that
have -device vfio-pci argument.

Jonathan, Alex, do you know how we may add tests that is dependent
on the vfio-pci device?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-03-05  5:59     ` Ankit Agrawal
@ 2024-03-05  7:11       ` Cédric Le Goater
  2024-03-05  8:17         ` Ankit Agrawal
  2024-03-05 21:06       ` Alex Williamson
  1 sibling, 1 reply; 37+ messages in thread
From: Cédric Le Goater @ 2024-03-05  7:11 UTC (permalink / raw)
  To: Ankit Agrawal, Jonathan Cameron
  Cc: Jason Gunthorpe, alex.williamson, shannon.zhaosl, peter.maydell,
	ani, berrange, eduardo, imammedo, mst, eblake, armbru, david,
	gshan, Zhi Wang, Matt Ochs, pbonzini, Aniket Agashe, Neo Jia,
	Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On 3/5/24 06:59, Ankit Agrawal wrote:
>> One thing I forgot.
>>
>> Please add a test.  tests/qtest/bios-tables-test.c
>> + relevant table dumps.
> 
> Here I need to add a test that creates a vfio-pci device and numa
> nodes and link using the acpi-generic-initiator object. One thing
> here is that the -device vfio-pci needs a host=<bdf> argument. I
> probably cannot provide the device bdf from my local setup. So
> I am not sure how can I add this test to tests/qtest/bios-tables-test.c.
> FYI, the following is a sample args we use for the
> acpi-generic-initiator object.
> 
>         -numa node,nodeid=2
>         -device vfio-pci-nohotplug,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
>         -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
> 
> Moreover based on a quick grep, I don't see any other test that
> have -device vfio-pci argument.
> 
> Jonathan, Alex, do you know how we may add tests that is dependent
> on the vfio-pci device?

There are none.

This would require a host device always available for passthrough and
there is no simple solution for this problem. Such tests would need to
run in a nested environment under avocado: a pc/virt machine with an
igb device and use the PF and/or VFs to check device assignment in a
nested guests.

PPC just introduced new tests to check nested guest support on two
different HV implementations. If you have time, please take a look
at tests/avocado/ppc_hv_tests.py for the framework.

I will try to propose a new test when I am done with the reviews,
not before 9.0 soft freeze though.

Thanks,

C.




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-03-05  7:11       ` Cédric Le Goater
@ 2024-03-05  8:17         ` Ankit Agrawal
  2024-03-05 10:38           ` Jonathan Cameron via
  0 siblings, 1 reply; 37+ messages in thread
From: Ankit Agrawal @ 2024-03-05  8:17 UTC (permalink / raw)
  To: Cédric Le Goater, Jonathan Cameron
  Cc: Jason Gunthorpe, alex.williamson, shannon.zhaosl, peter.maydell,
	ani, berrange, eduardo, imammedo, mst, eblake, armbru, david,
	gshan, Zhi Wang, Matt Ochs, pbonzini, Aniket Agashe, Neo Jia,
	Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel


>>> Please add a test.  tests/qtest/bios-tables-test.c
>>> + relevant table dumps.
>>
>> Here I need to add a test that creates a vfio-pci device and numa
>> nodes and link using the acpi-generic-initiator object. One thing
>> here is that the -device vfio-pci needs a host=<bdf> argument. I
>> probably cannot provide the device bdf from my local setup. So
>> I am not sure how can I add this test to tests/qtest/bios-tables-test.c.
>> FYI, the following is a sample args we use for the
>> acpi-generic-initiator object.
>>
>>         -numa node,nodeid=2
>>         -device vfio-pci-nohotplug,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
>>         -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
>>
>> Moreover based on a quick grep, I don't see any other test that
>> have -device vfio-pci argument.
>>
>> Jonathan, Alex, do you know how we may add tests that is dependent
>> on the vfio-pci device?
>
> There are none.
>
> This would require a host device always available for passthrough and
> there is no simple solution for this problem. Such tests would need to
> run in a nested environment under avocado: a pc/virt machine with an
> igb device and use the PF and/or VFs to check device assignment in a
> nested guests.
>
> PPC just introduced new tests to check nested guest support on two
> different HV implementations. If you have time, please take a look
> at tests/avocado/ppc_hv_tests.py for the framework.
>
> I will try to propose a new test when I am done with the reviews,
> not before 9.0 soft freeze though.

Thanks for the information. As part of this patch, I'll leave out
this test change then.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-03-05  8:17         ` Ankit Agrawal
@ 2024-03-05 10:38           ` Jonathan Cameron via
  2024-03-06 10:33             ` Ankit Agrawal
  0 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-03-05 10:38 UTC (permalink / raw)
  To: Ankit Agrawal
  Cc: Cédric Le Goater, Jason Gunthorpe, alex.williamson,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On Tue, 5 Mar 2024 08:17:18 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> >>> Please add a test.  tests/qtest/bios-tables-test.c
> >>> + relevant table dumps.  
> >>
> >> Here I need to add a test that creates a vfio-pci device and numa
> >> nodes and link using the acpi-generic-initiator object. One thing
> >> here is that the -device vfio-pci needs a host=<bdf> argument. I
> >> probably cannot provide the device bdf from my local setup. So
> >> I am not sure how can I add this test to tests/qtest/bios-tables-test.c.
> >> FYI, the following is a sample args we use for the
> >> acpi-generic-initiator object.
> >>
> >>         -numa node,nodeid=2
> >>         -device vfio-pci-nohotplug,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
> >>         -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
> >>
> >> Moreover based on a quick grep, I don't see any other test that
> >> have -device vfio-pci argument.
> >>
> >> Jonathan, Alex, do you know how we may add tests that is dependent
> >> on the vfio-pci device?  
> >
> > There are none.
> >
> > This would require a host device always available for passthrough and
> > there is no simple solution for this problem. Such tests would need to
> > run in a nested environment under avocado: a pc/virt machine with an
> > igb device and use the PF and/or VFs to check device assignment in a
> > nested guests.
> >
> > PPC just introduced new tests to check nested guest support on two
> > different HV implementations. If you have time, please take a look
> > at tests/avocado/ppc_hv_tests.py for the framework.
> >
> > I will try to propose a new test when I am done with the reviews,
> > not before 9.0 soft freeze though.  
> 
> Thanks for the information. As part of this patch, I'll leave out
> this test change then.

For BIOS table purposes it can be any PCI device. I've been testing
this with a virtio-net-pci but something like virtio-rng-pci will
do fine.  The table contents doesn't care if it's vfio or not.

I can spin a test as part of the follow up Generic Port series that
incorporates both and pushes the limits of the hmat code in general.
Current tests are too tame ;)

Given I don't think we have clarification from ACPI spec side on
the many to one mapping you are using, I'd just use a 1-1 in any
test.


Jonathan


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-03-05  5:59     ` Ankit Agrawal
  2024-03-05  7:11       ` Cédric Le Goater
@ 2024-03-05 21:06       ` Alex Williamson
  2024-03-06 10:36         ` Ankit Agrawal
  1 sibling, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2024-03-05 21:06 UTC (permalink / raw)
  To: Ankit Agrawal
  Cc: Jonathan Cameron, Jason Gunthorpe, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On Tue, 5 Mar 2024 05:59:46 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> > One thing I forgot.
> >
> > Please add a test.  tests/qtest/bios-tables-test.c
> > + relevant table dumps.  
> 
> Here I need to add a test that creates a vfio-pci device and numa
> nodes and link using the acpi-generic-initiator object. One thing
> here is that the -device vfio-pci needs a host=<bdf> argument. I
> probably cannot provide the device bdf from my local setup. So
> I am not sure how can I add this test to tests/qtest/bios-tables-test.c.
> FYI, the following is a sample args we use for the
> acpi-generic-initiator object.
> 
>        -numa node,nodeid=2
>        -device vfio-pci-nohotplug,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
>        -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
> 
> Moreover based on a quick grep, I don't see any other test that
> have -device vfio-pci argument.
> 
> Jonathan, Alex, do you know how we may add tests that is dependent
> on the vfio-pci device?
> 

As Jonathan notes, we've decoupled this from vfio-pci, the pci-dev= arg
can point to any PCI device.  For example, any emulated PCI NIC could
be a stand-in for the vfio-pci device used in practice.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-02-26 16:42   ` Jonathan Cameron via
                       ` (2 preceding siblings ...)
  2024-03-05  5:59     ` Ankit Agrawal
@ 2024-03-06  9:12     ` Jonathan Cameron via
  2024-03-06 10:41       ` Ankit Agrawal
  3 siblings, 1 reply; 37+ messages in thread
From: Jonathan Cameron via @ 2024-03-06  9:12 UTC (permalink / raw)
  To: Jonathan Cameron via
  Cc: Jonathan Cameron, ankita, jgg, alex.williamson, clg,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, armbru, david, gshan, zhiw, mochs, pbonzini,
	aniketa, cjia, kwankhede, targupta, vsethi, acurrid, dnigam,
	udhoke, qemu-arm

On Mon, 26 Feb 2024 16:42:29 +0000
Jonathan Cameron via <qemu-devel@nongnu.org> wrote:

> On Fri, 23 Feb 2024 12:42:23 +0000
> <ankita@nvidia.com> wrote:
> 
> > From: Ankit Agrawal <ankita@nvidia.com>
> > 
> > ACPI spec provides a scheme to associate "Generic Initiators" [1]
> > (e.g. heterogeneous processors and accelerators, GPUs, and I/O devices with
> > integrated compute or DMA engines GPUs) with Proximity Domains. This is
> > achieved using Generic Initiator Affinity Structure in SRAT. During bootup,
> > Linux kernel parse the ACPI SRAT to determine the PXM ids and create a NUMA
> > node for each unique PXM ID encountered. Qemu currently do not implement
> > these structures while building SRAT.
> > 
> > Add GI structures while building VM ACPI SRAT. The association between
> > device and node are stored using acpi-generic-initiator object. Lookup
> > presence of all such objects and use them to build these structures.
> > 
> > The structure needs a PCI device handle [2] that consists of the device BDF.
> > The vfio-pci device corresponding to the acpi-generic-initiator object is
> > located to determine the BDF.
> > 
> > [1] ACPI Spec 6.3, Section 5.2.16.6
> > [2] ACPI Spec 6.3, Table 5.80
> > 
> > Signed-off-by: Ankit Agrawal <ankita@nvidia.com>  
> 
> One thing I forgot.
And another :)

It might be nice to also support x86 from the start (apparently people still
care about that old architecture)

https://gitlab.com/jic23/qemu/-/commit/ccfb4fe22167e035173390cf147d9c226951b9b6
is what I'm carrying for this (see below)

We could do this later as part of the generic ports series (which is also on
that tree if you are curious).

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2d2bb0a325b83f5e3fb4666a462a693aea1a2220..54462d3a46c379a4159b4d71d7689a107745fa4c 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -68,6 +68,7 @@
 #include "hw/acpi/utils.h"
 #include "hw/acpi/pci.h"
 #include "hw/acpi/cxl.h"
+#include "hw/acpi/acpi-generic-initiator.h"
 
 #include "qom/qom-qobject.h"
 #include "hw/i386/amd_iommu.h"
@@ -2097,6 +2098,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
         }
     }
 
+    build_srat_generic_pci_initiator(table_data);
     if (machine->nvdimms_state->is_enabled) {
         nvdimm_build_srat(table_data);
     }




> 
> Please add a test.  tests/qtest/bios-tables-test.c
> + relevant table dumps.
> 
> Could also hook this up for x86 with a oneline addition and improve
> test coverage.  If not, I'll do it when I add Generic Ports as annoyingly
> people still care about x86 for some reason.
> 



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-03-05 10:38           ` Jonathan Cameron via
@ 2024-03-06 10:33             ` Ankit Agrawal
  2024-03-06 11:46               ` Jonathan Cameron via
  0 siblings, 1 reply; 37+ messages in thread
From: Ankit Agrawal @ 2024-03-06 10:33 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Cédric Le Goater, Jason Gunthorpe, alex.williamson,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

>> >> Jonathan, Alex, do you know how we may add tests that is dependent
>> >> on the vfio-pci device?
>> >
>> > There are none.
>> >
>> > This would require a host device always available for passthrough and
>> > there is no simple solution for this problem. Such tests would need to
>> > run in a nested environment under avocado: a pc/virt machine with an
>> > igb device and use the PF and/or VFs to check device assignment in a
>> > nested guests.
>> >
>> > PPC just introduced new tests to check nested guest support on two
>> > different HV implementations. If you have time, please take a look
>> > at tests/avocado/ppc_hv_tests.py for the framework.
>> >
>> > I will try to propose a new test when I am done with the reviews,
>> > not before 9.0 soft freeze though.
>>
>> Thanks for the information. As part of this patch, I'll leave out
>> this test change then.
>
> For BIOS table purposes it can be any PCI device. I've been testing
> this with a virtio-net-pci but something like virtio-rng-pci will
> do fine.  The table contents doesn't care if it's vfio or not.

Thanks, I was able to work this out with the virtio-rng-pci device.

> I can spin a test as part of the follow up Generic Port series that
> incorporates both and pushes the limits of the hmat code in general.
> Current tests are too tame ;)

Sure, that is fine by me.
FYI, this is how the test change looked like in case you were wondering.

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index fe6a9a8563..8ac8e3f048 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -2157,6 +2157,29 @@ static void test_acpi_virt_oem_fields(void)
     g_free(args);
 }

+static void test_acpi_virt_srat_gi(void)
+{
+    test_data data = {
+        .machine = "virt",
+        .tcg_only = true,
+        .uefi_fl1 = "pc-bios/edk2-aarch64-code.fd",
+        .uefi_fl2 = "pc-bios/edk2-arm-vars.fd",
+        .cd = "tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2",
+        .ram_start = 0x40000000ULL,
+        .scan_len = 128ULL * 1024 * 1024,
+    };
+
+    data.variant = ".gi";
+    test_acpi_one(" -cpu cortex-a57"
+                  " -object memory-backend-ram,id=ram0,size=128M"
+                  " -numa node,memdev=ram0,nodeid=0"
+                  " -numa node,nodeid=1"
+                  " -device virtio-rng-pci,id=dev0"
+                  " -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=1",
+                  &data);
+
+    free_test_data(&data);
+}

 int main(int argc, char *argv[])
 {
@@ -2312,6 +2335,7 @@ int main(int argc, char *argv[])
             if (qtest_has_device("virtio-iommu-pci")) {
                 qtest_add_func("acpi/virt/viot", test_acpi_virt_viot);
             }
+            qtest_add_func("acpi/virt/gi", test_acpi_virt_srat_gi);
         }
     }
     ret = g_test_run();
--
2.34.1

> Given I don't think we have clarification from ACPI spec side on
> the many to one mapping you are using, I'd just use a 1-1 in any
> test.

Ack.


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-03-05 21:06       ` Alex Williamson
@ 2024-03-06 10:36         ` Ankit Agrawal
  0 siblings, 0 replies; 37+ messages in thread
From: Ankit Agrawal @ 2024-03-06 10:36 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Jonathan Cameron, Jason Gunthorpe, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

>>
>> Jonathan, Alex, do you know how we may add tests that is dependent
>> on the vfio-pci device?
>>
>
> As Jonathan notes, we've decoupled this from vfio-pci, the pci-dev= arg
> can point to any PCI device.  For example, any emulated PCI NIC could
> be a stand-in for the vfio-pci device used in practice.  Thanks,

Yeah, -device virtio-rng-pci could be used as suggested by Jonathan.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-03-06  9:12     ` Jonathan Cameron via
@ 2024-03-06 10:41       ` Ankit Agrawal
  0 siblings, 0 replies; 37+ messages in thread
From: Ankit Agrawal @ 2024-03-06 10:41 UTC (permalink / raw)
  To: Jonathan Cameron, Jonathan Cameron via
  Cc: Jason Gunthorpe, alex.williamson, clg, shannon.zhaosl,
	peter.maydell, ani, berrange, eduardo, imammedo, mst, eblake,
	armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm

>> > The structure needs a PCI device handle [2] that consists of the device BDF.
>> > The vfio-pci device corresponding to the acpi-generic-initiator object is
>> > located to determine the BDF.
>> > 
>> > [1] ACPI Spec 6.3, Section 5.2.16.6
>> > [2] ACPI Spec 6.3, Table 5.80
>> > 
>> > Signed-off-by: Ankit Agrawal <ankita@nvidia.com>  
>> 
>> One thing I forgot.
> And another :)
>
> It might be nice to also support x86 from the start (apparently people still
> care about that old architecture)
>
> https://gitlab.com/jic23/qemu/-/commit/ccfb4fe22167e035173390cf147d9c226951b9b6
> is what I'm carrying for this (see below)

Ack.

> We could do this later as part of the generic ports series (which is also on
> that tree if you are curious).

Yeah, that sounds fine to me.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
  2024-03-06 10:33             ` Ankit Agrawal
@ 2024-03-06 11:46               ` Jonathan Cameron via
  0 siblings, 0 replies; 37+ messages in thread
From: Jonathan Cameron via @ 2024-03-06 11:46 UTC (permalink / raw)
  To: Ankit Agrawal
  Cc: Cédric Le Goater, Jason Gunthorpe, alex.williamson,
	shannon.zhaosl, peter.maydell, ani, berrange, eduardo, imammedo,
	mst, eblake, armbru, david, gshan, Zhi Wang, Matt Ochs, pbonzini,
	Aniket Agashe, Neo Jia, Kirti Wankhede, Tarun Gupta (SW-GPU),
	Vikram Sethi, Andy Currid, Dheeraj Nigam, Uday Dhoke, qemu-arm,
	qemu-devel

On Wed, 6 Mar 2024 10:33:17 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> >> >> Jonathan, Alex, do you know how we may add tests that is dependent
> >> >> on the vfio-pci device?  
> >> >
> >> > There are none.
> >> >
> >> > This would require a host device always available for passthrough and
> >> > there is no simple solution for this problem. Such tests would need to
> >> > run in a nested environment under avocado: a pc/virt machine with an
> >> > igb device and use the PF and/or VFs to check device assignment in a
> >> > nested guests.
> >> >
> >> > PPC just introduced new tests to check nested guest support on two
> >> > different HV implementations. If you have time, please take a look
> >> > at tests/avocado/ppc_hv_tests.py for the framework.
> >> >
> >> > I will try to propose a new test when I am done with the reviews,
> >> > not before 9.0 soft freeze though.  
> >>
> >> Thanks for the information. As part of this patch, I'll leave out
> >> this test change then.  
> >
> > For BIOS table purposes it can be any PCI device. I've been testing
> > this with a virtio-net-pci but something like virtio-rng-pci will
> > do fine.  The table contents doesn't care if it's vfio or not.  
> 
> Thanks, I was able to work this out with the virtio-rng-pci device.
> 
> > I can spin a test as part of the follow up Generic Port series that
> > incorporates both and pushes the limits of the hmat code in general.
> > Current tests are too tame ;)  
> 
> Sure, that is fine by me.
> FYI, this is how the test change looked like in case you were wondering.

Looks good as a starting point.
Ideally I'd like HMAT + a few bandwidth and latency values
so we test that GIs work with that as well part.

Think you'd just need
"-machine hmat=on "
//some values for cpu to local memory
"-numa hmat-lb,initiator=0,target=0,hierarchy-memory,data-type=access_latency,latency=10"
"-numa hmat-lb,initiator=0,target=0,hierarchy-memory,data-type=access_bandwidth,bandwidth=10G"
//some values for the GI node to main memory.
"-numa hmat-lb,initiator=1,target=0,hierarchy-memory,data-type=access_latency,latency=200"
"-numa hmat-lb,initiator=1,target=0,hierarchy-memory,data-type=access_bandwidth,bandwidth=5G"




^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2024-03-06 11:47 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-23 12:42 [PATCH v7 0/2] acpi: report numa nodes for device memory using GI ankita
2024-02-23 12:42 ` [PATCH v7 1/2] qom: new object to associate device to numa node ankita
2024-02-27 13:00   ` Jonathan Cameron via
2024-02-28  5:35     ` Ankit Agrawal
2024-02-28  7:35   ` Markus Armbruster
2024-02-28 13:55     ` Jonathan Cameron via
2024-02-28 16:08       ` Markus Armbruster
2024-02-28 16:50         ` Ankit Agrawal
2024-02-29 10:22           ` Jonathan Cameron via
2024-02-29 13:00             ` Ankit Agrawal
2024-02-29 16:32               ` Jonathan Cameron via
2024-03-01  8:33                 ` Ankit Agrawal
2024-03-01 16:13                   ` Alex Williamson
2024-02-23 12:42 ` [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure ankita
2024-02-26 16:34   ` Jonathan Cameron via
2024-02-26 16:42   ` Jonathan Cameron via
2024-02-27  8:37     ` Ankit Agrawal
2024-02-27 17:11       ` Jonathan Cameron via
2024-02-27 17:36         ` Jonathan Cameron via
2024-02-29 15:59           ` Jonathan Cameron via
2024-03-01  8:30             ` Ankit Agrawal
2024-02-29 11:43     ` Ankit Agrawal
2024-02-29 12:17       ` Jonathan Cameron via
2024-02-29 12:24         ` Ankit Agrawal
2024-03-05  5:59     ` Ankit Agrawal
2024-03-05  7:11       ` Cédric Le Goater
2024-03-05  8:17         ` Ankit Agrawal
2024-03-05 10:38           ` Jonathan Cameron via
2024-03-06 10:33             ` Ankit Agrawal
2024-03-06 11:46               ` Jonathan Cameron via
2024-03-05 21:06       ` Alex Williamson
2024-03-06 10:36         ` Ankit Agrawal
2024-03-06  9:12     ` Jonathan Cameron via
2024-03-06 10:41       ` Ankit Agrawal
2024-02-27 12:53   ` Jonathan Cameron via
2024-02-29 11:46     ` Ankit Agrawal
2024-02-29 12:20       ` Jonathan Cameron via

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.