All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID
@ 2022-03-03  3:11 Gavin Shan
  2022-03-03  3:11 ` [PATCH v2 1/3] " Gavin Shan
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Gavin Shan @ 2022-03-03  3:11 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

When the CPU-to-NUMA association isn't provided by user, the default NUMA
node ID for the specific CPU is returned from virt_get_default_cpu_node_id().
Unfortunately, the default NUMA node ID breaks socket boundary and leads to
the broken CPU topology warning message in Linux guest. This series intends
to fix the issue.

PATCH[1/3]: Fixes the broken CPU topology by considering the socket boundary
            when the default NUMA node ID is calculated.
PATCH[2/3]: Use the existing CPU topology to build PPTT table. However, the
            cluster ID has to be calculated dynamically because there is no
            corresponding information in CPU instance properties.
PATCH[3/3]: Take thread ID as the ACPI processor ID in MDAT and SRAT tables.

Changelog
=========
v2:
   * Populate the CPU topology in virt_possible_cpu_arch_ids() so that it
     can be reused in virt_get_default_cpu_node_id()                          (Igor)
   * Added PATCH[2/3] to use the existing CPU topology when PPTT table
     is built                                                                 (Igor)
   * Added PATCH[3/3] to take thread ID as ACPI processor ID in MADT and
     SRAT table                                                               (Gavin)

Gavin Shan (3):
  hw/arm/virt: Fix CPU's default NUMA node ID
  hw/acpi/aml-build: Use existing CPU topology to build PPTT table
  hw/arm/virt: Unify ACPI processor ID in MADT and SRAT table

 hw/acpi/aml-build.c      | 106 ++++++++++++++++++++++++++++++---------
 hw/arm/virt-acpi-build.c |  12 +++--
 hw/arm/virt.c            |  17 ++++++-
 3 files changed, 107 insertions(+), 28 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-03-03  3:11 [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
@ 2022-03-03  3:11 ` Gavin Shan
  2022-03-18  6:23   ` wangyanan (Y) via
  2022-03-03  3:11 ` [PATCH v2 2/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table Gavin Shan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Gavin Shan @ 2022-03-03  3:11 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id()
when it isn't provided explicitly. However, the CPU topology isn't fully
considered in the default association and it causes CPU topology broken
warnings on booting Linux guest.

For example, the following warning messages are observed when the Linux guest
is booted with the following command lines.

  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
  -accel kvm -machine virt,gic-version=host               \
  -cpu host                                               \
  -smp 6,sockets=2,cores=3,threads=1                      \
  -m 1024M,slots=16,maxmem=64G                            \
  -object memory-backend-ram,id=mem0,size=128M            \
  -object memory-backend-ram,id=mem1,size=128M            \
  -object memory-backend-ram,id=mem2,size=128M            \
  -object memory-backend-ram,id=mem3,size=128M            \
  -object memory-backend-ram,id=mem4,size=128M            \
  -object memory-backend-ram,id=mem4,size=384M            \
  -numa node,nodeid=0,memdev=mem0                         \
  -numa node,nodeid=1,memdev=mem1                         \
  -numa node,nodeid=2,memdev=mem2                         \
  -numa node,nodeid=3,memdev=mem3                         \
  -numa node,nodeid=4,memdev=mem4                         \
  -numa node,nodeid=5,memdev=mem5
         :
  alternatives: patching kernel code
  BUG: arch topology borken
  the CLS domain not a subset of the MC domain
  <the above error log repeats>
  BUG: arch topology borken
  the DIE domain not a subset of the NODE domain

With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5
are associated with NODE#0 to NODE#5 separately. That's incorrect because
CPU#0/1/2 should be associated with same NUMA node because they're seated
in same socket.

This fixes the issue by populating the CPU topology in virt_possible_cpu_arch_ids()
and considering the socket index when default CPU-to-NUMA association is given
in virt_possible_cpu_arch_ids(). With this applied, no more CPU topology broken
warnings are seen from the Linux guest. The 6 CPUs are associated with NODE#0/1,
but there are no CPUs associated with NODE#2/3/4/5.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/arm/virt.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 46bf7ceddf..dee02b60fc 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2488,7 +2488,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 
 static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
 {
-    return idx % ms->numa_state->num_nodes;
+    int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
+
+    return socket_id % ms->numa_state->num_nodes;
 }
 
 static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
@@ -2496,6 +2498,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     int n;
     unsigned int max_cpus = ms->smp.max_cpus;
     VirtMachineState *vms = VIRT_MACHINE(ms);
+    MachineClass *mc = MACHINE_GET_CLASS(vms);
 
     if (ms->possible_cpus) {
         assert(ms->possible_cpus->len == max_cpus);
@@ -2509,6 +2512,18 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[n].type = ms->cpu_type;
         ms->possible_cpus->cpus[n].arch_id =
             virt_cpu_mp_affinity(vms, n);
+
+        ms->possible_cpus->cpus[n].props.has_socket_id = true;
+        ms->possible_cpus->cpus[n].props.socket_id =
+            n / (ms->smp.dies * ms->smp.clusters *
+                ms->smp.cores * ms->smp.threads);
+        if (mc->smp_props.dies_supported) {
+            ms->possible_cpus->cpus[n].props.has_die_id = true;
+            ms->possible_cpus->cpus[n].props.die_id =
+                n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
+        }
+        ms->possible_cpus->cpus[n].props.has_core_id = true;
+        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
         ms->possible_cpus->cpus[n].props.has_thread_id = true;
         ms->possible_cpus->cpus[n].props.thread_id = n;
     }
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table
  2022-03-03  3:11 [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
  2022-03-03  3:11 ` [PATCH v2 1/3] " Gavin Shan
@ 2022-03-03  3:11 ` Gavin Shan
  2022-03-18  6:34   ` wangyanan (Y) via
  2022-03-03  3:11 ` [PATCH v2 3/3] hw/arm/virt: Unify ACPI processor ID in MADT and SRAT table Gavin Shan
  2022-03-14  6:24 ` [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
  3 siblings, 1 reply; 15+ messages in thread
From: Gavin Shan @ 2022-03-03  3:11 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

When the PPTT table is built, the CPU topology is re-calculated, but
it's unecessary because the CPU topology, except the cluster IDs,
has been populated in virt_possible_cpu_arch_ids() on arm/virt machine.

This avoids to re-calculate the CPU topology by reusing the existing
one in ms->possible_cpus. However, the cluster ID for the CPU instance
has to be calculated dynamically because there is no corresponding
field in struct CpuInstanceProperties. Currently, the only user of
build_pptt() is arm/virt machine.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/acpi/aml-build.c | 106 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 82 insertions(+), 24 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 8966e16320..572cf5fc00 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2002,18 +2002,27 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
                 const char *oem_id, const char *oem_table_id)
 {
     MachineClass *mc = MACHINE_GET_CLASS(ms);
+    CPUArchIdList *cpus = ms->possible_cpus;
+    GQueue *socket_list = g_queue_new();
+    GQueue *cluster_list = g_queue_new();
+    GQueue *core_list = g_queue_new();
     GQueue *list = g_queue_new();
     guint pptt_start = table_data->len;
     guint parent_offset;
     guint length, i;
-    int uid = 0;
-    int socket;
+    int n, id, socket_id, cluster_id, core_id, thread_id;
     AcpiTable table = { .sig = "PPTT", .rev = 2,
                         .oem_id = oem_id, .oem_table_id = oem_table_id };
 
     acpi_table_begin(&table, table_data);
 
-    for (socket = 0; socket < ms->smp.sockets; socket++) {
+    for (n = 0; n < cpus->len; n++) {
+        socket_id = cpus->cpus[n].props.socket_id;
+        if (g_queue_find(socket_list, GUINT_TO_POINTER(socket_id))) {
+            continue;
+        }
+
+        g_queue_push_tail(socket_list, GUINT_TO_POINTER(socket_id));
         g_queue_push_tail(list,
             GUINT_TO_POINTER(table_data->len - pptt_start));
         build_processor_hierarchy_node(
@@ -2023,65 +2032,114 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
              * of a physical package
              */
             (1 << 0),
-            0, socket, NULL, 0);
+            0, socket_id, NULL, 0);
     }
 
     if (mc->smp_props.clusters_supported) {
         length = g_queue_get_length(list);
         for (i = 0; i < length; i++) {
-            int cluster;
-
             parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
-            for (cluster = 0; cluster < ms->smp.clusters; cluster++) {
+            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
+
+            for (n = 0; n < cpus->len; n++) {
+                if (cpus->cpus[n].props.socket_id != socket_id) {
+                    continue;
+                }
+
+                /*
+                 * We have to calculate the cluster ID because it isn't
+                 * available in the CPU instance properties.
+                 */
+                cluster_id = cpus->cpus[n].props.thread_id /
+                             (ms->smp.cores * ms->smp.threads);
+                if (g_queue_find(cluster_list, GUINT_TO_POINTER(cluster_id))) {
+                    continue;
+                }
+
+                g_queue_push_tail(cluster_list, GUINT_TO_POINTER(cluster_id));
                 g_queue_push_tail(list,
                     GUINT_TO_POINTER(table_data->len - pptt_start));
                 build_processor_hierarchy_node(
                     table_data,
                     (0 << 0), /* not a physical package */
-                    parent_offset, cluster, NULL, 0);
+                    parent_offset, cluster_id, NULL, 0);
             }
         }
     }
 
     length = g_queue_get_length(list);
     for (i = 0; i < length; i++) {
-        int core;
-
         parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
-        for (core = 0; core < ms->smp.cores; core++) {
-            if (ms->smp.threads > 1) {
-                g_queue_push_tail(list,
-                    GUINT_TO_POINTER(table_data->len - pptt_start));
-                build_processor_hierarchy_node(
-                    table_data,
-                    (0 << 0), /* not a physical package */
-                    parent_offset, core, NULL, 0);
-            } else {
+        if (!mc->smp_props.clusters_supported) {
+            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
+        } else {
+            cluster_id = GPOINTER_TO_UINT(g_queue_pop_head(cluster_list));
+        }
+
+        for (n = 0; n < cpus->len; n++) {
+            if (!mc->smp_props.clusters_supported &&
+                cpus->cpus[n].props.socket_id != socket_id) {
+                continue;
+            }
+
+            /*
+             * We have to calculate the cluster ID because it isn't
+             * available in the CPU instance properties.
+             */
+            id = cpus->cpus[n].props.thread_id /
+                (ms->smp.cores * ms->smp.threads);
+            if (mc->smp_props.clusters_supported && id != cluster_id) {
+                continue;
+            }
+
+            core_id = cpus->cpus[n].props.core_id;
+            if (ms->smp.threads <= 1) {
                 build_processor_hierarchy_node(
                     table_data,
                     (1 << 1) | /* ACPI Processor ID valid */
                     (1 << 3),  /* Node is a Leaf */
-                    parent_offset, uid++, NULL, 0);
+                    parent_offset, core_id, NULL, 0);
+                continue;
             }
+
+            if (g_queue_find(core_list, GUINT_TO_POINTER(core_id))) {
+                continue;
+            }
+
+            g_queue_push_tail(core_list, GUINT_TO_POINTER(core_id));
+            g_queue_push_tail(list,
+                GUINT_TO_POINTER(table_data->len - pptt_start));
+            build_processor_hierarchy_node(
+                table_data,
+                (0 << 0), /* not a physical package */
+                parent_offset, core_id, NULL, 0);
         }
     }
 
     length = g_queue_get_length(list);
     for (i = 0; i < length; i++) {
-        int thread;
-
         parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
-        for (thread = 0; thread < ms->smp.threads; thread++) {
+        core_id = GPOINTER_TO_UINT(g_queue_pop_head(core_list));
+
+        for (n = 0; n < cpus->len; n++) {
+            if (cpus->cpus[n].props.core_id != core_id) {
+                continue;
+            }
+
+            thread_id = cpus->cpus[n].props.thread_id;
             build_processor_hierarchy_node(
                 table_data,
                 (1 << 1) | /* ACPI Processor ID valid */
                 (1 << 2) | /* Processor is a Thread */
                 (1 << 3),  /* Node is a Leaf */
-                parent_offset, uid++, NULL, 0);
+                parent_offset, thread_id, NULL, 0);
         }
     }
 
     g_queue_free(list);
+    g_queue_free(core_list);
+    g_queue_free(cluster_list);
+    g_queue_free(socket_list);
     acpi_table_end(linker, &table);
 }
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/3] hw/arm/virt: Unify ACPI processor ID in MADT and SRAT table
  2022-03-03  3:11 [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
  2022-03-03  3:11 ` [PATCH v2 1/3] " Gavin Shan
  2022-03-03  3:11 ` [PATCH v2 2/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table Gavin Shan
@ 2022-03-03  3:11 ` Gavin Shan
  2022-03-14  6:24 ` [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
  3 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2022-03-03  3:11 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

The value of the following field has been used in ACPI PPTT table
to identify the corresponding processor. This takes the same field
as the ACPI processor ID in MADT and SRAT tables.

  ms->possible_cpus->cpus[i].props.thread_id

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/arm/virt-acpi-build.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 449fab0080..7fedb56eea 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -534,13 +534,16 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 
     for (i = 0; i < cpu_list->len; ++i) {
         uint32_t nodeid = cpu_list->cpus[i].props.node_id;
+        uint32_t thread_id = cpu_list->cpus[i].props.thread_id;
+
         /*
          * 5.2.16.4 GICC Affinity Structure
          */
         build_append_int_noprefix(table_data, 3, 1);      /* Type */
         build_append_int_noprefix(table_data, 18, 1);     /* Length */
         build_append_int_noprefix(table_data, nodeid, 4); /* Proximity Domain */
-        build_append_int_noprefix(table_data, i, 4); /* ACPI Processor UID */
+        build_append_int_noprefix(table_data,
+                                  thread_id, 4); /* ACPI Processor UID */
         /* Flags, Table 5-76 */
         build_append_int_noprefix(table_data, 1 /* Enabled */, 4);
         build_append_int_noprefix(table_data, 0, 4); /* Clock Domain */
@@ -704,6 +707,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
     int i;
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
+    MachineState *ms = MACHINE(vms);
     const MemMapEntry *memmap = vms->memmap;
     AcpiTable table = { .sig = "APIC", .rev = 3, .oem_id = vms->oem_id,
                         .oem_table_id = vms->oem_table_id };
@@ -725,8 +729,9 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     build_append_int_noprefix(table_data, vms->gic_version, 1);
     build_append_int_noprefix(table_data, 0, 3);   /* Reserved */
 
-    for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {
+    for (i = 0; i < ms->smp.cpus; i++) {
         ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
+        uint32_t thread_id = ms->possible_cpus->cpus[i].props.thread_id;
         uint64_t physical_base_address = 0, gich = 0, gicv = 0;
         uint32_t vgic_interrupt = vms->virt ? PPI(ARCH_GIC_MAINT_IRQ) : 0;
         uint32_t pmu_interrupt = arm_feature(&armcpu->env, ARM_FEATURE_PMU) ?
@@ -743,7 +748,8 @@ build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         build_append_int_noprefix(table_data, 76, 1);   /* Length */
         build_append_int_noprefix(table_data, 0, 2);    /* Reserved */
         build_append_int_noprefix(table_data, i, 4);    /* GIC ID */
-        build_append_int_noprefix(table_data, i, 4);    /* ACPI Processor UID */
+        build_append_int_noprefix(table_data,
+                                  thread_id, 4);        /* ACPI Processor UID */
         /* Flags */
         build_append_int_noprefix(table_data, 1, 4);    /* Enabled */
         /* Parking Protocol Version */
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-03-03  3:11 [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
                   ` (2 preceding siblings ...)
  2022-03-03  3:11 ` [PATCH v2 3/3] hw/arm/virt: Unify ACPI processor ID in MADT and SRAT table Gavin Shan
@ 2022-03-14  6:24 ` Gavin Shan
  3 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2022-03-14  6:24 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

Hi Igor,

On 3/3/22 11:11 AM, Gavin Shan wrote:
> When the CPU-to-NUMA association isn't provided by user, the default NUMA
> node ID for the specific CPU is returned from virt_get_default_cpu_node_id().
> Unfortunately, the default NUMA node ID breaks socket boundary and leads to
> the broken CPU topology warning message in Linux guest. This series intends
> to fix the issue.
> 
> PATCH[1/3]: Fixes the broken CPU topology by considering the socket boundary
>              when the default NUMA node ID is calculated.
> PATCH[2/3]: Use the existing CPU topology to build PPTT table. However, the
>              cluster ID has to be calculated dynamically because there is no
>              corresponding information in CPU instance properties.
> PATCH[3/3]: Take thread ID as the ACPI processor ID in MDAT and SRAT tables.
> 
> Changelog
> =========
> v2:
>     * Populate the CPU topology in virt_possible_cpu_arch_ids() so that it
>       can be reused in virt_get_default_cpu_node_id()                          (Igor)
>     * Added PATCH[2/3] to use the existing CPU topology when PPTT table
>       is built                                                                 (Igor)
>     * Added PATCH[3/3] to take thread ID as ACPI processor ID in MADT and
>       SRAT table                                                               (Gavin)
> 

Kindly ping. Could you help to review when you have free cycles? :)

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-03-03  3:11 ` [PATCH v2 1/3] " Gavin Shan
@ 2022-03-18  6:23   ` wangyanan (Y) via
  2022-03-18  9:56     ` Igor Mammedov
  0 siblings, 1 reply; 15+ messages in thread
From: wangyanan (Y) via @ 2022-03-18  6:23 UTC (permalink / raw)
  To: Gavin Shan, qemu-arm
  Cc: qemu-devel, imammedo, drjones, peter.maydell, richard.henderson,
	shan.gavin, zhenyzha

Hi Gavin,

On 2022/3/3 11:11, Gavin Shan wrote:
> The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id()
> when it isn't provided explicitly. However, the CPU topology isn't fully
> considered in the default association and it causes CPU topology broken
> warnings on booting Linux guest.
>
> For example, the following warning messages are observed when the Linux guest
> is booted with the following command lines.
>
>    /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>    -accel kvm -machine virt,gic-version=host               \
>    -cpu host                                               \
>    -smp 6,sockets=2,cores=3,threads=1                      \
>    -m 1024M,slots=16,maxmem=64G                            \
>    -object memory-backend-ram,id=mem0,size=128M            \
>    -object memory-backend-ram,id=mem1,size=128M            \
>    -object memory-backend-ram,id=mem2,size=128M            \
>    -object memory-backend-ram,id=mem3,size=128M            \
>    -object memory-backend-ram,id=mem4,size=128M            \
>    -object memory-backend-ram,id=mem4,size=384M            \
>    -numa node,nodeid=0,memdev=mem0                         \
>    -numa node,nodeid=1,memdev=mem1                         \
>    -numa node,nodeid=2,memdev=mem2                         \
>    -numa node,nodeid=3,memdev=mem3                         \
>    -numa node,nodeid=4,memdev=mem4                         \
>    -numa node,nodeid=5,memdev=mem5
>           :
>    alternatives: patching kernel code
>    BUG: arch topology borken
>    the CLS domain not a subset of the MC domain
>    <the above error log repeats>
>    BUG: arch topology borken
>    the DIE domain not a subset of the NODE domain
>
> With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5
> are associated with NODE#0 to NODE#5 separately. That's incorrect because
> CPU#0/1/2 should be associated with same NUMA node because they're seated
> in same socket.
>
> This fixes the issue by populating the CPU topology in virt_possible_cpu_arch_ids()
> and considering the socket index when default CPU-to-NUMA association is given
> in virt_possible_cpu_arch_ids(). With this applied, no more CPU topology broken
> warnings are seen from the Linux guest. The 6 CPUs are associated with NODE#0/1,
> but there are no CPUs associated with NODE#2/3/4/5.
It may be better to split this patch into two. One extends 
virt_possible_cpu_arch_ids,
and the other fixes the numa node ID issue.
>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>   hw/arm/virt.c | 17 ++++++++++++++++-
>   1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 46bf7ceddf..dee02b60fc 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2488,7 +2488,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>   
>   static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
>   {
> -    return idx % ms->numa_state->num_nodes;
> +    int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
> +
> +    return socket_id % ms->numa_state->num_nodes;
>   }
>   
>   static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> @@ -2496,6 +2498,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>       int n;
>       unsigned int max_cpus = ms->smp.max_cpus;
>       VirtMachineState *vms = VIRT_MACHINE(ms);
> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>   
>       if (ms->possible_cpus) {
>           assert(ms->possible_cpus->len == max_cpus);
> @@ -2509,6 +2512,18 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>           ms->possible_cpus->cpus[n].arch_id =
>               virt_cpu_mp_affinity(vms, n);
> +
> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
> +        ms->possible_cpus->cpus[n].props.socket_id =
> +            n / (ms->smp.dies * ms->smp.clusters *
> +                ms->smp.cores * ms->smp.threads);
> +        if (mc->smp_props.dies_supported) {
> +            ms->possible_cpus->cpus[n].props.has_die_id = true;
> +            ms->possible_cpus->cpus[n].props.die_id =
> +                n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
> +        }
I still don't think we need to consider dies if it's certainly not
supported yet, IOW, we will never come into the if-branch.
We are populating arm-specific topo info instead of the generic,
we can probably uniformly update this part together with other
necessary places when we decide to support dies for arm virt
machine in the future. :)
> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>           ms->possible_cpus->cpus[n].props.has_thread_id = true;
>           ms->possible_cpus->cpus[n].props.thread_id = n;
>       }
Maybe we should use the same algorithm in x86_topo_ids_from_idx
to populate the IDs, so that scope of socket-id will be [0, total_sockets),
scope of thread-id is [0, threads_per_core), and so on. Then with a
group of socket/cluster/core/thread-id, we determine a CPU.

Suggestion: For the long term, is it necessary now to add similar topo
info infrastructure for ARM, such as X86CPUTopoInfo, X86CPUTopoIDs,
x86_topo_ids_from_idx?

Thanks,
Yanan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table
  2022-03-03  3:11 ` [PATCH v2 2/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table Gavin Shan
@ 2022-03-18  6:34   ` wangyanan (Y) via
  2022-03-18 13:28     ` Igor Mammedov
  0 siblings, 1 reply; 15+ messages in thread
From: wangyanan (Y) via @ 2022-03-18  6:34 UTC (permalink / raw)
  To: Gavin Shan, qemu-arm
  Cc: qemu-devel, imammedo, drjones, peter.maydell, richard.henderson,
	shan.gavin, zhenyzha

Hi Gavin,

On 2022/3/3 11:11, Gavin Shan wrote:
> When the PPTT table is built, the CPU topology is re-calculated, but
> it's unecessary because the CPU topology, except the cluster IDs,
> has been populated in virt_possible_cpu_arch_ids() on arm/virt machine.
>
> This avoids to re-calculate the CPU topology by reusing the existing
> one in ms->possible_cpus. However, the cluster ID for the CPU instance
> has to be calculated dynamically because there is no corresponding
> field in struct CpuInstanceProperties. Currently, the only user of
> build_pptt() is arm/virt machine.
>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>   hw/acpi/aml-build.c | 106 ++++++++++++++++++++++++++++++++++----------
>   1 file changed, 82 insertions(+), 24 deletions(-)
>
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 8966e16320..572cf5fc00 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -2002,18 +2002,27 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>                   const char *oem_id, const char *oem_table_id)
>   {
>       MachineClass *mc = MACHINE_GET_CLASS(ms);
> +    CPUArchIdList *cpus = ms->possible_cpus;
> +    GQueue *socket_list = g_queue_new();
> +    GQueue *cluster_list = g_queue_new();
> +    GQueue *core_list = g_queue_new();
>       GQueue *list = g_queue_new();
>       guint pptt_start = table_data->len;
>       guint parent_offset;
>       guint length, i;
> -    int uid = 0;
> -    int socket;
> +    int n, id, socket_id, cluster_id, core_id, thread_id;
>       AcpiTable table = { .sig = "PPTT", .rev = 2,
>                           .oem_id = oem_id, .oem_table_id = oem_table_id };
>   
>       acpi_table_begin(&table, table_data);
>   
> -    for (socket = 0; socket < ms->smp.sockets; socket++) {
> +    for (n = 0; n < cpus->len; n++) {
> +        socket_id = cpus->cpus[n].props.socket_id;
> +        if (g_queue_find(socket_list, GUINT_TO_POINTER(socket_id))) {
> +            continue;
> +        }
> +
> +        g_queue_push_tail(socket_list, GUINT_TO_POINTER(socket_id));
>           g_queue_push_tail(list,
>               GUINT_TO_POINTER(table_data->len - pptt_start));
>           build_processor_hierarchy_node(
> @@ -2023,65 +2032,114 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>                * of a physical package
>                */
>               (1 << 0),
> -            0, socket, NULL, 0);
> +            0, socket_id, NULL, 0);
>       }
>   
>       if (mc->smp_props.clusters_supported) {
>           length = g_queue_get_length(list);
>           for (i = 0; i < length; i++) {
> -            int cluster;
> -
>               parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
> -            for (cluster = 0; cluster < ms->smp.clusters; cluster++) {
> +            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
> +
> +            for (n = 0; n < cpus->len; n++) {
> +                if (cpus->cpus[n].props.socket_id != socket_id) {
> +                    continue;
> +                }
> +
> +                /*
> +                 * We have to calculate the cluster ID because it isn't
> +                 * available in the CPU instance properties.
> +                 */
Since we need cluster ID now, maybe we can simply make it supported
in the CPU instance properties.

Thanks,
Yanan
> +                cluster_id = cpus->cpus[n].props.thread_id /
> +                             (ms->smp.cores * ms->smp.threads);
> +                if (g_queue_find(cluster_list, GUINT_TO_POINTER(cluster_id))) {
> +                    continue;
> +                }
> +
> +                g_queue_push_tail(cluster_list, GUINT_TO_POINTER(cluster_id));
>                   g_queue_push_tail(list,
>                       GUINT_TO_POINTER(table_data->len - pptt_start));
>                   build_processor_hierarchy_node(
>                       table_data,
>                       (0 << 0), /* not a physical package */
> -                    parent_offset, cluster, NULL, 0);
> +                    parent_offset, cluster_id, NULL, 0);
>               }
>           }
>       }
>   
>       length = g_queue_get_length(list);
>       for (i = 0; i < length; i++) {
> -        int core;
> -
>           parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
> -        for (core = 0; core < ms->smp.cores; core++) {
> -            if (ms->smp.threads > 1) {
> -                g_queue_push_tail(list,
> -                    GUINT_TO_POINTER(table_data->len - pptt_start));
> -                build_processor_hierarchy_node(
> -                    table_data,
> -                    (0 << 0), /* not a physical package */
> -                    parent_offset, core, NULL, 0);
> -            } else {
> +        if (!mc->smp_props.clusters_supported) {
> +            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
> +        } else {
> +            cluster_id = GPOINTER_TO_UINT(g_queue_pop_head(cluster_list));
> +        }
> +
> +        for (n = 0; n < cpus->len; n++) {
> +            if (!mc->smp_props.clusters_supported &&
> +                cpus->cpus[n].props.socket_id != socket_id) {
> +                continue;
> +            }
> +
> +            /*
> +             * We have to calculate the cluster ID because it isn't
> +             * available in the CPU instance properties.
> +             */
> +            id = cpus->cpus[n].props.thread_id /
> +                (ms->smp.cores * ms->smp.threads);
> +            if (mc->smp_props.clusters_supported && id != cluster_id) {
> +                continue;
> +            }
> +
> +            core_id = cpus->cpus[n].props.core_id;
> +            if (ms->smp.threads <= 1) {
>                   build_processor_hierarchy_node(
>                       table_data,
>                       (1 << 1) | /* ACPI Processor ID valid */
>                       (1 << 3),  /* Node is a Leaf */
> -                    parent_offset, uid++, NULL, 0);
> +                    parent_offset, core_id, NULL, 0);
> +                continue;
>               }
> +
> +            if (g_queue_find(core_list, GUINT_TO_POINTER(core_id))) {
> +                continue;
> +            }
> +
> +            g_queue_push_tail(core_list, GUINT_TO_POINTER(core_id));
> +            g_queue_push_tail(list,
> +                GUINT_TO_POINTER(table_data->len - pptt_start));
> +            build_processor_hierarchy_node(
> +                table_data,
> +                (0 << 0), /* not a physical package */
> +                parent_offset, core_id, NULL, 0);
>           }
>       }
>   
>       length = g_queue_get_length(list);
>       for (i = 0; i < length; i++) {
> -        int thread;
> -
>           parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
> -        for (thread = 0; thread < ms->smp.threads; thread++) {
> +        core_id = GPOINTER_TO_UINT(g_queue_pop_head(core_list));
> +
> +        for (n = 0; n < cpus->len; n++) {
> +            if (cpus->cpus[n].props.core_id != core_id) {
> +                continue;
> +            }
> +
> +            thread_id = cpus->cpus[n].props.thread_id;
>               build_processor_hierarchy_node(
>                   table_data,
>                   (1 << 1) | /* ACPI Processor ID valid */
>                   (1 << 2) | /* Processor is a Thread */
>                   (1 << 3),  /* Node is a Leaf */
> -                parent_offset, uid++, NULL, 0);
> +                parent_offset, thread_id, NULL, 0);
>           }
>       }
>   
>       g_queue_free(list);
> +    g_queue_free(core_list);
> +    g_queue_free(cluster_list);
> +    g_queue_free(socket_list);
>       acpi_table_end(linker, &table);
>   }
>   



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-03-18  6:23   ` wangyanan (Y) via
@ 2022-03-18  9:56     ` Igor Mammedov
  2022-03-18 13:00       ` wangyanan (Y) via
  0 siblings, 1 reply; 15+ messages in thread
From: Igor Mammedov @ 2022-03-18  9:56 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: peter.maydell, drjones, Gavin Shan, richard.henderson,
	qemu-devel, zhenyzha, qemu-arm, shan.gavin

On Fri, 18 Mar 2022 14:23:34 +0800
"wangyanan (Y)" <wangyanan55@huawei.com> wrote:

> Hi Gavin,
> 
> On 2022/3/3 11:11, Gavin Shan wrote:
> > The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id()
> > when it isn't provided explicitly. However, the CPU topology isn't fully
> > considered in the default association and it causes CPU topology broken
> > warnings on booting Linux guest.
> >
> > For example, the following warning messages are observed when the Linux guest
> > is booted with the following command lines.
> >
> >    /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> >    -accel kvm -machine virt,gic-version=host               \
> >    -cpu host                                               \
> >    -smp 6,sockets=2,cores=3,threads=1                      \
> >    -m 1024M,slots=16,maxmem=64G                            \
> >    -object memory-backend-ram,id=mem0,size=128M            \
> >    -object memory-backend-ram,id=mem1,size=128M            \
> >    -object memory-backend-ram,id=mem2,size=128M            \
> >    -object memory-backend-ram,id=mem3,size=128M            \
> >    -object memory-backend-ram,id=mem4,size=128M            \
> >    -object memory-backend-ram,id=mem4,size=384M            \
> >    -numa node,nodeid=0,memdev=mem0                         \
> >    -numa node,nodeid=1,memdev=mem1                         \
> >    -numa node,nodeid=2,memdev=mem2                         \
> >    -numa node,nodeid=3,memdev=mem3                         \
> >    -numa node,nodeid=4,memdev=mem4                         \
> >    -numa node,nodeid=5,memdev=mem5
> >           :
> >    alternatives: patching kernel code
> >    BUG: arch topology borken
> >    the CLS domain not a subset of the MC domain
> >    <the above error log repeats>
> >    BUG: arch topology borken
> >    the DIE domain not a subset of the NODE domain
> >
> > With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5
> > are associated with NODE#0 to NODE#5 separately. That's incorrect because
> > CPU#0/1/2 should be associated with same NUMA node because they're seated
> > in same socket.
> >
> > This fixes the issue by populating the CPU topology in virt_possible_cpu_arch_ids()
> > and considering the socket index when default CPU-to-NUMA association is given
> > in virt_possible_cpu_arch_ids(). With this applied, no more CPU topology broken
> > warnings are seen from the Linux guest. The 6 CPUs are associated with NODE#0/1,
> > but there are no CPUs associated with NODE#2/3/4/5.  
> It may be better to split this patch into two. One extends 
> virt_possible_cpu_arch_ids,
> and the other fixes the numa node ID issue.
> >
> > Signed-off-by: Gavin Shan <gshan@redhat.com>
> > ---
> >   hw/arm/virt.c | 17 ++++++++++++++++-
> >   1 file changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 46bf7ceddf..dee02b60fc 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2488,7 +2488,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> >   
> >   static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
> >   {
> > -    return idx % ms->numa_state->num_nodes;
> > +    int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
> > +
> > +    return socket_id % ms->numa_state->num_nodes;
> >   }
> >   
> >   static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> > @@ -2496,6 +2498,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >       int n;
> >       unsigned int max_cpus = ms->smp.max_cpus;
> >       VirtMachineState *vms = VIRT_MACHINE(ms);
> > +    MachineClass *mc = MACHINE_GET_CLASS(vms);
> >   
> >       if (ms->possible_cpus) {
> >           assert(ms->possible_cpus->len == max_cpus);
> > @@ -2509,6 +2512,18 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
> >           ms->possible_cpus->cpus[n].arch_id =
> >               virt_cpu_mp_affinity(vms, n);
> > +
> > +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
> > +        ms->possible_cpus->cpus[n].props.socket_id =
> > +            n / (ms->smp.dies * ms->smp.clusters *
> > +                ms->smp.cores * ms->smp.threads);
> > +        if (mc->smp_props.dies_supported) {
> > +            ms->possible_cpus->cpus[n].props.has_die_id = true;
> > +            ms->possible_cpus->cpus[n].props.die_id =
> > +                n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
> > +        }  
> I still don't think we need to consider dies if it's certainly not
> supported yet, IOW, we will never come into the if-branch.
> We are populating arm-specific topo info instead of the generic,
> we can probably uniformly update this part together with other
> necessary places when we decide to support dies for arm virt
> machine in the future. :)

it seems we do support dies and they are supposed to be numa boundary too,
so perhaps we should account for it when generating node-id.

> > +        ms->possible_cpus->cpus[n].props.has_core_id = true;
> > +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
> >           ms->possible_cpus->cpus[n].props.has_thread_id = true;
> >           ms->possible_cpus->cpus[n].props.thread_id = n;
> >       }  
> Maybe we should use the same algorithm in x86_topo_ids_from_idx
> to populate the IDs, so that scope of socket-id will be [0, total_sockets),
> scope of thread-id is [0, threads_per_core), and so on. Then with a
> group of socket/cluster/core/thread-id, we determine a CPU.
> 
> Suggestion: For the long term, is it necessary now to add similar topo
> info infrastructure for ARM, such as X86CPUTopoInfo, X86CPUTopoIDs,
> x86_topo_ids_from_idx?
> 
> Thanks,
> Yanan
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-03-18  9:56     ` Igor Mammedov
@ 2022-03-18 13:00       ` wangyanan (Y) via
  2022-03-18 13:27         ` Igor Mammedov
  0 siblings, 1 reply; 15+ messages in thread
From: wangyanan (Y) via @ 2022-03-18 13:00 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Gavin Shan, qemu-arm, qemu-devel, drjones, peter.maydell,
	richard.henderson, shan.gavin, zhenyzha

On 2022/3/18 17:56, Igor Mammedov wrote:
> On Fri, 18 Mar 2022 14:23:34 +0800
> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>
>> Hi Gavin,
>>
>> On 2022/3/3 11:11, Gavin Shan wrote:
>>> The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id()
>>> when it isn't provided explicitly. However, the CPU topology isn't fully
>>> considered in the default association and it causes CPU topology broken
>>> warnings on booting Linux guest.
>>>
>>> For example, the following warning messages are observed when the Linux guest
>>> is booted with the following command lines.
>>>
>>>     /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>>>     -accel kvm -machine virt,gic-version=host               \
>>>     -cpu host                                               \
>>>     -smp 6,sockets=2,cores=3,threads=1                      \
>>>     -m 1024M,slots=16,maxmem=64G                            \
>>>     -object memory-backend-ram,id=mem0,size=128M            \
>>>     -object memory-backend-ram,id=mem1,size=128M            \
>>>     -object memory-backend-ram,id=mem2,size=128M            \
>>>     -object memory-backend-ram,id=mem3,size=128M            \
>>>     -object memory-backend-ram,id=mem4,size=128M            \
>>>     -object memory-backend-ram,id=mem4,size=384M            \
>>>     -numa node,nodeid=0,memdev=mem0                         \
>>>     -numa node,nodeid=1,memdev=mem1                         \
>>>     -numa node,nodeid=2,memdev=mem2                         \
>>>     -numa node,nodeid=3,memdev=mem3                         \
>>>     -numa node,nodeid=4,memdev=mem4                         \
>>>     -numa node,nodeid=5,memdev=mem5
>>>            :
>>>     alternatives: patching kernel code
>>>     BUG: arch topology borken
>>>     the CLS domain not a subset of the MC domain
>>>     <the above error log repeats>
>>>     BUG: arch topology borken
>>>     the DIE domain not a subset of the NODE domain
>>>
>>> With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5
>>> are associated with NODE#0 to NODE#5 separately. That's incorrect because
>>> CPU#0/1/2 should be associated with same NUMA node because they're seated
>>> in same socket.
>>>
>>> This fixes the issue by populating the CPU topology in virt_possible_cpu_arch_ids()
>>> and considering the socket index when default CPU-to-NUMA association is given
>>> in virt_possible_cpu_arch_ids(). With this applied, no more CPU topology broken
>>> warnings are seen from the Linux guest. The 6 CPUs are associated with NODE#0/1,
>>> but there are no CPUs associated with NODE#2/3/4/5.
>> It may be better to split this patch into two. One extends
>> virt_possible_cpu_arch_ids,
>> and the other fixes the numa node ID issue.
>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>> ---
>>>    hw/arm/virt.c | 17 ++++++++++++++++-
>>>    1 file changed, 16 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 46bf7ceddf..dee02b60fc 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -2488,7 +2488,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>>>    
>>>    static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
>>>    {
>>> -    return idx % ms->numa_state->num_nodes;
>>> +    int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
>>> +
>>> +    return socket_id % ms->numa_state->num_nodes;
>>>    }
>>>    
>>>    static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>> @@ -2496,6 +2498,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>        int n;
>>>        unsigned int max_cpus = ms->smp.max_cpus;
>>>        VirtMachineState *vms = VIRT_MACHINE(ms);
>>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>>    
>>>        if (ms->possible_cpus) {
>>>            assert(ms->possible_cpus->len == max_cpus);
>>> @@ -2509,6 +2512,18 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>            ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>>            ms->possible_cpus->cpus[n].arch_id =
>>>                virt_cpu_mp_affinity(vms, n);
>>> +
>>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
>>> +        ms->possible_cpus->cpus[n].props.socket_id =
>>> +            n / (ms->smp.dies * ms->smp.clusters *
>>> +                ms->smp.cores * ms->smp.threads);
>>> +        if (mc->smp_props.dies_supported) {
>>> +            ms->possible_cpus->cpus[n].props.has_die_id = true;
>>> +            ms->possible_cpus->cpus[n].props.die_id =
>>> +                n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>>> +        }
>> I still don't think we need to consider dies if it's certainly not
>> supported yet, IOW, we will never come into the if-branch.
>> We are populating arm-specific topo info instead of the generic,
>> we can probably uniformly update this part together with other
>> necessary places when we decide to support dies for arm virt
>> machine in the future. :)
> it seems we do support dies and they are supposed to be numa boundary too,
> so perhaps we should account for it when generating node-id.
Sorry, I actually meant that we currently don't support dies for arm, so 
that
we will always have "mc->smp_props.dies_supported == False" here, which
makes the code a bit unnecessary.  dies are only supported for x86 for 
now. :)

Thanks,
Yanan
>>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
>>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>>>            ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>>            ms->possible_cpus->cpus[n].props.thread_id = n;
>>>        }
>> Maybe we should use the same algorithm in x86_topo_ids_from_idx
>> to populate the IDs, so that scope of socket-id will be [0, total_sockets),
>> scope of thread-id is [0, threads_per_core), and so on. Then with a
>> group of socket/cluster/core/thread-id, we determine a CPU.
>>
>> Suggestion: For the long term, is it necessary now to add similar topo
>> info infrastructure for ARM, such as X86CPUTopoInfo, X86CPUTopoIDs,
>> x86_topo_ids_from_idx?
>>
>> Thanks,
>> Yanan
>>
> .



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-03-18 13:00       ` wangyanan (Y) via
@ 2022-03-18 13:27         ` Igor Mammedov
  2022-03-21  2:28           ` wangyanan (Y) via
  0 siblings, 1 reply; 15+ messages in thread
From: Igor Mammedov @ 2022-03-18 13:27 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: peter.maydell, drjones, Gavin Shan, richard.henderson,
	qemu-devel, zhenyzha, qemu-arm, shan.gavin

On Fri, 18 Mar 2022 21:00:35 +0800
"wangyanan (Y)" <wangyanan55@huawei.com> wrote:

> On 2022/3/18 17:56, Igor Mammedov wrote:
> > On Fri, 18 Mar 2022 14:23:34 +0800
> > "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
> >  
> >> Hi Gavin,
> >>
> >> On 2022/3/3 11:11, Gavin Shan wrote:  
> >>> The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id()
> >>> when it isn't provided explicitly. However, the CPU topology isn't fully
> >>> considered in the default association and it causes CPU topology broken
> >>> warnings on booting Linux guest.
> >>>
> >>> For example, the following warning messages are observed when the Linux guest
> >>> is booted with the following command lines.
> >>>
> >>>     /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> >>>     -accel kvm -machine virt,gic-version=host               \
> >>>     -cpu host                                               \
> >>>     -smp 6,sockets=2,cores=3,threads=1                      \
> >>>     -m 1024M,slots=16,maxmem=64G                            \
> >>>     -object memory-backend-ram,id=mem0,size=128M            \
> >>>     -object memory-backend-ram,id=mem1,size=128M            \
> >>>     -object memory-backend-ram,id=mem2,size=128M            \
> >>>     -object memory-backend-ram,id=mem3,size=128M            \
> >>>     -object memory-backend-ram,id=mem4,size=128M            \
> >>>     -object memory-backend-ram,id=mem4,size=384M            \
> >>>     -numa node,nodeid=0,memdev=mem0                         \
> >>>     -numa node,nodeid=1,memdev=mem1                         \
> >>>     -numa node,nodeid=2,memdev=mem2                         \
> >>>     -numa node,nodeid=3,memdev=mem3                         \
> >>>     -numa node,nodeid=4,memdev=mem4                         \
> >>>     -numa node,nodeid=5,memdev=mem5
> >>>            :
> >>>     alternatives: patching kernel code
> >>>     BUG: arch topology borken
> >>>     the CLS domain not a subset of the MC domain
> >>>     <the above error log repeats>
> >>>     BUG: arch topology borken
> >>>     the DIE domain not a subset of the NODE domain
> >>>
> >>> With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5
> >>> are associated with NODE#0 to NODE#5 separately. That's incorrect because
> >>> CPU#0/1/2 should be associated with same NUMA node because they're seated
> >>> in same socket.
> >>>
> >>> This fixes the issue by populating the CPU topology in virt_possible_cpu_arch_ids()
> >>> and considering the socket index when default CPU-to-NUMA association is given
> >>> in virt_possible_cpu_arch_ids(). With this applied, no more CPU topology broken
> >>> warnings are seen from the Linux guest. The 6 CPUs are associated with NODE#0/1,
> >>> but there are no CPUs associated with NODE#2/3/4/5.  
> >> It may be better to split this patch into two. One extends
> >> virt_possible_cpu_arch_ids,
> >> and the other fixes the numa node ID issue.  
> >>> Signed-off-by: Gavin Shan <gshan@redhat.com>
> >>> ---
> >>>    hw/arm/virt.c | 17 ++++++++++++++++-
> >>>    1 file changed, 16 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> >>> index 46bf7ceddf..dee02b60fc 100644
> >>> --- a/hw/arm/virt.c
> >>> +++ b/hw/arm/virt.c
> >>> @@ -2488,7 +2488,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> >>>    
> >>>    static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
> >>>    {
> >>> -    return idx % ms->numa_state->num_nodes;
> >>> +    int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
> >>> +
> >>> +    return socket_id % ms->numa_state->num_nodes;
> >>>    }
> >>>    
> >>>    static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >>> @@ -2496,6 +2498,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >>>        int n;
> >>>        unsigned int max_cpus = ms->smp.max_cpus;
> >>>        VirtMachineState *vms = VIRT_MACHINE(ms);
> >>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
> >>>    
> >>>        if (ms->possible_cpus) {
> >>>            assert(ms->possible_cpus->len == max_cpus);
> >>> @@ -2509,6 +2512,18 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >>>            ms->possible_cpus->cpus[n].type = ms->cpu_type;
> >>>            ms->possible_cpus->cpus[n].arch_id =
> >>>                virt_cpu_mp_affinity(vms, n);
> >>> +
> >>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
> >>> +        ms->possible_cpus->cpus[n].props.socket_id =
> >>> +            n / (ms->smp.dies * ms->smp.clusters *
> >>> +                ms->smp.cores * ms->smp.threads);
> >>> +        if (mc->smp_props.dies_supported) {
> >>> +            ms->possible_cpus->cpus[n].props.has_die_id = true;
> >>> +            ms->possible_cpus->cpus[n].props.die_id =
> >>> +                n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
> >>> +        }  
> >> I still don't think we need to consider dies if it's certainly not
> >> supported yet, IOW, we will never come into the if-branch.
> >> We are populating arm-specific topo info instead of the generic,
> >> we can probably uniformly update this part together with other
> >> necessary places when we decide to support dies for arm virt
> >> machine in the future. :)  
> > it seems we do support dies and they are supposed to be numa boundary too,
> > so perhaps we should account for it when generating node-id.  
> Sorry, I actually meant that we currently don't support dies for arm, so 
> that
> we will always have "mc->smp_props.dies_supported == False" here, which
> makes the code a bit unnecessary.  dies are only supported for x86 for 
> now. :)
> 

then perhaps add an assert() here, so that we would notice and fix this
place when dies_supported becomes true.

> Thanks,
> Yanan
> >>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
> >>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
> >>>            ms->possible_cpus->cpus[n].props.has_thread_id = true;
> >>>            ms->possible_cpus->cpus[n].props.thread_id = n;
> >>>        }  
> >> Maybe we should use the same algorithm in x86_topo_ids_from_idx
> >> to populate the IDs, so that scope of socket-id will be [0, total_sockets),
> >> scope of thread-id is [0, threads_per_core), and so on. Then with a
> >> group of socket/cluster/core/thread-id, we determine a CPU.
> >>
> >> Suggestion: For the long term, is it necessary now to add similar topo
> >> info infrastructure for ARM, such as X86CPUTopoInfo, X86CPUTopoIDs,
> >> x86_topo_ids_from_idx?
> >>
> >> Thanks,
> >> Yanan
> >>  
> > .  
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table
  2022-03-18  6:34   ` wangyanan (Y) via
@ 2022-03-18 13:28     ` Igor Mammedov
  2022-03-23  3:31       ` Gavin Shan
  0 siblings, 1 reply; 15+ messages in thread
From: Igor Mammedov @ 2022-03-18 13:28 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: peter.maydell, drjones, Gavin Shan, richard.henderson,
	qemu-devel, zhenyzha, qemu-arm, shan.gavin

On Fri, 18 Mar 2022 14:34:12 +0800
"wangyanan (Y)" <wangyanan55@huawei.com> wrote:

> Hi Gavin,
> 
> On 2022/3/3 11:11, Gavin Shan wrote:
> > When the PPTT table is built, the CPU topology is re-calculated, but
> > it's unecessary because the CPU topology, except the cluster IDs,
> > has been populated in virt_possible_cpu_arch_ids() on arm/virt machine.
> >
> > This avoids to re-calculate the CPU topology by reusing the existing
> > one in ms->possible_cpus. However, the cluster ID for the CPU instance
> > has to be calculated dynamically because there is no corresponding
> > field in struct CpuInstanceProperties. Currently, the only user of
> > build_pptt() is arm/virt machine.
> >
> > Signed-off-by: Gavin Shan <gshan@redhat.com>
> > ---
> >   hw/acpi/aml-build.c | 106 ++++++++++++++++++++++++++++++++++----------
> >   1 file changed, 82 insertions(+), 24 deletions(-)
> >
> > diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> > index 8966e16320..572cf5fc00 100644
> > --- a/hw/acpi/aml-build.c
> > +++ b/hw/acpi/aml-build.c
> > @@ -2002,18 +2002,27 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> >                   const char *oem_id, const char *oem_table_id)
> >   {
> >       MachineClass *mc = MACHINE_GET_CLASS(ms);
> > +    CPUArchIdList *cpus = ms->possible_cpus;
> > +    GQueue *socket_list = g_queue_new();
> > +    GQueue *cluster_list = g_queue_new();
> > +    GQueue *core_list = g_queue_new();
> >       GQueue *list = g_queue_new();
> >       guint pptt_start = table_data->len;
> >       guint parent_offset;
> >       guint length, i;
> > -    int uid = 0;
> > -    int socket;
> > +    int n, id, socket_id, cluster_id, core_id, thread_id;
> >       AcpiTable table = { .sig = "PPTT", .rev = 2,
> >                           .oem_id = oem_id, .oem_table_id = oem_table_id };
> >   
> >       acpi_table_begin(&table, table_data);
> >   
> > -    for (socket = 0; socket < ms->smp.sockets; socket++) {
> > +    for (n = 0; n < cpus->len; n++) {
> > +        socket_id = cpus->cpus[n].props.socket_id;
> > +        if (g_queue_find(socket_list, GUINT_TO_POINTER(socket_id))) {
> > +            continue;
> > +        }
> > +
> > +        g_queue_push_tail(socket_list, GUINT_TO_POINTER(socket_id));
> >           g_queue_push_tail(list,
> >               GUINT_TO_POINTER(table_data->len - pptt_start));
> >           build_processor_hierarchy_node(
> > @@ -2023,65 +2032,114 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
> >                * of a physical package
> >                */
> >               (1 << 0),
> > -            0, socket, NULL, 0);
> > +            0, socket_id, NULL, 0);
> >       }
> >   
> >       if (mc->smp_props.clusters_supported) {
> >           length = g_queue_get_length(list);
> >           for (i = 0; i < length; i++) {
> > -            int cluster;
> > -
> >               parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
> > -            for (cluster = 0; cluster < ms->smp.clusters; cluster++) {
> > +            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
> > +
> > +            for (n = 0; n < cpus->len; n++) {
> > +                if (cpus->cpus[n].props.socket_id != socket_id) {
> > +                    continue;
> > +                }
> > +
> > +                /*
> > +                 * We have to calculate the cluster ID because it isn't
> > +                 * available in the CPU instance properties.
> > +                 */  
> Since we need cluster ID now, maybe we can simply make it supported
> in the CPU instance properties.

agreed

> 
> Thanks,
> Yanan
> > +                cluster_id = cpus->cpus[n].props.thread_id /
> > +                             (ms->smp.cores * ms->smp.threads);
> > +                if (g_queue_find(cluster_list, GUINT_TO_POINTER(cluster_id))) {
> > +                    continue;
> > +                }
> > +
> > +                g_queue_push_tail(cluster_list, GUINT_TO_POINTER(cluster_id));
> >                   g_queue_push_tail(list,
> >                       GUINT_TO_POINTER(table_data->len - pptt_start));
> >                   build_processor_hierarchy_node(
> >                       table_data,
> >                       (0 << 0), /* not a physical package */
> > -                    parent_offset, cluster, NULL, 0);
> > +                    parent_offset, cluster_id, NULL, 0);
> >               }
> >           }
> >       }
> >   
> >       length = g_queue_get_length(list);
> >       for (i = 0; i < length; i++) {
> > -        int core;
> > -
> >           parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
> > -        for (core = 0; core < ms->smp.cores; core++) {
> > -            if (ms->smp.threads > 1) {
> > -                g_queue_push_tail(list,
> > -                    GUINT_TO_POINTER(table_data->len - pptt_start));
> > -                build_processor_hierarchy_node(
> > -                    table_data,
> > -                    (0 << 0), /* not a physical package */
> > -                    parent_offset, core, NULL, 0);
> > -            } else {
> > +        if (!mc->smp_props.clusters_supported) {
> > +            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
> > +        } else {
> > +            cluster_id = GPOINTER_TO_UINT(g_queue_pop_head(cluster_list));
> > +        }
> > +
> > +        for (n = 0; n < cpus->len; n++) {
> > +            if (!mc->smp_props.clusters_supported &&
> > +                cpus->cpus[n].props.socket_id != socket_id) {
> > +                continue;
> > +            }
> > +
> > +            /*
> > +             * We have to calculate the cluster ID because it isn't
> > +             * available in the CPU instance properties.
> > +             */
> > +            id = cpus->cpus[n].props.thread_id /
> > +                (ms->smp.cores * ms->smp.threads);
> > +            if (mc->smp_props.clusters_supported && id != cluster_id) {
> > +                continue;
> > +            }
> > +
> > +            core_id = cpus->cpus[n].props.core_id;
> > +            if (ms->smp.threads <= 1) {
> >                   build_processor_hierarchy_node(
> >                       table_data,
> >                       (1 << 1) | /* ACPI Processor ID valid */
> >                       (1 << 3),  /* Node is a Leaf */
> > -                    parent_offset, uid++, NULL, 0);
> > +                    parent_offset, core_id, NULL, 0);
> > +                continue;
> >               }
> > +
> > +            if (g_queue_find(core_list, GUINT_TO_POINTER(core_id))) {
> > +                continue;
> > +            }
> > +
> > +            g_queue_push_tail(core_list, GUINT_TO_POINTER(core_id));
> > +            g_queue_push_tail(list,
> > +                GUINT_TO_POINTER(table_data->len - pptt_start));
> > +            build_processor_hierarchy_node(
> > +                table_data,
> > +                (0 << 0), /* not a physical package */
> > +                parent_offset, core_id, NULL, 0);
> >           }
> >       }
> >   
> >       length = g_queue_get_length(list);
> >       for (i = 0; i < length; i++) {
> > -        int thread;
> > -
> >           parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
> > -        for (thread = 0; thread < ms->smp.threads; thread++) {
> > +        core_id = GPOINTER_TO_UINT(g_queue_pop_head(core_list));
> > +
> > +        for (n = 0; n < cpus->len; n++) {
> > +            if (cpus->cpus[n].props.core_id != core_id) {
> > +                continue;
> > +            }
> > +
> > +            thread_id = cpus->cpus[n].props.thread_id;
> >               build_processor_hierarchy_node(
> >                   table_data,
> >                   (1 << 1) | /* ACPI Processor ID valid */
> >                   (1 << 2) | /* Processor is a Thread */
> >                   (1 << 3),  /* Node is a Leaf */
> > -                parent_offset, uid++, NULL, 0);
> > +                parent_offset, thread_id, NULL, 0);
> >           }
> >       }
> >   
> >       g_queue_free(list);
> > +    g_queue_free(core_list);
> > +    g_queue_free(cluster_list);
> > +    g_queue_free(socket_list);
> >       acpi_table_end(linker, &table);
> >   }
> >     
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-03-18 13:27         ` Igor Mammedov
@ 2022-03-21  2:28           ` wangyanan (Y) via
  2022-03-23  3:26             ` Gavin Shan
  2022-03-23  3:29             ` Gavin Shan
  0 siblings, 2 replies; 15+ messages in thread
From: wangyanan (Y) via @ 2022-03-21  2:28 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Gavin Shan, qemu-arm, qemu-devel, drjones, peter.maydell,
	richard.henderson, shan.gavin, zhenyzha

On 2022/3/18 21:27, Igor Mammedov wrote:
> On Fri, 18 Mar 2022 21:00:35 +0800
> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>
>> On 2022/3/18 17:56, Igor Mammedov wrote:
>>> On Fri, 18 Mar 2022 14:23:34 +0800
>>> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>>>   
>>>> Hi Gavin,
>>>>
>>>> On 2022/3/3 11:11, Gavin Shan wrote:
>>>>> The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id()
>>>>> when it isn't provided explicitly. However, the CPU topology isn't fully
>>>>> considered in the default association and it causes CPU topology broken
>>>>> warnings on booting Linux guest.
>>>>>
>>>>> For example, the following warning messages are observed when the Linux guest
>>>>> is booted with the following command lines.
>>>>>
>>>>>      /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>>>>>      -accel kvm -machine virt,gic-version=host               \
>>>>>      -cpu host                                               \
>>>>>      -smp 6,sockets=2,cores=3,threads=1                      \
>>>>>      -m 1024M,slots=16,maxmem=64G                            \
>>>>>      -object memory-backend-ram,id=mem0,size=128M            \
>>>>>      -object memory-backend-ram,id=mem1,size=128M            \
>>>>>      -object memory-backend-ram,id=mem2,size=128M            \
>>>>>      -object memory-backend-ram,id=mem3,size=128M            \
>>>>>      -object memory-backend-ram,id=mem4,size=128M            \
>>>>>      -object memory-backend-ram,id=mem4,size=384M            \
>>>>>      -numa node,nodeid=0,memdev=mem0                         \
>>>>>      -numa node,nodeid=1,memdev=mem1                         \
>>>>>      -numa node,nodeid=2,memdev=mem2                         \
>>>>>      -numa node,nodeid=3,memdev=mem3                         \
>>>>>      -numa node,nodeid=4,memdev=mem4                         \
>>>>>      -numa node,nodeid=5,memdev=mem5
>>>>>             :
>>>>>      alternatives: patching kernel code
>>>>>      BUG: arch topology borken
>>>>>      the CLS domain not a subset of the MC domain
>>>>>      <the above error log repeats>
>>>>>      BUG: arch topology borken
>>>>>      the DIE domain not a subset of the NODE domain
>>>>>
>>>>> With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5
>>>>> are associated with NODE#0 to NODE#5 separately. That's incorrect because
>>>>> CPU#0/1/2 should be associated with same NUMA node because they're seated
>>>>> in same socket.
>>>>>
>>>>> This fixes the issue by populating the CPU topology in virt_possible_cpu_arch_ids()
>>>>> and considering the socket index when default CPU-to-NUMA association is given
>>>>> in virt_possible_cpu_arch_ids(). With this applied, no more CPU topology broken
>>>>> warnings are seen from the Linux guest. The 6 CPUs are associated with NODE#0/1,
>>>>> but there are no CPUs associated with NODE#2/3/4/5.
>>>> It may be better to split this patch into two. One extends
>>>> virt_possible_cpu_arch_ids,
>>>> and the other fixes the numa node ID issue.
>>>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>>>> ---
>>>>>     hw/arm/virt.c | 17 ++++++++++++++++-
>>>>>     1 file changed, 16 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>>> index 46bf7ceddf..dee02b60fc 100644
>>>>> --- a/hw/arm/virt.c
>>>>> +++ b/hw/arm/virt.c
>>>>> @@ -2488,7 +2488,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>>>>>     
>>>>>     static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
>>>>>     {
>>>>> -    return idx % ms->numa_state->num_nodes;
>>>>> +    int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
>>>>> +
>>>>> +    return socket_id % ms->numa_state->num_nodes;
>>>>>     }
>>>>>     
>>>>>     static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>> @@ -2496,6 +2498,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>         int n;
>>>>>         unsigned int max_cpus = ms->smp.max_cpus;
>>>>>         VirtMachineState *vms = VIRT_MACHINE(ms);
>>>>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>>>>     
>>>>>         if (ms->possible_cpus) {
>>>>>             assert(ms->possible_cpus->len == max_cpus);
>>>>> @@ -2509,6 +2512,18 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>             ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>>>>             ms->possible_cpus->cpus[n].arch_id =
>>>>>                 virt_cpu_mp_affinity(vms, n);
>>>>> +
>>>>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
>>>>> +        ms->possible_cpus->cpus[n].props.socket_id =
>>>>> +            n / (ms->smp.dies * ms->smp.clusters *
>>>>> +                ms->smp.cores * ms->smp.threads);
>>>>> +        if (mc->smp_props.dies_supported) {
>>>>> +            ms->possible_cpus->cpus[n].props.has_die_id = true;
>>>>> +            ms->possible_cpus->cpus[n].props.die_id =
>>>>> +                n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>>>>> +        }
>>>> I still don't think we need to consider dies if it's certainly not
>>>> supported yet, IOW, we will never come into the if-branch.
>>>> We are populating arm-specific topo info instead of the generic,
>>>> we can probably uniformly update this part together with other
>>>> necessary places when we decide to support dies for arm virt
>>>> machine in the future. :)
>>> it seems we do support dies and they are supposed to be numa boundary too,
>>> so perhaps we should account for it when generating node-id.
>> Sorry, I actually meant that we currently don't support dies for arm, so
>> that
>> we will always have "mc->smp_props.dies_supported == False" here, which
>> makes the code a bit unnecessary.  dies are only supported for x86 for
>> now. :)
>>
> then perhaps add an assert() here, so that we would notice and fix this
> place when dies_supported becomes true.
A simple assert() works here, I think.

Thanks,
Yanan
>> Thanks,
>> Yanan
>>>>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
>>>>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>>>>>             ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>>>>             ms->possible_cpus->cpus[n].props.thread_id = n;
>>>>>         }
>>>> Maybe we should use the same algorithm in x86_topo_ids_from_idx
>>>> to populate the IDs, so that scope of socket-id will be [0, total_sockets),
>>>> scope of thread-id is [0, threads_per_core), and so on. Then with a
>>>> group of socket/cluster/core/thread-id, we determine a CPU.
>>>>
>>>> Suggestion: For the long term, is it necessary now to add similar topo
>>>> info infrastructure for ARM, such as X86CPUTopoInfo, X86CPUTopoIDs,
>>>> x86_topo_ids_from_idx?
>>>>
>>>> Thanks,
>>>> Yanan
>>>>   
>>> .
> .



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-03-21  2:28           ` wangyanan (Y) via
@ 2022-03-23  3:26             ` Gavin Shan
  2022-03-23  3:29             ` Gavin Shan
  1 sibling, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2022-03-23  3:26 UTC (permalink / raw)
  To: wangyanan (Y), Igor Mammedov
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	qemu-arm, shan.gavin

Hi Yanan and Igor,

On 3/21/22 10:28 AM, wangyanan (Y) wrote:
> On 2022/3/18 21:27, Igor Mammedov wrote:
>> On Fri, 18 Mar 2022 21:00:35 +0800
>> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>>> On 2022/3/18 17:56, Igor Mammedov wrote:
>>>> On Fri, 18 Mar 2022 14:23:34 +0800
>>>> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>>>>> On 2022/3/3 11:11, Gavin Shan wrote:
>>>>>> The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id()
>>>>>> when it isn't provided explicitly. However, the CPU topology isn't fully
>>>>>> considered in the default association and it causes CPU topology broken
>>>>>> warnings on booting Linux guest.
>>>>>>
>>>>>> For example, the following warning messages are observed when the Linux guest
>>>>>> is booted with the following command lines.
>>>>>>
>>>>>>      /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>>>>>>      -accel kvm -machine virt,gic-version=host               \
>>>>>>      -cpu host                                               \
>>>>>>      -smp 6,sockets=2,cores=3,threads=1                      \
>>>>>>      -m 1024M,slots=16,maxmem=64G                            \
>>>>>>      -object memory-backend-ram,id=mem0,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem1,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem2,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem3,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem4,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem4,size=384M            \
>>>>>>      -numa node,nodeid=0,memdev=mem0                         \
>>>>>>      -numa node,nodeid=1,memdev=mem1                         \
>>>>>>      -numa node,nodeid=2,memdev=mem2                         \
>>>>>>      -numa node,nodeid=3,memdev=mem3                         \
>>>>>>      -numa node,nodeid=4,memdev=mem4                         \
>>>>>>      -numa node,nodeid=5,memdev=mem5
>>>>>>             :
>>>>>>      alternatives: patching kernel code
>>>>>>      BUG: arch topology borken
>>>>>>      the CLS domain not a subset of the MC domain
>>>>>>      <the above error log repeats>
>>>>>>      BUG: arch topology borken
>>>>>>      the DIE domain not a subset of the NODE domain
>>>>>>
>>>>>> With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5
>>>>>> are associated with NODE#0 to NODE#5 separately. That's incorrect because
>>>>>> CPU#0/1/2 should be associated with same NUMA node because they're seated
>>>>>> in same socket.
>>>>>>
>>>>>> This fixes the issue by populating the CPU topology in virt_possible_cpu_arch_ids()
>>>>>> and considering the socket index when default CPU-to-NUMA association is given
>>>>>> in virt_possible_cpu_arch_ids(). With this applied, no more CPU topology broken
>>>>>> warnings are seen from the Linux guest. The 6 CPUs are associated with NODE#0/1,
>>>>>> but there are no CPUs associated with NODE#2/3/4/5.
>>>>> It may be better to split this patch into two. One extends
>>>>> virt_possible_cpu_arch_ids,
>>>>> and the other fixes the numa node ID issue.
>>>>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>>>>> ---
>>>>>>     hw/arm/virt.c | 17 ++++++++++++++++-
>>>>>>     1 file changed, 16 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>>>> index 46bf7ceddf..dee02b60fc 100644
>>>>>> --- a/hw/arm/virt.c
>>>>>> +++ b/hw/arm/virt.c
>>>>>> @@ -2488,7 +2488,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>>>>>>     static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
>>>>>>     {
>>>>>> -    return idx % ms->numa_state->num_nodes;
>>>>>> +    int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
>>>>>> +
>>>>>> +    return socket_id % ms->numa_state->num_nodes;
>>>>>>     }
>>>>>>     static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>> @@ -2496,6 +2498,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>>         int n;
>>>>>>         unsigned int max_cpus = ms->smp.max_cpus;
>>>>>>         VirtMachineState *vms = VIRT_MACHINE(ms);
>>>>>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>>>>>         if (ms->possible_cpus) {
>>>>>>             assert(ms->possible_cpus->len == max_cpus);
>>>>>> @@ -2509,6 +2512,18 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>>             ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>>>>>             ms->possible_cpus->cpus[n].arch_id =
>>>>>>                 virt_cpu_mp_affinity(vms, n);
>>>>>> +
>>>>>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
>>>>>> +        ms->possible_cpus->cpus[n].props.socket_id =
>>>>>> +            n / (ms->smp.dies * ms->smp.clusters *
>>>>>> +                ms->smp.cores * ms->smp.threads);
>>>>>> +        if (mc->smp_props.dies_supported) {
>>>>>> +            ms->possible_cpus->cpus[n].props.has_die_id = true;
>>>>>> +            ms->possible_cpus->cpus[n].props.die_id =
>>>>>> +                n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>>>>>> +        }
>>>>> I still don't think we need to consider dies if it's certainly not
>>>>> supported yet, IOW, we will never come into the if-branch.
>>>>> We are populating arm-specific topo info instead of the generic,
>>>>> we can probably uniformly update this part together with other
>>>>> necessary places when we decide to support dies for arm virt
>>>>> machine in the future. :)
>>>> it seems we do support dies and they are supposed to be numa boundary too,
>>>> so perhaps we should account for it when generating node-id.
>>> Sorry, I actually meant that we currently don't support dies for arm, so
>>> that
>>> we will always have "mc->smp_props.dies_supported == False" here, which
>>> makes the code a bit unnecessary.  dies are only supported for x86 for
>>> now. :)
>>>
>> then perhaps add an assert() here, so that we would notice and fix this
>> place when dies_supported becomes true.
> A simple assert() works here, I think.
> 

Ok. I will have the changes in v3. ms->smp.dies won't be included in
the calculation and assert(!mc->smp_props.clusters_supported) will be
added.

>>>>>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
>>>>>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>>>>>>             ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>>>>>             ms->possible_cpus->cpus[n].props.thread_id = n;
>>>>>>         }
>>>>> Maybe we should use the same algorithm in x86_topo_ids_from_idx
>>>>> to populate the IDs, so that scope of socket-id will be [0, total_sockets),
>>>>> scope of thread-id is [0, threads_per_core), and so on. Then with a
>>>>> group of socket/cluster/core/thread-id, we determine a CPU.
>>>>>
>>>>> Suggestion: For the long term, is it necessary now to add similar topo
>>>>> info infrastructure for ARM, such as X86 TopoInfo, X86CPUTopoIDs,
>>>>> x86_topo_ids_from_idx?
>>>>>

It's a good idea, but I think it's something for future. Once
the die is supported, we may have generic mechanism to generate
the CPU topology based on its index or thread ID. It would be
nice if the mechanism can be shared by various architectures.

In the guest, which is booted with the given command lines in
the commit log, CPUs are associated with NUMA node#0/1 and
no CPUs are associated with node#2/3/4/5 after the patch is
applied on arm/virt machine. x86 has same behavior.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-03-21  2:28           ` wangyanan (Y) via
  2022-03-23  3:26             ` Gavin Shan
@ 2022-03-23  3:29             ` Gavin Shan
  1 sibling, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2022-03-23  3:29 UTC (permalink / raw)
  To: wangyanan (Y), Igor Mammedov
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	qemu-arm, shan.gavin

Hi Yanan,

On 3/21/22 10:28 AM, wangyanan (Y) wrote:
> On 2022/3/18 21:27, Igor Mammedov wrote:
>> On Fri, 18 Mar 2022 21:00:35 +0800
>> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>>
>>> On 2022/3/18 17:56, Igor Mammedov wrote:
>>>> On Fri, 18 Mar 2022 14:23:34 +0800
>>>> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>>>>> Hi Gavin,
>>>>>
>>>>> On 2022/3/3 11:11, Gavin Shan wrote:
>>>>>> The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id()
>>>>>> when it isn't provided explicitly. However, the CPU topology isn't fully
>>>>>> considered in the default association and it causes CPU topology broken
>>>>>> warnings on booting Linux guest.
>>>>>>
>>>>>> For example, the following warning messages are observed when the Linux guest
>>>>>> is booted with the following command lines.
>>>>>>
>>>>>>      /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>>>>>>      -accel kvm -machine virt,gic-version=host               \
>>>>>>      -cpu host                                               \
>>>>>>      -smp 6,sockets=2,cores=3,threads=1                      \
>>>>>>      -m 1024M,slots=16,maxmem=64G                            \
>>>>>>      -object memory-backend-ram,id=mem0,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem1,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem2,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem3,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem4,size=128M            \
>>>>>>      -object memory-backend-ram,id=mem4,size=384M            \
>>>>>>      -numa node,nodeid=0,memdev=mem0                         \
>>>>>>      -numa node,nodeid=1,memdev=mem1                         \
>>>>>>      -numa node,nodeid=2,memdev=mem2                         \
>>>>>>      -numa node,nodeid=3,memdev=mem3                         \
>>>>>>      -numa node,nodeid=4,memdev=mem4                         \
>>>>>>      -numa node,nodeid=5,memdev=mem5
>>>>>>             :
>>>>>>      alternatives: patching kernel code
>>>>>>      BUG: arch topology borken
>>>>>>      the CLS domain not a subset of the MC domain
>>>>>>      <the above error log repeats>
>>>>>>      BUG: arch topology borken
>>>>>>      the DIE domain not a subset of the NODE domain
>>>>>>
>>>>>> With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5
>>>>>> are associated with NODE#0 to NODE#5 separately. That's incorrect because
>>>>>> CPU#0/1/2 should be associated with same NUMA node because they're seated
>>>>>> in same socket.
>>>>>>
>>>>>> This fixes the issue by populating the CPU topology in virt_possible_cpu_arch_ids()
>>>>>> and considering the socket index when default CPU-to-NUMA association is given
>>>>>> in virt_possible_cpu_arch_ids(). With this applied, no more CPU topology broken
>>>>>> warnings are seen from the Linux guest. The 6 CPUs are associated with NODE#0/1,
>>>>>> but there are no CPUs associated with NODE#2/3/4/5.
>>>>> It may be better to split this patch into two. One extends
>>>>> virt_possible_cpu_arch_ids,

Agreed, I will do in v3. Sorry that I forgot to mention it in last reply.

Thanks,
Gavin

>>>>> and the other fixes the numa node ID issue.
>>>>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>>>>> ---
>>>>>>     hw/arm/virt.c | 17 ++++++++++++++++-
>>>>>>     1 file changed, 16 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>>>> index 46bf7ceddf..dee02b60fc 100644
>>>>>> --- a/hw/arm/virt.c
>>>>>> +++ b/hw/arm/virt.c
>>>>>> @@ -2488,7 +2488,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>>>>>>     static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
>>>>>>     {
>>>>>> -    return idx % ms->numa_state->num_nodes;
>>>>>> +    int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
>>>>>> +
>>>>>> +    return socket_id % ms->numa_state->num_nodes;
>>>>>>     }
>>>>>>     static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>> @@ -2496,6 +2498,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>>         int n;
>>>>>>         unsigned int max_cpus = ms->smp.max_cpus;
>>>>>>         VirtMachineState *vms = VIRT_MACHINE(ms);
>>>>>> +    MachineClass *mc = MACHINE_GET_CLASS(vms);
>>>>>>         if (ms->possible_cpus) {
>>>>>>             assert(ms->possible_cpus->len == max_cpus);
>>>>>> @@ -2509,6 +2512,18 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>>>>>>             ms->possible_cpus->cpus[n].type = ms->cpu_type;
>>>>>>             ms->possible_cpus->cpus[n].arch_id =
>>>>>>                 virt_cpu_mp_affinity(vms, n);
>>>>>> +
>>>>>> +        ms->possible_cpus->cpus[n].props.has_socket_id = true;
>>>>>> +        ms->possible_cpus->cpus[n].props.socket_id =
>>>>>> +            n / (ms->smp.dies * ms->smp.clusters *
>>>>>> +                ms->smp.cores * ms->smp.threads);
>>>>>> +        if (mc->smp_props.dies_supported) {
>>>>>> +            ms->possible_cpus->cpus[n].props.has_die_id = true;
>>>>>> +            ms->possible_cpus->cpus[n].props.die_id =
>>>>>> +                n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
>>>>>> +        }
>>>>> I still don't think we need to consider dies if it's certainly not
>>>>> supported yet, IOW, we will never come into the if-branch.
>>>>> We are populating arm-specific topo info instead of the generic,
>>>>> we can probably uniformly update this part together with other
>>>>> necessary places when we decide to support dies for arm virt
>>>>> machine in the future. :)
>>>> it seems we do support dies and they are supposed to be numa boundary too,
>>>> so perhaps we should account for it when generating node-id.
>>> Sorry, I actually meant that we currently don't support dies for arm, so
>>> that
>>> we will always have "mc->smp_props.dies_supported == False" here, which
>>> makes the code a bit unnecessary.  dies are only supported for x86 for
>>> now. :)
>>>
>> then perhaps add an assert() here, so that we would notice and fix this
>> place when dies_supported becomes true.
> A simple assert() works here, I think.
> 
> Thanks,
> Yanan
>>> Thanks,
>>> Yanan
>>>>>> +        ms->possible_cpus->cpus[n].props.has_core_id = true;
>>>>>> +        ms->possible_cpus->cpus[n].props.core_id = n / ms->smp.threads;
>>>>>>             ms->possible_cpus->cpus[n].props.has_thread_id = true;
>>>>>>             ms->possible_cpus->cpus[n].props.thread_id = n;
>>>>>>         }
>>>>> Maybe we should use the same algorithm in x86_topo_ids_from_idx
>>>>> to populate the IDs, so that scope of socket-id will be [0, total_sockets),
>>>>> scope of thread-id is [0, threads_per_core), and so on. Then with a
>>>>> group of socket/cluster/core/thread-id, we determine a CPU.
>>>>>
>>>>> Suggestion: For the long term, is it necessary now to add similar topo
>>>>> info infrastructure for ARM, such as X86CPUTopoInfo, X86CPUTopoIDs,
>>>>> x86_topo_ids_from_idx?
>>>>>
>>>>> Thanks,
>>>>> Yanan
>>>> .
>> .
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table
  2022-03-18 13:28     ` Igor Mammedov
@ 2022-03-23  3:31       ` Gavin Shan
  0 siblings, 0 replies; 15+ messages in thread
From: Gavin Shan @ 2022-03-23  3:31 UTC (permalink / raw)
  To: Igor Mammedov, wangyanan (Y)
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	qemu-arm, shan.gavin

Hi Igor and Yanan,

On 3/18/22 9:28 PM, Igor Mammedov wrote:
> On Fri, 18 Mar 2022 14:34:12 +0800
> "wangyanan (Y)" <wangyanan55@huawei.com> wrote:
>> On 2022/3/3 11:11, Gavin Shan wrote:
>>> When the PPTT table is built, the CPU topology is re-calculated, but
>>> it's unecessary because the CPU topology, except the cluster IDs,
>>> has been populated in virt_possible_cpu_arch_ids() on arm/virt machine.
>>>
>>> This avoids to re-calculate the CPU topology by reusing the existing
>>> one in ms->possible_cpus. However, the cluster ID for the CPU instance
>>> has to be calculated dynamically because there is no corresponding
>>> field in struct CpuInstanceProperties. Currently, the only user of
>>> build_pptt() is arm/virt machine.
>>>
>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>> ---
>>>    hw/acpi/aml-build.c | 106 ++++++++++++++++++++++++++++++++++----------
>>>    1 file changed, 82 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>>> index 8966e16320..572cf5fc00 100644
>>> --- a/hw/acpi/aml-build.c
>>> +++ b/hw/acpi/aml-build.c
>>> @@ -2002,18 +2002,27 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>>>                    const char *oem_id, const char *oem_table_id)
>>>    {
>>>        MachineClass *mc = MACHINE_GET_CLASS(ms);
>>> +    CPUArchIdList *cpus = ms->possible_cpus;
>>> +    GQueue *socket_list = g_queue_new();
>>> +    GQueue *cluster_list = g_queue_new();
>>> +    GQueue *core_list = g_queue_new();
>>>        GQueue *list = g_queue_new();
>>>        guint pptt_start = table_data->len;
>>>        guint parent_offset;
>>>        guint length, i;
>>> -    int uid = 0;
>>> -    int socket;
>>> +    int n, id, socket_id, cluster_id, core_id, thread_id;
>>>        AcpiTable table = { .sig = "PPTT", .rev = 2,
>>>                            .oem_id = oem_id, .oem_table_id = oem_table_id };
>>>    
>>>        acpi_table_begin(&table, table_data);
>>>    
>>> -    for (socket = 0; socket < ms->smp.sockets; socket++) {
>>> +    for (n = 0; n < cpus->len; n++) {
>>> +        socket_id = cpus->cpus[n].props.socket_id;
>>> +        if (g_queue_find(socket_list, GUINT_TO_POINTER(socket_id))) {
>>> +            continue;
>>> +        }
>>> +
>>> +        g_queue_push_tail(socket_list, GUINT_TO_POINTER(socket_id));
>>>            g_queue_push_tail(list,
>>>                GUINT_TO_POINTER(table_data->len - pptt_start));
>>>            build_processor_hierarchy_node(
>>> @@ -2023,65 +2032,114 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
>>>                 * of a physical package
>>>                 */
>>>                (1 << 0),
>>> -            0, socket, NULL, 0);
>>> +            0, socket_id, NULL, 0);
>>>        }
>>>    
>>>        if (mc->smp_props.clusters_supported) {
>>>            length = g_queue_get_length(list);
>>>            for (i = 0; i < length; i++) {
>>> -            int cluster;
>>> -
>>>                parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
>>> -            for (cluster = 0; cluster < ms->smp.clusters; cluster++) {
>>> +            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
>>> +
>>> +            for (n = 0; n < cpus->len; n++) {
>>> +                if (cpus->cpus[n].props.socket_id != socket_id) {
>>> +                    continue;
>>> +                }
>>> +
>>> +                /*
>>> +                 * We have to calculate the cluster ID because it isn't
>>> +                 * available in the CPU instance properties.
>>> +                 */
>> Since we need cluster ID now, maybe we can simply make it supported
>> in the CPU instance properties.
> 
> agreed
> 

Thanks for your review. I will add it in v3. FYI, the addition
needs to be done in the PATCH[v3 01/04] where the CPU topology
is populated :)

Thanks,
Gavin

>>> +                cluster_id = cpus->cpus[n].props.thread_id /
>>> +                             (ms->smp.cores * ms->smp.threads);
>>> +                if (g_queue_find(cluster_list, GUINT_TO_POINTER(cluster_id))) {
>>> +                    continue;
>>> +                }
>>> +
>>> +                g_queue_push_tail(cluster_list, GUINT_TO_POINTER(cluster_id));
>>>                    g_queue_push_tail(list,
>>>                        GUINT_TO_POINTER(table_data->len - pptt_start));
>>>                    build_processor_hierarchy_node(
>>>                        table_data,
>>>                        (0 << 0), /* not a physical package */
>>> -                    parent_offset, cluster, NULL, 0);
>>> +                    parent_offset, cluster_id, NULL, 0);
>>>                }
>>>            }
>>>        }
>>>    
>>>        length = g_queue_get_length(list);
>>>        for (i = 0; i < length; i++) {
>>> -        int core;
>>> -
>>>            parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
>>> -        for (core = 0; core < ms->smp.cores; core++) {
>>> -            if (ms->smp.threads > 1) {
>>> -                g_queue_push_tail(list,
>>> -                    GUINT_TO_POINTER(table_data->len - pptt_start));
>>> -                build_processor_hierarchy_node(
>>> -                    table_data,
>>> -                    (0 << 0), /* not a physical package */
>>> -                    parent_offset, core, NULL, 0);
>>> -            } else {
>>> +        if (!mc->smp_props.clusters_supported) {
>>> +            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
>>> +        } else {
>>> +            cluster_id = GPOINTER_TO_UINT(g_queue_pop_head(cluster_list));
>>> +        }
>>> +
>>> +        for (n = 0; n < cpus->len; n++) {
>>> +            if (!mc->smp_props.clusters_supported &&
>>> +                cpus->cpus[n].props.socket_id != socket_id) {
>>> +                continue;
>>> +            }
>>> +
>>> +            /*
>>> +             * We have to calculate the cluster ID because it isn't
>>> +             * available in the CPU instance properties.
>>> +             */
>>> +            id = cpus->cpus[n].props.thread_id /
>>> +                (ms->smp.cores * ms->smp.threads);
>>> +            if (mc->smp_props.clusters_supported && id != cluster_id) {
>>> +                continue;
>>> +            }
>>> +
>>> +            core_id = cpus->cpus[n].props.core_id;
>>> +            if (ms->smp.threads <= 1) {
>>>                    build_processor_hierarchy_node(
>>>                        table_data,
>>>                        (1 << 1) | /* ACPI Processor ID valid */
>>>                        (1 << 3),  /* Node is a Leaf */
>>> -                    parent_offset, uid++, NULL, 0);
>>> +                    parent_offset, core_id, NULL, 0);
>>> +                continue;
>>>                }
>>> +
>>> +            if (g_queue_find(core_list, GUINT_TO_POINTER(core_id))) {
>>> +                continue;
>>> +            }
>>> +
>>> +            g_queue_push_tail(core_list, GUINT_TO_POINTER(core_id));
>>> +            g_queue_push_tail(list,
>>> +                GUINT_TO_POINTER(table_data->len - pptt_start));
>>> +            build_processor_hierarchy_node(
>>> +                table_data,
>>> +                (0 << 0), /* not a physical package */
>>> +                parent_offset, core_id, NULL, 0);
>>>            }
>>>        }
>>>    
>>>        length = g_queue_get_length(list);
>>>        for (i = 0; i < length; i++) {
>>> -        int thread;
>>> -
>>>            parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
>>> -        for (thread = 0; thread < ms->smp.threads; thread++) {
>>> +        core_id = GPOINTER_TO_UINT(g_queue_pop_head(core_list));
>>> +
>>> +        for (n = 0; n < cpus->len; n++) {
>>> +            if (cpus->cpus[n].props.core_id != core_id) {
>>> +                continue;
>>> +            }
>>> +
>>> +            thread_id = cpus->cpus[n].props.thread_id;
>>>                build_processor_hierarchy_node(
>>>                    table_data,
>>>                    (1 << 1) | /* ACPI Processor ID valid */
>>>                    (1 << 2) | /* Processor is a Thread */
>>>                    (1 << 3),  /* Node is a Leaf */
>>> -                parent_offset, uid++, NULL, 0);
>>> +                parent_offset, thread_id, NULL, 0);
>>>            }
>>>        }
>>>    
>>>        g_queue_free(list);
>>> +    g_queue_free(core_list);
>>> +    g_queue_free(cluster_list);
>>> +    g_queue_free(socket_list);
>>>        acpi_table_end(linker, &table);
>>>    }
>>>      
>>
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-03-23  3:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-03  3:11 [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
2022-03-03  3:11 ` [PATCH v2 1/3] " Gavin Shan
2022-03-18  6:23   ` wangyanan (Y) via
2022-03-18  9:56     ` Igor Mammedov
2022-03-18 13:00       ` wangyanan (Y) via
2022-03-18 13:27         ` Igor Mammedov
2022-03-21  2:28           ` wangyanan (Y) via
2022-03-23  3:26             ` Gavin Shan
2022-03-23  3:29             ` Gavin Shan
2022-03-03  3:11 ` [PATCH v2 2/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table Gavin Shan
2022-03-18  6:34   ` wangyanan (Y) via
2022-03-18 13:28     ` Igor Mammedov
2022-03-23  3:31       ` Gavin Shan
2022-03-03  3:11 ` [PATCH v2 3/3] hw/arm/virt: Unify ACPI processor ID in MADT and SRAT table Gavin Shan
2022-03-14  6:24 ` [PATCH v2 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.