qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID
@ 2022-04-03 11:00 Gavin Shan
  2022-04-03 11:00 ` [PATCH v4 1/3] hw/arm/virt: Consider SMP configuration in CPU topology Gavin Shan
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Gavin Shan @ 2022-04-03 11:00 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

When the CPU-to-NUMA association isn't provided by user, the default NUMA
node ID for the specific CPU is returned from virt_get_default_cpu_node_id().
Unfortunately, the default NUMA node ID breaks socket boundary and leads to
the broken CPU topology warning message in Linux guest. This series intends
to fix the issue.

  PATCH[1/3] Uses SMP configuration to populate CPU topology
  PATCH[2/3] Fixes the broken CPU topology by considering the socket boundary
             when the default NUMA node ID is given
  PATCH[3/3] Uses the populated CPU topology to build PPTT table, instead of
             calculate it again

Changelog
=========
v4:
   * Apply '% ms->smp.{clusters, cores, threads} as x86 does
     in virt_possible_cpu_arch_ids()                            (Igor)
   * s/within cluster/within cluster\/die/ for 'core-id' in
     qapi/machine.json                                          (Igor)
   * Use [0 - possible_cpus->len] as ACPI processor UID to
     build PPTT table and PATCH[v3 4/4] is dropped              (Igor)
v3:
   * Split PATCH[v2 1/3] to PATCH[v3 1/4] and PATCH[v3 2/4]     (Yanan)
   * Don't take account of die ID in CPU topology population
     and added assert(!mc->smp_props.dies_supported)            (Yanan/Igor)
   * Assign cluster_id and use it when building PPTT table      (Yanan/Igor)
v2:
   * Populate the CPU topology in virt_possible_cpu_arch_ids()
     so that it can be reused in virt_get_default_cpu_node_id() (Igor)
   * Added PATCH[2/3] to use the existing CPU topology when the
     PPTT table is built                                        (Igor)
   * Added PATCH[3/3] to take thread ID as ACPI processor ID
     in MADT and SRAT table                                     (Gavin)


Gavin Shan (3):
  hw/arm/virt: Consider SMP configuration in CPU topology
  hw/arm/virt: Fix CPU's default NUMA node ID
  hw/acpi/aml-build: Use existing CPU topology to build PPTT table

 hw/acpi/aml-build.c | 95 +++++++++++++++++++++++++++++++++------------
 hw/arm/virt.c       | 19 ++++++++-
 qapi/machine.json   |  6 ++-
 3 files changed, 92 insertions(+), 28 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 1/3] hw/arm/virt: Consider SMP configuration in CPU topology
  2022-04-03 11:00 [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
@ 2022-04-03 11:00 ` Gavin Shan
  2022-04-03 11:00 ` [PATCH v4 2/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Gavin Shan @ 2022-04-03 11:00 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

Currently, the SMP configuration isn't considered when the CPU
topology is populated. In this case, it's impossible to provide
the default CPU-to-NUMA mapping or association based on the socket
ID of the given CPU.

This takes account of SMP configuration when the CPU topology
is populated. The die ID for the given CPU isn't assigned since
it's not supported on arm/virt machine yet. Besides, the cluster
ID for the given CPU is assigned because it has been supported
on arm/virt machine.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/arm/virt.c     | 15 ++++++++++++++-
 qapi/machine.json |  6 ++++--
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d2e5ecd234..f628e86f78 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2505,6 +2505,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
     int n;
     unsigned int max_cpus = ms->smp.max_cpus;
     VirtMachineState *vms = VIRT_MACHINE(ms);
+    MachineClass *mc = MACHINE_GET_CLASS(vms);
 
     if (ms->possible_cpus) {
         assert(ms->possible_cpus->len == max_cpus);
@@ -2518,8 +2519,20 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[n].type = ms->cpu_type;
         ms->possible_cpus->cpus[n].arch_id =
             virt_cpu_mp_affinity(vms, n);
+
+        assert(!mc->smp_props.dies_supported);
+        ms->possible_cpus->cpus[n].props.has_socket_id = true;
+        ms->possible_cpus->cpus[n].props.socket_id =
+            n / (ms->smp.clusters * ms->smp.cores * ms->smp.threads);
+        ms->possible_cpus->cpus[n].props.has_cluster_id = true;
+        ms->possible_cpus->cpus[n].props.cluster_id =
+            (n / (ms->smp.cores * ms->smp.threads)) % ms->smp.clusters;
+        ms->possible_cpus->cpus[n].props.has_core_id = true;
+        ms->possible_cpus->cpus[n].props.core_id =
+            (n / ms->smp.threads) % ms->smp.cores;
         ms->possible_cpus->cpus[n].props.has_thread_id = true;
-        ms->possible_cpus->cpus[n].props.thread_id = n;
+        ms->possible_cpus->cpus[n].props.thread_id =
+            n % ms->smp.threads;
     }
     return ms->possible_cpus;
 }
diff --git a/qapi/machine.json b/qapi/machine.json
index 9c460ec450..ea22b574b0 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -868,10 +868,11 @@
 # @node-id: NUMA node ID the CPU belongs to
 # @socket-id: socket number within node/board the CPU belongs to
 # @die-id: die number within socket the CPU belongs to (since 4.1)
-# @core-id: core number within die the CPU belongs to
+# @cluster-id: cluster number within die the CPU belongs to
+# @core-id: core number within cluster/die the CPU belongs to
 # @thread-id: thread number within core the CPU belongs to
 #
-# Note: currently there are 5 properties that could be present
+# Note: currently there are 6 properties that could be present
 #       but management should be prepared to pass through other
 #       properties with device_add command to allow for future
 #       interface extension. This also requires the filed names to be kept in
@@ -883,6 +884,7 @@
   'data': { '*node-id': 'int',
             '*socket-id': 'int',
             '*die-id': 'int',
+            '*cluster-id': 'int',
             '*core-id': 'int',
             '*thread-id': 'int'
   }
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 2/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-04-03 11:00 [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
  2022-04-03 11:00 ` [PATCH v4 1/3] hw/arm/virt: Consider SMP configuration in CPU topology Gavin Shan
@ 2022-04-03 11:00 ` Gavin Shan
  2022-04-03 11:00 ` [PATCH v4 3/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table Gavin Shan
  2022-04-03 11:58 ` [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
  3 siblings, 0 replies; 5+ messages in thread
From: Gavin Shan @ 2022-04-03 11:00 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

When CPU-to-NUMA association isn't explicitly provided by users,
the default on is given by mc->get_default_cpu_node_id(). However,
the CPU topology isn't fully considered in the default association
and this causes CPU topology broken warnings on booting Linux guest.

For example, the following warning messages are observed when the
Linux guest is booted with the following command lines.

  /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
  -accel kvm -machine virt,gic-version=host               \
  -cpu host                                               \
  -smp 6,sockets=2,cores=3,threads=1                      \
  -m 1024M,slots=16,maxmem=64G                            \
  -object memory-backend-ram,id=mem0,size=128M            \
  -object memory-backend-ram,id=mem1,size=128M            \
  -object memory-backend-ram,id=mem2,size=128M            \
  -object memory-backend-ram,id=mem3,size=128M            \
  -object memory-backend-ram,id=mem4,size=128M            \
  -object memory-backend-ram,id=mem4,size=384M            \
  -numa node,nodeid=0,memdev=mem0                         \
  -numa node,nodeid=1,memdev=mem1                         \
  -numa node,nodeid=2,memdev=mem2                         \
  -numa node,nodeid=3,memdev=mem3                         \
  -numa node,nodeid=4,memdev=mem4                         \
  -numa node,nodeid=5,memdev=mem5
         :
  alternatives: patching kernel code
  BUG: arch topology borken
  the CLS domain not a subset of the MC domain
  <the above error log repeats>
  BUG: arch topology borken
  the DIE domain not a subset of the NODE domain

With current implementation of mc->get_default_cpu_node_id(),
CPU#0 to CPU#5 are associated with NODE#0 to NODE#5 separately.
That's incorrect because CPU#0/1/2 should be associated with same
NUMA node because they're seated in same socket.

This fixes the issue by considering the socket ID when the default
CPU-to-NUMA association is provided in virt_possible_cpu_arch_ids().
With this applied, no more CPU topology broken warnings are seen
from the Linux guest. The 6 CPUs are associated with NODE#0/1, but
there are no CPUs associated with NODE#2/3/4/5.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/arm/virt.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f628e86f78..558bd59e8b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2497,7 +2497,9 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 
 static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx)
 {
-    return idx % ms->numa_state->num_nodes;
+    int64_t socket_id = ms->possible_cpus->cpus[idx].props.socket_id;
+
+    return socket_id % ms->numa_state->num_nodes;
 }
 
 static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 3/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table
  2022-04-03 11:00 [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
  2022-04-03 11:00 ` [PATCH v4 1/3] hw/arm/virt: Consider SMP configuration in CPU topology Gavin Shan
  2022-04-03 11:00 ` [PATCH v4 2/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
@ 2022-04-03 11:00 ` Gavin Shan
  2022-04-03 11:58 ` [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
  3 siblings, 0 replies; 5+ messages in thread
From: Gavin Shan @ 2022-04-03 11:00 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

When the PPTT table is built, the CPU topology is re-calculated, but
it's unecessary because the CPU topology has been populated in
virt_possible_cpu_arch_ids() on arm/virt machine.

This avoids to re-calculate the CPU topology by reusing the existing
one in ms->possible_cpus. Currently, the only user of build_pptt() is
arm/virt machine.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/acpi/aml-build.c | 95 +++++++++++++++++++++++++++++++++------------
 1 file changed, 71 insertions(+), 24 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 4086879ebf..8f02d77375 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2002,18 +2002,27 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
                 const char *oem_id, const char *oem_table_id)
 {
     MachineClass *mc = MACHINE_GET_CLASS(ms);
+    CPUArchIdList *cpus = ms->possible_cpus;
+    GQueue *socket_list = g_queue_new();
+    GQueue *cluster_list = g_queue_new();
+    GQueue *core_list = g_queue_new();
     GQueue *list = g_queue_new();
     guint pptt_start = table_data->len;
     guint parent_offset;
     guint length, i;
-    int uid = 0;
-    int socket;
+    int n, socket_id, cluster_id, core_id;
     AcpiTable table = { .sig = "PPTT", .rev = 2,
                         .oem_id = oem_id, .oem_table_id = oem_table_id };
 
     acpi_table_begin(&table, table_data);
 
-    for (socket = 0; socket < ms->smp.sockets; socket++) {
+    for (n = 0; n < cpus->len; n++) {
+        socket_id = cpus->cpus[n].props.socket_id;
+        if (g_queue_find(socket_list, GUINT_TO_POINTER(socket_id))) {
+            continue;
+        }
+
+        g_queue_push_tail(socket_list, GUINT_TO_POINTER(socket_id));
         g_queue_push_tail(list,
             GUINT_TO_POINTER(table_data->len - pptt_start));
         build_processor_hierarchy_node(
@@ -2023,65 +2032,103 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, MachineState *ms,
              * of a physical package
              */
             (1 << 0),
-            0, socket, NULL, 0);
+            0, socket_id, NULL, 0);
     }
 
     if (mc->smp_props.clusters_supported) {
         length = g_queue_get_length(list);
         for (i = 0; i < length; i++) {
-            int cluster;
-
             parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
-            for (cluster = 0; cluster < ms->smp.clusters; cluster++) {
+            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
+
+            for (n = 0; n < cpus->len; n++) {
+                if (cpus->cpus[n].props.socket_id != socket_id) {
+                    continue;
+                }
+
+                cluster_id = cpus->cpus[n].props.cluster_id;
+                if (g_queue_find(cluster_list, GUINT_TO_POINTER(cluster_id))) {
+                    continue;
+                }
+
+                g_queue_push_tail(cluster_list, GUINT_TO_POINTER(cluster_id));
                 g_queue_push_tail(list,
                     GUINT_TO_POINTER(table_data->len - pptt_start));
                 build_processor_hierarchy_node(
                     table_data,
                     (0 << 0), /* not a physical package */
-                    parent_offset, cluster, NULL, 0);
+                    parent_offset, cluster_id, NULL, 0);
             }
         }
     }
 
     length = g_queue_get_length(list);
     for (i = 0; i < length; i++) {
-        int core;
-
         parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
-        for (core = 0; core < ms->smp.cores; core++) {
-            if (ms->smp.threads > 1) {
-                g_queue_push_tail(list,
-                    GUINT_TO_POINTER(table_data->len - pptt_start));
-                build_processor_hierarchy_node(
-                    table_data,
-                    (0 << 0), /* not a physical package */
-                    parent_offset, core, NULL, 0);
-            } else {
+        if (!mc->smp_props.clusters_supported) {
+            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
+        } else {
+            cluster_id = GPOINTER_TO_UINT(g_queue_pop_head(cluster_list));
+        }
+
+        for (n = 0; n < cpus->len; n++) {
+            if (!mc->smp_props.clusters_supported &&
+                cpus->cpus[n].props.socket_id != socket_id) {
+                continue;
+            }
+
+            if (mc->smp_props.clusters_supported &&
+                cpus->cpus[n].props.cluster_id != cluster_id) {
+                continue;
+            }
+
+            if (ms->smp.threads <= 1) {
                 build_processor_hierarchy_node(
                     table_data,
                     (1 << 1) | /* ACPI Processor ID valid */
                     (1 << 3),  /* Node is a Leaf */
-                    parent_offset, uid++, NULL, 0);
+                    parent_offset, n, NULL, 0);
+                continue;
+            }
+
+            core_id = cpus->cpus[n].props.core_id;
+            if (g_queue_find(core_list, GUINT_TO_POINTER(core_id))) {
+                continue;
             }
+
+            g_queue_push_tail(core_list, GUINT_TO_POINTER(core_id));
+            g_queue_push_tail(list,
+                GUINT_TO_POINTER(table_data->len - pptt_start));
+            build_processor_hierarchy_node(
+                table_data,
+                (0 << 0), /* not a physical package */
+                parent_offset, core_id, NULL, 0);
         }
     }
 
     length = g_queue_get_length(list);
     for (i = 0; i < length; i++) {
-        int thread;
-
         parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
-        for (thread = 0; thread < ms->smp.threads; thread++) {
+        core_id = GPOINTER_TO_UINT(g_queue_pop_head(core_list));
+
+        for (n = 0; n < cpus->len; n++) {
+            if (cpus->cpus[n].props.core_id != core_id) {
+                continue;
+            }
+
             build_processor_hierarchy_node(
                 table_data,
                 (1 << 1) | /* ACPI Processor ID valid */
                 (1 << 2) | /* Processor is a Thread */
                 (1 << 3),  /* Node is a Leaf */
-                parent_offset, uid++, NULL, 0);
+                parent_offset, n, NULL, 0);
         }
     }
 
     g_queue_free(list);
+    g_queue_free(core_list);
+    g_queue_free(cluster_list);
+    g_queue_free(socket_list);
     acpi_table_end(linker, &table);
 }
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID
  2022-04-03 11:00 [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
                   ` (2 preceding siblings ...)
  2022-04-03 11:00 ` [PATCH v4 3/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table Gavin Shan
@ 2022-04-03 11:58 ` Gavin Shan
  3 siblings, 0 replies; 5+ messages in thread
From: Gavin Shan @ 2022-04-03 11:58 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, drjones, richard.henderson, qemu-devel, zhenyzha,
	wangyanan55, shan.gavin, imammedo

Hi Igor and Yanan,

On 4/3/22 7:00 PM, Gavin Shan wrote:
> When the CPU-to-NUMA association isn't provided by user, the default NUMA
> node ID for the specific CPU is returned from virt_get_default_cpu_node_id().
> Unfortunately, the default NUMA node ID breaks socket boundary and leads to
> the broken CPU topology warning message in Linux guest. This series intends
> to fix the issue.
> 
>    PATCH[1/3] Uses SMP configuration to populate CPU topology
>    PATCH[2/3] Fixes the broken CPU topology by considering the socket boundary
>               when the default NUMA node ID is given
>    PATCH[3/3] Uses the populated CPU topology to build PPTT table, instead of
>               calculate it again
> 

Please ignore this series (v4) and review v5 directly. I missed some of your
comments when v4 was posted and they will be covered in v5.

> Changelog
> =========
> v4:
>     * Apply '% ms->smp.{clusters, cores, threads} as x86 does
>       in virt_possible_cpu_arch_ids()                            (Igor)
>     * s/within cluster/within cluster\/die/ for 'core-id' in
>       qapi/machine.json                                          (Igor)
>     * Use [0 - possible_cpus->len] as ACPI processor UID to
>       build PPTT table and PATCH[v3 4/4] is dropped              (Igor)
> v3:
>     * Split PATCH[v2 1/3] to PATCH[v3 1/4] and PATCH[v3 2/4]     (Yanan)
>     * Don't take account of die ID in CPU topology population
>       and added assert(!mc->smp_props.dies_supported)            (Yanan/Igor)
>     * Assign cluster_id and use it when building PPTT table      (Yanan/Igor)
> v2:
>     * Populate the CPU topology in virt_possible_cpu_arch_ids()
>       so that it can be reused in virt_get_default_cpu_node_id() (Igor)
>     * Added PATCH[2/3] to use the existing CPU topology when the
>       PPTT table is built                                        (Igor)
>     * Added PATCH[3/3] to take thread ID as ACPI processor ID
>       in MADT and SRAT table                                     (Gavin)
> 
> 

Thanks,
Gavin

> Gavin Shan (3):
>    hw/arm/virt: Consider SMP configuration in CPU topology
>    hw/arm/virt: Fix CPU's default NUMA node ID
>    hw/acpi/aml-build: Use existing CPU topology to build PPTT table
> 
>   hw/acpi/aml-build.c | 95 +++++++++++++++++++++++++++++++++------------
>   hw/arm/virt.c       | 19 ++++++++-
>   qapi/machine.json   |  6 ++-
>   3 files changed, 92 insertions(+), 28 deletions(-)
> 



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-04-03 12:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-03 11:00 [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
2022-04-03 11:00 ` [PATCH v4 1/3] hw/arm/virt: Consider SMP configuration in CPU topology Gavin Shan
2022-04-03 11:00 ` [PATCH v4 2/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan
2022-04-03 11:00 ` [PATCH v4 3/3] hw/acpi/aml-build: Use existing CPU topology to build PPTT table Gavin Shan
2022-04-03 11:58 ` [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).