All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Fix couple of issues with AMD topology
@ 2020-07-01 17:30 Babu Moger
  2020-07-01 17:31 ` [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties Babu Moger
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Babu Moger @ 2020-07-01 17:30 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel

This series fixes couple of issues with recent topology related code.
1. Maintain consistency while building the topology. Use the numa
   information passed from user to build the apic_id.
2. Fix uninitialized memory with -device and CPU hotplug

Here are the discussion thread.
https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.stgit@naples-babu.amd.com/
https://lore.kernel.org/qemu-devel/20200602175212.GH577771@habkost.net/

Fixes:
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1828750
---

v2:
 - Used the numa information from CpuInstanceProperties for building
   the apic_id suggested by Igor.
 - Also did some minor code re-aarangement to take care of changes.
 - Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
   it later.

v1:
 https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.stgit@naples-babu.amd.com
   
Babu Moger (3):
      hw/i386: Initialize topo_ids from CpuInstanceProperties
      hw/i386: Build apic_id from CpuInstanceProperties
      hw/386: Fix uninitialized memory with -device and CPU hotplug


 hw/i386/pc.c               |   16 +++++++++++++++-
 hw/i386/x86.c              |   19 +++++++++++++------
 include/hw/i386/topology.h |   33 ++++++++++++++++++++++++++++++---
 include/hw/i386/x86.h      |    6 ++++--
 tests/test-x86-cpuid.c     |   39 ++++++++++++++++++++-------------------
 5 files changed, 82 insertions(+), 31 deletions(-)

--
Signature


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-01 17:30 [PATCH v2 0/3] Fix couple of issues with AMD topology Babu Moger
@ 2020-07-01 17:31 ` Babu Moger
  2020-07-13  9:08   ` Igor Mammedov
  2020-07-01 17:31 ` [PATCH v2 2/3] hw/i386: Build apic_id " Babu Moger
  2020-07-01 17:31 ` [PATCH v2 3/3] hw/386: Fix uninitialized memory with -device and CPU hotplug Babu Moger
  2 siblings, 1 reply; 23+ messages in thread
From: Babu Moger @ 2020-07-01 17:31 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel

This is in preparation to build the apic_id from user
provided topology information.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 include/hw/i386/topology.h |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 07239f95f4..7cb21e9c82 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -40,6 +40,7 @@
 
 
 #include "qemu/bitops.h"
+#include "qapi/qapi-types-machine.h"
 
 /* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
  */
@@ -196,6 +197,24 @@ static inline void x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
     topo_ids->pkg_id = apicid >> apicid_pkg_offset_epyc(topo_info);
 }
 
+
+/*
+ * Initialize topo_ids from CpuInstanceProperties
+ * node_id in CpuInstanceProperties(or in CPU device) is a sequential
+ * number, but while building the topology we need to separate it for
+ * each socket(mod nodes_per_pkg).
+ */
+static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
+                                     CpuInstanceProperties props,
+                                     X86CPUTopoIDs *topo_ids)
+{
+    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
+    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
+    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
+    topo_ids->node_id = props.has_node_id ?
+                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) : 0;
+    topo_ids->pkg_id = props.has_socket_id ? props.socket_id : 0;
+}
 /*
  * Make APIC ID for the CPU 'cpu_index'
  *



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/3] hw/i386: Build apic_id from CpuInstanceProperties
  2020-07-01 17:30 [PATCH v2 0/3] Fix couple of issues with AMD topology Babu Moger
  2020-07-01 17:31 ` [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties Babu Moger
@ 2020-07-01 17:31 ` Babu Moger
  2020-07-13  9:15   ` Igor Mammedov
  2020-07-01 17:31 ` [PATCH v2 3/3] hw/386: Fix uninitialized memory with -device and CPU hotplug Babu Moger
  2 siblings, 1 reply; 23+ messages in thread
From: Babu Moger @ 2020-07-01 17:31 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel

Build apic_id from CpuInstanceProperties if numa configured.
Use the node_id from user provided numa information. This
will avoid conflicts between numa information and apic_id
generated.

Re-arranged the code little bit to make sure CpuInstanceProperties
is initialized before calling.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c               |    6 +++++-
 hw/i386/x86.c              |   19 +++++++++++++------
 include/hw/i386/topology.h |   14 +++++++++++---
 include/hw/i386/x86.h      |    6 ++++--
 tests/test-x86-cpuid.c     |   39 ++++++++++++++++++++-------------------
 5 files changed, 53 insertions(+), 31 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index d103b8c0ab..e613b2299f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -800,13 +800,17 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
 {
     X86MachineState *x86ms = X86_MACHINE(ms);
-    int64_t apic_id = x86_cpu_apic_id_from_index(x86ms, id);
+    CpuInstanceProperties props;
+    int64_t apic_id;
     Error *local_err = NULL;
 
     if (id < 0) {
         error_setg(errp, "Invalid CPU id: %" PRIi64, id);
         return;
     }
+    props = ms->possible_cpus->cpus[id].props;
+
+    apic_id = x86_cpu_apic_id_from_index(x86ms, id, props);
 
     if (apic_id >= ACPI_CPU_HOTPLUG_ID_LIMIT) {
         error_setg(errp, "Unable to add CPU: %" PRIi64
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 34229b45c7..7554416ae0 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -93,7 +93,8 @@ static void x86_set_epyc_topo_handlers(MachineState *machine)
  * all CPUs up to max_cpus.
  */
 uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
-                                    unsigned int cpu_index)
+                                    unsigned int cpu_index,
+                                    CpuInstanceProperties props)
 {
     X86MachineClass *x86mc = X86_MACHINE_GET_CLASS(x86ms);
     X86CPUTopoInfo topo_info;
@@ -102,7 +103,7 @@ uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
 
     init_topo_info(&topo_info, x86ms);
 
-    correct_id = x86ms->apicid_from_cpu_idx(&topo_info, cpu_index);
+    correct_id = x86ms->apicid_from_cpu_idx(&topo_info, cpu_index, props);
     if (x86mc->compat_apic_id_mode) {
         if (cpu_index != correct_id && !warned && !qtest_enabled()) {
             error_report("APIC IDs set in compatibility mode, "
@@ -136,6 +137,8 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
     const CPUArchIdList *possible_cpus;
     MachineState *ms = MACHINE(x86ms);
     MachineClass *mc = MACHINE_GET_CLASS(x86ms);
+    CpuInstanceProperties props;
+
 
     /* Check for apicid encoding */
     if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
@@ -144,6 +147,8 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
 
     x86_cpu_set_default_version(default_cpu_version);
 
+    possible_cpus = mc->possible_cpu_arch_ids(ms);
+
     /*
      * Calculates the limit to CPU APIC ID values
      *
@@ -152,13 +157,15 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
      *
      * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
      */
-    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
-                                                      ms->smp.max_cpus - 1) + 1;
-    possible_cpus = mc->possible_cpu_arch_ids(ms);
+    props = ms->possible_cpus->cpus[ms->smp.max_cpus - 1].props;
 
+    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
+                                                      ms->smp.max_cpus - 1,
+                                                      props) + 1;
     for (i = 0; i < ms->possible_cpus->len; i++) {
+        props = ms->possible_cpus->cpus[i].props;
         ms->possible_cpus->cpus[i].arch_id =
-            x86_cpu_apic_id_from_index(x86ms, i);
+            x86_cpu_apic_id_from_index(x86ms, i, props);
     }
 
     for (i = 0; i < ms->smp.cpus; i++) {
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 7cb21e9c82..a800fc905f 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -221,10 +221,17 @@ static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
  * 'cpu_index' is a sequential, contiguous ID for the CPU.
  */
 static inline apic_id_t x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo *topo_info,
-                                                     unsigned cpu_index)
+                                                     unsigned cpu_index,
+                                                     CpuInstanceProperties props)
 {
     X86CPUTopoIDs topo_ids;
-    x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
+
+    if (props.has_node_id) {
+        x86_init_topo_ids(topo_info, props, &topo_ids);
+    } else {
+        x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
+    }
+
     return x86_apicid_from_topo_ids_epyc(topo_info, &topo_ids);
 }
 /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
@@ -280,7 +287,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
  * 'cpu_index' is a sequential, contiguous ID for the CPU.
  */
 static inline apic_id_t x86_apicid_from_cpu_idx(X86CPUTopoInfo *topo_info,
-                                                unsigned cpu_index)
+                                                unsigned cpu_index,
+                                                CpuInstanceProperties props)
 {
     X86CPUTopoIDs topo_ids;
     x86_topo_ids_from_idx(topo_info, cpu_index, &topo_ids);
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index b79f24e285..3109f39554 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -65,7 +65,8 @@ typedef struct {
 
     /* Apic id specific handlers */
     uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info,
-                                    unsigned cpu_index);
+                                    unsigned cpu_index,
+                                    CpuInstanceProperties props);
     void (*topo_ids_from_apicid)(apic_id_t apicid, X86CPUTopoInfo *topo_info,
                                  X86CPUTopoIDs *topo_ids);
     apic_id_t (*apicid_from_topo_ids)(X86CPUTopoInfo *topo_info,
@@ -93,7 +94,8 @@ typedef struct {
 void init_topo_info(X86CPUTopoInfo *topo_info, const X86MachineState *x86ms);
 
 uint32_t x86_cpu_apic_id_from_index(X86MachineState *pcms,
-                                    unsigned int cpu_index);
+                                    unsigned int cpu_index,
+                                    CpuInstanceProperties props);
 
 void x86_cpu_new(X86MachineState *pcms, int64_t apic_id, Error **errp);
 void x86_cpus_init(X86MachineState *pcms, int default_cpu_version);
diff --git a/tests/test-x86-cpuid.c b/tests/test-x86-cpuid.c
index 049030a50e..a1308e214b 100644
--- a/tests/test-x86-cpuid.c
+++ b/tests/test-x86-cpuid.c
@@ -29,6 +29,7 @@
 static void test_topo_bits(void)
 {
     X86CPUTopoInfo topo_info = {0};
+    CpuInstanceProperties props = {0};
 
     /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
     topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
@@ -37,10 +38,10 @@ static void test_topo_bits(void)
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
 
     topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 3), ==, 3);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0, props), ==, 0);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1, props), ==, 1);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2, props), ==, 2);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 3, props), ==, 3);
 
 
     /* Test field width calculation for multiple values
@@ -92,38 +93,38 @@ static void test_topo_bits(void)
     g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
 
     topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0, props), ==, 0);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1, props), ==, 1);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2, props), ==, 2);
 
     topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0, props), ==,
                      (1 << 2) | 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1, props), ==,
                      (1 << 2) | 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 2), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 2, props), ==,
                      (1 << 2) | 2);
 
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 0), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 0, props), ==,
                      (2 << 2) | 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 1), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 1, props), ==,
                      (2 << 2) | 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 2), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 2, props), ==,
                      (2 << 2) | 2);
 
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 0), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 0, props), ==,
                      (5 << 2) | 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 1), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 1, props), ==,
                      (5 << 2) | 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 2), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 2, props), ==,
                      (5 << 2) | 2);
 
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
-                     1 * 6 * 3 + 0 * 3 + 0), ==, (1 << 5));
+                     1 * 6 * 3 + 0 * 3 + 0, props), ==, (1 << 5));
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
-                     1 * 6 * 3 + 1 * 3 + 1), ==, (1 << 5) | (1 << 2) | 1);
+                     1 * 6 * 3 + 1 * 3 + 1, props), ==, (1 << 5) | (1 << 2) | 1);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
-                     3 * 6 * 3 + 5 * 3 + 2), ==, (3 << 5) | (5 << 2) | 2);
+                     3 * 6 * 3 + 5 * 3 + 2, props), ==, (3 << 5) | (5 << 2) | 2);
 }
 
 int main(int argc, char **argv)



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 3/3] hw/386: Fix uninitialized memory with -device and CPU hotplug
  2020-07-01 17:30 [PATCH v2 0/3] Fix couple of issues with AMD topology Babu Moger
  2020-07-01 17:31 ` [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties Babu Moger
  2020-07-01 17:31 ` [PATCH v2 2/3] hw/i386: Build apic_id " Babu Moger
@ 2020-07-01 17:31 ` Babu Moger
  2020-07-27 16:36   ` Igor Mammedov
  2 siblings, 1 reply; 23+ messages in thread
From: Babu Moger @ 2020-07-01 17:31 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel

Noticed the following command failure while testing CPU hotplug.

$ qemu-system-x86_64 -machine q35,accel=kvm -smp 1,maxcpus=2,
  cores=1, threads=1,sockets=2 -cpu EPYC -device EPYC-x86_64-
  cpu,core-id=0,socket-id=1,thread-id=0

  qemu-system-x86_64: -device EPYC-x86_64-cpu,core-id=0,socket-id=1,
  thread-id=0: Invalid CPU [socket: 21855, die: 0, core: 0, thread: 0]
  with APIC ID 21855, valid index range 0:1

This happens because APIC ID is calculated using uninitialized memory.
This is happening after the addition of new field node_id in X86CPUTopoIDs
structure. The node_id field is uninitialized while calling
apicid_from_topo_ids. The problem is discussed in the thread below.
https://lore.kernel.org/qemu-devel/20200602171838.GG577771@habkost.net/

Fix the problem by initializing the node_id from the device being added.

Fixes:
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1828750

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e613b2299f..aa9fb48834 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1553,6 +1553,15 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
             cpu->die_id = 0;
         }
 
+        /*
+         * If node_id is not set, initialize it to zero for now. If the user
+         * does not pass the correct node in case of numa configuration, it
+         * will be rejected eventually.
+         */
+        if (cpu->node_id < 0) {
+            cpu->node_id = 0;
+        }
+
         if (cpu->socket_id < 0) {
             error_setg(errp, "CPU socket-id is not set");
             return;
@@ -1587,6 +1596,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         }
 
         topo_ids.pkg_id = cpu->socket_id;
+        topo_ids.node_id = cpu->node_id;
         topo_ids.die_id = cpu->die_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-01 17:31 ` [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties Babu Moger
@ 2020-07-13  9:08   ` Igor Mammedov
  2020-07-13 15:02     ` Babu Moger
  0 siblings, 1 reply; 23+ messages in thread
From: Igor Mammedov @ 2020-07-13  9:08 UTC (permalink / raw)
  To: Babu Moger; +Cc: qemu-devel, pbonzini, ehabkost, rth

On Wed, 01 Jul 2020 12:31:01 -0500
Babu Moger <babu.moger@amd.com> wrote:

> This is in preparation to build the apic_id from user
> provided topology information.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  include/hw/i386/topology.h |   19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 07239f95f4..7cb21e9c82 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -40,6 +40,7 @@
>  
>  
>  #include "qemu/bitops.h"
> +#include "qapi/qapi-types-machine.h"
>  
>  /* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
>   */
> @@ -196,6 +197,24 @@ static inline void x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
>      topo_ids->pkg_id = apicid >> apicid_pkg_offset_epyc(topo_info);
>  }
>  
> +
> +/*
> + * Initialize topo_ids from CpuInstanceProperties
> + * node_id in CpuInstanceProperties(or in CPU device) is a sequential
> + * number, but while building the topology 

>we need to separate it for
> + * each socket(mod nodes_per_pkg).
could you clarify a bit more on why this is necessary?


> + */
> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> +                                     CpuInstanceProperties props,
> +                                     X86CPUTopoIDs *topo_ids)
> +{
> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> +    topo_ids->node_id = props.has_node_id ?
> +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) : 0;
> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id : 0;
> +}
>  /*
>   * Make APIC ID for the CPU 'cpu_index'
>   *
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/3] hw/i386: Build apic_id from CpuInstanceProperties
  2020-07-01 17:31 ` [PATCH v2 2/3] hw/i386: Build apic_id " Babu Moger
@ 2020-07-13  9:15   ` Igor Mammedov
  2020-07-13 15:00     ` Babu Moger
  0 siblings, 1 reply; 23+ messages in thread
From: Igor Mammedov @ 2020-07-13  9:15 UTC (permalink / raw)
  To: Babu Moger; +Cc: qemu-devel, pbonzini, ehabkost, rth

On Wed, 01 Jul 2020 12:31:08 -0500
Babu Moger <babu.moger@amd.com> wrote:

> Build apic_id from CpuInstanceProperties if numa configured.
> Use the node_id from user provided numa information. This
> will avoid conflicts between numa information and apic_id
> generated.
> 
> Re-arranged the code little bit to make sure CpuInstanceProperties
> is initialized before calling.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  hw/i386/pc.c               |    6 +++++-
>  hw/i386/x86.c              |   19 +++++++++++++------
>  include/hw/i386/topology.h |   14 +++++++++++---
>  include/hw/i386/x86.h      |    6 ++++--
>  tests/test-x86-cpuid.c     |   39 ++++++++++++++++++++-------------------
>  5 files changed, 53 insertions(+), 31 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index d103b8c0ab..e613b2299f 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -800,13 +800,17 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
>  void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
>  {
>      X86MachineState *x86ms = X86_MACHINE(ms);
> -    int64_t apic_id = x86_cpu_apic_id_from_index(x86ms, id);
> +    CpuInstanceProperties props;
> +    int64_t apic_id;
>      Error *local_err = NULL;
>  
>      if (id < 0) {
>          error_setg(errp, "Invalid CPU id: %" PRIi64, id);
>          return;
>      }
> +    props = ms->possible_cpus->cpus[id].props;
> +
> +    apic_id = x86_cpu_apic_id_from_index(x86ms, id, props);
>  
>      if (apic_id >= ACPI_CPU_HOTPLUG_ID_LIMIT) {
>          error_setg(errp, "Unable to add CPU: %" PRIi64
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 34229b45c7..7554416ae0 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -93,7 +93,8 @@ static void x86_set_epyc_topo_handlers(MachineState *machine)
>   * all CPUs up to max_cpus.
>   */
>  uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
> -                                    unsigned int cpu_index)
> +                                    unsigned int cpu_index,
> +                                    CpuInstanceProperties props)
>  {
>      X86MachineClass *x86mc = X86_MACHINE_GET_CLASS(x86ms);
>      X86CPUTopoInfo topo_info;
> @@ -102,7 +103,7 @@ uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
>  
>      init_topo_info(&topo_info, x86ms);
>  
> -    correct_id = x86ms->apicid_from_cpu_idx(&topo_info, cpu_index);
> +    correct_id = x86ms->apicid_from_cpu_idx(&topo_info, cpu_index, props);
>      if (x86mc->compat_apic_id_mode) {
>          if (cpu_index != correct_id && !warned && !qtest_enabled()) {
>              error_report("APIC IDs set in compatibility mode, "
> @@ -136,6 +137,8 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
>      const CPUArchIdList *possible_cpus;
>      MachineState *ms = MACHINE(x86ms);
>      MachineClass *mc = MACHINE_GET_CLASS(x86ms);
> +    CpuInstanceProperties props;
> +
>  
>      /* Check for apicid encoding */
>      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
> @@ -144,6 +147,8 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
>  
>      x86_cpu_set_default_version(default_cpu_version);
>  
> +    possible_cpus = mc->possible_cpu_arch_ids(ms);
> +
>      /*
>       * Calculates the limit to CPU APIC ID values
>       *
> @@ -152,13 +157,15 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
>       *
>       * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
>       */
> -    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
> -                                                      ms->smp.max_cpus - 1) + 1;
> -    possible_cpus = mc->possible_cpu_arch_ids(ms);
> +    props = ms->possible_cpus->cpus[ms->smp.max_cpus - 1].props;
>  
> +    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
> +                                                      ms->smp.max_cpus - 1,
> +                                                      props) + 1;
>      for (i = 0; i < ms->possible_cpus->len; i++) {
> +        props = ms->possible_cpus->cpus[i].props;
>          ms->possible_cpus->cpus[i].arch_id =
> -            x86_cpu_apic_id_from_index(x86ms, i);
> +            x86_cpu_apic_id_from_index(x86ms, i, props);
>      }
>  
>      for (i = 0; i < ms->smp.cpus; i++) {
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 7cb21e9c82..a800fc905f 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -221,10 +221,17 @@ static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
>   * 'cpu_index' is a sequential, contiguous ID for the CPU.
>   */
>  static inline apic_id_t x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo *topo_info,
> -                                                     unsigned cpu_index)
> +                                                     unsigned cpu_index,
> +                                                     CpuInstanceProperties props)
>  {
>      X86CPUTopoIDs topo_ids;
> -    x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
> +
> +    if (props.has_node_id) {
> +        x86_init_topo_ids(topo_info, props, &topo_ids);
> +    } else {
> +        x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
why this branch is needed?

> +    }
> +
>      return x86_apicid_from_topo_ids_epyc(topo_info, &topo_ids);
>  }
>  /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> @@ -280,7 +287,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>   * 'cpu_index' is a sequential, contiguous ID for the CPU.
>   */
>  static inline apic_id_t x86_apicid_from_cpu_idx(X86CPUTopoInfo *topo_info,
> -                                                unsigned cpu_index)
> +                                                unsigned cpu_index,
> +                                                CpuInstanceProperties props)
>  {
>      X86CPUTopoIDs topo_ids;
>      x86_topo_ids_from_idx(topo_info, cpu_index, &topo_ids);
> diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
> index b79f24e285..3109f39554 100644
> --- a/include/hw/i386/x86.h
> +++ b/include/hw/i386/x86.h
> @@ -65,7 +65,8 @@ typedef struct {
>  
>      /* Apic id specific handlers */
>      uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info,
> -                                    unsigned cpu_index);
> +                                    unsigned cpu_index,
> +                                    CpuInstanceProperties props);
>      void (*topo_ids_from_apicid)(apic_id_t apicid, X86CPUTopoInfo *topo_info,
>                                   X86CPUTopoIDs *topo_ids);
>      apic_id_t (*apicid_from_topo_ids)(X86CPUTopoInfo *topo_info,
> @@ -93,7 +94,8 @@ typedef struct {
>  void init_topo_info(X86CPUTopoInfo *topo_info, const X86MachineState *x86ms);
>  
>  uint32_t x86_cpu_apic_id_from_index(X86MachineState *pcms,
> -                                    unsigned int cpu_index);
> +                                    unsigned int cpu_index,
> +                                    CpuInstanceProperties props);
>  
>  void x86_cpu_new(X86MachineState *pcms, int64_t apic_id, Error **errp);
>  void x86_cpus_init(X86MachineState *pcms, int default_cpu_version);
> diff --git a/tests/test-x86-cpuid.c b/tests/test-x86-cpuid.c
> index 049030a50e..a1308e214b 100644
> --- a/tests/test-x86-cpuid.c
> +++ b/tests/test-x86-cpuid.c
> @@ -29,6 +29,7 @@
>  static void test_topo_bits(void)
>  {
>      X86CPUTopoInfo topo_info = {0};
> +    CpuInstanceProperties props = {0};
>  
>      /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
>      topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
> @@ -37,10 +38,10 @@ static void test_topo_bits(void)
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
>  
>      topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 3), ==, 3);
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0, props), ==, 0);
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1, props), ==, 1);
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2, props), ==, 2);
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 3, props), ==, 3);
>  
>  
>      /* Test field width calculation for multiple values
> @@ -92,38 +93,38 @@ static void test_topo_bits(void)
>      g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
>  
>      topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0, props), ==, 0);
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1, props), ==, 1);
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2, props), ==, 2);
>  
>      topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0, props), ==,
>                       (1 << 2) | 0);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1, props), ==,
>                       (1 << 2) | 1);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 2), ==,
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 2, props), ==,
>                       (1 << 2) | 2);
>  
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 0), ==,
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 0, props), ==,
>                       (2 << 2) | 0);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 1), ==,
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 1, props), ==,
>                       (2 << 2) | 1);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 2), ==,
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 2, props), ==,
>                       (2 << 2) | 2);
>  
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 0), ==,
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 0, props), ==,
>                       (5 << 2) | 0);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 1), ==,
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 1, props), ==,
>                       (5 << 2) | 1);
> -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 2), ==,
> +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 2, props), ==,
>                       (5 << 2) | 2);
>  
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
> -                     1 * 6 * 3 + 0 * 3 + 0), ==, (1 << 5));
> +                     1 * 6 * 3 + 0 * 3 + 0, props), ==, (1 << 5));
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
> -                     1 * 6 * 3 + 1 * 3 + 1), ==, (1 << 5) | (1 << 2) | 1);
> +                     1 * 6 * 3 + 1 * 3 + 1, props), ==, (1 << 5) | (1 << 2) | 1);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
> -                     3 * 6 * 3 + 5 * 3 + 2), ==, (3 << 5) | (5 << 2) | 2);
> +                     3 * 6 * 3 + 5 * 3 + 2, props), ==, (3 << 5) | (5 << 2) | 2);
>  }
>  
>  int main(int argc, char **argv)
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2 2/3] hw/i386: Build apic_id from CpuInstanceProperties
  2020-07-13  9:15   ` Igor Mammedov
@ 2020-07-13 15:00     ` Babu Moger
  0 siblings, 0 replies; 23+ messages in thread
From: Babu Moger @ 2020-07-13 15:00 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, ehabkost, rth



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Monday, July 13, 2020 4:15 AM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> devel@nongnu.org
> Subject: Re: [PATCH v2 2/3] hw/i386: Build apic_id from CpuInstanceProperties
> 
> On Wed, 01 Jul 2020 12:31:08 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > Build apic_id from CpuInstanceProperties if numa configured.
> > Use the node_id from user provided numa information. This will avoid
> > conflicts between numa information and apic_id generated.
> >
> > Re-arranged the code little bit to make sure CpuInstanceProperties is
> > initialized before calling.
> >
> > Signed-off-by: Babu Moger <babu.moger@amd.com>
> > ---
> >  hw/i386/pc.c               |    6 +++++-
> >  hw/i386/x86.c              |   19 +++++++++++++------
> >  include/hw/i386/topology.h |   14 +++++++++++---
> >  include/hw/i386/x86.h      |    6 ++++--
> >  tests/test-x86-cpuid.c     |   39 ++++++++++++++++++++-------------------
> >  5 files changed, 53 insertions(+), 31 deletions(-)
> >
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c index d103b8c0ab..e613b2299f
> > 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -800,13 +800,17 @@ void pc_smp_parse(MachineState *ms, QemuOpts
> > *opts)  void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error
> > **errp)  {
> >      X86MachineState *x86ms = X86_MACHINE(ms);
> > -    int64_t apic_id = x86_cpu_apic_id_from_index(x86ms, id);
> > +    CpuInstanceProperties props;
> > +    int64_t apic_id;
> >      Error *local_err = NULL;
> >
> >      if (id < 0) {
> >          error_setg(errp, "Invalid CPU id: %" PRIi64, id);
> >          return;
> >      }
> > +    props = ms->possible_cpus->cpus[id].props;
> > +
> > +    apic_id = x86_cpu_apic_id_from_index(x86ms, id, props);
> >
> >      if (apic_id >= ACPI_CPU_HOTPLUG_ID_LIMIT) {
> >          error_setg(errp, "Unable to add CPU: %" PRIi64 diff --git
> > a/hw/i386/x86.c b/hw/i386/x86.c index 34229b45c7..7554416ae0 100644
> > --- a/hw/i386/x86.c
> > +++ b/hw/i386/x86.c
> > @@ -93,7 +93,8 @@ static void x86_set_epyc_topo_handlers(MachineState
> *machine)
> >   * all CPUs up to max_cpus.
> >   */
> >  uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
> > -                                    unsigned int cpu_index)
> > +                                    unsigned int cpu_index,
> > +                                    CpuInstanceProperties props)
> >  {
> >      X86MachineClass *x86mc = X86_MACHINE_GET_CLASS(x86ms);
> >      X86CPUTopoInfo topo_info;
> > @@ -102,7 +103,7 @@ uint32_t
> > x86_cpu_apic_id_from_index(X86MachineState *x86ms,
> >
> >      init_topo_info(&topo_info, x86ms);
> >
> > -    correct_id = x86ms->apicid_from_cpu_idx(&topo_info, cpu_index);
> > +    correct_id = x86ms->apicid_from_cpu_idx(&topo_info, cpu_index,
> > + props);
> >      if (x86mc->compat_apic_id_mode) {
> >          if (cpu_index != correct_id && !warned && !qtest_enabled()) {
> >              error_report("APIC IDs set in compatibility mode, "
> > @@ -136,6 +137,8 @@ void x86_cpus_init(X86MachineState *x86ms, int
> default_cpu_version)
> >      const CPUArchIdList *possible_cpus;
> >      MachineState *ms = MACHINE(x86ms);
> >      MachineClass *mc = MACHINE_GET_CLASS(x86ms);
> > +    CpuInstanceProperties props;
> > +
> >
> >      /* Check for apicid encoding */
> >      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) { @@ -144,6
> > +147,8 @@ void x86_cpus_init(X86MachineState *x86ms, int
> > default_cpu_version)
> >
> >      x86_cpu_set_default_version(default_cpu_version);
> >
> > +    possible_cpus = mc->possible_cpu_arch_ids(ms);
> > +
> >      /*
> >       * Calculates the limit to CPU APIC ID values
> >       *
> > @@ -152,13 +157,15 @@ void x86_cpus_init(X86MachineState *x86ms, int
> default_cpu_version)
> >       *
> >       * This is used for FW_CFG_MAX_CPUS. See comments on
> fw_cfg_arch_create().
> >       */
> > -    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
> > -                                                      ms->smp.max_cpus - 1) + 1;
> > -    possible_cpus = mc->possible_cpu_arch_ids(ms);
> > +    props = ms->possible_cpus->cpus[ms->smp.max_cpus - 1].props;
> >
> > +    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
> > +                                                      ms->smp.max_cpus - 1,
> > +                                                      props) + 1;
> >      for (i = 0; i < ms->possible_cpus->len; i++) {
> > +        props = ms->possible_cpus->cpus[i].props;
> >          ms->possible_cpus->cpus[i].arch_id =
> > -            x86_cpu_apic_id_from_index(x86ms, i);
> > +            x86_cpu_apic_id_from_index(x86ms, i, props);
> >      }
> >
> >      for (i = 0; i < ms->smp.cpus; i++) { diff --git
> > a/include/hw/i386/topology.h b/include/hw/i386/topology.h index
> > 7cb21e9c82..a800fc905f 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -221,10 +221,17 @@ static inline void
> x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> >   * 'cpu_index' is a sequential, contiguous ID for the CPU.
> >   */
> >  static inline apic_id_t x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo
> *topo_info,
> > -                                                     unsigned cpu_index)
> > +                                                     unsigned cpu_index,
> > +
> > + CpuInstanceProperties props)
> >  {
> >      X86CPUTopoIDs topo_ids;
> > -    x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
> > +
> > +    if (props.has_node_id) {
> > +        x86_init_topo_ids(topo_info, props, &topo_ids);
> > +    } else {
> > +        x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
> why this branch is needed?

I have added it to make sure to handle the case if this function is called
without initializing the props. I have to check if that case can happen.

> 
> > +    }
> > +
> >      return x86_apicid_from_topo_ids_epyc(topo_info, &topo_ids);  }
> >  /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID @@
> > -280,7 +287,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t
> apicid,
> >   * 'cpu_index' is a sequential, contiguous ID for the CPU.
> >   */
> >  static inline apic_id_t x86_apicid_from_cpu_idx(X86CPUTopoInfo *topo_info,
> > -                                                unsigned cpu_index)
> > +                                                unsigned cpu_index,
> > +                                                CpuInstanceProperties
> > + props)
> >  {
> >      X86CPUTopoIDs topo_ids;
> >      x86_topo_ids_from_idx(topo_info, cpu_index, &topo_ids); diff
> > --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h index
> > b79f24e285..3109f39554 100644
> > --- a/include/hw/i386/x86.h
> > +++ b/include/hw/i386/x86.h
> > @@ -65,7 +65,8 @@ typedef struct {
> >
> >      /* Apic id specific handlers */
> >      uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info,
> > -                                    unsigned cpu_index);
> > +                                    unsigned cpu_index,
> > +                                    CpuInstanceProperties props);
> >      void (*topo_ids_from_apicid)(apic_id_t apicid, X86CPUTopoInfo
> *topo_info,
> >                                   X86CPUTopoIDs *topo_ids);
> >      apic_id_t (*apicid_from_topo_ids)(X86CPUTopoInfo *topo_info, @@
> > -93,7 +94,8 @@ typedef struct {  void init_topo_info(X86CPUTopoInfo
> > *topo_info, const X86MachineState *x86ms);
> >
> >  uint32_t x86_cpu_apic_id_from_index(X86MachineState *pcms,
> > -                                    unsigned int cpu_index);
> > +                                    unsigned int cpu_index,
> > +                                    CpuInstanceProperties props);
> >
> >  void x86_cpu_new(X86MachineState *pcms, int64_t apic_id, Error
> > **errp);  void x86_cpus_init(X86MachineState *pcms, int
> > default_cpu_version); diff --git a/tests/test-x86-cpuid.c
> > b/tests/test-x86-cpuid.c index 049030a50e..a1308e214b 100644
> > --- a/tests/test-x86-cpuid.c
> > +++ b/tests/test-x86-cpuid.c
> > @@ -29,6 +29,7 @@
> >  static void test_topo_bits(void)
> >  {
> >      X86CPUTopoInfo topo_info = {0};
> > +    CpuInstanceProperties props = {0};
> >
> >      /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
> >      topo_info = (X86CPUTopoInfo) {0, 1, 1, 1}; @@ -37,10 +38,10 @@
> > static void test_topo_bits(void)
> >      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> >
> >      topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 3), ==, 3);
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0, props), ==, 0);
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1, props), ==, 1);
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2, props), ==, 2);
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 3, props),
> > + ==, 3);
> >
> >
> >      /* Test field width calculation for multiple values @@ -92,38
> > +93,38 @@ static void test_topo_bits(void)
> >      g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
> >
> >      topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0, props), ==, 0);
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1, props), ==, 1);
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2, props),
> > + ==, 2);
> >
> >      topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0,
> > + props), ==,
> >                       (1 << 2) | 0);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1,
> > + props), ==,
> >                       (1 << 2) | 1);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 2), ==,
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 2,
> > + props), ==,
> >                       (1 << 2) | 2);
> >
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 0), ==,
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 0,
> > + props), ==,
> >                       (2 << 2) | 0);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 1), ==,
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 1,
> > + props), ==,
> >                       (2 << 2) | 1);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 2), ==,
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 2,
> > + props), ==,
> >                       (2 << 2) | 2);
> >
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 0), ==,
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 0,
> > + props), ==,
> >                       (5 << 2) | 0);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 1), ==,
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 1,
> > + props), ==,
> >                       (5 << 2) | 1);
> > -    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 2), ==,
> > +    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 2,
> > + props), ==,
> >                       (5 << 2) | 2);
> >
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
> > -                     1 * 6 * 3 + 0 * 3 + 0), ==, (1 << 5));
> > +                     1 * 6 * 3 + 0 * 3 + 0, props), ==, (1 << 5));
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
> > -                     1 * 6 * 3 + 1 * 3 + 1), ==, (1 << 5) | (1 << 2) | 1);
> > +                     1 * 6 * 3 + 1 * 3 + 1, props), ==, (1 << 5) | (1
> > + << 2) | 1);
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
> > -                     3 * 6 * 3 + 5 * 3 + 2), ==, (3 << 5) | (5 << 2) | 2);
> > +                     3 * 6 * 3 + 5 * 3 + 2, props), ==, (3 << 5) | (5
> > + << 2) | 2);
> >  }
> >
> >  int main(int argc, char **argv)
> >



^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-13  9:08   ` Igor Mammedov
@ 2020-07-13 15:02     ` Babu Moger
  2020-07-13 16:17       ` Igor Mammedov
  0 siblings, 1 reply; 23+ messages in thread
From: Babu Moger @ 2020-07-13 15:02 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, ehabkost, rth



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Monday, July 13, 2020 4:08 AM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> devel@nongnu.org
> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> CpuInstanceProperties
> 
> On Wed, 01 Jul 2020 12:31:01 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > This is in preparation to build the apic_id from user provided
> > topology information.
> >
> > Signed-off-by: Babu Moger <babu.moger@amd.com>
> > ---
> >  include/hw/i386/topology.h |   19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> >
> > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > index 07239f95f4..7cb21e9c82 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -40,6 +40,7 @@
> >
> >
> >  #include "qemu/bitops.h"
> > +#include "qapi/qapi-types-machine.h"
> >
> >  /* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> >   */
> > @@ -196,6 +197,24 @@ static inline void
> x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
> >      topo_ids->pkg_id = apicid >> apicid_pkg_offset_epyc(topo_info);
> >  }
> >
> > +
> > +/*
> > + * Initialize topo_ids from CpuInstanceProperties
> > + * node_id in CpuInstanceProperties(or in CPU device) is a sequential
> > + * number, but while building the topology
> 
> >we need to separate it for
> > + * each socket(mod nodes_per_pkg).
> could you clarify a bit more on why this is necessary?

If you have two sockets and 4 numa nodes, node_id in CpuInstanceProperties
will be number sequentially as 0, 1, 2, 3.  But in EPYC topology, it will
be  0, 1, 0, 1( Basically mod % number of nodes per socket).


> > + */
> > +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> > +                                     CpuInstanceProperties props,
> > +                                     X86CPUTopoIDs *topo_ids) {
> > +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > +    topo_ids->node_id = props.has_node_id ?
> > +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) : 0;
> > +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id : 0; }
> >  /*
> >   * Make APIC ID for the CPU 'cpu_index'
> >   *
> >



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-13 15:02     ` Babu Moger
@ 2020-07-13 16:17       ` Igor Mammedov
  2020-07-13 16:43         ` Babu Moger
  0 siblings, 1 reply; 23+ messages in thread
From: Igor Mammedov @ 2020-07-13 16:17 UTC (permalink / raw)
  To: Babu Moger; +Cc: qemu-devel, pbonzini, ehabkost, rth

On Mon, 13 Jul 2020 10:02:22 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Monday, July 13, 2020 4:08 AM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> > devel@nongnu.org
> > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > CpuInstanceProperties
[...]
> > > +
> > > +/*
> > > + * Initialize topo_ids from CpuInstanceProperties
> > > + * node_id in CpuInstanceProperties(or in CPU device) is a sequential
> > > + * number, but while building the topology  
> >   
> > >we need to separate it for
> > > + * each socket(mod nodes_per_pkg).  
> > could you clarify a bit more on why this is necessary?  
> 
> If you have two sockets and 4 numa nodes, node_id in CpuInstanceProperties
> will be number sequentially as 0, 1, 2, 3.  But in EPYC topology, it will
> be  0, 1, 0, 1( Basically mod % number of nodes per socket).

I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes per socket
so APIC id woulbe be composed like:

 1st socket
   pkg_id(0) | node_id(0)
   pkg_id(0) | node_id(1)

 2nd socket
   pkg_id(1) | node_id(0)
   pkg_id(1) | node_id(1)
  
if that's the case, then EPYC's node_id here doesn't look like
a NUMA node in the sense it's usually used
(above config would have 4 different memory controllers => 4 conventional NUMA nodes).

I wonder if linux guest actually uses node_id encoded in apic id for
configuring/checking numa structures, or it just uses whatever ACPI SRAT
table provided.
 
> > > + */
> > > +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> > > +                                     CpuInstanceProperties props,
> > > +                                     X86CPUTopoIDs *topo_ids) {
> > > +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > > +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > +    topo_ids->node_id = props.has_node_id ?
> > > +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) : 0;
> > > +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id : 0; }
> > >  /*
> > >   * Make APIC ID for the CPU 'cpu_index'
> > >   *
> > >  
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-13 16:17       ` Igor Mammedov
@ 2020-07-13 16:43         ` Babu Moger
  2020-07-13 17:32           ` Igor Mammedov
  0 siblings, 1 reply; 23+ messages in thread
From: Babu Moger @ 2020-07-13 16:43 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, ehabkost, rth



On 7/13/20 11:17 AM, Igor Mammedov wrote:
> On Mon, 13 Jul 2020 10:02:22 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
>>> -----Original Message-----
>>> From: Igor Mammedov <imammedo@redhat.com>
>>> Sent: Monday, July 13, 2020 4:08 AM
>>> To: Moger, Babu <Babu.Moger@amd.com>
>>> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
>>> devel@nongnu.org
>>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
>>> CpuInstanceProperties
> [...]
>>>> +
>>>> +/*
>>>> + * Initialize topo_ids from CpuInstanceProperties
>>>> + * node_id in CpuInstanceProperties(or in CPU device) is a sequential
>>>> + * number, but while building the topology  
>>>   
>>>> we need to separate it for
>>>> + * each socket(mod nodes_per_pkg).  
>>> could you clarify a bit more on why this is necessary?  
>>
>> If you have two sockets and 4 numa nodes, node_id in CpuInstanceProperties
>> will be number sequentially as 0, 1, 2, 3.  But in EPYC topology, it will
>> be  0, 1, 0, 1( Basically mod % number of nodes per socket).
> 
> I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes per socket
> so APIC id woulbe be composed like:
> 
>  1st socket
>    pkg_id(0) | node_id(0)
>    pkg_id(0) | node_id(1)
> 
>  2nd socket
>    pkg_id(1) | node_id(0)
>    pkg_id(1) | node_id(1)
>   
> if that's the case, then EPYC's node_id here doesn't look like
> a NUMA node in the sense it's usually used
> (above config would have 4 different memory controllers => 4 conventional NUMA nodes).

EPIC model uses combination of socket id and node id to identify the numa
nodes. So, it internally uses all the information.

> 
> I wonder if linux guest actually uses node_id encoded in apic id for
> configuring/checking numa structures, or it just uses whatever ACPI SRAT
> table provided.
>  
>>>> + */
>>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
>>>> +                                     CpuInstanceProperties props,
>>>> +                                     X86CPUTopoIDs *topo_ids) {
>>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
>>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
>>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
>>>> +    topo_ids->node_id = props.has_node_id ?
>>>> +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) : 0;
>>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id : 0; }
>>>>  /*
>>>>   * Make APIC ID for the CPU 'cpu_index'
>>>>   *
>>>>  
>>
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-13 16:43         ` Babu Moger
@ 2020-07-13 17:32           ` Igor Mammedov
  2020-07-13 19:30             ` Babu Moger
  0 siblings, 1 reply; 23+ messages in thread
From: Igor Mammedov @ 2020-07-13 17:32 UTC (permalink / raw)
  To: Babu Moger; +Cc: qemu-devel, pbonzini, ehabkost, rth

On Mon, 13 Jul 2020 11:43:33 -0500
Babu Moger <babu.moger@amd.com> wrote:

> On 7/13/20 11:17 AM, Igor Mammedov wrote:
> > On Mon, 13 Jul 2020 10:02:22 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> >>> -----Original Message-----
> >>> From: Igor Mammedov <imammedo@redhat.com>
> >>> Sent: Monday, July 13, 2020 4:08 AM
> >>> To: Moger, Babu <Babu.Moger@amd.com>
> >>> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> >>> devel@nongnu.org
> >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> >>> CpuInstanceProperties  
> > [...]  
> >>>> +
> >>>> +/*
> >>>> + * Initialize topo_ids from CpuInstanceProperties
> >>>> + * node_id in CpuInstanceProperties(or in CPU device) is a sequential
> >>>> + * number, but while building the topology    
> >>>     
> >>>> we need to separate it for
> >>>> + * each socket(mod nodes_per_pkg).    
> >>> could you clarify a bit more on why this is necessary?    
> >>
> >> If you have two sockets and 4 numa nodes, node_id in CpuInstanceProperties
> >> will be number sequentially as 0, 1, 2, 3.  But in EPYC topology, it will
> >> be  0, 1, 0, 1( Basically mod % number of nodes per socket).  
> > 
> > I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes per socket
> > so APIC id woulbe be composed like:
> > 
> >  1st socket
> >    pkg_id(0) | node_id(0)
> >    pkg_id(0) | node_id(1)
> > 
> >  2nd socket
> >    pkg_id(1) | node_id(0)
> >    pkg_id(1) | node_id(1)
> >   
> > if that's the case, then EPYC's node_id here doesn't look like
> > a NUMA node in the sense it's usually used
> > (above config would have 4 different memory controllers => 4 conventional NUMA nodes).  
> 
> EPIC model uses combination of socket id and node id to identify the numa
> nodes. So, it internally uses all the information.

well with above values, EPYC's node_id doesn't look like it's specifying
a machine numa node, but rather a node index within single socket. In which
case, it doesn't make much sense calling it NUMA node_id, it's rather some
index within a socket. (it starts looking like terminology is all mixed up)

If you have access to a milti-socket EPYC machine, can you dump and post here
its apic ids, pls?

> 
> > 
> > I wonder if linux guest actually uses node_id encoded in apic id for
> > configuring/checking numa structures, or it just uses whatever ACPI SRAT
> > table provided.
> >    
> >>>> + */
> >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> >>>> +                                     CpuInstanceProperties props,
> >>>> +                                     X86CPUTopoIDs *topo_ids) {
> >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> >>>> +    topo_ids->node_id = props.has_node_id ?
> >>>> +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) : 0;
> >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id : 0; }
> >>>>  /*
> >>>>   * Make APIC ID for the CPU 'cpu_index'
> >>>>   *
> >>>>    
> >>  
> >   
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-13 17:32           ` Igor Mammedov
@ 2020-07-13 19:30             ` Babu Moger
  2020-07-14 16:41               ` Igor Mammedov
  2020-07-24 17:05               ` Igor Mammedov
  0 siblings, 2 replies; 23+ messages in thread
From: Babu Moger @ 2020-07-13 19:30 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, ehabkost, rth



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Monday, July 13, 2020 12:32 PM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> devel@nongnu.org
> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> CpuInstanceProperties
> 
> On Mon, 13 Jul 2020 11:43:33 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > On 7/13/20 11:17 AM, Igor Mammedov wrote:
> > > On Mon, 13 Jul 2020 10:02:22 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >
> > >>> -----Original Message-----
> > >>> From: Igor Mammedov <imammedo@redhat.com>
> > >>> Sent: Monday, July 13, 2020 4:08 AM
> > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > >>> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > >>> qemu- devel@nongnu.org
> > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > >>> CpuInstanceProperties
> > > [...]
> > >>>> +
> > >>>> +/*
> > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > >>>> + * node_id in CpuInstanceProperties(or in CPU device) is a sequential
> > >>>> + * number, but while building the topology
> > >>>
> > >>>> we need to separate it for
> > >>>> + * each socket(mod nodes_per_pkg).
> > >>> could you clarify a bit more on why this is necessary?
> > >>
> > >> If you have two sockets and 4 numa nodes, node_id in
> > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod % number of nodes
> per socket).
> > >
> > > I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes per
> > > socket so APIC id woulbe be composed like:
> > >
> > >  1st socket
> > >    pkg_id(0) | node_id(0)
> > >    pkg_id(0) | node_id(1)
> > >
> > >  2nd socket
> > >    pkg_id(1) | node_id(0)
> > >    pkg_id(1) | node_id(1)
> > >
> > > if that's the case, then EPYC's node_id here doesn't look like a
> > > NUMA node in the sense it's usually used (above config would have 4
> > > different memory controllers => 4 conventional NUMA nodes).
> >
> > EPIC model uses combination of socket id and node id to identify the
> > numa nodes. So, it internally uses all the information.
> 
> well with above values, EPYC's node_id doesn't look like it's specifying a
> machine numa node, but rather a node index within single socket. In which case,
> it doesn't make much sense calling it NUMA node_id, it's rather some index
> within a socket. (it starts looking like terminology is all mixed up)
> 
> If you have access to a milti-socket EPYC machine, can you dump and post here
> its apic ids, pls?

Here is the output from my EPYC machine with 2 sockets and totally 8
nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus 32-63 in
socket 1.

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              64
On-line CPU(s) list: 0-63
Thread(s) per core:  1
Core(s) per socket:  32
Socket(s):           2
NUMA node(s):        8
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               1
Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
Stepping:            2
CPU MHz:             2379.233
CPU max MHz:         1900.0000
CPU min MHz:         1200.0000
BogoMIPS:            3792.81
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           64K
L2 cache:            512K
L3 cache:            8192K
NUMA node0 CPU(s):   0-7
NUMA node1 CPU(s):   8-15
NUMA node2 CPU(s):   16-23
NUMA node3 CPU(s):   24-31
NUMA node4 CPU(s):   32-39
NUMA node5 CPU(s):   40-47
NUMA node6 CPU(s):   48-55
NUMA node7 CPU(s):   56-63

Here is the output of #cpuid  -l 0x8000001e  -r

You may want to refer
https://www.amd.com/system/files/TechDocs/54945_3.03_ppr_ZP_B2_pub.zip
(section 2.1.12.2.1.3 ApicId Enumeration Requirements).
Note that this is a general guideline. We tried to generalize in qemu as
much as possible. It is bit complex.

CPU 0:
   0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000300
edx=0x00000000
CPU 1:
   0x8000001e 0x00: eax=0x00000002 ebx=0x00000101 ecx=0x00000300
edx=0x00000000
CPU 2:
   0x8000001e 0x00: eax=0x00000004 ebx=0x00000102 ecx=0x00000300
edx=0x00000000
CPU 3:
   0x8000001e 0x00: eax=0x00000006 ebx=0x00000103 ecx=0x00000300
edx=0x00000000
CPU 4:
   0x8000001e 0x00: eax=0x00000008 ebx=0x00000104 ecx=0x00000300
edx=0x00000000
CPU 5:
   0x8000001e 0x00: eax=0x0000000a ebx=0x00000105 ecx=0x00000300
edx=0x00000000
CPU 6:
   0x8000001e 0x00: eax=0x0000000c ebx=0x00000106 ecx=0x00000300
edx=0x00000000
CPU 7:
   0x8000001e 0x00: eax=0x0000000e ebx=0x00000107 ecx=0x00000300
edx=0x00000000
CPU 8:
   0x8000001e 0x00: eax=0x00000010 ebx=0x00000108 ecx=0x00000301
edx=0x00000000
CPU 9:
   0x8000001e 0x00: eax=0x00000012 ebx=0x00000109 ecx=0x00000301
edx=0x00000000
CPU 10:
   0x8000001e 0x00: eax=0x00000014 ebx=0x0000010a ecx=0x00000301
edx=0x00000000
CPU 11:
   0x8000001e 0x00: eax=0x00000016 ebx=0x0000010b ecx=0x00000301
edx=0x00000000
CPU 12:
   0x8000001e 0x00: eax=0x00000018 ebx=0x0000010c ecx=0x00000301
edx=0x00000000
CPU 13:
   0x8000001e 0x00: eax=0x0000001a ebx=0x0000010d ecx=0x00000301
edx=0x00000000
CPU 14:
   0x8000001e 0x00: eax=0x0000001c ebx=0x0000010e ecx=0x00000301
edx=0x00000000
CPU 15:
   0x8000001e 0x00: eax=0x0000001e ebx=0x0000010f ecx=0x00000301
edx=0x00000000
CPU 16:
   0x8000001e 0x00: eax=0x00000020 ebx=0x00000110 ecx=0x00000302
edx=0x00000000
CPU 17:
   0x8000001e 0x00: eax=0x00000022 ebx=0x00000111 ecx=0x00000302
edx=0x00000000
CPU 18:
   0x8000001e 0x00: eax=0x00000024 ebx=0x00000112 ecx=0x00000302
edx=0x00000000
CPU 19:
   0x8000001e 0x00: eax=0x00000026 ebx=0x00000113 ecx=0x00000302
edx=0x00000000
CPU 20:
   0x8000001e 0x00: eax=0x00000028 ebx=0x00000114 ecx=0x00000302
edx=0x00000000
CPU 21:
   0x8000001e 0x00: eax=0x0000002a ebx=0x00000115 ecx=0x00000302
edx=0x00000000
CPU 22:
   0x8000001e 0x00: eax=0x0000002c ebx=0x00000116 ecx=0x00000302
edx=0x00000000
CPU 23:
   0x8000001e 0x00: eax=0x0000002e ebx=0x00000117 ecx=0x00000302
edx=0x00000000
CPU 24:
   0x8000001e 0x00: eax=0x00000030 ebx=0x00000118 ecx=0x00000303
edx=0x00000000
CPU 25:
   0x8000001e 0x00: eax=0x00000032 ebx=0x00000119 ecx=0x00000303
edx=0x00000000
CPU 26:
   0x8000001e 0x00: eax=0x00000034 ebx=0x0000011a ecx=0x00000303
edx=0x00000000
CPU 27:
   0x8000001e 0x00: eax=0x00000036 ebx=0x0000011b ecx=0x00000303
edx=0x00000000
CPU 28:
   0x8000001e 0x00: eax=0x00000038 ebx=0x0000011c ecx=0x00000303
edx=0x00000000
CPU 29:
   0x8000001e 0x00: eax=0x0000003a ebx=0x0000011d ecx=0x00000303
edx=0x00000000
CPU 30:
   0x8000001e 0x00: eax=0x0000003c ebx=0x0000011e ecx=0x00000303
edx=0x00000000
CPU 31:
   0x8000001e 0x00: eax=0x0000003e ebx=0x0000011f ecx=0x00000303
edx=0x00000000
CPU 32:
   0x8000001e 0x00: eax=0x00000040 ebx=0x00000100 ecx=0x00000304
edx=0x00000000
CPU 33:
   0x8000001e 0x00: eax=0x00000042 ebx=0x00000101 ecx=0x00000304
edx=0x00000000
CPU 34:
   0x8000001e 0x00: eax=0x00000044 ebx=0x00000102 ecx=0x00000304
edx=0x00000000
CPU 35:
   0x8000001e 0x00: eax=0x00000046 ebx=0x00000103 ecx=0x00000304
edx=0x00000000
CPU 36:
   0x8000001e 0x00: eax=0x00000048 ebx=0x00000104 ecx=0x00000304
edx=0x00000000
CPU 37:
   0x8000001e 0x00: eax=0x0000004a ebx=0x00000105 ecx=0x00000304
edx=0x00000000
CPU 38:
   0x8000001e 0x00: eax=0x0000004c ebx=0x00000106 ecx=0x00000304
edx=0x00000000
CPU 39:
   0x8000001e 0x00: eax=0x0000004e ebx=0x00000107 ecx=0x00000304
edx=0x00000000
CPU 40:
   0x8000001e 0x00: eax=0x00000050 ebx=0x00000108 ecx=0x00000305
edx=0x00000000
CPU 41:
   0x8000001e 0x00: eax=0x00000052 ebx=0x00000109 ecx=0x00000305
edx=0x00000000
CPU 42:
   0x8000001e 0x00: eax=0x00000054 ebx=0x0000010a ecx=0x00000305
edx=0x00000000
CPU 43:
   0x8000001e 0x00: eax=0x00000056 ebx=0x0000010b ecx=0x00000305
edx=0x00000000
CPU 44:
   0x8000001e 0x00: eax=0x00000058 ebx=0x0000010c ecx=0x00000305
edx=0x00000000
CPU 45:
   0x8000001e 0x00: eax=0x0000005a ebx=0x0000010d ecx=0x00000305
edx=0x00000000
CPU 46:
   0x8000001e 0x00: eax=0x0000005c ebx=0x0000010e ecx=0x00000305
edx=0x00000000
CPU 47:
   0x8000001e 0x00: eax=0x0000005e ebx=0x0000010f ecx=0x00000305
edx=0x00000000
CPU 48:
   0x8000001e 0x00: eax=0x00000060 ebx=0x00000110 ecx=0x00000306
edx=0x00000000

CPU 49:
   0x8000001e 0x00: eax=0x00000062 ebx=0x00000111 ecx=0x00000306
edx=0x00000000
CPU 50:
   0x8000001e 0x00: eax=0x00000064 ebx=0x00000112 ecx=0x00000306
edx=0x00000000
CPU 51:
   0x8000001e 0x00: eax=0x00000066 ebx=0x00000113 ecx=0x00000306
edx=0x00000000
CPU 52:
   0x8000001e 0x00: eax=0x00000068 ebx=0x00000114 ecx=0x00000306
edx=0x00000000
CPU 53:
   0x8000001e 0x00: eax=0x0000006a ebx=0x00000115 ecx=0x00000306
edx=0x00000000
CPU 54:
   0x8000001e 0x00: eax=0x0000006c ebx=0x00000116 ecx=0x00000306
edx=0x00000000
CPU 55:
   0x8000001e 0x00: eax=0x0000006e ebx=0x00000117 ecx=0x00000306
edx=0x00000000
CPU 56:
   0x8000001e 0x00: eax=0x00000070 ebx=0x00000118 ecx=0x00000307
edx=0x00000000
CPU 57:
   0x8000001e 0x00: eax=0x00000072 ebx=0x00000119 ecx=0x00000307
edx=0x00000000
CPU 58:
   0x8000001e 0x00: eax=0x00000074 ebx=0x0000011a ecx=0x00000307
edx=0x00000000
CPU 59:
   0x8000001e 0x00: eax=0x00000076 ebx=0x0000011b ecx=0x00000307
edx=0x00000000
CPU 60:
   0x8000001e 0x00: eax=0x00000078 ebx=0x0000011c ecx=0x00000307
edx=0x00000000
CPU 61:
   0x8000001e 0x00: eax=0x0000007a ebx=0x0000011d ecx=0x00000307
edx=0x00000000
CPU 62:
   0x8000001e 0x00: eax=0x0000007c ebx=0x0000011e ecx=0x00000307
edx=0x00000000
CPU 63:
   0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f ecx=0x00000307
edx=0x00000000

> 
> >
> > >
> > > I wonder if linux guest actually uses node_id encoded in apic id for
> > > configuring/checking numa structures, or it just uses whatever ACPI
> > > SRAT table provided.
> > >
> > >>>> + */
> > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> > >>>> +                                     CpuInstanceProperties props,
> > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > >>>> +    topo_ids->node_id = props.has_node_id ?
> > >>>> +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) : 0;
> > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id :
> > >>>> +0; }
> > >>>>  /*
> > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > >>>>   *
> > >>>>
> > >>
> > >
> >



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-13 19:30             ` Babu Moger
@ 2020-07-14 16:41               ` Igor Mammedov
  2020-07-14 17:26                 ` Babu Moger
  2020-07-24 17:05               ` Igor Mammedov
  1 sibling, 1 reply; 23+ messages in thread
From: Igor Mammedov @ 2020-07-14 16:41 UTC (permalink / raw)
  To: Babu Moger; +Cc: pbonzini, rth, qemu-devel, ehabkost

On Mon, 13 Jul 2020 14:30:29 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Monday, July 13, 2020 12:32 PM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> > devel@nongnu.org
> > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > CpuInstanceProperties
> > 
> > On Mon, 13 Jul 2020 11:43:33 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > On 7/13/20 11:17 AM, Igor Mammedov wrote:  
> > > > On Mon, 13 Jul 2020 10:02:22 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >  
> > > >>> -----Original Message-----
> > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > > >>> qemu- devel@nongnu.org
> > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > >>> CpuInstanceProperties  
> > > > [...]  
> > > >>>> +
> > > >>>> +/*
> > > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > > >>>> + * node_id in CpuInstanceProperties(or in CPU device) is a sequential
> > > >>>> + * number, but while building the topology  
> > > >>>  
> > > >>>> we need to separate it for
> > > >>>> + * each socket(mod nodes_per_pkg).  
> > > >>> could you clarify a bit more on why this is necessary?  
> > > >>
> > > >> If you have two sockets and 4 numa nodes, node_id in
> > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod % number of nodes  
> > per socket).  
> > > >
> > > > I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes per
> > > > socket so APIC id woulbe be composed like:
> > > >
> > > >  1st socket
> > > >    pkg_id(0) | node_id(0)
> > > >    pkg_id(0) | node_id(1)
> > > >
> > > >  2nd socket
> > > >    pkg_id(1) | node_id(0)
> > > >    pkg_id(1) | node_id(1)
> > > >
> > > > if that's the case, then EPYC's node_id here doesn't look like a
> > > > NUMA node in the sense it's usually used (above config would have 4
> > > > different memory controllers => 4 conventional NUMA nodes).  
> > >
> > > EPIC model uses combination of socket id and node id to identify the
> > > numa nodes. So, it internally uses all the information.  
> > 
> > well with above values, EPYC's node_id doesn't look like it's specifying a
> > machine numa node, but rather a node index within single socket. In which case,
> > it doesn't make much sense calling it NUMA node_id, it's rather some index
> > within a socket. (it starts looking like terminology is all mixed up)
> > 
> > If you have access to a milti-socket EPYC machine, can you dump and post here
> > its apic ids, pls?  
> 
> Here is the output from my EPYC machine with 2 sockets and totally 8
> nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus 32-63 in
> socket 1.
> 
> # lscpu
> Architecture:        x86_64
> CPU op-mode(s):      32-bit, 64-bit
> Byte Order:          Little Endian
> CPU(s):              64
> On-line CPU(s) list: 0-63
> Thread(s) per core:  1
> Core(s) per socket:  32
> Socket(s):           2
> NUMA node(s):        8
> Vendor ID:           AuthenticAMD
> CPU family:          23
> Model:               1
> Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> Stepping:            2
> CPU MHz:             2379.233
> CPU max MHz:         1900.0000
> CPU min MHz:         1200.0000
> BogoMIPS:            3792.81
> Virtualization:      AMD-V
> L1d cache:           32K
> L1i cache:           64K
> L2 cache:            512K
> L3 cache:            8192K
> NUMA node0 CPU(s):   0-7
> NUMA node1 CPU(s):   8-15
> NUMA node2 CPU(s):   16-23
> NUMA node3 CPU(s):   24-31
> NUMA node4 CPU(s):   32-39
> NUMA node5 CPU(s):   40-47
> NUMA node6 CPU(s):   48-55
> NUMA node7 CPU(s):   56-63
> 
> Here is the output of #cpuid  -l 0x8000001e  -r
> 
> You may want to refer
> https://www.amd.com/system/files/TechDocs/54945_3.03_ppr_ZP_B2_pub.zip
> (section 2.1.12.2.1.3 ApicId Enumeration Requirements).
> Note that this is a general guideline. We tried to generalize in qemu as
> much as possible. It is bit complex.
Thanks, I'll look into it.
Can you also dump SRAT table from that machine, please?
I'd like to see CPUs relation to NUMA nodes described in SRAT.

> 
> CPU 0:
>    0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000300
> edx=0x00000000
> CPU 1:
>    0x8000001e 0x00: eax=0x00000002 ebx=0x00000101 ecx=0x00000300
> edx=0x00000000
> CPU 2:
>    0x8000001e 0x00: eax=0x00000004 ebx=0x00000102 ecx=0x00000300
> edx=0x00000000
> CPU 3:
>    0x8000001e 0x00: eax=0x00000006 ebx=0x00000103 ecx=0x00000300
> edx=0x00000000
> CPU 4:
>    0x8000001e 0x00: eax=0x00000008 ebx=0x00000104 ecx=0x00000300
> edx=0x00000000
> CPU 5:
>    0x8000001e 0x00: eax=0x0000000a ebx=0x00000105 ecx=0x00000300
> edx=0x00000000
> CPU 6:
>    0x8000001e 0x00: eax=0x0000000c ebx=0x00000106 ecx=0x00000300
> edx=0x00000000
> CPU 7:
>    0x8000001e 0x00: eax=0x0000000e ebx=0x00000107 ecx=0x00000300
> edx=0x00000000
> CPU 8:
>    0x8000001e 0x00: eax=0x00000010 ebx=0x00000108 ecx=0x00000301
> edx=0x00000000
> CPU 9:
>    0x8000001e 0x00: eax=0x00000012 ebx=0x00000109 ecx=0x00000301
> edx=0x00000000
> CPU 10:
>    0x8000001e 0x00: eax=0x00000014 ebx=0x0000010a ecx=0x00000301
> edx=0x00000000
> CPU 11:
>    0x8000001e 0x00: eax=0x00000016 ebx=0x0000010b ecx=0x00000301
> edx=0x00000000
> CPU 12:
>    0x8000001e 0x00: eax=0x00000018 ebx=0x0000010c ecx=0x00000301
> edx=0x00000000
> CPU 13:
>    0x8000001e 0x00: eax=0x0000001a ebx=0x0000010d ecx=0x00000301
> edx=0x00000000
> CPU 14:
>    0x8000001e 0x00: eax=0x0000001c ebx=0x0000010e ecx=0x00000301
> edx=0x00000000
> CPU 15:
>    0x8000001e 0x00: eax=0x0000001e ebx=0x0000010f ecx=0x00000301
> edx=0x00000000
> CPU 16:
>    0x8000001e 0x00: eax=0x00000020 ebx=0x00000110 ecx=0x00000302
> edx=0x00000000
> CPU 17:
>    0x8000001e 0x00: eax=0x00000022 ebx=0x00000111 ecx=0x00000302
> edx=0x00000000
> CPU 18:
>    0x8000001e 0x00: eax=0x00000024 ebx=0x00000112 ecx=0x00000302
> edx=0x00000000
> CPU 19:
>    0x8000001e 0x00: eax=0x00000026 ebx=0x00000113 ecx=0x00000302
> edx=0x00000000
> CPU 20:
>    0x8000001e 0x00: eax=0x00000028 ebx=0x00000114 ecx=0x00000302
> edx=0x00000000
> CPU 21:
>    0x8000001e 0x00: eax=0x0000002a ebx=0x00000115 ecx=0x00000302
> edx=0x00000000
> CPU 22:
>    0x8000001e 0x00: eax=0x0000002c ebx=0x00000116 ecx=0x00000302
> edx=0x00000000
> CPU 23:
>    0x8000001e 0x00: eax=0x0000002e ebx=0x00000117 ecx=0x00000302
> edx=0x00000000
> CPU 24:
>    0x8000001e 0x00: eax=0x00000030 ebx=0x00000118 ecx=0x00000303
> edx=0x00000000
> CPU 25:
>    0x8000001e 0x00: eax=0x00000032 ebx=0x00000119 ecx=0x00000303
> edx=0x00000000
> CPU 26:
>    0x8000001e 0x00: eax=0x00000034 ebx=0x0000011a ecx=0x00000303
> edx=0x00000000
> CPU 27:
>    0x8000001e 0x00: eax=0x00000036 ebx=0x0000011b ecx=0x00000303
> edx=0x00000000
> CPU 28:
>    0x8000001e 0x00: eax=0x00000038 ebx=0x0000011c ecx=0x00000303
> edx=0x00000000
> CPU 29:
>    0x8000001e 0x00: eax=0x0000003a ebx=0x0000011d ecx=0x00000303
> edx=0x00000000
> CPU 30:
>    0x8000001e 0x00: eax=0x0000003c ebx=0x0000011e ecx=0x00000303
> edx=0x00000000
> CPU 31:
>    0x8000001e 0x00: eax=0x0000003e ebx=0x0000011f ecx=0x00000303
> edx=0x00000000
> CPU 32:
>    0x8000001e 0x00: eax=0x00000040 ebx=0x00000100 ecx=0x00000304
> edx=0x00000000
> CPU 33:
>    0x8000001e 0x00: eax=0x00000042 ebx=0x00000101 ecx=0x00000304
> edx=0x00000000
> CPU 34:
>    0x8000001e 0x00: eax=0x00000044 ebx=0x00000102 ecx=0x00000304
> edx=0x00000000
> CPU 35:
>    0x8000001e 0x00: eax=0x00000046 ebx=0x00000103 ecx=0x00000304
> edx=0x00000000
> CPU 36:
>    0x8000001e 0x00: eax=0x00000048 ebx=0x00000104 ecx=0x00000304
> edx=0x00000000
> CPU 37:
>    0x8000001e 0x00: eax=0x0000004a ebx=0x00000105 ecx=0x00000304
> edx=0x00000000
> CPU 38:
>    0x8000001e 0x00: eax=0x0000004c ebx=0x00000106 ecx=0x00000304
> edx=0x00000000
> CPU 39:
>    0x8000001e 0x00: eax=0x0000004e ebx=0x00000107 ecx=0x00000304
> edx=0x00000000
> CPU 40:
>    0x8000001e 0x00: eax=0x00000050 ebx=0x00000108 ecx=0x00000305
> edx=0x00000000
> CPU 41:
>    0x8000001e 0x00: eax=0x00000052 ebx=0x00000109 ecx=0x00000305
> edx=0x00000000
> CPU 42:
>    0x8000001e 0x00: eax=0x00000054 ebx=0x0000010a ecx=0x00000305
> edx=0x00000000
> CPU 43:
>    0x8000001e 0x00: eax=0x00000056 ebx=0x0000010b ecx=0x00000305
> edx=0x00000000
> CPU 44:
>    0x8000001e 0x00: eax=0x00000058 ebx=0x0000010c ecx=0x00000305
> edx=0x00000000
> CPU 45:
>    0x8000001e 0x00: eax=0x0000005a ebx=0x0000010d ecx=0x00000305
> edx=0x00000000
> CPU 46:
>    0x8000001e 0x00: eax=0x0000005c ebx=0x0000010e ecx=0x00000305
> edx=0x00000000
> CPU 47:
>    0x8000001e 0x00: eax=0x0000005e ebx=0x0000010f ecx=0x00000305
> edx=0x00000000
> CPU 48:
>    0x8000001e 0x00: eax=0x00000060 ebx=0x00000110 ecx=0x00000306
> edx=0x00000000
> 
> CPU 49:
>    0x8000001e 0x00: eax=0x00000062 ebx=0x00000111 ecx=0x00000306
> edx=0x00000000
> CPU 50:
>    0x8000001e 0x00: eax=0x00000064 ebx=0x00000112 ecx=0x00000306
> edx=0x00000000
> CPU 51:
>    0x8000001e 0x00: eax=0x00000066 ebx=0x00000113 ecx=0x00000306
> edx=0x00000000
> CPU 52:
>    0x8000001e 0x00: eax=0x00000068 ebx=0x00000114 ecx=0x00000306
> edx=0x00000000
> CPU 53:
>    0x8000001e 0x00: eax=0x0000006a ebx=0x00000115 ecx=0x00000306
> edx=0x00000000
> CPU 54:
>    0x8000001e 0x00: eax=0x0000006c ebx=0x00000116 ecx=0x00000306
> edx=0x00000000
> CPU 55:
>    0x8000001e 0x00: eax=0x0000006e ebx=0x00000117 ecx=0x00000306
> edx=0x00000000
> CPU 56:
>    0x8000001e 0x00: eax=0x00000070 ebx=0x00000118 ecx=0x00000307
> edx=0x00000000
> CPU 57:
>    0x8000001e 0x00: eax=0x00000072 ebx=0x00000119 ecx=0x00000307
> edx=0x00000000
> CPU 58:
>    0x8000001e 0x00: eax=0x00000074 ebx=0x0000011a ecx=0x00000307
> edx=0x00000000
> CPU 59:
>    0x8000001e 0x00: eax=0x00000076 ebx=0x0000011b ecx=0x00000307
> edx=0x00000000
> CPU 60:
>    0x8000001e 0x00: eax=0x00000078 ebx=0x0000011c ecx=0x00000307
> edx=0x00000000
> CPU 61:
>    0x8000001e 0x00: eax=0x0000007a ebx=0x0000011d ecx=0x00000307
> edx=0x00000000
> CPU 62:
>    0x8000001e 0x00: eax=0x0000007c ebx=0x0000011e ecx=0x00000307
> edx=0x00000000
> CPU 63:
>    0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f ecx=0x00000307
> edx=0x00000000
> 
> >   
> > >  
> > > >
> > > > I wonder if linux guest actually uses node_id encoded in apic id for
> > > > configuring/checking numa structures, or it just uses whatever ACPI
> > > > SRAT table provided.
> > > >  
> > > >>>> + */
> > > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> > > >>>> +                                     CpuInstanceProperties props,
> > > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > >>>> +    topo_ids->node_id = props.has_node_id ?
> > > >>>> +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) : 0;
> > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id :
> > > >>>> +0; }
> > > >>>>  /*
> > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > >>>>   *
> > > >>>>  
> > > >>  
> > > >  
> > >  
> 
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-14 16:41               ` Igor Mammedov
@ 2020-07-14 17:26                 ` Babu Moger
  2020-07-14 18:26                   ` Igor Mammedov
  0 siblings, 1 reply; 23+ messages in thread
From: Babu Moger @ 2020-07-14 17:26 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: pbonzini, rth, qemu-devel, ehabkost



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Tuesday, July 14, 2020 11:42 AM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: qemu-devel@nongnu.org; pbonzini@redhat.com; ehabkost@redhat.com;
> rth@twiddle.net
> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> CpuInstanceProperties
> 
> On Mon, 13 Jul 2020 14:30:29 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > > -----Original Message-----
> > > From: Igor Mammedov <imammedo@redhat.com>
> > > Sent: Monday, July 13, 2020 12:32 PM
> > > To: Moger, Babu <Babu.Moger@amd.com>
> > > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> > > devel@nongnu.org
> > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > CpuInstanceProperties
> > >
> > > On Mon, 13 Jul 2020 11:43:33 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >
> > > > On 7/13/20 11:17 AM, Igor Mammedov wrote:
> > > > > On Mon, 13 Jul 2020 10:02:22 -0500 Babu Moger
> > > > > <babu.moger@amd.com> wrote:
> > > > >
> > > > >>> -----Original Message-----
> > > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > > > >>> qemu- devel@nongnu.org
> > > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > >>> CpuInstanceProperties
> > > > > [...]
> > > > >>>> +
> > > > >>>> +/*
> > > > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > > > >>>> + * node_id in CpuInstanceProperties(or in CPU device) is a
> > > > >>>> +sequential
> > > > >>>> + * number, but while building the topology
> > > > >>>
> > > > >>>> we need to separate it for
> > > > >>>> + * each socket(mod nodes_per_pkg).
> > > > >>> could you clarify a bit more on why this is necessary?
> > > > >>
> > > > >> If you have two sockets and 4 numa nodes, node_id in
> > > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod %
> > > > >> number of nodes
> > > per socket).
> > > > >
> > > > > I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes
> > > > > per socket so APIC id woulbe be composed like:
> > > > >
> > > > >  1st socket
> > > > >    pkg_id(0) | node_id(0)
> > > > >    pkg_id(0) | node_id(1)
> > > > >
> > > > >  2nd socket
> > > > >    pkg_id(1) | node_id(0)
> > > > >    pkg_id(1) | node_id(1)
> > > > >
> > > > > if that's the case, then EPYC's node_id here doesn't look like a
> > > > > NUMA node in the sense it's usually used (above config would
> > > > > have 4 different memory controllers => 4 conventional NUMA nodes).
> > > >
> > > > EPIC model uses combination of socket id and node id to identify
> > > > the numa nodes. So, it internally uses all the information.
> > >
> > > well with above values, EPYC's node_id doesn't look like it's
> > > specifying a machine numa node, but rather a node index within
> > > single socket. In which case, it doesn't make much sense calling it
> > > NUMA node_id, it's rather some index within a socket. (it starts
> > > looking like terminology is all mixed up)
> > >
> > > If you have access to a milti-socket EPYC machine, can you dump and
> > > post here its apic ids, pls?
> >
> > Here is the output from my EPYC machine with 2 sockets and totally 8
> > nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus 32-63 in
> > socket 1.
> >
> > # lscpu
> > Architecture:        x86_64
> > CPU op-mode(s):      32-bit, 64-bit
> > Byte Order:          Little Endian
> > CPU(s):              64
> > On-line CPU(s) list: 0-63
> > Thread(s) per core:  1
> > Core(s) per socket:  32
> > Socket(s):           2
> > NUMA node(s):        8
> > Vendor ID:           AuthenticAMD
> > CPU family:          23
> > Model:               1
> > Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> > Stepping:            2
> > CPU MHz:             2379.233
> > CPU max MHz:         1900.0000
> > CPU min MHz:         1200.0000
> > BogoMIPS:            3792.81
> > Virtualization:      AMD-V
> > L1d cache:           32K
> > L1i cache:           64K
> > L2 cache:            512K
> > L3 cache:            8192K
> > NUMA node0 CPU(s):   0-7
> > NUMA node1 CPU(s):   8-15
> > NUMA node2 CPU(s):   16-23
> > NUMA node3 CPU(s):   24-31
> > NUMA node4 CPU(s):   32-39
> > NUMA node5 CPU(s):   40-47
> > NUMA node6 CPU(s):   48-55
> > NUMA node7 CPU(s):   56-63
> >
> > Here is the output of #cpuid  -l 0x8000001e  -r
> >
> > You may want to refer
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> >
> amd.com%2Fsystem%2Ffiles%2FTechDocs%2F54945_3.03_ppr_ZP_B2_pub.zip&
> amp
> >
> ;data=02%7C01%7Cbabu.moger%40amd.com%7C8027127197c440bc097008d8
> 2814e52
> >
> 5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6373034174951581
> 22&amp;
> >
> sdata=ViF%2FGTPxDFV6KcjicGedyACbjQ6Ikkq0oWUWw18pGbU%3D&amp;reser
> ved=0
> > (section 2.1.12.2.1.3 ApicId Enumeration Requirements).
> > Note that this is a general guideline. We tried to generalize in qemu
> > as much as possible. It is bit complex.
> Thanks, I'll look into it.
> Can you also dump SRAT table from that machine, please?
> I'd like to see CPUs relation to NUMA nodes described in SRAT.

Hope this helps.  Got this from acpidump tool.
Let me if you have any other tool.

SRAT @ 0x0000000000000000
    0000: 53 52 41 54 C0 09 00 00 03 64 41 4D 44 00 00 00 SRAT.....dAMD...
    0010: 44 49 45 53 45 4C 20 20 01 00 00 00 41 4D 44 20  DIESEL  ....AMD
    0020: 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0030: 00 10 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0040: 00 10 00 01 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0050: 00 10 00 02 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0060: 00 10 00 03 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0070: 00 10 00 04 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0080: 00 10 00 05 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0090: 00 10 00 06 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00A0: 00 10 00 07 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00B0: 00 10 00 08 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00C0: 00 10 00 09 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00D0: 00 10 00 0A 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00E0: 00 10 00 0B 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00F0: 00 10 00 0C 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0100: 00 10 00 0D 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0110: 00 10 00 0E 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0120: 00 10 00 0F 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0130: 00 10 01 10 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0140: 00 10 01 11 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0150: 00 10 01 12 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0160: 00 10 01 13 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0170: 00 10 01 14 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0180: 00 10 01 15 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0190: 00 10 01 16 01 00 00 00 00 00 00 00 00 00 00 00  ................
    01A0: 00 10 01 17 01 00 00 00 00 00 00 00 00 00 00 00  ................
    01B0: 00 10 01 18 01 00 00 00 00 00 00 00 00 00 00 00  ................
    01C0: 00 10 01 19 01 00 00 00 00 00 00 00 00 00 00 00  ................
    01D0: 00 10 01 1A 01 00 00 00 00 00 00 00 00 00 00 00  ................
    01E0: 00 10 01 1B 01 00 00 00 00 00 00 00 00 00 00 00  ................
    01F0: 00 10 01 1C 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0200: 00 10 01 1D 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0210: 00 10 01 1E 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0220: 00 10 01 1F 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0230: 00 10 02 20 01 00 00 00 00 00 00 00 00 00 00 00  ... ............
    0240: 00 10 02 21 01 00 00 00 00 00 00 00 00 00 00 00  ...!............
    0250: 00 10 02 22 01 00 00 00 00 00 00 00 00 00 00 00  ..."............
    0260: 00 10 02 23 01 00 00 00 00 00 00 00 00 00 00 00  ...#............
    0270: 00 10 02 24 01 00 00 00 00 00 00 00 00 00 00 00  ...$............
    0280: 00 10 02 25 01 00 00 00 00 00 00 00 00 00 00 00  ...%............
    0290: 00 10 02 26 01 00 00 00 00 00 00 00 00 00 00 00  ...&............
    02A0: 00 10 02 27 01 00 00 00 00 00 00 00 00 00 00 00  ...'............
    02B0: 00 10 02 28 01 00 00 00 00 00 00 00 00 00 00 00  ...(............
    02C0: 00 10 02 29 01 00 00 00 00 00 00 00 00 00 00 00  ...)............
    02D0: 00 10 02 2A 01 00 00 00 00 00 00 00 00 00 00 00  ...*............
    02E0: 00 10 02 2B 01 00 00 00 00 00 00 00 00 00 00 00  ...+............
    02F0: 00 10 02 2C 01 00 00 00 00 00 00 00 00 00 00 00  ...,............
    0300: 00 10 02 2D 01 00 00 00 00 00 00 00 00 00 00 00  ...-............
    0310: 00 10 02 2E 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0320: 00 10 02 2F 01 00 00 00 00 00 00 00 00 00 00 00  .../............
    0330: 00 10 03 30 01 00 00 00 00 00 00 00 00 00 00 00  ...0............
    0340: 00 10 03 31 01 00 00 00 00 00 00 00 00 00 00 00  ...1............
    0350: 00 10 03 32 01 00 00 00 00 00 00 00 00 00 00 00  ...2............
    0360: 00 10 03 33 01 00 00 00 00 00 00 00 00 00 00 00  ...3............
    0370: 00 10 03 34 01 00 00 00 00 00 00 00 00 00 00 00  ...4............
    0380: 00 10 03 35 01 00 00 00 00 00 00 00 00 00 00 00  ...5............
    0390: 00 10 03 36 01 00 00 00 00 00 00 00 00 00 00 00  ...6............
    03A0: 00 10 03 37 01 00 00 00 00 00 00 00 00 00 00 00  ...7............
    03B0: 00 10 03 38 01 00 00 00 00 00 00 00 00 00 00 00  ...8............
    03C0: 00 10 03 39 01 00 00 00 00 00 00 00 00 00 00 00  ...9............
    03D0: 00 10 03 3A 01 00 00 00 00 00 00 00 00 00 00 00  ...:............
    03E0: 00 10 03 3B 01 00 00 00 00 00 00 00 00 00 00 00  ...;............
    03F0: 00 10 03 3C 01 00 00 00 00 00 00 00 00 00 00 00  ...<............
    0400: 00 10 03 3D 01 00 00 00 00 00 00 00 00 00 00 00  ...=............
    0410: 00 10 03 3E 01 00 00 00 00 00 00 00 00 00 00 00  ...>............
    0420: 00 10 03 3F 01 00 00 00 00 00 00 00 00 00 00 00  ...?............
    0430: 00 10 04 40 01 00 00 00 00 00 00 00 00 00 00 00  ...@............
    0440: 00 10 04 41 01 00 00 00 00 00 00 00 00 00 00 00  ...A............
    0450: 00 10 04 42 01 00 00 00 00 00 00 00 00 00 00 00  ...B............
    0460: 00 10 04 43 01 00 00 00 00 00 00 00 00 00 00 00  ...C............
    0470: 00 10 04 44 01 00 00 00 00 00 00 00 00 00 00 00  ...D............
    0480: 00 10 04 45 01 00 00 00 00 00 00 00 00 00 00 00  ...E............
    0490: 00 10 04 46 01 00 00 00 00 00 00 00 00 00 00 00  ...F............
    04A0: 00 10 04 47 01 00 00 00 00 00 00 00 00 00 00 00  ...G............
    04B0: 00 10 04 48 01 00 00 00 00 00 00 00 00 00 00 00  ...H............
    04C0: 00 10 04 49 01 00 00 00 00 00 00 00 00 00 00 00  ...I............
    04D0: 00 10 04 4A 01 00 00 00 00 00 00 00 00 00 00 00  ...J............
    04E0: 00 10 04 4B 01 00 00 00 00 00 00 00 00 00 00 00  ...K............
    04F0: 00 10 04 4C 01 00 00 00 00 00 00 00 00 00 00 00  ...L............
    0500: 00 10 04 4D 01 00 00 00 00 00 00 00 00 00 00 00  ...M............
    0510: 00 10 04 4E 01 00 00 00 00 00 00 00 00 00 00 00  ...N............
    0520: 00 10 04 4F 01 00 00 00 00 00 00 00 00 00 00 00  ...O............
    0530: 00 10 05 50 01 00 00 00 00 00 00 00 00 00 00 00  ...P............
    0540: 00 10 05 51 01 00 00 00 00 00 00 00 00 00 00 00  ...Q............
    0550: 00 10 05 52 01 00 00 00 00 00 00 00 00 00 00 00  ...R............
    0560: 00 10 05 53 01 00 00 00 00 00 00 00 00 00 00 00  ...S............
    0570: 00 10 05 54 01 00 00 00 00 00 00 00 00 00 00 00  ...T............
    0580: 00 10 05 55 01 00 00 00 00 00 00 00 00 00 00 00  ...U............
    0590: 00 10 05 56 01 00 00 00 00 00 00 00 00 00 00 00  ...V............
    05A0: 00 10 05 57 01 00 00 00 00 00 00 00 00 00 00 00  ...W............
    05B0: 00 10 05 58 01 00 00 00 00 00 00 00 00 00 00 00  ...X............
    05C0: 00 10 05 59 01 00 00 00 00 00 00 00 00 00 00 00  ...Y............
    05D0: 00 10 05 5A 01 00 00 00 00 00 00 00 00 00 00 00  ...Z............
    05E0: 00 10 05 5B 01 00 00 00 00 00 00 00 00 00 00 00  ...[............
    05F0: 00 10 05 5C 01 00 00 00 00 00 00 00 00 00 00 00  ...\............
    0600: 00 10 05 5D 01 00 00 00 00 00 00 00 00 00 00 00
  ...]............
    0610: 00 10 05 5E 01 00 00 00 00 00 00 00 00 00 00 00  ...^............
    0620: 00 10 05 5F 01 00 00 00 00 00 00 00 00 00 00 00  ..._............
    0630: 00 10 06 60 01 00 00 00 00 00 00 00 00 00 00 00  ...`............
    0640: 00 10 06 61 01 00 00 00 00 00 00 00 00 00 00 00  ...a............
    0650: 00 10 06 62 01 00 00 00 00 00 00 00 00 00 00 00  ...b............
    0660: 00 10 06 63 01 00 00 00 00 00 00 00 00 00 00 00  ...c............
    0670: 00 10 06 64 01 00 00 00 00 00 00 00 00 00 00 00  ...d............
    0680: 00 10 06 65 01 00 00 00 00 00 00 00 00 00 00 00  ...e............
    0690: 00 10 06 66 01 00 00 00 00 00 00 00 00 00 00 00  ...f............
    06A0: 00 10 06 67 01 00 00 00 00 00 00 00 00 00 00 00  ...g............
    06B0: 00 10 06 68 01 00 00 00 00 00 00 00 00 00 00 00  ...h............
    06C0: 00 10 06 69 01 00 00 00 00 00 00 00 00 00 00 00  ...i............
    06D0: 00 10 06 6A 01 00 00 00 00 00 00 00 00 00 00 00  ...j............
    06E0: 00 10 06 6B 01 00 00 00 00 00 00 00 00 00 00 00  ...k............
    06F0: 00 10 06 6C 01 00 00 00 00 00 00 00 00 00 00 00  ...l............
    0700: 00 10 06 6D 01 00 00 00 00 00 00 00 00 00 00 00  ...m............
    0710: 00 10 06 6E 01 00 00 00 00 00 00 00 00 00 00 00  ...n............
    0720: 00 10 06 6F 01 00 00 00 00 00 00 00 00 00 00 00  ...o............
    0730: 00 10 07 70 01 00 00 00 00 00 00 00 00 00 00 00  ...p............
    0740: 00 10 07 71 01 00 00 00 00 00 00 00 00 00 00 00  ...q............
    0750: 00 10 07 72 01 00 00 00 00 00 00 00 00 00 00 00  ...r............
    0760: 00 10 07 73 01 00 00 00 00 00 00 00 00 00 00 00  ...s............
    0770: 00 10 07 74 01 00 00 00 00 00 00 00 00 00 00 00  ...t............
    0780: 00 10 07 75 01 00 00 00 00 00 00 00 00 00 00 00  ...u............
    0790: 00 10 07 76 01 00 00 00 00 00 00 00 00 00 00 00  ...v............
    07A0: 00 10 07 77 01 00 00 00 00 00 00 00 00 00 00 00  ...w............
    07B0: 00 10 07 78 01 00 00 00 00 00 00 00 00 00 00 00  ...x............
    07C0: 00 10 07 79 01 00 00 00 00 00 00 00 00 00 00 00  ...y............
    07D0: 00 10 07 7A 01 00 00 00 00 00 00 00 00 00 00 00  ...z............
    07E0: 00 10 07 7B 01 00 00 00 00 00 00 00 00 00 00 00  ...{............
    07F0: 00 10 07 7C 01 00 00 00 00 00 00 00 00 00 00 00  ...|............
    0800: 00 10 07 7D 01 00 00 00 00 00 00 00 00 00 00 00  ...}............
    0810: 00 10 07 7E 01 00 00 00 00 00 00 00 00 00 00 00  ...~............
    0820: 00 10 07 7F 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0830: 01 28 00 00 00 00 00 00 00 00 00 00 00 00 00 00  .(..............
    0840: 00 00 0A 00 00 00 00 00 00 00 00 00 01 00 00 00  ................
    0850: 00 00 00 00 00 00 00 00 01 28 00 00 00 00 00 00  .........(......
    0860: 00 00 10 00 00 00 00 00 00 00 F0 7F 00 00 00 00  ................
    0870: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0880: 01 28 00 00 00 00 00 00 00 00 00 00 01 00 00 00  .(..............
    0890: 00 00 00 80 03 00 00 00 00 00 00 00 01 00 00 00  ................
    08A0: 00 00 00 00 00 00 00 00 01 28 01 00 00 00 00 00  .........(......
    08B0: 00 00 00 80 04 00 00 00 00 00 00 00 04 00 00 00  ................
    08C0: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    08D0: 01 28 02 00 00 00 00 00 00 00 00 80 08 00 00 00  .(..............
    08E0: 00 00 00 00 04 00 00 00 00 00 00 00 01 00 00 00  ................
    08F0: 00 00 00 00 00 00 00 00 01 28 03 00 00 00 00 00  .........(......
    0900: 00 00 00 80 0C 00 00 00 00 00 00 00 04 00 00 00  ................
    0910: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0920: 01 28 04 00 00 00 00 00 00 00 00 80 10 00 00 00  .(..............
    0930: 00 00 00 00 04 00 00 00 00 00 00 00 01 00 00 00  ................
    0940: 00 00 00 00 00 00 00 00 01 28 05 00 00 00 00 00  .........(......
    0950: 00 00 00 80 14 00 00 00 00 00 00 00 04 00 00 00  ................
    0960: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    0970: 01 28 06 00 00 00 00 00 00 00 00 80 18 00 00 00  .(..............
    0980: 00 00 00 00 04 00 00 00 00 00 00 00 01 00 00 00  ................
    0990: 00 00 00 00 00 00 00 00 01 28 07 00 00 00 00 00  .........(......
    09A0: 00 00 00 80 1C 00 00 00 00 00 00 00 04 00 00 00  ................
    09B0: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00
................
> 
> >
> > CPU 0:
> >    0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000300
> > edx=0x00000000
> > CPU 1:
> >    0x8000001e 0x00: eax=0x00000002 ebx=0x00000101 ecx=0x00000300
> > edx=0x00000000
> > CPU 2:
> >    0x8000001e 0x00: eax=0x00000004 ebx=0x00000102 ecx=0x00000300
> > edx=0x00000000
> > CPU 3:
> >    0x8000001e 0x00: eax=0x00000006 ebx=0x00000103 ecx=0x00000300
> > edx=0x00000000
> > CPU 4:
> >    0x8000001e 0x00: eax=0x00000008 ebx=0x00000104 ecx=0x00000300
> > edx=0x00000000
> > CPU 5:
> >    0x8000001e 0x00: eax=0x0000000a ebx=0x00000105 ecx=0x00000300
> > edx=0x00000000
> > CPU 6:
> >    0x8000001e 0x00: eax=0x0000000c ebx=0x00000106 ecx=0x00000300
> > edx=0x00000000
> > CPU 7:
> >    0x8000001e 0x00: eax=0x0000000e ebx=0x00000107 ecx=0x00000300
> > edx=0x00000000
> > CPU 8:
> >    0x8000001e 0x00: eax=0x00000010 ebx=0x00000108 ecx=0x00000301
> > edx=0x00000000
> > CPU 9:
> >    0x8000001e 0x00: eax=0x00000012 ebx=0x00000109 ecx=0x00000301
> > edx=0x00000000
> > CPU 10:
> >    0x8000001e 0x00: eax=0x00000014 ebx=0x0000010a ecx=0x00000301
> > edx=0x00000000
> > CPU 11:
> >    0x8000001e 0x00: eax=0x00000016 ebx=0x0000010b ecx=0x00000301
> > edx=0x00000000
> > CPU 12:
> >    0x8000001e 0x00: eax=0x00000018 ebx=0x0000010c ecx=0x00000301
> > edx=0x00000000
> > CPU 13:
> >    0x8000001e 0x00: eax=0x0000001a ebx=0x0000010d ecx=0x00000301
> > edx=0x00000000
> > CPU 14:
> >    0x8000001e 0x00: eax=0x0000001c ebx=0x0000010e ecx=0x00000301
> > edx=0x00000000
> > CPU 15:
> >    0x8000001e 0x00: eax=0x0000001e ebx=0x0000010f ecx=0x00000301
> > edx=0x00000000
> > CPU 16:
> >    0x8000001e 0x00: eax=0x00000020 ebx=0x00000110 ecx=0x00000302
> > edx=0x00000000
> > CPU 17:
> >    0x8000001e 0x00: eax=0x00000022 ebx=0x00000111 ecx=0x00000302
> > edx=0x00000000
> > CPU 18:
> >    0x8000001e 0x00: eax=0x00000024 ebx=0x00000112 ecx=0x00000302
> > edx=0x00000000
> > CPU 19:
> >    0x8000001e 0x00: eax=0x00000026 ebx=0x00000113 ecx=0x00000302
> > edx=0x00000000
> > CPU 20:
> >    0x8000001e 0x00: eax=0x00000028 ebx=0x00000114 ecx=0x00000302
> > edx=0x00000000
> > CPU 21:
> >    0x8000001e 0x00: eax=0x0000002a ebx=0x00000115 ecx=0x00000302
> > edx=0x00000000
> > CPU 22:
> >    0x8000001e 0x00: eax=0x0000002c ebx=0x00000116 ecx=0x00000302
> > edx=0x00000000
> > CPU 23:
> >    0x8000001e 0x00: eax=0x0000002e ebx=0x00000117 ecx=0x00000302
> > edx=0x00000000
> > CPU 24:
> >    0x8000001e 0x00: eax=0x00000030 ebx=0x00000118 ecx=0x00000303
> > edx=0x00000000
> > CPU 25:
> >    0x8000001e 0x00: eax=0x00000032 ebx=0x00000119 ecx=0x00000303
> > edx=0x00000000
> > CPU 26:
> >    0x8000001e 0x00: eax=0x00000034 ebx=0x0000011a ecx=0x00000303
> > edx=0x00000000
> > CPU 27:
> >    0x8000001e 0x00: eax=0x00000036 ebx=0x0000011b ecx=0x00000303
> > edx=0x00000000
> > CPU 28:
> >    0x8000001e 0x00: eax=0x00000038 ebx=0x0000011c ecx=0x00000303
> > edx=0x00000000
> > CPU 29:
> >    0x8000001e 0x00: eax=0x0000003a ebx=0x0000011d ecx=0x00000303
> > edx=0x00000000
> > CPU 30:
> >    0x8000001e 0x00: eax=0x0000003c ebx=0x0000011e ecx=0x00000303
> > edx=0x00000000
> > CPU 31:
> >    0x8000001e 0x00: eax=0x0000003e ebx=0x0000011f ecx=0x00000303
> > edx=0x00000000
> > CPU 32:
> >    0x8000001e 0x00: eax=0x00000040 ebx=0x00000100 ecx=0x00000304
> > edx=0x00000000
> > CPU 33:
> >    0x8000001e 0x00: eax=0x00000042 ebx=0x00000101 ecx=0x00000304
> > edx=0x00000000
> > CPU 34:
> >    0x8000001e 0x00: eax=0x00000044 ebx=0x00000102 ecx=0x00000304
> > edx=0x00000000
> > CPU 35:
> >    0x8000001e 0x00: eax=0x00000046 ebx=0x00000103 ecx=0x00000304
> > edx=0x00000000
> > CPU 36:
> >    0x8000001e 0x00: eax=0x00000048 ebx=0x00000104 ecx=0x00000304
> > edx=0x00000000
> > CPU 37:
> >    0x8000001e 0x00: eax=0x0000004a ebx=0x00000105 ecx=0x00000304
> > edx=0x00000000
> > CPU 38:
> >    0x8000001e 0x00: eax=0x0000004c ebx=0x00000106 ecx=0x00000304
> > edx=0x00000000
> > CPU 39:
> >    0x8000001e 0x00: eax=0x0000004e ebx=0x00000107 ecx=0x00000304
> > edx=0x00000000
> > CPU 40:
> >    0x8000001e 0x00: eax=0x00000050 ebx=0x00000108 ecx=0x00000305
> > edx=0x00000000
> > CPU 41:
> >    0x8000001e 0x00: eax=0x00000052 ebx=0x00000109 ecx=0x00000305
> > edx=0x00000000
> > CPU 42:
> >    0x8000001e 0x00: eax=0x00000054 ebx=0x0000010a ecx=0x00000305
> > edx=0x00000000
> > CPU 43:
> >    0x8000001e 0x00: eax=0x00000056 ebx=0x0000010b ecx=0x00000305
> > edx=0x00000000
> > CPU 44:
> >    0x8000001e 0x00: eax=0x00000058 ebx=0x0000010c ecx=0x00000305
> > edx=0x00000000
> > CPU 45:
> >    0x8000001e 0x00: eax=0x0000005a ebx=0x0000010d ecx=0x00000305
> > edx=0x00000000
> > CPU 46:
> >    0x8000001e 0x00: eax=0x0000005c ebx=0x0000010e ecx=0x00000305
> > edx=0x00000000
> > CPU 47:
> >    0x8000001e 0x00: eax=0x0000005e ebx=0x0000010f ecx=0x00000305
> > edx=0x00000000
> > CPU 48:
> >    0x8000001e 0x00: eax=0x00000060 ebx=0x00000110 ecx=0x00000306
> > edx=0x00000000
> >
> > CPU 49:
> >    0x8000001e 0x00: eax=0x00000062 ebx=0x00000111 ecx=0x00000306
> > edx=0x00000000
> > CPU 50:
> >    0x8000001e 0x00: eax=0x00000064 ebx=0x00000112 ecx=0x00000306
> > edx=0x00000000
> > CPU 51:
> >    0x8000001e 0x00: eax=0x00000066 ebx=0x00000113 ecx=0x00000306
> > edx=0x00000000
> > CPU 52:
> >    0x8000001e 0x00: eax=0x00000068 ebx=0x00000114 ecx=0x00000306
> > edx=0x00000000
> > CPU 53:
> >    0x8000001e 0x00: eax=0x0000006a ebx=0x00000115 ecx=0x00000306
> > edx=0x00000000
> > CPU 54:
> >    0x8000001e 0x00: eax=0x0000006c ebx=0x00000116 ecx=0x00000306
> > edx=0x00000000
> > CPU 55:
> >    0x8000001e 0x00: eax=0x0000006e ebx=0x00000117 ecx=0x00000306
> > edx=0x00000000
> > CPU 56:
> >    0x8000001e 0x00: eax=0x00000070 ebx=0x00000118 ecx=0x00000307
> > edx=0x00000000
> > CPU 57:
> >    0x8000001e 0x00: eax=0x00000072 ebx=0x00000119 ecx=0x00000307
> > edx=0x00000000
> > CPU 58:
> >    0x8000001e 0x00: eax=0x00000074 ebx=0x0000011a ecx=0x00000307
> > edx=0x00000000
> > CPU 59:
> >    0x8000001e 0x00: eax=0x00000076 ebx=0x0000011b ecx=0x00000307
> > edx=0x00000000
> > CPU 60:
> >    0x8000001e 0x00: eax=0x00000078 ebx=0x0000011c ecx=0x00000307
> > edx=0x00000000
> > CPU 61:
> >    0x8000001e 0x00: eax=0x0000007a ebx=0x0000011d ecx=0x00000307
> > edx=0x00000000
> > CPU 62:
> >    0x8000001e 0x00: eax=0x0000007c ebx=0x0000011e ecx=0x00000307
> > edx=0x00000000
> > CPU 63:
> >    0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f ecx=0x00000307
> > edx=0x00000000
> >
> > >
> > > >
> > > > >
> > > > > I wonder if linux guest actually uses node_id encoded in apic id
> > > > > for configuring/checking numa structures, or it just uses
> > > > > whatever ACPI SRAT table provided.
> > > > >
> > > > >>>> + */
> > > > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> > > > >>>> +                                     CpuInstanceProperties props,
> > > > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > > > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > > > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > > >>>> +    topo_ids->node_id = props.has_node_id ?
> > > > >>>> +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) :
> 0;
> > > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id :
> > > > >>>> +0; }
> > > > >>>>  /*
> > > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > > >>>>   *
> > > > >>>>
> > > > >>
> > > > >
> > > >
> >
> >



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-14 17:26                 ` Babu Moger
@ 2020-07-14 18:26                   ` Igor Mammedov
  0 siblings, 0 replies; 23+ messages in thread
From: Igor Mammedov @ 2020-07-14 18:26 UTC (permalink / raw)
  To: Babu Moger; +Cc: pbonzini, rth, qemu-devel, ehabkost

On Tue, 14 Jul 2020 12:26:56 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Tuesday, July 14, 2020 11:42 AM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: qemu-devel@nongnu.org; pbonzini@redhat.com; ehabkost@redhat.com;
> > rth@twiddle.net
> > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > CpuInstanceProperties
> > 
> > On Mon, 13 Jul 2020 14:30:29 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > > -----Original Message-----
> > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > Sent: Monday, July 13, 2020 12:32 PM
> > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> > > > devel@nongnu.org
> > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > CpuInstanceProperties
> > > >
> > > > On Mon, 13 Jul 2020 11:43:33 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >  
> > > > > On 7/13/20 11:17 AM, Igor Mammedov wrote:  
> > > > > > On Mon, 13 Jul 2020 10:02:22 -0500 Babu Moger
> > > > > > <babu.moger@amd.com> wrote:
> > > > > >  
> > > > > >>> -----Original Message-----
> > > > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > > > > >>> qemu- devel@nongnu.org
> > > > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > > >>> CpuInstanceProperties  
> > > > > > [...]  
> > > > > >>>> +
> > > > > >>>> +/*
> > > > > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > > > > >>>> + * node_id in CpuInstanceProperties(or in CPU device) is a
> > > > > >>>> +sequential
> > > > > >>>> + * number, but while building the topology  
> > > > > >>>  
> > > > > >>>> we need to separate it for
> > > > > >>>> + * each socket(mod nodes_per_pkg).  
> > > > > >>> could you clarify a bit more on why this is necessary?  
> > > > > >>
> > > > > >> If you have two sockets and 4 numa nodes, node_id in
> > > > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod %
> > > > > >> number of nodes  
> > > > per socket).  
> > > > > >
> > > > > > I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes
> > > > > > per socket so APIC id woulbe be composed like:
> > > > > >
> > > > > >  1st socket
> > > > > >    pkg_id(0) | node_id(0)
> > > > > >    pkg_id(0) | node_id(1)
> > > > > >
> > > > > >  2nd socket
> > > > > >    pkg_id(1) | node_id(0)
> > > > > >    pkg_id(1) | node_id(1)
> > > > > >
> > > > > > if that's the case, then EPYC's node_id here doesn't look like a
> > > > > > NUMA node in the sense it's usually used (above config would
> > > > > > have 4 different memory controllers => 4 conventional NUMA nodes).  
> > > > >
> > > > > EPIC model uses combination of socket id and node id to identify
> > > > > the numa nodes. So, it internally uses all the information.  
> > > >
> > > > well with above values, EPYC's node_id doesn't look like it's
> > > > specifying a machine numa node, but rather a node index within
> > > > single socket. In which case, it doesn't make much sense calling it
> > > > NUMA node_id, it's rather some index within a socket. (it starts
> > > > looking like terminology is all mixed up)
> > > >
> > > > If you have access to a milti-socket EPYC machine, can you dump and
> > > > post here its apic ids, pls?  
> > >
> > > Here is the output from my EPYC machine with 2 sockets and totally 8
> > > nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus 32-63 in
> > > socket 1.
> > >
> > > # lscpu
> > > Architecture:        x86_64
> > > CPU op-mode(s):      32-bit, 64-bit
> > > Byte Order:          Little Endian
> > > CPU(s):              64
> > > On-line CPU(s) list: 0-63
> > > Thread(s) per core:  1
> > > Core(s) per socket:  32
> > > Socket(s):           2
> > > NUMA node(s):        8
> > > Vendor ID:           AuthenticAMD
> > > CPU family:          23
> > > Model:               1
> > > Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> > > Stepping:            2
> > > CPU MHz:             2379.233
> > > CPU max MHz:         1900.0000
> > > CPU min MHz:         1200.0000
> > > BogoMIPS:            3792.81
> > > Virtualization:      AMD-V
> > > L1d cache:           32K
> > > L1i cache:           64K
> > > L2 cache:            512K
> > > L3 cache:            8192K
> > > NUMA node0 CPU(s):   0-7
> > > NUMA node1 CPU(s):   8-15
> > > NUMA node2 CPU(s):   16-23
> > > NUMA node3 CPU(s):   24-31
> > > NUMA node4 CPU(s):   32-39
> > > NUMA node5 CPU(s):   40-47
> > > NUMA node6 CPU(s):   48-55
> > > NUMA node7 CPU(s):   56-63
> > >
> > > Here is the output of #cpuid  -l 0x8000001e  -r
> > >
> > > You may want to refer
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > >  
> > amd.com%2Fsystem%2Ffiles%2FTechDocs%2F54945_3.03_ppr_ZP_B2_pub.zip&
> > amp  
> > >  
> > ;data=02%7C01%7Cbabu.moger%40amd.com%7C8027127197c440bc097008d8
> > 2814e52  
> > >  
> > 5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6373034174951581
> > 22&amp;  
> > >  
> > sdata=ViF%2FGTPxDFV6KcjicGedyACbjQ6Ikkq0oWUWw18pGbU%3D&amp;reser
> > ved=0  
> > > (section 2.1.12.2.1.3 ApicId Enumeration Requirements).
> > > Note that this is a general guideline. We tried to generalize in qemu
> > > as much as possible. It is bit complex.  
> > Thanks, I'll look into it.
> > Can you also dump SRAT table from that machine, please?
> > I'd like to see CPUs relation to NUMA nodes described in SRAT.  
> 
> Hope this helps.  Got this from acpidump tool.
> Let me if you have any other tool.

# acpidump -b
and then
# iasl -d dumped_table_blob
that will produce human readble table

> 
> SRAT @ 0x0000000000000000
>     0000: 53 52 41 54 C0 09 00 00 03 64 41 4D 44 00 00 00 SRAT.....dAMD...
>     0010: 44 49 45 53 45 4C 20 20 01 00 00 00 41 4D 44 20  DIESEL  ....AMD
>     0020: 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0030: 00 10 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0040: 00 10 00 01 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0050: 00 10 00 02 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0060: 00 10 00 03 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0070: 00 10 00 04 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0080: 00 10 00 05 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0090: 00 10 00 06 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     00A0: 00 10 00 07 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     00B0: 00 10 00 08 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     00C0: 00 10 00 09 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     00D0: 00 10 00 0A 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     00E0: 00 10 00 0B 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     00F0: 00 10 00 0C 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0100: 00 10 00 0D 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0110: 00 10 00 0E 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0120: 00 10 00 0F 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0130: 00 10 01 10 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0140: 00 10 01 11 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0150: 00 10 01 12 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0160: 00 10 01 13 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0170: 00 10 01 14 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0180: 00 10 01 15 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0190: 00 10 01 16 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     01A0: 00 10 01 17 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     01B0: 00 10 01 18 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     01C0: 00 10 01 19 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     01D0: 00 10 01 1A 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     01E0: 00 10 01 1B 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     01F0: 00 10 01 1C 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0200: 00 10 01 1D 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0210: 00 10 01 1E 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0220: 00 10 01 1F 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0230: 00 10 02 20 01 00 00 00 00 00 00 00 00 00 00 00  ... ............
>     0240: 00 10 02 21 01 00 00 00 00 00 00 00 00 00 00 00  ...!............
>     0250: 00 10 02 22 01 00 00 00 00 00 00 00 00 00 00 00  ..."............
>     0260: 00 10 02 23 01 00 00 00 00 00 00 00 00 00 00 00  ...#............
>     0270: 00 10 02 24 01 00 00 00 00 00 00 00 00 00 00 00  ...$............
>     0280: 00 10 02 25 01 00 00 00 00 00 00 00 00 00 00 00  ...%............
>     0290: 00 10 02 26 01 00 00 00 00 00 00 00 00 00 00 00  ...&............
>     02A0: 00 10 02 27 01 00 00 00 00 00 00 00 00 00 00 00  ...'............
>     02B0: 00 10 02 28 01 00 00 00 00 00 00 00 00 00 00 00  ...(............
>     02C0: 00 10 02 29 01 00 00 00 00 00 00 00 00 00 00 00  ...)............
>     02D0: 00 10 02 2A 01 00 00 00 00 00 00 00 00 00 00 00  ...*............
>     02E0: 00 10 02 2B 01 00 00 00 00 00 00 00 00 00 00 00  ...+............
>     02F0: 00 10 02 2C 01 00 00 00 00 00 00 00 00 00 00 00  ...,............
>     0300: 00 10 02 2D 01 00 00 00 00 00 00 00 00 00 00 00  ...-............
>     0310: 00 10 02 2E 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0320: 00 10 02 2F 01 00 00 00 00 00 00 00 00 00 00 00  .../............
>     0330: 00 10 03 30 01 00 00 00 00 00 00 00 00 00 00 00  ...0............
>     0340: 00 10 03 31 01 00 00 00 00 00 00 00 00 00 00 00  ...1............
>     0350: 00 10 03 32 01 00 00 00 00 00 00 00 00 00 00 00  ...2............
>     0360: 00 10 03 33 01 00 00 00 00 00 00 00 00 00 00 00  ...3............
>     0370: 00 10 03 34 01 00 00 00 00 00 00 00 00 00 00 00  ...4............
>     0380: 00 10 03 35 01 00 00 00 00 00 00 00 00 00 00 00  ...5............
>     0390: 00 10 03 36 01 00 00 00 00 00 00 00 00 00 00 00  ...6............
>     03A0: 00 10 03 37 01 00 00 00 00 00 00 00 00 00 00 00  ...7............
>     03B0: 00 10 03 38 01 00 00 00 00 00 00 00 00 00 00 00  ...8............
>     03C0: 00 10 03 39 01 00 00 00 00 00 00 00 00 00 00 00  ...9............
>     03D0: 00 10 03 3A 01 00 00 00 00 00 00 00 00 00 00 00  ...:............
>     03E0: 00 10 03 3B 01 00 00 00 00 00 00 00 00 00 00 00  ...;............
>     03F0: 00 10 03 3C 01 00 00 00 00 00 00 00 00 00 00 00  ...<............
>     0400: 00 10 03 3D 01 00 00 00 00 00 00 00 00 00 00 00  ...=............
>     0410: 00 10 03 3E 01 00 00 00 00 00 00 00 00 00 00 00  ...>............
>     0420: 00 10 03 3F 01 00 00 00 00 00 00 00 00 00 00 00  ...?............
>     0430: 00 10 04 40 01 00 00 00 00 00 00 00 00 00 00 00  ...@............
>     0440: 00 10 04 41 01 00 00 00 00 00 00 00 00 00 00 00  ...A............
>     0450: 00 10 04 42 01 00 00 00 00 00 00 00 00 00 00 00  ...B............
>     0460: 00 10 04 43 01 00 00 00 00 00 00 00 00 00 00 00  ...C............
>     0470: 00 10 04 44 01 00 00 00 00 00 00 00 00 00 00 00  ...D............
>     0480: 00 10 04 45 01 00 00 00 00 00 00 00 00 00 00 00  ...E............
>     0490: 00 10 04 46 01 00 00 00 00 00 00 00 00 00 00 00  ...F............
>     04A0: 00 10 04 47 01 00 00 00 00 00 00 00 00 00 00 00  ...G............
>     04B0: 00 10 04 48 01 00 00 00 00 00 00 00 00 00 00 00  ...H............
>     04C0: 00 10 04 49 01 00 00 00 00 00 00 00 00 00 00 00  ...I............
>     04D0: 00 10 04 4A 01 00 00 00 00 00 00 00 00 00 00 00  ...J............
>     04E0: 00 10 04 4B 01 00 00 00 00 00 00 00 00 00 00 00  ...K............
>     04F0: 00 10 04 4C 01 00 00 00 00 00 00 00 00 00 00 00  ...L............
>     0500: 00 10 04 4D 01 00 00 00 00 00 00 00 00 00 00 00  ...M............
>     0510: 00 10 04 4E 01 00 00 00 00 00 00 00 00 00 00 00  ...N............
>     0520: 00 10 04 4F 01 00 00 00 00 00 00 00 00 00 00 00  ...O............
>     0530: 00 10 05 50 01 00 00 00 00 00 00 00 00 00 00 00  ...P............
>     0540: 00 10 05 51 01 00 00 00 00 00 00 00 00 00 00 00  ...Q............
>     0550: 00 10 05 52 01 00 00 00 00 00 00 00 00 00 00 00  ...R............
>     0560: 00 10 05 53 01 00 00 00 00 00 00 00 00 00 00 00  ...S............
>     0570: 00 10 05 54 01 00 00 00 00 00 00 00 00 00 00 00  ...T............
>     0580: 00 10 05 55 01 00 00 00 00 00 00 00 00 00 00 00  ...U............
>     0590: 00 10 05 56 01 00 00 00 00 00 00 00 00 00 00 00  ...V............
>     05A0: 00 10 05 57 01 00 00 00 00 00 00 00 00 00 00 00  ...W............
>     05B0: 00 10 05 58 01 00 00 00 00 00 00 00 00 00 00 00  ...X............
>     05C0: 00 10 05 59 01 00 00 00 00 00 00 00 00 00 00 00  ...Y............
>     05D0: 00 10 05 5A 01 00 00 00 00 00 00 00 00 00 00 00  ...Z............
>     05E0: 00 10 05 5B 01 00 00 00 00 00 00 00 00 00 00 00  ...[............
>     05F0: 00 10 05 5C 01 00 00 00 00 00 00 00 00 00 00 00  ...\............
>     0600: 00 10 05 5D 01 00 00 00 00 00 00 00 00 00 00 00
>   ...]............
>     0610: 00 10 05 5E 01 00 00 00 00 00 00 00 00 00 00 00  ...^............
>     0620: 00 10 05 5F 01 00 00 00 00 00 00 00 00 00 00 00  ..._............
>     0630: 00 10 06 60 01 00 00 00 00 00 00 00 00 00 00 00  ...`............
>     0640: 00 10 06 61 01 00 00 00 00 00 00 00 00 00 00 00  ...a............
>     0650: 00 10 06 62 01 00 00 00 00 00 00 00 00 00 00 00  ...b............
>     0660: 00 10 06 63 01 00 00 00 00 00 00 00 00 00 00 00  ...c............
>     0670: 00 10 06 64 01 00 00 00 00 00 00 00 00 00 00 00  ...d............
>     0680: 00 10 06 65 01 00 00 00 00 00 00 00 00 00 00 00  ...e............
>     0690: 00 10 06 66 01 00 00 00 00 00 00 00 00 00 00 00  ...f............
>     06A0: 00 10 06 67 01 00 00 00 00 00 00 00 00 00 00 00  ...g............
>     06B0: 00 10 06 68 01 00 00 00 00 00 00 00 00 00 00 00  ...h............
>     06C0: 00 10 06 69 01 00 00 00 00 00 00 00 00 00 00 00  ...i............
>     06D0: 00 10 06 6A 01 00 00 00 00 00 00 00 00 00 00 00  ...j............
>     06E0: 00 10 06 6B 01 00 00 00 00 00 00 00 00 00 00 00  ...k............
>     06F0: 00 10 06 6C 01 00 00 00 00 00 00 00 00 00 00 00  ...l............
>     0700: 00 10 06 6D 01 00 00 00 00 00 00 00 00 00 00 00  ...m............
>     0710: 00 10 06 6E 01 00 00 00 00 00 00 00 00 00 00 00  ...n............
>     0720: 00 10 06 6F 01 00 00 00 00 00 00 00 00 00 00 00  ...o............
>     0730: 00 10 07 70 01 00 00 00 00 00 00 00 00 00 00 00  ...p............
>     0740: 00 10 07 71 01 00 00 00 00 00 00 00 00 00 00 00  ...q............
>     0750: 00 10 07 72 01 00 00 00 00 00 00 00 00 00 00 00  ...r............
>     0760: 00 10 07 73 01 00 00 00 00 00 00 00 00 00 00 00  ...s............
>     0770: 00 10 07 74 01 00 00 00 00 00 00 00 00 00 00 00  ...t............
>     0780: 00 10 07 75 01 00 00 00 00 00 00 00 00 00 00 00  ...u............
>     0790: 00 10 07 76 01 00 00 00 00 00 00 00 00 00 00 00  ...v............
>     07A0: 00 10 07 77 01 00 00 00 00 00 00 00 00 00 00 00  ...w............
>     07B0: 00 10 07 78 01 00 00 00 00 00 00 00 00 00 00 00  ...x............
>     07C0: 00 10 07 79 01 00 00 00 00 00 00 00 00 00 00 00  ...y............
>     07D0: 00 10 07 7A 01 00 00 00 00 00 00 00 00 00 00 00  ...z............
>     07E0: 00 10 07 7B 01 00 00 00 00 00 00 00 00 00 00 00  ...{............
>     07F0: 00 10 07 7C 01 00 00 00 00 00 00 00 00 00 00 00  ...|............
>     0800: 00 10 07 7D 01 00 00 00 00 00 00 00 00 00 00 00  ...}............
>     0810: 00 10 07 7E 01 00 00 00 00 00 00 00 00 00 00 00  ...~............
>     0820: 00 10 07 7F 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0830: 01 28 00 00 00 00 00 00 00 00 00 00 00 00 00 00  .(..............
>     0840: 00 00 0A 00 00 00 00 00 00 00 00 00 01 00 00 00  ................
>     0850: 00 00 00 00 00 00 00 00 01 28 00 00 00 00 00 00  .........(......
>     0860: 00 00 10 00 00 00 00 00 00 00 F0 7F 00 00 00 00  ................
>     0870: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0880: 01 28 00 00 00 00 00 00 00 00 00 00 01 00 00 00  .(..............
>     0890: 00 00 00 80 03 00 00 00 00 00 00 00 01 00 00 00  ................
>     08A0: 00 00 00 00 00 00 00 00 01 28 01 00 00 00 00 00  .........(......
>     08B0: 00 00 00 80 04 00 00 00 00 00 00 00 04 00 00 00  ................
>     08C0: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     08D0: 01 28 02 00 00 00 00 00 00 00 00 80 08 00 00 00  .(..............
>     08E0: 00 00 00 00 04 00 00 00 00 00 00 00 01 00 00 00  ................
>     08F0: 00 00 00 00 00 00 00 00 01 28 03 00 00 00 00 00  .........(......
>     0900: 00 00 00 80 0C 00 00 00 00 00 00 00 04 00 00 00  ................
>     0910: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0920: 01 28 04 00 00 00 00 00 00 00 00 80 10 00 00 00  .(..............
>     0930: 00 00 00 00 04 00 00 00 00 00 00 00 01 00 00 00  ................
>     0940: 00 00 00 00 00 00 00 00 01 28 05 00 00 00 00 00  .........(......
>     0950: 00 00 00 80 14 00 00 00 00 00 00 00 04 00 00 00  ................
>     0960: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
>     0970: 01 28 06 00 00 00 00 00 00 00 00 80 18 00 00 00  .(..............
>     0980: 00 00 00 00 04 00 00 00 00 00 00 00 01 00 00 00  ................
>     0990: 00 00 00 00 00 00 00 00 01 28 07 00 00 00 00 00  .........(......
>     09A0: 00 00 00 80 1C 00 00 00 00 00 00 00 04 00 00 00  ................
>     09B0: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00
> ................
> >   
> > >
> > > CPU 0:
> > >    0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000300
> > > edx=0x00000000
> > > CPU 1:
> > >    0x8000001e 0x00: eax=0x00000002 ebx=0x00000101 ecx=0x00000300
> > > edx=0x00000000
> > > CPU 2:
> > >    0x8000001e 0x00: eax=0x00000004 ebx=0x00000102 ecx=0x00000300
> > > edx=0x00000000
> > > CPU 3:
> > >    0x8000001e 0x00: eax=0x00000006 ebx=0x00000103 ecx=0x00000300
> > > edx=0x00000000
> > > CPU 4:
> > >    0x8000001e 0x00: eax=0x00000008 ebx=0x00000104 ecx=0x00000300
> > > edx=0x00000000
> > > CPU 5:
> > >    0x8000001e 0x00: eax=0x0000000a ebx=0x00000105 ecx=0x00000300
> > > edx=0x00000000
> > > CPU 6:
> > >    0x8000001e 0x00: eax=0x0000000c ebx=0x00000106 ecx=0x00000300
> > > edx=0x00000000
> > > CPU 7:
> > >    0x8000001e 0x00: eax=0x0000000e ebx=0x00000107 ecx=0x00000300
> > > edx=0x00000000
> > > CPU 8:
> > >    0x8000001e 0x00: eax=0x00000010 ebx=0x00000108 ecx=0x00000301
> > > edx=0x00000000
> > > CPU 9:
> > >    0x8000001e 0x00: eax=0x00000012 ebx=0x00000109 ecx=0x00000301
> > > edx=0x00000000
> > > CPU 10:
> > >    0x8000001e 0x00: eax=0x00000014 ebx=0x0000010a ecx=0x00000301
> > > edx=0x00000000
> > > CPU 11:
> > >    0x8000001e 0x00: eax=0x00000016 ebx=0x0000010b ecx=0x00000301
> > > edx=0x00000000
> > > CPU 12:
> > >    0x8000001e 0x00: eax=0x00000018 ebx=0x0000010c ecx=0x00000301
> > > edx=0x00000000
> > > CPU 13:
> > >    0x8000001e 0x00: eax=0x0000001a ebx=0x0000010d ecx=0x00000301
> > > edx=0x00000000
> > > CPU 14:
> > >    0x8000001e 0x00: eax=0x0000001c ebx=0x0000010e ecx=0x00000301
> > > edx=0x00000000
> > > CPU 15:
> > >    0x8000001e 0x00: eax=0x0000001e ebx=0x0000010f ecx=0x00000301
> > > edx=0x00000000
> > > CPU 16:
> > >    0x8000001e 0x00: eax=0x00000020 ebx=0x00000110 ecx=0x00000302
> > > edx=0x00000000
> > > CPU 17:
> > >    0x8000001e 0x00: eax=0x00000022 ebx=0x00000111 ecx=0x00000302
> > > edx=0x00000000
> > > CPU 18:
> > >    0x8000001e 0x00: eax=0x00000024 ebx=0x00000112 ecx=0x00000302
> > > edx=0x00000000
> > > CPU 19:
> > >    0x8000001e 0x00: eax=0x00000026 ebx=0x00000113 ecx=0x00000302
> > > edx=0x00000000
> > > CPU 20:
> > >    0x8000001e 0x00: eax=0x00000028 ebx=0x00000114 ecx=0x00000302
> > > edx=0x00000000
> > > CPU 21:
> > >    0x8000001e 0x00: eax=0x0000002a ebx=0x00000115 ecx=0x00000302
> > > edx=0x00000000
> > > CPU 22:
> > >    0x8000001e 0x00: eax=0x0000002c ebx=0x00000116 ecx=0x00000302
> > > edx=0x00000000
> > > CPU 23:
> > >    0x8000001e 0x00: eax=0x0000002e ebx=0x00000117 ecx=0x00000302
> > > edx=0x00000000
> > > CPU 24:
> > >    0x8000001e 0x00: eax=0x00000030 ebx=0x00000118 ecx=0x00000303
> > > edx=0x00000000
> > > CPU 25:
> > >    0x8000001e 0x00: eax=0x00000032 ebx=0x00000119 ecx=0x00000303
> > > edx=0x00000000
> > > CPU 26:
> > >    0x8000001e 0x00: eax=0x00000034 ebx=0x0000011a ecx=0x00000303
> > > edx=0x00000000
> > > CPU 27:
> > >    0x8000001e 0x00: eax=0x00000036 ebx=0x0000011b ecx=0x00000303
> > > edx=0x00000000
> > > CPU 28:
> > >    0x8000001e 0x00: eax=0x00000038 ebx=0x0000011c ecx=0x00000303
> > > edx=0x00000000
> > > CPU 29:
> > >    0x8000001e 0x00: eax=0x0000003a ebx=0x0000011d ecx=0x00000303
> > > edx=0x00000000
> > > CPU 30:
> > >    0x8000001e 0x00: eax=0x0000003c ebx=0x0000011e ecx=0x00000303
> > > edx=0x00000000
> > > CPU 31:
> > >    0x8000001e 0x00: eax=0x0000003e ebx=0x0000011f ecx=0x00000303
> > > edx=0x00000000
> > > CPU 32:
> > >    0x8000001e 0x00: eax=0x00000040 ebx=0x00000100 ecx=0x00000304
> > > edx=0x00000000
> > > CPU 33:
> > >    0x8000001e 0x00: eax=0x00000042 ebx=0x00000101 ecx=0x00000304
> > > edx=0x00000000
> > > CPU 34:
> > >    0x8000001e 0x00: eax=0x00000044 ebx=0x00000102 ecx=0x00000304
> > > edx=0x00000000
> > > CPU 35:
> > >    0x8000001e 0x00: eax=0x00000046 ebx=0x00000103 ecx=0x00000304
> > > edx=0x00000000
> > > CPU 36:
> > >    0x8000001e 0x00: eax=0x00000048 ebx=0x00000104 ecx=0x00000304
> > > edx=0x00000000
> > > CPU 37:
> > >    0x8000001e 0x00: eax=0x0000004a ebx=0x00000105 ecx=0x00000304
> > > edx=0x00000000
> > > CPU 38:
> > >    0x8000001e 0x00: eax=0x0000004c ebx=0x00000106 ecx=0x00000304
> > > edx=0x00000000
> > > CPU 39:
> > >    0x8000001e 0x00: eax=0x0000004e ebx=0x00000107 ecx=0x00000304
> > > edx=0x00000000
> > > CPU 40:
> > >    0x8000001e 0x00: eax=0x00000050 ebx=0x00000108 ecx=0x00000305
> > > edx=0x00000000
> > > CPU 41:
> > >    0x8000001e 0x00: eax=0x00000052 ebx=0x00000109 ecx=0x00000305
> > > edx=0x00000000
> > > CPU 42:
> > >    0x8000001e 0x00: eax=0x00000054 ebx=0x0000010a ecx=0x00000305
> > > edx=0x00000000
> > > CPU 43:
> > >    0x8000001e 0x00: eax=0x00000056 ebx=0x0000010b ecx=0x00000305
> > > edx=0x00000000
> > > CPU 44:
> > >    0x8000001e 0x00: eax=0x00000058 ebx=0x0000010c ecx=0x00000305
> > > edx=0x00000000
> > > CPU 45:
> > >    0x8000001e 0x00: eax=0x0000005a ebx=0x0000010d ecx=0x00000305
> > > edx=0x00000000
> > > CPU 46:
> > >    0x8000001e 0x00: eax=0x0000005c ebx=0x0000010e ecx=0x00000305
> > > edx=0x00000000
> > > CPU 47:
> > >    0x8000001e 0x00: eax=0x0000005e ebx=0x0000010f ecx=0x00000305
> > > edx=0x00000000
> > > CPU 48:
> > >    0x8000001e 0x00: eax=0x00000060 ebx=0x00000110 ecx=0x00000306
> > > edx=0x00000000
> > >
> > > CPU 49:
> > >    0x8000001e 0x00: eax=0x00000062 ebx=0x00000111 ecx=0x00000306
> > > edx=0x00000000
> > > CPU 50:
> > >    0x8000001e 0x00: eax=0x00000064 ebx=0x00000112 ecx=0x00000306
> > > edx=0x00000000
> > > CPU 51:
> > >    0x8000001e 0x00: eax=0x00000066 ebx=0x00000113 ecx=0x00000306
> > > edx=0x00000000
> > > CPU 52:
> > >    0x8000001e 0x00: eax=0x00000068 ebx=0x00000114 ecx=0x00000306
> > > edx=0x00000000
> > > CPU 53:
> > >    0x8000001e 0x00: eax=0x0000006a ebx=0x00000115 ecx=0x00000306
> > > edx=0x00000000
> > > CPU 54:
> > >    0x8000001e 0x00: eax=0x0000006c ebx=0x00000116 ecx=0x00000306
> > > edx=0x00000000
> > > CPU 55:
> > >    0x8000001e 0x00: eax=0x0000006e ebx=0x00000117 ecx=0x00000306
> > > edx=0x00000000
> > > CPU 56:
> > >    0x8000001e 0x00: eax=0x00000070 ebx=0x00000118 ecx=0x00000307
> > > edx=0x00000000
> > > CPU 57:
> > >    0x8000001e 0x00: eax=0x00000072 ebx=0x00000119 ecx=0x00000307
> > > edx=0x00000000
> > > CPU 58:
> > >    0x8000001e 0x00: eax=0x00000074 ebx=0x0000011a ecx=0x00000307
> > > edx=0x00000000
> > > CPU 59:
> > >    0x8000001e 0x00: eax=0x00000076 ebx=0x0000011b ecx=0x00000307
> > > edx=0x00000000
> > > CPU 60:
> > >    0x8000001e 0x00: eax=0x00000078 ebx=0x0000011c ecx=0x00000307
> > > edx=0x00000000
> > > CPU 61:
> > >    0x8000001e 0x00: eax=0x0000007a ebx=0x0000011d ecx=0x00000307
> > > edx=0x00000000
> > > CPU 62:
> > >    0x8000001e 0x00: eax=0x0000007c ebx=0x0000011e ecx=0x00000307
> > > edx=0x00000000
> > > CPU 63:
> > >    0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f ecx=0x00000307
> > > edx=0x00000000
> > >  
> > > >  
> > > > >  
> > > > > >
> > > > > > I wonder if linux guest actually uses node_id encoded in apic id
> > > > > > for configuring/checking numa structures, or it just uses
> > > > > > whatever ACPI SRAT table provided.
> > > > > >  
> > > > > >>>> + */
> > > > > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> > > > > >>>> +                                     CpuInstanceProperties props,
> > > > > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > > > > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > > > > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > > > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > > > >>>> +    topo_ids->node_id = props.has_node_id ?
> > > > > >>>> +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) :  
> > 0;  
> > > > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id :
> > > > > >>>> +0; }
> > > > > >>>>  /*
> > > > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > > > >>>>   *
> > > > > >>>>  
> > > > > >>  
> > > > > >  
> > > > >  
> > >
> > >  
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-13 19:30             ` Babu Moger
  2020-07-14 16:41               ` Igor Mammedov
@ 2020-07-24 17:05               ` Igor Mammedov
  2020-07-27 15:49                 ` Babu Moger
  1 sibling, 1 reply; 23+ messages in thread
From: Igor Mammedov @ 2020-07-24 17:05 UTC (permalink / raw)
  To: Babu Moger; +Cc: pbonzini, rth, qemu-devel, ehabkost

On Mon, 13 Jul 2020 14:30:29 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Monday, July 13, 2020 12:32 PM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> > devel@nongnu.org
> > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > CpuInstanceProperties
> > 
> > On Mon, 13 Jul 2020 11:43:33 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > On 7/13/20 11:17 AM, Igor Mammedov wrote:  
> > > > On Mon, 13 Jul 2020 10:02:22 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >  
> > > >>> -----Original Message-----
> > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > > >>> qemu- devel@nongnu.org
> > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > >>> CpuInstanceProperties  
> > > > [...]  
> > > >>>> +
> > > >>>> +/*
> > > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > > >>>> + * node_id in CpuInstanceProperties(or in CPU device) is a sequential
> > > >>>> + * number, but while building the topology  
> > > >>>  
> > > >>>> we need to separate it for
> > > >>>> + * each socket(mod nodes_per_pkg).  
> > > >>> could you clarify a bit more on why this is necessary?  
> > > >>
> > > >> If you have two sockets and 4 numa nodes, node_id in
> > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod % number of nodes  
> > per socket).  
> > > >
> > > > I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes per
> > > > socket so APIC id woulbe be composed like:
> > > >
> > > >  1st socket
> > > >    pkg_id(0) | node_id(0)
> > > >    pkg_id(0) | node_id(1)
> > > >
> > > >  2nd socket
> > > >    pkg_id(1) | node_id(0)
> > > >    pkg_id(1) | node_id(1)
> > > >
> > > > if that's the case, then EPYC's node_id here doesn't look like a
> > > > NUMA node in the sense it's usually used (above config would have 4
> > > > different memory controllers => 4 conventional NUMA nodes).  
> > >
> > > EPIC model uses combination of socket id and node id to identify the
> > > numa nodes. So, it internally uses all the information.  
> > 
> > well with above values, EPYC's node_id doesn't look like it's specifying a
> > machine numa node, but rather a node index within single socket. In which case,
> > it doesn't make much sense calling it NUMA node_id, it's rather some index
> > within a socket. (it starts looking like terminology is all mixed up)
> > 
> > If you have access to a milti-socket EPYC machine, can you dump and post here
> > its apic ids, pls?  
> 
> Here is the output from my EPYC machine with 2 sockets and totally 8
> nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus 32-63 in
> socket 1.
> 
> # lscpu
> Architecture:        x86_64
> CPU op-mode(s):      32-bit, 64-bit
> Byte Order:          Little Endian
> CPU(s):              64
> On-line CPU(s) list: 0-63
> Thread(s) per core:  1
> Core(s) per socket:  32
> Socket(s):           2
> NUMA node(s):        8
> Vendor ID:           AuthenticAMD
> CPU family:          23
> Model:               1
> Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> Stepping:            2
> CPU MHz:             2379.233
> CPU max MHz:         1900.0000
> CPU min MHz:         1200.0000
> BogoMIPS:            3792.81
> Virtualization:      AMD-V
> L1d cache:           32K
> L1i cache:           64K
> L2 cache:            512K
> L3 cache:            8192K
> NUMA node0 CPU(s):   0-7
> NUMA node1 CPU(s):   8-15
> NUMA node2 CPU(s):   16-23
> NUMA node3 CPU(s):   24-31
> NUMA node4 CPU(s):   32-39
> NUMA node5 CPU(s):   40-47
> NUMA node6 CPU(s):   48-55
> NUMA node7 CPU(s):   56-63
> 
> Here is the output of #cpuid  -l 0x8000001e  -r


(1)
> You may want to refer
> https://www.amd.com/system/files/TechDocs/54945_3.03_ppr_ZP_B2_pub.zip
> (section 2.1.12.2.1.3 ApicId Enumeration Requirements).
> Note that this is a general guideline. We tried to generalize in qemu as
> much as possible. It is bit complex.


 
> CPU 0:
>    0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000300
> edx=0x00000000
[...]
> CPU 63:
>    0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f ecx=0x00000307
> edx=0x00000000
> 
> >   
> > >  
> > > >
> > > > I wonder if linux guest actually uses node_id encoded in apic id for
> > > > configuring/checking numa structures, or it just uses whatever ACPI
> > > > SRAT table provided.
> > > >  
> > > >>>> + */
> > > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> > > >>>> +                                     CpuInstanceProperties props,
> > > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > >>>> +    topo_ids->node_id = props.has_node_id ?
> > > >>>> +                        props.node_id % MAX(topo_info->nodes_per_pkg, 1) : 0;

It looks like I was wrong pushing system wide NUMA node-id into APIC ID
(choosen naming is confusing a bit), per [1] mentioned above, EPYC's node-id is:

• ApicId[6] = Socket ID.
* ApicId[5:4]= Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}

which is can hold only 0-3 values, and defined as:

"A node, is an integrated circuit device that includes one to 8 cores (one or two Core Complexes)."

spec also mentions it indirectly as die-id if one looks at
CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId)
...
  CoreId = ({2'b0, DieId[1:0], LogicalComplexId[0], LogicalThreadId[2:0]} >> SMT

and in
(2)
CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId)
...
  {5'b00000,1'b[SOCKET_ID],2'b[DIE_ID]}

Question is why we did not reuse topo_ids->die_id instead of adding
confusing topo_ids->node_id in the first place?

Also looking APIC ID and SRAT table provided here,
CPUID_Fn8000001E_ECX corresponds to NUMA node id (i.e. what -numa in QEMU used for)
and Node ID embeded into ApicId[5:4] is basically die-id.

Difference between die-id implemented in QEMU and EPYC's die id (topo_ids->node_id)
is that the former doesn't require numa config (maybe it should, but ship'salready sailed)
and gets number of dies from '-smp dies=X' CLI option, while for EPYC we calculate
topo_ids->node_id implicitly from number of numa nodes and sockets, which implicitly
requires that machine 'must' be configured with -numa options.

Maybe we should drop this implicit calculation along with topo_ids->node_id
and reuse '-smp dies=X' plus extra checks for EPYC to ask for -numa if there
is more than 1 die and if we need to be really strict, add checks for limit of
dies/cores within socket/die according to spec[2] so encoded APIC ID and CPUID_8000001E
match the spec.



> > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id :
> > > >>>> +0; }
> > > >>>>  /*
> > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > >>>>   *
> > > >>>>  
> > > >>  
> > > >  
> > >  
> 
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-24 17:05               ` Igor Mammedov
@ 2020-07-27 15:49                 ` Babu Moger
  2020-07-27 17:14                   ` Igor Mammedov
  0 siblings, 1 reply; 23+ messages in thread
From: Babu Moger @ 2020-07-27 15:49 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: pbonzini, rth, qemu-devel, ehabkost



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Friday, July 24, 2020 12:05 PM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: qemu-devel@nongnu.org; pbonzini@redhat.com; ehabkost@redhat.com;
> rth@twiddle.net
> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> CpuInstanceProperties
> 
> On Mon, 13 Jul 2020 14:30:29 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > > -----Original Message-----
> > > From: Igor Mammedov <imammedo@redhat.com>
> > > Sent: Monday, July 13, 2020 12:32 PM
> > > To: Moger, Babu <Babu.Moger@amd.com>
> > > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> > > devel@nongnu.org
> > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > CpuInstanceProperties
> > >
> > > On Mon, 13 Jul 2020 11:43:33 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >
> > > > On 7/13/20 11:17 AM, Igor Mammedov wrote:
> > > > > On Mon, 13 Jul 2020 10:02:22 -0500 Babu Moger
> > > > > <babu.moger@amd.com> wrote:
> > > > >
> > > > >>> -----Original Message-----
> > > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > > > >>> qemu- devel@nongnu.org
> > > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > >>> CpuInstanceProperties
> > > > > [...]
> > > > >>>> +
> > > > >>>> +/*
> > > > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > > > >>>> + * node_id in CpuInstanceProperties(or in CPU device) is a
> > > > >>>> +sequential
> > > > >>>> + * number, but while building the topology
> > > > >>>
> > > > >>>> we need to separate it for
> > > > >>>> + * each socket(mod nodes_per_pkg).
> > > > >>> could you clarify a bit more on why this is necessary?
> > > > >>
> > > > >> If you have two sockets and 4 numa nodes, node_id in
> > > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod %
> > > > >> number of nodes
> > > per socket).
> > > > >
> > > > > I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes
> > > > > per socket so APIC id woulbe be composed like:
> > > > >
> > > > >  1st socket
> > > > >    pkg_id(0) | node_id(0)
> > > > >    pkg_id(0) | node_id(1)
> > > > >
> > > > >  2nd socket
> > > > >    pkg_id(1) | node_id(0)
> > > > >    pkg_id(1) | node_id(1)
> > > > >
> > > > > if that's the case, then EPYC's node_id here doesn't look like a
> > > > > NUMA node in the sense it's usually used (above config would
> > > > > have 4 different memory controllers => 4 conventional NUMA nodes).
> > > >
> > > > EPIC model uses combination of socket id and node id to identify
> > > > the numa nodes. So, it internally uses all the information.
> > >
> > > well with above values, EPYC's node_id doesn't look like it's
> > > specifying a machine numa node, but rather a node index within
> > > single socket. In which case, it doesn't make much sense calling it
> > > NUMA node_id, it's rather some index within a socket. (it starts
> > > looking like terminology is all mixed up)
> > >
> > > If you have access to a milti-socket EPYC machine, can you dump and
> > > post here its apic ids, pls?
> >
> > Here is the output from my EPYC machine with 2 sockets and totally 8
> > nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus 32-63 in
> > socket 1.
> >
> > # lscpu
> > Architecture:        x86_64
> > CPU op-mode(s):      32-bit, 64-bit
> > Byte Order:          Little Endian
> > CPU(s):              64
> > On-line CPU(s) list: 0-63
> > Thread(s) per core:  1
> > Core(s) per socket:  32
> > Socket(s):           2
> > NUMA node(s):        8
> > Vendor ID:           AuthenticAMD
> > CPU family:          23
> > Model:               1
> > Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> > Stepping:            2
> > CPU MHz:             2379.233
> > CPU max MHz:         1900.0000
> > CPU min MHz:         1200.0000
> > BogoMIPS:            3792.81
> > Virtualization:      AMD-V
> > L1d cache:           32K
> > L1i cache:           64K
> > L2 cache:            512K
> > L3 cache:            8192K
> > NUMA node0 CPU(s):   0-7
> > NUMA node1 CPU(s):   8-15
> > NUMA node2 CPU(s):   16-23
> > NUMA node3 CPU(s):   24-31
> > NUMA node4 CPU(s):   32-39
> > NUMA node5 CPU(s):   40-47
> > NUMA node6 CPU(s):   48-55
> > NUMA node7 CPU(s):   56-63
> >
> > Here is the output of #cpuid  -l 0x8000001e  -r
> 
> 
> (1)
> > You may want to refer
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> >
> amd.com%2Fsystem%2Ffiles%2FTechDocs%2F54945_3.03_ppr_ZP_B2_pub.zip&
> amp
> >
> ;data=02%7C01%7Cbabu.moger%40amd.com%7Ceacf7e8facbc4ae2eee808d82
> ff3ca9
> >
> 0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6373120714103223
> 35&amp;
> >
> sdata=%2Fdr93YVlwSq82%2FwRh2NU21Zkw4HJ%2B%2FVVYxAkhCCKJ4w%3D&a
> mp;reser
> > ved=0 (section 2.1.12.2.1.3 ApicId Enumeration Requirements).
> > Note that this is a general guideline. We tried to generalize in qemu
> > as much as possible. It is bit complex.
> 
> 
> 
> > CPU 0:
> >    0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000300
> > edx=0x00000000
> [...]
> > CPU 63:
> >    0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f ecx=0x00000307
> > edx=0x00000000
> >
> > >
> > > >
> > > > >
> > > > > I wonder if linux guest actually uses node_id encoded in apic id
> > > > > for configuring/checking numa structures, or it just uses
> > > > > whatever ACPI SRAT table provided.
> > > > >
> > > > >>>> + */
> > > > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> > > > >>>> +                                     CpuInstanceProperties props,
> > > > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > > > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > > > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > > >>>> +    topo_ids->node_id = props.has_node_id ?
> > > > >>>> +                        props.node_id %
> > > > >>>> +MAX(topo_info->nodes_per_pkg, 1) : 0;
> 
> It looks like I was wrong pushing system wide NUMA node-id into APIC ID
> (choosen naming is confusing a bit), per [1] mentioned above, EPYC's node-id is:
> 
> • ApicId[6] = Socket ID.
> * ApicId[5:4]= Node ID.
> • ApicId[3] = Logical CCX L3 complex ID
> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}
> 
> which is can hold only 0-3 values, and defined as:
> 
> "A node, is an integrated circuit device that includes one to 8 cores (one or two
> Core Complexes)."
> 
> spec also mentions it indirectly as die-id if one looks at CPUID_Fn8000001E_EBX
> [Core Identifiers] (Core::X86::Cpuid::CoreId) ...
>   CoreId = ({2'b0, DieId[1:0], LogicalComplexId[0], LogicalThreadId[2:0]} >> SMT
> 
> and in
> (2)
> CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId) ...
>   {5'b00000,1'b[SOCKET_ID],2'b[DIE_ID]}
> 
> Question is why we did not reuse topo_ids->die_id instead of adding confusing
> topo_ids->node_id in the first place?

Initially, I thought about it. But Intel uses die_id differently than AMD.
So, did not want complicate things.
If we take that route then we need to re-arrange the numa code as we need
to numa information to calculate the die id. So, did not want to mix up
things.

> 
> Also looking APIC ID and SRAT table provided here, CPUID_Fn8000001E_ECX
> corresponds to NUMA node id (i.e. what -numa in QEMU used for) and Node ID
> embeded into ApicId[5:4] is basically die-id.
> 
> Difference between die-id implemented in QEMU and EPYC's die id (topo_ids-
> >node_id) is that the former doesn't require numa config (maybe it should, but
> ship'salready sailed) and gets number of dies from '-smp dies=X' CLI option,
> while for EPYC we calculate topo_ids->node_id implicitly from number of numa
> nodes and sockets, which implicitly requires that machine 'must' be configured
> with -numa options.
> 
> Maybe we should drop this implicit calculation along with topo_ids->node_id
> and reuse '-smp dies=X' plus extra checks for EPYC to ask for -numa if there is
> more than 1 die and if we need to be really strict, add checks for limit of
> dies/cores within socket/die according to spec[2] so encoded APIC ID and
> CPUID_8000001E match the spec.

There will be complications when user configures with both die_id and
numa_id. It will complicate things further. I will have to look closely at
the code if it is feasible.

> 
> 
> 
> > > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id :
> > > > >>>> +0; }
> > > > >>>>  /*
> > > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > > >>>>   *
> > > > >>>>
> > > > >>
> > > > >
> > > >
> >
> >



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/3] hw/386: Fix uninitialized memory with -device and CPU hotplug
  2020-07-01 17:31 ` [PATCH v2 3/3] hw/386: Fix uninitialized memory with -device and CPU hotplug Babu Moger
@ 2020-07-27 16:36   ` Igor Mammedov
  0 siblings, 0 replies; 23+ messages in thread
From: Igor Mammedov @ 2020-07-27 16:36 UTC (permalink / raw)
  To: Babu Moger; +Cc: qemu-devel, pbonzini, ehabkost, rth

On Wed, 01 Jul 2020 12:31:15 -0500
Babu Moger <babu.moger@amd.com> wrote:

> Noticed the following command failure while testing CPU hotplug.
> 
> $ qemu-system-x86_64 -machine q35,accel=kvm -smp 1,maxcpus=2,
>   cores=1, threads=1,sockets=2 -cpu EPYC -device EPYC-x86_64-
>   cpu,core-id=0,socket-id=1,thread-id=0
> 
>   qemu-system-x86_64: -device EPYC-x86_64-cpu,core-id=0,socket-id=1,
>   thread-id=0: Invalid CPU [socket: 21855, die: 0, core: 0, thread: 0]
>   with APIC ID 21855, valid index range 0:1
> 
> This happens because APIC ID is calculated using uninitialized memory.
> This is happening after the addition of new field node_id in X86CPUTopoIDs
> structure. The node_id field is uninitialized while calling
> apicid_from_topo_ids. The problem is discussed in the thread below.
> https://lore.kernel.org/qemu-devel/20200602171838.GG577771@habkost.net/
> 
> Fix the problem by initializing the node_id from the device being added.
> 
> Fixes:
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1828750
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  hw/i386/pc.c |   10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index e613b2299f..aa9fb48834 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1553,6 +1553,15 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>              cpu->die_id = 0;
>          }
>  
so this is from 
 'if (cpu->apic_id == UNASSIGNED_APIC_ID) {'
branch, meaning cpu comes from -device/device_add

> +        /*
> +         * If node_id is not set, initialize it to zero for now. If the user
> +         * does not pass the correct node in case of numa configuration, it
> +         * will be rejected eventually.
> +         */
> +        if (cpu->node_id < 0) {
which means that user hasn't provided 'node-id',
in which case we should error out asking for specifying NUMA node-id along with other options

(1)
However that's not enough since by contract query-hotpluggbale-cpus shall provide all attributes
necessary to hotplug CPU, which makes node-id is not an optional in case of EPYC cpu.
So we need to initialize ms->possible_cpus->cpus[i].props.[has_]node_id by the time
we start creating CPUs.

here are 2 variants:
(2)
  * single node:
      nodes_per_pkg 1 and ms->smp.sockets == 1

    since it's the only node and mapping of RAM/CPU is unambigiuos,
    we can deal with it by moving auto_enable_numa into MachineState
    and enabling it in case EPYC CPU is used

  * multiple nodes:
      - ms->smp.sockets > 1
      - nodes_per_pkg > 1
    we can't make up NUMA nodes automatically, and have to ask user to use -numa options
    to provide nodes along with CPU/RAM mapping. So in case NUMA wasn't configured
    explicitly, we can only error out. (that also applies to CPU created implicitly by board '-smp X')

(3) Once user supplied mapping we need to checks that it matches EPYC topology,


(4) As for CPUID, current code in CPUID_Fn8000001E_ECX
      if (nodes <= 4) { /* here nodes is nodes_per_pkg */
         / goes by stricly spec /
         *ecx = ((nodes - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.node_id;
         /* makes up system wide NUMA node IDs which happen to match system wide
            NUMA node IDs created by -numa, when -smp + -numa produces nodes_per_pkg is in that range,
            basically user has no idea when this happens
          */
      } else {
         /* makeup new algorithm system wide NUMA node IDs generation for out of spec behaviour */
      }
     
    problem with both branches is that might lead to inconsistentcy between system wide
    NUMA node id in CPUID_Fn8000001E_ECX and the one configured with -numa which goes to
    SRAT ACPI table and should go to CPU::node-id property.

    Considering that out of spec behaviour is allowed we probably schould replace both branches
    with
       *ecx = ((nodes - 1) << 8) | cpu->node_id;
    which ensures consistency of system wide NUMA node ids and add checks for max nodes/max node id.

checks could be done early in cpu's realize() function.



> +            cpu->node_id = 0;
> +        }

>          if (cpu->socket_id < 0) {
>              error_setg(errp, "CPU socket-id is not set");
>              return;
> @@ -1587,6 +1596,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>          }
>  
>          topo_ids.pkg_id = cpu->socket_id;
> +        topo_ids.node_id = cpu->node_id;
>          topo_ids.die_id = cpu->die_id;
>          topo_ids.core_id = cpu->core_id;
>          topo_ids.smt_id = cpu->thread_id;
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-27 15:49                 ` Babu Moger
@ 2020-07-27 17:14                   ` Igor Mammedov
  2020-07-27 23:59                     ` Babu Moger
  0 siblings, 1 reply; 23+ messages in thread
From: Igor Mammedov @ 2020-07-27 17:14 UTC (permalink / raw)
  To: Babu Moger; +Cc: pbonzini, rth, qemu-devel, ehabkost

On Mon, 27 Jul 2020 10:49:08 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Friday, July 24, 2020 12:05 PM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: qemu-devel@nongnu.org; pbonzini@redhat.com; ehabkost@redhat.com;
> > rth@twiddle.net
> > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > CpuInstanceProperties
> > 
> > On Mon, 13 Jul 2020 14:30:29 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > > -----Original Message-----
> > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > Sent: Monday, July 13, 2020 12:32 PM
> > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> > > > devel@nongnu.org
> > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > CpuInstanceProperties
> > > >
> > > > On Mon, 13 Jul 2020 11:43:33 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >  
> > > > > On 7/13/20 11:17 AM, Igor Mammedov wrote:  
> > > > > > On Mon, 13 Jul 2020 10:02:22 -0500 Babu Moger
> > > > > > <babu.moger@amd.com> wrote:
> > > > > >  
> > > > > >>> -----Original Message-----
> > > > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > > > > >>> qemu- devel@nongnu.org
> > > > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > > >>> CpuInstanceProperties  
> > > > > > [...]  
> > > > > >>>> +
> > > > > >>>> +/*
> > > > > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > > > > >>>> + * node_id in CpuInstanceProperties(or in CPU device) is a
> > > > > >>>> +sequential
> > > > > >>>> + * number, but while building the topology  
> > > > > >>>  
> > > > > >>>> we need to separate it for
> > > > > >>>> + * each socket(mod nodes_per_pkg).  
> > > > > >>> could you clarify a bit more on why this is necessary?  
> > > > > >>
> > > > > >> If you have two sockets and 4 numa nodes, node_id in
> > > > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod %
> > > > > >> number of nodes  
> > > > per socket).  
> > > > > >
> > > > > > I'm confused, let's suppose we have 2 EPYC sockets with 2 nodes
> > > > > > per socket so APIC id woulbe be composed like:
> > > > > >
> > > > > >  1st socket
> > > > > >    pkg_id(0) | node_id(0)
> > > > > >    pkg_id(0) | node_id(1)
> > > > > >
> > > > > >  2nd socket
> > > > > >    pkg_id(1) | node_id(0)
> > > > > >    pkg_id(1) | node_id(1)
> > > > > >
> > > > > > if that's the case, then EPYC's node_id here doesn't look like a
> > > > > > NUMA node in the sense it's usually used (above config would
> > > > > > have 4 different memory controllers => 4 conventional NUMA nodes).  
> > > > >
> > > > > EPIC model uses combination of socket id and node id to identify
> > > > > the numa nodes. So, it internally uses all the information.  
> > > >
> > > > well with above values, EPYC's node_id doesn't look like it's
> > > > specifying a machine numa node, but rather a node index within
> > > > single socket. In which case, it doesn't make much sense calling it
> > > > NUMA node_id, it's rather some index within a socket. (it starts
> > > > looking like terminology is all mixed up)
> > > >
> > > > If you have access to a milti-socket EPYC machine, can you dump and
> > > > post here its apic ids, pls?  
> > >
> > > Here is the output from my EPYC machine with 2 sockets and totally 8
> > > nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus 32-63 in
> > > socket 1.
> > >
> > > # lscpu
> > > Architecture:        x86_64
> > > CPU op-mode(s):      32-bit, 64-bit
> > > Byte Order:          Little Endian
> > > CPU(s):              64
> > > On-line CPU(s) list: 0-63
> > > Thread(s) per core:  1
> > > Core(s) per socket:  32
> > > Socket(s):           2
> > > NUMA node(s):        8
> > > Vendor ID:           AuthenticAMD
> > > CPU family:          23
> > > Model:               1
> > > Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> > > Stepping:            2
> > > CPU MHz:             2379.233
> > > CPU max MHz:         1900.0000
> > > CPU min MHz:         1200.0000
> > > BogoMIPS:            3792.81
> > > Virtualization:      AMD-V
> > > L1d cache:           32K
> > > L1i cache:           64K
> > > L2 cache:            512K
> > > L3 cache:            8192K
> > > NUMA node0 CPU(s):   0-7
> > > NUMA node1 CPU(s):   8-15
> > > NUMA node2 CPU(s):   16-23
> > > NUMA node3 CPU(s):   24-31
> > > NUMA node4 CPU(s):   32-39
> > > NUMA node5 CPU(s):   40-47
> > > NUMA node6 CPU(s):   48-55
> > > NUMA node7 CPU(s):   56-63
> > >
> > > Here is the output of #cpuid  -l 0x8000001e  -r  
> > 
> > 
> > (1)  
> > > You may want to refer
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > >  
> > amd.com%2Fsystem%2Ffiles%2FTechDocs%2F54945_3.03_ppr_ZP_B2_pub.zip&
> > amp  
> > >  
> > ;data=02%7C01%7Cbabu.moger%40amd.com%7Ceacf7e8facbc4ae2eee808d82
> > ff3ca9  
> > >  
> > 0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6373120714103223
> > 35&amp;  
> > >  
> > sdata=%2Fdr93YVlwSq82%2FwRh2NU21Zkw4HJ%2B%2FVVYxAkhCCKJ4w%3D&a
> > mp;reser  
> > > ved=0 (section 2.1.12.2.1.3 ApicId Enumeration Requirements).
> > > Note that this is a general guideline. We tried to generalize in qemu
> > > as much as possible. It is bit complex.  
> > 
> > 
> >   
> > > CPU 0:
> > >    0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000300
> > > edx=0x00000000  
> > [...]  
> > > CPU 63:
> > >    0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f ecx=0x00000307
> > > edx=0x00000000
> > >  
> > > >  
> > > > >  
> > > > > >
> > > > > > I wonder if linux guest actually uses node_id encoded in apic id
> > > > > > for configuring/checking numa structures, or it just uses
> > > > > > whatever ACPI SRAT table provided.
> > > > > >  
> > > > > >>>> + */
> > > > > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo *topo_info,
> > > > > >>>> +                                     CpuInstanceProperties props,
> > > > > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > > > > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > > > > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > > > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > > > >>>> +    topo_ids->node_id = props.has_node_id ?
> > > > > >>>> +                        props.node_id %
> > > > > >>>> +MAX(topo_info->nodes_per_pkg, 1) : 0;  
> > 
> > It looks like I was wrong pushing system wide NUMA node-id into APIC ID
> > (choosen naming is confusing a bit), per [1] mentioned above, EPYC's node-id is:
> > 
> > • ApicId[6] = Socket ID.
> > * ApicId[5:4]= Node ID.
> > • ApicId[3] = Logical CCX L3 complex ID
> > • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}
> > 
> > which is can hold only 0-3 values, and defined as:
> > 
> > "A node, is an integrated circuit device that includes one to 8 cores (one or two
> > Core Complexes)."
> > 
> > spec also mentions it indirectly as die-id if one looks at CPUID_Fn8000001E_EBX
> > [Core Identifiers] (Core::X86::Cpuid::CoreId) ...
> >   CoreId = ({2'b0, DieId[1:0], LogicalComplexId[0], LogicalThreadId[2:0]} >> SMT
> > 
> > and in
> > (2)
> > CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId) ...
> >   {5'b00000,1'b[SOCKET_ID],2'b[DIE_ID]}
> > 
> > Question is why we did not reuse topo_ids->die_id instead of adding confusing
> > topo_ids->node_id in the first place?  
> 
> Initially, I thought about it. But Intel uses die_id differently than AMD.
> So, did not want complicate things.
> If we take that route then we need to re-arrange the numa code as we need
> to numa information to calculate the die id. So, did not want to mix up
> things.
> 
> > 
> > Also looking APIC ID and SRAT table provided here, CPUID_Fn8000001E_ECX
> > corresponds to NUMA node id (i.e. what -numa in QEMU used for) and Node ID
> > embeded into ApicId[5:4] is basically die-id.
> > 
> > Difference between die-id implemented in QEMU and EPYC's die id (topo_ids-  
> > >node_id) is that the former doesn't require numa config (maybe it should, but  
> > ship'salready sailed) and gets number of dies from '-smp dies=X' CLI option,
> > while for EPYC we calculate topo_ids->node_id implicitly from number of numa
> > nodes and sockets, which implicitly requires that machine 'must' be configured
> > with -numa options.
> > 
> > Maybe we should drop this implicit calculation along with topo_ids->node_id
> > and reuse '-smp dies=X' plus extra checks for EPYC to ask for -numa if there is
> > more than 1 die and if we need to be really strict, add checks for limit of
> > dies/cores within socket/die according to spec[2] so encoded APIC ID and
> > CPUID_8000001E match the spec.  
> 
> There will be complications when user configures with both die_id and
> numa_id. It will complicate things further. I will have to look closely at
> the code if it is feasible.

it's worth a try.
conseptionally die_id in intel/amd is the same. Most likely intel has a dedicated
memory controller on each die so it still should form a NUMA node. But that aspect
probably was ignored while implementing it in QEMU so ping of configuring QEMU right
is on user's shoulders (there is no checks whatsoever if cpu belonging to specific
die is on right NUMA node).

What AMD has implemented on top of that in CPU hw, is to expose NUMA node id in
CPUID_8000001E. I don't know why it was done as usually it's ACPI tables that
describe relations between nodes so for OS this info almost useless (I'd guess
it's faster to use CPUID instead of fetching pre-cpu variable but that's pretty
much it from OS point of view)

> 
> > 
> > 
> >   
> > > > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id :
> > > > > >>>> +0; }
> > > > > >>>>  /*
> > > > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > > > >>>>   *
> > > > > >>>>  
> > > > > >>  
> > > > > >  
> > > > >  
> > >
> > >  
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-27 17:14                   ` Igor Mammedov
@ 2020-07-27 23:59                     ` Babu Moger
  2020-07-29 14:12                       ` Igor Mammedov
  0 siblings, 1 reply; 23+ messages in thread
From: Babu Moger @ 2020-07-27 23:59 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: pbonzini, rth, qemu-devel, ehabkost



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Monday, July 27, 2020 12:14 PM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: qemu-devel@nongnu.org; pbonzini@redhat.com; ehabkost@redhat.com;
> rth@twiddle.net
> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> CpuInstanceProperties
> 
> On Mon, 27 Jul 2020 10:49:08 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > > -----Original Message-----
> > > From: Igor Mammedov <imammedo@redhat.com>
> > > Sent: Friday, July 24, 2020 12:05 PM
> > > To: Moger, Babu <Babu.Moger@amd.com>
> > > Cc: qemu-devel@nongnu.org; pbonzini@redhat.com;
> ehabkost@redhat.com;
> > > rth@twiddle.net
> > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > CpuInstanceProperties
> > >
> > > On Mon, 13 Jul 2020 14:30:29 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >
> > > > > -----Original Message-----
> > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > Sent: Monday, July 13, 2020 12:32 PM
> > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > > > > qemu- devel@nongnu.org
> > > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > > CpuInstanceProperties
> > > > >
> > > > > On Mon, 13 Jul 2020 11:43:33 -0500 Babu Moger
> > > > > <babu.moger@amd.com> wrote:
> > > > >
> > > > > > On 7/13/20 11:17 AM, Igor Mammedov wrote:
> > > > > > > On Mon, 13 Jul 2020 10:02:22 -0500 Babu Moger
> > > > > > > <babu.moger@amd.com> wrote:
> > > > > > >
> > > > > > >>> -----Original Message-----
> > > > > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > > > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > > > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net;
> > > > > > >>> ehabkost@redhat.com;
> > > > > > >>> qemu- devel@nongnu.org
> > > > > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids
> > > > > > >>> from CpuInstanceProperties
> > > > > > > [...]
> > > > > > >>>> +
> > > > > > >>>> +/*
> > > > > > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > > > > > >>>> + * node_id in CpuInstanceProperties(or in CPU device) is
> > > > > > >>>> +a sequential
> > > > > > >>>> + * number, but while building the topology
> > > > > > >>>
> > > > > > >>>> we need to separate it for
> > > > > > >>>> + * each socket(mod nodes_per_pkg).
> > > > > > >>> could you clarify a bit more on why this is necessary?
> > > > > > >>
> > > > > > >> If you have two sockets and 4 numa nodes, node_id in
> > > > > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > > > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod
> > > > > > >> % number of nodes
> > > > > per socket).
> > > > > > >
> > > > > > > I'm confused, let's suppose we have 2 EPYC sockets with 2
> > > > > > > nodes per socket so APIC id woulbe be composed like:
> > > > > > >
> > > > > > >  1st socket
> > > > > > >    pkg_id(0) | node_id(0)
> > > > > > >    pkg_id(0) | node_id(1)
> > > > > > >
> > > > > > >  2nd socket
> > > > > > >    pkg_id(1) | node_id(0)
> > > > > > >    pkg_id(1) | node_id(1)
> > > > > > >
> > > > > > > if that's the case, then EPYC's node_id here doesn't look
> > > > > > > like a NUMA node in the sense it's usually used (above
> > > > > > > config would have 4 different memory controllers => 4 conventional
> NUMA nodes).
> > > > > >
> > > > > > EPIC model uses combination of socket id and node id to
> > > > > > identify the numa nodes. So, it internally uses all the information.
> > > > >
> > > > > well with above values, EPYC's node_id doesn't look like it's
> > > > > specifying a machine numa node, but rather a node index within
> > > > > single socket. In which case, it doesn't make much sense calling
> > > > > it NUMA node_id, it's rather some index within a socket. (it
> > > > > starts looking like terminology is all mixed up)
> > > > >
> > > > > If you have access to a milti-socket EPYC machine, can you dump
> > > > > and post here its apic ids, pls?
> > > >
> > > > Here is the output from my EPYC machine with 2 sockets and totally
> > > > 8 nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus
> > > > 32-63 in socket 1.
> > > >
> > > > # lscpu
> > > > Architecture:        x86_64
> > > > CPU op-mode(s):      32-bit, 64-bit
> > > > Byte Order:          Little Endian
> > > > CPU(s):              64
> > > > On-line CPU(s) list: 0-63
> > > > Thread(s) per core:  1
> > > > Core(s) per socket:  32
> > > > Socket(s):           2
> > > > NUMA node(s):        8
> > > > Vendor ID:           AuthenticAMD
> > > > CPU family:          23
> > > > Model:               1
> > > > Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> > > > Stepping:            2
> > > > CPU MHz:             2379.233
> > > > CPU max MHz:         1900.0000
> > > > CPU min MHz:         1200.0000
> > > > BogoMIPS:            3792.81
> > > > Virtualization:      AMD-V
> > > > L1d cache:           32K
> > > > L1i cache:           64K
> > > > L2 cache:            512K
> > > > L3 cache:            8192K
> > > > NUMA node0 CPU(s):   0-7
> > > > NUMA node1 CPU(s):   8-15
> > > > NUMA node2 CPU(s):   16-23
> > > > NUMA node3 CPU(s):   24-31
> > > > NUMA node4 CPU(s):   32-39
> > > > NUMA node5 CPU(s):   40-47
> > > > NUMA node6 CPU(s):   48-55
> > > > NUMA node7 CPU(s):   56-63
> > > >
> > > > Here is the output of #cpuid  -l 0x8000001e  -r
> > >
> > >
> > > (1)
> > > > You may want to refer
> > > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > > >
> > >
> amd.com%2Fsystem%2Ffiles%2FTechDocs%2F54945_3.03_ppr_ZP_B2_pub.zip&
> > > amp
> > > >
> > >
> ;data=02%7C01%7Cbabu.moger%40amd.com%7Ceacf7e8facbc4ae2eee808d82
> > > ff3ca9
> > > >
> > >
> 0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6373120714103223
> > > 35&amp;
> > > >
> > >
> sdata=%2Fdr93YVlwSq82%2FwRh2NU21Zkw4HJ%2B%2FVVYxAkhCCKJ4w%3D&a
> > > mp;reser
> > > > ved=0 (section 2.1.12.2.1.3 ApicId Enumeration Requirements).
> > > > Note that this is a general guideline. We tried to generalize in
> > > > qemu as much as possible. It is bit complex.
> > >
> > >
> > >
> > > > CPU 0:
> > > >    0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000300
> > > > edx=0x00000000
> > > [...]
> > > > CPU 63:
> > > >    0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f ecx=0x00000307
> > > > edx=0x00000000
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > I wonder if linux guest actually uses node_id encoded in
> > > > > > > apic id for configuring/checking numa structures, or it just
> > > > > > > uses whatever ACPI SRAT table provided.
> > > > > > >
> > > > > > >>>> + */
> > > > > > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo
> *topo_info,
> > > > > > >>>> +                                     CpuInstanceProperties props,
> > > > > > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > > > > > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > > > > > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > > > > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > > > > >>>> +    topo_ids->node_id = props.has_node_id ?
> > > > > > >>>> +                        props.node_id %
> > > > > > >>>> +MAX(topo_info->nodes_per_pkg, 1) : 0;
> > >
> > > It looks like I was wrong pushing system wide NUMA node-id into APIC
> > > ID (choosen naming is confusing a bit), per [1] mentioned above, EPYC's
> node-id is:
> > >
> > > • ApicId[6] = Socket ID.
> > > * ApicId[5:4]= Node ID.
> > > • ApicId[3] = Logical CCX L3 complex ID • ApicId[2:0]= (SMT) ?
> > > {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}
> > >
> > > which is can hold only 0-3 values, and defined as:
> > >
> > > "A node, is an integrated circuit device that includes one to 8
> > > cores (one or two Core Complexes)."
> > >
> > > spec also mentions it indirectly as die-id if one looks at
> > > CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId) ...
> > >   CoreId = ({2'b0, DieId[1:0], LogicalComplexId[0],
> > > LogicalThreadId[2:0]} >> SMT
> > >
> > > and in
> > > (2)
> > > CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId) ...
> > >   {5'b00000,1'b[SOCKET_ID],2'b[DIE_ID]}
> > >
> > > Question is why we did not reuse topo_ids->die_id instead of adding
> > > confusing topo_ids->node_id in the first place?
> >
> > Initially, I thought about it. But Intel uses die_id differently than AMD.
> > So, did not want complicate things.
> > If we take that route then we need to re-arrange the numa code as we
> > need to numa information to calculate the die id. So, did not want to
> > mix up things.
> >
> > >
> > > Also looking APIC ID and SRAT table provided here,
> > > CPUID_Fn8000001E_ECX corresponds to NUMA node id (i.e. what -numa in
> > > QEMU used for) and Node ID embeded into ApicId[5:4] is basically die-id.
> > >
> > > Difference between die-id implemented in QEMU and EPYC's die id
> > > (topo_ids-
> > > >node_id) is that the former doesn't require numa config (maybe it
> > > >should, but
> > > ship'salready sailed) and gets number of dies from '-smp dies=X' CLI
> > > option, while for EPYC we calculate topo_ids->node_id implicitly
> > > from number of numa nodes and sockets, which implicitly requires
> > > that machine 'must' be configured with -numa options.
> > >
> > > Maybe we should drop this implicit calculation along with
> > > topo_ids->node_id and reuse '-smp dies=X' plus extra checks for EPYC
> > > to ask for -numa if there is more than 1 die and if we need to be
> > > really strict, add checks for limit of dies/cores within socket/die
> > > according to spec[2] so encoded APIC ID and CPUID_8000001E match the
> spec.
> >
> > There will be complications when user configures with both die_id and
> > numa_id. It will complicate things further. I will have to look
> > closely at the code if it is feasible.
> 
> it's worth a try.
> conseptionally die_id in intel/amd is the same. Most likely intel has a dedicated
> memory controller on each die so it still should form a NUMA node. But that
> aspect probably was ignored while implementing it in QEMU so ping of
> configuring QEMU right is on user's shoulders (there is no checks whatsoever if
> cpu belonging to specific die is on right NUMA node).

So you are suggesting to use die_id to build the topology for EPYC. Also
initialize die_id based on the numa information. Re-arrange the numa code
to make sure we have all the information before we build the topology. And
then remove the node_id inside X86CPUTopoIDs. Is that the plan?
> 
> What AMD has implemented on top of that in CPU hw, is to expose NUMA node
> id in CPUID_8000001E. I don't know why it was done as usually it's ACPI tables
> that describe relations between nodes so for OS this info almost useless (I'd
> guess it's faster to use CPUID instead of fetching pre-cpu variable but that's
> pretty much it from OS point of view)
> 
> >
> > >
> > >
> > >
> > > > > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id :
> > > > > > >>>> +0; }
> > > > > > >>>>  /*
> > > > > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > > > > >>>>   *
> > > > > > >>>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > >
> > > >
> >



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-27 23:59                     ` Babu Moger
@ 2020-07-29 14:12                       ` Igor Mammedov
  2020-07-29 21:22                         ` Babu Moger
  0 siblings, 1 reply; 23+ messages in thread
From: Igor Mammedov @ 2020-07-29 14:12 UTC (permalink / raw)
  To: Babu Moger; +Cc: pbonzini, qemu-devel, ehabkost, rth

On Mon, 27 Jul 2020 18:59:42 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Monday, July 27, 2020 12:14 PM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: qemu-devel@nongnu.org; pbonzini@redhat.com; ehabkost@redhat.com;
> > rth@twiddle.net
> > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > CpuInstanceProperties
> > 
> > On Mon, 27 Jul 2020 10:49:08 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > > -----Original Message-----
> > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > Sent: Friday, July 24, 2020 12:05 PM
> > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > Cc: qemu-devel@nongnu.org; pbonzini@redhat.com;  
> > ehabkost@redhat.com;  
> > > > rth@twiddle.net
> > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > CpuInstanceProperties
> > > >
> > > > On Mon, 13 Jul 2020 14:30:29 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >  
> > > > > > -----Original Message-----
> > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > Sent: Monday, July 13, 2020 12:32 PM
> > > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > > > > > qemu- devel@nongnu.org
> > > > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > > > CpuInstanceProperties
> > > > > >
> > > > > > On Mon, 13 Jul 2020 11:43:33 -0500 Babu Moger
> > > > > > <babu.moger@amd.com> wrote:
> > > > > >  
> > > > > > > On 7/13/20 11:17 AM, Igor Mammedov wrote:  
> > > > > > > > On Mon, 13 Jul 2020 10:02:22 -0500 Babu Moger
> > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > >  
> > > > > > > >>> -----Original Message-----
> > > > > > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > > > > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > > > > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net;
> > > > > > > >>> ehabkost@redhat.com;
> > > > > > > >>> qemu- devel@nongnu.org
> > > > > > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids
> > > > > > > >>> from CpuInstanceProperties  
> > > > > > > > [...]  
> > > > > > > >>>> +
> > > > > > > >>>> +/*
> > > > > > > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > > > > > > >>>> + * node_id in CpuInstanceProperties(or in CPU device) is
> > > > > > > >>>> +a sequential
> > > > > > > >>>> + * number, but while building the topology  
> > > > > > > >>>  
> > > > > > > >>>> we need to separate it for
> > > > > > > >>>> + * each socket(mod nodes_per_pkg).  
> > > > > > > >>> could you clarify a bit more on why this is necessary?  
> > > > > > > >>
> > > > > > > >> If you have two sockets and 4 numa nodes, node_id in
> > > > > > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > > > > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod
> > > > > > > >> % number of nodes  
> > > > > > per socket).  
> > > > > > > >
> > > > > > > > I'm confused, let's suppose we have 2 EPYC sockets with 2
> > > > > > > > nodes per socket so APIC id woulbe be composed like:
> > > > > > > >
> > > > > > > >  1st socket
> > > > > > > >    pkg_id(0) | node_id(0)
> > > > > > > >    pkg_id(0) | node_id(1)
> > > > > > > >
> > > > > > > >  2nd socket
> > > > > > > >    pkg_id(1) | node_id(0)
> > > > > > > >    pkg_id(1) | node_id(1)
> > > > > > > >
> > > > > > > > if that's the case, then EPYC's node_id here doesn't look
> > > > > > > > like a NUMA node in the sense it's usually used (above
> > > > > > > > config would have 4 different memory controllers => 4 conventional  
> > NUMA nodes).  
> > > > > > >
> > > > > > > EPIC model uses combination of socket id and node id to
> > > > > > > identify the numa nodes. So, it internally uses all the information.  
> > > > > >
> > > > > > well with above values, EPYC's node_id doesn't look like it's
> > > > > > specifying a machine numa node, but rather a node index within
> > > > > > single socket. In which case, it doesn't make much sense calling
> > > > > > it NUMA node_id, it's rather some index within a socket. (it
> > > > > > starts looking like terminology is all mixed up)
> > > > > >
> > > > > > If you have access to a milti-socket EPYC machine, can you dump
> > > > > > and post here its apic ids, pls?  
> > > > >
> > > > > Here is the output from my EPYC machine with 2 sockets and totally
> > > > > 8 nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus
> > > > > 32-63 in socket 1.
> > > > >
> > > > > # lscpu
> > > > > Architecture:        x86_64
> > > > > CPU op-mode(s):      32-bit, 64-bit
> > > > > Byte Order:          Little Endian
> > > > > CPU(s):              64
> > > > > On-line CPU(s) list: 0-63
> > > > > Thread(s) per core:  1
> > > > > Core(s) per socket:  32
> > > > > Socket(s):           2
> > > > > NUMA node(s):        8
> > > > > Vendor ID:           AuthenticAMD
> > > > > CPU family:          23
> > > > > Model:               1
> > > > > Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> > > > > Stepping:            2
> > > > > CPU MHz:             2379.233
> > > > > CPU max MHz:         1900.0000
> > > > > CPU min MHz:         1200.0000
> > > > > BogoMIPS:            3792.81
> > > > > Virtualization:      AMD-V
> > > > > L1d cache:           32K
> > > > > L1i cache:           64K
> > > > > L2 cache:            512K
> > > > > L3 cache:            8192K
> > > > > NUMA node0 CPU(s):   0-7
> > > > > NUMA node1 CPU(s):   8-15
> > > > > NUMA node2 CPU(s):   16-23
> > > > > NUMA node3 CPU(s):   24-31
> > > > > NUMA node4 CPU(s):   32-39
> > > > > NUMA node5 CPU(s):   40-47
> > > > > NUMA node6 CPU(s):   48-55
> > > > > NUMA node7 CPU(s):   56-63
> > > > >
> > > > > Here is the output of #cpuid  -l 0x8000001e  -r  
> > > >
> > > >
> > > > (1)  
> > > > > You may want to refer
> > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.  
> > > > >  
> > > >  
> > amd.com%2Fsystem%2Ffiles%2FTechDocs%2F54945_3.03_ppr_ZP_B2_pub.zip&  
> > > > amp  
> > > > >  
> > > >  
> > ;data=02%7C01%7Cbabu.moger%40amd.com%7Ceacf7e8facbc4ae2eee808d82  
> > > > ff3ca9  
> > > > >  
> > > >  
> > 0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6373120714103223  
> > > > 35&amp;  
> > > > >  
> > > >  
> > sdata=%2Fdr93YVlwSq82%2FwRh2NU21Zkw4HJ%2B%2FVVYxAkhCCKJ4w%3D&a  
> > > > mp;reser  
> > > > > ved=0 (section 2.1.12.2.1.3 ApicId Enumeration Requirements).
> > > > > Note that this is a general guideline. We tried to generalize in
> > > > > qemu as much as possible. It is bit complex.  
> > > >
> > > >
> > > >  
> > > > > CPU 0:
> > > > >    0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000300
> > > > > edx=0x00000000  
> > > > [...]  
> > > > > CPU 63:
> > > > >    0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f ecx=0x00000307
> > > > > edx=0x00000000
> > > > >  
> > > > > >  
> > > > > > >  
> > > > > > > >
> > > > > > > > I wonder if linux guest actually uses node_id encoded in
> > > > > > > > apic id for configuring/checking numa structures, or it just
> > > > > > > > uses whatever ACPI SRAT table provided.
> > > > > > > >  
> > > > > > > >>>> + */
> > > > > > > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo  
> > *topo_info,  
> > > > > > > >>>> +                                     CpuInstanceProperties props,
> > > > > > > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > > > > > > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id : 0;
> > > > > > > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > > > > > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > > > > > >>>> +    topo_ids->node_id = props.has_node_id ?
> > > > > > > >>>> +                        props.node_id %
> > > > > > > >>>> +MAX(topo_info->nodes_per_pkg, 1) : 0;  
> > > >
> > > > It looks like I was wrong pushing system wide NUMA node-id into APIC
> > > > ID (choosen naming is confusing a bit), per [1] mentioned above, EPYC's  
> > node-id is:  
> > > >
> > > > • ApicId[6] = Socket ID.
> > > > * ApicId[5:4]= Node ID.
> > > > • ApicId[3] = Logical CCX L3 complex ID • ApicId[2:0]= (SMT) ?
> > > > {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}
> > > >
> > > > which is can hold only 0-3 values, and defined as:
> > > >
> > > > "A node, is an integrated circuit device that includes one to 8
> > > > cores (one or two Core Complexes)."
> > > >
> > > > spec also mentions it indirectly as die-id if one looks at
> > > > CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId) ...
> > > >   CoreId = ({2'b0, DieId[1:0], LogicalComplexId[0],
> > > > LogicalThreadId[2:0]} >> SMT
> > > >
> > > > and in
> > > > (2)
> > > > CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId) ...
> > > >   {5'b00000,1'b[SOCKET_ID],2'b[DIE_ID]}
> > > >
> > > > Question is why we did not reuse topo_ids->die_id instead of adding
> > > > confusing topo_ids->node_id in the first place?  
> > >
> > > Initially, I thought about it. But Intel uses die_id differently than AMD.
> > > So, did not want complicate things.
> > > If we take that route then we need to re-arrange the numa code as we
> > > need to numa information to calculate the die id. So, did not want to
> > > mix up things.
> > >  
> > > >
> > > > Also looking APIC ID and SRAT table provided here,
> > > > CPUID_Fn8000001E_ECX corresponds to NUMA node id (i.e. what -numa in
> > > > QEMU used for) and Node ID embeded into ApicId[5:4] is basically die-id.
> > > >
> > > > Difference between die-id implemented in QEMU and EPYC's die id
> > > > (topo_ids-  
> > > > >node_id) is that the former doesn't require numa config (maybe it
> > > > >should, but  
> > > > ship'salready sailed) and gets number of dies from '-smp dies=X' CLI
> > > > option, while for EPYC we calculate topo_ids->node_id implicitly
> > > > from number of numa nodes and sockets, which implicitly requires
> > > > that machine 'must' be configured with -numa options.
> > > >
> > > > Maybe we should drop this implicit calculation along with
> > > > topo_ids->node_id and reuse '-smp dies=X' plus extra checks for EPYC
> > > > to ask for -numa if there is more than 1 die and if we need to be
> > > > really strict, add checks for limit of dies/cores within socket/die
> > > > according to spec[2] so encoded APIC ID and CPUID_8000001E match the  
> > spec.  
> > >
> > > There will be complications when user configures with both die_id and
> > > numa_id. It will complicate things further. I will have to look
> > > closely at the code if it is feasible.  
> > 
> > it's worth a try.
> > conseptionally die_id in intel/amd is the same. Most likely intel has a dedicated
> > memory controller on each die so it still should form a NUMA node. But that
> > aspect probably was ignored while implementing it in QEMU so ping of
> > configuring QEMU right is on user's shoulders (there is no checks whatsoever if
> > cpu belonging to specific die is on right NUMA node).  
> 
> So you are suggesting to use die_id to build the topology for EPYC. Also
> initialize die_id based on the numa information. Re-arrange the numa code
> to make sure we have all the information before we build the topology. And
> then remove the node_id inside X86CPUTopoIDs. Is that the plan?
reusing die_id might simplify logic and at very least we won't have 2 very similar
fields to deal with. With die_id it should be conditional on EPYC.

Regardless of using die_id, we should

(1) error out if tolopolgy will require more than 1 numa node and no numa config was provided.
(2) for 1 NUMA node use autonuma to create 1 node implicitly, that requres converting
static MachineClass::auto_enable_numa into an instance specific value, i.e. moving it
into MachineState, so that we can change it at runtime depending on CPU type.
(3) use NUMA id from CPU::node-id for CPUID_8000001E and have a checks that will ensure
    that used value is possible to fit in CPUID leaf.

    


> > What AMD has implemented on top of that in CPU hw, is to expose NUMA node
> > id in CPUID_8000001E. I don't know why it was done as usually it's ACPI tables
> > that describe relations between nodes so for OS this info almost useless (I'd
> > guess it's faster to use CPUID instead of fetching pre-cpu variable but that's
> > pretty much it from OS point of view)
> >   
> > >  
> > > >
> > > >
> > > >  
> > > > > > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id :
> > > > > > > >>>> +0; }
> > > > > > > >>>>  /*
> > > > > > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > > > > > >>>>   *
> > > > > > > >>>>  
> > > > > > > >>  
> > > > > > > >  
> > > > > > >  
> > > > >
> > > > >  
> > >  
> 
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-29 14:12                       ` Igor Mammedov
@ 2020-07-29 21:22                         ` Babu Moger
  2020-07-30 11:27                           ` Igor Mammedov
  0 siblings, 1 reply; 23+ messages in thread
From: Babu Moger @ 2020-07-29 21:22 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: pbonzini, qemu-devel, ehabkost, rth

Igor,
Sorry. Few more questions.

> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Wednesday, July 29, 2020 9:12 AM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: pbonzini@redhat.com; rth@twiddle.net; qemu-devel@nongnu.org;
> ehabkost@redhat.com
> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> CpuInstanceProperties
> 
> On Mon, 27 Jul 2020 18:59:42 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > > -----Original Message-----
> > > From: Igor Mammedov <imammedo@redhat.com>
> > > Sent: Monday, July 27, 2020 12:14 PM
> > > To: Moger, Babu <Babu.Moger@amd.com>
> > > Cc: qemu-devel@nongnu.org; pbonzini@redhat.com;
> ehabkost@redhat.com;
> > > rth@twiddle.net
> > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > CpuInstanceProperties
> > >
> > > On Mon, 27 Jul 2020 10:49:08 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >
> > > > > -----Original Message-----
> > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > Sent: Friday, July 24, 2020 12:05 PM
> > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > Cc: qemu-devel@nongnu.org; pbonzini@redhat.com;
> > > ehabkost@redhat.com;
> > > > > rth@twiddle.net
> > > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > > CpuInstanceProperties
> > > > >
> > > > > On Mon, 13 Jul 2020 14:30:29 -0500 Babu Moger
> > > > > <babu.moger@amd.com> wrote:
> > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > > Sent: Monday, July 13, 2020 12:32 PM
> > > > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > > Cc: pbonzini@redhat.com; rth@twiddle.net;
> > > > > > > ehabkost@redhat.com;
> > > > > > > qemu- devel@nongnu.org
> > > > > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids
> > > > > > > from CpuInstanceProperties
> > > > > > >
> > > > > > > On Mon, 13 Jul 2020 11:43:33 -0500 Babu Moger
> > > > > > > <babu.moger@amd.com> wrote:
> > > > > > >
> > > > > > > > On 7/13/20 11:17 AM, Igor Mammedov wrote:
> > > > > > > > > On Mon, 13 Jul 2020 10:02:22 -0500 Babu Moger
> > > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > > >
> > > > > > > > >>> -----Original Message-----
> > > > > > > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > > > > > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > > > > > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net;
> > > > > > > > >>> ehabkost@redhat.com;
> > > > > > > > >>> qemu- devel@nongnu.org
> > > > > > > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize
> > > > > > > > >>> topo_ids from CpuInstanceProperties
> > > > > > > > > [...]
> > > > > > > > >>>> +
> > > > > > > > >>>> +/*
> > > > > > > > >>>> + * Initialize topo_ids from CpuInstanceProperties
> > > > > > > > >>>> + * node_id in CpuInstanceProperties(or in CPU
> > > > > > > > >>>> +device) is a sequential
> > > > > > > > >>>> + * number, but while building the topology
> > > > > > > > >>>
> > > > > > > > >>>> we need to separate it for
> > > > > > > > >>>> + * each socket(mod nodes_per_pkg).
> > > > > > > > >>> could you clarify a bit more on why this is necessary?
> > > > > > > > >>
> > > > > > > > >> If you have two sockets and 4 numa nodes, node_id in
> > > > > > > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > > > > > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically
> > > > > > > > >> mod % number of nodes
> > > > > > > per socket).
> > > > > > > > >
> > > > > > > > > I'm confused, let's suppose we have 2 EPYC sockets with
> > > > > > > > > 2 nodes per socket so APIC id woulbe be composed like:
> > > > > > > > >
> > > > > > > > >  1st socket
> > > > > > > > >    pkg_id(0) | node_id(0)
> > > > > > > > >    pkg_id(0) | node_id(1)
> > > > > > > > >
> > > > > > > > >  2nd socket
> > > > > > > > >    pkg_id(1) | node_id(0)
> > > > > > > > >    pkg_id(1) | node_id(1)
> > > > > > > > >
> > > > > > > > > if that's the case, then EPYC's node_id here doesn't
> > > > > > > > > look like a NUMA node in the sense it's usually used
> > > > > > > > > (above config would have 4 different memory controllers
> > > > > > > > > => 4 conventional
> > > NUMA nodes).
> > > > > > > >
> > > > > > > > EPIC model uses combination of socket id and node id to
> > > > > > > > identify the numa nodes. So, it internally uses all the information.
> > > > > > >
> > > > > > > well with above values, EPYC's node_id doesn't look like
> > > > > > > it's specifying a machine numa node, but rather a node index
> > > > > > > within single socket. In which case, it doesn't make much
> > > > > > > sense calling it NUMA node_id, it's rather some index within
> > > > > > > a socket. (it starts looking like terminology is all mixed
> > > > > > > up)
> > > > > > >
> > > > > > > If you have access to a milti-socket EPYC machine, can you
> > > > > > > dump and post here its apic ids, pls?
> > > > > >
> > > > > > Here is the output from my EPYC machine with 2 sockets and
> > > > > > totally
> > > > > > 8 nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus
> > > > > > 32-63 in socket 1.
> > > > > >
> > > > > > # lscpu
> > > > > > Architecture:        x86_64
> > > > > > CPU op-mode(s):      32-bit, 64-bit
> > > > > > Byte Order:          Little Endian
> > > > > > CPU(s):              64
> > > > > > On-line CPU(s) list: 0-63
> > > > > > Thread(s) per core:  1
> > > > > > Core(s) per socket:  32
> > > > > > Socket(s):           2
> > > > > > NUMA node(s):        8
> > > > > > Vendor ID:           AuthenticAMD
> > > > > > CPU family:          23
> > > > > > Model:               1
> > > > > > Model name:          AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> > > > > > Stepping:            2
> > > > > > CPU MHz:             2379.233
> > > > > > CPU max MHz:         1900.0000
> > > > > > CPU min MHz:         1200.0000
> > > > > > BogoMIPS:            3792.81
> > > > > > Virtualization:      AMD-V
> > > > > > L1d cache:           32K
> > > > > > L1i cache:           64K
> > > > > > L2 cache:            512K
> > > > > > L3 cache:            8192K
> > > > > > NUMA node0 CPU(s):   0-7
> > > > > > NUMA node1 CPU(s):   8-15
> > > > > > NUMA node2 CPU(s):   16-23
> > > > > > NUMA node3 CPU(s):   24-31
> > > > > > NUMA node4 CPU(s):   32-39
> > > > > > NUMA node5 CPU(s):   40-47
> > > > > > NUMA node6 CPU(s):   48-55
> > > > > > NUMA node7 CPU(s):   56-63
> > > > > >
> > > > > > Here is the output of #cpuid  -l 0x8000001e  -r
> > > > >
> > > > >
> > > > > (1)
> > > > > > You may want to refer
> > > > > >
> > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > > > > >
> > > > >
> > >
> amd.com%2Fsystem%2Ffiles%2FTechDocs%2F54945_3.03_ppr_ZP_B2_pub.zip&
> > > > > amp
> > > > > >
> > > > >
> > >
> ;data=02%7C01%7Cbabu.moger%40amd.com%7Ceacf7e8facbc4ae2eee808d82
> > > > > ff3ca9
> > > > > >
> > > > >
> > >
> 0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6373120714103223
> > > > > 35&amp;
> > > > > >
> > > > >
> > >
> sdata=%2Fdr93YVlwSq82%2FwRh2NU21Zkw4HJ%2B%2FVVYxAkhCCKJ4w%3D&a
> > > > > mp;reser
> > > > > > ved=0 (section 2.1.12.2.1.3 ApicId Enumeration Requirements).
> > > > > > Note that this is a general guideline. We tried to generalize
> > > > > > in qemu as much as possible. It is bit complex.
> > > > >
> > > > >
> > > > >
> > > > > > CPU 0:
> > > > > >    0x8000001e 0x00: eax=0x00000000 ebx=0x00000100
> > > > > > ecx=0x00000300
> > > > > > edx=0x00000000
> > > > > [...]
> > > > > > CPU 63:
> > > > > >    0x8000001e 0x00: eax=0x0000007e ebx=0x0000011f
> > > > > > ecx=0x00000307
> > > > > > edx=0x00000000
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > I wonder if linux guest actually uses node_id encoded in
> > > > > > > > > apic id for configuring/checking numa structures, or it
> > > > > > > > > just uses whatever ACPI SRAT table provided.
> > > > > > > > >
> > > > > > > > >>>> + */
> > > > > > > > >>>> +static inline void x86_init_topo_ids(X86CPUTopoInfo
> > > *topo_info,
> > > > > > > > >>>> +                                     CpuInstanceProperties props,
> > > > > > > > >>>> +                                     X86CPUTopoIDs *topo_ids) {
> > > > > > > > >>>> +    topo_ids->smt_id = props.has_thread_id ? props.thread_id
> : 0;
> > > > > > > > >>>> +    topo_ids->core_id = props.has_core_id ? props.core_id : 0;
> > > > > > > > >>>> +    topo_ids->die_id = props.has_die_id ? props.die_id : 0;
> > > > > > > > >>>> +    topo_ids->node_id = props.has_node_id ?
> > > > > > > > >>>> +                        props.node_id %
> > > > > > > > >>>> +MAX(topo_info->nodes_per_pkg, 1) : 0;
> > > > >
> > > > > It looks like I was wrong pushing system wide NUMA node-id into
> > > > > APIC ID (choosen naming is confusing a bit), per [1] mentioned
> > > > > above, EPYC's
> > > node-id is:
> > > > >
> > > > > • ApicId[6] = Socket ID.
> > > > > * ApicId[5:4]= Node ID.
> > > > > • ApicId[3] = Logical CCX L3 complex ID • ApicId[2:0]= (SMT) ?
> > > > > {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}
> > > > >
> > > > > which is can hold only 0-3 values, and defined as:
> > > > >
> > > > > "A node, is an integrated circuit device that includes one to 8
> > > > > cores (one or two Core Complexes)."
> > > > >
> > > > > spec also mentions it indirectly as die-id if one looks at
> > > > > CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId) ...
> > > > >   CoreId = ({2'b0, DieId[1:0], LogicalComplexId[0],
> > > > > LogicalThreadId[2:0]} >> SMT
> > > > >
> > > > > and in
> > > > > (2)
> > > > > CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId) ...
> > > > >   {5'b00000,1'b[SOCKET_ID],2'b[DIE_ID]}
> > > > >
> > > > > Question is why we did not reuse topo_ids->die_id instead of
> > > > > adding confusing topo_ids->node_id in the first place?
> > > >
> > > > Initially, I thought about it. But Intel uses die_id differently than AMD.
> > > > So, did not want complicate things.
> > > > If we take that route then we need to re-arrange the numa code as
> > > > we need to numa information to calculate the die id. So, did not
> > > > want to mix up things.
> > > >
> > > > >
> > > > > Also looking APIC ID and SRAT table provided here,
> > > > > CPUID_Fn8000001E_ECX corresponds to NUMA node id (i.e. what
> > > > > -numa in QEMU used for) and Node ID embeded into ApicId[5:4] is
> basically die-id.
> > > > >
> > > > > Difference between die-id implemented in QEMU and EPYC's die id
> > > > > (topo_ids-
> > > > > >node_id) is that the former doesn't require numa config (maybe
> > > > > >it should, but
> > > > > ship'salready sailed) and gets number of dies from '-smp dies=X'
> > > > > CLI option, while for EPYC we calculate topo_ids->node_id
> > > > > implicitly from number of numa nodes and sockets, which
> > > > > implicitly requires that machine 'must' be configured with -numa options.
> > > > >
> > > > > Maybe we should drop this implicit calculation along with
> > > > > topo_ids->node_id and reuse '-smp dies=X' plus extra checks for
> > > > > EPYC to ask for -numa if there is more than 1 die and if we need
> > > > > to be really strict, add checks for limit of dies/cores within
> > > > > socket/die according to spec[2] so encoded APIC ID and
> > > > > CPUID_8000001E match the
> > > spec.
> > > >
> > > > There will be complications when user configures with both die_id
> > > > and numa_id. It will complicate things further. I will have to
> > > > look closely at the code if it is feasible.
> > >
> > > it's worth a try.
> > > conseptionally die_id in intel/amd is the same. Most likely intel
> > > has a dedicated memory controller on each die so it still should
> > > form a NUMA node. But that aspect probably was ignored while
> > > implementing it in QEMU so ping of configuring QEMU right is on
> > > user's shoulders (there is no checks whatsoever if cpu belonging to specific
> die is on right NUMA node).
> >
> > So you are suggesting to use die_id to build the topology for EPYC.
> > Also initialize die_id based on the numa information. Re-arrange the
> > numa code to make sure we have all the information before we build the
> > topology. And then remove the node_id inside X86CPUTopoIDs. Is that the
> plan?
> reusing die_id might simplify logic and at very least we won't have 2 very similar
> fields to deal with. With die_id it should be conditional on EPYC.

Not convinced if the using the die_id will solve the problem here. But
going to investigate this little bit.
> > Regardless of using die_id, we should
> 
> (1) error out if tolopolgy will require more than 1 numa node and no numa
> config was provided.

We already have node_id check in numa_cpu_pre_plug. Do you want me to
bring this check in pc_cpu_pre_plug?

> (2) for 1 NUMA node use autonuma to create 1 node implicitly, that requres
> converting static MachineClass::auto_enable_numa into an instance specific
> value, i.e. moving it into MachineState, so that we can change it at runtime
> depending on CPU type.

Isn't it already taken care in numa_complete_configuration when num_nodes
= 0? Where does this change go if at all required?

> (3) use NUMA id from CPU::node-id for CPUID_8000001E and have a checks that
> will ensure
>     that used value is possible to fit in CPUID leaf.

Node_id is already part of apic_id. We can easily extract it from apic_id.
I have already sent the patch to simplify CPUID_8000001E. I will make it
part of this series.
https://lore.kernel.org/qemu-devel/159164753686.20543.4158548114186964547.stgit@naples-babu.amd.com/

> 
> 
> 
> 
> > > What AMD has implemented on top of that in CPU hw, is to expose NUMA
> > > node id in CPUID_8000001E. I don't know why it was done as usually
> > > it's ACPI tables that describe relations between nodes so for OS
> > > this info almost useless (I'd guess it's faster to use CPUID instead
> > > of fetching pre-cpu variable but that's pretty much it from OS point
> > > of view)
> > >
> > > >
> > > > >
> > > > >
> > > > >
> > > > > > > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id
> :
> > > > > > > > >>>> +0; }
> > > > > > > > >>>>  /*
> > > > > > > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > > > > > > >>>>   *
> > > > > > > > >>>>
> > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > > >
> > > >
> >
> >



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties
  2020-07-29 21:22                         ` Babu Moger
@ 2020-07-30 11:27                           ` Igor Mammedov
  0 siblings, 0 replies; 23+ messages in thread
From: Igor Mammedov @ 2020-07-30 11:27 UTC (permalink / raw)
  To: Babu Moger; +Cc: pbonzini, qemu-devel, ehabkost, rth

On Wed, 29 Jul 2020 16:22:32 -0500
Babu Moger <babu.moger@amd.com> wrote:

> Igor,
> Sorry. Few more questions.
> 
> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Wednesday, July 29, 2020 9:12 AM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: pbonzini@redhat.com; rth@twiddle.net; qemu-devel@nongnu.org;
> > ehabkost@redhat.com
> > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > CpuInstanceProperties
> > 
> > On Mon, 27 Jul 2020 18:59:42 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > > -----Original Message-----
> > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > Sent: Monday, July 27, 2020 12:14 PM
> > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > Cc: qemu-devel@nongnu.org; pbonzini@redhat.com;  
> > ehabkost@redhat.com;  
> > > > rth@twiddle.net
> > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > CpuInstanceProperties
> > > >
> > > > On Mon, 27 Jul 2020 10:49:08 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >  
> > > > > > -----Original Message-----
> > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > Sent: Friday, July 24, 2020 12:05 PM
> > > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > Cc: qemu-devel@nongnu.org; pbonzini@redhat.com;  
> > > > ehabkost@redhat.com;  
> > > > > > rth@twiddle.net
> > > > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > > > CpuInstanceProperties
> > > > > >
> > > > > > On Mon, 13 Jul 2020 14:30:29 -0500 Babu Moger
> > > > > > <babu.moger@amd.com> wrote:
> > > > > >  
> > > > > > > > -----Original Message-----
> > > > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > > > Sent: Monday, July 13, 2020 12:32 PM
> > > > > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > > > Cc: pbonzini@redhat.com; rth@twiddle.net;
> > > > > > > > ehabkost@redhat.com;
> > > > > > > > qemu- devel@nongnu.org
> > > > > > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids
> > > > > > > > from CpuInstanceProperties
> > > > > > > >
> > > > > > > > On Mon, 13 Jul 2020 11:43:33 -0500 Babu Moger
> > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > >  
> > > > > > > > > On 7/13/20 11:17 AM, Igor Mammedov wrote:  
> > > > > > > > > > On Mon, 13 Jul 2020 10:02:22 -0500 Babu Moger
> > > > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > > > >  
> > > > > > > > > >>> -----Original Message-----
> > > > > > > > > >>> From: Igor Mammedov <imammedo@redhat.com>
> > > > > > > > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > > > > > > > >>> To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > > > > >>> Cc: pbonzini@redhat.com; rth@twiddle.net;
> > > > > > > > > >>> ehabkost@redhat.com;
> > > > > > > > > >>> qemu- devel@nongnu.org
> > > > > > > > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize
> > > > > > > > > >>> topo_ids from CpuInstanceProperties  
[...]

> > > > > There will be complications when user configures with both die_id
> > > > > and numa_id. It will complicate things further. I will have to
> > > > > look closely at the code if it is feasible.  
> > > >
> > > > it's worth a try.
> > > > conseptionally die_id in intel/amd is the same. Most likely intel
> > > > has a dedicated memory controller on each die so it still should
> > > > form a NUMA node. But that aspect probably was ignored while
> > > > implementing it in QEMU so ping of configuring QEMU right is on
> > > > user's shoulders (there is no checks whatsoever if cpu belonging to specific  
> > die is on right NUMA node).  
> > >
> > > So you are suggesting to use die_id to build the topology for EPYC.
> > > Also initialize die_id based on the numa information. Re-arrange the
> > > numa code to make sure we have all the information before we build the
> > > topology. And then remove the node_id inside X86CPUTopoIDs. Is that the  
> > plan?
> > reusing die_id might simplify logic and at very least we won't have 2 very similar
> > fields to deal with. With die_id it should be conditional on EPYC.  
> 
> Not convinced if the using the die_id will solve the problem here. But
> going to investigate this little bit.
it allows  us to drop nodes_per_pkg calculation with its dependency on numa,
since it's provided by user with -smp dies=

We might need a sanity check that user provided value is valid in case on EPYC though.

> > > Regardless of using die_id, we should  
> > 
> > (1) error out if tolopolgy will require more than 1 numa node and no numa
> > config was provided.  
> 
> We already have node_id check in numa_cpu_pre_plug. Do you want me to
> bring this check in pc_cpu_pre_plug?

There are several checks there and that includes validating per CPU node-id
values and workarounds for broken libvirt.

Where I'm talking more about number of numa nodes required f(-smp dies,-cpu EPYC),
like:
  if (dies>1 && epyc && nb_numa_nodes != sockets * dies)
     error_steg("chosen cpu model ... and -smp ... parameters require X numa nodes being configured")
     error_append_hint("use -numa options to create requred number of numa nodes")

I'm not sure where put it in for now, we can try to put it into x86_cpus_init()
for starters and later see if there is more sutable place for it.


> > (2) for 1 NUMA node use autonuma to create 1 node implicitly, that requres
> > converting static MachineClass::auto_enable_numa into an instance specific
> > value, i.e. moving it into MachineState, so that we can change it at runtime
> > depending on CPU type.  
> 
> Isn't it already taken care in numa_complete_configuration when num_nodes
> = 0? Where does this change go if at all required?
numa_complete_configuration()
...
    if (ms->numa_state->num_nodes == 0 &&                                        
        ((ms->ram_slots && mc->auto_enable_numa_with_memhp) ||                   
         (ms->maxram_size > ms->ram_size && mc->auto_enable_numa_with_memdev) || 
         mc->auto_enable_numa)) {                                                
            NumaNodeOptions node = { };                                          
            parse_numa_node(ms, &node, &error_abort);                            
            numa_info[0].node_mem = ram_size;                                    
    }

that is a fragment that takes care of implict creation of the single numa node.
lets ignore *numa_with_* cases and look into mc->auto_enable_numa.
It is MachineClass field and we are not supposed to change it at runtime,
but we need to enable it in case options specify 1 node config, i.e.

  "-cpu epyc -smp x,sockets=1,dies=1"

so we need to trigger auto_enable_numa depending on above condition.
Looking at the current code there is no good place to put it in.

we can try to replace MachineClass::auto_enable_numa with callback
  bool MachineClass::auto_enable_numa_cb(MachineState *ms)
so we can change logic at runtime where it's needed.

> 
> > (3) use NUMA id from CPU::node-id for CPUID_8000001E and have a checks that
> > will ensure
> >     that used value is possible to fit in CPUID leaf.  
> 
> Node_id is already part of apic_id. We can easily extract it from apic_id.
> I have already sent the patch to simplify CPUID_8000001E. I will make it
> part of this series.
> https://lore.kernel.org/qemu-devel/159164753686.20543.4158548114186964547.stgit@naples-babu.amd.com/
that's where confusion in naming gets in a way:
let's set following difinitions for purpose of this discussion/QEMU
   node_id = system wide NUMA node id  
   AMD's ApicId[5:4] = die_id

what AMD encodes in APIC ID is not a node_id but reather an index of a node within package.
Even in spec in one place it's called "Node ID" but in another place it's reffered as DIE_ID.

Whith that cleared up, following CPUID defined as
CPUID_Fn8000001E_ECX[7:0] = NodeId
but it's not the same as ApicId[5:4], description says it's {5'b00000,1'b[SOCKET_ID],2'b[DIE_ID]}.
and CPUID dump from EPYC machine confirms that. It also matches with system wide NUMA node ids
encoded in SRAT table.
So above pointed patch is incorrect wrt CPUID_Fn8000001E_ECX.

Considering we allow for more nodes than existing EPYCs can have,
I'd rather it would take value of NUMA node id from CPU's "node-id" property
with a check that it fits within CPUID_Fn8000001E_ECX[7:0] space at realize time,
to ensure that NUMA node ids are consistent with what user provides and ACPI tables.

   
> > 
> > 
> > 
> >   
> > > > What AMD has implemented on top of that in CPU hw, is to expose NUMA
> > > > node id in CPUID_8000001E. I don't know why it was done as usually
> > > > it's ACPI tables that describe relations between nodes so for OS
> > > > this info almost useless (I'd guess it's faster to use CPUID instead
> > > > of fetching pre-cpu variable but that's pretty much it from OS point
> > > > of view)
> > > >  
> > > > >  
> > > > > >
> > > > > >
> > > > > >  
> > > > > > > > > >>>> +    topo_ids->pkg_id = props.has_socket_id ? props.socket_id  
> > :  
> > > > > > > > > >>>> +0; }
> > > > > > > > > >>>>  /*
> > > > > > > > > >>>>   * Make APIC ID for the CPU 'cpu_index'
> > > > > > > > > >>>>   *
> > > > > > > > > >>>>  
> > > > > > > > > >>  
> > > > > > > > > >  
> > > > > > > > >  
> > > > > > >
> > > > > > >  
> > > > >  
> > >
> > >  
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-07-30 11:31 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-01 17:30 [PATCH v2 0/3] Fix couple of issues with AMD topology Babu Moger
2020-07-01 17:31 ` [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties Babu Moger
2020-07-13  9:08   ` Igor Mammedov
2020-07-13 15:02     ` Babu Moger
2020-07-13 16:17       ` Igor Mammedov
2020-07-13 16:43         ` Babu Moger
2020-07-13 17:32           ` Igor Mammedov
2020-07-13 19:30             ` Babu Moger
2020-07-14 16:41               ` Igor Mammedov
2020-07-14 17:26                 ` Babu Moger
2020-07-14 18:26                   ` Igor Mammedov
2020-07-24 17:05               ` Igor Mammedov
2020-07-27 15:49                 ` Babu Moger
2020-07-27 17:14                   ` Igor Mammedov
2020-07-27 23:59                     ` Babu Moger
2020-07-29 14:12                       ` Igor Mammedov
2020-07-29 21:22                         ` Babu Moger
2020-07-30 11:27                           ` Igor Mammedov
2020-07-01 17:31 ` [PATCH v2 2/3] hw/i386: Build apic_id " Babu Moger
2020-07-13  9:15   ` Igor Mammedov
2020-07-13 15:00     ` Babu Moger
2020-07-01 17:31 ` [PATCH v2 3/3] hw/386: Fix uninitialized memory with -device and CPU hotplug Babu Moger
2020-07-27 16:36   ` Igor Mammedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.