All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
@ 2020-08-21 22:12 Babu Moger
  2020-08-21 22:12 ` [PATCH v5 1/8] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology Babu Moger
                   ` (9 more replies)
  0 siblings, 10 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-21 22:12 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

To support some of the complex topology, we introduced EPYC mode apicid decode.
But, EPYC mode decode is running into problems. Also it can become quite a
maintenance problem in the future. So, it was decided to remove that code and
use the generic decode which works for majority of the topology. Most of the
SPECed configuration would work just fine. With some non-SPECed user inputs,
it will create some sub-optimal configuration.
Here is the discussion thread.
https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/

This series removes all the EPYC mode specific apicid changes and use the generic
apicid decode.

---
v5:
 Revert EPYC specific decode.
 Simplify CPUID_8000_001E

v4:
  https://lore.kernel.org/qemu-devel/159744083536.39197.13827776633866601278.stgit@naples-babu.amd.com/
  Not much of a change. Just added few text changes.
  Error out configuration instead of warning if dies are not configured in EPYC.
  Few other text changes to clarify the removal of node_id, nr_nodes and nodes_per_pkg.

v3:
  https://lore.kernel.org/qemu-devel/159681772267.9679.1334429994189974662.stgit@naples-babu.amd.com/#r
  Added a new check to pass the dies for EPYC numa configuration.
  Added Simplify CPUID_8000_001E patch with some changes suggested by Igor.
  Dropped the patch to build the topology from CpuInstanceProperties.
  TODO: Not sure if we still need the Autonuma changes Igor mentioned.
  Needs more clarity on that.

v2:
  https://lore.kernel.org/qemu-devel/159362436285.36204.986406297373871949.stgit@naples-babu.amd.com/
  Used the numa information from CpuInstanceProperties for building
  the apic_id suggested by Igor.
  Also did some minor code re-aarangement to take care of changes.
  Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
  it later.

v1:
 https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.stgit@naples-babu.amd.com

Babu Moger (8):
      hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
      Revert "i386: Fix pkg_id offset for EPYC cpu models"
      Revert "target/i386: Enable new apic id encoding for EPYC based cpus models"
      Revert "hw/i386: Move arch_id decode inside x86_cpus_init"
      Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition"
      Revert "hw/i386: Introduce apicid functions inside X86MachineState"
      Revert "hw/386: Add EPYC mode topology decoding functions"
      i386: Simplify CPUID_8000_001E for AMD


 hw/i386/pc.c               |    8 +--
 hw/i386/x86.c              |   43 +++-------------
 include/hw/i386/topology.h |  101 ---------------------------------------
 include/hw/i386/x86.h      |    9 ---
 target/i386/cpu.c          |  115 ++++++++++++++++----------------------------
 target/i386/cpu.h          |    3 -
 tests/test-x86-cpuid.c     |   40 ++++++++-------
 7 files changed, 73 insertions(+), 246 deletions(-)

--
Signature


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v5 1/8] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
@ 2020-08-21 22:12 ` Babu Moger
  2020-08-26  9:57   ` Igor Mammedov
  2020-08-21 22:12 ` [PATCH v5 2/8] Revert "i386: Fix pkg_id offset for EPYC cpu models" Babu Moger
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 51+ messages in thread
From: Babu Moger @ 2020-08-21 22:12 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

Remove node_id, nr_nodes and nodes_per_pkg from topology. Use
die_id, nr_dies and dies_per_pkg which is already available.
Removes the confusion over two variables.

With node_id removed in topology the uninitialized memory issue
with -device and CPU hotplug will be fixed.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1828750
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c               |    1 -
 hw/i386/x86.c              |    1 -
 include/hw/i386/topology.h |   40 +++++++++-------------------------------
 target/i386/cpu.c          |   24 ++++++++++--------------
 target/i386/cpu.h          |    1 -
 tests/test-x86-cpuid.c     |   40 ++++++++++++++++++++--------------------
 6 files changed, 39 insertions(+), 68 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 47c5ca3e34..0ae5cb3af4 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1498,7 +1498,6 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     init_topo_info(&topo_info, x86ms);
 
     env->nr_dies = x86ms->smp_dies;
-    env->nr_nodes = topo_info.nodes_per_pkg;
     env->pkg_offset = x86ms->apicid_pkg_offset(&topo_info);
 
     /*
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 67bee1bcb8..f6eab947df 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -62,7 +62,6 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
 {
     MachineState *ms = MACHINE(x86ms);
 
-    topo_info->nodes_per_pkg = ms->numa_state->num_nodes / ms->smp.sockets;
     topo_info->dies_per_pkg = x86ms->smp_dies;
     topo_info->cores_per_die = ms->smp.cores;
     topo_info->threads_per_core = ms->smp.threads;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 07239f95f4..05ddde7ba0 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -47,14 +47,12 @@ typedef uint32_t apic_id_t;
 
 typedef struct X86CPUTopoIDs {
     unsigned pkg_id;
-    unsigned node_id;
     unsigned die_id;
     unsigned core_id;
     unsigned smt_id;
 } X86CPUTopoIDs;
 
 typedef struct X86CPUTopoInfo {
-    unsigned nodes_per_pkg;
     unsigned dies_per_pkg;
     unsigned cores_per_die;
     unsigned threads_per_core;
@@ -89,11 +87,6 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
     return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
 }
 
-/* Bit width of the node_id field per socket */
-static inline unsigned apicid_node_width_epyc(X86CPUTopoInfo *topo_info)
-{
-    return apicid_bitwidth_for_count(MAX(topo_info->nodes_per_pkg, 1));
-}
 /* Bit offset of the Core_ID field
  */
 static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
@@ -114,30 +107,23 @@ static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
     return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
 }
 
-#define NODE_ID_OFFSET 3 /* Minimum node_id offset if numa configured */
+#define EPYC_DIE_OFFSET 3 /* Minimum die_id offset if numa configured */
 
 /*
- * Bit offset of the node_id field
- *
- * Make sure nodes_per_pkg >  0 if numa configured else zero.
+ * Bit offset of the die_id field
  */
-static inline unsigned apicid_node_offset_epyc(X86CPUTopoInfo *topo_info)
+static inline unsigned apicid_die_offset_epyc(X86CPUTopoInfo *topo_info)
 {
-    unsigned offset = apicid_die_offset(topo_info) +
-                      apicid_die_width(topo_info);
+    unsigned offset = apicid_core_offset(topo_info) +
+                      apicid_core_width(topo_info);
 
-    if (topo_info->nodes_per_pkg) {
-        return MAX(NODE_ID_OFFSET, offset);
-    } else {
-        return offset;
-    }
+    return MAX(EPYC_DIE_OFFSET, offset);
 }
 
 /* Bit offset of the Pkg_ID (socket ID) field */
 static inline unsigned apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)
 {
-    return apicid_node_offset_epyc(topo_info) +
-           apicid_node_width_epyc(topo_info);
+    return apicid_die_offset_epyc(topo_info) + apicid_die_width(topo_info);
 }
 
 /*
@@ -150,8 +136,7 @@ x86_apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
                               const X86CPUTopoIDs *topo_ids)
 {
     return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
-           (topo_ids->node_id << apicid_node_offset_epyc(topo_info)) |
-           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
+           (topo_ids->die_id  << apicid_die_offset_epyc(topo_info)) |
            (topo_ids->core_id << apicid_core_offset(topo_info)) |
            topo_ids->smt_id;
 }
@@ -160,15 +145,11 @@ static inline void x86_topo_ids_from_idx_epyc(X86CPUTopoInfo *topo_info,
                                               unsigned cpu_index,
                                               X86CPUTopoIDs *topo_ids)
 {
-    unsigned nr_nodes = MAX(topo_info->nodes_per_pkg, 1);
     unsigned nr_dies = topo_info->dies_per_pkg;
     unsigned nr_cores = topo_info->cores_per_die;
     unsigned nr_threads = topo_info->threads_per_core;
-    unsigned cores_per_node = DIV_ROUND_UP((nr_dies * nr_cores * nr_threads),
-                                            nr_nodes);
 
     topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
-    topo_ids->node_id = (cpu_index / cores_per_node) % nr_nodes;
     topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
     topo_ids->core_id = cpu_index / nr_threads % nr_cores;
     topo_ids->smt_id = cpu_index % nr_threads;
@@ -188,11 +169,8 @@ static inline void x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
             (apicid >> apicid_core_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
     topo_ids->die_id =
-            (apicid >> apicid_die_offset(topo_info)) &
+            (apicid >> apicid_die_offset_epyc(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
-    topo_ids->node_id =
-            (apicid >> apicid_node_offset_epyc(topo_info)) &
-            ~(0xFFFFFFFFUL << apicid_node_width_epyc(topo_info));
     topo_ids->pkg_id = apicid >> apicid_pkg_offset_epyc(topo_info);
 }
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 588f32e136..3c58af1f43 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -345,7 +345,6 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
                                        uint32_t *ecx, uint32_t *edx)
 {
     uint32_t l3_cores;
-    unsigned nodes = MAX(topo_info->nodes_per_pkg, 1);
 
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
@@ -355,10 +354,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_cores = DIV_ROUND_UP((topo_info->dies_per_pkg *
-                                 topo_info->cores_per_die *
+        l3_cores = DIV_ROUND_UP((topo_info->cores_per_die *
                                  topo_info->threads_per_core),
-                                 nodes);
+                                 topo_info->dies_per_pkg);
         *eax |= (l3_cores - 1) << 14;
     } else {
         *eax |= ((topo_info->threads_per_core - 1) << 14);
@@ -387,7 +385,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
                                        uint32_t *ecx, uint32_t *edx)
 {
     X86CPUTopoIDs topo_ids = {0};
-    unsigned long nodes = MAX(topo_info->nodes_per_pkg, 1);
+    unsigned long dies = topo_info->dies_per_pkg;
     int shift;
 
     x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
@@ -408,7 +406,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
      *             3 Core complex id
      *           1:0 Core id
      */
-    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.node_id << 3) |
+    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.die_id << 3) |
             (topo_ids.core_id);
     /*
      * CPUID_Fn8000001E_ECX
@@ -418,8 +416,8 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
      *         2  Socket id
      *       1:0  Node id
      */
-    if (nodes <= 4) {
-        *ecx = ((nodes - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.node_id;
+    if (dies <= 4) {
+        *ecx = ((dies - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.die_id;
     } else {
         /*
          * Node id fix up. Actual hardware supports up to 4 nodes. But with
@@ -434,10 +432,10 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
          * number of nodes. find_last_bit returns last set bit(0 based). Left
          * shift(+1) the socket id to represent all the nodes.
          */
-        nodes -= 1;
-        shift = find_last_bit(&nodes, 8);
-        *ecx = (nodes << 8) | (topo_ids.pkg_id << (shift + 1)) |
-               topo_ids.node_id;
+        dies -= 1;
+        shift = find_last_bit(&dies, 8);
+        *ecx = (dies << 8) | (topo_ids.pkg_id << (shift + 1)) |
+               topo_ids.die_id;
     }
     *edx = 0;
 }
@@ -5489,7 +5487,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
 
-    topo_info.nodes_per_pkg = env->nr_nodes;
     topo_info.dies_per_pkg = env->nr_dies;
     topo_info.cores_per_die = cs->nr_cores;
     topo_info.threads_per_core = cs->nr_threads;
@@ -6949,7 +6946,6 @@ static void x86_cpu_initfn(Object *obj)
     FeatureWord w;
 
     env->nr_dies = 1;
-    env->nr_nodes = 1;
     cpu_set_cpustate_pointers(cpu);
 
     object_property_add(obj, "family", "int",
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index e1a5c174dc..4c89bee8d1 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1629,7 +1629,6 @@ typedef struct CPUX86State {
     TPRAccess tpr_access_type;
 
     unsigned nr_dies;
-    unsigned nr_nodes;
     unsigned pkg_offset;
 } CPUX86State;
 
diff --git a/tests/test-x86-cpuid.c b/tests/test-x86-cpuid.c
index 049030a50e..bfabc0403a 100644
--- a/tests/test-x86-cpuid.c
+++ b/tests/test-x86-cpuid.c
@@ -31,12 +31,12 @@ static void test_topo_bits(void)
     X86CPUTopoInfo topo_info = {0};
 
     /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
-    topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
 
-    topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
@@ -45,39 +45,39 @@ static void test_topo_bits(void)
 
     /* Test field width calculation for multiple values
      */
-    topo_info = (X86CPUTopoInfo) {0, 1, 1, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 2};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {0, 1, 1, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {0, 1, 1, 4};
+    topo_info = (X86CPUTopoInfo) {1, 1, 4};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
 
-    topo_info = (X86CPUTopoInfo) {0, 1, 1, 14};
+    topo_info = (X86CPUTopoInfo) {1, 1, 14};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {0, 1, 1, 15};
+    topo_info = (X86CPUTopoInfo) {1, 1, 15};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {0, 1, 1, 16};
+    topo_info = (X86CPUTopoInfo) {1, 1, 16};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {0, 1, 1, 17};
+    topo_info = (X86CPUTopoInfo) {1, 1, 17};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
 
 
-    topo_info = (X86CPUTopoInfo) {0, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 30, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {0, 1, 31, 2};
+    topo_info = (X86CPUTopoInfo) {1, 31, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {0, 1, 32, 2};
+    topo_info = (X86CPUTopoInfo) {1, 32, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {0, 1, 33, 2};
+    topo_info = (X86CPUTopoInfo) {1, 33, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
 
-    topo_info = (X86CPUTopoInfo) {0, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
-    topo_info = (X86CPUTopoInfo) {0, 2, 30, 2};
+    topo_info = (X86CPUTopoInfo) {2, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {0, 3, 30, 2};
+    topo_info = (X86CPUTopoInfo) {3, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {0, 4, 30, 2};
+    topo_info = (X86CPUTopoInfo) {4, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
 
     /* build a weird topology and see if IDs are calculated correctly
@@ -85,18 +85,18 @@ static void test_topo_bits(void)
 
     /* This will use 2 bits for thread ID and 3 bits for core ID
      */
-    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 6, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
 
-    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 6, 3};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
 
-    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 6, 3};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
                      (1 << 2) | 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v5 2/8] Revert "i386: Fix pkg_id offset for EPYC cpu models"
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
  2020-08-21 22:12 ` [PATCH v5 1/8] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology Babu Moger
@ 2020-08-21 22:12 ` Babu Moger
  2020-08-21 22:12 ` [PATCH v5 3/8] Revert "target/i386: Enable new apic id encoding for EPYC based cpus models" Babu Moger
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-21 22:12 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

Remove the EPYC specific apicid decoding and use the generic
default decoding.

This reverts commit 7b225762c8c05fd31d4c2be116aedfbc00383f8b.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c      |    1 -
 target/i386/cpu.c |    6 +++---
 target/i386/cpu.h |    1 -
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0ae5cb3af4..e74b3cb1d8 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1498,7 +1498,6 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     init_topo_info(&topo_info, x86ms);
 
     env->nr_dies = x86ms->smp_dies;
-    env->pkg_offset = x86ms->apicid_pkg_offset(&topo_info);
 
     /*
      * If APIC ID is not set,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3c58af1f43..83acbce3e9 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5675,7 +5675,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
-            *eax = env->pkg_offset;
+            *eax = apicid_pkg_offset(&topo_info);
             *ebx = cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
@@ -5709,7 +5709,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         case 2:
-            *eax = env->pkg_offset;
+            *eax = apicid_pkg_offset(&topo_info);
             *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;
@@ -5890,7 +5890,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              * CPUX86State::pkg_offset.
              * Bits 7:0 is "The number of threads in the package is NC+1"
              */
-            *ecx = (env->pkg_offset << 12) |
+            *ecx = (apicid_pkg_offset(&topo_info) << 12) |
                    ((cs->nr_cores * cs->nr_threads) - 1);
         } else {
             *ecx = 0;
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 4c89bee8d1..a345172afd 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1629,7 +1629,6 @@ typedef struct CPUX86State {
     TPRAccess tpr_access_type;
 
     unsigned nr_dies;
-    unsigned pkg_offset;
 } CPUX86State;
 
 struct kvm_msrs;



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v5 3/8] Revert "target/i386: Enable new apic id encoding for EPYC based cpus models"
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
  2020-08-21 22:12 ` [PATCH v5 1/8] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology Babu Moger
  2020-08-21 22:12 ` [PATCH v5 2/8] Revert "i386: Fix pkg_id offset for EPYC cpu models" Babu Moger
@ 2020-08-21 22:12 ` Babu Moger
  2020-08-21 22:12 ` [PATCH v5 4/8] Revert "hw/i386: Move arch_id decode inside x86_cpus_init" Babu Moger
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-21 22:12 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

Remove the EPYC specific apicid decoding and use the generic
default decoding.

This reverts commit 247b18c593ec298446645af8d5d28911daf653b1.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 target/i386/cpu.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 83acbce3e9..567d864051 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3993,7 +3993,6 @@ static X86CPUDefinition builtin_x86_defs[] = {
         .xlevel = 0x8000001E,
         .model_id = "AMD EPYC Processor",
         .cache_info = &epyc_cache_info,
-        .use_epyc_apic_id_encoding = 1,
         .versions = (X86CPUVersionDefinition[]) {
             { .version = 1 },
             {
@@ -4121,7 +4120,6 @@ static X86CPUDefinition builtin_x86_defs[] = {
         .xlevel = 0x8000001E,
         .model_id = "AMD EPYC-Rome Processor",
         .cache_info = &epyc_rome_cache_info,
-        .use_epyc_apic_id_encoding = 1,
     },
 };
 



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v5 4/8] Revert "hw/i386: Move arch_id decode inside x86_cpus_init"
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
                   ` (2 preceding siblings ...)
  2020-08-21 22:12 ` [PATCH v5 3/8] Revert "target/i386: Enable new apic id encoding for EPYC based cpus models" Babu Moger
@ 2020-08-21 22:12 ` Babu Moger
  2020-08-21 22:12 ` [PATCH v5 5/8] Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition" Babu Moger
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-21 22:12 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

Remove the EPYC specific apicid decoding and use the generic
default decoding.

This reverts commit 2e26f4ab3bf8390a2677d3afd9b1a04f015d7721.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c  |    6 +++---
 hw/i386/x86.c |   37 +++++++------------------------------
 2 files changed, 10 insertions(+), 33 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e74b3cb1d8..6abe7723e0 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1552,14 +1552,14 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         topo_ids.die_id = cpu->die_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
-        cpu->apic_id = x86ms->apicid_from_topo_ids(&topo_info, &topo_ids);
+        cpu->apic_id = x86_apicid_from_topo_ids(&topo_info, &topo_ids);
     }
 
     cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, &idx);
     if (!cpu_slot) {
         MachineState *ms = MACHINE(pcms);
 
-        x86ms->topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
+        x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
         error_setg(errp,
             "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
             " APIC ID %" PRIu32 ", valid index range 0:%d",
@@ -1580,7 +1580,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
      * once -smp refactoring is complete and there will be CPU private
      * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
-    x86ms->topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
+    x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
     if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
         error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
             " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id,
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index f6eab947df..0a7cf8336c 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -67,22 +67,6 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
     topo_info->threads_per_core = ms->smp.threads;
 }
 
-/*
- * Set up with the new EPYC topology handlers
- *
- * AMD uses different apic id encoding for EPYC based cpus. Override
- * the default topo handlers with EPYC encoding handlers.
- */
-static void x86_set_epyc_topo_handlers(MachineState *machine)
-{
-    X86MachineState *x86ms = X86_MACHINE(machine);
-
-    x86ms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx_epyc;
-    x86ms->topo_ids_from_apicid = x86_topo_ids_from_apicid_epyc;
-    x86ms->apicid_from_topo_ids = x86_apicid_from_topo_ids_epyc;
-    x86ms->apicid_pkg_offset = apicid_pkg_offset_epyc;
-}
-
 /*
  * Calculates initial APIC ID for a specific CPU index
  *
@@ -101,7 +85,7 @@ uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
 
     init_topo_info(&topo_info, x86ms);
 
-    correct_id = x86ms->apicid_from_cpu_idx(&topo_info, cpu_index);
+    correct_id = x86_apicid_from_cpu_idx(&topo_info, cpu_index);
     if (x86mc->compat_apic_id_mode) {
         if (cpu_index != correct_id && !warned && !qtest_enabled()) {
             error_report("APIC IDs set in compatibility mode, "
@@ -135,11 +119,6 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
     MachineState *ms = MACHINE(x86ms);
     MachineClass *mc = MACHINE_GET_CLASS(x86ms);
 
-    /* Check for apicid encoding */
-    if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
-        x86_set_epyc_topo_handlers(ms);
-    }
-
     x86_cpu_set_default_version(default_cpu_version);
 
     /*
@@ -153,12 +132,6 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
     x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
                                                       ms->smp.max_cpus - 1) + 1;
     possible_cpus = mc->possible_cpu_arch_ids(ms);
-
-    for (i = 0; i < ms->possible_cpus->len; i++) {
-        ms->possible_cpus->cpus[i].arch_id =
-            x86_cpu_apic_id_from_index(x86ms, i);
-    }
-
     for (i = 0; i < ms->smp.cpus; i++) {
         x86_cpu_new(x86ms, possible_cpus->cpus[i].arch_id, &error_fatal);
     }
@@ -183,7 +156,8 @@ int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
    init_topo_info(&topo_info, x86ms);
 
    assert(idx < ms->possible_cpus->len);
-   x86_topo_ids_from_idx(&topo_info, idx, &topo_ids);
+   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
+                            &topo_info, &topo_ids);
    return topo_ids.pkg_id % ms->numa_state->num_nodes;
 }
 
@@ -214,7 +188,10 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
 
         ms->possible_cpus->cpus[i].type = ms->cpu_type;
         ms->possible_cpus->cpus[i].vcpus_count = 1;
-        x86_topo_ids_from_idx(&topo_info, i, &topo_ids);
+        ms->possible_cpus->cpus[i].arch_id =
+            x86_cpu_apic_id_from_index(x86ms, i);
+        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
+                                 &topo_info, &topo_ids);
         ms->possible_cpus->cpus[i].props.has_socket_id = true;
         ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
         if (x86ms->smp_dies > 1) {



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v5 5/8] Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition"
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
                   ` (3 preceding siblings ...)
  2020-08-21 22:12 ` [PATCH v5 4/8] Revert "hw/i386: Move arch_id decode inside x86_cpus_init" Babu Moger
@ 2020-08-21 22:12 ` Babu Moger
  2020-08-21 22:12 ` [PATCH v5 6/8] Revert "hw/i386: Introduce apicid functions inside X86MachineState" Babu Moger
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-21 22:12 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

Remove the EPYC specific apicid decoding and use the generic
default decoding.

This reverts commit 0c1538cb1a26287c072645f4759b9872b1596d79.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 target/i386/cpu.c |   16 ----------------
 target/i386/cpu.h |    1 -
 2 files changed, 17 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 567d864051..19198e3e7f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1636,10 +1636,6 @@ typedef struct X86CPUDefinition {
     FeatureWordArray features;
     const char *model_id;
     CPUCaches *cache_info;
-
-    /* Use AMD EPYC encoding for apic id */
-    bool use_epyc_apic_id_encoding;
-
     /*
      * Definitions for alternative versions of CPU model.
      * List is terminated by item with version == 0.
@@ -1681,18 +1677,6 @@ static const X86CPUVersionDefinition *x86_cpu_def_get_versions(X86CPUDefinition
     return def->versions ?: default_version_list;
 }
 
-bool cpu_x86_use_epyc_apic_id_encoding(const char *cpu_type)
-{
-    X86CPUClass *xcc = X86_CPU_CLASS(object_class_by_name(cpu_type));
-
-    assert(xcc);
-    if (xcc->model && xcc->model->cpudef) {
-        return xcc->model->cpudef->use_epyc_apic_id_encoding;
-    } else {
-        return false;
-    }
-}
-
 static CPUCaches epyc_cache_info = {
     .l1d_cache = &(CPUCacheInfo) {
         .type = DATA_CACHE,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index a345172afd..d3097be6a5 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1917,7 +1917,6 @@ void cpu_clear_apic_feature(CPUX86State *env);
 void host_cpuid(uint32_t function, uint32_t count,
                 uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx);
 void host_vendor_fms(char *vendor, int *family, int *model, int *stepping);
-bool cpu_x86_use_epyc_apic_id_encoding(const char *cpu_type);
 
 /* helper.c */
 bool x86_cpu_tlb_fill(CPUState *cs, vaddr address, int size,



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v5 6/8] Revert "hw/i386: Introduce apicid functions inside X86MachineState"
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
                   ` (4 preceding siblings ...)
  2020-08-21 22:12 ` [PATCH v5 5/8] Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition" Babu Moger
@ 2020-08-21 22:12 ` Babu Moger
  2020-08-21 22:13 ` [PATCH v5 7/8] Revert "hw/386: Add EPYC mode topology decoding functions" Babu Moger
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-21 22:12 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

Remove the EPYC specific apicid decoding and use the generic
default decoding.

This reverts commit 6121c7fbfd98dbc3af1b00b56ff2eef66df87828.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/x86.c         |    5 -----
 include/hw/i386/x86.h |    9 ---------
 2 files changed, 14 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 0a7cf8336c..c9dba0811a 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -895,11 +895,6 @@ static void x86_machine_initfn(Object *obj)
     x86ms->smm = ON_OFF_AUTO_AUTO;
     x86ms->acpi = ON_OFF_AUTO_AUTO;
     x86ms->smp_dies = 1;
-
-    x86ms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx;
-    x86ms->topo_ids_from_apicid = x86_topo_ids_from_apicid;
-    x86ms->apicid_from_topo_ids = x86_apicid_from_topo_ids;
-    x86ms->apicid_pkg_offset = apicid_pkg_offset;
 }
 
 static void x86_machine_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index b79f24e285..4d9a26326d 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -63,15 +63,6 @@ typedef struct {
     OnOffAuto smm;
     OnOffAuto acpi;
 
-    /* Apic id specific handlers */
-    uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info,
-                                    unsigned cpu_index);
-    void (*topo_ids_from_apicid)(apic_id_t apicid, X86CPUTopoInfo *topo_info,
-                                 X86CPUTopoIDs *topo_ids);
-    apic_id_t (*apicid_from_topo_ids)(X86CPUTopoInfo *topo_info,
-                                      const X86CPUTopoIDs *topo_ids);
-    uint32_t (*apicid_pkg_offset)(X86CPUTopoInfo *topo_info);
-
     /*
      * Address space used by IOAPIC device. All IOAPIC interrupts
      * will be translated to MSI messages in the address space.



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v5 7/8] Revert "hw/386: Add EPYC mode topology decoding functions"
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
                   ` (5 preceding siblings ...)
  2020-08-21 22:12 ` [PATCH v5 6/8] Revert "hw/i386: Introduce apicid functions inside X86MachineState" Babu Moger
@ 2020-08-21 22:13 ` Babu Moger
  2020-08-28 17:27   ` Eduardo Habkost
  2020-08-21 22:13 ` [PATCH v5 8/8] i386: Simplify CPUID_8000_001E for AMD Babu Moger
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 51+ messages in thread
From: Babu Moger @ 2020-08-21 22:13 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

Remove the EPYC specific apicid decoding and use the generic
default decoding.

This reverts commit 7568b205555a6405042f62c64af3268f4330aed5.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 include/hw/i386/topology.h |   79 --------------------------------------------
 target/i386/cpu.c          |    2 +
 2 files changed, 1 insertion(+), 80 deletions(-)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 05ddde7ba0..81573f6cfd 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -107,85 +107,6 @@ static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
     return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
 }
 
-#define EPYC_DIE_OFFSET 3 /* Minimum die_id offset if numa configured */
-
-/*
- * Bit offset of the die_id field
- */
-static inline unsigned apicid_die_offset_epyc(X86CPUTopoInfo *topo_info)
-{
-    unsigned offset = apicid_core_offset(topo_info) +
-                      apicid_core_width(topo_info);
-
-    return MAX(EPYC_DIE_OFFSET, offset);
-}
-
-/* Bit offset of the Pkg_ID (socket ID) field */
-static inline unsigned apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)
-{
-    return apicid_die_offset_epyc(topo_info) + apicid_die_width(topo_info);
-}
-
-/*
- * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
- *
- * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
- */
-static inline apic_id_t
-x86_apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
-                              const X86CPUTopoIDs *topo_ids)
-{
-    return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
-           (topo_ids->die_id  << apicid_die_offset_epyc(topo_info)) |
-           (topo_ids->core_id << apicid_core_offset(topo_info)) |
-           topo_ids->smt_id;
-}
-
-static inline void x86_topo_ids_from_idx_epyc(X86CPUTopoInfo *topo_info,
-                                              unsigned cpu_index,
-                                              X86CPUTopoIDs *topo_ids)
-{
-    unsigned nr_dies = topo_info->dies_per_pkg;
-    unsigned nr_cores = topo_info->cores_per_die;
-    unsigned nr_threads = topo_info->threads_per_core;
-
-    topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
-    topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
-    topo_ids->core_id = cpu_index / nr_threads % nr_cores;
-    topo_ids->smt_id = cpu_index % nr_threads;
-}
-
-/*
- * Calculate thread/core/package IDs for a specific topology,
- * based on APIC ID
- */
-static inline void x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
-                                            X86CPUTopoInfo *topo_info,
-                                            X86CPUTopoIDs *topo_ids)
-{
-    topo_ids->smt_id = apicid &
-            ~(0xFFFFFFFFUL << apicid_smt_width(topo_info));
-    topo_ids->core_id =
-            (apicid >> apicid_core_offset(topo_info)) &
-            ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
-    topo_ids->die_id =
-            (apicid >> apicid_die_offset_epyc(topo_info)) &
-            ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
-    topo_ids->pkg_id = apicid >> apicid_pkg_offset_epyc(topo_info);
-}
-
-/*
- * Make APIC ID for the CPU 'cpu_index'
- *
- * 'cpu_index' is a sequential, contiguous ID for the CPU.
- */
-static inline apic_id_t x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo *topo_info,
-                                                     unsigned cpu_index)
-{
-    X86CPUTopoIDs topo_ids;
-    x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
-    return x86_apicid_from_topo_ids_epyc(topo_info, &topo_ids);
-}
 /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
  *
  * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 19198e3e7f..b29686220e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -388,7 +388,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
     unsigned long dies = topo_info->dies_per_pkg;
     int shift;
 
-    x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
+    x86_topo_ids_from_apicid(cpu->apic_id, topo_info, &topo_ids);
 
     *eax = cpu->apic_id;
     /*



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v5 8/8] i386: Simplify CPUID_8000_001E for AMD
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
                   ` (6 preceding siblings ...)
  2020-08-21 22:13 ` [PATCH v5 7/8] Revert "hw/386: Add EPYC mode topology decoding functions" Babu Moger
@ 2020-08-21 22:13 ` Babu Moger
  2020-08-26 12:19   ` Igor Mammedov
  2020-08-24 18:41 ` [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Dr. David Alan Gilbert
  2020-08-26 12:38 ` Igor Mammedov
  9 siblings, 1 reply; 51+ messages in thread
From: Babu Moger @ 2020-08-21 22:13 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

apic_id contains all the information required to build
CPUID_8000_001E. core_id and node_id is already part of
apic_id generated by x86_topo_ids_from_apicid_epyc.
Also remove the restriction on number bits on core_id and
node_id.

Remove all the hardcoded values and replace with generalized
fields.

Refer the Processor Programming Reference (PPR) documentation
available from the bugzilla Link below.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 target/i386/cpu.c |   81 ++++++++++++++++++++++++-----------------------------
 1 file changed, 37 insertions(+), 44 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b29686220e..bea2822923 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -385,58 +385,51 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
                                        uint32_t *ecx, uint32_t *edx)
 {
     X86CPUTopoIDs topo_ids = {0};
-    unsigned long dies = topo_info->dies_per_pkg;
-    int shift;
 
     x86_topo_ids_from_apicid(cpu->apic_id, topo_info, &topo_ids);
 
     *eax = cpu->apic_id;
+
     /*
-     * CPUID_Fn8000001E_EBX
-     * 31:16 Reserved
-     * 15:8  Threads per core (The number of threads per core is
-     *       Threads per core + 1)
-     *  7:0  Core id (see bit decoding below)
-     *       SMT:
-     *           4:3 node id
-     *             2 Core complex id
-     *           1:0 Core id
-     *       Non SMT:
-     *           5:4 node id
-     *             3 Core complex id
-     *           1:0 Core id
+     * CPUID_Fn8000001E_EBX [Core Identifiers] (CoreId)
+     * Read-only. Reset: 0000_XXXXh.
+     * See Core::X86::Cpuid::ExtApicId.
+     * Core::X86::Cpuid::CoreId_lthree[1:0]_core[3:0]_thread[1:0];
+     * Bits Description
+     * 31:16 Reserved.
+     * 15:8 ThreadsPerCore: threads per core. Read-only. Reset: XXh.
+     *      The number of threads per core is ThreadsPerCore+1.
+     *  7:0 CoreId: core ID. Read-only. Reset: XXh.
+     *
+     *  NOTE: CoreId is already part of apic_id. Just use it. We can
+     *  use all the 8 bits to represent the core_id here.
      */
-    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.die_id << 3) |
-            (topo_ids.core_id);
+    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.core_id & 0xFF);
+
     /*
-     * CPUID_Fn8000001E_ECX
-     * 31:11 Reserved
-     * 10:8  Nodes per processor (Nodes per processor is number of nodes + 1)
-     *  7:0  Node id (see bit decoding below)
-     *         2  Socket id
-     *       1:0  Node id
+     * CPUID_Fn8000001E_ECX [Node Identifiers] (NodeId)
+     * Read-only. Reset: 0000_0XXXh.
+     * Core::X86::Cpuid::NodeId_lthree[1:0]_core[3:0]_thread[1:0];
+     * Bits Description
+     * 31:11 Reserved.
+     * 10:8 NodesPerProcessor: Node per processor. Read-only. Reset: XXXb.
+     *      ValidValues:
+     *      Value Description
+     *      000b  1 node per processor.
+     *      001b  2 nodes per processor.
+     *      010b Reserved.
+     *      011b 4 nodes per processor.
+     *      111b-100b Reserved.
+     *  7:0 NodeId: Node ID. Read-only. Reset: XXh.
+     *
+     * NOTE: Hardware reserves 3 bits for number of nodes per processor.
+     * But users can create more nodes than the actual hardware can
+     * support. To genaralize we can use all the upper 8 bits for nodes.
+     * NodeId is combination of node and socket_id which is already decoded
+     * in apic_id. Just use it by shifting.
      */
-    if (dies <= 4) {
-        *ecx = ((dies - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.die_id;
-    } else {
-        /*
-         * Node id fix up. Actual hardware supports up to 4 nodes. But with
-         * more than 32 cores, we may end up with more than 4 nodes.
-         * Node id is a combination of socket id and node id. Only requirement
-         * here is that this number should be unique accross the system.
-         * Shift the socket id to accommodate more nodes. We dont expect both
-         * socket id and node id to be big number at the same time. This is not
-         * an ideal config but we need to to support it. Max nodes we can have
-         * is 32 (255/8) with 8 cores per node and 255 max cores. We only need
-         * 5 bits for nodes. Find the left most set bit to represent the total
-         * number of nodes. find_last_bit returns last set bit(0 based). Left
-         * shift(+1) the socket id to represent all the nodes.
-         */
-        dies -= 1;
-        shift = find_last_bit(&dies, 8);
-        *ecx = (dies << 8) | (topo_ids.pkg_id << (shift + 1)) |
-               topo_ids.die_id;
-    }
+    *ecx = ((topo_info->dies_per_pkg - 1) << 8) |
+           ((cpu->apic_id >> apicid_die_offset(topo_info)) & 0xFF);
     *edx = 0;
 }
 



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
                   ` (7 preceding siblings ...)
  2020-08-21 22:13 ` [PATCH v5 8/8] i386: Simplify CPUID_8000_001E for AMD Babu Moger
@ 2020-08-24 18:41 ` Dr. David Alan Gilbert
  2020-08-24 19:20   ` Babu Moger
  2020-08-26 12:38 ` Igor Mammedov
  9 siblings, 1 reply; 51+ messages in thread
From: Dr. David Alan Gilbert @ 2020-08-24 18:41 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, qemu-devel, imammedo, pbonzini, rth

* Babu Moger (babu.moger@amd.com) wrote:
> To support some of the complex topology, we introduced EPYC mode apicid decode.
> But, EPYC mode decode is running into problems. Also it can become quite a
> maintenance problem in the future. So, it was decided to remove that code and
> use the generic decode which works for majority of the topology. Most of the
> SPECed configuration would work just fine. With some non-SPECed user inputs,
> it will create some sub-optimal configuration.
> Here is the discussion thread.
> https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> 
> This series removes all the EPYC mode specific apicid changes and use the generic
> apicid decode.

Hi Babu,
  This does simplify things a lot!
One worry, what happens about a live migration of a VM from an old qemu
that was using the node-id to a qemu with this new scheme?

Dave

> ---
> v5:
>  Revert EPYC specific decode.
>  Simplify CPUID_8000_001E
> 
> v4:
>   https://lore.kernel.org/qemu-devel/159744083536.39197.13827776633866601278.stgit@naples-babu.amd.com/
>   Not much of a change. Just added few text changes.
>   Error out configuration instead of warning if dies are not configured in EPYC.
>   Few other text changes to clarify the removal of node_id, nr_nodes and nodes_per_pkg.
> 
> v3:
>   https://lore.kernel.org/qemu-devel/159681772267.9679.1334429994189974662.stgit@naples-babu.amd.com/#r
>   Added a new check to pass the dies for EPYC numa configuration.
>   Added Simplify CPUID_8000_001E patch with some changes suggested by Igor.
>   Dropped the patch to build the topology from CpuInstanceProperties.
>   TODO: Not sure if we still need the Autonuma changes Igor mentioned.
>   Needs more clarity on that.
> 
> v2:
>   https://lore.kernel.org/qemu-devel/159362436285.36204.986406297373871949.stgit@naples-babu.amd.com/
>   Used the numa information from CpuInstanceProperties for building
>   the apic_id suggested by Igor.
>   Also did some minor code re-aarangement to take care of changes.
>   Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
>   it later.
> 
> v1:
>  https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.stgit@naples-babu.amd.com
> 
> Babu Moger (8):
>       hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
>       Revert "i386: Fix pkg_id offset for EPYC cpu models"
>       Revert "target/i386: Enable new apic id encoding for EPYC based cpus models"
>       Revert "hw/i386: Move arch_id decode inside x86_cpus_init"
>       Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition"
>       Revert "hw/i386: Introduce apicid functions inside X86MachineState"
>       Revert "hw/386: Add EPYC mode topology decoding functions"
>       i386: Simplify CPUID_8000_001E for AMD
> 
> 
>  hw/i386/pc.c               |    8 +--
>  hw/i386/x86.c              |   43 +++-------------
>  include/hw/i386/topology.h |  101 ---------------------------------------
>  include/hw/i386/x86.h      |    9 ---
>  target/i386/cpu.c          |  115 ++++++++++++++++----------------------------
>  target/i386/cpu.h          |    3 -
>  tests/test-x86-cpuid.c     |   40 ++++++++-------
>  7 files changed, 73 insertions(+), 246 deletions(-)
> 
> --
> Signature
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-24 18:41 ` [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Dr. David Alan Gilbert
@ 2020-08-24 19:20   ` Babu Moger
  2020-08-25  8:15     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 51+ messages in thread
From: Babu Moger @ 2020-08-24 19:20 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: ehabkost, mst, qemu-devel, imammedo, pbonzini, rth

Hi Dave,

On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> * Babu Moger (babu.moger@amd.com) wrote:
>> To support some of the complex topology, we introduced EPYC mode apicid decode.
>> But, EPYC mode decode is running into problems. Also it can become quite a
>> maintenance problem in the future. So, it was decided to remove that code and
>> use the generic decode which works for majority of the topology. Most of the
>> SPECed configuration would work just fine. With some non-SPECed user inputs,
>> it will create some sub-optimal configuration.
>> Here is the discussion thread.
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
>>
>> This series removes all the EPYC mode specific apicid changes and use the generic
>> apicid decode.
> 
> Hi Babu,
>   This does simplify things a lot!
> One worry, what happens about a live migration of a VM from an old qemu
> that was using the node-id to a qemu with this new scheme?

The node_id which we introduced was only used internally. This wasn't
exposed outside. I don't think live migration will be an issue.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-24 19:20   ` Babu Moger
@ 2020-08-25  8:15     ` Dr. David Alan Gilbert
  2020-08-25 14:38       ` Igor Mammedov
  0 siblings, 1 reply; 51+ messages in thread
From: Dr. David Alan Gilbert @ 2020-08-25  8:15 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, qemu-devel, imammedo, pbonzini, rth

* Babu Moger (babu.moger@amd.com) wrote:
> Hi Dave,
> 
> On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > * Babu Moger (babu.moger@amd.com) wrote:
> >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> >> But, EPYC mode decode is running into problems. Also it can become quite a
> >> maintenance problem in the future. So, it was decided to remove that code and
> >> use the generic decode which works for majority of the topology. Most of the
> >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> >> it will create some sub-optimal configuration.
> >> Here is the discussion thread.
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> >>
> >> This series removes all the EPYC mode specific apicid changes and use the generic
> >> apicid decode.
> > 
> > Hi Babu,
> >   This does simplify things a lot!
> > One worry, what happens about a live migration of a VM from an old qemu
> > that was using the node-id to a qemu with this new scheme?
> 
> The node_id which we introduced was only used internally. This wasn't
> exposed outside. I don't think live migration will be an issue.

Didn't it become part of the APIC ID visible to the guest?

Dave

-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-25  8:15     ` Dr. David Alan Gilbert
@ 2020-08-25 14:38       ` Igor Mammedov
  2020-08-25 15:25         ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 51+ messages in thread
From: Igor Mammedov @ 2020-08-25 14:38 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: ehabkost, mst, qemu-devel, Babu Moger, pbonzini, rth

On Tue, 25 Aug 2020 09:15:04 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Babu Moger (babu.moger@amd.com) wrote:
> > Hi Dave,
> > 
> > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > * Babu Moger (babu.moger@amd.com) wrote:  
> > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > >> maintenance problem in the future. So, it was decided to remove that code and
> > >> use the generic decode which works for majority of the topology. Most of the
> > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > >> it will create some sub-optimal configuration.
> > >> Here is the discussion thread.
> > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > >>
> > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > >> apicid decode.  
> > > 
> > > Hi Babu,
> > >   This does simplify things a lot!
> > > One worry, what happens about a live migration of a VM from an old qemu
> > > that was using the node-id to a qemu with this new scheme?  
> > 
> > The node_id which we introduced was only used internally. This wasn't
> > exposed outside. I don't think live migration will be an issue.  
> 
> Didn't it become part of the APIC ID visible to the guest?

Daniel asked similar question wrt hard error on start up,
when CLI is not sufficient to create EPYC cpu.

https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html

Migration might fall into the same category.
Also looking at the history, 5.0 commit 
  247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).

(I'm not aware of somebody complaining about it)

Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.


With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
Multi-node configs would be correct only if user assigns cpus to numa nodes
by duplicating internal node_id algorithm that this series removes.

There might be other broken cases that I don't recall anymore
(should be mentioned in previous versions of this series)


To summarize from migration pov (ignoring ed78467a21459 change):

 1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration
 2) with this series (lets call it qemu 5.2)
     pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
     qemu 5.0, 5.1 ==> qemu 5.2 - broken

It's all about picking which poison to choose,
I'd preffer 2nd case as it lets drop a lot of complicated code that
doesn't work as expected.

PS:
 I didn't review it yet, but with this series we aren't
 making up internal node_ids that should match user provided numa node ids somehow.
 It seems series lost the patch that would enforce numa in case -smp dies>1,
 but otherwise it heads in the right direction.

> 
> Dave
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-25 14:38       ` Igor Mammedov
@ 2020-08-25 15:25         ` Dr. David Alan Gilbert
  2020-08-26 12:43           ` Igor Mammedov
  0 siblings, 1 reply; 51+ messages in thread
From: Dr. David Alan Gilbert @ 2020-08-25 15:25 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, qemu-devel, Babu Moger, pbonzini, rth

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Tue, 25 Aug 2020 09:15:04 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Babu Moger (babu.moger@amd.com) wrote:
> > > Hi Dave,
> > > 
> > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > >> use the generic decode which works for majority of the topology. Most of the
> > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > >> it will create some sub-optimal configuration.
> > > >> Here is the discussion thread.
> > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > >>
> > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > >> apicid decode.  
> > > > 
> > > > Hi Babu,
> > > >   This does simplify things a lot!
> > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > that was using the node-id to a qemu with this new scheme?  
> > > 
> > > The node_id which we introduced was only used internally. This wasn't
> > > exposed outside. I don't think live migration will be an issue.  
> > 
> > Didn't it become part of the APIC ID visible to the guest?
> 
> Daniel asked similar question wrt hard error on start up,
> when CLI is not sufficient to create EPYC cpu.
> 
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> 
> Migration might fall into the same category.
> Also looking at the history, 5.0 commit 
>   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> 
> (I'm not aware of somebody complaining about it)
> 
> Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> 
> 
> With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> Multi-node configs would be correct only if user assigns cpus to numa nodes
> by duplicating internal node_id algorithm that this series removes.
> 
> There might be other broken cases that I don't recall anymore
> (should be mentioned in previous versions of this series)
> 
> 
> To summarize from migration pov (ignoring ed78467a21459 change):
> 
>  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration

Oh ....

>  2) with this series (lets call it qemu 5.2)
>      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
>      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> 
> It's all about picking which poison to choose,
> I'd preffer 2nd case as it lets drop a lot of complicated code that
> doesn't work as expected.

I think that would make our lives easier for other reasons; so I'm happy
to go with that.

> PS:
>  I didn't review it yet, but with this series we aren't
>  making up internal node_ids that should match user provided numa node ids somehow.
>  It seems series lost the patch that would enforce numa in case -smp dies>1,
>  but otherwise it heads in the right direction.

Dave

> > 
> > Dave
> > 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 1/8] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
  2020-08-21 22:12 ` [PATCH v5 1/8] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology Babu Moger
@ 2020-08-26  9:57   ` Igor Mammedov
  2020-08-26 17:31     ` Babu Moger
  0 siblings, 1 reply; 51+ messages in thread
From: Igor Mammedov @ 2020-08-26  9:57 UTC (permalink / raw)
  To: Babu Moger; +Cc: qemu-devel, pbonzini, mst, ehabkost, rth

On Fri, 21 Aug 2020 17:12:25 -0500
Babu Moger <babu.moger@amd.com> wrote:

> Remove node_id, nr_nodes and nodes_per_pkg from topology. Use
> die_id, nr_dies and dies_per_pkg which is already available.
> Removes the confusion over two variables.
> 
> With node_id removed in topology the uninitialized memory issue
> with -device and CPU hotplug will be fixed.
> 

it would be easier to review if you first put all reverts,
and then this patch on top.

well, to actually avoid that request from others,
I'd suggest just to do that to avoid issue with merging.

other than that patches 2-7/8 look good to me.

> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1828750
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  hw/i386/pc.c               |    1 -
>  hw/i386/x86.c              |    1 -
>  include/hw/i386/topology.h |   40 +++++++++-------------------------------
>  target/i386/cpu.c          |   24 ++++++++++--------------
>  target/i386/cpu.h          |    1 -
>  tests/test-x86-cpuid.c     |   40 ++++++++++++++++++++--------------------
>  6 files changed, 39 insertions(+), 68 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 47c5ca3e34..0ae5cb3af4 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1498,7 +1498,6 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>      init_topo_info(&topo_info, x86ms);
>  
>      env->nr_dies = x86ms->smp_dies;
> -    env->nr_nodes = topo_info.nodes_per_pkg;
>      env->pkg_offset = x86ms->apicid_pkg_offset(&topo_info);
>  
>      /*
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 67bee1bcb8..f6eab947df 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -62,7 +62,6 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
>  {
>      MachineState *ms = MACHINE(x86ms);
>  
> -    topo_info->nodes_per_pkg = ms->numa_state->num_nodes / ms->smp.sockets;
>      topo_info->dies_per_pkg = x86ms->smp_dies;
>      topo_info->cores_per_die = ms->smp.cores;
>      topo_info->threads_per_core = ms->smp.threads;
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 07239f95f4..05ddde7ba0 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -47,14 +47,12 @@ typedef uint32_t apic_id_t;
>  
>  typedef struct X86CPUTopoIDs {
>      unsigned pkg_id;
> -    unsigned node_id;
>      unsigned die_id;
>      unsigned core_id;
>      unsigned smt_id;
>  } X86CPUTopoIDs;
>  
>  typedef struct X86CPUTopoInfo {
> -    unsigned nodes_per_pkg;
>      unsigned dies_per_pkg;
>      unsigned cores_per_die;
>      unsigned threads_per_core;
> @@ -89,11 +87,6 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
>      return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
>  }
>  
> -/* Bit width of the node_id field per socket */
> -static inline unsigned apicid_node_width_epyc(X86CPUTopoInfo *topo_info)
> -{
> -    return apicid_bitwidth_for_count(MAX(topo_info->nodes_per_pkg, 1));
> -}
>  /* Bit offset of the Core_ID field
>   */
>  static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
> @@ -114,30 +107,23 @@ static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
>      return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
>  }
>  
> -#define NODE_ID_OFFSET 3 /* Minimum node_id offset if numa configured */
> +#define EPYC_DIE_OFFSET 3 /* Minimum die_id offset if numa configured */
>  
>  /*
> - * Bit offset of the node_id field
> - *
> - * Make sure nodes_per_pkg >  0 if numa configured else zero.
> + * Bit offset of the die_id field
>   */
> -static inline unsigned apicid_node_offset_epyc(X86CPUTopoInfo *topo_info)
> +static inline unsigned apicid_die_offset_epyc(X86CPUTopoInfo *topo_info)
>  {
> -    unsigned offset = apicid_die_offset(topo_info) +
> -                      apicid_die_width(topo_info);
> +    unsigned offset = apicid_core_offset(topo_info) +
> +                      apicid_core_width(topo_info);
>  
> -    if (topo_info->nodes_per_pkg) {
> -        return MAX(NODE_ID_OFFSET, offset);
> -    } else {
> -        return offset;
> -    }
> +    return MAX(EPYC_DIE_OFFSET, offset);
>  }
>  
>  /* Bit offset of the Pkg_ID (socket ID) field */
>  static inline unsigned apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)
>  {
> -    return apicid_node_offset_epyc(topo_info) +
> -           apicid_node_width_epyc(topo_info);
> +    return apicid_die_offset_epyc(topo_info) + apicid_die_width(topo_info);
>  }
>  
>  /*
> @@ -150,8 +136,7 @@ x86_apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
>                                const X86CPUTopoIDs *topo_ids)
>  {
>      return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
> -           (topo_ids->node_id << apicid_node_offset_epyc(topo_info)) |
> -           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
> +           (topo_ids->die_id  << apicid_die_offset_epyc(topo_info)) |
>             (topo_ids->core_id << apicid_core_offset(topo_info)) |
>             topo_ids->smt_id;
>  }
> @@ -160,15 +145,11 @@ static inline void x86_topo_ids_from_idx_epyc(X86CPUTopoInfo *topo_info,
>                                                unsigned cpu_index,
>                                                X86CPUTopoIDs *topo_ids)
>  {
> -    unsigned nr_nodes = MAX(topo_info->nodes_per_pkg, 1);
>      unsigned nr_dies = topo_info->dies_per_pkg;
>      unsigned nr_cores = topo_info->cores_per_die;
>      unsigned nr_threads = topo_info->threads_per_core;
> -    unsigned cores_per_node = DIV_ROUND_UP((nr_dies * nr_cores * nr_threads),
> -                                            nr_nodes);
>  
>      topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> -    topo_ids->node_id = (cpu_index / cores_per_node) % nr_nodes;
>      topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
>      topo_ids->core_id = cpu_index / nr_threads % nr_cores;
>      topo_ids->smt_id = cpu_index % nr_threads;
> @@ -188,11 +169,8 @@ static inline void x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
>              (apicid >> apicid_core_offset(topo_info)) &
>              ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
>      topo_ids->die_id =
> -            (apicid >> apicid_die_offset(topo_info)) &
> +            (apicid >> apicid_die_offset_epyc(topo_info)) &
>              ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
> -    topo_ids->node_id =
> -            (apicid >> apicid_node_offset_epyc(topo_info)) &
> -            ~(0xFFFFFFFFUL << apicid_node_width_epyc(topo_info));
>      topo_ids->pkg_id = apicid >> apicid_pkg_offset_epyc(topo_info);
>  }
>  
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 588f32e136..3c58af1f43 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -345,7 +345,6 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>                                         uint32_t *ecx, uint32_t *edx)
>  {
>      uint32_t l3_cores;
> -    unsigned nodes = MAX(topo_info->nodes_per_pkg, 1);
>  
>      assert(cache->size == cache->line_size * cache->associativity *
>                            cache->partitions * cache->sets);
> @@ -355,10 +354,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>  
>      /* L3 is shared among multiple cores */
>      if (cache->level == 3) {
> -        l3_cores = DIV_ROUND_UP((topo_info->dies_per_pkg *
> -                                 topo_info->cores_per_die *
> +        l3_cores = DIV_ROUND_UP((topo_info->cores_per_die *
>                                   topo_info->threads_per_core),
> -                                 nodes);
> +                                 topo_info->dies_per_pkg);
>          *eax |= (l3_cores - 1) << 14;
>      } else {
>          *eax |= ((topo_info->threads_per_core - 1) << 14);
> @@ -387,7 +385,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>                                         uint32_t *ecx, uint32_t *edx)
>  {
>      X86CPUTopoIDs topo_ids = {0};
> -    unsigned long nodes = MAX(topo_info->nodes_per_pkg, 1);
> +    unsigned long dies = topo_info->dies_per_pkg;
>      int shift;
>  
>      x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
> @@ -408,7 +406,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>       *             3 Core complex id
>       *           1:0 Core id
>       */
> -    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.node_id << 3) |
> +    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.die_id << 3) |
>              (topo_ids.core_id);
>      /*
>       * CPUID_Fn8000001E_ECX
> @@ -418,8 +416,8 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>       *         2  Socket id
>       *       1:0  Node id
>       */
> -    if (nodes <= 4) {
> -        *ecx = ((nodes - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.node_id;
> +    if (dies <= 4) {
> +        *ecx = ((dies - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.die_id;
>      } else {
>          /*
>           * Node id fix up. Actual hardware supports up to 4 nodes. But with
> @@ -434,10 +432,10 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>           * number of nodes. find_last_bit returns last set bit(0 based). Left
>           * shift(+1) the socket id to represent all the nodes.
>           */
> -        nodes -= 1;
> -        shift = find_last_bit(&nodes, 8);
> -        *ecx = (nodes << 8) | (topo_ids.pkg_id << (shift + 1)) |
> -               topo_ids.node_id;
> +        dies -= 1;
> +        shift = find_last_bit(&dies, 8);
> +        *ecx = (dies << 8) | (topo_ids.pkg_id << (shift + 1)) |
> +               topo_ids.die_id;
>      }
>      *edx = 0;
>  }
> @@ -5489,7 +5487,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>      uint32_t signature[3];
>      X86CPUTopoInfo topo_info;
>  
> -    topo_info.nodes_per_pkg = env->nr_nodes;
>      topo_info.dies_per_pkg = env->nr_dies;
>      topo_info.cores_per_die = cs->nr_cores;
>      topo_info.threads_per_core = cs->nr_threads;
> @@ -6949,7 +6946,6 @@ static void x86_cpu_initfn(Object *obj)
>      FeatureWord w;
>  
>      env->nr_dies = 1;
> -    env->nr_nodes = 1;
>      cpu_set_cpustate_pointers(cpu);
>  
>      object_property_add(obj, "family", "int",
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index e1a5c174dc..4c89bee8d1 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1629,7 +1629,6 @@ typedef struct CPUX86State {
>      TPRAccess tpr_access_type;
>  
>      unsigned nr_dies;
> -    unsigned nr_nodes;
>      unsigned pkg_offset;
>  } CPUX86State;
>  
> diff --git a/tests/test-x86-cpuid.c b/tests/test-x86-cpuid.c
> index 049030a50e..bfabc0403a 100644
> --- a/tests/test-x86-cpuid.c
> +++ b/tests/test-x86-cpuid.c
> @@ -31,12 +31,12 @@ static void test_topo_bits(void)
>      X86CPUTopoInfo topo_info = {0};
>  
>      /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
>  
> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1};
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> @@ -45,39 +45,39 @@ static void test_topo_bits(void)
>  
>      /* Test field width calculation for multiple values
>       */
> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 2};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 3};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 4};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 4};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>  
> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 14};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 14};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 15};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 15};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 16};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 16};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 17};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 17};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
>  
>  
> -    topo_info = (X86CPUTopoInfo) {0, 1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 30, 2};
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {0, 1, 31, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 31, 2};
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {0, 1, 32, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 32, 2};
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {0, 1, 33, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 33, 2};
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
>  
> -    topo_info = (X86CPUTopoInfo) {0, 1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 30, 2};
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> -    topo_info = (X86CPUTopoInfo) {0, 2, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {2, 30, 2};
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
> -    topo_info = (X86CPUTopoInfo) {0, 3, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {3, 30, 2};
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> -    topo_info = (X86CPUTopoInfo) {0, 4, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {4, 30, 2};
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
>  
>      /* build a weird topology and see if IDs are calculated correctly
> @@ -85,18 +85,18 @@ static void test_topo_bits(void)
>  
>      /* This will use 2 bits for thread ID and 3 bits for core ID
>       */
> -    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 6, 3};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>      g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
>      g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
>      g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
>  
> -    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 6, 3};
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
>  
> -    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 6, 3};
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
>                       (1 << 2) | 0);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
> 
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 8/8] i386: Simplify CPUID_8000_001E for AMD
  2020-08-21 22:13 ` [PATCH v5 8/8] i386: Simplify CPUID_8000_001E for AMD Babu Moger
@ 2020-08-26 12:19   ` Igor Mammedov
  0 siblings, 0 replies; 51+ messages in thread
From: Igor Mammedov @ 2020-08-26 12:19 UTC (permalink / raw)
  To: Babu Moger; +Cc: qemu-devel, pbonzini, mst, ehabkost, rth

On Fri, 21 Aug 2020 17:13:09 -0500
Babu Moger <babu.moger@amd.com> wrote:

> apic_id contains all the information required to build
> CPUID_8000_001E. core_id and node_id is already part of
> apic_id generated by x86_topo_ids_from_apicid_epyc.
> Also remove the restriction on number bits on core_id and
> node_id.
> 
> Remove all the hardcoded values and replace with generalized
> fields.
> 
> Refer the Processor Programming Reference (PPR) documentation
> available from the bugzilla Link below.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  target/i386/cpu.c |   81 ++++++++++++++++++++++++-----------------------------
>  1 file changed, 37 insertions(+), 44 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b29686220e..bea2822923 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -385,58 +385,51 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>                                         uint32_t *ecx, uint32_t *edx)
>  {
>      X86CPUTopoIDs topo_ids = {0};
> -    unsigned long dies = topo_info->dies_per_pkg;
> -    int shift;
>  
>      x86_topo_ids_from_apicid(cpu->apic_id, topo_info, &topo_ids);
>  
>      *eax = cpu->apic_id;
> +
>      /*
> -     * CPUID_Fn8000001E_EBX
> -     * 31:16 Reserved
> -     * 15:8  Threads per core (The number of threads per core is
> -     *       Threads per core + 1)
> -     *  7:0  Core id (see bit decoding below)
> -     *       SMT:
> -     *           4:3 node id
> -     *             2 Core complex id
> -     *           1:0 Core id
> -     *       Non SMT:
> -     *           5:4 node id
> -     *             3 Core complex id
> -     *           1:0 Core id
> +     * CPUID_Fn8000001E_EBX [Core Identifiers] (CoreId)
> +     * Read-only. Reset: 0000_XXXXh.
> +     * See Core::X86::Cpuid::ExtApicId.
> +     * Core::X86::Cpuid::CoreId_lthree[1:0]_core[3:0]_thread[1:0];
> +     * Bits Description
> +     * 31:16 Reserved.
> +     * 15:8 ThreadsPerCore: threads per core. Read-only. Reset: XXh.
> +     *      The number of threads per core is ThreadsPerCore+1.
> +     *  7:0 CoreId: core ID. Read-only. Reset: XXh.
> +     *
> +     *  NOTE: CoreId is already part of apic_id. Just use it. We can
> +     *  use all the 8 bits to represent the core_id here.
>       */
> -    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.die_id << 3) |
> -            (topo_ids.core_id);
> +    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.core_id & 0xFF);
> +
>      /*
> -     * CPUID_Fn8000001E_ECX
> -     * 31:11 Reserved
> -     * 10:8  Nodes per processor (Nodes per processor is number of nodes + 1)
> -     *  7:0  Node id (see bit decoding below)
> -     *         2  Socket id
> -     *       1:0  Node id
> +     * CPUID_Fn8000001E_ECX [Node Identifiers] (NodeId)
> +     * Read-only. Reset: 0000_0XXXh.
> +     * Core::X86::Cpuid::NodeId_lthree[1:0]_core[3:0]_thread[1:0];
> +     * Bits Description
> +     * 31:11 Reserved.
> +     * 10:8 NodesPerProcessor: Node per processor. Read-only. Reset: XXXb.
> +     *      ValidValues:
> +     *      Value Description
> +     *      000b  1 node per processor.
> +     *      001b  2 nodes per processor.
> +     *      010b Reserved.
> +     *      011b 4 nodes per processor.
> +     *      111b-100b Reserved.
> +     *  7:0 NodeId: Node ID. Read-only. Reset: XXh.
> +     *
> +     * NOTE: Hardware reserves 3 bits for number of nodes per processor.
> +     * But users can create more nodes than the actual hardware can
> +     * support. To genaralize we can use all the upper 8 bits for nodes.
> +     * NodeId is combination of node and socket_id which is already decoded
> +     * in apic_id. Just use it by shifting.
>       */
> -    if (dies <= 4) {
> -        *ecx = ((dies - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.die_id;
> -    } else {
> -        /*
> -         * Node id fix up. Actual hardware supports up to 4 nodes. But with
> -         * more than 32 cores, we may end up with more than 4 nodes.
> -         * Node id is a combination of socket id and node id. Only requirement
> -         * here is that this number should be unique accross the system.
> -         * Shift the socket id to accommodate more nodes. We dont expect both
> -         * socket id and node id to be big number at the same time. This is not
> -         * an ideal config but we need to to support it. Max nodes we can have
> -         * is 32 (255/8) with 8 cores per node and 255 max cores. We only need
> -         * 5 bits for nodes. Find the left most set bit to represent the total
> -         * number of nodes. find_last_bit returns last set bit(0 based). Left
> -         * shift(+1) the socket id to represent all the nodes.
> -         */
> -        dies -= 1;
> -        shift = find_last_bit(&dies, 8);
> -        *ecx = (dies << 8) | (topo_ids.pkg_id << (shift + 1)) |
> -               topo_ids.die_id;
> -    }
> +    *ecx = ((topo_info->dies_per_pkg - 1) << 8) |
> +           ((cpu->apic_id >> apicid_die_offset(topo_info)) & 0xFF);

I'd prefer approach used in "[PATCH v4 1/3] i386: Simplify CPUID_8000_001E for AMD"
that way numa node id in this leaf will aways be consistent with -numa CLI.
 


>      *edx = 0;
>  }
>  
> 
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
                   ` (8 preceding siblings ...)
  2020-08-24 18:41 ` [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Dr. David Alan Gilbert
@ 2020-08-26 12:38 ` Igor Mammedov
  2020-08-26 12:50   ` Daniel P. Berrangé
  2020-08-26 14:04   ` Eduardo Habkost
  9 siblings, 2 replies; 51+ messages in thread
From: Igor Mammedov @ 2020-08-26 12:38 UTC (permalink / raw)
  To: Babu Moger
  Cc: Daniel P. Berrangé,
	ehabkost, mst, Michal Privoznik, qemu-devel, pbonzini, rth

On Fri, 21 Aug 2020 17:12:19 -0500
Babu Moger <babu.moger@amd.com> wrote:

> To support some of the complex topology, we introduced EPYC mode apicid decode.
> But, EPYC mode decode is running into problems. Also it can become quite a
> maintenance problem in the future. So, it was decided to remove that code and
> use the generic decode which works for majority of the topology. Most of the
> SPECed configuration would work just fine. With some non-SPECed user inputs,
> it will create some sub-optimal configuration.
> Here is the discussion thread.
> https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> 
> This series removes all the EPYC mode specific apicid changes and use the generic
> apicid decode.

the main difference between EPYC and all other CPUs is that
it requires numa configuration (it's not optional)
so we need an extra patch on top of this series to enfoce that, i.e:

 if (epyc && !numa) 
    error("EPYC cpu requires numa to be configured")

I think there was a patch in previous revisions that aimed for this.
Simplest form would be above snippet.

More complex one, would be moving auto_enable_numa from MachineClass to
MachineState so we can change it at runtime if EPYC is used. That should
take care of use case where user hasn't provided -numa.


Eduardo,
 is there any way to tell managment that particular CPU type requires
 -numa ?

> ---
> v5:
>  Revert EPYC specific decode.
>  Simplify CPUID_8000_001E
> 
> v4:
>   https://lore.kernel.org/qemu-devel/159744083536.39197.13827776633866601278.stgit@naples-babu.amd.com/
>   Not much of a change. Just added few text changes.
>   Error out configuration instead of warning if dies are not configured in EPYC.
>   Few other text changes to clarify the removal of node_id, nr_nodes and nodes_per_pkg.
> 
> v3:
>   https://lore.kernel.org/qemu-devel/159681772267.9679.1334429994189974662.stgit@naples-babu.amd.com/#r
>   Added a new check to pass the dies for EPYC numa configuration.
>   Added Simplify CPUID_8000_001E patch with some changes suggested by Igor.
>   Dropped the patch to build the topology from CpuInstanceProperties.
>   TODO: Not sure if we still need the Autonuma changes Igor mentioned.
>   Needs more clarity on that.
> 
> v2:
>   https://lore.kernel.org/qemu-devel/159362436285.36204.986406297373871949.stgit@naples-babu.amd.com/
>   Used the numa information from CpuInstanceProperties for building
>   the apic_id suggested by Igor.
>   Also did some minor code re-aarangement to take care of changes.
>   Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
>   it later.
> 
> v1:
>  https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.stgit@naples-babu.amd.com
> 
> Babu Moger (8):
>       hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
>       Revert "i386: Fix pkg_id offset for EPYC cpu models"
>       Revert "target/i386: Enable new apic id encoding for EPYC based cpus models"
>       Revert "hw/i386: Move arch_id decode inside x86_cpus_init"
>       Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition"
>       Revert "hw/i386: Introduce apicid functions inside X86MachineState"
>       Revert "hw/386: Add EPYC mode topology decoding functions"
>       i386: Simplify CPUID_8000_001E for AMD
> 
> 
>  hw/i386/pc.c               |    8 +--
>  hw/i386/x86.c              |   43 +++-------------
>  include/hw/i386/topology.h |  101 ---------------------------------------
>  include/hw/i386/x86.h      |    9 ---
>  target/i386/cpu.c          |  115 ++++++++++++++++----------------------------
>  target/i386/cpu.h          |    3 -
>  tests/test-x86-cpuid.c     |   40 ++++++++-------
>  7 files changed, 73 insertions(+), 246 deletions(-)
> 
> --
> Signature
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-25 15:25         ` Dr. David Alan Gilbert
@ 2020-08-26 12:43           ` Igor Mammedov
  2020-08-26 14:10             ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 51+ messages in thread
From: Igor Mammedov @ 2020-08-26 12:43 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: ehabkost, mst, qemu-devel, Babu Moger, pbonzini, rth

On Tue, 25 Aug 2020 16:25:21 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Tue, 25 Aug 2020 09:15:04 +0100
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >   
> > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > Hi Dave,
> > > > 
> > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:    
> > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > >> it will create some sub-optimal configuration.
> > > > >> Here is the discussion thread.
> > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > >>
> > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > >> apicid decode.    
> > > > > 
> > > > > Hi Babu,
> > > > >   This does simplify things a lot!
> > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > that was using the node-id to a qemu with this new scheme?    
> > > > 
> > > > The node_id which we introduced was only used internally. This wasn't
> > > > exposed outside. I don't think live migration will be an issue.    
> > > 
> > > Didn't it become part of the APIC ID visible to the guest?  
> > 
> > Daniel asked similar question wrt hard error on start up,
> > when CLI is not sufficient to create EPYC cpu.
> > 
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > 
> > Migration might fall into the same category.
> > Also looking at the history, 5.0 commit 
> >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > 
> > (I'm not aware of somebody complaining about it)
> > 
> > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > 
> > 
> > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > by duplicating internal node_id algorithm that this series removes.
> > 
> > There might be other broken cases that I don't recall anymore
> > (should be mentioned in previous versions of this series)
> > 
> > 
> > To summarize from migration pov (ignoring ed78467a21459 change):
> > 
> >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> 
> Oh ....
> 
> >  2) with this series (lets call it qemu 5.2)
> >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > 
> > It's all about picking which poison to choose,
> > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > doesn't work as expected.  
> 
> I think that would make our lives easier for other reasons; so I'm happy
> to go with that.

to make things less painful for users, me wonders if there is a way
to block migration if epyc and specific QEMU versions are used?

> > PS:
> >  I didn't review it yet, but with this series we aren't
> >  making up internal node_ids that should match user provided numa node ids somehow.
> >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> >  but otherwise it heads in the right direction.  
> 
> Dave
> 
> > > 
> > > Dave
> > >   
> >   



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 12:38 ` Igor Mammedov
@ 2020-08-26 12:50   ` Daniel P. Berrangé
  2020-08-26 13:30     ` Igor Mammedov
  2020-08-26 14:04   ` Eduardo Habkost
  1 sibling, 1 reply; 51+ messages in thread
From: Daniel P. Berrangé @ 2020-08-26 12:50 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini, rth

On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> On Fri, 21 Aug 2020 17:12:19 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > But, EPYC mode decode is running into problems. Also it can become quite a
> > maintenance problem in the future. So, it was decided to remove that code and
> > use the generic decode which works for majority of the topology. Most of the
> > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > it will create some sub-optimal configuration.
> > Here is the discussion thread.
> > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > 
> > This series removes all the EPYC mode specific apicid changes and use the generic
> > apicid decode.
> 
> the main difference between EPYC and all other CPUs is that
> it requires numa configuration (it's not optional)
> so we need an extra patch on top of this series to enfoce that, i.e:
> 
>  if (epyc && !numa) 
>     error("EPYC cpu requires numa to be configured")

Please no. This will break 90% of current usage of the EPYC CPU in
real world QEMU deployments. That is way too user hostile to introduce
as a requirement.

Why do we need to force this ?  People have been successfuly using
EPYC CPUs without NUMA in QEMU for years now.

It might not match behaviour of bare metal silicon, but that hasn't
obviously caused the world to come crashing down.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 12:50   ` Daniel P. Berrangé
@ 2020-08-26 13:30     ` Igor Mammedov
  2020-08-26 13:36       ` Daniel P. Berrangé
  2020-08-26 17:17       ` Babu Moger
  0 siblings, 2 replies; 51+ messages in thread
From: Igor Mammedov @ 2020-08-26 13:30 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini, rth

On Wed, 26 Aug 2020 13:50:59 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > On Fri, 21 Aug 2020 17:12:19 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > maintenance problem in the future. So, it was decided to remove that code and
> > > use the generic decode which works for majority of the topology. Most of the
> > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > it will create some sub-optimal configuration.
> > > Here is the discussion thread.
> > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > 
> > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > apicid decode.  
> > 
> > the main difference between EPYC and all other CPUs is that
> > it requires numa configuration (it's not optional)
> > so we need an extra patch on top of this series to enfoce that, i.e:
> > 
> >  if (epyc && !numa) 
> >     error("EPYC cpu requires numa to be configured")  
> 
> Please no. This will break 90% of current usage of the EPYC CPU in
> real world QEMU deployments. That is way too user hostile to introduce
> as a requirement.
> 
> Why do we need to force this ?  People have been successfuly using
> EPYC CPUs without NUMA in QEMU for years now.
> 
> It might not match behaviour of bare metal silicon, but that hasn't
> obviously caused the world to come crashing down.
So far it produces warning in linux kernel (RHBZ1728166),
(resulting performance might be suboptimal), but I haven't seen
anyone reporting crashes yet.


What other options do we have?
Perhaps we can turn on strict check for new machine types only,
so old configs can keep broken topology (CPUID),
while new ones would require -numa and produce correct topology.


> 
> Regards,
> Daniel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 13:30     ` Igor Mammedov
@ 2020-08-26 13:36       ` Daniel P. Berrangé
  2020-08-26 14:02         ` Igor Mammedov
  2020-08-26 17:17       ` Babu Moger
  1 sibling, 1 reply; 51+ messages in thread
From: Daniel P. Berrangé @ 2020-08-26 13:36 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini, rth

On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 13:50:59 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >   
> > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > use the generic decode which works for majority of the topology. Most of the
> > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > it will create some sub-optimal configuration.
> > > > Here is the discussion thread.
> > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > 
> > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > apicid decode.  
> > > 
> > > the main difference between EPYC and all other CPUs is that
> > > it requires numa configuration (it's not optional)
> > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > 
> > >  if (epyc && !numa) 
> > >     error("EPYC cpu requires numa to be configured")  
> > 
> > Please no. This will break 90% of current usage of the EPYC CPU in
> > real world QEMU deployments. That is way too user hostile to introduce
> > as a requirement.
> > 
> > Why do we need to force this ?  People have been successfuly using
> > EPYC CPUs without NUMA in QEMU for years now.
> > 
> > It might not match behaviour of bare metal silicon, but that hasn't
> > obviously caused the world to come crashing down.
> So far it produces warning in linux kernel (RHBZ1728166),
> (resulting performance might be suboptimal), but I haven't seen
> anyone reporting crashes yet.
> 
> 
> What other options do we have?
> Perhaps we can turn on strict check for new machine types only,
> so old configs can keep broken topology (CPUID),
> while new ones would require -numa and produce correct topology.

No, tieing this to machine types is not viable either. That is still
going to break essentially every single management application that
exists today using QEMU.

Breaking stuff existing apps is not acceptable for something that is
merely reporting sub-optimal performance. That's simply a documentation
task to highlight best practice to app developers.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 13:36       ` Daniel P. Berrangé
@ 2020-08-26 14:02         ` Igor Mammedov
  2020-08-26 15:03           ` Daniel P. Berrangé
  0 siblings, 1 reply; 51+ messages in thread
From: Igor Mammedov @ 2020-08-26 14:02 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini, rth

On Wed, 26 Aug 2020 14:36:38 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 13:50:59 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >     
> > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > it will create some sub-optimal configuration.
> > > > > Here is the discussion thread.
> > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > 
> > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > apicid decode.    
> > > > 
> > > > the main difference between EPYC and all other CPUs is that
> > > > it requires numa configuration (it's not optional)
> > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > 
> > > >  if (epyc && !numa) 
> > > >     error("EPYC cpu requires numa to be configured")    
> > > 
> > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > real world QEMU deployments. That is way too user hostile to introduce
> > > as a requirement.
> > > 
> > > Why do we need to force this ?  People have been successfuly using
> > > EPYC CPUs without NUMA in QEMU for years now.
> > > 
> > > It might not match behaviour of bare metal silicon, but that hasn't
> > > obviously caused the world to come crashing down.  
> > So far it produces warning in linux kernel (RHBZ1728166),
> > (resulting performance might be suboptimal), but I haven't seen
> > anyone reporting crashes yet.
> > 
> > 
> > What other options do we have?
> > Perhaps we can turn on strict check for new machine types only,
> > so old configs can keep broken topology (CPUID),
> > while new ones would require -numa and produce correct topology.  
> 
> No, tieing this to machine types is not viable either. That is still
> going to break essentially every single management application that
> exists today using QEMU.
for that we have deprecation process, so users could switch to new CLI
that would be required.


> Breaking stuff existing apps is not acceptable for something that is
> merely reporting sub-optimal performance. That's simply a documentation
> task to highlight best practice to app developers.
> 
> Regards,
> Daniel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 12:38 ` Igor Mammedov
  2020-08-26 12:50   ` Daniel P. Berrangé
@ 2020-08-26 14:04   ` Eduardo Habkost
  1 sibling, 0 replies; 51+ messages in thread
From: Eduardo Habkost @ 2020-08-26 14:04 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Daniel P. Berrangé,
	mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini, rth

On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> On Fri, 21 Aug 2020 17:12:19 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > But, EPYC mode decode is running into problems. Also it can become quite a
> > maintenance problem in the future. So, it was decided to remove that code and
> > use the generic decode which works for majority of the topology. Most of the
> > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > it will create some sub-optimal configuration.
> > Here is the discussion thread.
> > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > 
> > This series removes all the EPYC mode specific apicid changes and use the generic
> > apicid decode.
> 
> the main difference between EPYC and all other CPUs is that
> it requires numa configuration (it's not optional)
> so we need an extra patch on top of this series to enfoce that, i.e:
> 
>  if (epyc && !numa) 
>     error("EPYC cpu requires numa to be configured")
> 
> I think there was a patch in previous revisions that aimed for this.
> Simplest form would be above snippet.
> 
> More complex one, would be moving auto_enable_numa from MachineClass to
> MachineState so we can change it at runtime if EPYC is used. That should
> take care of use case where user hasn't provided -numa.

This sounds like a good solution.  It actually sounds simpler
than the alternatives (which just move the complexity to other
components).

We can keep MachineClass::auto_enable_numa as is, and just use it
to initialize the default value of MachineState::auto_enable_numa.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 12:43           ` Igor Mammedov
@ 2020-08-26 14:10             ` Dr. David Alan Gilbert
  2020-08-27 21:19               ` Igor Mammedov
  0 siblings, 1 reply; 51+ messages in thread
From: Dr. David Alan Gilbert @ 2020-08-26 14:10 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, qemu-devel, Babu Moger, pbonzini, rth

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Tue, 25 Aug 2020 16:25:21 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > >   
> > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > Hi Dave,
> > > > > 
> > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:    
> > > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > >> it will create some sub-optimal configuration.
> > > > > >> Here is the discussion thread.
> > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > > >>
> > > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > >> apicid decode.    
> > > > > > 
> > > > > > Hi Babu,
> > > > > >   This does simplify things a lot!
> > > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > > that was using the node-id to a qemu with this new scheme?    
> > > > > 
> > > > > The node_id which we introduced was only used internally. This wasn't
> > > > > exposed outside. I don't think live migration will be an issue.    
> > > > 
> > > > Didn't it become part of the APIC ID visible to the guest?  
> > > 
> > > Daniel asked similar question wrt hard error on start up,
> > > when CLI is not sufficient to create EPYC cpu.
> > > 
> > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > 
> > > Migration might fall into the same category.
> > > Also looking at the history, 5.0 commit 
> > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > > 
> > > (I'm not aware of somebody complaining about it)
> > > 
> > > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > > 
> > > 
> > > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > > by duplicating internal node_id algorithm that this series removes.
> > > 
> > > There might be other broken cases that I don't recall anymore
> > > (should be mentioned in previous versions of this series)
> > > 
> > > 
> > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > 
> > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > 
> > Oh ....
> > 
> > >  2) with this series (lets call it qemu 5.2)
> > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > 
> > > It's all about picking which poison to choose,
> > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > doesn't work as expected.  
> > 
> > I think that would make our lives easier for other reasons; so I'm happy
> > to go with that.
> 
> to make things less painful for users, me wonders if there is a way
> to block migration if epyc and specific QEMU versions are used?

We have no way to block based on version - and that's a pretty painful
thing to do; we can block based on machine type.

But before we get there; can we understand in which combinations that
things break and why exactly - would it break on a 1 or 2 vCPU guest -
or would it only break when we get to the point the upper bits start
being used for example?  Why exaclty would it break - i.e. is it going
to change the name of sections in the migration stream - or are the
values we need actually going to migrate OK?

Dave


> > > PS:
> > >  I didn't review it yet, but with this series we aren't
> > >  making up internal node_ids that should match user provided numa node ids somehow.
> > >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> > >  but otherwise it heads in the right direction.  
> > 
> > Dave
> > 
> > > > 
> > > > Dave
> > > >   
> > >   
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 14:02         ` Igor Mammedov
@ 2020-08-26 15:03           ` Daniel P. Berrangé
  2020-08-26 15:18             ` Eduardo Habkost
  2020-08-27 17:03             ` Igor Mammedov
  0 siblings, 2 replies; 51+ messages in thread
From: Daniel P. Berrangé @ 2020-08-26 15:03 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini, rth

On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 14:36:38 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >   
> > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > >     
> > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > it will create some sub-optimal configuration.
> > > > > > Here is the discussion thread.
> > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > 
> > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > apicid decode.    
> > > > > 
> > > > > the main difference between EPYC and all other CPUs is that
> > > > > it requires numa configuration (it's not optional)
> > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > 
> > > > >  if (epyc && !numa) 
> > > > >     error("EPYC cpu requires numa to be configured")    
> > > > 
> > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > as a requirement.
> > > > 
> > > > Why do we need to force this ?  People have been successfuly using
> > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > 
> > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > obviously caused the world to come crashing down.  
> > > So far it produces warning in linux kernel (RHBZ1728166),
> > > (resulting performance might be suboptimal), but I haven't seen
> > > anyone reporting crashes yet.
> > > 
> > > 
> > > What other options do we have?
> > > Perhaps we can turn on strict check for new machine types only,
> > > so old configs can keep broken topology (CPUID),
> > > while new ones would require -numa and produce correct topology.  
> > 
> > No, tieing this to machine types is not viable either. That is still
> > going to break essentially every single management application that
> > exists today using QEMU.
> for that we have deprecation process, so users could switch to new CLI
> that would be required.

We could, but I don't find the cost/benefit tradeoff is compelling.

There are so many places where we diverge from what bare metal would
do, that I don't see a good reason to introduce this breakage, even
if we notify users via a deprecation message. 

If QEMU wants to require NUMA for EPYC, then QEMU could internally
create a single NUMA node if none was specified for new machine
types, such that there is no visible change or breakage to any
mgmt apps.  


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 15:03           ` Daniel P. Berrangé
@ 2020-08-26 15:18             ` Eduardo Habkost
  2020-08-27 17:03             ` Igor Mammedov
  1 sibling, 0 replies; 51+ messages in thread
From: Eduardo Habkost @ 2020-08-26 15:18 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini,
	Igor Mammedov, rth

On Wed, Aug 26, 2020 at 04:03:40PM +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 14:36:38 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >   
> > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > >     
> > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > it will create some sub-optimal configuration.
> > > > > > > Here is the discussion thread.
> > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > 
> > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > apicid decode.    
> > > > > > 
> > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > it requires numa configuration (it's not optional)
> > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > 
> > > > > >  if (epyc && !numa) 
> > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > 
> > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > as a requirement.
> > > > > 
> > > > > Why do we need to force this ?  People have been successfuly using
> > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > 
> > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > obviously caused the world to come crashing down.  
> > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > (resulting performance might be suboptimal), but I haven't seen
> > > > anyone reporting crashes yet.
> > > > 
> > > > 
> > > > What other options do we have?
> > > > Perhaps we can turn on strict check for new machine types only,
> > > > so old configs can keep broken topology (CPUID),
> > > > while new ones would require -numa and produce correct topology.  
> > > 
> > > No, tieing this to machine types is not viable either. That is still
> > > going to break essentially every single management application that
> > > exists today using QEMU.
> > for that we have deprecation process, so users could switch to new CLI
> > that would be required.
> 
> We could, but I don't find the cost/benefit tradeoff is compelling.
> 
> There are so many places where we diverge from what bare metal would
> do, that I don't see a good reason to introduce this breakage, even
> if we notify users via a deprecation message. 
> 
> If QEMU wants to require NUMA for EPYC, then QEMU could internally
> create a single NUMA node if none was specified for new machine
> types, such that there is no visible change or breakage to any
> mgmt apps.  

Is anything expected to break if we just set
auto_enable_numa=true unconditionally on pc-*-5.2 and newer?

-- 
Eduardo



^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 13:30     ` Igor Mammedov
  2020-08-26 13:36       ` Daniel P. Berrangé
@ 2020-08-26 17:17       ` Babu Moger
  2020-08-26 18:33         ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 51+ messages in thread
From: Babu Moger @ 2020-08-26 17:17 UTC (permalink / raw)
  To: Igor Mammedov, Daniel P. Berrangé
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel, pbonzini, rth


> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Wednesday, August 26, 2020 8:31 AM
> To: Daniel P. Berrangé <berrange@redhat.com>
> Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> decode
> 
> On Wed, 26 Aug 2020 13:50:59 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >
> > > > To support some of the complex topology, we introduced EPYC mode
> apicid decode.
> > > > But, EPYC mode decode is running into problems. Also it can become
> > > > quite a maintenance problem in the future. So, it was decided to
> > > > remove that code and use the generic decode which works for
> > > > majority of the topology. Most of the SPECed configuration would
> > > > work just fine. With some non-SPECed user inputs, it will create some sub-
> optimal configuration.
> > > > Here is the discussion thread.
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> d5b437c1b25
> > > >
> 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> 52f92
> > > >
> 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> C0
> > > >
> %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> YO%2B
> > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > >
> > > > This series removes all the EPYC mode specific apicid changes and
> > > > use the generic apicid decode.
> > >
> > > the main difference between EPYC and all other CPUs is that it
> > > requires numa configuration (it's not optional) so we need an extra
No, That is not true. Because of that assumption we made all these apicid
changes. And here we are now.

AMD supports varies mixed configurations. In case of EPYC-Rome, we have
NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1, basically we
have all the cores in a socket under one numa node. This is non-numa
configuration.
Looking at the various configurations and also discussing internally, it
is not advisable to have (epyc && !numa) check.

> > > patch on top of this series to enfoce that, i.e:
> > >
> > >  if (epyc && !numa)
> > >     error("EPYC cpu requires numa to be configured")
> >
> > Please no. This will break 90% of current usage of the EPYC CPU in
> > real world QEMU deployments. That is way too user hostile to introduce
> > as a requirement.
> >
> > Why do we need to force this ?  People have been successfuly using
> > EPYC CPUs without NUMA in QEMU for years now.
> >
> > It might not match behaviour of bare metal silicon, but that hasn't
> > obviously caused the world to come crashing down.
> So far it produces warning in linux kernel (RHBZ1728166), (resulting performance
> might be suboptimal), but I haven't seen anyone reporting crashes yet.
> 
> 
> What other options do we have?
> Perhaps we can turn on strict check for new machine types only, so old configs
> can keep broken topology (CPUID), while new ones would require -numa and
> produce correct topology.
> 
> 
> >
> > Regards,
> > Daniel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 1/8] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
  2020-08-26  9:57   ` Igor Mammedov
@ 2020-08-26 17:31     ` Babu Moger
  0 siblings, 0 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-26 17:31 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, mst, ehabkost, rth



On 8/26/20 4:57 AM, Igor Mammedov wrote:
> On Fri, 21 Aug 2020 17:12:25 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> Remove node_id, nr_nodes and nodes_per_pkg from topology. Use
>> die_id, nr_dies and dies_per_pkg which is already available.
>> Removes the confusion over two variables.
>>
>> With node_id removed in topology the uninitialized memory issue
>> with -device and CPU hotplug will be fixed.
>>
> 
> it would be easier to review if you first put all reverts,
> and then this patch on top.

Ok. I will change it in next revision.

> 
> well, to actually avoid that request from others,
> I'd suggest just to do that to avoid issue with merging.
> 
> other than that patches 2-7/8 look good to me.

> 
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1828750&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C65d4a3950e1a4e88f3bb08d849a687c5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637340326880983397&amp;sdata=PGZoE4AmYuFVELCuU81M3OaLewN3BlHpjszc4uCokr8%3D&amp;reserved=0
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>>  hw/i386/pc.c               |    1 -
>>  hw/i386/x86.c              |    1 -
>>  include/hw/i386/topology.h |   40 +++++++++-------------------------------
>>  target/i386/cpu.c          |   24 ++++++++++--------------
>>  target/i386/cpu.h          |    1 -
>>  tests/test-x86-cpuid.c     |   40 ++++++++++++++++++++--------------------
>>  6 files changed, 39 insertions(+), 68 deletions(-)
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 47c5ca3e34..0ae5cb3af4 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -1498,7 +1498,6 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>      init_topo_info(&topo_info, x86ms);
>>  
>>      env->nr_dies = x86ms->smp_dies;
>> -    env->nr_nodes = topo_info.nodes_per_pkg;
>>      env->pkg_offset = x86ms->apicid_pkg_offset(&topo_info);
>>  
>>      /*
>> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
>> index 67bee1bcb8..f6eab947df 100644
>> --- a/hw/i386/x86.c
>> +++ b/hw/i386/x86.c
>> @@ -62,7 +62,6 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
>>  {
>>      MachineState *ms = MACHINE(x86ms);
>>  
>> -    topo_info->nodes_per_pkg = ms->numa_state->num_nodes / ms->smp.sockets;
>>      topo_info->dies_per_pkg = x86ms->smp_dies;
>>      topo_info->cores_per_die = ms->smp.cores;
>>      topo_info->threads_per_core = ms->smp.threads;
>> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
>> index 07239f95f4..05ddde7ba0 100644
>> --- a/include/hw/i386/topology.h
>> +++ b/include/hw/i386/topology.h
>> @@ -47,14 +47,12 @@ typedef uint32_t apic_id_t;
>>  
>>  typedef struct X86CPUTopoIDs {
>>      unsigned pkg_id;
>> -    unsigned node_id;
>>      unsigned die_id;
>>      unsigned core_id;
>>      unsigned smt_id;
>>  } X86CPUTopoIDs;
>>  
>>  typedef struct X86CPUTopoInfo {
>> -    unsigned nodes_per_pkg;
>>      unsigned dies_per_pkg;
>>      unsigned cores_per_die;
>>      unsigned threads_per_core;
>> @@ -89,11 +87,6 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
>>      return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
>>  }
>>  
>> -/* Bit width of the node_id field per socket */
>> -static inline unsigned apicid_node_width_epyc(X86CPUTopoInfo *topo_info)
>> -{
>> -    return apicid_bitwidth_for_count(MAX(topo_info->nodes_per_pkg, 1));
>> -}
>>  /* Bit offset of the Core_ID field
>>   */
>>  static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>> @@ -114,30 +107,23 @@ static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
>>      return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
>>  }
>>  
>> -#define NODE_ID_OFFSET 3 /* Minimum node_id offset if numa configured */
>> +#define EPYC_DIE_OFFSET 3 /* Minimum die_id offset if numa configured */
>>  
>>  /*
>> - * Bit offset of the node_id field
>> - *
>> - * Make sure nodes_per_pkg >  0 if numa configured else zero.
>> + * Bit offset of the die_id field
>>   */
>> -static inline unsigned apicid_node_offset_epyc(X86CPUTopoInfo *topo_info)
>> +static inline unsigned apicid_die_offset_epyc(X86CPUTopoInfo *topo_info)
>>  {
>> -    unsigned offset = apicid_die_offset(topo_info) +
>> -                      apicid_die_width(topo_info);
>> +    unsigned offset = apicid_core_offset(topo_info) +
>> +                      apicid_core_width(topo_info);
>>  
>> -    if (topo_info->nodes_per_pkg) {
>> -        return MAX(NODE_ID_OFFSET, offset);
>> -    } else {
>> -        return offset;
>> -    }
>> +    return MAX(EPYC_DIE_OFFSET, offset);
>>  }
>>  
>>  /* Bit offset of the Pkg_ID (socket ID) field */
>>  static inline unsigned apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)
>>  {
>> -    return apicid_node_offset_epyc(topo_info) +
>> -           apicid_node_width_epyc(topo_info);
>> +    return apicid_die_offset_epyc(topo_info) + apicid_die_width(topo_info);
>>  }
>>  
>>  /*
>> @@ -150,8 +136,7 @@ x86_apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
>>                                const X86CPUTopoIDs *topo_ids)
>>  {
>>      return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
>> -           (topo_ids->node_id << apicid_node_offset_epyc(topo_info)) |
>> -           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
>> +           (topo_ids->die_id  << apicid_die_offset_epyc(topo_info)) |
>>             (topo_ids->core_id << apicid_core_offset(topo_info)) |
>>             topo_ids->smt_id;
>>  }
>> @@ -160,15 +145,11 @@ static inline void x86_topo_ids_from_idx_epyc(X86CPUTopoInfo *topo_info,
>>                                                unsigned cpu_index,
>>                                                X86CPUTopoIDs *topo_ids)
>>  {
>> -    unsigned nr_nodes = MAX(topo_info->nodes_per_pkg, 1);
>>      unsigned nr_dies = topo_info->dies_per_pkg;
>>      unsigned nr_cores = topo_info->cores_per_die;
>>      unsigned nr_threads = topo_info->threads_per_core;
>> -    unsigned cores_per_node = DIV_ROUND_UP((nr_dies * nr_cores * nr_threads),
>> -                                            nr_nodes);
>>  
>>      topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
>> -    topo_ids->node_id = (cpu_index / cores_per_node) % nr_nodes;
>>      topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
>>      topo_ids->core_id = cpu_index / nr_threads % nr_cores;
>>      topo_ids->smt_id = cpu_index % nr_threads;
>> @@ -188,11 +169,8 @@ static inline void x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
>>              (apicid >> apicid_core_offset(topo_info)) &
>>              ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
>>      topo_ids->die_id =
>> -            (apicid >> apicid_die_offset(topo_info)) &
>> +            (apicid >> apicid_die_offset_epyc(topo_info)) &
>>              ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
>> -    topo_ids->node_id =
>> -            (apicid >> apicid_node_offset_epyc(topo_info)) &
>> -            ~(0xFFFFFFFFUL << apicid_node_width_epyc(topo_info));
>>      topo_ids->pkg_id = apicid >> apicid_pkg_offset_epyc(topo_info);
>>  }
>>  
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 588f32e136..3c58af1f43 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -345,7 +345,6 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>>                                         uint32_t *ecx, uint32_t *edx)
>>  {
>>      uint32_t l3_cores;
>> -    unsigned nodes = MAX(topo_info->nodes_per_pkg, 1);
>>  
>>      assert(cache->size == cache->line_size * cache->associativity *
>>                            cache->partitions * cache->sets);
>> @@ -355,10 +354,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>>  
>>      /* L3 is shared among multiple cores */
>>      if (cache->level == 3) {
>> -        l3_cores = DIV_ROUND_UP((topo_info->dies_per_pkg *
>> -                                 topo_info->cores_per_die *
>> +        l3_cores = DIV_ROUND_UP((topo_info->cores_per_die *
>>                                   topo_info->threads_per_core),
>> -                                 nodes);
>> +                                 topo_info->dies_per_pkg);
>>          *eax |= (l3_cores - 1) << 14;
>>      } else {
>>          *eax |= ((topo_info->threads_per_core - 1) << 14);
>> @@ -387,7 +385,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>>                                         uint32_t *ecx, uint32_t *edx)
>>  {
>>      X86CPUTopoIDs topo_ids = {0};
>> -    unsigned long nodes = MAX(topo_info->nodes_per_pkg, 1);
>> +    unsigned long dies = topo_info->dies_per_pkg;
>>      int shift;
>>  
>>      x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
>> @@ -408,7 +406,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>>       *             3 Core complex id
>>       *           1:0 Core id
>>       */
>> -    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.node_id << 3) |
>> +    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.die_id << 3) |
>>              (topo_ids.core_id);
>>      /*
>>       * CPUID_Fn8000001E_ECX
>> @@ -418,8 +416,8 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>>       *         2  Socket id
>>       *       1:0  Node id
>>       */
>> -    if (nodes <= 4) {
>> -        *ecx = ((nodes - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.node_id;
>> +    if (dies <= 4) {
>> +        *ecx = ((dies - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.die_id;
>>      } else {
>>          /*
>>           * Node id fix up. Actual hardware supports up to 4 nodes. But with
>> @@ -434,10 +432,10 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>>           * number of nodes. find_last_bit returns last set bit(0 based). Left
>>           * shift(+1) the socket id to represent all the nodes.
>>           */
>> -        nodes -= 1;
>> -        shift = find_last_bit(&nodes, 8);
>> -        *ecx = (nodes << 8) | (topo_ids.pkg_id << (shift + 1)) |
>> -               topo_ids.node_id;
>> +        dies -= 1;
>> +        shift = find_last_bit(&dies, 8);
>> +        *ecx = (dies << 8) | (topo_ids.pkg_id << (shift + 1)) |
>> +               topo_ids.die_id;
>>      }
>>      *edx = 0;
>>  }
>> @@ -5489,7 +5487,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>      uint32_t signature[3];
>>      X86CPUTopoInfo topo_info;
>>  
>> -    topo_info.nodes_per_pkg = env->nr_nodes;
>>      topo_info.dies_per_pkg = env->nr_dies;
>>      topo_info.cores_per_die = cs->nr_cores;
>>      topo_info.threads_per_core = cs->nr_threads;
>> @@ -6949,7 +6946,6 @@ static void x86_cpu_initfn(Object *obj)
>>      FeatureWord w;
>>  
>>      env->nr_dies = 1;
>> -    env->nr_nodes = 1;
>>      cpu_set_cpustate_pointers(cpu);
>>  
>>      object_property_add(obj, "family", "int",
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index e1a5c174dc..4c89bee8d1 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -1629,7 +1629,6 @@ typedef struct CPUX86State {
>>      TPRAccess tpr_access_type;
>>  
>>      unsigned nr_dies;
>> -    unsigned nr_nodes;
>>      unsigned pkg_offset;
>>  } CPUX86State;
>>  
>> diff --git a/tests/test-x86-cpuid.c b/tests/test-x86-cpuid.c
>> index 049030a50e..bfabc0403a 100644
>> --- a/tests/test-x86-cpuid.c
>> +++ b/tests/test-x86-cpuid.c
>> @@ -31,12 +31,12 @@ static void test_topo_bits(void)
>>      X86CPUTopoInfo topo_info = {0};
>>  
>>      /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
>> +    topo_info = (X86CPUTopoInfo) {1, 1, 1};
>>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
>>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
>>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
>>  
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
>> +    topo_info = (X86CPUTopoInfo) {1, 1, 1};
>>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
>> @@ -45,39 +45,39 @@ static void test_topo_bits(void)
>>  
>>      /* Test field width calculation for multiple values
>>       */
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 2};
>> +    topo_info = (X86CPUTopoInfo) {1, 1, 2};
>>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 3};
>> +    topo_info = (X86CPUTopoInfo) {1, 1, 3};
>>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 4};
>> +    topo_info = (X86CPUTopoInfo) {1, 1, 4};
>>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>>  
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 14};
>> +    topo_info = (X86CPUTopoInfo) {1, 1, 14};
>>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 15};
>> +    topo_info = (X86CPUTopoInfo) {1, 1, 15};
>>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 16};
>> +    topo_info = (X86CPUTopoInfo) {1, 1, 16};
>>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 1, 17};
>> +    topo_info = (X86CPUTopoInfo) {1, 1, 17};
>>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
>>  
>>  
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 30, 2};
>> +    topo_info = (X86CPUTopoInfo) {1, 30, 2};
>>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 31, 2};
>> +    topo_info = (X86CPUTopoInfo) {1, 31, 2};
>>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 32, 2};
>> +    topo_info = (X86CPUTopoInfo) {1, 32, 2};
>>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 33, 2};
>> +    topo_info = (X86CPUTopoInfo) {1, 33, 2};
>>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
>>  
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 30, 2};
>> +    topo_info = (X86CPUTopoInfo) {1, 30, 2};
>>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
>> -    topo_info = (X86CPUTopoInfo) {0, 2, 30, 2};
>> +    topo_info = (X86CPUTopoInfo) {2, 30, 2};
>>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
>> -    topo_info = (X86CPUTopoInfo) {0, 3, 30, 2};
>> +    topo_info = (X86CPUTopoInfo) {3, 30, 2};
>>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
>> -    topo_info = (X86CPUTopoInfo) {0, 4, 30, 2};
>> +    topo_info = (X86CPUTopoInfo) {4, 30, 2};
>>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
>>  
>>      /* build a weird topology and see if IDs are calculated correctly
>> @@ -85,18 +85,18 @@ static void test_topo_bits(void)
>>  
>>      /* This will use 2 bits for thread ID and 3 bits for core ID
>>       */
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
>> +    topo_info = (X86CPUTopoInfo) {1, 6, 3};
>>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>>      g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
>>      g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
>>      g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
>>  
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
>> +    topo_info = (X86CPUTopoInfo) {1, 6, 3};
>>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
>>  
>> -    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
>> +    topo_info = (X86CPUTopoInfo) {1, 6, 3};
>>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
>>                       (1 << 2) | 0);
>>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
>>
>>
> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 17:17       ` Babu Moger
@ 2020-08-26 18:33         ` Dr. David Alan Gilbert
  2020-08-26 18:45           ` Babu Moger
  0 siblings, 1 reply; 51+ messages in thread
From: Dr. David Alan Gilbert @ 2020-08-26 18:33 UTC (permalink / raw)
  To: Babu Moger
  Cc: Daniel P. Berrangé,
	ehabkost, mst, Michal Privoznik, qemu-devel, pbonzini,
	Igor Mammedov, rth

* Babu Moger (babu.moger@amd.com) wrote:
> 
> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Wednesday, August 26, 2020 8:31 AM
> > To: Daniel P. Berrangé <berrange@redhat.com>
> > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > decode
> > 
> > On Wed, 26 Aug 2020 13:50:59 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >
> > > > > To support some of the complex topology, we introduced EPYC mode
> > apicid decode.
> > > > > But, EPYC mode decode is running into problems. Also it can become
> > > > > quite a maintenance problem in the future. So, it was decided to
> > > > > remove that code and use the generic decode which works for
> > > > > majority of the topology. Most of the SPECed configuration would
> > > > > work just fine. With some non-SPECed user inputs, it will create some sub-
> > optimal configuration.
> > > > > Here is the discussion thread.
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > d5b437c1b25
> > > > >
> > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > 52f92
> > > > >
> > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > C0
> > > > >
> > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > YO%2B
> > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > >
> > > > > This series removes all the EPYC mode specific apicid changes and
> > > > > use the generic apicid decode.
> > > >
> > > > the main difference between EPYC and all other CPUs is that it
> > > > requires numa configuration (it's not optional) so we need an extra
> No, That is not true. Because of that assumption we made all these apicid
> changes. And here we are now.
> 
> AMD supports varies mixed configurations. In case of EPYC-Rome, we have
> NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1, basically we
> have all the cores in a socket under one numa node. This is non-numa
> configuration.
> Looking at the various configurations and also discussing internally, it
> is not advisable to have (epyc && !numa) check.

Indeed on real hardware, I don't think we always see NUMA; my single
socket, 16 core/32 thread 7302P Dell box, shows the kernel printing
'No NUMA configuration found...Faking a node.'

So if real hardware hasn't got a NUMA node, what's the real problem?

Dave

> > > > patch on top of this series to enfoce that, i.e:
> > > >
> > > >  if (epyc && !numa)
> > > >     error("EPYC cpu requires numa to be configured")
> > >
> > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > real world QEMU deployments. That is way too user hostile to introduce
> > > as a requirement.
> > >
> > > Why do we need to force this ?  People have been successfuly using
> > > EPYC CPUs without NUMA in QEMU for years now.
> > >
> > > It might not match behaviour of bare metal silicon, but that hasn't
> > > obviously caused the world to come crashing down.
> > So far it produces warning in linux kernel (RHBZ1728166), (resulting performance
> > might be suboptimal), but I haven't seen anyone reporting crashes yet.
> > 
> > 
> > What other options do we have?
> > Perhaps we can turn on strict check for new machine types only, so old configs
> > can keep broken topology (CPUID), while new ones would require -numa and
> > produce correct topology.
> > 
> > 
> > >
> > > Regards,
> > > Daniel
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 18:33         ` Dr. David Alan Gilbert
@ 2020-08-26 18:45           ` Babu Moger
  2020-08-27 20:21             ` Igor Mammedov
  0 siblings, 1 reply; 51+ messages in thread
From: Babu Moger @ 2020-08-26 18:45 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Daniel P. Berrangé,
	ehabkost, mst, Michal Privoznik, qemu-devel, pbonzini,
	Igor Mammedov, rth



> -----Original Message-----
> From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Sent: Wednesday, August 26, 2020 1:34 PM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> pbonzini@redhat.com; rth@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> decode
> 
> * Babu Moger (babu.moger@amd.com) wrote:
> >
> > > -----Original Message-----
> > > From: Igor Mammedov <imammedo@redhat.com>
> > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > generic decode
> > >
> > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > <babu.moger@amd.com> wrote:
> > > > >
> > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > mode
> > > apicid decode.
> > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > decided to remove that code and use the generic decode which
> > > > > > works for majority of the topology. Most of the SPECed
> > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > inputs, it will create some sub-
> > > optimal configuration.
> > > > > > Here is the discussion thread.
> > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > F%2F
> > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > > d5b437c1b25
> > > > > >
> > >
> 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > > 52f92
> > > > > >
> > >
> 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > C0
> > > > > >
> > >
> %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > > YO%2B
> > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > >
> > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > and use the generic apicid decode.
> > > > >
> > > > > the main difference between EPYC and all other CPUs is that it
> > > > > requires numa configuration (it's not optional) so we need an
> > > > > extra
> > No, That is not true. Because of that assumption we made all these
> > apicid changes. And here we are now.
> >
> > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > basically we have all the cores in a socket under one numa node. This
> > is non-numa configuration.
> > Looking at the various configurations and also discussing internally,
> > it is not advisable to have (epyc && !numa) check.
> 
> Indeed on real hardware, I don't think we always see NUMA; my single socket,
> 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> configuration found...Faking a node.'
> 
> So if real hardware hasn't got a NUMA node, what's the real problem?

I don't see any problem once we revert all these changes(patch 1-7).
We don't need if (epyc && !numa) error check or auto_enable_numa=true
unconditionally.

> 
> Dave
> 
> > > > > patch on top of this series to enfoce that, i.e:
> > > > >
> > > > >  if (epyc && !numa)
> > > > >     error("EPYC cpu requires numa to be configured")
> > > >
> > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > real world QEMU deployments. That is way too user hostile to
> > > > introduce as a requirement.
> > > >
> > > > Why do we need to force this ?  People have been successfuly using
> > > > EPYC CPUs without NUMA in QEMU for years now.
> > > >
> > > > It might not match behaviour of bare metal silicon, but that
> > > > hasn't obviously caused the world to come crashing down.
> > > So far it produces warning in linux kernel (RHBZ1728166), (resulting
> > > performance might be suboptimal), but I haven't seen anyone reporting
> crashes yet.
> > >
> > >
> > > What other options do we have?
> > > Perhaps we can turn on strict check for new machine types only, so
> > > old configs can keep broken topology (CPUID), while new ones would
> > > require -numa and produce correct topology.
> > >
> > >
> > > >
> > > > Regards,
> > > > Daniel
> >
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 15:03           ` Daniel P. Berrangé
  2020-08-26 15:18             ` Eduardo Habkost
@ 2020-08-27 17:03             ` Igor Mammedov
  2020-08-27 19:07               ` Eduardo Habkost
  1 sibling, 1 reply; 51+ messages in thread
From: Igor Mammedov @ 2020-08-27 17:03 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini, rth

On Wed, 26 Aug 2020 16:03:40 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 14:36:38 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >   
> > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > >     
> > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > it will create some sub-optimal configuration.
> > > > > > > Here is the discussion thread.
> > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > 
> > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > apicid decode.    
> > > > > > 
> > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > it requires numa configuration (it's not optional)
> > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > 
> > > > > >  if (epyc && !numa) 
> > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > 
> > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > as a requirement.
> > > > > 
> > > > > Why do we need to force this ?  People have been successfuly using
> > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > 
> > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > obviously caused the world to come crashing down.  
> > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > (resulting performance might be suboptimal), but I haven't seen
> > > > anyone reporting crashes yet.
> > > > 
> > > > 
> > > > What other options do we have?
> > > > Perhaps we can turn on strict check for new machine types only,
> > > > so old configs can keep broken topology (CPUID),
> > > > while new ones would require -numa and produce correct topology.  
> > > 
> > > No, tieing this to machine types is not viable either. That is still
> > > going to break essentially every single management application that
> > > exists today using QEMU.
> > for that we have deprecation process, so users could switch to new CLI
> > that would be required.
> 
> We could, but I don't find the cost/benefit tradeoff is compelling.
> 
> There are so many places where we diverge from what bare metal would
> do, that I don't see a good reason to introduce this breakage, even
> if we notify users via a deprecation message. 
I find (3) and (4) good enough reasons to use deprecation.

> If QEMU wants to require NUMA for EPYC, then QEMU could internally
> create a single NUMA node if none was specified for new machine
> types, such that there is no visible change or breakage to any
> mgmt apps.  

(1) for configs that started without -numa &&|| without -smp dies>1,
      QEMU can do just that (enable auto_enable_numa).

(2) As for configs that are out of spec, I do not care much (junk in - junk out)
(though not having to spend time on bug reports and debug issues, just to say
it's not supported in the end, makes deprecation sound like a reasonable
choice)

(3) However if config matches bare metal i.e. CPU has more than 1 die and within
dies limits (spec wise), QEMU has to produce valid CPUs.
In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
on user's behalf. That's where we have to error out and ask for explicit
numa configuration.

For such configs, current code (since 5.0), will produce in the best case
performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
in the worst case issues might be related to invalid APIC ID if running on EPYC host
and HW takes in account subfields of APIC ID (according to Babu real CPU uses
die_id(aka node_id) internally).
I'd rather error out on nonsense configs earlier than debug such issues
and than error out anyways later (upsetting more users).

(4)
If I were non hobby user, I'd hate if QEMU allowed me to start invalid config,
that I'd have to spend time on debugging issues (including performance ones),
instead of clearly telling me what's wrong and how config should be corrected.
I'd probably jump to another hypervisor that does the job right,
instead of digging into QEMU codebase and CPU specs to figure out how
to hack and configure it.


> 
> 
> Regards,
> Daniel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-27 17:03             ` Igor Mammedov
@ 2020-08-27 19:07               ` Eduardo Habkost
  2020-08-27 20:55                 ` Igor Mammedov
  0 siblings, 1 reply; 51+ messages in thread
From: Eduardo Habkost @ 2020-08-27 19:07 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Daniel P. Berrangé,
	mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini, rth

On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 16:03:40 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > 
> > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >   
> > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > >     
> > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > Here is the discussion thread.
> > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > 
> > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > apicid decode.    
> > > > > > > 
> > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > it requires numa configuration (it's not optional)
> > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > 
> > > > > > >  if (epyc && !numa) 
> > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > 
> > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > as a requirement.
> > > > > > 
> > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > 
> > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > obviously caused the world to come crashing down.  
> > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > anyone reporting crashes yet.
> > > > > 
> > > > > 
> > > > > What other options do we have?
> > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > so old configs can keep broken topology (CPUID),
> > > > > while new ones would require -numa and produce correct topology.  
> > > > 
> > > > No, tieing this to machine types is not viable either. That is still
> > > > going to break essentially every single management application that
> > > > exists today using QEMU.
> > > for that we have deprecation process, so users could switch to new CLI
> > > that would be required.
> > 
> > We could, but I don't find the cost/benefit tradeoff is compelling.
> > 
> > There are so many places where we diverge from what bare metal would
> > do, that I don't see a good reason to introduce this breakage, even
> > if we notify users via a deprecation message. 
> I find (3) and (4) good enough reasons to use deprecation.
> 
> > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > create a single NUMA node if none was specified for new machine
> > types, such that there is no visible change or breakage to any
> > mgmt apps.  
> 
> (1) for configs that started without -numa &&|| without -smp dies>1,
>       QEMU can do just that (enable auto_enable_numa).

Why exactly do we need auto_enable_numa with dies=1?

If I understand correctly, Babu said earlier in this thread[1]
that we don't need auto_enable_numa.

[1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/

> 
> (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> (though not having to spend time on bug reports and debug issues, just to say
> it's not supported in the end, makes deprecation sound like a reasonable
> choice)
> 
> (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> dies limits (spec wise), QEMU has to produce valid CPUs.
> In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> on user's behalf. That's where we have to error out and ask for explicit
> numa configuration.
> 
> For such configs, current code (since 5.0), will produce in the best case
> performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> in the worst case issues might be related to invalid APIC ID if running on EPYC host
> and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> die_id(aka node_id) internally).
> I'd rather error out on nonsense configs earlier than debug such issues
> and than error out anyways later (upsetting more users).
> 

The requirements are not clear to me.  Is this just about making
CPU die_id match the NUMA node ID, or are there additional
constraints?


> (4)
> If I were non hobby user, I'd hate if QEMU allowed me to start invalid config,
> that I'd have to spend time on debugging issues (including performance ones),
> instead of clearly telling me what's wrong and how config should be corrected.
> I'd probably jump to another hypervisor that does the job right,
> instead of digging into QEMU codebase and CPU specs to figure out how
> to hack and configure it.
> 

-- 
Eduardo



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 18:45           ` Babu Moger
@ 2020-08-27 20:21             ` Igor Mammedov
  2020-08-28  8:58               ` Daniel P. Berrangé
  0 siblings, 1 reply; 51+ messages in thread
From: Igor Mammedov @ 2020-08-27 20:21 UTC (permalink / raw)
  To: Babu Moger
  Cc: Daniel P. Berrangé,
	ehabkost, mst, Michal Privoznik, qemu-devel,
	Dr. David Alan Gilbert, pbonzini, rth

On Wed, 26 Aug 2020 13:45:51 -0500
Babu Moger <babu.moger@amd.com> wrote:

> 
> 
> > -----Original Message-----
> > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Sent: Wednesday, August 26, 2020 1:34 PM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> > Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> > pbonzini@redhat.com; rth@twiddle.net
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > decode
> > 
> > * Babu Moger (babu.moger@amd.com) wrote:
> > >
> > > > -----Original Message-----
> > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > > generic decode
> > > >
> > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > <babu.moger@amd.com> wrote:
> > > > > >
> > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > mode
> > > > apicid decode.
> > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > > decided to remove that code and use the generic decode which
> > > > > > > works for majority of the topology. Most of the SPECed
> > > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > > inputs, it will create some sub-
> > > > optimal configuration.
> > > > > > > Here is the discussion thread.
> > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > > F%2F
> > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > > > d5b437c1b25
> > > > > > >
> > > >
> > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > > > 52f92
> > > > > > >
> > > >
> > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > > C0
> > > > > > >
> > > >
> > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > > > YO%2B
> > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > >
> > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > and use the generic apicid decode.
> > > > > >
> > > > > > the main difference between EPYC and all other CPUs is that it
> > > > > > requires numa configuration (it's not optional) so we need an
> > > > > > extra
> > > No, That is not true. Because of that assumption we made all these
> > > apicid changes. And here we are now.
> > >
> > > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > > basically we have all the cores in a socket under one numa node. This
> > > is non-numa configuration.
> > > Looking at the various configurations and also discussing internally,
> > > it is not advisable to have (epyc && !numa) check.
> > 
> > Indeed on real hardware, I don't think we always see NUMA; my single socket,
> > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> > configuration found...Faking a node.'
looks like firmware bug or maybe it's feature and there is a knob in fw
to turn it on/off in case used OS doesn't like it for some reason.


> > So if real hardware hasn't got a NUMA node, what's the real problem?
> 
> I don't see any problem once we revert all these changes(patch 1-7).
> We don't need if (epyc && !numa) error check or auto_enable_numa=true
> unconditionally.

We need revert to unbreak migration from QEMU < 5.0,
everything else (fixes for CPUID_Fn8000001E) could go on top.

So what's on top (because old code also wasn't correct when
CPUID_Fn8000001E is taken in account, tha's why we are at this point),

When starting QEMU without -numa
Indeed we can skip "if (epyc && !numa) error check or auto_enable_numa=true",
in case where there is 1 die (NPS1).
(1) User however may set core/threads number bigger than possible by spec,
    in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not be
    valid spec vise and could trigger OPPs in guest kernel.
    Given we allow go out of spec, perhaps we should add a warning at
    realize time saying that used -smp config is not supported since it
    doesn't match AMD EPYC spec and might not work.

(2) Earlier we agreed that we can reuse existing die_id instead of internal
    (topo_ids.node_id in current code)
    (It's is called DIE_ID and NODE ID in spec interchangeably)
    Same as (1) add a warning when '-smp dies' goes beyond spec limits.
    
(3) "-smp dies>1" ''if'' we allow to run it without -numa,
    then system wide NUMA node id in CPUID_Fn8000001E_ECX probably doesn't matter.
    could be something like in spec but taking in account die offset, to produce
    unique id.

    Same, add a warning that there are more than 1 dies but numa is not enabled,
    suggest to enable numa.

    With current code it produces invalid APIC ID for valid '-smp' combination,
    however if we revert it and switch to die_id than it should produce
    valid APIC ID once again (as in 4.2).
    Given it produces invalid APIC id, maybe we should just ditch the case and
    fold it in (4) (i.e. require -numa if "-smp dies>1")

(4) -numa is used (RHBZ1728166)
    we need to ensure that socket*dies == ms->numa_state->num_nodes
     and make sure that CPUID_Fn8000001E_ECX consistent with
    cpu mapping provided with "-numa cpu=" option.

Warnings won't help a lot, but at least they will point out at
possible problem when someone complains.

> > 
> > Dave
> > 
> > > > > > patch on top of this series to enfoce that, i.e:
> > > > > >
> > > > > >  if (epyc && !numa)
> > > > > >     error("EPYC cpu requires numa to be configured")
> > > > >
> > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > real world QEMU deployments. That is way too user hostile to
> > > > > introduce as a requirement.
> > > > >
> > > > > Why do we need to force this ?  People have been successfuly using
> > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > >
> > > > > It might not match behaviour of bare metal silicon, but that
> > > > > hasn't obviously caused the world to come crashing down.
> > > > So far it produces warning in linux kernel (RHBZ1728166), (resulting
> > > > performance might be suboptimal), but I haven't seen anyone reporting
> > crashes yet.
> > > >
> > > >
> > > > What other options do we have?
> > > > Perhaps we can turn on strict check for new machine types only, so
> > > > old configs can keep broken topology (CPUID), while new ones would
> > > > require -numa and produce correct topology.
> > > >
> > > >
> > > > >
> > > > > Regards,
> > > > > Daniel
> > >
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-27 19:07               ` Eduardo Habkost
@ 2020-08-27 20:55                 ` Igor Mammedov
  2020-08-28  8:55                   ` Daniel P. Berrangé
  0 siblings, 1 reply; 51+ messages in thread
From: Igor Mammedov @ 2020-08-27 20:55 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Daniel P. Berrangé,
	mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini, rth

On Thu, 27 Aug 2020 15:07:52 -0400
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 16:03:40 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > 
> > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >   
> > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > >     
> > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > Here is the discussion thread.
> > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > 
> > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > apicid decode.    
> > > > > > > > 
> > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > 
> > > > > > > >  if (epyc && !numa) 
> > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > 
> > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > as a requirement.
> > > > > > > 
> > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > 
> > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > obviously caused the world to come crashing down.  
> > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > anyone reporting crashes yet.
> > > > > > 
> > > > > > 
> > > > > > What other options do we have?
> > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > so old configs can keep broken topology (CPUID),
> > > > > > while new ones would require -numa and produce correct topology.  
> > > > > 
> > > > > No, tieing this to machine types is not viable either. That is still
> > > > > going to break essentially every single management application that
> > > > > exists today using QEMU.
> > > > for that we have deprecation process, so users could switch to new CLI
> > > > that would be required.
> > > 
> > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > 
> > > There are so many places where we diverge from what bare metal would
> > > do, that I don't see a good reason to introduce this breakage, even
> > > if we notify users via a deprecation message. 
> > I find (3) and (4) good enough reasons to use deprecation.
> > 
> > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > create a single NUMA node if none was specified for new machine
> > > types, such that there is no visible change or breakage to any
> > > mgmt apps.  
> > 
> > (1) for configs that started without -numa &&|| without -smp dies>1,
> >       QEMU can do just that (enable auto_enable_numa).
> 
> Why exactly do we need auto_enable_numa with dies=1?
> 
> If I understand correctly, Babu said earlier in this thread[1]
> that we don't need auto_enable_numa.
> 
> [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/

in case of 1 die, -numa is not must have as it's one numa node only.
Though having auto_enable_numa, will allow to reuse the CPU.node-id property
to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.

 
> > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > (though not having to spend time on bug reports and debug issues, just to say
> > it's not supported in the end, makes deprecation sound like a reasonable
> > choice)
> > 
> > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > dies limits (spec wise), QEMU has to produce valid CPUs.
> > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > on user's behalf. That's where we have to error out and ask for explicit
> > numa configuration.
> > 
> > For such configs, current code (since 5.0), will produce in the best case
> > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > die_id(aka node_id) internally).
> > I'd rather error out on nonsense configs earlier than debug such issues
> > and than error out anyways later (upsetting more users).
> > 
> 
> The requirements are not clear to me.  Is this just about making
> CPU die_id match the NUMA node ID, or are there additional
> constraints?
die_id is per socket numa node index, so it's not numa node id in
a sense we use it in qemu
(that's where all the confusion started that led to current code)

I understood that each die in EPYC chip is a numa node, which encodes
NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
wrote earlier that EPYC makes -numa non optional.

In case of only one die we can either use auto_enable_numa to ensure
that we have consistent code or special case it and just hardcode 
CPUID_Fn8000001E_ECX value which is hackish but will let us avoid
enabling numa (explicitly or implictly).

in case of multiple dies, CPUID_Fn8000001E_ECX (encodes number of nodes +
systemwide numa node id looking at CPUID of real EPYC machine)
shall match -numa mapping (otherwise it's a bug where CPUID and
ACPI mismatch).
Here we can go to ways:
  1) ask user to provide sane config with -numa (I'd prefer that)
     and use that info to fill in CPUID_Fn8000001E_ECX
  2) pretend that it's non numa machine, skip ACPI SRAT table
     but make up CPUID_Fn8000001E (i.e. another special case)
     (requires another code path and addition to -numa one)



> 
> 
> > (4)
> > If I were non hobby user, I'd hate if QEMU allowed me to start invalid config,
> > that I'd have to spend time on debugging issues (including performance ones),
> > instead of clearly telling me what's wrong and how config should be corrected.
> > I'd probably jump to another hypervisor that does the job right,
> > instead of digging into QEMU codebase and CPU specs to figure out how
> > to hack and configure it.
> > 
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-26 14:10             ` Dr. David Alan Gilbert
@ 2020-08-27 21:19               ` Igor Mammedov
  2020-08-27 22:58                 ` Babu Moger
  2020-08-28  8:48                 ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 51+ messages in thread
From: Igor Mammedov @ 2020-08-27 21:19 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: ehabkost, mst, qemu-devel, Babu Moger, pbonzini, rth

On Wed, 26 Aug 2020 15:10:46 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Tue, 25 Aug 2020 16:25:21 +0100
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > 
> > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > >   
> > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > Hi Dave,
> > > > > > 
> > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:    
> > > > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > >> it will create some sub-optimal configuration.
> > > > > > >> Here is the discussion thread.
> > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > > > >>
> > > > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > >> apicid decode.    
> > > > > > > 
> > > > > > > Hi Babu,
> > > > > > >   This does simplify things a lot!
> > > > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > > > that was using the node-id to a qemu with this new scheme?    
> > > > > > 
> > > > > > The node_id which we introduced was only used internally. This wasn't
> > > > > > exposed outside. I don't think live migration will be an issue.    
> > > > > 
> > > > > Didn't it become part of the APIC ID visible to the guest?  
> > > > 
> > > > Daniel asked similar question wrt hard error on start up,
> > > > when CLI is not sufficient to create EPYC cpu.
> > > > 
> > > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > > 
> > > > Migration might fall into the same category.
> > > > Also looking at the history, 5.0 commit 
> > > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > > > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > > > 
> > > > (I'm not aware of somebody complaining about it)
> > > > 
> > > > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > > > 
> > > > 
> > > > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > > > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > > > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > > > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > > > by duplicating internal node_id algorithm that this series removes.
> > > > 
> > > > There might be other broken cases that I don't recall anymore
> > > > (should be mentioned in previous versions of this series)
> > > > 
> > > > 
> > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > 
> > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > > 
> > > Oh ....
> > > 
> > > >  2) with this series (lets call it qemu 5.2)
> > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > 
> > > > It's all about picking which poison to choose,
> > > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > > doesn't work as expected.  
> > > 
> > > I think that would make our lives easier for other reasons; so I'm happy
> > > to go with that.
> > 
> > to make things less painful for users, me wonders if there is a way
> > to block migration if epyc and specific QEMU versions are used?
> 
> We have no way to block based on version - and that's a pretty painful
> thing to do; we can block based on machine type.
> 
> But before we get there; can we understand in which combinations that
> things break and why exactly - would it break on a 1 or 2 vCPU guest -
> or would it only break when we get to the point the upper bits start
> being used for example?  Why exaclty would it break - i.e. is it going
> to change the name of sections in the migration stream - or are the
> values we need actually going to migrate OK?

it's values of APIC ID, where 4.2 and 5.0 QEMU use different values
if numa is enabled.
I'd expect guest to be very confused in when this happens.

here is an example:
qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7

(QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
{
    "return": 7
}

vs

qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7
(QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
{
    "return": 15
}

we probably can't do anything based on machine type versions, as
4.2 and older versions on qemu-5.0 and newer use different algorithm to calculate apic-id.

Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is used. 


> Dave
> 
> 
> > > > PS:
> > > >  I didn't review it yet, but with this series we aren't
> > > >  making up internal node_ids that should match user provided numa node ids somehow.
> > > >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> > > >  but otherwise it heads in the right direction.  
> > > 
> > > Dave
> > > 
> > > > > 
> > > > > Dave
> > > > >   
> > > >   
> > 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-27 21:19               ` Igor Mammedov
@ 2020-08-27 22:58                 ` Babu Moger
  2020-08-28  8:42                   ` Igor Mammedov
  2020-08-28  8:48                 ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 51+ messages in thread
From: Babu Moger @ 2020-08-27 22:58 UTC (permalink / raw)
  To: Igor Mammedov, Dr. David Alan Gilbert
  Cc: qemu-devel, pbonzini, rth, ehabkost, mst



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Thursday, August 27, 2020 4:19 PM
> To: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: ehabkost@redhat.com; mst@redhat.com; qemu-devel@nongnu.org;
> Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> rth@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> generic decode
> 
> On Wed, 26 Aug 2020 15:10:46 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > >
> > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > On Tue, 25 Aug 2020 09:15:04 +0100 "Dr. David Alan Gilbert"
> > > > > <dgilbert@redhat.com> wrote:
> > > > >
> > > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > > > Hi Dave,
> > > > > > >
> > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > > > >> To support some of the complex topology, we introduced EPYC
> mode apicid decode.
> > > > > > > >> But, EPYC mode decode is running into problems. Also it
> > > > > > > >> can become quite a maintenance problem in the future. So,
> > > > > > > >> it was decided to remove that code and use the generic
> > > > > > > >> decode which works for majority of the topology. Most of
> > > > > > > >> the SPECed configuration would work just fine. With some
> non-SPECed user inputs, it will create some sub-optimal configuration.
> > > > > > > >> Here is the discussion thread.
> > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https
> > > > > > > >> %3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-
> 1d84-a6e
> > > > > > > >> 7-e468-
> d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.
> > > > > > > >>
> moger%40amd.com%7C9b15ee395daa4935640408d84acedf13%7C3dd8
> > > > > > > >>
> 961fe4884e608e11a82d994e183d%7C0%7C0%7C637341599663177545
> > > > > > > >>
> &amp;sdata=4okYGU%2F8QTYqEOZEd1EBC%2BEsIIrEV59HZrHzpbsR8s
> > > > > > > >> U%3D&amp;reserved=0
> > > > > > > >>
> > > > > > > >> This series removes all the EPYC mode specific apicid changes
> and use the generic
> > > > > > > >> apicid decode.
> > > > > > > >
> > > > > > > > Hi Babu,
> > > > > > > >   This does simplify things a lot!
> > > > > > > > One worry, what happens about a live migration of a VM from
> an old qemu
> > > > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > > >
> > > > > > > The node_id which we introduced was only used internally. This
> wasn't
> > > > > > > exposed outside. I don't think live migration will be an issue.
> > > > > >
> > > > > > Didn't it become part of the APIC ID visible to the guest?
> > > > >
> > > > > Daniel asked similar question wrt hard error on start up, when
> > > > > CLI is not sufficient to create EPYC cpu.
> > > > >
> > > > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%
> > > > > 2Fwww.mail-archive.com%2Fqemu-
> devel%40nongnu.org%2Fmsg728536.htm
> > > > >
> l&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C9b15ee395daa49356
> 404
> > > > >
> 08d84acedf13%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63734
> 1
> > > > >
> 599663177545&amp;sdata=OnHz23W4F4TdYwlxPZwC%2B8YRY1K3qJ5U9Sfdo
> Oc
> > > > > GXtw%3D&amp;reserved=0
> > > > >
> > > > > Migration might fall into the same category.
> > > > > Also looking at the history, 5.0 commit
> > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for
> > > > > EPYC based cpus models silently broke APIC ID (without versioning),
> for all EPYC models (that's were 1 new and 1 old one).
> > > > >
> > > > > (I'm not aware of somebody complaining about it)
> > > > >
> > > > > Another commit ed78467a21459, changed CPUID_8000_001E without
> versioning as well.
> > > > >
> > > > >
> > > > > With current EPYC apicid code, if all starts align (no numa or 1
> > > > > numa node only on CLI and no -smp dies=) it might produce a valid
> CPU (apicid+CPUID_8000_001E).
> > > > > No numa is gray area, since EPYC spec implies that it has to be numa
> machine in case of real EPYC cpus.
> > > > > Multi-node configs would be correct only if user assigns cpus to
> > > > > numa nodes by duplicating internal node_id algorithm that this series
> removes.
> > > > >
> > > > > There might be other broken cases that I don't recall anymore
> > > > > (should be mentioned in previous versions of this series)
> > > > >
> > > > >
> > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > >
> > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration
> > > >
> > > > Oh ....
> > > >
> > > > >  2) with this series (lets call it qemu 5.2)
> > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks
> current code to pre-5.0
> > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > >
> > > > > It's all about picking which poison to choose, I'd preffer 2nd
> > > > > case as it lets drop a lot of complicated code that doesn't work
> > > > > as expected.
> > > >
> > > > I think that would make our lives easier for other reasons; so I'm
> > > > happy to go with that.
> > >
> > > to make things less painful for users, me wonders if there is a way
> > > to block migration if epyc and specific QEMU versions are used?
> >
> > We have no way to block based on version - and that's a pretty painful
> > thing to do; we can block based on machine type.
> >
> > But before we get there; can we understand in which combinations that
> > things break and why exactly - would it break on a 1 or 2 vCPU guest -
> > or would it only break when we get to the point the upper bits start
> > being used for example?  Why exaclty would it break - i.e. is it going
> > to change the name of sections in the migration stream - or are the
> > values we need actually going to migrate OK?
> 
> it's values of APIC ID, where 4.2 and 5.0 QEMU use different values if numa is
> enabled.
> I'd expect guest to be very confused in when this happens.
> 
> here is an example:
> qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa
> node,cpus=4-7
> 
> (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id {
>     "return": 7
> }
> 
> vs
> 
> qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa
> node,cpus=4-7
> (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id {
>     "return": 15
> }
> 
> we probably can't do anything based on machine type versions, as
> 4.2 and older versions on qemu-5.0 and newer use different algorithm to
> calculate apic-id.
> 
> Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
> 4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is
> used.

That is correct. When we revert all the node_id related changes, we will
go back to 4.2 algorithm. It will work fine with user passing "-smp
dies=n". It also keeps the code simple. That is why I kept the decoding of
0x8000001e like this below. This will also match apicid decoding.

*ecx = ((topo_info->dies_per_pkg - 1) << 8) |  ((cpu->apic_id >>
apicid_die_offset(topo_info)) & 0xFF);


Still not clear if we need to add a warning when numa nodes != dies.
Worried about adding that check and remove it again later.

What about auto_enable_numa? Do we still need it?

I can send the patches tomorrow if these things are clarified.
Thanks

> 
> 
> > Dave
> >
> >
> > > > > PS:
> > > > >  I didn't review it yet, but with this series we aren't  making
> > > > > up internal node_ids that should match user provided numa node ids
> somehow.
> > > > >  It seems series lost the patch that would enforce numa in case
> > > > > -smp dies>1,  but otherwise it heads in the right direction.
> > > >
> > > > Dave
> > > >
> > > > > >
> > > > > > Dave
> > > > > >
> > > > >
> > >



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-27 22:58                 ` Babu Moger
@ 2020-08-28  8:42                   ` Igor Mammedov
  2020-08-28 14:22                     ` Babu Moger
  0 siblings, 1 reply; 51+ messages in thread
From: Igor Mammedov @ 2020-08-28  8:42 UTC (permalink / raw)
  To: Babu Moger
  Cc: ehabkost, mst, qemu-devel, Dr. David Alan Gilbert, pbonzini, rth

On Thu, 27 Aug 2020 17:58:01 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Thursday, August 27, 2020 4:19 PM
> > To: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Cc: ehabkost@redhat.com; mst@redhat.com; qemu-devel@nongnu.org;
> > Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > rth@twiddle.net
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > generic decode
> > 
> > On Wed, 26 Aug 2020 15:10:46 +0100
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >   
> > > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > >  
> > > > > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > > > > On Tue, 25 Aug 2020 09:15:04 +0100 "Dr. David Alan Gilbert"
> > > > > > <dgilbert@redhat.com> wrote:
> > > > > >  
> > > > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > > > Hi Dave,
> > > > > > > >
> > > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > > > >> To support some of the complex topology, we introduced EPYC  
> > mode apicid decode.  
> > > > > > > > >> But, EPYC mode decode is running into problems. Also it
> > > > > > > > >> can become quite a maintenance problem in the future. So,
> > > > > > > > >> it was decided to remove that code and use the generic
> > > > > > > > >> decode which works for majority of the topology. Most of
> > > > > > > > >> the SPECed configuration would work just fine. With some  
> > non-SPECed user inputs, it will create some sub-optimal configuration.  
> > > > > > > > >> Here is the discussion thread.
> > > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https
> > > > > > > > >> %3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-  
> > 1d84-a6e  
> > > > > > > > >> 7-e468-  
> > d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.  
> > > > > > > > >>  
> > moger%40amd.com%7C9b15ee395daa4935640408d84acedf13%7C3dd8  
> > > > > > > > >>  
> > 961fe4884e608e11a82d994e183d%7C0%7C0%7C637341599663177545  
> > > > > > > > >>  
> > &amp;sdata=4okYGU%2F8QTYqEOZEd1EBC%2BEsIIrEV59HZrHzpbsR8s  
> > > > > > > > >> U%3D&amp;reserved=0
> > > > > > > > >>
> > > > > > > > >> This series removes all the EPYC mode specific apicid changes  
> > and use the generic  
> > > > > > > > >> apicid decode.  
> > > > > > > > >
> > > > > > > > > Hi Babu,
> > > > > > > > >   This does simplify things a lot!
> > > > > > > > > One worry, what happens about a live migration of a VM from  
> > an old qemu  
> > > > > > > > > that was using the node-id to a qemu with this new scheme?  
> > > > > > > >
> > > > > > > > The node_id which we introduced was only used internally. This  
> > wasn't  
> > > > > > > > exposed outside. I don't think live migration will be an issue.  
> > > > > > >
> > > > > > > Didn't it become part of the APIC ID visible to the guest?  
> > > > > >
> > > > > > Daniel asked similar question wrt hard error on start up, when
> > > > > > CLI is not sufficient to create EPYC cpu.
> > > > > >
> > > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%  
> > > > > > 2Fwww.mail-archive.com%2Fqemu-  
> > devel%40nongnu.org%2Fmsg728536.htm  
> > > > > >  
> > l&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C9b15ee395daa49356
> > 404  
> > > > > >  
> > 08d84acedf13%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63734
> > 1  
> > > > > >  
> > 599663177545&amp;sdata=OnHz23W4F4TdYwlxPZwC%2B8YRY1K3qJ5U9Sfdo
> > Oc  
> > > > > > GXtw%3D&amp;reserved=0
> > > > > >
> > > > > > Migration might fall into the same category.
> > > > > > Also looking at the history, 5.0 commit
> > > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for
> > > > > > EPYC based cpus models silently broke APIC ID (without versioning),  
> > for all EPYC models (that's were 1 new and 1 old one).  
> > > > > >
> > > > > > (I'm not aware of somebody complaining about it)
> > > > > >
> > > > > > Another commit ed78467a21459, changed CPUID_8000_001E without  
> > versioning as well.  
> > > > > >
> > > > > >
> > > > > > With current EPYC apicid code, if all starts align (no numa or 1
> > > > > > numa node only on CLI and no -smp dies=) it might produce a valid  
> > CPU (apicid+CPUID_8000_001E).  
> > > > > > No numa is gray area, since EPYC spec implies that it has to be numa  
> > machine in case of real EPYC cpus.  
> > > > > > Multi-node configs would be correct only if user assigns cpus to
> > > > > > numa nodes by duplicating internal node_id algorithm that this series  
> > removes.  
> > > > > >
> > > > > > There might be other broken cases that I don't recall anymore
> > > > > > (should be mentioned in previous versions of this series)
> > > > > >
> > > > > >
> > > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > > >
> > > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > > > >
> > > > > Oh ....
> > > > >  
> > > > > >  2) with this series (lets call it qemu 5.2)
> > > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks  
> > current code to pre-5.0  
> > > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > >
> > > > > > It's all about picking which poison to choose, I'd preffer 2nd
> > > > > > case as it lets drop a lot of complicated code that doesn't work
> > > > > > as expected.  
> > > > >
> > > > > I think that would make our lives easier for other reasons; so I'm
> > > > > happy to go with that.  
> > > >
> > > > to make things less painful for users, me wonders if there is a way
> > > > to block migration if epyc and specific QEMU versions are used?  
> > >
> > > We have no way to block based on version - and that's a pretty painful
> > > thing to do; we can block based on machine type.
> > >
> > > But before we get there; can we understand in which combinations that
> > > things break and why exactly - would it break on a 1 or 2 vCPU guest -
> > > or would it only break when we get to the point the upper bits start
> > > being used for example?  Why exaclty would it break - i.e. is it going
> > > to change the name of sections in the migration stream - or are the
> > > values we need actually going to migrate OK?  
> > 
> > it's values of APIC ID, where 4.2 and 5.0 QEMU use different values if numa is
> > enabled.
> > I'd expect guest to be very confused in when this happens.
> > 
> > here is an example:
> > qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa
> > node,cpus=4-7
> > 
> > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id {
> >     "return": 7
> > }
> > 
> > vs
> > 
> > qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa
> > node,cpus=4-7
> > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id {
> >     "return": 15
> > }
> > 
> > we probably can't do anything based on machine type versions, as
> > 4.2 and older versions on qemu-5.0 and newer use different algorithm to
> > calculate apic-id.
> > 
> > Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
> > 4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is
> > used.  
> 
> That is correct. When we revert all the node_id related changes, we will
> go back to 4.2 algorithm. It will work fine with user passing "-smp
> dies=n". It also keeps the code simple. That is why I kept the decoding of
> 0x8000001e like this below. This will also match apicid decoding.
> 
> *ecx = ((topo_info->dies_per_pkg - 1) << 8) |  ((cpu->apic_id >>
> apicid_die_offset(topo_info)) & 0xFF);
that will work when there is no -numa on CLI, when -numa is used,
we should use node id that user provided.
like you did in previous revision
   "[PATCH v4 1/3] i386: Simplify CPUID_8000_001E for AMD"

> Still not clear if we need to add a warning when numa nodes != dies.
> Worried about adding that check and remove it again later.
Since there is objection wrt making it error and I'd go with warning for now,
it makes life of person who have to figure what's wrong a bit easier.

> What about auto_enable_numa? Do we still need it?
>
> 
> I can send the patches tomorrow if these things are clarified.
> Thanks
With auto_enable_numa it would be cleaner as you can reuse
the same numa code to set 0x8000001e.ecx vs hardcodding it as above.

Maybe post series without auto_enable_numa so we fix migration
regression ASAP and then switch to auto_enable_numa on top.


> 
> > 
> >   
> > > Dave
> > >
> > >  
> > > > > > PS:
> > > > > >  I didn't review it yet, but with this series we aren't  making
> > > > > > up internal node_ids that should match user provided numa node ids  
> > somehow.  
> > > > > >  It seems series lost the patch that would enforce numa in case
> > > > > > -smp dies>1,  but otherwise it heads in the right direction.  
> > > > >
> > > > > Dave
> > > > >  
> > > > > > >
> > > > > > > Dave
> > > > > > >  
> > > > > >  
> > > >  
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-27 21:19               ` Igor Mammedov
  2020-08-27 22:58                 ` Babu Moger
@ 2020-08-28  8:48                 ` Dr. David Alan Gilbert
  2020-08-28 11:36                   ` Igor Mammedov
  1 sibling, 1 reply; 51+ messages in thread
From: Dr. David Alan Gilbert @ 2020-08-28  8:48 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, qemu-devel, Babu Moger, pbonzini, rth

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Wed, 26 Aug 2020 15:10:46 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > 
> > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > >   
> > > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > > Hi Dave,
> > > > > > > 
> > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:    
> > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > >> it will create some sub-optimal configuration.
> > > > > > > >> Here is the discussion thread.
> > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > > > > >>
> > > > > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > >> apicid decode.    
> > > > > > > > 
> > > > > > > > Hi Babu,
> > > > > > > >   This does simplify things a lot!
> > > > > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > > > > that was using the node-id to a qemu with this new scheme?    
> > > > > > > 
> > > > > > > The node_id which we introduced was only used internally. This wasn't
> > > > > > > exposed outside. I don't think live migration will be an issue.    
> > > > > > 
> > > > > > Didn't it become part of the APIC ID visible to the guest?  
> > > > > 
> > > > > Daniel asked similar question wrt hard error on start up,
> > > > > when CLI is not sufficient to create EPYC cpu.
> > > > > 
> > > > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > > > 
> > > > > Migration might fall into the same category.
> > > > > Also looking at the history, 5.0 commit 
> > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > > > > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > > > > 
> > > > > (I'm not aware of somebody complaining about it)
> > > > > 
> > > > > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > > > > 
> > > > > 
> > > > > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > > > > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > > > > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > > > > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > > > > by duplicating internal node_id algorithm that this series removes.
> > > > > 
> > > > > There might be other broken cases that I don't recall anymore
> > > > > (should be mentioned in previous versions of this series)
> > > > > 
> > > > > 
> > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > > 
> > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > > > 
> > > > Oh ....
> > > > 
> > > > >  2) with this series (lets call it qemu 5.2)
> > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > 
> > > > > It's all about picking which poison to choose,
> > > > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > > > doesn't work as expected.  
> > > > 
> > > > I think that would make our lives easier for other reasons; so I'm happy
> > > > to go with that.
> > > 
> > > to make things less painful for users, me wonders if there is a way
> > > to block migration if epyc and specific QEMU versions are used?
> > 
> > We have no way to block based on version - and that's a pretty painful
> > thing to do; we can block based on machine type.
> > 
> > But before we get there; can we understand in which combinations that
> > things break and why exactly - would it break on a 1 or 2 vCPU guest -
> > or would it only break when we get to the point the upper bits start
> > being used for example?  Why exaclty would it break - i.e. is it going
> > to change the name of sections in the migration stream - or are the
> > values we need actually going to migrate OK?
> 
> it's values of APIC ID, where 4.2 and 5.0 QEMU use different values
> if numa is enabled.
> I'd expect guest to be very confused in when this happens.
> 
> here is an example:
> qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7

OK, but it'll probably be OK on small VMs with a single NUMA node?

Dave

> (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> {
>     "return": 7
> }
> 
> vs
> 
> qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7
> (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> {
>     "return": 15
> }
> 
> we probably can't do anything based on machine type versions, as
> 4.2 and older versions on qemu-5.0 and newer use different algorithm to calculate apic-id.
> 
> Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
> 4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is used. 
> 
> 
> > Dave
> > 
> > 
> > > > > PS:
> > > > >  I didn't review it yet, but with this series we aren't
> > > > >  making up internal node_ids that should match user provided numa node ids somehow.
> > > > >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> > > > >  but otherwise it heads in the right direction.  
> > > > 
> > > > Dave
> > > > 
> > > > > > 
> > > > > > Dave
> > > > > >   
> > > > >   
> > > 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-27 20:55                 ` Igor Mammedov
@ 2020-08-28  8:55                   ` Daniel P. Berrangé
  2020-08-28 16:29                     ` Eduardo Habkost
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel P. Berrangé @ 2020-08-28  8:55 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Eduardo Habkost, mst, Michal Privoznik, qemu-devel, Babu Moger,
	pbonzini, rth

On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> On Thu, 27 Aug 2020 15:07:52 -0400
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > 
> > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > 
> > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > >   
> > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > > >     
> > > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > > 
> > > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > > apicid decode.    
> > > > > > > > > 
> > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > > 
> > > > > > > > >  if (epyc && !numa) 
> > > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > > 
> > > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > > as a requirement.
> > > > > > > > 
> > > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > 
> > > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > anyone reporting crashes yet.
> > > > > > > 
> > > > > > > 
> > > > > > > What other options do we have?
> > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > while new ones would require -numa and produce correct topology.  
> > > > > > 
> > > > > > No, tieing this to machine types is not viable either. That is still
> > > > > > going to break essentially every single management application that
> > > > > > exists today using QEMU.
> > > > > for that we have deprecation process, so users could switch to new CLI
> > > > > that would be required.
> > > > 
> > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > 
> > > > There are so many places where we diverge from what bare metal would
> > > > do, that I don't see a good reason to introduce this breakage, even
> > > > if we notify users via a deprecation message. 
> > > I find (3) and (4) good enough reasons to use deprecation.
> > > 
> > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > create a single NUMA node if none was specified for new machine
> > > > types, such that there is no visible change or breakage to any
> > > > mgmt apps.  
> > > 
> > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > >       QEMU can do just that (enable auto_enable_numa).
> > 
> > Why exactly do we need auto_enable_numa with dies=1?
> > 
> > If I understand correctly, Babu said earlier in this thread[1]
> > that we don't need auto_enable_numa.
> > 
> > [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/
> 
> in case of 1 die, -numa is not must have as it's one numa node only.
> Though having auto_enable_numa, will allow to reuse the CPU.node-id property
> to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.
> 
>  
> > > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > > (though not having to spend time on bug reports and debug issues, just to say
> > > it's not supported in the end, makes deprecation sound like a reasonable
> > > choice)
> > > 
> > > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > > dies limits (spec wise), QEMU has to produce valid CPUs.
> > > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > > on user's behalf. That's where we have to error out and ask for explicit
> > > numa configuration.
> > > 
> > > For such configs, current code (since 5.0), will produce in the best case
> > > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > > die_id(aka node_id) internally).
> > > I'd rather error out on nonsense configs earlier than debug such issues
> > > and than error out anyways later (upsetting more users).
> > > 
> > 
> > The requirements are not clear to me.  Is this just about making
> > CPU die_id match the NUMA node ID, or are there additional
> > constraints?
> die_id is per socket numa node index, so it's not numa node id in
> a sense we use it in qemu
> (that's where all the confusion started that led to current code)
> 
> I understood that each die in EPYC chip is a numa node, which encodes
> NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
> wrote earlier that EPYC makes -numa non optional.

AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
have used, the BIOS lets you choose whether the dies are exposed as
1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
that I see.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-27 20:21             ` Igor Mammedov
@ 2020-08-28  8:58               ` Daniel P. Berrangé
  2020-08-28 11:24                 ` Igor Mammedov
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel P. Berrangé @ 2020-08-28  8:58 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel,
	Dr. David Alan Gilbert, Babu Moger, pbonzini, rth

On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 13:45:51 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > 
> > 
> > > -----Original Message-----
> > > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > To: Moger, Babu <Babu.Moger@amd.com>
> > > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> > > Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> > > pbonzini@redhat.com; rth@twiddle.net
> > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > > decode
> > > 
> > > * Babu Moger (babu.moger@amd.com) wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > > > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > > > generic decode
> > > > >
> > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > <babu.moger@amd.com> wrote:
> > > > > > >
> > > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > > mode
> > > > > apicid decode.
> > > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > > > decided to remove that code and use the generic decode which
> > > > > > > > works for majority of the topology. Most of the SPECed
> > > > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > > > inputs, it will create some sub-
> > > > > optimal configuration.
> > > > > > > > Here is the discussion thread.
> > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > > > F%2F
> > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > > > > d5b437c1b25
> > > > > > > >
> > > > >
> > > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > > > > 52f92
> > > > > > > >
> > > > >
> > > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > > > C0
> > > > > > > >
> > > > >
> > > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > > > > YO%2B
> > > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > > >
> > > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > > and use the generic apicid decode.
> > > > > > >
> > > > > > > the main difference between EPYC and all other CPUs is that it
> > > > > > > requires numa configuration (it's not optional) so we need an
> > > > > > > extra
> > > > No, That is not true. Because of that assumption we made all these
> > > > apicid changes. And here we are now.
> > > >
> > > > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > > > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > > > basically we have all the cores in a socket under one numa node. This
> > > > is non-numa configuration.
> > > > Looking at the various configurations and also discussing internally,
> > > > it is not advisable to have (epyc && !numa) check.
> > > 
> > > Indeed on real hardware, I don't think we always see NUMA; my single socket,
> > > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> > > configuration found...Faking a node.'
> looks like firmware bug or maybe it's feature and there is a knob in fw
> to turn it on/off in case used OS doesn't like it for some reason.
> 
> 
> > > So if real hardware hasn't got a NUMA node, what's the real problem?
> > 
> > I don't see any problem once we revert all these changes(patch 1-7).
> > We don't need if (epyc && !numa) error check or auto_enable_numa=true
> > unconditionally.
> 
> We need revert to unbreak migration from QEMU < 5.0,
> everything else (fixes for CPUID_Fn8000001E) could go on top.
> 
> So what's on top (because old code also wasn't correct when
> CPUID_Fn8000001E is taken in account, tha's why we are at this point),
> 
> When starting QEMU without -numa
> Indeed we can skip "if (epyc && !numa) error check or auto_enable_numa=true",
> in case where there is 1 die (NPS1).
> (1) User however may set core/threads number bigger than possible by spec,
>     in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not be
>     valid spec vise and could trigger OPPs in guest kernel.
>     Given we allow go out of spec, perhaps we should add a warning at
>     realize time saying that used -smp config is not supported since it
>     doesn't match AMD EPYC spec and might not work.
> 
> (2) Earlier we agreed that we can reuse existing die_id instead of internal
>     (topo_ids.node_id in current code)
>     (It's is called DIE_ID and NODE ID in spec interchangeably)
>     Same as (1) add a warning when '-smp dies' goes beyond spec limits.
>     
> (3) "-smp dies>1" ''if'' we allow to run it without -numa,
>     then system wide NUMA node id in CPUID_Fn8000001E_ECX probably doesn't matter.
>     could be something like in spec but taking in account die offset, to produce
>     unique id.
> 
>     Same, add a warning that there are more than 1 dies but numa is not enabled,
>     suggest to enable numa.
> 
>     With current code it produces invalid APIC ID for valid '-smp' combination,
>     however if we revert it and switch to die_id than it should produce
>     valid APIC ID once again (as in 4.2).
>     Given it produces invalid APIC id, maybe we should just ditch the case and
>     fold it in (4) (i.e. require -numa if "-smp dies>1")
> 
> (4) -numa is used (RHBZ1728166)
>     we need to ensure that socket*dies == ms->numa_state->num_nodes
>      and make sure that CPUID_Fn8000001E_ECX consistent with
>     cpu mapping provided with "-numa cpu=" option.

Why do we need to socket*dies == ms->numa_state->num_nodes ? That doesn't
seem to be the case in bare metal EPYC nodes I've used which lets you
configure how many NUMA nodes in firmware.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-28  8:58               ` Daniel P. Berrangé
@ 2020-08-28 11:24                 ` Igor Mammedov
  2020-08-28 14:17                   ` Babu Moger
  0 siblings, 1 reply; 51+ messages in thread
From: Igor Mammedov @ 2020-08-28 11:24 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel,
	Dr. David Alan Gilbert, Babu Moger, pbonzini, rth

On Fri, 28 Aug 2020 09:58:03 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 13:45:51 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > 
> > >   
> > > > -----Original Message-----
> > > > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > > > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> > > > Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> > > > pbonzini@redhat.com; rth@twiddle.net
> > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > > > decode
> > > > 
> > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > >  
> > > > > > -----Original Message-----
> > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > > > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > > > > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > > > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > > > > generic decode
> > > > > >
> > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >  
> > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > >  
> > > > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > > > mode  
> > > > > > apicid decode.  
> > > > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > > > > decided to remove that code and use the generic decode which
> > > > > > > > > works for majority of the topology. Most of the SPECed
> > > > > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > > > > inputs, it will create some sub-  
> > > > > > optimal configuration.  
> > > > > > > > > Here is the discussion thread.
> > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > > > > F%2F
> > > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-  
> > > > > > d5b437c1b25  
> > > > > > > > >  
> > > > > >  
> > > > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c  
> > > > > > 52f92  
> > > > > > > > >  
> > > > > >  
> > > > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7  
> > > > > > C0  
> > > > > > > > >  
> > > > > >  
> > > > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA  
> > > > > > YO%2B  
> > > > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > > > >
> > > > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > > > and use the generic apicid decode.  
> > > > > > > >
> > > > > > > > the main difference between EPYC and all other CPUs is that it
> > > > > > > > requires numa configuration (it's not optional) so we need an
> > > > > > > > extra  
> > > > > No, That is not true. Because of that assumption we made all these
> > > > > apicid changes. And here we are now.
> > > > >
> > > > > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > > > > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > > > > basically we have all the cores in a socket under one numa node. This
> > > > > is non-numa configuration.
> > > > > Looking at the various configurations and also discussing internally,
> > > > > it is not advisable to have (epyc && !numa) check.  
> > > > 
> > > > Indeed on real hardware, I don't think we always see NUMA; my single socket,
> > > > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> > > > configuration found...Faking a node.'  
> > looks like firmware bug or maybe it's feature and there is a knob in fw
> > to turn it on/off in case used OS doesn't like it for some reason.
> > 
> >   
> > > > So if real hardware hasn't got a NUMA node, what's the real problem?  
> > > 
> > > I don't see any problem once we revert all these changes(patch 1-7).
> > > We don't need if (epyc && !numa) error check or auto_enable_numa=true
> > > unconditionally.  
> > 
> > We need revert to unbreak migration from QEMU < 5.0,
> > everything else (fixes for CPUID_Fn8000001E) could go on top.
> > 
> > So what's on top (because old code also wasn't correct when
> > CPUID_Fn8000001E is taken in account, tha's why we are at this point),
> > 
> > When starting QEMU without -numa
> > Indeed we can skip "if (epyc && !numa) error check or auto_enable_numa=true",
> > in case where there is 1 die (NPS1).
> > (1) User however may set core/threads number bigger than possible by spec,
> >     in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not be
> >     valid spec vise and could trigger OPPs in guest kernel.
> >     Given we allow go out of spec, perhaps we should add a warning at
> >     realize time saying that used -smp config is not supported since it
> >     doesn't match AMD EPYC spec and might not work.
> > 
> > (2) Earlier we agreed that we can reuse existing die_id instead of internal
> >     (topo_ids.node_id in current code)
> >     (It's is called DIE_ID and NODE ID in spec interchangeably)
> >     Same as (1) add a warning when '-smp dies' goes beyond spec limits.
> >     
> > (3) "-smp dies>1" ''if'' we allow to run it without -numa,
> >     then system wide NUMA node id in CPUID_Fn8000001E_ECX probably doesn't matter.
> >     could be something like in spec but taking in account die offset, to produce
> >     unique id.
> > 
> >     Same, add a warning that there are more than 1 dies but numa is not enabled,
> >     suggest to enable numa.
> > 
> >     With current code it produces invalid APIC ID for valid '-smp' combination,
> >     however if we revert it and switch to die_id than it should produce
> >     valid APIC ID once again (as in 4.2).
> >     Given it produces invalid APIC id, maybe we should just ditch the case and
> >     fold it in (4) (i.e. require -numa if "-smp dies>1")
> > 
> > (4) -numa is used (RHBZ1728166)
> >     we need to ensure that socket*dies == ms->numa_state->num_nodes
> >      and make sure that CPUID_Fn8000001E_ECX consistent with
> >     cpu mapping provided with "-numa cpu=" option.  
> 
> Why do we need to socket*dies == ms->numa_state->num_nodes ? That doesn't
> seem to be the case in bare metal EPYC nodes I've used which lets you
> configure how many NUMA nodes in firmware.

(From dumps Babu has provided earlier, it was dies == nodes and
CPUID_Fn8000001E_ECX == numa node ids in SRAT.)

dumping CPUID_Fn8000001E and SRAT table for such configs will help us
to figure out if we need socket*dies != nodes and how to compose config
were SRAT differs from CPUID_Fn8000001E_ECX.

Babu, can you provide CPUID_Fn8000001E and SRAT dumps for
above configs combinations? Or to some spec/guide how it should be.


> 
> 
> Regards,
> Daniel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-28  8:48                 ` Dr. David Alan Gilbert
@ 2020-08-28 11:36                   ` Igor Mammedov
  0 siblings, 0 replies; 51+ messages in thread
From: Igor Mammedov @ 2020-08-28 11:36 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: ehabkost, mst, qemu-devel, Babu Moger, pbonzini, rth

On Fri, 28 Aug 2020 09:48:30 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Wed, 26 Aug 2020 15:10:46 +0100
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >   
> > > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > >   
> > > > > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > > > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > > >     
> > > > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > > > > > Hi Dave,
> > > > > > > > 
> > > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:      
> > > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:      
> > > > > > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > > > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > >> it will create some sub-optimal configuration.
> > > > > > > > >> Here is the discussion thread.
> > > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > > > > > >>
> > > > > > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > >> apicid decode.      
> > > > > > > > > 
> > > > > > > > > Hi Babu,
> > > > > > > > >   This does simplify things a lot!
> > > > > > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > > > > > that was using the node-id to a qemu with this new scheme?      
> > > > > > > > 
> > > > > > > > The node_id which we introduced was only used internally. This wasn't
> > > > > > > > exposed outside. I don't think live migration will be an issue.      
> > > > > > > 
> > > > > > > Didn't it become part of the APIC ID visible to the guest?    
> > > > > > 
> > > > > > Daniel asked similar question wrt hard error on start up,
> > > > > > when CLI is not sufficient to create EPYC cpu.
> > > > > > 
> > > > > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > > > > 
> > > > > > Migration might fall into the same category.
> > > > > > Also looking at the history, 5.0 commit 
> > > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > > > > > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > > > > > 
> > > > > > (I'm not aware of somebody complaining about it)
> > > > > > 
> > > > > > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > > > > > 
> > > > > > 
> > > > > > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > > > > > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > > > > > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > > > > > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > > > > > by duplicating internal node_id algorithm that this series removes.
> > > > > > 
> > > > > > There might be other broken cases that I don't recall anymore
> > > > > > (should be mentioned in previous versions of this series)
> > > > > > 
> > > > > > 
> > > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > > > 
> > > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration    
> > > > > 
> > > > > Oh ....
> > > > >   
> > > > > >  2) with this series (lets call it qemu 5.2)
> > > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> > > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > > 
> > > > > > It's all about picking which poison to choose,
> > > > > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > > > > doesn't work as expected.    
> > > > > 
> > > > > I think that would make our lives easier for other reasons; so I'm happy
> > > > > to go with that.  
> > > > 
> > > > to make things less painful for users, me wonders if there is a way
> > > > to block migration if epyc and specific QEMU versions are used?  
> > > 
> > > We have no way to block based on version - and that's a pretty painful
> > > thing to do; we can block based on machine type.
> > > 
> > > But before we get there; can we understand in which combinations that
> > > things break and why exactly - would it break on a 1 or 2 vCPU guest -
> > > or would it only break when we get to the point the upper bits start
> > > being used for example?  Why exaclty would it break - i.e. is it going
> > > to change the name of sections in the migration stream - or are the
> > > values we need actually going to migrate OK?  
> > 
> > it's values of APIC ID, where 4.2 and 5.0 QEMU use different values
> > if numa is enabled.
> > I'd expect guest to be very confused in when this happens.
> > 
> > here is an example:
> > qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7  
> 
> OK, but it'll probably be OK on small VMs with a single NUMA node?

it should be fine if -numa isn't used.
 
> Dave
> 
> > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> > {
> >     "return": 7
> > }
> > 
> > vs
> > 
> > qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7
> > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> > {
> >     "return": 15
> > }
> > 
> > we probably can't do anything based on machine type versions, as
> > 4.2 and older versions on qemu-5.0 and newer use different algorithm to calculate apic-id.
> > 
> > Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
> > 4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is used. 
> > 
> >   
> > > Dave
> > > 
> > >   
> > > > > > PS:
> > > > > >  I didn't review it yet, but with this series we aren't
> > > > > >  making up internal node_ids that should match user provided numa node ids somehow.
> > > > > >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> > > > > >  but otherwise it heads in the right direction.    
> > > > > 
> > > > > Dave
> > > > >   
> > > > > > > 
> > > > > > > Dave
> > > > > > >     
> > > > > >     
> > > >   
> >   



^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-28 11:24                 ` Igor Mammedov
@ 2020-08-28 14:17                   ` Babu Moger
  2020-08-28 14:48                     ` Igor Mammedov
  0 siblings, 1 reply; 51+ messages in thread
From: Babu Moger @ 2020-08-28 14:17 UTC (permalink / raw)
  To: Igor Mammedov, Daniel P. Berrangé
  Cc: ehabkost, mst, Michal Privoznik, qemu-devel,
	Dr. David Alan Gilbert, pbonzini, rth



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Friday, August 28, 2020 6:25 AM
> To: Daniel P. Berrangé <berrange@redhat.com>
> Cc: Moger, Babu <Babu.Moger@amd.com>; Dr. David Alan Gilbert
> <dgilbert@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> pbonzini@redhat.com; rth@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> generic decode
> 
> On Fri, 28 Aug 2020 09:58:03 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 13:45:51 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > > > > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com;
> > > > > Michal Privoznik <mprivozn@redhat.com>; qemu-
> devel@nongnu.org;
> > > > > pbonzini@redhat.com; rth@twiddle.net
> > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and
> > > > > use generic decode
> > > > >
> > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > > > > Cc: Moger, Babu <Babu.Moger@amd.com>;
> pbonzini@redhat.com;
> > > > > > > rth@twiddle.net; ehabkost@redhat.com; qemu-
> devel@nongnu.org;
> > > > > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode
> > > > > > > and use generic decode
> > > > > > >
> > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100 Daniel P. Berrangé
> > > > > > > <berrange@redhat.com> wrote:
> > > > > > >
> > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov
> wrote:
> > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > > >
> > > > > > > > > > To support some of the complex topology, we introduced
> > > > > > > > > > EPYC mode
> > > > > > > apicid decode.
> > > > > > > > > > But, EPYC mode decode is running into problems. Also
> > > > > > > > > > it can become quite a maintenance problem in the
> > > > > > > > > > future. So, it was decided to remove that code and use
> > > > > > > > > > the generic decode which works for majority of the
> > > > > > > > > > topology. Most of the SPECed configuration would work
> > > > > > > > > > just fine. With some non-SPECed user inputs, it will
> > > > > > > > > > create some sub-
> > > > > > > optimal configuration.
> > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=ht
> > > > > > > > > > tps%3A%252
> > > > > > > > > > F%2F
> > > > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-
> e468
> > > > > > > > > > -
> > > > > > > d5b437c1b25
> > > > > > > > > >
> > > > > > >
> > > > >
> 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a
> 5c
> > > > > > > 52f92
> > > > > > > > > >
> > > > > > >
> > > > >
> 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > > > > > C0
> > > > > > > > > >
> > > > > > >
> > > > >
> %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMe
> A
> > > > > > > YO%2B
> > > > > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > > > > >
> > > > > > > > > > This series removes all the EPYC mode specific apicid
> > > > > > > > > > changes and use the generic apicid decode.
> > > > > > > > >
> > > > > > > > > the main difference between EPYC and all other CPUs is
> > > > > > > > > that it requires numa configuration (it's not optional)
> > > > > > > > > so we need an extra
> > > > > > No, That is not true. Because of that assumption we made all
> > > > > > these apicid changes. And here we are now.
> > > > > >
> > > > > > AMD supports varies mixed configurations. In case of
> > > > > > EPYC-Rome, we have NPS1, NPS2 and NPS4(Numa Nodes per
> socket).
> > > > > > In case of NPS1, basically we have all the cores in a socket
> > > > > > under one numa node. This is non-numa configuration.
> > > > > > Looking at the various configurations and also discussing
> > > > > > internally, it is not advisable to have (epyc && !numa) check.
> > > > >
> > > > > Indeed on real hardware, I don't think we always see NUMA; my
> > > > > single socket,
> > > > > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No
> > > > > NUMA configuration found...Faking a node.'
> > > looks like firmware bug or maybe it's feature and there is a knob in
> > > fw to turn it on/off in case used OS doesn't like it for some reason.
> > >
> > >
> > > > > So if real hardware hasn't got a NUMA node, what's the real problem?
> > > >
> > > > I don't see any problem once we revert all these changes(patch 1-7).
> > > > We don't need if (epyc && !numa) error check or
> > > > auto_enable_numa=true unconditionally.
> > >
> > > We need revert to unbreak migration from QEMU < 5.0, everything else
> > > (fixes for CPUID_Fn8000001E) could go on top.
> > >
> > > So what's on top (because old code also wasn't correct when
> > > CPUID_Fn8000001E is taken in account, tha's why we are at this
> > > point),
> > >
> > > When starting QEMU without -numa
> > > Indeed we can skip "if (epyc && !numa) error check or
> > > auto_enable_numa=true", in case where there is 1 die (NPS1).
> > > (1) User however may set core/threads number bigger than possible by
> spec,
> > >     in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not
> be
> > >     valid spec vise and could trigger OPPs in guest kernel.
> > >     Given we allow go out of spec, perhaps we should add a warning at
> > >     realize time saying that used -smp config is not supported since it
> > >     doesn't match AMD EPYC spec and might not work.
> > >
> > > (2) Earlier we agreed that we can reuse existing die_id instead of internal
> > >     (topo_ids.node_id in current code)
> > >     (It's is called DIE_ID and NODE ID in spec interchangeably)
> > >     Same as (1) add a warning when '-smp dies' goes beyond spec limits.
> > >
> > > (3) "-smp dies>1" ''if'' we allow to run it without -numa,
> > >     then system wide NUMA node id in CPUID_Fn8000001E_ECX probably
> doesn't matter.
> > >     could be something like in spec but taking in account die offset, to
> produce
> > >     unique id.
> > >
> > >     Same, add a warning that there are more than 1 dies but numa is not
> enabled,
> > >     suggest to enable numa.
> > >
> > >     With current code it produces invalid APIC ID for valid '-smp'
> combination,
> > >     however if we revert it and switch to die_id than it should produce
> > >     valid APIC ID once again (as in 4.2).
> > >     Given it produces invalid APIC id, maybe we should just ditch the case
> and
> > >     fold it in (4) (i.e. require -numa if "-smp dies>1")
> > >
> > > (4) -numa is used (RHBZ1728166)
> > >     we need to ensure that socket*dies == ms->numa_state->num_nodes
> > >      and make sure that CPUID_Fn8000001E_ECX consistent with
> > >     cpu mapping provided with "-numa cpu=" option.
> >
> > Why do we need to socket*dies == ms->numa_state->num_nodes ? That
> > doesn't seem to be the case in bare metal EPYC nodes I've used which
> > lets you configure how many NUMA nodes in firmware.
> 
> (From dumps Babu has provided earlier, it was dies == nodes and
> CPUID_Fn8000001E_ECX == numa node ids in SRAT.)

Yes, That is correct. In most cases dies == nodes.

But that is going to change. In future(even in EPYC-Rome) with new f/w
BIOS option, users can configure their numa node. It will give the option
to keep NPS1, SPS2 or NSP4(Nodes per socket). In those cases dies and
nodes will not match. That is why I wanted to keep them separate. User can
change dies or -numa to match their bios config.

> 
> dumping CPUID_Fn8000001E and SRAT table for such configs will help us to
> figure out if we need socket*dies != nodes and how to compose config were
> SRAT differs from CPUID_Fn8000001E_ECX.
> 
> Babu, can you provide CPUID_Fn8000001E and SRAT dumps for above configs
> combinations? Or to some spec/guide how it should be.

I dont have the config right now. But I will try to get one.

> 
> 
> >
> >
> > Regards,
> > Daniel



^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-28  8:42                   ` Igor Mammedov
@ 2020-08-28 14:22                     ` Babu Moger
  0 siblings, 0 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-28 14:22 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, qemu-devel, Dr. David Alan Gilbert, pbonzini, rth



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Friday, August 28, 2020 3:43 AM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>; ehabkost@redhat.com;
> mst@redhat.com; qemu-devel@nongnu.org; pbonzini@redhat.com;
> rth@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> generic decode
> 
> On Thu, 27 Aug 2020 17:58:01 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > > -----Original Message-----
> > > From: Igor Mammedov <imammedo@redhat.com>
> > > Sent: Thursday, August 27, 2020 4:19 PM
> > > To: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > Cc: ehabkost@redhat.com; mst@redhat.com; qemu-devel@nongnu.org;
> > > Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > rth@twiddle.net
> > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > generic decode
> > >
> > > On Wed, 26 Aug 2020 15:10:46 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > >
> > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > On Tue, 25 Aug 2020 16:25:21 +0100 "Dr. David Alan Gilbert"
> > > > > <dgilbert@redhat.com> wrote:
> > > > >
> > > > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > > > On Tue, 25 Aug 2020 09:15:04 +0100 "Dr. David Alan Gilbert"
> > > > > > > <dgilbert@redhat.com> wrote:
> > > > > > >
> > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > > > > > Hi Dave,
> > > > > > > > >
> > > > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > > > > > >> To support some of the complex topology, we
> > > > > > > > > >> introduced EPYC
> > > mode apicid decode.
> > > > > > > > > >> But, EPYC mode decode is running into problems. Also
> > > > > > > > > >> it can become quite a maintenance problem in the
> > > > > > > > > >> future. So, it was decided to remove that code and
> > > > > > > > > >> use the generic decode which works for majority of
> > > > > > > > > >> the topology. Most of the SPECed configuration would
> > > > > > > > > >> work just fine. With some
> > > non-SPECed user inputs, it will create some sub-optimal configuration.
> > > > > > > > > >> Here is the discussion thread.
> > > > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=h
> > > > > > > > > >> ttps
> > > > > > > > > >> %3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-
> > > 1d84-a6e
> > > > > > > > > >> 7-e468-
> > > d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.
> > > > > > > > > >>
> > > moger%40amd.com%7C9b15ee395daa4935640408d84acedf13%7C3dd8
> > > > > > > > > >>
> > > 961fe4884e608e11a82d994e183d%7C0%7C0%7C637341599663177545
> > > > > > > > > >>
> > > &amp;sdata=4okYGU%2F8QTYqEOZEd1EBC%2BEsIIrEV59HZrHzpbsR8s
> > > > > > > > > >> U%3D&amp;reserved=0
> > > > > > > > > >>
> > > > > > > > > >> This series removes all the EPYC mode specific apicid
> > > > > > > > > >> changes
> > > and use the generic
> > > > > > > > > >> apicid decode.
> > > > > > > > > >
> > > > > > > > > > Hi Babu,
> > > > > > > > > >   This does simplify things a lot!
> > > > > > > > > > One worry, what happens about a live migration of a VM
> > > > > > > > > > from
> > > an old qemu
> > > > > > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > > > > >
> > > > > > > > > The node_id which we introduced was only used
> > > > > > > > > internally. This
> > > wasn't
> > > > > > > > > exposed outside. I don't think live migration will be an issue.
> > > > > > > >
> > > > > > > > Didn't it become part of the APIC ID visible to the guest?
> > > > > > >
> > > > > > > Daniel asked similar question wrt hard error on start up,
> > > > > > > when CLI is not sufficient to create EPYC cpu.
> > > > > > >
> > > > > > >
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%25
> > > > > > > 2Fwww.mail-archive.com%2Fqemu-
> > > devel%40nongnu.org%2Fmsg728536.htm
> > > > > > >
> > >
> l&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C9b15ee395daa49356
> > > 404
> > > > > > >
> > >
> 08d84acedf13%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63734
> > > 1
> > > > > > >
> > >
> 599663177545&amp;sdata=OnHz23W4F4TdYwlxPZwC%2B8YRY1K3qJ5U9Sfdo
> > > Oc
> > > > > > > GXtw%3D&amp;reserved=0
> > > > > > >
> > > > > > > Migration might fall into the same category.
> > > > > > > Also looking at the history, 5.0 commit
> > > > > > >   247b18c593ec29 target/i386: Enable new apic id encoding
> > > > > > > for EPYC based cpus models silently broke APIC ID (without
> > > > > > > versioning),
> > > for all EPYC models (that's were 1 new and 1 old one).
> > > > > > >
> > > > > > > (I'm not aware of somebody complaining about it)
> > > > > > >
> > > > > > > Another commit ed78467a21459, changed CPUID_8000_001E
> > > > > > > without
> > > versioning as well.
> > > > > > >
> > > > > > >
> > > > > > > With current EPYC apicid code, if all starts align (no numa
> > > > > > > or 1 numa node only on CLI and no -smp dies=) it might
> > > > > > > produce a valid
> > > CPU (apicid+CPUID_8000_001E).
> > > > > > > No numa is gray area, since EPYC spec implies that it has to
> > > > > > > be numa
> > > machine in case of real EPYC cpus.
> > > > > > > Multi-node configs would be correct only if user assigns
> > > > > > > cpus to numa nodes by duplicating internal node_id algorithm
> > > > > > > that this series
> > > removes.
> > > > > > >
> > > > > > > There might be other broken cases that I don't recall
> > > > > > > anymore (should be mentioned in previous versions of this
> > > > > > > series)
> > > > > > >
> > > > > > >
> > > > > > > To summarize from migration pov (ignoring ed78467a21459
> change):
> > > > > > >
> > > > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration
> > > > > >
> > > > > > Oh ....
> > > > > >
> > > > > > >  2) with this series (lets call it qemu 5.2)
> > > > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically
> > > > > > > rollbacks
> > > current code to pre-5.0
> > > > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > > >
> > > > > > > It's all about picking which poison to choose, I'd preffer
> > > > > > > 2nd case as it lets drop a lot of complicated code that
> > > > > > > doesn't work as expected.
> > > > > >
> > > > > > I think that would make our lives easier for other reasons; so
> > > > > > I'm happy to go with that.
> > > > >
> > > > > to make things less painful for users, me wonders if there is a
> > > > > way to block migration if epyc and specific QEMU versions are used?
> > > >
> > > > We have no way to block based on version - and that's a pretty
> > > > painful thing to do; we can block based on machine type.
> > > >
> > > > But before we get there; can we understand in which combinations
> > > > that things break and why exactly - would it break on a 1 or 2
> > > > vCPU guest - or would it only break when we get to the point the
> > > > upper bits start being used for example?  Why exaclty would it
> > > > break - i.e. is it going to change the name of sections in the
> > > > migration stream - or are the values we need actually going to migrate
> OK?
> > >
> > > it's values of APIC ID, where 4.2 and 5.0 QEMU use different values
> > > if numa is enabled.
> > > I'd expect guest to be very confused in when this happens.
> > >
> > > here is an example:
> > > qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3
> > > -numa
> > > node,cpus=4-7
> > >
> > > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> {
> > >     "return": 7
> > > }
> > >
> > > vs
> > >
> > > qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3
> > > -numa
> > > node,cpus=4-7
> > > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> {
> > >     "return": 15
> > > }
> > >
> > > we probably can't do anything based on machine type versions, as
> > > 4.2 and older versions on qemu-5.0 and newer use different algorithm
> > > to calculate apic-id.
> > >
> > > Hence was suggestion to leave 5.0/5.1 with broken apic id and revert
> > > back to
> > > 4.2 algorithm, which should encode APIC ID correctly when '-smp
> > > dies' is used.
> >
> > That is correct. When we revert all the node_id related changes, we
> > will go back to 4.2 algorithm. It will work fine with user passing
> > "-smp dies=n". It also keeps the code simple. That is why I kept the
> > decoding of 0x8000001e like this below. This will also match apicid decoding.
> >
> > *ecx = ((topo_info->dies_per_pkg - 1) << 8) |  ((cpu->apic_id >>
> > apicid_die_offset(topo_info)) & 0xFF);
> that will work when there is no -numa on CLI, when -numa is used, we
> should use node id that user provided.
> like you did in previous revision
>    "[PATCH v4 1/3] i386: Simplify CPUID_8000_001E for AMD"

This might be a problem in the future with new BIOS options to change the
NPS(Nodes per Socket). Nodes and dies may not match. Then we will end up
with wrong CPUID_8000_001E encoding. That is why I wanted to keep both of
them separate. Users have the option to configure the way it matches their
bios config.


> 
> > Still not clear if we need to add a warning when numa nodes != dies.
> > Worried about adding that check and remove it again later.
> Since there is objection wrt making it error and I'd go with warning for now, it
> makes life of person who have to figure what's wrong a bit easier.
> 
> > What about auto_enable_numa? Do we still need it?
> >
> >
> > I can send the patches tomorrow if these things are clarified.
> > Thanks
> With auto_enable_numa it would be cleaner as you can reuse the same
> numa code to set 0x8000001e.ecx vs hardcodding it as above.
> 
> Maybe post series without auto_enable_numa so we fix migration
> regression ASAP and then switch to auto_enable_numa on top.
> 
> 
> >
> > >
> > >
> > > > Dave
> > > >
> > > >
> > > > > > > PS:
> > > > > > >  I didn't review it yet, but with this series we aren't
> > > > > > > making up internal node_ids that should match user provided
> > > > > > > numa node ids
> > > somehow.
> > > > > > >  It seems series lost the patch that would enforce numa in
> > > > > > > case -smp dies>1,  but otherwise it heads in the right direction.
> > > > > >
> > > > > > Dave
> > > > > >
> > > > > > > >
> > > > > > > > Dave
> > > > > > > >
> > > > > > >
> > > > >
> >



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-28 14:17                   ` Babu Moger
@ 2020-08-28 14:48                     ` Igor Mammedov
  0 siblings, 0 replies; 51+ messages in thread
From: Igor Mammedov @ 2020-08-28 14:48 UTC (permalink / raw)
  To: Babu Moger
  Cc: Daniel P. Berrangé,
	ehabkost, mst, Michal Privoznik, qemu-devel,
	Dr. David Alan Gilbert, pbonzini, rth

On Fri, 28 Aug 2020 09:17:42 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Friday, August 28, 2020 6:25 AM
> > To: Daniel P. Berrangé <berrange@redhat.com>
> > Cc: Moger, Babu <Babu.Moger@amd.com>; Dr. David Alan Gilbert
> > <dgilbert@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> > Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> > pbonzini@redhat.com; rth@twiddle.net
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > generic decode
> > 
> > On Fri, 28 Aug 2020 09:58:03 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:  
> > > > On Wed, 26 Aug 2020 13:45:51 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >  
> > > > >
> > > > >  
> > > > > > -----Original Message-----
> > > > > > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > > > > > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com;
> > > > > > Michal Privoznik <mprivozn@redhat.com>; qemu-  
> > devel@nongnu.org;  
> > > > > > pbonzini@redhat.com; rth@twiddle.net
> > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and
> > > > > > use generic decode
> > > > > >
> > > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > >  
> > > > > > > > -----Original Message-----
> > > > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > > > > > Cc: Moger, Babu <Babu.Moger@amd.com>;  
> > pbonzini@redhat.com;  
> > > > > > > > rth@twiddle.net; ehabkost@redhat.com; qemu-  
> > devel@nongnu.org;  
> > > > > > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode
> > > > > > > > and use generic decode
> > > > > > > >
> > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100 Daniel P. Berrangé
> > > > > > > > <berrange@redhat.com> wrote:
> > > > > > > >  
> > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov  
> > wrote:  
> > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > > > >  
> > > > > > > > > > > To support some of the complex topology, we introduced
> > > > > > > > > > > EPYC mode  
> > > > > > > > apicid decode.  
> > > > > > > > > > > But, EPYC mode decode is running into problems. Also
> > > > > > > > > > > it can become quite a maintenance problem in the
> > > > > > > > > > > future. So, it was decided to remove that code and use
> > > > > > > > > > > the generic decode which works for majority of the
> > > > > > > > > > > topology. Most of the SPECed configuration would work
> > > > > > > > > > > just fine. With some non-SPECed user inputs, it will
> > > > > > > > > > > create some sub-  
> > > > > > > > optimal configuration.  
> > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=ht
> > > > > > > > > > > tps%3A%252
> > > > > > > > > > > F%2F
> > > > > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-  
> > e468  
> > > > > > > > > > > -  
> > > > > > > > d5b437c1b25  
> > > > > > > > > > >  
> > > > > > > >  
> > > > > >  
> > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a
> > 5c  
> > > > > > > > 52f92  
> > > > > > > > > > >  
> > > > > > > >  
> > > > > >  
> > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7  
> > > > > > > > C0  
> > > > > > > > > > >  
> > > > > > > >  
> > > > > >  
> > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMe
> > A  
> > > > > > > > YO%2B  
> > > > > > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > > > > > >
> > > > > > > > > > > This series removes all the EPYC mode specific apicid
> > > > > > > > > > > changes and use the generic apicid decode.  
> > > > > > > > > >
> > > > > > > > > > the main difference between EPYC and all other CPUs is
> > > > > > > > > > that it requires numa configuration (it's not optional)
> > > > > > > > > > so we need an extra  
> > > > > > > No, That is not true. Because of that assumption we made all
> > > > > > > these apicid changes. And here we are now.
> > > > > > >
> > > > > > > AMD supports varies mixed configurations. In case of
> > > > > > > EPYC-Rome, we have NPS1, NPS2 and NPS4(Numa Nodes per  
> > socket).  
> > > > > > > In case of NPS1, basically we have all the cores in a socket
> > > > > > > under one numa node. This is non-numa configuration.
> > > > > > > Looking at the various configurations and also discussing
> > > > > > > internally, it is not advisable to have (epyc && !numa) check.  
> > > > > >
> > > > > > Indeed on real hardware, I don't think we always see NUMA; my
> > > > > > single socket,
> > > > > > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No
> > > > > > NUMA configuration found...Faking a node.'  
> > > > looks like firmware bug or maybe it's feature and there is a knob in
> > > > fw to turn it on/off in case used OS doesn't like it for some reason.
> > > >
> > > >  
> > > > > > So if real hardware hasn't got a NUMA node, what's the real problem?  
> > > > >
> > > > > I don't see any problem once we revert all these changes(patch 1-7).
> > > > > We don't need if (epyc && !numa) error check or
> > > > > auto_enable_numa=true unconditionally.  
> > > >
> > > > We need revert to unbreak migration from QEMU < 5.0, everything else
> > > > (fixes for CPUID_Fn8000001E) could go on top.
> > > >
> > > > So what's on top (because old code also wasn't correct when
> > > > CPUID_Fn8000001E is taken in account, tha's why we are at this
> > > > point),
> > > >
> > > > When starting QEMU without -numa
> > > > Indeed we can skip "if (epyc && !numa) error check or
> > > > auto_enable_numa=true", in case where there is 1 die (NPS1).
> > > > (1) User however may set core/threads number bigger than possible by  
> > spec,  
> > > >     in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not  
> > be  
> > > >     valid spec vise and could trigger OPPs in guest kernel.
> > > >     Given we allow go out of spec, perhaps we should add a warning at
> > > >     realize time saying that used -smp config is not supported since it
> > > >     doesn't match AMD EPYC spec and might not work.
> > > >
> > > > (2) Earlier we agreed that we can reuse existing die_id instead of internal
> > > >     (topo_ids.node_id in current code)
> > > >     (It's is called DIE_ID and NODE ID in spec interchangeably)
> > > >     Same as (1) add a warning when '-smp dies' goes beyond spec limits.
> > > >
> > > > (3) "-smp dies>1" ''if'' we allow to run it without -numa,
> > > >     then system wide NUMA node id in CPUID_Fn8000001E_ECX probably  
> > doesn't matter.  
> > > >     could be something like in spec but taking in account die offset, to  
> > produce  
> > > >     unique id.
> > > >
> > > >     Same, add a warning that there are more than 1 dies but numa is not  
> > enabled,  
> > > >     suggest to enable numa.
> > > >
> > > >     With current code it produces invalid APIC ID for valid '-smp'  
> > combination,  
> > > >     however if we revert it and switch to die_id than it should produce
> > > >     valid APIC ID once again (as in 4.2).
> > > >     Given it produces invalid APIC id, maybe we should just ditch the case  
> > and  
> > > >     fold it in (4) (i.e. require -numa if "-smp dies>1")
> > > >
> > > > (4) -numa is used (RHBZ1728166)
> > > >     we need to ensure that socket*dies == ms->numa_state->num_nodes
> > > >      and make sure that CPUID_Fn8000001E_ECX consistent with
> > > >     cpu mapping provided with "-numa cpu=" option.  
> > >
> > > Why do we need to socket*dies == ms->numa_state->num_nodes ? That
> > > doesn't seem to be the case in bare metal EPYC nodes I've used which
> > > lets you configure how many NUMA nodes in firmware.  
> > 
> > (From dumps Babu has provided earlier, it was dies == nodes and
> > CPUID_Fn8000001E_ECX == numa node ids in SRAT.)  
> 
> Yes, That is correct. In most cases dies == nodes.
> 
> But that is going to change. In future(even in EPYC-Rome) with new f/w
> BIOS option, users can configure their numa node. It will give the option
> to keep NPS1, SPS2 or NSP4(Nodes per socket). In those cases dies and
> nodes will not match. That is why I wanted to keep them separate. User can
> change dies or -numa to match their bios config.

if real hw will do that, than that's fine.
it will be hw vendor who will be fixing issues if any when it comes to guest OS
(i.e. Windows)


> > dumping CPUID_Fn8000001E and SRAT table for such configs will help us to
> > figure out if we need socket*dies != nodes and how to compose config were
> > SRAT differs from CPUID_Fn8000001E_ECX.
> > 
> > Babu, can you provide CPUID_Fn8000001E and SRAT dumps for above configs
> > combinations? Or to some spec/guide how it should be.  
> 
> I dont have the config right now. But I will try to get one.
> 
> > 
> >   
> > >
> > >
> > > Regards,
> > > Daniel  
> 
> 



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-28  8:55                   ` Daniel P. Berrangé
@ 2020-08-28 16:29                     ` Eduardo Habkost
  2020-08-28 16:32                       ` Daniel P. Berrangé
  0 siblings, 1 reply; 51+ messages in thread
From: Eduardo Habkost @ 2020-08-28 16:29 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini,
	Igor Mammedov, rth

On Fri, Aug 28, 2020 at 09:55:33AM +0100, Daniel P. Berrangé wrote:
> On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> > On Thu, 27 Aug 2020 15:07:52 -0400
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > 
> > > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > 
> > > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > 
> > > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > >   
> > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > > > >     
> > > > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > > > 
> > > > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > > > apicid decode.    
> > > > > > > > > > 
> > > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > > > 
> > > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > > > 
> > > > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > > > as a requirement.
> > > > > > > > > 
> > > > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > > 
> > > > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > > anyone reporting crashes yet.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > What other options do we have?
> > > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > > while new ones would require -numa and produce correct topology.  
> > > > > > > 
> > > > > > > No, tieing this to machine types is not viable either. That is still
> > > > > > > going to break essentially every single management application that
> > > > > > > exists today using QEMU.
> > > > > > for that we have deprecation process, so users could switch to new CLI
> > > > > > that would be required.
> > > > > 
> > > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > > 
> > > > > There are so many places where we diverge from what bare metal would
> > > > > do, that I don't see a good reason to introduce this breakage, even
> > > > > if we notify users via a deprecation message. 
> > > > I find (3) and (4) good enough reasons to use deprecation.
> > > > 
> > > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > > create a single NUMA node if none was specified for new machine
> > > > > types, such that there is no visible change or breakage to any
> > > > > mgmt apps.  
> > > > 
> > > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > > >       QEMU can do just that (enable auto_enable_numa).
> > > 
> > > Why exactly do we need auto_enable_numa with dies=1?
> > > 
> > > If I understand correctly, Babu said earlier in this thread[1]
> > > that we don't need auto_enable_numa.
> > > 
> > > [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/
> > 
> > in case of 1 die, -numa is not must have as it's one numa node only.
> > Though having auto_enable_numa, will allow to reuse the CPU.node-id property
> > to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.
> > 
> >  
> > > > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > > > (though not having to spend time on bug reports and debug issues, just to say
> > > > it's not supported in the end, makes deprecation sound like a reasonable
> > > > choice)
> > > > 
> > > > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > > > dies limits (spec wise), QEMU has to produce valid CPUs.
> > > > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > > > on user's behalf. That's where we have to error out and ask for explicit
> > > > numa configuration.
> > > > 
> > > > For such configs, current code (since 5.0), will produce in the best case
> > > > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > > > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > > > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > > > die_id(aka node_id) internally).
> > > > I'd rather error out on nonsense configs earlier than debug such issues
> > > > and than error out anyways later (upsetting more users).
> > > > 
> > > 
> > > The requirements are not clear to me.  Is this just about making
> > > CPU die_id match the NUMA node ID, or are there additional
> > > constraints?
> > die_id is per socket numa node index, so it's not numa node id in
> > a sense we use it in qemu
> > (that's where all the confusion started that led to current code)
> > 
> > I understood that each die in EPYC chip is a numa node, which encodes
> > NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
> > wrote earlier that EPYC makes -numa non optional.
> 
> AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
> have used, the BIOS lets you choose whether the dies are exposed as
> 1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
> that I see.

If you change that setting, will all CPUID bits be kept the same,
or the die topology seen by the OS will change?

-- 
Eduardo



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-28 16:29                     ` Eduardo Habkost
@ 2020-08-28 16:32                       ` Daniel P. Berrangé
  2020-08-28 16:45                         ` Eduardo Habkost
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel P. Berrangé @ 2020-08-28 16:32 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini,
	Igor Mammedov, rth

On Fri, Aug 28, 2020 at 12:29:31PM -0400, Eduardo Habkost wrote:
> On Fri, Aug 28, 2020 at 09:55:33AM +0100, Daniel P. Berrangé wrote:
> > On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> > > On Thu, 27 Aug 2020 15:07:52 -0400
> > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > > 
> > > > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > 
> > > > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > 
> > > > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > >   
> > > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > > > > >     
> > > > > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > > > > 
> > > > > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > > > > apicid decode.    
> > > > > > > > > > > 
> > > > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > > > > 
> > > > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > > > > 
> > > > > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > > > > as a requirement.
> > > > > > > > > > 
> > > > > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > > > 
> > > > > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > > > anyone reporting crashes yet.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > What other options do we have?
> > > > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > > > while new ones would require -numa and produce correct topology.  
> > > > > > > > 
> > > > > > > > No, tieing this to machine types is not viable either. That is still
> > > > > > > > going to break essentially every single management application that
> > > > > > > > exists today using QEMU.
> > > > > > > for that we have deprecation process, so users could switch to new CLI
> > > > > > > that would be required.
> > > > > > 
> > > > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > > > 
> > > > > > There are so many places where we diverge from what bare metal would
> > > > > > do, that I don't see a good reason to introduce this breakage, even
> > > > > > if we notify users via a deprecation message. 
> > > > > I find (3) and (4) good enough reasons to use deprecation.
> > > > > 
> > > > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > > > create a single NUMA node if none was specified for new machine
> > > > > > types, such that there is no visible change or breakage to any
> > > > > > mgmt apps.  
> > > > > 
> > > > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > > > >       QEMU can do just that (enable auto_enable_numa).
> > > > 
> > > > Why exactly do we need auto_enable_numa with dies=1?
> > > > 
> > > > If I understand correctly, Babu said earlier in this thread[1]
> > > > that we don't need auto_enable_numa.
> > > > 
> > > > [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/
> > > 
> > > in case of 1 die, -numa is not must have as it's one numa node only.
> > > Though having auto_enable_numa, will allow to reuse the CPU.node-id property
> > > to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.
> > > 
> > >  
> > > > > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > > > > (though not having to spend time on bug reports and debug issues, just to say
> > > > > it's not supported in the end, makes deprecation sound like a reasonable
> > > > > choice)
> > > > > 
> > > > > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > > > > dies limits (spec wise), QEMU has to produce valid CPUs.
> > > > > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > > > > on user's behalf. That's where we have to error out and ask for explicit
> > > > > numa configuration.
> > > > > 
> > > > > For such configs, current code (since 5.0), will produce in the best case
> > > > > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > > > > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > > > > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > > > > die_id(aka node_id) internally).
> > > > > I'd rather error out on nonsense configs earlier than debug such issues
> > > > > and than error out anyways later (upsetting more users).
> > > > > 
> > > > 
> > > > The requirements are not clear to me.  Is this just about making
> > > > CPU die_id match the NUMA node ID, or are there additional
> > > > constraints?
> > > die_id is per socket numa node index, so it's not numa node id in
> > > a sense we use it in qemu
> > > (that's where all the confusion started that led to current code)
> > > 
> > > I understood that each die in EPYC chip is a numa node, which encodes
> > > NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
> > > wrote earlier that EPYC makes -numa non optional.
> > 
> > AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
> > have used, the BIOS lets you choose whether the dies are exposed as
> > 1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
> > that I see.
> 
> If you change that setting, will all CPUID bits be kept the same,
> or the die topology seen by the OS will change?

I don't know offhand, and don't currently have access to the hardware.
All I know is that I was able to change between 1, 2 and 4 NUMA nodes
and that was reflected in numactl display, I didn't check the CPUID
when I was testing previously.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-28 16:32                       ` Daniel P. Berrangé
@ 2020-08-28 16:45                         ` Eduardo Habkost
  2020-08-28 18:00                           ` Babu Moger
  0 siblings, 1 reply; 51+ messages in thread
From: Eduardo Habkost @ 2020-08-28 16:45 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: mst, Michal Privoznik, qemu-devel, Babu Moger, pbonzini,
	Igor Mammedov, rth

On Fri, Aug 28, 2020 at 05:32:51PM +0100, Daniel P. Berrangé wrote:
> On Fri, Aug 28, 2020 at 12:29:31PM -0400, Eduardo Habkost wrote:
> > On Fri, Aug 28, 2020 at 09:55:33AM +0100, Daniel P. Berrangé wrote:
> > > On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> > > > On Thu, 27 Aug 2020 15:07:52 -0400
> > > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > > > 
> > > > > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > 
> > > > > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > 
> > > > > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > > >   
> > > > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > > > > > >     
> > > > > > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > > > > > 
> > > > > > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > > > > > apicid decode.    
> > > > > > > > > > > > 
> > > > > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > > > > > 
> > > > > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > > > > > 
> > > > > > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > > > > > as a requirement.
> > > > > > > > > > > 
> > > > > > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > > > > 
> > > > > > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > > > > anyone reporting crashes yet.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > What other options do we have?
> > > > > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > > > > while new ones would require -numa and produce correct topology.  
> > > > > > > > > 
> > > > > > > > > No, tieing this to machine types is not viable either. That is still
> > > > > > > > > going to break essentially every single management application that
> > > > > > > > > exists today using QEMU.
> > > > > > > > for that we have deprecation process, so users could switch to new CLI
> > > > > > > > that would be required.
> > > > > > > 
> > > > > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > > > > 
> > > > > > > There are so many places where we diverge from what bare metal would
> > > > > > > do, that I don't see a good reason to introduce this breakage, even
> > > > > > > if we notify users via a deprecation message. 
> > > > > > I find (3) and (4) good enough reasons to use deprecation.
> > > > > > 
> > > > > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > > > > create a single NUMA node if none was specified for new machine
> > > > > > > types, such that there is no visible change or breakage to any
> > > > > > > mgmt apps.  
> > > > > > 
> > > > > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > > > > >       QEMU can do just that (enable auto_enable_numa).
> > > > > 
> > > > > Why exactly do we need auto_enable_numa with dies=1?
> > > > > 
> > > > > If I understand correctly, Babu said earlier in this thread[1]
> > > > > that we don't need auto_enable_numa.
> > > > > 
> > > > > [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/
> > > > 
> > > > in case of 1 die, -numa is not must have as it's one numa node only.
> > > > Though having auto_enable_numa, will allow to reuse the CPU.node-id property
> > > > to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.
> > > > 
> > > >  
> > > > > > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > > > > > (though not having to spend time on bug reports and debug issues, just to say
> > > > > > it's not supported in the end, makes deprecation sound like a reasonable
> > > > > > choice)
> > > > > > 
> > > > > > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > > > > > dies limits (spec wise), QEMU has to produce valid CPUs.
> > > > > > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > > > > > on user's behalf. That's where we have to error out and ask for explicit
> > > > > > numa configuration.
> > > > > > 
> > > > > > For such configs, current code (since 5.0), will produce in the best case
> > > > > > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > > > > > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > > > > > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > > > > > die_id(aka node_id) internally).
> > > > > > I'd rather error out on nonsense configs earlier than debug such issues
> > > > > > and than error out anyways later (upsetting more users).
> > > > > > 
> > > > > 
> > > > > The requirements are not clear to me.  Is this just about making
> > > > > CPU die_id match the NUMA node ID, or are there additional
> > > > > constraints?
> > > > die_id is per socket numa node index, so it's not numa node id in
> > > > a sense we use it in qemu
> > > > (that's where all the confusion started that led to current code)
> > > > 
> > > > I understood that each die in EPYC chip is a numa node, which encodes
> > > > NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
> > > > wrote earlier that EPYC makes -numa non optional.
> > > 
> > > AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
> > > have used, the BIOS lets you choose whether the dies are exposed as
> > > 1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
> > > that I see.
> > 
> > If you change that setting, will all CPUID bits be kept the same,
> > or the die topology seen by the OS will change?
> 
> I don't know offhand, and don't currently have access to the hardware.
> All I know is that I was able to change between 1, 2 and 4 NUMA nodes
> and that was reflected in numactl display, I didn't check the CPUID
> when I was testing previously.

Babu, do you know the answer here?

If CPUID is kept the same with 1, 2 and 4 NUMA nodes, then having
NUMA configured is not a requirement at all.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 7/8] Revert "hw/386: Add EPYC mode topology decoding functions"
  2020-08-21 22:13 ` [PATCH v5 7/8] Revert "hw/386: Add EPYC mode topology decoding functions" Babu Moger
@ 2020-08-28 17:27   ` Eduardo Habkost
  2020-08-28 18:03     ` Babu Moger
  0 siblings, 1 reply; 51+ messages in thread
From: Eduardo Habkost @ 2020-08-28 17:27 UTC (permalink / raw)
  To: Babu Moger; +Cc: mst, qemu-devel, imammedo, pbonzini, rth

On Fri, Aug 21, 2020 at 05:13:03PM -0500, Babu Moger wrote:
> Remove the EPYC specific apicid decoding and use the generic
> default decoding.
> 
> This reverts commit 7568b205555a6405042f62c64af3268f4330aed5.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
[...]
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 19198e3e7f..b29686220e 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -388,7 +388,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>      unsigned long dies = topo_info->dies_per_pkg;
>      int shift;
>  
> -    x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
> +    x86_topo_ids_from_apicid(cpu->apic_id, topo_info, &topo_ids);

This was not part of commit 7568b205555a6405042f62c64af3268f4330aed5.
I suggest doing this change as a separate patch, to make review easier.

That line was addd by commit dd08ef0318e2
("target/i386: Cleanup and use the EPYC mode topology functions").
Wouldn't it be simpler to revert that commit?  If there are parts
of commit dd08ef0318e2 we want to keep, they can be re-added
in a separate patch.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
  2020-08-28 16:45                         ` Eduardo Habkost
@ 2020-08-28 18:00                           ` Babu Moger
  0 siblings, 0 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-28 18:00 UTC (permalink / raw)
  To: ehabkost, babu.moger
  Cc: berrange, mst, mprivozn, qemu-devel, pbonzini, imammedo, rth

Responding to Eduardo's question. Some emails are not comming to my
mailbox for some reason. Responding git send-email --in-reply-to.


>> > > > I understood that each die in EPYC chip is a numa node, which encodes
>> > > > NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
>> > > > wrote earlier that EPYC makes -numa non optional.
>> > > 
>> > > AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
>> > > have used, the BIOS lets you choose whether the dies are exposed as
>> > > 1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
>> > > that I see.
>> > 
>> > If you change that setting, will all CPUID bits be kept the same,
>> > or the die topology seen by the OS will change?
>> 
>> I don't know offhand, and don't currently have access to the hardware.
>> All I know is that I was able to change between 1, 2 and 4 NUMA nodes
>> and that was reflected in numactl display, I didn't check the CPUID
>> when I was testing previously.
>
>Babu, do you know the answer here?
>
>If CPUID is kept the same with 1, 2 and 4 NUMA nodes, then having
>NUMA configured is not a requirement at all.

Yes. The CPUID are kept the same with 1, 2 and 4 NUMA nodes.
So, having numa configered in not a requirement. Following are the
details of NPS2 and NPS4. Seing the same behaviour with NPS1 also.

NPS2:
================================
#lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              256
On-line CPU(s) list: 0-255
Thread(s) per core:  2
Core(s) per socket:  64
Socket(s):           2
NUMA node(s):        4
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               49
Model name:          AMD EPYC 7742 64-Core Processor
Stepping:            0
CPU MHz:             1785.033
CPU max MHz:         2250.0000
CPU min MHz:         1500.0000
BogoMIPS:            4491.51
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            16384K
NUMA node0 CPU(s):   0-31,128-159
NUMA node1 CPU(s):   32-63,160-191
NUMA node2 CPU(s):   64-95,192-223
NUMA node3 CPU(s):   96-127,224-255

#cpuid -l 0x8000001e -r

CPU 0:
   0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000000 edx=0x00000000
CPU 1:
   0x8000001e 0x00: eax=0x00000002 ebx=0x00000101 ecx=0x00000000 edx=0x00000000
CPU 2:
   0x8000001e 0x00: eax=0x00000004 ebx=0x00000102 ecx=0x00000000 edx=0x00000000
CPU 3:
   0x8000001e 0x00: eax=0x00000006 ebx=0x00000103 ecx=0x00000000 edx=0x00000000
CPU 4:
   0x8000001e 0x00: eax=0x00000008 ebx=0x00000104 ecx=0x00000000 edx=0x00000000
CPU 5:
   0x8000001e 0x00: eax=0x0000000a ebx=0x00000105 ecx=0x00000000 edx=0x00000000
CPU 6:
   0x8000001e 0x00: eax=0x0000000c ebx=0x00000106 ecx=0x00000000 edx=0x00000000
CPU 7:
   0x8000001e 0x00: eax=0x0000000e ebx=0x00000107 ecx=0x00000000 edx=0x00000000
CPU 8:
   0x8000001e 0x00: eax=0x00000010 ebx=0x00000108 ecx=0x00000000 edx=0x00000000
CPU 9:
   0x8000001e 0x00: eax=0x00000012 ebx=0x00000109 ecx=0x00000000 edx=0x00000000
CPU 10:
   0x8000001e 0x00: eax=0x00000014 ebx=0x0000010a ecx=0x00000000 edx=0x00000000
CPU 11:
   0x8000001e 0x00: eax=0x00000016 ebx=0x0000010b ecx=0x00000000 edx=0x00000000
CPU 12:
   0x8000001e 0x00: eax=0x00000018 ebx=0x0000010c ecx=0x00000000 edx=0x00000000
CPU 13:
   0x8000001e 0x00: eax=0x0000001a ebx=0x0000010d ecx=0x00000000 edx=0x00000000
CPU 14:
   0x8000001e 0x00: eax=0x0000001c ebx=0x0000010e ecx=0x00000000 edx=0x00000000
CPU 15:
   0x8000001e 0x00: eax=0x0000001e ebx=0x0000010f ecx=0x00000000 edx=0x00000000
CPU 16:
   0x8000001e 0x00: eax=0x00000020 ebx=0x00000110 ecx=0x00000000 edx=0x00000000
CPU 17:
   0x8000001e 0x00: eax=0x00000022 ebx=0x00000111 ecx=0x00000000 edx=0x00000000
CPU 18:
   0x8000001e 0x00: eax=0x00000024 ebx=0x00000112 ecx=0x00000000 edx=0x00000000
CPU 19:
   0x8000001e 0x00: eax=0x00000026 ebx=0x00000113 ecx=0x00000000 edx=0x00000000
CPU 20:
   0x8000001e 0x00: eax=0x00000028 ebx=0x00000114 ecx=0x00000000 edx=0x00000000
CPU 21:
   0x8000001e 0x00: eax=0x0000002a ebx=0x00000115 ecx=0x00000000 edx=0x00000000
CPU 22:
   0x8000001e 0x00: eax=0x0000002c ebx=0x00000116 ecx=0x00000000 edx=0x00000000
CPU 23:
   0x8000001e 0x00: eax=0x0000002e ebx=0x00000117 ecx=0x00000000 edx=0x00000000
CPU 24:
   0x8000001e 0x00: eax=0x00000030 ebx=0x00000118 ecx=0x00000000 edx=0x00000000
CPU 25:
   0x8000001e 0x00: eax=0x00000032 ebx=0x00000119 ecx=0x00000000 edx=0x00000000
CPU 26:
   0x8000001e 0x00: eax=0x00000034 ebx=0x0000011a ecx=0x00000000 edx=0x00000000
CPU 27:
   0x8000001e 0x00: eax=0x00000036 ebx=0x0000011b ecx=0x00000000 edx=0x00000000
CPU 28:
   0x8000001e 0x00: eax=0x00000038 ebx=0x0000011c ecx=0x00000000 edx=0x00000000
CPU 29:
   0x8000001e 0x00: eax=0x0000003a ebx=0x0000011d ecx=0x00000000 edx=0x00000000
CPU 30:
   0x8000001e 0x00: eax=0x0000003c ebx=0x0000011e ecx=0x00000000 edx=0x00000000
CPU 31:
   0x8000001e 0x00: eax=0x0000003e ebx=0x0000011f ecx=0x00000000 edx=0x00000000
CPU 32:
   0x8000001e 0x00: eax=0x00000040 ebx=0x00000120 ecx=0x00000000 edx=0x00000000
CPU 33:
   0x8000001e 0x00: eax=0x00000042 ebx=0x00000121 ecx=0x00000000 edx=0x00000000
CPU 34:
   0x8000001e 0x00: eax=0x00000044 ebx=0x00000122 ecx=0x00000000 edx=0x00000000
CPU 35:
   0x8000001e 0x00: eax=0x00000046 ebx=0x00000123 ecx=0x00000000 edx=0x00000000
CPU 36:
   0x8000001e 0x00: eax=0x00000048 ebx=0x00000124 ecx=0x00000000 edx=0x00000000
CPU 37:
   0x8000001e 0x00: eax=0x0000004a ebx=0x00000125 ecx=0x00000000 edx=0x00000000
CPU 38:
   0x8000001e 0x00: eax=0x0000004c ebx=0x00000126 ecx=0x00000000 edx=0x00000000
CPU 39:
   0x8000001e 0x00: eax=0x0000004e ebx=0x00000127 ecx=0x00000000 edx=0x00000000
CPU 40:
   0x8000001e 0x00: eax=0x00000050 ebx=0x00000128 ecx=0x00000000 edx=0x00000000
CPU 41:
   0x8000001e 0x00: eax=0x00000052 ebx=0x00000129 ecx=0x00000000 edx=0x00000000
CPU 42:
   0x8000001e 0x00: eax=0x00000054 ebx=0x0000012a ecx=0x00000000 edx=0x00000000
CPU 43:
   0x8000001e 0x00: eax=0x00000056 ebx=0x0000012b ecx=0x00000000 edx=0x00000000
CPU 44:
   0x8000001e 0x00: eax=0x00000058 ebx=0x0000012c ecx=0x00000000 edx=0x00000000
CPU 45:
   0x8000001e 0x00: eax=0x0000005a ebx=0x0000012d ecx=0x00000000 edx=0x00000000
CPU 46:
   0x8000001e 0x00: eax=0x0000005c ebx=0x0000012e ecx=0x00000000 edx=0x00000000
CPU 47:
   0x8000001e 0x00: eax=0x0000005e ebx=0x0000012f ecx=0x00000000 edx=0x00000000
CPU 48:
   0x8000001e 0x00: eax=0x00000060 ebx=0x00000130 ecx=0x00000000 edx=0x00000000
CPU 49:
   0x8000001e 0x00: eax=0x00000062 ebx=0x00000131 ecx=0x00000000 edx=0x00000000
CPU 50:
   0x8000001e 0x00: eax=0x00000064 ebx=0x00000132 ecx=0x00000000 edx=0x00000000
CPU 51:
   0x8000001e 0x00: eax=0x00000066 ebx=0x00000133 ecx=0x00000000 edx=0x00000000
CPU 52:
   0x8000001e 0x00: eax=0x00000068 ebx=0x00000134 ecx=0x00000000 edx=0x00000000
CPU 53:
   0x8000001e 0x00: eax=0x0000006a ebx=0x00000135 ecx=0x00000000 edx=0x00000000
CPU 54:
   0x8000001e 0x00: eax=0x0000006c ebx=0x00000136 ecx=0x00000000 edx=0x00000000
CPU 55:
   0x8000001e 0x00: eax=0x0000006e ebx=0x00000137 ecx=0x00000000 edx=0x00000000
CPU 56:
   0x8000001e 0x00: eax=0x00000070 ebx=0x00000138 ecx=0x00000000 edx=0x00000000
CPU 57:
   0x8000001e 0x00: eax=0x00000072 ebx=0x00000139 ecx=0x00000000 edx=0x00000000
CPU 58:
   0x8000001e 0x00: eax=0x00000074 ebx=0x0000013a ecx=0x00000000 edx=0x00000000
CPU 59:
   0x8000001e 0x00: eax=0x00000076 ebx=0x0000013b ecx=0x00000000 edx=0x00000000
CPU 60:
   0x8000001e 0x00: eax=0x00000078 ebx=0x0000013c ecx=0x00000000 edx=0x00000000
CPU 61:
   0x8000001e 0x00: eax=0x0000007a ebx=0x0000013d ecx=0x00000000 edx=0x00000000
CPU 62:
   0x8000001e 0x00: eax=0x0000007c ebx=0x0000013e ecx=0x00000000 edx=0x00000000
CPU 63:
   0x8000001e 0x00: eax=0x0000007e ebx=0x0000013f ecx=0x00000000 edx=0x00000000
CPU 64:
   0x8000001e 0x00: eax=0x00000080 ebx=0x00000100 ecx=0x00000001 edx=0x00000000
CPU 65:
   0x8000001e 0x00: eax=0x00000082 ebx=0x00000101 ecx=0x00000001 edx=0x00000000
CPU 66:
   0x8000001e 0x00: eax=0x00000084 ebx=0x00000102 ecx=0x00000001 edx=0x00000000
CPU 67:
   0x8000001e 0x00: eax=0x00000086 ebx=0x00000103 ecx=0x00000001 edx=0x00000000
CPU 68:
   0x8000001e 0x00: eax=0x00000088 ebx=0x00000104 ecx=0x00000001 edx=0x00000000
CPU 69:
   0x8000001e 0x00: eax=0x0000008a ebx=0x00000105 ecx=0x00000001 edx=0x00000000
CPU 70:
   0x8000001e 0x00: eax=0x0000008c ebx=0x00000106 ecx=0x00000001 edx=0x00000000
CPU 71:
   0x8000001e 0x00: eax=0x0000008e ebx=0x00000107 ecx=0x00000001 edx=0x00000000
CPU 72:
   0x8000001e 0x00: eax=0x00000090 ebx=0x00000108 ecx=0x00000001 edx=0x00000000
CPU 73:
   0x8000001e 0x00: eax=0x00000092 ebx=0x00000109 ecx=0x00000001 edx=0x00000000
CPU 74:
   0x8000001e 0x00: eax=0x00000094 ebx=0x0000010a ecx=0x00000001 edx=0x00000000
CPU 75:
   0x8000001e 0x00: eax=0x00000096 ebx=0x0000010b ecx=0x00000001 edx=0x00000000
CPU 76:
   0x8000001e 0x00: eax=0x00000098 ebx=0x0000010c ecx=0x00000001 edx=0x00000000
CPU 77:
   0x8000001e 0x00: eax=0x0000009a ebx=0x0000010d ecx=0x00000001 edx=0x00000000
CPU 78:
   0x8000001e 0x00: eax=0x0000009c ebx=0x0000010e ecx=0x00000001 edx=0x00000000
CPU 79:
   0x8000001e 0x00: eax=0x0000009e ebx=0x0000010f ecx=0x00000001 edx=0x00000000
CPU 80:
   0x8000001e 0x00: eax=0x000000a0 ebx=0x00000110 ecx=0x00000001 edx=0x00000000
CPU 81:
   0x8000001e 0x00: eax=0x000000a2 ebx=0x00000111 ecx=0x00000001 edx=0x00000000
CPU 82:
   0x8000001e 0x00: eax=0x000000a4 ebx=0x00000112 ecx=0x00000001 edx=0x00000000
CPU 83:
   0x8000001e 0x00: eax=0x000000a6 ebx=0x00000113 ecx=0x00000001 edx=0x00000000
CPU 84:
   0x8000001e 0x00: eax=0x000000a8 ebx=0x00000114 ecx=0x00000001 edx=0x00000000
CPU 85:
   0x8000001e 0x00: eax=0x000000aa ebx=0x00000115 ecx=0x00000001 edx=0x00000000
CPU 86:
   0x8000001e 0x00: eax=0x000000ac ebx=0x00000116 ecx=0x00000001 edx=0x00000000
CPU 87:
   0x8000001e 0x00: eax=0x000000ae ebx=0x00000117 ecx=0x00000001 edx=0x00000000
CPU 88:
   0x8000001e 0x00: eax=0x000000b0 ebx=0x00000118 ecx=0x00000001 edx=0x00000000
CPU 89:
   0x8000001e 0x00: eax=0x000000b2 ebx=0x00000119 ecx=0x00000001 edx=0x00000000
CPU 90:
   0x8000001e 0x00: eax=0x000000b4 ebx=0x0000011a ecx=0x00000001 edx=0x00000000
CPU 91:
   0x8000001e 0x00: eax=0x000000b6 ebx=0x0000011b ecx=0x00000001 edx=0x00000000
CPU 92:
   0x8000001e 0x00: eax=0x000000b8 ebx=0x0000011c ecx=0x00000001 edx=0x00000000
CPU 93:
   0x8000001e 0x00: eax=0x000000ba ebx=0x0000011d ecx=0x00000001 edx=0x00000000
CPU 94:
   0x8000001e 0x00: eax=0x000000bc ebx=0x0000011e ecx=0x00000001 edx=0x00000000
CPU 95:
   0x8000001e 0x00: eax=0x000000be ebx=0x0000011f ecx=0x00000001 edx=0x00000000
CPU 96:
   0x8000001e 0x00: eax=0x000000c0 ebx=0x00000120 ecx=0x00000001 edx=0x00000000
CPU 97:
   0x8000001e 0x00: eax=0x000000c2 ebx=0x00000121 ecx=0x00000001 edx=0x00000000
CPU 98:
   0x8000001e 0x00: eax=0x000000c4 ebx=0x00000122 ecx=0x00000001 edx=0x00000000
CPU 99:
   0x8000001e 0x00: eax=0x000000c6 ebx=0x00000123 ecx=0x00000001 edx=0x00000000
CPU 100:
   0x8000001e 0x00: eax=0x000000c8 ebx=0x00000124 ecx=0x00000001 edx=0x00000000
CPU 101:
   0x8000001e 0x00: eax=0x000000ca ebx=0x00000125 ecx=0x00000001 edx=0x00000000
CPU 102:
   0x8000001e 0x00: eax=0x000000cc ebx=0x00000126 ecx=0x00000001 edx=0x00000000
CPU 103:
   0x8000001e 0x00: eax=0x000000ce ebx=0x00000127 ecx=0x00000001 edx=0x00000000
CPU 104:
   0x8000001e 0x00: eax=0x000000d0 ebx=0x00000128 ecx=0x00000001 edx=0x00000000
CPU 105:
   0x8000001e 0x00: eax=0x000000d2 ebx=0x00000129 ecx=0x00000001 edx=0x00000000
CPU 106:
   0x8000001e 0x00: eax=0x000000d4 ebx=0x0000012a ecx=0x00000001 edx=0x00000000
CPU 107:
   0x8000001e 0x00: eax=0x000000d6 ebx=0x0000012b ecx=0x00000001 edx=0x00000000
CPU 108:
   0x8000001e 0x00: eax=0x000000d8 ebx=0x0000012c ecx=0x00000001 edx=0x00000000
CPU 109:
   0x8000001e 0x00: eax=0x000000da ebx=0x0000012d ecx=0x00000001 edx=0x00000000
CPU 110:
   0x8000001e 0x00: eax=0x000000dc ebx=0x0000012e ecx=0x00000001 edx=0x00000000
CPU 111:
   0x8000001e 0x00: eax=0x000000de ebx=0x0000012f ecx=0x00000001 edx=0x00000000
CPU 112:
   0x8000001e 0x00: eax=0x000000e0 ebx=0x00000130 ecx=0x00000001 edx=0x00000000
CPU 113:
   0x8000001e 0x00: eax=0x000000e2 ebx=0x00000131 ecx=0x00000001 edx=0x00000000
CPU 114:
   0x8000001e 0x00: eax=0x000000e4 ebx=0x00000132 ecx=0x00000001 edx=0x00000000
CPU 115:
   0x8000001e 0x00: eax=0x000000e6 ebx=0x00000133 ecx=0x00000001 edx=0x00000000
CPU 116:
   0x8000001e 0x00: eax=0x000000e8 ebx=0x00000134 ecx=0x00000001 edx=0x00000000
CPU 117:
   0x8000001e 0x00: eax=0x000000ea ebx=0x00000135 ecx=0x00000001 edx=0x00000000
CPU 118:
   0x8000001e 0x00: eax=0x000000ec ebx=0x00000136 ecx=0x00000001 edx=0x00000000
CPU 119:
   0x8000001e 0x00: eax=0x000000ee ebx=0x00000137 ecx=0x00000001 edx=0x00000000
CPU 120:
   0x8000001e 0x00: eax=0x000000f0 ebx=0x00000138 ecx=0x00000001 edx=0x00000000
CPU 121:
   0x8000001e 0x00: eax=0x000000f2 ebx=0x00000139 ecx=0x00000001 edx=0x00000000
CPU 122:
   0x8000001e 0x00: eax=0x000000f4 ebx=0x0000013a ecx=0x00000001 edx=0x00000000
CPU 123:
   0x8000001e 0x00: eax=0x000000f6 ebx=0x0000013b ecx=0x00000001 edx=0x00000000
CPU 124:
   0x8000001e 0x00: eax=0x000000f8 ebx=0x0000013c ecx=0x00000001 edx=0x00000000
CPU 125:
   0x8000001e 0x00: eax=0x000000fa ebx=0x0000013d ecx=0x00000001 edx=0x00000000
CPU 126:
   0x8000001e 0x00: eax=0x000000fc ebx=0x0000013e ecx=0x00000001 edx=0x00000000
CPU 127:
   0x8000001e 0x00: eax=0x000000fe ebx=0x0000013f ecx=0x00000001 edx=0x00000000
CPU 128:
   0x8000001e 0x00: eax=0x00000001 ebx=0x00000100 ecx=0x00000000 edx=0x00000000
CPU 129:
   0x8000001e 0x00: eax=0x00000003 ebx=0x00000101 ecx=0x00000000 edx=0x00000000
CPU 130:
   0x8000001e 0x00: eax=0x00000005 ebx=0x00000102 ecx=0x00000000 edx=0x00000000
CPU 131:
   0x8000001e 0x00: eax=0x00000007 ebx=0x00000103 ecx=0x00000000 edx=0x00000000
CPU 132:
   0x8000001e 0x00: eax=0x00000009 ebx=0x00000104 ecx=0x00000000 edx=0x00000000
CPU 133:
   0x8000001e 0x00: eax=0x0000000b ebx=0x00000105 ecx=0x00000000 edx=0x00000000
CPU 134:
   0x8000001e 0x00: eax=0x0000000d ebx=0x00000106 ecx=0x00000000 edx=0x00000000
CPU 135:
   0x8000001e 0x00: eax=0x0000000f ebx=0x00000107 ecx=0x00000000 edx=0x00000000
CPU 136:
   0x8000001e 0x00: eax=0x00000011 ebx=0x00000108 ecx=0x00000000 edx=0x00000000
CPU 137:
   0x8000001e 0x00: eax=0x00000013 ebx=0x00000109 ecx=0x00000000 edx=0x00000000
CPU 138:
   0x8000001e 0x00: eax=0x00000015 ebx=0x0000010a ecx=0x00000000 edx=0x00000000
CPU 139:
   0x8000001e 0x00: eax=0x00000017 ebx=0x0000010b ecx=0x00000000 edx=0x00000000
CPU 140:
   0x8000001e 0x00: eax=0x00000019 ebx=0x0000010c ecx=0x00000000 edx=0x00000000
CPU 141:
   0x8000001e 0x00: eax=0x0000001b ebx=0x0000010d ecx=0x00000000 edx=0x00000000
CPU 142:
   0x8000001e 0x00: eax=0x0000001d ebx=0x0000010e ecx=0x00000000 edx=0x00000000
CPU 143:
   0x8000001e 0x00: eax=0x0000001f ebx=0x0000010f ecx=0x00000000 edx=0x00000000
CPU 144:
   0x8000001e 0x00: eax=0x00000021 ebx=0x00000110 ecx=0x00000000 edx=0x00000000
CPU 145:
   0x8000001e 0x00: eax=0x00000023 ebx=0x00000111 ecx=0x00000000 edx=0x00000000
CPU 146:
   0x8000001e 0x00: eax=0x00000025 ebx=0x00000112 ecx=0x00000000 edx=0x00000000
CPU 147:
   0x8000001e 0x00: eax=0x00000027 ebx=0x00000113 ecx=0x00000000 edx=0x00000000
CPU 148:
   0x8000001e 0x00: eax=0x00000029 ebx=0x00000114 ecx=0x00000000 edx=0x00000000
CPU 149:
   0x8000001e 0x00: eax=0x0000002b ebx=0x00000115 ecx=0x00000000 edx=0x00000000
CPU 150:
   0x8000001e 0x00: eax=0x0000002d ebx=0x00000116 ecx=0x00000000 edx=0x00000000
CPU 151:
   0x8000001e 0x00: eax=0x0000002f ebx=0x00000117 ecx=0x00000000 edx=0x00000000
CPU 152:
   0x8000001e 0x00: eax=0x00000031 ebx=0x00000118 ecx=0x00000000 edx=0x00000000
CPU 153:
   0x8000001e 0x00: eax=0x00000033 ebx=0x00000119 ecx=0x00000000 edx=0x00000000
CPU 154:
   0x8000001e 0x00: eax=0x00000035 ebx=0x0000011a ecx=0x00000000 edx=0x00000000
CPU 155:
   0x8000001e 0x00: eax=0x00000037 ebx=0x0000011b ecx=0x00000000 edx=0x00000000
CPU 156:
   0x8000001e 0x00: eax=0x00000039 ebx=0x0000011c ecx=0x00000000 edx=0x00000000
CPU 157:
   0x8000001e 0x00: eax=0x0000003b ebx=0x0000011d ecx=0x00000000 edx=0x00000000
CPU 158:
   0x8000001e 0x00: eax=0x0000003d ebx=0x0000011e ecx=0x00000000 edx=0x00000000
CPU 159:
   0x8000001e 0x00: eax=0x0000003f ebx=0x0000011f ecx=0x00000000 edx=0x00000000
CPU 160:
   0x8000001e 0x00: eax=0x00000041 ebx=0x00000120 ecx=0x00000000 edx=0x00000000
CPU 161:
   0x8000001e 0x00: eax=0x00000043 ebx=0x00000121 ecx=0x00000000 edx=0x00000000
CPU 162:
   0x8000001e 0x00: eax=0x00000045 ebx=0x00000122 ecx=0x00000000 edx=0x00000000
CPU 163:
   0x8000001e 0x00: eax=0x00000047 ebx=0x00000123 ecx=0x00000000 edx=0x00000000
CPU 164:
   0x8000001e 0x00: eax=0x00000049 ebx=0x00000124 ecx=0x00000000 edx=0x00000000
CPU 165:
   0x8000001e 0x00: eax=0x0000004b ebx=0x00000125 ecx=0x00000000 edx=0x00000000
CPU 166:
   0x8000001e 0x00: eax=0x0000004d ebx=0x00000126 ecx=0x00000000 edx=0x00000000
CPU 167:
   0x8000001e 0x00: eax=0x0000004f ebx=0x00000127 ecx=0x00000000 edx=0x00000000
CPU 168:
   0x8000001e 0x00: eax=0x00000051 ebx=0x00000128 ecx=0x00000000 edx=0x00000000
CPU 169:
   0x8000001e 0x00: eax=0x00000053 ebx=0x00000129 ecx=0x00000000 edx=0x00000000
CPU 170:
   0x8000001e 0x00: eax=0x00000055 ebx=0x0000012a ecx=0x00000000 edx=0x00000000
CPU 171:
   0x8000001e 0x00: eax=0x00000057 ebx=0x0000012b ecx=0x00000000 edx=0x00000000
CPU 172:
   0x8000001e 0x00: eax=0x00000059 ebx=0x0000012c ecx=0x00000000 edx=0x00000000
CPU 173:
   0x8000001e 0x00: eax=0x0000005b ebx=0x0000012d ecx=0x00000000 edx=0x00000000
CPU 174:
   0x8000001e 0x00: eax=0x0000005d ebx=0x0000012e ecx=0x00000000 edx=0x00000000
CPU 175:
   0x8000001e 0x00: eax=0x0000005f ebx=0x0000012f ecx=0x00000000 edx=0x00000000
CPU 176:
   0x8000001e 0x00: eax=0x00000061 ebx=0x00000130 ecx=0x00000000 edx=0x00000000
CPU 177:
   0x8000001e 0x00: eax=0x00000063 ebx=0x00000131 ecx=0x00000000 edx=0x00000000
CPU 178:
   0x8000001e 0x00: eax=0x00000065 ebx=0x00000132 ecx=0x00000000 edx=0x00000000
CPU 179:
   0x8000001e 0x00: eax=0x00000067 ebx=0x00000133 ecx=0x00000000 edx=0x00000000
CPU 180:
   0x8000001e 0x00: eax=0x00000069 ebx=0x00000134 ecx=0x00000000 edx=0x00000000
CPU 181:
   0x8000001e 0x00: eax=0x0000006b ebx=0x00000135 ecx=0x00000000 edx=0x00000000
CPU 182:
   0x8000001e 0x00: eax=0x0000006d ebx=0x00000136 ecx=0x00000000 edx=0x00000000
CPU 183:
   0x8000001e 0x00: eax=0x0000006f ebx=0x00000137 ecx=0x00000000 edx=0x00000000
CPU 184:
   0x8000001e 0x00: eax=0x00000071 ebx=0x00000138 ecx=0x00000000 edx=0x00000000
CPU 185:
   0x8000001e 0x00: eax=0x00000073 ebx=0x00000139 ecx=0x00000000 edx=0x00000000
CPU 186:
   0x8000001e 0x00: eax=0x00000075 ebx=0x0000013a ecx=0x00000000 edx=0x00000000
CPU 187:
   0x8000001e 0x00: eax=0x00000077 ebx=0x0000013b ecx=0x00000000 edx=0x00000000
CPU 188:
   0x8000001e 0x00: eax=0x00000079 ebx=0x0000013c ecx=0x00000000 edx=0x00000000
CPU 189:
   0x8000001e 0x00: eax=0x0000007b ebx=0x0000013d ecx=0x00000000 edx=0x00000000
CPU 190:
   0x8000001e 0x00: eax=0x0000007d ebx=0x0000013e ecx=0x00000000 edx=0x00000000
CPU 191:
   0x8000001e 0x00: eax=0x0000007f ebx=0x0000013f ecx=0x00000000 edx=0x00000000
CPU 192:
   0x8000001e 0x00: eax=0x00000081 ebx=0x00000100 ecx=0x00000001 edx=0x00000000
CPU 193:
   0x8000001e 0x00: eax=0x00000083 ebx=0x00000101 ecx=0x00000001 edx=0x00000000
CPU 194:
   0x8000001e 0x00: eax=0x00000085 ebx=0x00000102 ecx=0x00000001 edx=0x00000000
CPU 195:
   0x8000001e 0x00: eax=0x00000087 ebx=0x00000103 ecx=0x00000001 edx=0x00000000
CPU 196:
   0x8000001e 0x00: eax=0x00000089 ebx=0x00000104 ecx=0x00000001 edx=0x00000000
CPU 197:
   0x8000001e 0x00: eax=0x0000008b ebx=0x00000105 ecx=0x00000001 edx=0x00000000
CPU 198:
   0x8000001e 0x00: eax=0x0000008d ebx=0x00000106 ecx=0x00000001 edx=0x00000000
CPU 199:
   0x8000001e 0x00: eax=0x0000008f ebx=0x00000107 ecx=0x00000001 edx=0x00000000
CPU 200:
   0x8000001e 0x00: eax=0x00000091 ebx=0x00000108 ecx=0x00000001 edx=0x00000000
CPU 201:
   0x8000001e 0x00: eax=0x00000093 ebx=0x00000109 ecx=0x00000001 edx=0x00000000
CPU 202:
   0x8000001e 0x00: eax=0x00000095 ebx=0x0000010a ecx=0x00000001 edx=0x00000000
CPU 203:
   0x8000001e 0x00: eax=0x00000097 ebx=0x0000010b ecx=0x00000001 edx=0x00000000
CPU 204:
   0x8000001e 0x00: eax=0x00000099 ebx=0x0000010c ecx=0x00000001 edx=0x00000000
CPU 205:
   0x8000001e 0x00: eax=0x0000009b ebx=0x0000010d ecx=0x00000001 edx=0x00000000
CPU 206:
   0x8000001e 0x00: eax=0x0000009d ebx=0x0000010e ecx=0x00000001 edx=0x00000000
CPU 207:
   0x8000001e 0x00: eax=0x0000009f ebx=0x0000010f ecx=0x00000001 edx=0x00000000
CPU 208:
   0x8000001e 0x00: eax=0x000000a1 ebx=0x00000110 ecx=0x00000001 edx=0x00000000
CPU 209:
   0x8000001e 0x00: eax=0x000000a3 ebx=0x00000111 ecx=0x00000001 edx=0x00000000
CPU 210:
   0x8000001e 0x00: eax=0x000000a5 ebx=0x00000112 ecx=0x00000001 edx=0x00000000
CPU 211:
   0x8000001e 0x00: eax=0x000000a7 ebx=0x00000113 ecx=0x00000001 edx=0x00000000
CPU 212:
   0x8000001e 0x00: eax=0x000000a9 ebx=0x00000114 ecx=0x00000001 edx=0x00000000
CPU 213:
   0x8000001e 0x00: eax=0x000000ab ebx=0x00000115 ecx=0x00000001 edx=0x00000000
CPU 214:
   0x8000001e 0x00: eax=0x000000ad ebx=0x00000116 ecx=0x00000001 edx=0x00000000
CPU 215:
   0x8000001e 0x00: eax=0x000000af ebx=0x00000117 ecx=0x00000001 edx=0x00000000
CPU 216:
   0x8000001e 0x00: eax=0x000000b1 ebx=0x00000118 ecx=0x00000001 edx=0x00000000
CPU 217:
   0x8000001e 0x00: eax=0x000000b3 ebx=0x00000119 ecx=0x00000001 edx=0x00000000
CPU 218:
   0x8000001e 0x00: eax=0x000000b5 ebx=0x0000011a ecx=0x00000001 edx=0x00000000
CPU 219:
   0x8000001e 0x00: eax=0x000000b7 ebx=0x0000011b ecx=0x00000001 edx=0x00000000
CPU 220:
   0x8000001e 0x00: eax=0x000000b9 ebx=0x0000011c ecx=0x00000001 edx=0x00000000
CPU 221:
   0x8000001e 0x00: eax=0x000000bb ebx=0x0000011d ecx=0x00000001 edx=0x00000000
CPU 222:
   0x8000001e 0x00: eax=0x000000bd ebx=0x0000011e ecx=0x00000001 edx=0x00000000
CPU 223:
   0x8000001e 0x00: eax=0x000000bf ebx=0x0000011f ecx=0x00000001 edx=0x00000000
CPU 224:
   0x8000001e 0x00: eax=0x000000c1 ebx=0x00000120 ecx=0x00000001 edx=0x00000000
CPU 225:
   0x8000001e 0x00: eax=0x000000c3 ebx=0x00000121 ecx=0x00000001 edx=0x00000000
CPU 226:
   0x8000001e 0x00: eax=0x000000c5 ebx=0x00000122 ecx=0x00000001 edx=0x00000000
CPU 227:
   0x8000001e 0x00: eax=0x000000c7 ebx=0x00000123 ecx=0x00000001 edx=0x00000000
CPU 228:
   0x8000001e 0x00: eax=0x000000c9 ebx=0x00000124 ecx=0x00000001 edx=0x00000000
CPU 229:
   0x8000001e 0x00: eax=0x000000cb ebx=0x00000125 ecx=0x00000001 edx=0x00000000
CPU 230:
   0x8000001e 0x00: eax=0x000000cd ebx=0x00000126 ecx=0x00000001 edx=0x00000000
CPU 231:
   0x8000001e 0x00: eax=0x000000cf ebx=0x00000127 ecx=0x00000001 edx=0x00000000
CPU 232:
   0x8000001e 0x00: eax=0x000000d1 ebx=0x00000128 ecx=0x00000001 edx=0x00000000
CPU 233:
   0x8000001e 0x00: eax=0x000000d3 ebx=0x00000129 ecx=0x00000001 edx=0x00000000
CPU 234:
   0x8000001e 0x00: eax=0x000000d5 ebx=0x0000012a ecx=0x00000001 edx=0x00000000
CPU 235:
   0x8000001e 0x00: eax=0x000000d7 ebx=0x0000012b ecx=0x00000001 edx=0x00000000
CPU 236:
   0x8000001e 0x00: eax=0x000000d9 ebx=0x0000012c ecx=0x00000001 edx=0x00000000
CPU 237:
   0x8000001e 0x00: eax=0x000000db ebx=0x0000012d ecx=0x00000001 edx=0x00000000
CPU 238:
   0x8000001e 0x00: eax=0x000000dd ebx=0x0000012e ecx=0x00000001 edx=0x00000000
CPU 239:
   0x8000001e 0x00: eax=0x000000df ebx=0x0000012f ecx=0x00000001 edx=0x00000000
CPU 240:
   0x8000001e 0x00: eax=0x000000e1 ebx=0x00000130 ecx=0x00000001 edx=0x00000000
CPU 241:
   0x8000001e 0x00: eax=0x000000e3 ebx=0x00000131 ecx=0x00000001 edx=0x00000000
CPU 242:
   0x8000001e 0x00: eax=0x000000e5 ebx=0x00000132 ecx=0x00000001 edx=0x00000000
CPU 243:
   0x8000001e 0x00: eax=0x000000e7 ebx=0x00000133 ecx=0x00000001 edx=0x00000000
CPU 244:
   0x8000001e 0x00: eax=0x000000e9 ebx=0x00000134 ecx=0x00000001 edx=0x00000000
CPU 245:
   0x8000001e 0x00: eax=0x000000eb ebx=0x00000135 ecx=0x00000001 edx=0x00000000
CPU 246:
   0x8000001e 0x00: eax=0x000000ed ebx=0x00000136 ecx=0x00000001 edx=0x00000000
CPU 247:
   0x8000001e 0x00: eax=0x000000ef ebx=0x00000137 ecx=0x00000001 edx=0x00000000
CPU 248:
   0x8000001e 0x00: eax=0x000000f1 ebx=0x00000138 ecx=0x00000001 edx=0x00000000
CPU 249:
   0x8000001e 0x00: eax=0x000000f3 ebx=0x00000139 ecx=0x00000001 edx=0x00000000
CPU 250:
   0x8000001e 0x00: eax=0x000000f5 ebx=0x0000013a ecx=0x00000001 edx=0x00000000
CPU 251:
   0x8000001e 0x00: eax=0x000000f7 ebx=0x0000013b ecx=0x00000001 edx=0x00000000
CPU 252:
   0x8000001e 0x00: eax=0x000000f9 ebx=0x0000013c ecx=0x00000001 edx=0x00000000
CPU 253:
   0x8000001e 0x00: eax=0x000000fb ebx=0x0000013d ecx=0x00000001 edx=0x00000000
CPU 254:
   0x8000001e 0x00: eax=0x000000fd ebx=0x0000013e ecx=0x00000001 edx=0x00000000
CPU 255:
   0x8000001e 0x00: eax=0x000000ff ebx=0x0000013f ecx=0x00000001 edx=0x00000000
[root@rome NPS]#


NPS4:
================================
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              256
On-line CPU(s) list: 0-255
Thread(s) per core:  2
Core(s) per socket:  64
Socket(s):           2
NUMA node(s):        8
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               49
Model name:          AMD EPYC 7742 64-Core Processor
Stepping:            0
CPU MHz:             1862.249
CPU max MHz:         2250.0000
CPU min MHz:         1500.0000
BogoMIPS:            4491.24
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            16384K
NUMA node0 CPU(s):   0-15,128-143
NUMA node1 CPU(s):   16-31,144-159
NUMA node2 CPU(s):   32-47,160-175
NUMA node3 CPU(s):   48-63,176-191
NUMA node4 CPU(s):   64-79,192-207
NUMA node5 CPU(s):   80-95,208-223
NUMA node6 CPU(s):   96-111,224-239
NUMA node7 CPU(s):   112-127,240-255

#cpuid -l 0x8000001e -r

CPU 0:
   0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000000 edx=0x00000000
CPU 1:
   0x8000001e 0x00: eax=0x00000002 ebx=0x00000101 ecx=0x00000000 edx=0x00000000
CPU 2:
   0x8000001e 0x00: eax=0x00000004 ebx=0x00000102 ecx=0x00000000 edx=0x00000000
CPU 3:
   0x8000001e 0x00: eax=0x00000006 ebx=0x00000103 ecx=0x00000000 edx=0x00000000
CPU 4:
   0x8000001e 0x00: eax=0x00000008 ebx=0x00000104 ecx=0x00000000 edx=0x00000000
CPU 5:
   0x8000001e 0x00: eax=0x0000000a ebx=0x00000105 ecx=0x00000000 edx=0x00000000
CPU 6:
   0x8000001e 0x00: eax=0x0000000c ebx=0x00000106 ecx=0x00000000 edx=0x00000000
CPU 7:
   0x8000001e 0x00: eax=0x0000000e ebx=0x00000107 ecx=0x00000000 edx=0x00000000
CPU 8:
   0x8000001e 0x00: eax=0x00000010 ebx=0x00000108 ecx=0x00000000 edx=0x00000000
CPU 9:
   0x8000001e 0x00: eax=0x00000012 ebx=0x00000109 ecx=0x00000000 edx=0x00000000
CPU 10:
   0x8000001e 0x00: eax=0x00000014 ebx=0x0000010a ecx=0x00000000 edx=0x00000000
CPU 11:
   0x8000001e 0x00: eax=0x00000016 ebx=0x0000010b ecx=0x00000000 edx=0x00000000
CPU 12:
   0x8000001e 0x00: eax=0x00000018 ebx=0x0000010c ecx=0x00000000 edx=0x00000000
CPU 13:
   0x8000001e 0x00: eax=0x0000001a ebx=0x0000010d ecx=0x00000000 edx=0x00000000
CPU 14:
   0x8000001e 0x00: eax=0x0000001c ebx=0x0000010e ecx=0x00000000 edx=0x00000000
CPU 15:
   0x8000001e 0x00: eax=0x0000001e ebx=0x0000010f ecx=0x00000000 edx=0x00000000
CPU 16:
   0x8000001e 0x00: eax=0x00000020 ebx=0x00000110 ecx=0x00000000 edx=0x00000000
CPU 17:
   0x8000001e 0x00: eax=0x00000022 ebx=0x00000111 ecx=0x00000000 edx=0x00000000
CPU 18:
   0x8000001e 0x00: eax=0x00000024 ebx=0x00000112 ecx=0x00000000 edx=0x00000000
CPU 19:
   0x8000001e 0x00: eax=0x00000026 ebx=0x00000113 ecx=0x00000000 edx=0x00000000
CPU 20:
   0x8000001e 0x00: eax=0x00000028 ebx=0x00000114 ecx=0x00000000 edx=0x00000000
CPU 21:
   0x8000001e 0x00: eax=0x0000002a ebx=0x00000115 ecx=0x00000000 edx=0x00000000
CPU 22:
   0x8000001e 0x00: eax=0x0000002c ebx=0x00000116 ecx=0x00000000 edx=0x00000000
CPU 23:
   0x8000001e 0x00: eax=0x0000002e ebx=0x00000117 ecx=0x00000000 edx=0x00000000
CPU 24:
   0x8000001e 0x00: eax=0x00000030 ebx=0x00000118 ecx=0x00000000 edx=0x00000000
CPU 25:
   0x8000001e 0x00: eax=0x00000032 ebx=0x00000119 ecx=0x00000000 edx=0x00000000
CPU 26:
   0x8000001e 0x00: eax=0x00000034 ebx=0x0000011a ecx=0x00000000 edx=0x00000000
CPU 27:
   0x8000001e 0x00: eax=0x00000036 ebx=0x0000011b ecx=0x00000000 edx=0x00000000
CPU 28:
   0x8000001e 0x00: eax=0x00000038 ebx=0x0000011c ecx=0x00000000 edx=0x00000000
CPU 29:
   0x8000001e 0x00: eax=0x0000003a ebx=0x0000011d ecx=0x00000000 edx=0x00000000
CPU 30:
   0x8000001e 0x00: eax=0x0000003c ebx=0x0000011e ecx=0x00000000 edx=0x00000000
CPU 31:
   0x8000001e 0x00: eax=0x0000003e ebx=0x0000011f ecx=0x00000000 edx=0x00000000
CPU 32:
   0x8000001e 0x00: eax=0x00000040 ebx=0x00000120 ecx=0x00000000 edx=0x00000000
CPU 33:
   0x8000001e 0x00: eax=0x00000042 ebx=0x00000121 ecx=0x00000000 edx=0x00000000
CPU 34:
   0x8000001e 0x00: eax=0x00000044 ebx=0x00000122 ecx=0x00000000 edx=0x00000000
CPU 35:
   0x8000001e 0x00: eax=0x00000046 ebx=0x00000123 ecx=0x00000000 edx=0x00000000
CPU 36:
   0x8000001e 0x00: eax=0x00000048 ebx=0x00000124 ecx=0x00000000 edx=0x00000000
CPU 37:
   0x8000001e 0x00: eax=0x0000004a ebx=0x00000125 ecx=0x00000000 edx=0x00000000
CPU 38:
   0x8000001e 0x00: eax=0x0000004c ebx=0x00000126 ecx=0x00000000 edx=0x00000000
CPU 39:
   0x8000001e 0x00: eax=0x0000004e ebx=0x00000127 ecx=0x00000000 edx=0x00000000
CPU 40:
   0x8000001e 0x00: eax=0x00000050 ebx=0x00000128 ecx=0x00000000 edx=0x00000000
CPU 41:
   0x8000001e 0x00: eax=0x00000052 ebx=0x00000129 ecx=0x00000000 edx=0x00000000
CPU 42:
   0x8000001e 0x00: eax=0x00000054 ebx=0x0000012a ecx=0x00000000 edx=0x00000000
CPU 43:
   0x8000001e 0x00: eax=0x00000056 ebx=0x0000012b ecx=0x00000000 edx=0x00000000
CPU 44:
   0x8000001e 0x00: eax=0x00000058 ebx=0x0000012c ecx=0x00000000 edx=0x00000000
CPU 45:
   0x8000001e 0x00: eax=0x0000005a ebx=0x0000012d ecx=0x00000000 edx=0x00000000
CPU 46:
   0x8000001e 0x00: eax=0x0000005c ebx=0x0000012e ecx=0x00000000 edx=0x00000000
CPU 47:
   0x8000001e 0x00: eax=0x0000005e ebx=0x0000012f ecx=0x00000000 edx=0x00000000
CPU 48:
   0x8000001e 0x00: eax=0x00000060 ebx=0x00000130 ecx=0x00000000 edx=0x00000000
CPU 49:
   0x8000001e 0x00: eax=0x00000062 ebx=0x00000131 ecx=0x00000000 edx=0x00000000
CPU 50:
   0x8000001e 0x00: eax=0x00000064 ebx=0x00000132 ecx=0x00000000 edx=0x00000000
CPU 51:
   0x8000001e 0x00: eax=0x00000066 ebx=0x00000133 ecx=0x00000000 edx=0x00000000
CPU 52:
   0x8000001e 0x00: eax=0x00000068 ebx=0x00000134 ecx=0x00000000 edx=0x00000000
CPU 53:
   0x8000001e 0x00: eax=0x0000006a ebx=0x00000135 ecx=0x00000000 edx=0x00000000
CPU 54:
   0x8000001e 0x00: eax=0x0000006c ebx=0x00000136 ecx=0x00000000 edx=0x00000000
CPU 55:
   0x8000001e 0x00: eax=0x0000006e ebx=0x00000137 ecx=0x00000000 edx=0x00000000
CPU 56:
   0x8000001e 0x00: eax=0x00000070 ebx=0x00000138 ecx=0x00000000 edx=0x00000000
CPU 57:
   0x8000001e 0x00: eax=0x00000072 ebx=0x00000139 ecx=0x00000000 edx=0x00000000
CPU 58:
   0x8000001e 0x00: eax=0x00000074 ebx=0x0000013a ecx=0x00000000 edx=0x00000000
CPU 59:
   0x8000001e 0x00: eax=0x00000076 ebx=0x0000013b ecx=0x00000000 edx=0x00000000
CPU 60:
   0x8000001e 0x00: eax=0x00000078 ebx=0x0000013c ecx=0x00000000 edx=0x00000000
CPU 61:
   0x8000001e 0x00: eax=0x0000007a ebx=0x0000013d ecx=0x00000000 edx=0x00000000
CPU 62:
   0x8000001e 0x00: eax=0x0000007c ebx=0x0000013e ecx=0x00000000 edx=0x00000000
CPU 63:
   0x8000001e 0x00: eax=0x0000007e ebx=0x0000013f ecx=0x00000000 edx=0x00000000
CPU 64:
   0x8000001e 0x00: eax=0x00000080 ebx=0x00000100 ecx=0x00000001 edx=0x00000000
CPU 65:
   0x8000001e 0x00: eax=0x00000082 ebx=0x00000101 ecx=0x00000001 edx=0x00000000
CPU 66:
   0x8000001e 0x00: eax=0x00000084 ebx=0x00000102 ecx=0x00000001 edx=0x00000000
CPU 67:
   0x8000001e 0x00: eax=0x00000086 ebx=0x00000103 ecx=0x00000001 edx=0x00000000
CPU 68:
   0x8000001e 0x00: eax=0x00000088 ebx=0x00000104 ecx=0x00000001 edx=0x00000000
CPU 69:
   0x8000001e 0x00: eax=0x0000008a ebx=0x00000105 ecx=0x00000001 edx=0x00000000
CPU 70:
   0x8000001e 0x00: eax=0x0000008c ebx=0x00000106 ecx=0x00000001 edx=0x00000000
CPU 71:
   0x8000001e 0x00: eax=0x0000008e ebx=0x00000107 ecx=0x00000001 edx=0x00000000
CPU 72:
   0x8000001e 0x00: eax=0x00000090 ebx=0x00000108 ecx=0x00000001 edx=0x00000000
CPU 73:
   0x8000001e 0x00: eax=0x00000092 ebx=0x00000109 ecx=0x00000001 edx=0x00000000
CPU 74:
   0x8000001e 0x00: eax=0x00000094 ebx=0x0000010a ecx=0x00000001 edx=0x00000000
CPU 75:
   0x8000001e 0x00: eax=0x00000096 ebx=0x0000010b ecx=0x00000001 edx=0x00000000
CPU 76:
   0x8000001e 0x00: eax=0x00000098 ebx=0x0000010c ecx=0x00000001 edx=0x00000000
CPU 77:
   0x8000001e 0x00: eax=0x0000009a ebx=0x0000010d ecx=0x00000001 edx=0x00000000
CPU 78:
   0x8000001e 0x00: eax=0x0000009c ebx=0x0000010e ecx=0x00000001 edx=0x00000000
CPU 79:
   0x8000001e 0x00: eax=0x0000009e ebx=0x0000010f ecx=0x00000001 edx=0x00000000
CPU 80:
   0x8000001e 0x00: eax=0x000000a0 ebx=0x00000110 ecx=0x00000001 edx=0x00000000
CPU 81:
   0x8000001e 0x00: eax=0x000000a2 ebx=0x00000111 ecx=0x00000001 edx=0x00000000
CPU 82:
   0x8000001e 0x00: eax=0x000000a4 ebx=0x00000112 ecx=0x00000001 edx=0x00000000
CPU 83:
   0x8000001e 0x00: eax=0x000000a6 ebx=0x00000113 ecx=0x00000001 edx=0x00000000
CPU 84:
   0x8000001e 0x00: eax=0x000000a8 ebx=0x00000114 ecx=0x00000001 edx=0x00000000
CPU 85:
   0x8000001e 0x00: eax=0x000000aa ebx=0x00000115 ecx=0x00000001 edx=0x00000000
CPU 86:
   0x8000001e 0x00: eax=0x000000ac ebx=0x00000116 ecx=0x00000001 edx=0x00000000
CPU 87:
   0x8000001e 0x00: eax=0x000000ae ebx=0x00000117 ecx=0x00000001 edx=0x00000000
CPU 88:
   0x8000001e 0x00: eax=0x000000b0 ebx=0x00000118 ecx=0x00000001 edx=0x00000000
CPU 89:
   0x8000001e 0x00: eax=0x000000b2 ebx=0x00000119 ecx=0x00000001 edx=0x00000000
CPU 90:
   0x8000001e 0x00: eax=0x000000b4 ebx=0x0000011a ecx=0x00000001 edx=0x00000000
CPU 91:
   0x8000001e 0x00: eax=0x000000b6 ebx=0x0000011b ecx=0x00000001 edx=0x00000000
CPU 92:
   0x8000001e 0x00: eax=0x000000b8 ebx=0x0000011c ecx=0x00000001 edx=0x00000000
CPU 93:
   0x8000001e 0x00: eax=0x000000ba ebx=0x0000011d ecx=0x00000001 edx=0x00000000
CPU 94:
   0x8000001e 0x00: eax=0x000000bc ebx=0x0000011e ecx=0x00000001 edx=0x00000000
CPU 95:
   0x8000001e 0x00: eax=0x000000be ebx=0x0000011f ecx=0x00000001 edx=0x00000000
CPU 96:
   0x8000001e 0x00: eax=0x000000c0 ebx=0x00000120 ecx=0x00000001 edx=0x00000000
CPU 97:
   0x8000001e 0x00: eax=0x000000c2 ebx=0x00000121 ecx=0x00000001 edx=0x00000000
CPU 98:
   0x8000001e 0x00: eax=0x000000c4 ebx=0x00000122 ecx=0x00000001 edx=0x00000000
CPU 99:
   0x8000001e 0x00: eax=0x000000c6 ebx=0x00000123 ecx=0x00000001 edx=0x00000000
CPU 100:
   0x8000001e 0x00: eax=0x000000c8 ebx=0x00000124 ecx=0x00000001 edx=0x00000000
CPU 101:
   0x8000001e 0x00: eax=0x000000ca ebx=0x00000125 ecx=0x00000001 edx=0x00000000
CPU 102:
   0x8000001e 0x00: eax=0x000000cc ebx=0x00000126 ecx=0x00000001 edx=0x00000000
CPU 103:
   0x8000001e 0x00: eax=0x000000ce ebx=0x00000127 ecx=0x00000001 edx=0x00000000
CPU 104:
   0x8000001e 0x00: eax=0x000000d0 ebx=0x00000128 ecx=0x00000001 edx=0x00000000
CPU 105:
   0x8000001e 0x00: eax=0x000000d2 ebx=0x00000129 ecx=0x00000001 edx=0x00000000
CPU 106:
   0x8000001e 0x00: eax=0x000000d4 ebx=0x0000012a ecx=0x00000001 edx=0x00000000
CPU 107:
   0x8000001e 0x00: eax=0x000000d6 ebx=0x0000012b ecx=0x00000001 edx=0x00000000
CPU 108:
   0x8000001e 0x00: eax=0x000000d8 ebx=0x0000012c ecx=0x00000001 edx=0x00000000
CPU 109:
   0x8000001e 0x00: eax=0x000000da ebx=0x0000012d ecx=0x00000001 edx=0x00000000
CPU 110:
   0x8000001e 0x00: eax=0x000000dc ebx=0x0000012e ecx=0x00000001 edx=0x00000000
CPU 111:
   0x8000001e 0x00: eax=0x000000de ebx=0x0000012f ecx=0x00000001 edx=0x00000000
CPU 112:
   0x8000001e 0x00: eax=0x000000e0 ebx=0x00000130 ecx=0x00000001 edx=0x00000000
CPU 113:
   0x8000001e 0x00: eax=0x000000e2 ebx=0x00000131 ecx=0x00000001 edx=0x00000000
CPU 114:
   0x8000001e 0x00: eax=0x000000e4 ebx=0x00000132 ecx=0x00000001 edx=0x00000000
CPU 115:
   0x8000001e 0x00: eax=0x000000e6 ebx=0x00000133 ecx=0x00000001 edx=0x00000000
CPU 116:
   0x8000001e 0x00: eax=0x000000e8 ebx=0x00000134 ecx=0x00000001 edx=0x00000000
CPU 117:
   0x8000001e 0x00: eax=0x000000ea ebx=0x00000135 ecx=0x00000001 edx=0x00000000
CPU 118:
   0x8000001e 0x00: eax=0x000000ec ebx=0x00000136 ecx=0x00000001 edx=0x00000000
CPU 119:
   0x8000001e 0x00: eax=0x000000ee ebx=0x00000137 ecx=0x00000001 edx=0x00000000
CPU 120:
   0x8000001e 0x00: eax=0x000000f0 ebx=0x00000138 ecx=0x00000001 edx=0x00000000
CPU 121:
   0x8000001e 0x00: eax=0x000000f2 ebx=0x00000139 ecx=0x00000001 edx=0x00000000
CPU 122:
   0x8000001e 0x00: eax=0x000000f4 ebx=0x0000013a ecx=0x00000001 edx=0x00000000
CPU 123:
   0x8000001e 0x00: eax=0x000000f6 ebx=0x0000013b ecx=0x00000001 edx=0x00000000
CPU 124:
   0x8000001e 0x00: eax=0x000000f8 ebx=0x0000013c ecx=0x00000001 edx=0x00000000
CPU 125:
   0x8000001e 0x00: eax=0x000000fa ebx=0x0000013d ecx=0x00000001 edx=0x00000000
CPU 126:
   0x8000001e 0x00: eax=0x000000fc ebx=0x0000013e ecx=0x00000001 edx=0x00000000
CPU 127:
   0x8000001e 0x00: eax=0x000000fe ebx=0x0000013f ecx=0x00000001 edx=0x00000000
CPU 128:
   0x8000001e 0x00: eax=0x00000001 ebx=0x00000100 ecx=0x00000000 edx=0x00000000
CPU 129:
   0x8000001e 0x00: eax=0x00000003 ebx=0x00000101 ecx=0x00000000 edx=0x00000000
CPU 130:
   0x8000001e 0x00: eax=0x00000005 ebx=0x00000102 ecx=0x00000000 edx=0x00000000
CPU 131:
   0x8000001e 0x00: eax=0x00000007 ebx=0x00000103 ecx=0x00000000 edx=0x00000000
CPU 132:
   0x8000001e 0x00: eax=0x00000009 ebx=0x00000104 ecx=0x00000000 edx=0x00000000
CPU 133:
   0x8000001e 0x00: eax=0x0000000b ebx=0x00000105 ecx=0x00000000 edx=0x00000000
CPU 134:
   0x8000001e 0x00: eax=0x0000000d ebx=0x00000106 ecx=0x00000000 edx=0x00000000
CPU 135:
   0x8000001e 0x00: eax=0x0000000f ebx=0x00000107 ecx=0x00000000 edx=0x00000000
CPU 136:
   0x8000001e 0x00: eax=0x00000011 ebx=0x00000108 ecx=0x00000000 edx=0x00000000
CPU 137:
   0x8000001e 0x00: eax=0x00000013 ebx=0x00000109 ecx=0x00000000 edx=0x00000000
CPU 138:
   0x8000001e 0x00: eax=0x00000015 ebx=0x0000010a ecx=0x00000000 edx=0x00000000
CPU 139:
   0x8000001e 0x00: eax=0x00000017 ebx=0x0000010b ecx=0x00000000 edx=0x00000000
CPU 140:
   0x8000001e 0x00: eax=0x00000019 ebx=0x0000010c ecx=0x00000000 edx=0x00000000
CPU 141:
   0x8000001e 0x00: eax=0x0000001b ebx=0x0000010d ecx=0x00000000 edx=0x00000000
CPU 142:
   0x8000001e 0x00: eax=0x0000001d ebx=0x0000010e ecx=0x00000000 edx=0x00000000
CPU 143:
   0x8000001e 0x00: eax=0x0000001f ebx=0x0000010f ecx=0x00000000 edx=0x00000000
CPU 144:
   0x8000001e 0x00: eax=0x00000021 ebx=0x00000110 ecx=0x00000000 edx=0x00000000
CPU 145:
   0x8000001e 0x00: eax=0x00000023 ebx=0x00000111 ecx=0x00000000 edx=0x00000000
CPU 146:
   0x8000001e 0x00: eax=0x00000025 ebx=0x00000112 ecx=0x00000000 edx=0x00000000
CPU 147:
   0x8000001e 0x00: eax=0x00000027 ebx=0x00000113 ecx=0x00000000 edx=0x00000000
CPU 148:
   0x8000001e 0x00: eax=0x00000029 ebx=0x00000114 ecx=0x00000000 edx=0x00000000
CPU 149:
   0x8000001e 0x00: eax=0x0000002b ebx=0x00000115 ecx=0x00000000 edx=0x00000000
CPU 150:
   0x8000001e 0x00: eax=0x0000002d ebx=0x00000116 ecx=0x00000000 edx=0x00000000
CPU 151:
   0x8000001e 0x00: eax=0x0000002f ebx=0x00000117 ecx=0x00000000 edx=0x00000000
CPU 152:
   0x8000001e 0x00: eax=0x00000031 ebx=0x00000118 ecx=0x00000000 edx=0x00000000
CPU 153:
   0x8000001e 0x00: eax=0x00000033 ebx=0x00000119 ecx=0x00000000 edx=0x00000000
CPU 154:
   0x8000001e 0x00: eax=0x00000035 ebx=0x0000011a ecx=0x00000000 edx=0x00000000
CPU 155:
   0x8000001e 0x00: eax=0x00000037 ebx=0x0000011b ecx=0x00000000 edx=0x00000000
CPU 156:
   0x8000001e 0x00: eax=0x00000039 ebx=0x0000011c ecx=0x00000000 edx=0x00000000
CPU 157:
   0x8000001e 0x00: eax=0x0000003b ebx=0x0000011d ecx=0x00000000 edx=0x00000000
CPU 158:
   0x8000001e 0x00: eax=0x0000003d ebx=0x0000011e ecx=0x00000000 edx=0x00000000
CPU 159:
   0x8000001e 0x00: eax=0x0000003f ebx=0x0000011f ecx=0x00000000 edx=0x00000000
CPU 160:
   0x8000001e 0x00: eax=0x00000041 ebx=0x00000120 ecx=0x00000000 edx=0x00000000
CPU 161:
   0x8000001e 0x00: eax=0x00000043 ebx=0x00000121 ecx=0x00000000 edx=0x00000000
CPU 162:
   0x8000001e 0x00: eax=0x00000045 ebx=0x00000122 ecx=0x00000000 edx=0x00000000
CPU 163:
   0x8000001e 0x00: eax=0x00000047 ebx=0x00000123 ecx=0x00000000 edx=0x00000000
CPU 164:
   0x8000001e 0x00: eax=0x00000049 ebx=0x00000124 ecx=0x00000000 edx=0x00000000
CPU 165:
   0x8000001e 0x00: eax=0x0000004b ebx=0x00000125 ecx=0x00000000 edx=0x00000000
CPU 166:
   0x8000001e 0x00: eax=0x0000004d ebx=0x00000126 ecx=0x00000000 edx=0x00000000
CPU 167:
   0x8000001e 0x00: eax=0x0000004f ebx=0x00000127 ecx=0x00000000 edx=0x00000000
CPU 168:
   0x8000001e 0x00: eax=0x00000051 ebx=0x00000128 ecx=0x00000000 edx=0x00000000
CPU 169:
   0x8000001e 0x00: eax=0x00000053 ebx=0x00000129 ecx=0x00000000 edx=0x00000000
CPU 170:
   0x8000001e 0x00: eax=0x00000055 ebx=0x0000012a ecx=0x00000000 edx=0x00000000
CPU 171:
   0x8000001e 0x00: eax=0x00000057 ebx=0x0000012b ecx=0x00000000 edx=0x00000000
CPU 172:
   0x8000001e 0x00: eax=0x00000059 ebx=0x0000012c ecx=0x00000000 edx=0x00000000
CPU 173:
   0x8000001e 0x00: eax=0x0000005b ebx=0x0000012d ecx=0x00000000 edx=0x00000000
CPU 174:
   0x8000001e 0x00: eax=0x0000005d ebx=0x0000012e ecx=0x00000000 edx=0x00000000
CPU 175:
   0x8000001e 0x00: eax=0x0000005f ebx=0x0000012f ecx=0x00000000 edx=0x00000000
CPU 176:
   0x8000001e 0x00: eax=0x00000061 ebx=0x00000130 ecx=0x00000000 edx=0x00000000
CPU 177:
   0x8000001e 0x00: eax=0x00000063 ebx=0x00000131 ecx=0x00000000 edx=0x00000000
CPU 178:
   0x8000001e 0x00: eax=0x00000065 ebx=0x00000132 ecx=0x00000000 edx=0x00000000
CPU 179:
   0x8000001e 0x00: eax=0x00000067 ebx=0x00000133 ecx=0x00000000 edx=0x00000000
CPU 180:
   0x8000001e 0x00: eax=0x00000069 ebx=0x00000134 ecx=0x00000000 edx=0x00000000
CPU 181:
   0x8000001e 0x00: eax=0x0000006b ebx=0x00000135 ecx=0x00000000 edx=0x00000000
CPU 182:
   0x8000001e 0x00: eax=0x0000006d ebx=0x00000136 ecx=0x00000000 edx=0x00000000
CPU 183:
   0x8000001e 0x00: eax=0x0000006f ebx=0x00000137 ecx=0x00000000 edx=0x00000000
CPU 184:
   0x8000001e 0x00: eax=0x00000071 ebx=0x00000138 ecx=0x00000000 edx=0x00000000
CPU 185:
   0x8000001e 0x00: eax=0x00000073 ebx=0x00000139 ecx=0x00000000 edx=0x00000000
CPU 186:
   0x8000001e 0x00: eax=0x00000075 ebx=0x0000013a ecx=0x00000000 edx=0x00000000
CPU 187:
   0x8000001e 0x00: eax=0x00000077 ebx=0x0000013b ecx=0x00000000 edx=0x00000000
CPU 188:
   0x8000001e 0x00: eax=0x00000079 ebx=0x0000013c ecx=0x00000000 edx=0x00000000
CPU 189:
   0x8000001e 0x00: eax=0x0000007b ebx=0x0000013d ecx=0x00000000 edx=0x00000000
CPU 190:
   0x8000001e 0x00: eax=0x0000007d ebx=0x0000013e ecx=0x00000000 edx=0x00000000
CPU 191:
   0x8000001e 0x00: eax=0x0000007f ebx=0x0000013f ecx=0x00000000 edx=0x00000000
CPU 192:
   0x8000001e 0x00: eax=0x00000081 ebx=0x00000100 ecx=0x00000001 edx=0x00000000
CPU 193:
   0x8000001e 0x00: eax=0x00000083 ebx=0x00000101 ecx=0x00000001 edx=0x00000000
CPU 194:
   0x8000001e 0x00: eax=0x00000085 ebx=0x00000102 ecx=0x00000001 edx=0x00000000
CPU 195:
   0x8000001e 0x00: eax=0x00000087 ebx=0x00000103 ecx=0x00000001 edx=0x00000000
CPU 196:
   0x8000001e 0x00: eax=0x00000089 ebx=0x00000104 ecx=0x00000001 edx=0x00000000
CPU 197:
   0x8000001e 0x00: eax=0x0000008b ebx=0x00000105 ecx=0x00000001 edx=0x00000000
CPU 198:
   0x8000001e 0x00: eax=0x0000008d ebx=0x00000106 ecx=0x00000001 edx=0x00000000
CPU 199:
   0x8000001e 0x00: eax=0x0000008f ebx=0x00000107 ecx=0x00000001 edx=0x00000000
CPU 200:
   0x8000001e 0x00: eax=0x00000091 ebx=0x00000108 ecx=0x00000001 edx=0x00000000
CPU 201:
   0x8000001e 0x00: eax=0x00000093 ebx=0x00000109 ecx=0x00000001 edx=0x00000000
CPU 202:
   0x8000001e 0x00: eax=0x00000095 ebx=0x0000010a ecx=0x00000001 edx=0x00000000
CPU 203:
   0x8000001e 0x00: eax=0x00000097 ebx=0x0000010b ecx=0x00000001 edx=0x00000000
CPU 204:
   0x8000001e 0x00: eax=0x00000099 ebx=0x0000010c ecx=0x00000001 edx=0x00000000
CPU 205:
   0x8000001e 0x00: eax=0x0000009b ebx=0x0000010d ecx=0x00000001 edx=0x00000000
CPU 206:
   0x8000001e 0x00: eax=0x0000009d ebx=0x0000010e ecx=0x00000001 edx=0x00000000
CPU 207:
   0x8000001e 0x00: eax=0x0000009f ebx=0x0000010f ecx=0x00000001 edx=0x00000000
CPU 208:
   0x8000001e 0x00: eax=0x000000a1 ebx=0x00000110 ecx=0x00000001 edx=0x00000000
CPU 209:
   0x8000001e 0x00: eax=0x000000a3 ebx=0x00000111 ecx=0x00000001 edx=0x00000000
CPU 210:
   0x8000001e 0x00: eax=0x000000a5 ebx=0x00000112 ecx=0x00000001 edx=0x00000000
CPU 211:
   0x8000001e 0x00: eax=0x000000a7 ebx=0x00000113 ecx=0x00000001 edx=0x00000000
CPU 212:
   0x8000001e 0x00: eax=0x000000a9 ebx=0x00000114 ecx=0x00000001 edx=0x00000000
CPU 213:
   0x8000001e 0x00: eax=0x000000ab ebx=0x00000115 ecx=0x00000001 edx=0x00000000
CPU 214:
   0x8000001e 0x00: eax=0x000000ad ebx=0x00000116 ecx=0x00000001 edx=0x00000000
CPU 215:
   0x8000001e 0x00: eax=0x000000af ebx=0x00000117 ecx=0x00000001 edx=0x00000000
CPU 216:
   0x8000001e 0x00: eax=0x000000b1 ebx=0x00000118 ecx=0x00000001 edx=0x00000000
CPU 217:
   0x8000001e 0x00: eax=0x000000b3 ebx=0x00000119 ecx=0x00000001 edx=0x00000000
CPU 218:
   0x8000001e 0x00: eax=0x000000b5 ebx=0x0000011a ecx=0x00000001 edx=0x00000000
CPU 219:
   0x8000001e 0x00: eax=0x000000b7 ebx=0x0000011b ecx=0x00000001 edx=0x00000000
CPU 220:
   0x8000001e 0x00: eax=0x000000b9 ebx=0x0000011c ecx=0x00000001 edx=0x00000000
CPU 221:
   0x8000001e 0x00: eax=0x000000bb ebx=0x0000011d ecx=0x00000001 edx=0x00000000
CPU 222:
   0x8000001e 0x00: eax=0x000000bd ebx=0x0000011e ecx=0x00000001 edx=0x00000000
CPU 223:
   0x8000001e 0x00: eax=0x000000bf ebx=0x0000011f ecx=0x00000001 edx=0x00000000
CPU 224:
   0x8000001e 0x00: eax=0x000000c1 ebx=0x00000120 ecx=0x00000001 edx=0x00000000
CPU 225:
   0x8000001e 0x00: eax=0x000000c3 ebx=0x00000121 ecx=0x00000001 edx=0x00000000
CPU 226:
   0x8000001e 0x00: eax=0x000000c5 ebx=0x00000122 ecx=0x00000001 edx=0x00000000
CPU 227:
   0x8000001e 0x00: eax=0x000000c7 ebx=0x00000123 ecx=0x00000001 edx=0x00000000
CPU 228:
   0x8000001e 0x00: eax=0x000000c9 ebx=0x00000124 ecx=0x00000001 edx=0x00000000
CPU 229:
   0x8000001e 0x00: eax=0x000000cb ebx=0x00000125 ecx=0x00000001 edx=0x00000000
CPU 230:
   0x8000001e 0x00: eax=0x000000cd ebx=0x00000126 ecx=0x00000001 edx=0x00000000
CPU 231:
   0x8000001e 0x00: eax=0x000000cf ebx=0x00000127 ecx=0x00000001 edx=0x00000000
CPU 232:
   0x8000001e 0x00: eax=0x000000d1 ebx=0x00000128 ecx=0x00000001 edx=0x00000000
CPU 233:
   0x8000001e 0x00: eax=0x000000d3 ebx=0x00000129 ecx=0x00000001 edx=0x00000000
CPU 234:
   0x8000001e 0x00: eax=0x000000d5 ebx=0x0000012a ecx=0x00000001 edx=0x00000000
CPU 235:
   0x8000001e 0x00: eax=0x000000d7 ebx=0x0000012b ecx=0x00000001 edx=0x00000000
CPU 236:
   0x8000001e 0x00: eax=0x000000d9 ebx=0x0000012c ecx=0x00000001 edx=0x00000000
CPU 237:
   0x8000001e 0x00: eax=0x000000db ebx=0x0000012d ecx=0x00000001 edx=0x00000000
CPU 238:
   0x8000001e 0x00: eax=0x000000dd ebx=0x0000012e ecx=0x00000001 edx=0x00000000
CPU 239:
   0x8000001e 0x00: eax=0x000000df ebx=0x0000012f ecx=0x00000001 edx=0x00000000
CPU 240:
   0x8000001e 0x00: eax=0x000000e1 ebx=0x00000130 ecx=0x00000001 edx=0x00000000
CPU 241:
   0x8000001e 0x00: eax=0x000000e3 ebx=0x00000131 ecx=0x00000001 edx=0x00000000
CPU 242:
   0x8000001e 0x00: eax=0x000000e5 ebx=0x00000132 ecx=0x00000001 edx=0x00000000
CPU 243:
   0x8000001e 0x00: eax=0x000000e7 ebx=0x00000133 ecx=0x00000001 edx=0x00000000
CPU 244:
   0x8000001e 0x00: eax=0x000000e9 ebx=0x00000134 ecx=0x00000001 edx=0x00000000
CPU 245:
   0x8000001e 0x00: eax=0x000000eb ebx=0x00000135 ecx=0x00000001 edx=0x00000000
CPU 246:
   0x8000001e 0x00: eax=0x000000ed ebx=0x00000136 ecx=0x00000001 edx=0x00000000
CPU 247:
   0x8000001e 0x00: eax=0x000000ef ebx=0x00000137 ecx=0x00000001 edx=0x00000000
CPU 248:
   0x8000001e 0x00: eax=0x000000f1 ebx=0x00000138 ecx=0x00000001 edx=0x00000000
CPU 249:
   0x8000001e 0x00: eax=0x000000f3 ebx=0x00000139 ecx=0x00000001 edx=0x00000000
CPU 250:
   0x8000001e 0x00: eax=0x000000f5 ebx=0x0000013a ecx=0x00000001 edx=0x00000000
CPU 251:
   0x8000001e 0x00: eax=0x000000f7 ebx=0x0000013b ecx=0x00000001 edx=0x00000000
CPU 252:
   0x8000001e 0x00: eax=0x000000f9 ebx=0x0000013c ecx=0x00000001 edx=0x00000000
CPU 253:
   0x8000001e 0x00: eax=0x000000fb ebx=0x0000013d ecx=0x00000001 edx=0x00000000
CPU 254:
   0x8000001e 0x00: eax=0x000000fd ebx=0x0000013e ecx=0x00000001 edx=0x00000000
CPU 255:
   0x8000001e 0x00: eax=0x000000ff ebx=0x0000013f ecx=0x00000001 edx=0x00000000


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v5 7/8] Revert "hw/386: Add EPYC mode topology decoding functions"
  2020-08-28 17:27   ` Eduardo Habkost
@ 2020-08-28 18:03     ` Babu Moger
  0 siblings, 0 replies; 51+ messages in thread
From: Babu Moger @ 2020-08-28 18:03 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: mst, qemu-devel, imammedo, pbonzini, rth



On 8/28/20 12:27 PM, Eduardo Habkost wrote:
> On Fri, Aug 21, 2020 at 05:13:03PM -0500, Babu Moger wrote:
>> Remove the EPYC specific apicid decoding and use the generic
>> default decoding.
>>
>> This reverts commit 7568b205555a6405042f62c64af3268f4330aed5.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> [...]
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 19198e3e7f..b29686220e 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -388,7 +388,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>>      unsigned long dies = topo_info->dies_per_pkg;
>>      int shift;
>>  
>> -    x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
>> +    x86_topo_ids_from_apicid(cpu->apic_id, topo_info, &topo_ids);
> 
> This was not part of commit 7568b205555a6405042f62c64af3268f4330aed5.
> I suggest doing this change as a separate patch, to make review easier.
> 
> That line was addd by commit dd08ef0318e2
> ("target/i386: Cleanup and use the EPYC mode topology functions").
> Wouldn't it be simpler to revert that commit?  If there are parts
> of commit dd08ef0318e2 we want to keep, they can be re-added
> in a separate patch.
> 

Sure. Will take care of it in next revision. Thanks.


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2020-08-28 18:06 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-21 22:12 [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Babu Moger
2020-08-21 22:12 ` [PATCH v5 1/8] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology Babu Moger
2020-08-26  9:57   ` Igor Mammedov
2020-08-26 17:31     ` Babu Moger
2020-08-21 22:12 ` [PATCH v5 2/8] Revert "i386: Fix pkg_id offset for EPYC cpu models" Babu Moger
2020-08-21 22:12 ` [PATCH v5 3/8] Revert "target/i386: Enable new apic id encoding for EPYC based cpus models" Babu Moger
2020-08-21 22:12 ` [PATCH v5 4/8] Revert "hw/i386: Move arch_id decode inside x86_cpus_init" Babu Moger
2020-08-21 22:12 ` [PATCH v5 5/8] Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition" Babu Moger
2020-08-21 22:12 ` [PATCH v5 6/8] Revert "hw/i386: Introduce apicid functions inside X86MachineState" Babu Moger
2020-08-21 22:13 ` [PATCH v5 7/8] Revert "hw/386: Add EPYC mode topology decoding functions" Babu Moger
2020-08-28 17:27   ` Eduardo Habkost
2020-08-28 18:03     ` Babu Moger
2020-08-21 22:13 ` [PATCH v5 8/8] i386: Simplify CPUID_8000_001E for AMD Babu Moger
2020-08-26 12:19   ` Igor Mammedov
2020-08-24 18:41 ` [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode Dr. David Alan Gilbert
2020-08-24 19:20   ` Babu Moger
2020-08-25  8:15     ` Dr. David Alan Gilbert
2020-08-25 14:38       ` Igor Mammedov
2020-08-25 15:25         ` Dr. David Alan Gilbert
2020-08-26 12:43           ` Igor Mammedov
2020-08-26 14:10             ` Dr. David Alan Gilbert
2020-08-27 21:19               ` Igor Mammedov
2020-08-27 22:58                 ` Babu Moger
2020-08-28  8:42                   ` Igor Mammedov
2020-08-28 14:22                     ` Babu Moger
2020-08-28  8:48                 ` Dr. David Alan Gilbert
2020-08-28 11:36                   ` Igor Mammedov
2020-08-26 12:38 ` Igor Mammedov
2020-08-26 12:50   ` Daniel P. Berrangé
2020-08-26 13:30     ` Igor Mammedov
2020-08-26 13:36       ` Daniel P. Berrangé
2020-08-26 14:02         ` Igor Mammedov
2020-08-26 15:03           ` Daniel P. Berrangé
2020-08-26 15:18             ` Eduardo Habkost
2020-08-27 17:03             ` Igor Mammedov
2020-08-27 19:07               ` Eduardo Habkost
2020-08-27 20:55                 ` Igor Mammedov
2020-08-28  8:55                   ` Daniel P. Berrangé
2020-08-28 16:29                     ` Eduardo Habkost
2020-08-28 16:32                       ` Daniel P. Berrangé
2020-08-28 16:45                         ` Eduardo Habkost
2020-08-28 18:00                           ` Babu Moger
2020-08-26 17:17       ` Babu Moger
2020-08-26 18:33         ` Dr. David Alan Gilbert
2020-08-26 18:45           ` Babu Moger
2020-08-27 20:21             ` Igor Mammedov
2020-08-28  8:58               ` Daniel P. Berrangé
2020-08-28 11:24                 ` Igor Mammedov
2020-08-28 14:17                   ` Babu Moger
2020-08-28 14:48                     ` Igor Mammedov
2020-08-26 14:04   ` Eduardo Habkost

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.