All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
@ 2019-12-04  0:36 Babu Moger
  2019-12-04  0:37 ` [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs Babu Moger
                   ` (18 more replies)
  0 siblings, 19 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:36 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

This series fixes APIC ID encoding problems on AMD EPYC CPUs.
https://bugzilla.redhat.com/show_bug.cgi?id=1728166

Currently, the APIC ID is decoded based on the sequence
sockets->dies->cores->threads. This works for most standard AMD and other
vendors' configurations, but this decoding sequence does not follow that of
AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
inconsistency.  When booting a guest VM, the kernel tries to validate the
topology, and finds it inconsistent with the enumeration of EPYC cpu models.

To fix the problem we need to build the topology as per the Processor
Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
Processors. It is available at https://www.amd.com/system/files/TechDocs/55570-B1_PUB.zip

Here is the text from the PPR.
Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
number of least significant bits in the Initial APIC ID that indicate core ID
within a processor, in constructing per-core CPUID masks.
Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
(MNC) that the processor could theoretically support, not the actual number of
cores that are actually implemented or enabled on the processor, as indicated
by Core::X86::Cpuid::SizeId[NC].
Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
• ApicId[6] = Socket ID.
• ApicId[5:4] = Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}

v3:
  1. Consolidated the topology information in structure X86CPUTopoInfo.
  2. Changed the ccx_id to llc_id as commented by upstream.
  3. Generalized the apic id decoding. It is mostly similar to current apic id
     except that it adds new field llc_id when numa configured. Removes all the
     hardcoded values.
  4. Removed the earlier parse_numa split. And moved the numa node initialization
     inside the numa_complete_configuration. This is bit cleaner as commented by 
     Eduardo.
  5. Added new function init_apicid_fn inside machine_class structure. This
     will be used to update the apic id handler specific to cpu model.
  6. Updated the cpuid unit tests.
  7. TODO : Need to figure out how to dynamically update the handlers using cpu models.
     I might some guidance on that.

v2:
  https://lore.kernel.org/qemu-devel/156779689013.21957.1631551572950676212.stgit@localhost.localdomain/
  1. Introduced the new property epyc to enable new epyc mode.
  2. Separated the epyc mode and non epyc mode function.
  3. Introduced function pointers in PCMachineState to handle the
     differences.
  4. Mildly tested different combinations to make things are working as expected.
  5. TODO : Setting the epyc feature bit needs to be worked out. This feature is
     supported only on AMD EPYC models. I may need some guidance on that.

v1:
  https://lore.kernel.org/qemu-devel/20190731232032.51786-1-babu.moger@amd.com/

---

Babu Moger (18):
      hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
      hw/i386: Introduce X86CPUTopoInfo to contain topology info
      hw/i386: Consolidate topology functions
      hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo
      machine: Add SMP Sockets in CpuTopology
      hw/core: Add core complex id in X86CPU topology
      machine: Add a new function init_apicid_fn in MachineClass
      hw/i386: Update structures for nodes_per_pkg
      i386: Add CPUX86Family type in CPUX86State
      hw/386: Add EPYC mode topology decoding functions
      i386: Cleanup and use the EPYC mode topology functions
      numa: Split the numa initialization
      hw/i386: Introduce apicid_from_cpu_idx in PCMachineState
      hw/i386: Introduce topo_ids_from_apicid handler PCMachineState
      hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState
      hw/i386: Introduce EPYC mode function handlers
      i386: Fix pkg_id offset for epyc mode
      tests: Update the Unit tests


 hw/core/machine-hmp-cmds.c |    3 +
 hw/core/machine.c          |   14 +++
 hw/core/numa.c             |   62 +++++++++----
 hw/i386/pc.c               |  132 +++++++++++++++++++---------
 include/hw/boards.h        |    3 +
 include/hw/i386/pc.h       |    9 ++
 include/hw/i386/topology.h |  209 +++++++++++++++++++++++++++++++-------------
 include/sysemu/numa.h      |    5 +
 qapi/machine.json          |    7 +
 target/i386/cpu.c          |  196 ++++++++++++-----------------------------
 target/i386/cpu.h          |    9 ++
 tests/test-x86-cpuid.c     |  115 ++++++++++++++----------
 vl.c                       |    4 +
 13 files changed, 455 insertions(+), 313 deletions(-)

--


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
@ 2019-12-04  0:37 ` Babu Moger
  2020-02-03 15:08   ` Igor Mammedov
  2019-12-04  0:37 ` [PATCH v3 02/18] hw/i386: Introduce X86CPUTopoInfo to contain topology info Babu Moger
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:37 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Rename few data structures related to X86 topology.  X86CPUTopoIDs will
have individual arch ids. Next patch introduces X86CPUTopoInfo which will
have all topology information(like cores, threads etc..).

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
---
 hw/i386/pc.c               |   60 ++++++++++++++++++++++----------------------
 include/hw/i386/topology.h |   40 +++++++++++++++--------------
 2 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 51b72439b4..5bd2ffccb7 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2212,7 +2212,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     int idx;
     CPUState *cs;
     CPUArchId *cpu_slot;
-    X86CPUTopoInfo topo;
+    X86CPUTopoIDs topo_ids;
     X86CPU *cpu = X86_CPU(dev);
     CPUX86State *env = &cpu->env;
     MachineState *ms = MACHINE(hotplug_dev);
@@ -2277,12 +2277,12 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
             return;
         }
 
-        topo.pkg_id = cpu->socket_id;
-        topo.die_id = cpu->die_id;
-        topo.core_id = cpu->core_id;
-        topo.smt_id = cpu->thread_id;
+        topo_ids.pkg_id = cpu->socket_id;
+        topo_ids.die_id = cpu->die_id;
+        topo_ids.core_id = cpu->core_id;
+        topo_ids.smt_id = cpu->thread_id;
         cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
-                                            smp_threads, &topo);
+                                            smp_threads, &topo_ids);
     }
 
     cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, &idx);
@@ -2290,11 +2290,11 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         MachineState *ms = MACHINE(pcms);
 
         x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
-                                 smp_cores, smp_threads, &topo);
+                                 smp_cores, smp_threads, &topo_ids);
         error_setg(errp,
             "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
             " APIC ID %" PRIu32 ", valid index range 0:%d",
-            topo.pkg_id, topo.die_id, topo.core_id, topo.smt_id,
+            topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, topo_ids.smt_id,
             cpu->apic_id, ms->possible_cpus->len - 1);
         return;
     }
@@ -2312,34 +2312,34 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
      * once -smp refactoring is complete and there will be CPU private
      * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
     x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
-                             smp_cores, smp_threads, &topo);
-    if (cpu->socket_id != -1 && cpu->socket_id != topo.pkg_id) {
+                             smp_cores, smp_threads, &topo_ids);
+    if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
         error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
-            " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, topo.pkg_id);
+            " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, topo_ids.pkg_id);
         return;
     }
-    cpu->socket_id = topo.pkg_id;
+    cpu->socket_id = topo_ids.pkg_id;
 
-    if (cpu->die_id != -1 && cpu->die_id != topo.die_id) {
+    if (cpu->die_id != -1 && cpu->die_id != topo_ids.die_id) {
         error_setg(errp, "property die-id: %u doesn't match set apic-id:"
-            " 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo.die_id);
+            " 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo_ids.die_id);
         return;
     }
-    cpu->die_id = topo.die_id;
+    cpu->die_id = topo_ids.die_id;
 
-    if (cpu->core_id != -1 && cpu->core_id != topo.core_id) {
+    if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
         error_setg(errp, "property core-id: %u doesn't match set apic-id:"
-            " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo.core_id);
+            " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo_ids.core_id);
         return;
     }
-    cpu->core_id = topo.core_id;
+    cpu->core_id = topo_ids.core_id;
 
-    if (cpu->thread_id != -1 && cpu->thread_id != topo.smt_id) {
+    if (cpu->thread_id != -1 && cpu->thread_id != topo_ids.smt_id) {
         error_setg(errp, "property thread-id: %u doesn't match set apic-id:"
-            " 0x%x (thread-id: %u)", cpu->thread_id, cpu->apic_id, topo.smt_id);
+            " 0x%x (thread-id: %u)", cpu->thread_id, cpu->apic_id, topo_ids.smt_id);
         return;
     }
-    cpu->thread_id = topo.smt_id;
+    cpu->thread_id = topo_ids.smt_id;
 
     if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VPINDEX) &&
         !kvm_hv_vpindex_settable()) {
@@ -2692,14 +2692,14 @@ pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 
 static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
 {
-   X86CPUTopoInfo topo;
+   X86CPUTopoIDs topo_ids;
    PCMachineState *pcms = PC_MACHINE(ms);
 
    assert(idx < ms->possible_cpus->len);
    x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
                             pcms->smp_dies, ms->smp.cores,
-                            ms->smp.threads, &topo);
-   return topo.pkg_id % ms->numa_state->num_nodes;
+                            ms->smp.threads, &topo_ids);
+   return topo_ids.pkg_id % ms->numa_state->num_nodes;
 }
 
 static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
@@ -2721,24 +2721,24 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
                                   sizeof(CPUArchId) * max_cpus);
     ms->possible_cpus->len = max_cpus;
     for (i = 0; i < ms->possible_cpus->len; i++) {
-        X86CPUTopoInfo topo;
+        X86CPUTopoIDs topo_ids;
 
         ms->possible_cpus->cpus[i].type = ms->cpu_type;
         ms->possible_cpus->cpus[i].vcpus_count = 1;
         ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
         x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
                                  pcms->smp_dies, ms->smp.cores,
-                                 ms->smp.threads, &topo);
+                                 ms->smp.threads, &topo_ids);
         ms->possible_cpus->cpus[i].props.has_socket_id = true;
-        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
+        ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
         if (pcms->smp_dies > 1) {
             ms->possible_cpus->cpus[i].props.has_die_id = true;
-            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
+            ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
         }
         ms->possible_cpus->cpus[i].props.has_core_id = true;
-        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
+        ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
         ms->possible_cpus->cpus[i].props.has_thread_id = true;
-        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
+        ms->possible_cpus->cpus[i].props.thread_id = topo_ids.smt_id;
     }
     return ms->possible_cpus;
 }
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 4ff5b2da6c..6c184f3115 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -45,12 +45,12 @@
  */
 typedef uint32_t apic_id_t;
 
-typedef struct X86CPUTopoInfo {
+typedef struct X86CPUTopoIDs {
     unsigned pkg_id;
     unsigned die_id;
     unsigned core_id;
     unsigned smt_id;
-} X86CPUTopoInfo;
+} X86CPUTopoIDs;
 
 /* Return the bit width needed for 'count' IDs
  */
@@ -122,12 +122,12 @@ static inline unsigned apicid_pkg_offset(unsigned nr_dies,
 static inline apic_id_t apicid_from_topo_ids(unsigned nr_dies,
                                              unsigned nr_cores,
                                              unsigned nr_threads,
-                                             const X86CPUTopoInfo *topo)
+                                             const X86CPUTopoIDs *topo_ids)
 {
-    return (topo->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
-           (topo->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
-          (topo->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
-           topo->smt_id;
+    return (topo_ids->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
+           (topo_ids->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
+           (topo_ids->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
+           topo_ids->smt_id;
 }
 
 /* Calculate thread/core/package IDs for a specific topology,
@@ -137,12 +137,12 @@ static inline void x86_topo_ids_from_idx(unsigned nr_dies,
                                          unsigned nr_cores,
                                          unsigned nr_threads,
                                          unsigned cpu_index,
-                                         X86CPUTopoInfo *topo)
+                                         X86CPUTopoIDs *topo_ids)
 {
-    topo->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
-    topo->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
-    topo->core_id = cpu_index / nr_threads % nr_cores;
-    topo->smt_id = cpu_index % nr_threads;
+    topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
+    topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
+    topo_ids->core_id = cpu_index / nr_threads % nr_cores;
+    topo_ids->smt_id = cpu_index % nr_threads;
 }
 
 /* Calculate thread/core/package IDs for a specific topology,
@@ -152,17 +152,17 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
                                             unsigned nr_dies,
                                             unsigned nr_cores,
                                             unsigned nr_threads,
-                                            X86CPUTopoInfo *topo)
+                                            X86CPUTopoIDs *topo_ids)
 {
-    topo->smt_id = apicid &
+    topo_ids->smt_id = apicid &
             ~(0xFFFFFFFFUL << apicid_smt_width(nr_dies, nr_cores, nr_threads));
-    topo->core_id =
+    topo_ids->core_id =
             (apicid >> apicid_core_offset(nr_dies, nr_cores, nr_threads)) &
             ~(0xFFFFFFFFUL << apicid_core_width(nr_dies, nr_cores, nr_threads));
-    topo->die_id =
+    topo_ids->die_id =
             (apicid >> apicid_die_offset(nr_dies, nr_cores, nr_threads)) &
             ~(0xFFFFFFFFUL << apicid_die_width(nr_dies, nr_cores, nr_threads));
-    topo->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, nr_threads);
+    topo_ids->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, nr_threads);
 }
 
 /* Make APIC ID for the CPU 'cpu_index'
@@ -174,9 +174,9 @@ static inline apic_id_t x86_apicid_from_cpu_idx(unsigned nr_dies,
                                                 unsigned nr_threads,
                                                 unsigned cpu_index)
 {
-    X86CPUTopoInfo topo;
-    x86_topo_ids_from_idx(nr_dies, nr_cores, nr_threads, cpu_index, &topo);
-    return apicid_from_topo_ids(nr_dies, nr_cores, nr_threads, &topo);
+    X86CPUTopoIDs topo_ids;
+    x86_topo_ids_from_idx(nr_dies, nr_cores, nr_threads, cpu_index, &topo_ids);
+    return apicid_from_topo_ids(nr_dies, nr_cores, nr_threads, &topo_ids);
 }
 
 #endif /* HW_I386_TOPOLOGY_H */



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 02/18] hw/i386: Introduce X86CPUTopoInfo to contain topology info
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
  2019-12-04  0:37 ` [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs Babu Moger
@ 2019-12-04  0:37 ` Babu Moger
  2020-01-28 15:44   ` Igor Mammedov
  2019-12-04  0:37 ` [PATCH v3 03/18] hw/i386: Consolidate topology functions Babu Moger
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:37 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

This is an effort to re-arrange few data structure for better readability.
Add X86CPUTopoInfo which will have all the topology informations required
to build the cpu topology. There is no functional changes.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c               |   40 +++++++++++++++++++++++++++-------------
 include/hw/i386/topology.h |   38 ++++++++++++++++++++++++--------------
 2 files changed, 51 insertions(+), 27 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5bd2ffccb7..8c23b1e8c9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -878,11 +878,15 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
 {
     MachineState *ms = MACHINE(pcms);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    X86CPUTopoInfo topo_info;
     uint32_t correct_id;
     static bool warned;
 
-    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
-                                         ms->smp.threads, cpu_index);
+    topo_info.dies_per_pkg = pcms->smp_dies;
+    topo_info.cores_per_die = ms->smp.cores;
+    topo_info.threads_per_core = ms->smp.threads;
+
+    correct_id = x86_apicid_from_cpu_idx(&topo_info, cpu_index);
     if (pcmc->compat_apic_id_mode) {
         if (cpu_index != correct_id && !warned && !qtest_enabled()) {
             error_report("APIC IDs set in compatibility mode, "
@@ -2219,6 +2223,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     PCMachineState *pcms = PC_MACHINE(hotplug_dev);
     unsigned int smp_cores = ms->smp.cores;
     unsigned int smp_threads = ms->smp.threads;
+    X86CPUTopoInfo topo_info;
 
     if(!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
         error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
@@ -2226,6 +2231,10 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         return;
     }
 
+    topo_info.dies_per_pkg = pcms->smp_dies;
+    topo_info.cores_per_die = smp_cores;
+    topo_info.threads_per_core = smp_threads;
+
     env->nr_dies = pcms->smp_dies;
 
     /*
@@ -2281,16 +2290,14 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         topo_ids.die_id = cpu->die_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
-        cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
-                                            smp_threads, &topo_ids);
+        cpu->apic_id = apicid_from_topo_ids(&topo_info, &topo_ids);
     }
 
     cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, &idx);
     if (!cpu_slot) {
         MachineState *ms = MACHINE(pcms);
 
-        x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
-                                 smp_cores, smp_threads, &topo_ids);
+        x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
         error_setg(errp,
             "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
             " APIC ID %" PRIu32 ", valid index range 0:%d",
@@ -2311,8 +2318,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
      * once -smp refactoring is complete and there will be CPU private
      * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
-    x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
-                             smp_cores, smp_threads, &topo_ids);
+    x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
     if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
         error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
             " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, topo_ids.pkg_id);
@@ -2694,19 +2700,28 @@ static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
 {
    X86CPUTopoIDs topo_ids;
    PCMachineState *pcms = PC_MACHINE(ms);
+   X86CPUTopoInfo topo_info;
+
+   topo_info.dies_per_pkg = pcms->smp_dies;
+   topo_info.cores_per_die = ms->smp.cores;
+   topo_info.threads_per_core = ms->smp.threads;
 
    assert(idx < ms->possible_cpus->len);
    x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
-                            pcms->smp_dies, ms->smp.cores,
-                            ms->smp.threads, &topo_ids);
+                            &topo_info, &topo_ids);
    return topo_ids.pkg_id % ms->numa_state->num_nodes;
 }
 
 static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
 {
     PCMachineState *pcms = PC_MACHINE(ms);
-    int i;
     unsigned int max_cpus = ms->smp.max_cpus;
+    X86CPUTopoInfo topo_info;
+    int i;
+
+    topo_info.dies_per_pkg = pcms->smp_dies;
+    topo_info.cores_per_die = ms->smp.cores;
+    topo_info.threads_per_core = ms->smp.threads;
 
     if (ms->possible_cpus) {
         /*
@@ -2727,8 +2742,7 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[i].vcpus_count = 1;
         ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
         x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
-                                 pcms->smp_dies, ms->smp.cores,
-                                 ms->smp.threads, &topo_ids);
+                                 &topo_info, &topo_ids);
         ms->possible_cpus->cpus[i].props.has_socket_id = true;
         ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
         if (pcms->smp_dies > 1) {
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 6c184f3115..cf1935d548 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -52,6 +52,12 @@ typedef struct X86CPUTopoIDs {
     unsigned smt_id;
 } X86CPUTopoIDs;
 
+typedef struct X86CPUTopoInfo {
+    unsigned dies_per_pkg;
+    unsigned cores_per_die;
+    unsigned threads_per_core;
+} X86CPUTopoInfo;
+
 /* Return the bit width needed for 'count' IDs
  */
 static unsigned apicid_bitwidth_for_count(unsigned count)
@@ -119,11 +125,13 @@ static inline unsigned apicid_pkg_offset(unsigned nr_dies,
  *
  * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
  */
-static inline apic_id_t apicid_from_topo_ids(unsigned nr_dies,
-                                             unsigned nr_cores,
-                                             unsigned nr_threads,
+static inline apic_id_t apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
                                              const X86CPUTopoIDs *topo_ids)
 {
+    unsigned nr_dies = topo_info->dies_per_pkg;
+    unsigned nr_cores = topo_info->cores_per_die;
+    unsigned nr_threads = topo_info->threads_per_core;
+
     return (topo_ids->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
            (topo_ids->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
            (topo_ids->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
@@ -133,12 +141,14 @@ static inline apic_id_t apicid_from_topo_ids(unsigned nr_dies,
 /* Calculate thread/core/package IDs for a specific topology,
  * based on (contiguous) CPU index
  */
-static inline void x86_topo_ids_from_idx(unsigned nr_dies,
-                                         unsigned nr_cores,
-                                         unsigned nr_threads,
+static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
                                          unsigned cpu_index,
                                          X86CPUTopoIDs *topo_ids)
 {
+    unsigned nr_dies = topo_info->dies_per_pkg;
+    unsigned nr_cores = topo_info->cores_per_die;
+    unsigned nr_threads = topo_info->threads_per_core;
+
     topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
     topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
     topo_ids->core_id = cpu_index / nr_threads % nr_cores;
@@ -149,11 +159,13 @@ static inline void x86_topo_ids_from_idx(unsigned nr_dies,
  * based on APIC ID
  */
 static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
-                                            unsigned nr_dies,
-                                            unsigned nr_cores,
-                                            unsigned nr_threads,
+                                            X86CPUTopoInfo *topo_info,
                                             X86CPUTopoIDs *topo_ids)
 {
+    unsigned nr_dies = topo_info->dies_per_pkg;
+    unsigned nr_cores = topo_info->cores_per_die;
+    unsigned nr_threads = topo_info->threads_per_core;
+
     topo_ids->smt_id = apicid &
             ~(0xFFFFFFFFUL << apicid_smt_width(nr_dies, nr_cores, nr_threads));
     topo_ids->core_id =
@@ -169,14 +181,12 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
  *
  * 'cpu_index' is a sequential, contiguous ID for the CPU.
  */
-static inline apic_id_t x86_apicid_from_cpu_idx(unsigned nr_dies,
-                                                unsigned nr_cores,
-                                                unsigned nr_threads,
+static inline apic_id_t x86_apicid_from_cpu_idx(X86CPUTopoInfo *topo_info,
                                                 unsigned cpu_index)
 {
     X86CPUTopoIDs topo_ids;
-    x86_topo_ids_from_idx(nr_dies, nr_cores, nr_threads, cpu_index, &topo_ids);
-    return apicid_from_topo_ids(nr_dies, nr_cores, nr_threads, &topo_ids);
+    x86_topo_ids_from_idx(topo_info, cpu_index, &topo_ids);
+    return apicid_from_topo_ids(topo_info, &topo_ids);
 }
 
 #endif /* HW_I386_TOPOLOGY_H */



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 03/18] hw/i386: Consolidate topology functions
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
  2019-12-04  0:37 ` [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs Babu Moger
  2019-12-04  0:37 ` [PATCH v3 02/18] hw/i386: Introduce X86CPUTopoInfo to contain topology info Babu Moger
@ 2019-12-04  0:37 ` Babu Moger
  2020-01-28 15:46   ` Igor Mammedov
  2019-12-04  0:37 ` [PATCH v3 04/18] hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo Babu Moger
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:37 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Now that we have all the parameters in X86CPUTopoInfo, we can just pass the
structure to calculate the offsets and width.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 include/hw/i386/topology.h |   64 ++++++++++++++------------------------------
 target/i386/cpu.c          |   23 ++++++++--------
 2 files changed, 32 insertions(+), 55 deletions(-)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index cf1935d548..ba52d49079 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -69,56 +69,42 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
 
 /* Bit width of the SMT_ID (thread ID) field on the APIC ID
  */
-static inline unsigned apicid_smt_width(unsigned nr_dies,
-                                        unsigned nr_cores,
-                                        unsigned nr_threads)
+static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
 {
-    return apicid_bitwidth_for_count(nr_threads);
+    return apicid_bitwidth_for_count(topo_info->threads_per_core);
 }
 
 /* Bit width of the Core_ID field
  */
-static inline unsigned apicid_core_width(unsigned nr_dies,
-                                         unsigned nr_cores,
-                                         unsigned nr_threads)
+static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
 {
-    return apicid_bitwidth_for_count(nr_cores);
+    return apicid_bitwidth_for_count(topo_info->cores_per_die);
 }
 
 /* Bit width of the Die_ID field */
-static inline unsigned apicid_die_width(unsigned nr_dies,
-                                        unsigned nr_cores,
-                                        unsigned nr_threads)
+static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
 {
-    return apicid_bitwidth_for_count(nr_dies);
+    return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
 }
 
 /* Bit offset of the Core_ID field
  */
-static inline unsigned apicid_core_offset(unsigned nr_dies,
-                                          unsigned nr_cores,
-                                          unsigned nr_threads)
+static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
 {
-    return apicid_smt_width(nr_dies, nr_cores, nr_threads);
+    return apicid_smt_width(topo_info);
 }
 
 /* Bit offset of the Die_ID field */
-static inline unsigned apicid_die_offset(unsigned nr_dies,
-                                          unsigned nr_cores,
-                                           unsigned nr_threads)
+static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
 {
-    return apicid_core_offset(nr_dies, nr_cores, nr_threads) +
-           apicid_core_width(nr_dies, nr_cores, nr_threads);
+    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
 }
 
 /* Bit offset of the Pkg_ID (socket ID) field
  */
-static inline unsigned apicid_pkg_offset(unsigned nr_dies,
-                                         unsigned nr_cores,
-                                         unsigned nr_threads)
+static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
 {
-    return apicid_die_offset(nr_dies, nr_cores, nr_threads) +
-           apicid_die_width(nr_dies, nr_cores, nr_threads);
+    return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
 }
 
 /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
@@ -128,13 +114,9 @@ static inline unsigned apicid_pkg_offset(unsigned nr_dies,
 static inline apic_id_t apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
                                              const X86CPUTopoIDs *topo_ids)
 {
-    unsigned nr_dies = topo_info->dies_per_pkg;
-    unsigned nr_cores = topo_info->cores_per_die;
-    unsigned nr_threads = topo_info->threads_per_core;
-
-    return (topo_ids->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
-           (topo_ids->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
-           (topo_ids->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
+    return (topo_ids->pkg_id  << apicid_pkg_offset(topo_info)) |
+           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
+           (topo_ids->core_id << apicid_core_offset(topo_info)) |
            topo_ids->smt_id;
 }
 
@@ -162,19 +144,15 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
                                             X86CPUTopoInfo *topo_info,
                                             X86CPUTopoIDs *topo_ids)
 {
-    unsigned nr_dies = topo_info->dies_per_pkg;
-    unsigned nr_cores = topo_info->cores_per_die;
-    unsigned nr_threads = topo_info->threads_per_core;
-
     topo_ids->smt_id = apicid &
-            ~(0xFFFFFFFFUL << apicid_smt_width(nr_dies, nr_cores, nr_threads));
+            ~(0xFFFFFFFFUL << apicid_smt_width(topo_info));
     topo_ids->core_id =
-            (apicid >> apicid_core_offset(nr_dies, nr_cores, nr_threads)) &
-            ~(0xFFFFFFFFUL << apicid_core_width(nr_dies, nr_cores, nr_threads));
+            (apicid >> apicid_core_offset(topo_info)) &
+            ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
     topo_ids->die_id =
-            (apicid >> apicid_die_offset(nr_dies, nr_cores, nr_threads)) &
-            ~(0xFFFFFFFFUL << apicid_die_width(nr_dies, nr_cores, nr_threads));
-    topo_ids->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, nr_threads);
+            (apicid >> apicid_die_offset(topo_info)) &
+            ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
+    topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
 }
 
 /* Make APIC ID for the CPU 'cpu_index'
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 07cf562d89..bc9b491557 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4551,6 +4551,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t die_offset;
     uint32_t limit;
     uint32_t signature[3];
+    X86CPUTopoInfo topo_info;
+
+    topo_info.dies_per_pkg = env->nr_dies;
+    topo_info.cores_per_die = cs->nr_cores;
+    topo_info.threads_per_core = cs->nr_threads;
 
     /* Calculate & apply limits for different index ranges */
     if (index >= 0xC0000000) {
@@ -4637,8 +4642,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
                                     eax, ebx, ecx, edx);
                 break;
             case 3: /* L3 cache info */
-                die_offset = apicid_die_offset(env->nr_dies,
-                                        cs->nr_cores, cs->nr_threads);
+                die_offset = apicid_die_offset(&topo_info);
                 if (cpu->enable_l3_cache) {
                     encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
                                         (1 << die_offset), cs->nr_cores,
@@ -4729,14 +4733,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 
         switch (count) {
         case 0:
-            *eax = apicid_core_offset(env->nr_dies,
-                                      cs->nr_cores, cs->nr_threads);
+            *eax = apicid_core_offset(&topo_info);
             *ebx = cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
-            *eax = apicid_pkg_offset(env->nr_dies,
-                                     cs->nr_cores, cs->nr_threads);
+            *eax = apicid_pkg_offset(&topo_info);
             *ebx = cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
@@ -4760,20 +4762,17 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         *edx = cpu->apic_id;
         switch (count) {
         case 0:
-            *eax = apicid_core_offset(env->nr_dies, cs->nr_cores,
-                                                    cs->nr_threads);
+            *eax = apicid_core_offset(&topo_info);
             *ebx = cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
-            *eax = apicid_die_offset(env->nr_dies, cs->nr_cores,
-                                                   cs->nr_threads);
+            *eax = apicid_die_offset(&topo_info);
             *ebx = cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         case 2:
-            *eax = apicid_pkg_offset(env->nr_dies, cs->nr_cores,
-                                                   cs->nr_threads);
+            *eax = apicid_pkg_offset(&topo_info);
             *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 04/18] hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (2 preceding siblings ...)
  2019-12-04  0:37 ` [PATCH v3 03/18] hw/i386: Consolidate topology functions Babu Moger
@ 2019-12-04  0:37 ` Babu Moger
  2020-01-28 15:49   ` Igor Mammedov
  2019-12-04  0:37 ` [PATCH v3 05/18] machine: Add SMP Sockets in CpuTopology Babu Moger
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:37 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Initialize all the parameters in one function initialize_topo_info.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
---
 hw/i386/pc.c |   28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 8c23b1e8c9..cafbdafa76 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -866,6 +866,15 @@ static void handle_a20_line_change(void *opaque, int irq, int level)
     x86_cpu_set_a20(cpu, level);
 }
 
+static inline void initialize_topo_info(X86CPUTopoInfo *topo_info,
+                                        PCMachineState *pcms,
+                                        const MachineState *ms)
+{
+    topo_info->dies_per_pkg = pcms->smp_dies;
+    topo_info->cores_per_die = ms->smp.cores;
+    topo_info->threads_per_core = ms->smp.threads;
+}
+
 /* Calculates initial APIC ID for a specific CPU index
  *
  * Currently we need to be able to calculate the APIC ID from the CPU index
@@ -882,9 +891,7 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
     uint32_t correct_id;
     static bool warned;
 
-    topo_info.dies_per_pkg = pcms->smp_dies;
-    topo_info.cores_per_die = ms->smp.cores;
-    topo_info.threads_per_core = ms->smp.threads;
+    initialize_topo_info(&topo_info, pcms, ms);
 
     correct_id = x86_apicid_from_cpu_idx(&topo_info, cpu_index);
     if (pcmc->compat_apic_id_mode) {
@@ -2231,9 +2238,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         return;
     }
 
-    topo_info.dies_per_pkg = pcms->smp_dies;
-    topo_info.cores_per_die = smp_cores;
-    topo_info.threads_per_core = smp_threads;
+    initialize_topo_info(&topo_info, pcms, ms);
 
     env->nr_dies = pcms->smp_dies;
 
@@ -2702,9 +2707,7 @@ static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
    PCMachineState *pcms = PC_MACHINE(ms);
    X86CPUTopoInfo topo_info;
 
-   topo_info.dies_per_pkg = pcms->smp_dies;
-   topo_info.cores_per_die = ms->smp.cores;
-   topo_info.threads_per_core = ms->smp.threads;
+   initialize_topo_info(&topo_info, pcms, ms);
 
    assert(idx < ms->possible_cpus->len);
    x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
@@ -2719,10 +2722,6 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
     X86CPUTopoInfo topo_info;
     int i;
 
-    topo_info.dies_per_pkg = pcms->smp_dies;
-    topo_info.cores_per_die = ms->smp.cores;
-    topo_info.threads_per_core = ms->smp.threads;
-
     if (ms->possible_cpus) {
         /*
          * make sure that max_cpus hasn't changed since the first use, i.e.
@@ -2734,6 +2733,9 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
 
     ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
                                   sizeof(CPUArchId) * max_cpus);
+
+    initialize_topo_info(&topo_info, pcms, ms);
+
     ms->possible_cpus->len = max_cpus;
     for (i = 0; i < ms->possible_cpus->len; i++) {
         X86CPUTopoIDs topo_ids;



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 05/18] machine: Add SMP Sockets in CpuTopology
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (3 preceding siblings ...)
  2019-12-04  0:37 ` [PATCH v3 04/18] hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo Babu Moger
@ 2019-12-04  0:37 ` Babu Moger
  2019-12-04  0:37 ` [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology Babu Moger
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:37 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Store the  smp sockets in CpuTopology. The socket information required to
build the apic id in EPYC mode. Right now socket information is not passed
to down when decoding the apic id. Add the socket information here.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
---
 hw/core/machine.c   |    1 +
 hw/i386/pc.c        |    1 +
 include/hw/boards.h |    2 ++
 vl.c                |    1 +
 4 files changed, 5 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 1689ad3bf8..e59b181ead 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -784,6 +784,7 @@ static void smp_parse(MachineState *ms, QemuOpts *opts)
         ms->smp.cpus = cpus;
         ms->smp.cores = cores;
         ms->smp.threads = threads;
+        ms->smp.sockets = sockets;
     }
 
     if (ms->smp.cpus > 1) {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index cafbdafa76..17de152a77 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1473,6 +1473,7 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
         ms->smp.cpus = cpus;
         ms->smp.cores = cores;
         ms->smp.threads = threads;
+        ms->smp.sockets = sockets;
         pcms->smp_dies = dies;
     }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index de45087f34..d4fab218e6 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -256,12 +256,14 @@ typedef struct DeviceMemoryState {
  * @cpus: the number of present logical processors on the machine
  * @cores: the number of cores in one package
  * @threads: the number of threads in one core
+ * @sockets: the number of sockets on the machine
  * @max_cpus: the maximum number of logical processors on the machine
  */
 typedef struct CpuTopology {
     unsigned int cpus;
     unsigned int cores;
     unsigned int threads;
+    unsigned int sockets;
     unsigned int max_cpus;
 } CpuTopology;
 
diff --git a/vl.c b/vl.c
index 4489cfb2bb..a42c24a77f 100644
--- a/vl.c
+++ b/vl.c
@@ -3962,6 +3962,7 @@ int main(int argc, char **argv, char **envp)
     current_machine->smp.max_cpus = machine_class->default_cpus;
     current_machine->smp.cores = 1;
     current_machine->smp.threads = 1;
+    current_machine->smp.sockets = 1;
 
     machine_class->smp_parse(current_machine,
         qemu_opts_find(qemu_find_opts("smp-opts"), NULL));



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (4 preceding siblings ...)
  2019-12-04  0:37 ` [PATCH v3 05/18] machine: Add SMP Sockets in CpuTopology Babu Moger
@ 2019-12-04  0:37 ` Babu Moger
  2020-01-28 16:27   ` Igor Mammedov
  2020-01-28 16:31   ` Eric Blake
  2019-12-04  0:37 ` [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass Babu Moger
                   ` (12 subsequent siblings)
  18 siblings, 2 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:37 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Introduce last level cache id(llc_id) in x86CPU topology.  This information is
required to build the topology in EPIC mode.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/core/machine-hmp-cmds.c |    3 +++
 hw/core/machine.c          |   13 +++++++++++++
 hw/i386/pc.c               |   10 ++++++++++
 include/hw/i386/topology.h |    1 +
 qapi/machine.json          |    7 +++++--
 target/i386/cpu.c          |    2 ++
 target/i386/cpu.h          |    1 +
 7 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c
index cd970cc4c5..59c91d1ce1 100644
--- a/hw/core/machine-hmp-cmds.c
+++ b/hw/core/machine-hmp-cmds.c
@@ -90,6 +90,9 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict)
         if (c->has_die_id) {
             monitor_printf(mon, "    die-id: \"%" PRIu64 "\"\n", c->die_id);
         }
+        if (c->has_llc_id) {
+            monitor_printf(mon, "    llc-id: \"%" PRIu64 "\"\n", c->llc_id);
+        }
         if (c->has_core_id) {
             monitor_printf(mon, "    core-id: \"%" PRIu64 "\"\n", c->core_id);
         }
diff --git a/hw/core/machine.c b/hw/core/machine.c
index e59b181ead..ff991e6ab5 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -683,6 +683,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
             return;
         }
 
+        if (props->has_llc_id && !slot->props.has_llc_id) {
+            error_setg(errp, "llc-id is not supported");
+            return;
+        }
+
         /* skip slots with explicit mismatch */
         if (props->has_thread_id && props->thread_id != slot->props.thread_id) {
                 continue;
@@ -696,6 +701,10 @@ void machine_set_cpu_numa_node(MachineState *machine,
                 continue;
         }
 
+        if (props->has_llc_id && props->llc_id != slot->props.llc_id) {
+                continue;
+        }
+
         if (props->has_socket_id && props->socket_id != slot->props.socket_id) {
                 continue;
         }
@@ -1034,6 +1043,10 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
     if (cpu->props.has_die_id) {
         g_string_append_printf(s, "die-id: %"PRId64, cpu->props.die_id);
     }
+
+    if (cpu->props.has_llc_id) {
+        g_string_append_printf(s, "llc-id: %"PRId64, cpu->props.llc_id);
+    }
     if (cpu->props.has_core_id) {
         if (s->len) {
             g_string_append_printf(s, ", ");
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 17de152a77..df5339c102 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2294,6 +2294,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
         topo_ids.pkg_id = cpu->socket_id;
         topo_ids.die_id = cpu->die_id;
+        topo_ids.llc_id = cpu->llc_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
         cpu->apic_id = apicid_from_topo_ids(&topo_info, &topo_ids);
@@ -2339,6 +2340,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     }
     cpu->die_id = topo_ids.die_id;
 
+    if (cpu->llc_id != -1 && cpu->llc_id != topo_ids.llc_id) {
+        error_setg(errp, "property llc-id: %u doesn't match set apic-id:"
+            " 0x%x (llc-id: %u)", cpu->llc_id, cpu->apic_id, topo_ids.llc_id);
+        return;
+    }
+    cpu->llc_id = topo_ids.llc_id;
+
     if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
         error_setg(errp, "property core-id: %u doesn't match set apic-id:"
             " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo_ids.core_id);
@@ -2752,6 +2760,8 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
             ms->possible_cpus->cpus[i].props.has_die_id = true;
             ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
         }
+        ms->possible_cpus->cpus[i].props.has_llc_id = true;
+        ms->possible_cpus->cpus[i].props.llc_id = topo_ids.llc_id;
         ms->possible_cpus->cpus[i].props.has_core_id = true;
         ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
         ms->possible_cpus->cpus[i].props.has_thread_id = true;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index ba52d49079..1238006208 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -48,6 +48,7 @@ typedef uint32_t apic_id_t;
 typedef struct X86CPUTopoIDs {
     unsigned pkg_id;
     unsigned die_id;
+    unsigned llc_id;
     unsigned core_id;
     unsigned smt_id;
 } X86CPUTopoIDs;
diff --git a/qapi/machine.json b/qapi/machine.json
index ca26779f1a..1ca5b73418 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -646,9 +646,11 @@
 # @node-id: NUMA node ID the CPU belongs to
 # @socket-id: socket number within node/board the CPU belongs to
 # @die-id: die number within node/board the CPU belongs to (Since 4.1)
-# @core-id: core number within die the CPU belongs to# @thread-id: thread number within core the CPU belongs to
+# @llc-id: last level cache number within node/board the CPU belongs to (Since 4.2)
+# @core-id: core number within die the CPU belongs to
+# @thread-id: thread number within core the CPU belongs to
 #
-# Note: currently there are 5 properties that could be present
+# Note: currently there are 6 properties that could be present
 # but management should be prepared to pass through other
 # properties with device_add command to allow for future
 # interface extension. This also requires the filed names to be kept in
@@ -660,6 +662,7 @@
   'data': { '*node-id': 'int',
             '*socket-id': 'int',
             '*die-id': 'int',
+            '*llc-id': 'int',
             '*core-id': 'int',
             '*thread-id': 'int'
   }
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index bc9b491557..3c81aa3ecd 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6222,12 +6222,14 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, 0),
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
     DEFINE_PROP_INT32("die-id", X86CPU, die_id, 0),
+    DEFINE_PROP_INT32("llc-id", X86CPU, llc_id, 0),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, 0),
 #else
     DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
     DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, -1),
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
     DEFINE_PROP_INT32("die-id", X86CPU, die_id, -1),
+    DEFINE_PROP_INT32("llc-id", X86CPU, llc_id, -1),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
 #endif
     DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index af57fda8e5..a56d44e405 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1711,6 +1711,7 @@ struct X86CPU {
     int32_t node_id; /* NUMA node this CPU belongs to */
     int32_t socket_id;
     int32_t die_id;
+    int32_t llc_id;
     int32_t core_id;
     int32_t thread_id;
 



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (5 preceding siblings ...)
  2019-12-04  0:37 ` [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology Babu Moger
@ 2019-12-04  0:37 ` Babu Moger
  2020-01-28 16:29   ` Igor Mammedov
  2019-12-04  0:37 ` [PATCH v3 08/18] hw/i386: Update structures for nodes_per_pkg Babu Moger
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:37 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Add a new function init_apicid_fn in MachineClass to initialize the mode
specific handlers to decode the apic ids.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 include/hw/boards.h |    1 +
 vl.c                |    3 +++
 2 files changed, 4 insertions(+)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index d4fab218e6..ce5aa365cb 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -238,6 +238,7 @@ struct MachineClass {
                                                          unsigned cpu_index);
     const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
     int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
+    void (*init_apicid_fn)(MachineState *ms);
 };
 
 /**
diff --git a/vl.c b/vl.c
index a42c24a77f..b6af604e11 100644
--- a/vl.c
+++ b/vl.c
@@ -4318,6 +4318,9 @@ int main(int argc, char **argv, char **envp)
     current_machine->cpu_type = machine_class->default_cpu_type;
     if (cpu_option) {
         current_machine->cpu_type = parse_cpu_option(cpu_option);
+        if (machine_class->init_apicid_fn) {
+            machine_class->init_apicid_fn(current_machine);
+        }
     }
     parse_numa_opts(current_machine);
 



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 08/18] hw/i386: Update structures for nodes_per_pkg
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (6 preceding siblings ...)
  2019-12-04  0:37 ` [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass Babu Moger
@ 2019-12-04  0:37 ` Babu Moger
  2019-12-04  0:37 ` [PATCH v3 09/18] i386: Add CPUX86Family type in CPUX86State Babu Moger
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:37 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Update structures X86CPUTopoIDs and CPUX86State to hold the nodes_per_pkg. This
is required to build EPYC mode topology.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c               |    4 ++++
 include/hw/i386/topology.h |    1 +
 target/i386/cpu.c          |    1 +
 target/i386/cpu.h          |    1 +
 4 files changed, 7 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index df5339c102..5dc11df922 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -870,6 +870,7 @@ static inline void initialize_topo_info(X86CPUTopoInfo *topo_info,
                                         PCMachineState *pcms,
                                         const MachineState *ms)
 {
+    topo_info->nodes_per_pkg = ms->numa_state->num_nodes / ms->smp.sockets;
     topo_info->dies_per_pkg = pcms->smp_dies;
     topo_info->cores_per_die = ms->smp.cores;
     topo_info->threads_per_core = ms->smp.threads;
@@ -1390,11 +1391,13 @@ static void pc_new_cpu(PCMachineState *pcms, int64_t apic_id, Error **errp)
     Object *cpu = NULL;
     Error *local_err = NULL;
     CPUX86State *env = NULL;
+    MachineState *ms = MACHINE(pcms);
 
     cpu = object_new(MACHINE(pcms)->cpu_type);
 
     env = &X86_CPU(cpu)->env;
     env->nr_dies = pcms->smp_dies;
+    env->nr_nodes = ms->numa_state->num_nodes / ms->smp.sockets;
 
     object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
     object_property_set_bool(cpu, true, "realized", &local_err);
@@ -2242,6 +2245,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     initialize_topo_info(&topo_info, pcms, ms);
 
     env->nr_dies = pcms->smp_dies;
+    env->nr_nodes = ms->numa_state->num_nodes / ms->smp.sockets;
 
     /*
      * If APIC ID is not set,
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 1238006208..cfb09312fe 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -54,6 +54,7 @@ typedef struct X86CPUTopoIDs {
 } X86CPUTopoIDs;
 
 typedef struct X86CPUTopoInfo {
+    unsigned nodes_per_pkg;
     unsigned dies_per_pkg;
     unsigned cores_per_die;
     unsigned threads_per_core;
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3c81aa3ecd..9b2608a4c8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5996,6 +5996,7 @@ static void x86_cpu_initfn(Object *obj)
     FeatureWord w;
 
     env->nr_dies = 1;
+    env->nr_nodes = 1;
     cpu_set_cpustate_pointers(cpu);
 
     object_property_add(obj, "family", "int",
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index a56d44e405..0ef4fdb55f 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1591,6 +1591,7 @@ typedef struct CPUX86State {
     TPRAccess tpr_access_type;
 
     unsigned nr_dies;
+    unsigned nr_nodes;
 } CPUX86State;
 
 struct kvm_msrs;



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 09/18] i386: Add CPUX86Family type in CPUX86State
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (7 preceding siblings ...)
  2019-12-04  0:37 ` [PATCH v3 08/18] hw/i386: Update structures for nodes_per_pkg Babu Moger
@ 2019-12-04  0:37 ` Babu Moger
  2019-12-04  0:38 ` [PATCH v3 10/18] hw/386: Add EPYC mode topology decoding functions Babu Moger
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:37 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Add CPUX86Family type in CPUX86State. This will be used to differentiate
generic x86 and x86 EPYC based cpu models.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c      |    4 ++++
 target/i386/cpu.c |    1 +
 target/i386/cpu.h |    7 +++++++
 3 files changed, 12 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5dc11df922..7f30104a6b 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1398,6 +1398,8 @@ static void pc_new_cpu(PCMachineState *pcms, int64_t apic_id, Error **errp)
     env = &X86_CPU(cpu)->env;
     env->nr_dies = pcms->smp_dies;
     env->nr_nodes = ms->numa_state->num_nodes / ms->smp.sockets;
+    env->family_type = strncmp(ms->cpu_type, "EPYC", 4) ? CPUX86FAMILY_DEFAULT :
+                                                          CPUX86FAMILY_EPYC;
 
     object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
     object_property_set_bool(cpu, true, "realized", &local_err);
@@ -2246,6 +2248,8 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
     env->nr_dies = pcms->smp_dies;
     env->nr_nodes = ms->numa_state->num_nodes / ms->smp.sockets;
+    env->family_type = strncmp(ms->cpu_type, "EPYC", 4) ? CPUX86FAMILY_DEFAULT :
+                                                          CPUX86FAMILY_EPYC;
 
     /*
      * If APIC ID is not set,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9b2608a4c8..5629c6d4c1 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5997,6 +5997,7 @@ static void x86_cpu_initfn(Object *obj)
 
     env->nr_dies = 1;
     env->nr_nodes = 1;
+    env->family_type = CPUX86FAMILY_DEFAULT;
     cpu_set_cpustate_pointers(cpu);
 
     object_property_add(obj, "family", "int",
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 0ef4fdb55f..105744430b 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1284,6 +1284,11 @@ typedef enum TPRAccess {
     TPR_ACCESS_WRITE,
 } TPRAccess;
 
+typedef enum CPUX86Family {
+    CPUX86FAMILY_DEFAULT = 0,
+    CPUX86FAMILY_EPYC,
+} CPUX86Family;
+
 /* Cache information data structures: */
 
 enum CacheType {
@@ -1590,6 +1595,8 @@ typedef struct CPUX86State {
 
     TPRAccess tpr_access_type;
 
+    CPUX86Family family_type;
+
     unsigned nr_dies;
     unsigned nr_nodes;
 } CPUX86State;



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 10/18] hw/386: Add EPYC mode topology decoding functions
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (8 preceding siblings ...)
  2019-12-04  0:37 ` [PATCH v3 09/18] i386: Add CPUX86Family type in CPUX86State Babu Moger
@ 2019-12-04  0:38 ` Babu Moger
  2019-12-04  0:38 ` [PATCH v3 11/18] i386: Cleanup and use the EPYC mode topology functions Babu Moger
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:38 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

These functions add support for building EPYC mode topology given the smp
details like numa nodes, cores, threads and sockets.

The new apic id decoding is mostly similar to current apic id decoding
except that it adds a new field llc_id when numa configured. Removes all
the hardcoded values. Subsequent patches will use these functions to build
the topology.

Following functions are added.
apicid_llc_width_epyc
apicid_llc_offset_epyc
apicid_pkg_offset_epyc
apicid_from_topo_ids_epyc
x86_topo_ids_from_idx_epyc
x86_topo_ids_from_apicid_epyc
x86_apicid_from_cpu_idx_epyc

The topology details are available in Processor Programming Reference (PPR)
for AMD Family 17h Model 01h, Revision B1 Processors.
https://www.amd.com/system/files/TechDocs/55570-B1_PUB.zip

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 include/hw/i386/topology.h |   93 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index cfb09312fe..adb92fe9ce 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -89,6 +89,11 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
     return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
 }
 
+/* Bit width of the llc_ID field per socket */
+static inline unsigned apicid_llc_width_epyc(X86CPUTopoInfo *topo_info)
+{
+    return apicid_bitwidth_for_count(MAX(topo_info->nodes_per_pkg, 1));
+}
 /* Bit offset of the Core_ID field
  */
 static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
@@ -109,6 +114,94 @@ static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
     return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
 }
 
+#define LLC_OFFSET 3 /* Minimum LLC offset if numa configured */
+
+/* Bit offset of the llc_ID field */
+static inline unsigned apicid_llc_offset_epyc(X86CPUTopoInfo *topo_info)
+{
+    unsigned offset = apicid_die_offset(topo_info) +
+                      apicid_die_width(topo_info);
+
+    if (topo_info->nodes_per_pkg) {
+        return MAX(LLC_OFFSET, offset);
+    } else {
+        return offset;
+    }
+}
+
+/* Bit offset of the Pkg_ID (socket ID) field */
+static inline unsigned apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)
+{
+    return apicid_llc_offset_epyc(topo_info) + apicid_llc_width_epyc(topo_info);
+}
+
+/*
+ * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
+ *
+ * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
+ */
+static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
+                                                  const X86CPUTopoIDs *topo_ids)
+{
+    return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
+           (topo_ids->llc_id << apicid_llc_offset_epyc(topo_info)) |
+           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
+           (topo_ids->core_id << apicid_core_offset(topo_info)) |
+           topo_ids->smt_id;
+}
+
+static inline void x86_topo_ids_from_idx_epyc(X86CPUTopoInfo *topo_info,
+                                              unsigned cpu_index,
+                                              X86CPUTopoIDs *topo_ids)
+{
+    unsigned nr_nodes = MAX(topo_info->nodes_per_pkg, 1);
+    unsigned nr_dies = topo_info->dies_per_pkg;
+    unsigned nr_cores = topo_info->cores_per_die;
+    unsigned nr_threads = topo_info->threads_per_core;
+    unsigned cores_per_node = DIV_ROUND_UP((nr_dies * nr_cores * nr_threads),
+                                            nr_nodes);
+
+    topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
+    topo_ids->llc_id = (cpu_index / cores_per_node) % nr_nodes;
+    topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
+    topo_ids->core_id = cpu_index / nr_threads % nr_cores;
+    topo_ids->smt_id = cpu_index % nr_threads;
+}
+
+/*
+ * Calculate thread/core/package IDs for a specific topology,
+ * based on APIC ID
+ */
+static inline void x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
+                                            X86CPUTopoInfo *topo_info,
+                                            X86CPUTopoIDs *topo_ids)
+{
+    topo_ids->smt_id = apicid &
+            ~(0xFFFFFFFFUL << apicid_smt_width(topo_info));
+    topo_ids->core_id =
+            (apicid >> apicid_core_offset(topo_info)) &
+            ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
+    topo_ids->die_id =
+            (apicid >> apicid_die_offset(topo_info)) &
+            ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
+    topo_ids->llc_id =
+            (apicid >> apicid_llc_offset_epyc(topo_info)) &
+            ~(0xFFFFFFFFUL << apicid_llc_width_epyc(topo_info));
+    topo_ids->pkg_id = apicid >> apicid_pkg_offset_epyc(topo_info);
+}
+
+/*
+ * Make APIC ID for the CPU 'cpu_index'
+ *
+ * 'cpu_index' is a sequential, contiguous ID for the CPU.
+ */
+static inline apic_id_t x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo *topo_info,
+                                                     unsigned cpu_index)
+{
+    X86CPUTopoIDs topo_ids;
+    x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
+    return apicid_from_topo_ids_epyc(topo_info, &topo_ids);
+}
 /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
  *
  * The caller must make sure core_id < nr_cores and smt_id < nr_threads.



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 11/18] i386: Cleanup and use the EPYC mode topology functions
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (9 preceding siblings ...)
  2019-12-04  0:38 ` [PATCH v3 10/18] hw/386: Add EPYC mode topology decoding functions Babu Moger
@ 2019-12-04  0:38 ` Babu Moger
  2019-12-04  0:38 ` [PATCH v3 12/18] numa: Split the numa initialization Babu Moger
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:38 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Use the new functions from topology.h and delete the unused code. Given the
sockets, nodes, cores and threads, the new functions generate apic id for EPYC
mode. Removes all the hardcoded values.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 target/i386/cpu.c |  162 +++++++++++------------------------------------------
 1 file changed, 35 insertions(+), 127 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5629c6d4c1..e87487bae3 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -338,68 +338,15 @@ static void encode_cache_cpuid80000006(CPUCacheInfo *l2,
     }
 }
 
-/*
- * Definitions used for building CPUID Leaf 0x8000001D and 0x8000001E
- * Please refer to the AMD64 Architecture Programmer’s Manual Volume 3.
- * Define the constants to build the cpu topology. Right now, TOPOEXT
- * feature is enabled only on EPYC. So, these constants are based on
- * EPYC supported configurations. We may need to handle the cases if
- * these values change in future.
- */
-/* Maximum core complexes in a node */
-#define MAX_CCX 2
-/* Maximum cores in a core complex */
-#define MAX_CORES_IN_CCX 4
-/* Maximum cores in a node */
-#define MAX_CORES_IN_NODE 8
-/* Maximum nodes in a socket */
-#define MAX_NODES_PER_SOCKET 4
-
-/*
- * Figure out the number of nodes required to build this config.
- * Max cores in a node is 8
- */
-static int nodes_in_socket(int nr_cores)
-{
-    int nodes;
-
-    nodes = DIV_ROUND_UP(nr_cores, MAX_CORES_IN_NODE);
-
-   /* Hardware does not support config with 3 nodes, return 4 in that case */
-    return (nodes == 3) ? 4 : nodes;
-}
-
-/*
- * Decide the number of cores in a core complex with the given nr_cores using
- * following set constants MAX_CCX, MAX_CORES_IN_CCX, MAX_CORES_IN_NODE and
- * MAX_NODES_PER_SOCKET. Maintain symmetry as much as possible
- * L3 cache is shared across all cores in a core complex. So, this will also
- * tell us how many cores are sharing the L3 cache.
- */
-static int cores_in_core_complex(int nr_cores)
-{
-    int nodes;
-
-    /* Check if we can fit all the cores in one core complex */
-    if (nr_cores <= MAX_CORES_IN_CCX) {
-        return nr_cores;
-    }
-    /* Get the number of nodes required to build this config */
-    nodes = nodes_in_socket(nr_cores);
-
-    /*
-     * Divide the cores accros all the core complexes
-     * Return rounded up value
-     */
-    return DIV_ROUND_UP(nr_cores, nodes * MAX_CCX);
-}
-
 /* Encode cache info for CPUID[8000001D] */
-static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, CPUState *cs,
-                                uint32_t *eax, uint32_t *ebx,
-                                uint32_t *ecx, uint32_t *edx)
+static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
+                                       X86CPUTopoInfo *topo_info,
+                                       uint32_t *eax, uint32_t *ebx,
+                                       uint32_t *ecx, uint32_t *edx)
 {
     uint32_t l3_cores;
+    unsigned nodes = MAX(topo_info->nodes_per_pkg, 1);
+
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
@@ -408,10 +355,13 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, CPUState *cs,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_cores = cores_in_core_complex(cs->nr_cores);
-        *eax |= ((l3_cores * cs->nr_threads) - 1) << 14;
+        l3_cores = DIV_ROUND_UP((topo_info->dies_per_pkg *
+                                 topo_info->cores_per_die *
+                                 topo_info->threads_per_core),
+                                 nodes);
+        *eax |= (l3_cores - 1) << 14;
     } else {
-        *eax |= ((cs->nr_threads - 1) << 14);
+        *eax |= ((topo_info->threads_per_core - 1) << 14);
     }
 
     assert(cache->line_size > 0);
@@ -431,55 +381,17 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, CPUState *cs,
            (cache->complex_indexing ? CACHE_COMPLEX_IDX : 0);
 }
 
-/* Data structure to hold the configuration info for a given core index */
-struct core_topology {
-    /* core complex id of the current core index */
-    int ccx_id;
-    /*
-     * Adjusted core index for this core in the topology
-     * This can be 0,1,2,3 with max 4 cores in a core complex
-     */
-    int core_id;
-    /* Node id for this core index */
-    int node_id;
-    /* Number of nodes in this config */
-    int num_nodes;
-};
-
-/*
- * Build the configuration closely match the EPYC hardware. Using the EPYC
- * hardware configuration values (MAX_CCX, MAX_CORES_IN_CCX, MAX_CORES_IN_NODE)
- * right now. This could change in future.
- * nr_cores : Total number of cores in the config
- * core_id  : Core index of the current CPU
- * topo     : Data structure to hold all the config info for this core index
- */
-static void build_core_topology(int nr_cores, int core_id,
-                                struct core_topology *topo)
-{
-    int nodes, cores_in_ccx;
-
-    /* First get the number of nodes required */
-    nodes = nodes_in_socket(nr_cores);
-
-    cores_in_ccx = cores_in_core_complex(nr_cores);
-
-    topo->node_id = core_id / (cores_in_ccx * MAX_CCX);
-    topo->ccx_id = (core_id % (cores_in_ccx * MAX_CCX)) / cores_in_ccx;
-    topo->core_id = core_id % cores_in_ccx;
-    topo->num_nodes = nodes;
-}
-
 /* Encode cache info for CPUID[8000001E] */
-static void encode_topo_cpuid8000001e(CPUState *cs, X86CPU *cpu,
+static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
                                        uint32_t *eax, uint32_t *ebx,
                                        uint32_t *ecx, uint32_t *edx)
 {
-    struct core_topology topo = {0};
-    unsigned long nodes;
+    X86CPUTopoIDs topo_ids = {0};
+    unsigned long nodes = MAX(topo_info->nodes_per_pkg, 1);
     int shift;
 
-    build_core_topology(cs->nr_cores, cpu->core_id, &topo);
+    x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
+
     *eax = cpu->apic_id;
     /*
      * CPUID_Fn8000001E_EBX
@@ -496,12 +408,8 @@ static void encode_topo_cpuid8000001e(CPUState *cs, X86CPU *cpu,
      *             3 Core complex id
      *           1:0 Core id
      */
-    if (cs->nr_threads - 1) {
-        *ebx = ((cs->nr_threads - 1) << 8) | (topo.node_id << 3) |
-                (topo.ccx_id << 2) | topo.core_id;
-    } else {
-        *ebx = (topo.node_id << 4) | (topo.ccx_id << 3) | topo.core_id;
-    }
+    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.llc_id << 3) |
+            (topo_ids.core_id);
     /*
      * CPUID_Fn8000001E_ECX
      * 31:11 Reserved
@@ -510,9 +418,9 @@ static void encode_topo_cpuid8000001e(CPUState *cs, X86CPU *cpu,
      *         2  Socket id
      *       1:0  Node id
      */
-    if (topo.num_nodes <= 4) {
-        *ecx = ((topo.num_nodes - 1) << 8) | (cpu->socket_id << 2) |
-                topo.node_id;
+
+    if (nodes <= 4) {
+        *ecx = ((nodes - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.llc_id;
     } else {
         /*
          * Node id fix up. Actual hardware supports up to 4 nodes. But with
@@ -527,10 +435,10 @@ static void encode_topo_cpuid8000001e(CPUState *cs, X86CPU *cpu,
          * number of nodes. find_last_bit returns last set bit(0 based). Left
          * shift(+1) the socket id to represent all the nodes.
          */
-        nodes = topo.num_nodes - 1;
+        nodes -= 1;
         shift = find_last_bit(&nodes, 8);
-        *ecx = ((topo.num_nodes - 1) << 8) | (cpu->socket_id << (shift + 1)) |
-                topo.node_id;
+        *ecx = (nodes << 8) | (topo_ids.pkg_id << (shift + 1)) |
+               topo_ids.llc_id;
     }
     *edx = 0;
 }
@@ -4553,6 +4461,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
 
+    topo_info.nodes_per_pkg = env->nr_nodes;
     topo_info.dies_per_pkg = env->nr_dies;
     topo_info.cores_per_die = cs->nr_cores;
     topo_info.threads_per_core = cs->nr_threads;
@@ -4972,20 +4881,20 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         }
         switch (count) {
         case 0: /* L1 dcache info */
-            encode_cache_cpuid8000001d(env->cache_info_amd.l1d_cache, cs,
-                                       eax, ebx, ecx, edx);
+            encode_cache_cpuid8000001d(env->cache_info_amd.l1d_cache,
+                                       &topo_info, eax, ebx, ecx, edx);
             break;
         case 1: /* L1 icache info */
-            encode_cache_cpuid8000001d(env->cache_info_amd.l1i_cache, cs,
-                                       eax, ebx, ecx, edx);
+            encode_cache_cpuid8000001d(env->cache_info_amd.l1i_cache,
+                                       &topo_info, eax, ebx, ecx, edx);
             break;
         case 2: /* L2 cache info */
-            encode_cache_cpuid8000001d(env->cache_info_amd.l2_cache, cs,
-                                       eax, ebx, ecx, edx);
+            encode_cache_cpuid8000001d(env->cache_info_amd.l2_cache,
+                                       &topo_info, eax, ebx, ecx, edx);
             break;
         case 3: /* L3 cache info */
-            encode_cache_cpuid8000001d(env->cache_info_amd.l3_cache, cs,
-                                       eax, ebx, ecx, edx);
+            encode_cache_cpuid8000001d(env->cache_info_amd.l3_cache,
+                                       &topo_info, eax, ebx, ecx, edx);
             break;
         default: /* end of info */
             *eax = *ebx = *ecx = *edx = 0;
@@ -4994,8 +4903,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         break;
     case 0x8000001E:
         assert(cpu->core_id <= 255);
-        encode_topo_cpuid8000001e(cs, cpu,
-                                  eax, ebx, ecx, edx);
+        encode_topo_cpuid8000001e(&topo_info, cpu, eax, ebx, ecx, edx);
         break;
     case 0xC0000000:
         *eax = env->cpuid_xlevel2;



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 12/18] numa: Split the numa initialization
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (10 preceding siblings ...)
  2019-12-04  0:38 ` [PATCH v3 11/18] i386: Cleanup and use the EPYC mode topology functions Babu Moger
@ 2019-12-04  0:38 ` Babu Moger
  2019-12-04  0:38 ` [PATCH v3 13/18] hw/i386: Introduce apicid_from_cpu_idx in PCMachineState Babu Moger
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:38 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

To generate the apic id for EPYC cpu models correctly, we need to know the
number of numa nodes in advance. At present numa node initialization and cpu
initialization happens at the same time. Apic id generation happens during the
cpu initialization. At this point it is not known how many numa nodes are
configured. So, save the cpu indexes and move the cpu initialization inside the
numa_complete_configuration. Cpu initialization is done in new function
numa_node_complete_configuration.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/core/numa.c        |   62 ++++++++++++++++++++++++++++++++-----------------
 include/sysemu/numa.h |    5 ++++
 2 files changed, 46 insertions(+), 21 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 038c96d4ab..ba02a41421 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -33,6 +33,7 @@
 #include "qapi/error.h"
 #include "qapi/opts-visitor.h"
 #include "qapi/qapi-visit-machine.h"
+#include "qapi/clone-visitor.h"
 #include "sysemu/qtest.h"
 #include "hw/core/cpu.h"
 #include "hw/mem/pc-dimm.h"
@@ -59,11 +60,8 @@ static int max_numa_nodeid; /* Highest specified NUMA node ID, plus one.
 static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
                             Error **errp)
 {
-    Error *err = NULL;
     uint16_t nodenr;
-    uint16List *cpus = NULL;
     MachineClass *mc = MACHINE_GET_CLASS(ms);
-    unsigned int max_cpus = ms->smp.max_cpus;
     NodeInfo *numa_info = ms->numa_state->nodes;
 
     if (node->has_nodeid) {
@@ -87,24 +85,8 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
         error_setg(errp, "NUMA is not supported by this machine-type");
         return;
     }
-    for (cpus = node->cpus; cpus; cpus = cpus->next) {
-        CpuInstanceProperties props;
-        if (cpus->value >= max_cpus) {
-            error_setg(errp,
-                       "CPU index (%" PRIu16 ")"
-                       " should be smaller than maxcpus (%d)",
-                       cpus->value, max_cpus);
-            return;
-        }
-        props = mc->cpu_index_to_instance_props(ms, cpus->value);
-        props.node_id = nodenr;
-        props.has_node_id = true;
-        machine_set_cpu_numa_node(ms, &props, &err);
-        if (err) {
-            error_propagate(errp, err);
-            return;
-        }
-    }
+
+    numa_info[nodenr].cpu_indexes = QAPI_CLONE(uint16List, node->cpus);
 
     have_memdevs = have_memdevs ? : node->has_memdev;
     have_mem = have_mem ? : node->has_mem;
@@ -360,12 +342,50 @@ void numa_default_auto_assign_ram(MachineClass *mc, NodeInfo *nodes,
     nodes[i].node_mem = size - usedmem;
 }
 
+
+void numa_node_complete_configuration(MachineState *ms, NodeInfo *node,
+                                      uint16_t nodenr)
+{
+    Error *err = NULL;
+    uint16List *cpus = NULL;
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+    unsigned int max_cpus = ms->smp.max_cpus;
+
+    for (cpus = node->cpu_indexes; cpus; cpus = cpus->next) {
+        CpuInstanceProperties props;
+        if (cpus->value >= max_cpus) {
+            error_report("CPU index (%" PRIu16 ")"
+                         " should be smaller than maxcpus (%d)",
+                         cpus->value, max_cpus);
+            return;
+        }
+        props = mc->cpu_index_to_instance_props(ms, cpus->value);
+        props.node_id = nodenr;
+        props.has_node_id = true;
+        machine_set_cpu_numa_node(ms, &props, &err);
+        if (err) {
+            error_report("Numa node initialization failed");
+            return;
+        }
+    }
+}
+
 void numa_complete_configuration(MachineState *ms)
 {
     int i;
     MachineClass *mc = MACHINE_GET_CLASS(ms);
     NodeInfo *numa_info = ms->numa_state->nodes;
 
+    for (i = 0; i < ms->numa_state->num_nodes; i++) {
+        /*
+         * numa_node_complete_configuration() needs to be called after all
+         * nodes were already parsed, because to support new epyc mode, we
+         * need to know the number of numa nodes in advance to generate
+         * apic id correctly.
+         */
+        numa_node_complete_configuration(ms, &numa_info[i], i);
+    }
+
     /*
      * If memory hotplug is enabled (slots > 0) but without '-numa'
      * options explicitly on CLI, guestes will break.
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index ae9c41d02b..91794d685f 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -19,6 +19,9 @@ struct NodeInfo {
     struct HostMemoryBackend *node_memdev;
     bool present;
     uint8_t distance[MAX_NODES];
+
+    /* These indexes are saved for numa node initialization later */
+    uint16List *cpu_indexes;
 };
 
 struct NumaNodeMem {
@@ -41,6 +44,8 @@ typedef struct NumaState NumaState;
 void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp);
 void parse_numa_opts(MachineState *ms);
 void numa_complete_configuration(MachineState *ms);
+void numa_node_complete_configuration(MachineState *ms, NodeInfo *node,
+                                      uint16_t nodenr);
 void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms);
 extern QemuOptsList qemu_numa_opts;
 void numa_legacy_auto_assign_ram(MachineClass *mc, NodeInfo *nodes,



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 13/18] hw/i386: Introduce apicid_from_cpu_idx in PCMachineState
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (11 preceding siblings ...)
  2019-12-04  0:38 ` [PATCH v3 12/18] numa: Split the numa initialization Babu Moger
@ 2019-12-04  0:38 ` Babu Moger
  2019-12-04  0:38 ` [PATCH v3 14/18] hw/i386: Introduce topo_ids_from_apicid handler PCMachineState Babu Moger
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:38 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Add function pointers in PCMachineState to handle apic id specific
functionalities. This will be used to initialize with correct handlers based on
the cpu model selected.

x86_apicid_from_cpu_idx will be default handler.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c         |    5 ++++-
 include/hw/i386/pc.h |    5 +++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 7f30104a6b..52aea4a652 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -894,7 +894,7 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
 
     initialize_topo_info(&topo_info, pcms, ms);
 
-    correct_id = x86_apicid_from_cpu_idx(&topo_info, cpu_index);
+    correct_id = pcms->apicid_from_cpu_idx(&topo_info, cpu_index);
     if (pcmc->compat_apic_id_mode) {
         if (cpu_index != correct_id && !warned && !qtest_enabled()) {
             error_report("APIC IDs set in compatibility mode, "
@@ -2679,6 +2679,9 @@ static void pc_machine_initfn(Object *obj)
     pcms->pit_enabled = true;
     pcms->smp_dies = 1;
 
+    /* Initialize the apic id related handlers */
+    pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx;
+
     pc_system_flash_create(pcms);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 37bfd95113..56aa0e45b5 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -16,6 +16,7 @@
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/acpi/acpi_dev_interface.h"
+#include "hw/i386/topology.h"
 
 #define HPET_INTCAP "hpet-intcap"
 
@@ -67,6 +68,10 @@ struct PCMachineState {
     uint64_t numa_nodes;
     uint64_t *node_mem;
 
+    /* Apic id specific handlers */
+    uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info,
+                                    unsigned cpu_index);
+
     /* Address space used by IOAPIC device. All IOAPIC interrupts
      * will be translated to MSI messages in the address space. */
     AddressSpace *ioapic_as;



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 14/18] hw/i386: Introduce topo_ids_from_apicid handler PCMachineState
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (12 preceding siblings ...)
  2019-12-04  0:38 ` [PATCH v3 13/18] hw/i386: Introduce apicid_from_cpu_idx in PCMachineState Babu Moger
@ 2019-12-04  0:38 ` Babu Moger
  2019-12-04  0:38 ` [PATCH v3 15/18] hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState Babu Moger
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:38 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Add function pointer topo_ids_from_apicid in PCMachineState.
Initialize with correct handler based on mode selected.
x86_apicid_from_cpu_idx will be the default handler.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c         |   13 +++++++------
 include/hw/i386/pc.h |    2 ++
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 52aea4a652..b0d58515dd 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2312,7 +2312,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     if (!cpu_slot) {
         MachineState *ms = MACHINE(pcms);
 
-        x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
+        pcms->topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
         error_setg(errp,
             "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
             " APIC ID %" PRIu32 ", valid index range 0:%d",
@@ -2333,7 +2333,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
      * once -smp refactoring is complete and there will be CPU private
      * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
-    x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
+    pcms->topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
     if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
         error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
             " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, topo_ids.pkg_id);
@@ -2681,6 +2681,7 @@ static void pc_machine_initfn(Object *obj)
 
     /* Initialize the apic id related handlers */
     pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx;
+    pcms->topo_ids_from_apicid = x86_topo_ids_from_apicid;
 
     pc_system_flash_create(pcms);
 }
@@ -2730,8 +2731,8 @@ static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
    initialize_topo_info(&topo_info, pcms, ms);
 
    assert(idx < ms->possible_cpus->len);
-   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
-                            &topo_info, &topo_ids);
+   pcms->topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
+                              &topo_info, &topo_ids);
    return topo_ids.pkg_id % ms->numa_state->num_nodes;
 }
 
@@ -2763,8 +2764,8 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[i].type = ms->cpu_type;
         ms->possible_cpus->cpus[i].vcpus_count = 1;
         ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
-        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
-                                 &topo_info, &topo_ids);
+        pcms->topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
+                                   &topo_info, &topo_ids);
         ms->possible_cpus->cpus[i].props.has_socket_id = true;
         ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
         if (pcms->smp_dies > 1) {
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 56aa0e45b5..ffc5c78164 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -71,6 +71,8 @@ struct PCMachineState {
     /* Apic id specific handlers */
     uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info,
                                     unsigned cpu_index);
+    void (*topo_ids_from_apicid)(apic_id_t apicid, X86CPUTopoInfo *topo_info,
+                                 X86CPUTopoIDs *topo_ids);
 
     /* Address space used by IOAPIC device. All IOAPIC interrupts
      * will be translated to MSI messages in the address space. */



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 15/18] hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (13 preceding siblings ...)
  2019-12-04  0:38 ` [PATCH v3 14/18] hw/i386: Introduce topo_ids_from_apicid handler PCMachineState Babu Moger
@ 2019-12-04  0:38 ` Babu Moger
  2019-12-04  0:38 ` [PATCH v3 16/18] hw/i386: Introduce EPYC mode function handlers Babu Moger
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:38 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Add function pointer apic_id_from_topo_ids in PCMachineState.
Initialize with correct handler based on the mode selected.
Also rename the handler apicid_from_topo_ids to x86_apicid_from_topo_ids
for consistency. x86_apicid_from_topo_ids will be the default handler.

Signed-off-by: Babu Moger <babu.moger@amd.com
---
 hw/i386/pc.c               |    3 ++-
 include/hw/i386/pc.h       |    2 ++
 include/hw/i386/topology.h |    4 ++--
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index b0d58515dd..e6c8a458e7 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2305,7 +2305,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         topo_ids.llc_id = cpu->llc_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
-        cpu->apic_id = apicid_from_topo_ids(&topo_info, &topo_ids);
+        cpu->apic_id = pcms->apicid_from_topo_ids(&topo_info, &topo_ids);
     }
 
     cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, &idx);
@@ -2682,6 +2682,7 @@ static void pc_machine_initfn(Object *obj)
     /* Initialize the apic id related handlers */
     pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx;
     pcms->topo_ids_from_apicid = x86_topo_ids_from_apicid;
+    pcms->apicid_from_topo_ids = x86_apicid_from_topo_ids;
 
     pc_system_flash_create(pcms);
 }
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index ffc5c78164..0789f8b5ea 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -73,6 +73,8 @@ struct PCMachineState {
                                     unsigned cpu_index);
     void (*topo_ids_from_apicid)(apic_id_t apicid, X86CPUTopoInfo *topo_info,
                                  X86CPUTopoIDs *topo_ids);
+    apic_id_t (*apicid_from_topo_ids)(X86CPUTopoInfo *topo_info,
+                                      const X86CPUTopoIDs *topo_ids);
 
     /* Address space used by IOAPIC device. All IOAPIC interrupts
      * will be translated to MSI messages in the address space. */
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index adb92fe9ce..b2b9e93a06 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -206,7 +206,7 @@ static inline apic_id_t x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo *topo_info,
  *
  * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
  */
-static inline apic_id_t apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
+static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
                                              const X86CPUTopoIDs *topo_ids)
 {
     return (topo_ids->pkg_id  << apicid_pkg_offset(topo_info)) |
@@ -259,7 +259,7 @@ static inline apic_id_t x86_apicid_from_cpu_idx(X86CPUTopoInfo *topo_info,
 {
     X86CPUTopoIDs topo_ids;
     x86_topo_ids_from_idx(topo_info, cpu_index, &topo_ids);
-    return apicid_from_topo_ids(topo_info, &topo_ids);
+    return x86_apicid_from_topo_ids(topo_info, &topo_ids);
 }
 
 #endif /* HW_I386_TOPOLOGY_H */



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 16/18] hw/i386: Introduce EPYC mode function handlers
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (14 preceding siblings ...)
  2019-12-04  0:38 ` [PATCH v3 15/18] hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState Babu Moger
@ 2019-12-04  0:38 ` Babu Moger
  2020-01-28 20:04   ` Eduardo Habkost
  2019-12-04  0:38 ` [PATCH v3 17/18] i386: Fix pkg_id offset for epyc mode Babu Moger
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:38 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Introduce following handlers for new epyc mode.
x86_apicid_from_cpu_idx_epyc: Generate apicid from cpu index.
x86_topo_ids_from_apicid_epyc: Generate topo ids from apic id.
x86_apicid_from_topo_ids_epyc: Generate apicid from topo ids.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c               |   12 ++++++++++++
 include/hw/i386/topology.h |    4 ++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e6c8a458e7..64e3658873 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2819,6 +2819,17 @@ static bool pc_hotplug_allowed(MachineState *ms, DeviceState *dev, Error **errp)
     return true;
 }
 
+static void pc_init_apicid_fn(MachineState *ms)
+{
+    PCMachineState *pcms = PC_MACHINE(ms);
+
+    if (!strncmp(ms->cpu_type, "EPYC", 4)) {
+        pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx_epyc;
+        pcms->topo_ids_from_apicid = x86_topo_ids_from_apicid_epyc;
+        pcms->apicid_from_topo_ids = x86_apicid_from_topo_ids_epyc;
+    }
+}
+
 static void pc_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -2847,6 +2858,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = pc_cpu_index_to_props;
     mc->get_default_cpu_node_id = pc_get_default_cpu_node_id;
     mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
+    mc->init_apicid_fn = pc_init_apicid_fn;
     mc->auto_enable_numa_with_memhp = true;
     mc->has_hotpluggable_cpus = true;
     mc->default_boot_order = "cad";
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index b2b9e93a06..f028d2332a 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -140,7 +140,7 @@ static inline unsigned apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)
  *
  * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
  */
-static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
+static inline apic_id_t x86_apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
                                                   const X86CPUTopoIDs *topo_ids)
 {
     return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
@@ -200,7 +200,7 @@ static inline apic_id_t x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo *topo_info,
 {
     X86CPUTopoIDs topo_ids;
     x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
-    return apicid_from_topo_ids_epyc(topo_info, &topo_ids);
+    return x86_apicid_from_topo_ids_epyc(topo_info, &topo_ids);
 }
 /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
  *



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 17/18] i386: Fix pkg_id offset for epyc mode
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (15 preceding siblings ...)
  2019-12-04  0:38 ` [PATCH v3 16/18] hw/i386: Introduce EPYC mode function handlers Babu Moger
@ 2019-12-04  0:38 ` Babu Moger
  2019-12-04  0:39 ` [PATCH v3 18/18] tests: Update the Unit tests Babu Moger
  2020-02-03 14:59 ` [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Igor Mammedov
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:38 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 target/i386/cpu.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index e87487bae3..0eaedeb848 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4456,7 +4456,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 {
     X86CPU *cpu = env_archcpu(env);
     CPUState *cs = env_cpu(env);
-    uint32_t die_offset;
+    uint32_t die_offset, pkg_offset;
     uint32_t limit;
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
@@ -4466,6 +4466,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     topo_info.cores_per_die = cs->nr_cores;
     topo_info.threads_per_core = cs->nr_threads;
 
+    if (env->family_type == CPUX86FAMILY_EPYC)
+            pkg_offset = apicid_pkg_offset_epyc(&topo_info);
+    else
+            pkg_offset = apicid_pkg_offset(&topo_info);
+
     /* Calculate & apply limits for different index ranges */
     if (index >= 0xC0000000) {
         limit = env->cpuid_xlevel2;
@@ -4647,7 +4652,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
-            *eax = apicid_pkg_offset(&topo_info);
+            *eax = pkg_offset;
             *ebx = cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
@@ -4681,7 +4686,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         case 2:
-            *eax = apicid_pkg_offset(&topo_info);
+            *eax = pkg_offset;
             *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 18/18] tests: Update the Unit tests
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (16 preceding siblings ...)
  2019-12-04  0:38 ` [PATCH v3 17/18] i386: Fix pkg_id offset for epyc mode Babu Moger
@ 2019-12-04  0:39 ` Babu Moger
  2020-02-03 14:59 ` [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Igor Mammedov
  18 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2019-12-04  0:39 UTC (permalink / raw)
  To: ehabkost, marcel.apfelbaum, mst, pbonzini, rth, eblake, armbru, imammedo
  Cc: babu.moger, qemu-devel

Since the topology routines have changed, update
the unit tests to use the new APIs.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 tests/test-x86-cpuid.c |  115 ++++++++++++++++++++++++++++--------------------
 1 file changed, 68 insertions(+), 47 deletions(-)

diff --git a/tests/test-x86-cpuid.c b/tests/test-x86-cpuid.c
index 1942287f33..00553c1d77 100644
--- a/tests/test-x86-cpuid.c
+++ b/tests/test-x86-cpuid.c
@@ -28,79 +28,100 @@
 
 static void test_topo_bits(void)
 {
+    X86CPUTopoInfo topo_info = {0};
+
     /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
-    g_assert_cmpuint(apicid_smt_width(1, 1, 1), ==, 0);
-    g_assert_cmpuint(apicid_core_width(1, 1, 1), ==, 0);
-    g_assert_cmpuint(apicid_die_width(1, 1, 1), ==, 0);
+    topo_info = (X86CPUTopoInfo) {0, 1, 1, 1};
+    g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
+    g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
+    g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
 
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 1, 0), ==, 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 1, 1), ==, 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 1, 2), ==, 2);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 1, 3), ==, 3);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 3), ==, 3);
 
 
     /* Test field width calculation for multiple values
      */
-    g_assert_cmpuint(apicid_smt_width(1, 1, 2), ==, 1);
-    g_assert_cmpuint(apicid_smt_width(1, 1, 3), ==, 2);
-    g_assert_cmpuint(apicid_smt_width(1, 1, 4), ==, 2);
-
-    g_assert_cmpuint(apicid_smt_width(1, 1, 14), ==, 4);
-    g_assert_cmpuint(apicid_smt_width(1, 1, 15), ==, 4);
-    g_assert_cmpuint(apicid_smt_width(1, 1, 16), ==, 4);
-    g_assert_cmpuint(apicid_smt_width(1, 1, 17), ==, 5);
-
-
-    g_assert_cmpuint(apicid_core_width(1, 30, 2), ==, 5);
-    g_assert_cmpuint(apicid_core_width(1, 31, 2), ==, 5);
-    g_assert_cmpuint(apicid_core_width(1, 32, 2), ==, 5);
-    g_assert_cmpuint(apicid_core_width(1, 33, 2), ==, 6);
-
-    g_assert_cmpuint(apicid_die_width(1, 30, 2), ==, 0);
-    g_assert_cmpuint(apicid_die_width(2, 30, 2), ==, 1);
-    g_assert_cmpuint(apicid_die_width(3, 30, 2), ==, 2);
-    g_assert_cmpuint(apicid_die_width(4, 30, 2), ==, 2);
+    topo_info = (X86CPUTopoInfo) {0, 1, 1, 2};
+    g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
+    topo_info = (X86CPUTopoInfo) {0, 1, 1, 3};
+    g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
+    topo_info = (X86CPUTopoInfo) {0, 1, 1, 4};
+    g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
+
+    topo_info = (X86CPUTopoInfo) {0, 1, 1, 14};
+    g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
+    topo_info = (X86CPUTopoInfo) {0, 1, 1, 15};
+    g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
+    topo_info = (X86CPUTopoInfo) {0, 1, 1, 16};
+    g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
+    topo_info = (X86CPUTopoInfo) {0, 1, 1, 17};
+    g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
+
+
+    topo_info = (X86CPUTopoInfo) {0, 1, 30, 2};
+    g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
+    topo_info = (X86CPUTopoInfo) {0, 1, 31, 2};
+    g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
+    topo_info = (X86CPUTopoInfo) {0, 1, 32, 2};
+    g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
+    topo_info = (X86CPUTopoInfo) {0, 1, 33, 2};
+    g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
+
+    topo_info = (X86CPUTopoInfo) {0, 1, 30, 2};
+    g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
+    topo_info = (X86CPUTopoInfo) {0, 2, 30, 2};
+    g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
+    topo_info = (X86CPUTopoInfo) {0, 3, 30, 2};
+    g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
+    topo_info = (X86CPUTopoInfo) {0, 4, 30, 2};
+    g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
 
     /* build a weird topology and see if IDs are calculated correctly
      */
 
     /* This will use 2 bits for thread ID and 3 bits for core ID
      */
-    g_assert_cmpuint(apicid_smt_width(1, 6, 3), ==, 2);
-    g_assert_cmpuint(apicid_core_offset(1, 6, 3), ==, 2);
-    g_assert_cmpuint(apicid_die_offset(1, 6, 3), ==, 5);
-    g_assert_cmpuint(apicid_pkg_offset(1, 6, 3), ==, 5);
-
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 0), ==, 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 1), ==, 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 2), ==, 2);
-
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 1 * 3 + 0), ==,
+    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
+    g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
+    g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
+    g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
+    g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
+
+    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
+
+    topo_info = (X86CPUTopoInfo) {0, 1, 6, 3};
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
                      (1 << 2) | 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 1 * 3 + 1), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
                      (1 << 2) | 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 1 * 3 + 2), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 2), ==,
                      (1 << 2) | 2);
 
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 2 * 3 + 0), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 0), ==,
                      (2 << 2) | 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 2 * 3 + 1), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 1), ==,
                      (2 << 2) | 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 2 * 3 + 2), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2 * 3 + 2), ==,
                      (2 << 2) | 2);
 
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 5 * 3 + 0), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 0), ==,
                      (5 << 2) | 0);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 5 * 3 + 1), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 1), ==,
                      (5 << 2) | 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 5 * 3 + 2), ==,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 5 * 3 + 2), ==,
                      (5 << 2) | 2);
 
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
                      1 * 6 * 3 + 0 * 3 + 0), ==, (1 << 5));
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
                      1 * 6 * 3 + 1 * 3 + 1), ==, (1 << 5) | (1 << 2) | 1);
-    g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3,
+    g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info,
                      3 * 6 * 3 + 5 * 3 + 2), ==, (3 << 5) | (5 << 2) | 2);
 }
 



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 02/18] hw/i386: Introduce X86CPUTopoInfo to contain topology info
  2019-12-04  0:37 ` [PATCH v3 02/18] hw/i386: Introduce X86CPUTopoInfo to contain topology info Babu Moger
@ 2020-01-28 15:44   ` Igor Mammedov
  0 siblings, 0 replies; 53+ messages in thread
From: Igor Mammedov @ 2020-01-28 15:44 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Tue, 03 Dec 2019 18:37:08 -0600
Babu Moger <babu.moger@amd.com> wrote:

> This is an effort to re-arrange few data structure for better readability.
> Add X86CPUTopoInfo which will have all the topology informations required
> to build the cpu topology. There is no functional changes.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/i386/pc.c               |   40 +++++++++++++++++++++++++++-------------
>  include/hw/i386/topology.h |   38 ++++++++++++++++++++++++--------------
>  2 files changed, 51 insertions(+), 27 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 5bd2ffccb7..8c23b1e8c9 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -878,11 +878,15 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
>  {
>      MachineState *ms = MACHINE(pcms);
>      PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    X86CPUTopoInfo topo_info;
>      uint32_t correct_id;
>      static bool warned;
>  
> -    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
> -                                         ms->smp.threads, cpu_index);
> +    topo_info.dies_per_pkg = pcms->smp_dies;
> +    topo_info.cores_per_die = ms->smp.cores;
> +    topo_info.threads_per_core = ms->smp.threads;
> +
> +    correct_id = x86_apicid_from_cpu_idx(&topo_info, cpu_index);
>      if (pcmc->compat_apic_id_mode) {
>          if (cpu_index != correct_id && !warned && !qtest_enabled()) {
>              error_report("APIC IDs set in compatibility mode, "
> @@ -2219,6 +2223,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>      PCMachineState *pcms = PC_MACHINE(hotplug_dev);
>      unsigned int smp_cores = ms->smp.cores;
>      unsigned int smp_threads = ms->smp.threads;
> +    X86CPUTopoInfo topo_info;
>  
>      if(!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
>          error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
> @@ -2226,6 +2231,10 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>          return;
>      }
>  
> +    topo_info.dies_per_pkg = pcms->smp_dies;
> +    topo_info.cores_per_die = smp_cores;
> +    topo_info.threads_per_core = smp_threads;
> +
>      env->nr_dies = pcms->smp_dies;
>  
>      /*
> @@ -2281,16 +2290,14 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>          topo_ids.die_id = cpu->die_id;
>          topo_ids.core_id = cpu->core_id;
>          topo_ids.smt_id = cpu->thread_id;
> -        cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
> -                                            smp_threads, &topo_ids);
> +        cpu->apic_id = apicid_from_topo_ids(&topo_info, &topo_ids);
>      }
>  
>      cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, &idx);
>      if (!cpu_slot) {
>          MachineState *ms = MACHINE(pcms);
>  
> -        x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
> -                                 smp_cores, smp_threads, &topo_ids);
> +        x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
>          error_setg(errp,
>              "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
>              " APIC ID %" PRIu32 ", valid index range 0:%d",
> @@ -2311,8 +2318,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>      /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
>       * once -smp refactoring is complete and there will be CPU private
>       * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
> -    x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
> -                             smp_cores, smp_threads, &topo_ids);
> +    x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
>      if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
>          error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
>              " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, topo_ids.pkg_id);
> @@ -2694,19 +2700,28 @@ static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
>  {
>     X86CPUTopoIDs topo_ids;
>     PCMachineState *pcms = PC_MACHINE(ms);
> +   X86CPUTopoInfo topo_info;
> +
> +   topo_info.dies_per_pkg = pcms->smp_dies;
> +   topo_info.cores_per_die = ms->smp.cores;
> +   topo_info.threads_per_core = ms->smp.threads;
>  
>     assert(idx < ms->possible_cpus->len);
>     x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
> -                            pcms->smp_dies, ms->smp.cores,
> -                            ms->smp.threads, &topo_ids);
> +                            &topo_info, &topo_ids);
>     return topo_ids.pkg_id % ms->numa_state->num_nodes;
>  }
>  
>  static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>  {
>      PCMachineState *pcms = PC_MACHINE(ms);
> -    int i;
>      unsigned int max_cpus = ms->smp.max_cpus;
> +    X86CPUTopoInfo topo_info;
> +    int i;
> +
> +    topo_info.dies_per_pkg = pcms->smp_dies;
> +    topo_info.cores_per_die = ms->smp.cores;
> +    topo_info.threads_per_core = ms->smp.threads;
>  
>      if (ms->possible_cpus) {
>          /*
> @@ -2727,8 +2742,7 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>          ms->possible_cpus->cpus[i].vcpus_count = 1;
>          ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
>          x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
> -                                 pcms->smp_dies, ms->smp.cores,
> -                                 ms->smp.threads, &topo_ids);
> +                                 &topo_info, &topo_ids);
>          ms->possible_cpus->cpus[i].props.has_socket_id = true;
>          ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
>          if (pcms->smp_dies > 1) {
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 6c184f3115..cf1935d548 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -52,6 +52,12 @@ typedef struct X86CPUTopoIDs {
>      unsigned smt_id;
>  } X86CPUTopoIDs;
>  
> +typedef struct X86CPUTopoInfo {
> +    unsigned dies_per_pkg;
> +    unsigned cores_per_die;
> +    unsigned threads_per_core;
> +} X86CPUTopoInfo;
> +
>  /* Return the bit width needed for 'count' IDs
>   */
>  static unsigned apicid_bitwidth_for_count(unsigned count)
> @@ -119,11 +125,13 @@ static inline unsigned apicid_pkg_offset(unsigned nr_dies,
>   *
>   * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
>   */
> -static inline apic_id_t apicid_from_topo_ids(unsigned nr_dies,
> -                                             unsigned nr_cores,
> -                                             unsigned nr_threads,
> +static inline apic_id_t apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
>                                               const X86CPUTopoIDs *topo_ids)
>  {
> +    unsigned nr_dies = topo_info->dies_per_pkg;
> +    unsigned nr_cores = topo_info->cores_per_die;
> +    unsigned nr_threads = topo_info->threads_per_core;
> +
>      return (topo_ids->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
>             (topo_ids->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
>             (topo_ids->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
> @@ -133,12 +141,14 @@ static inline apic_id_t apicid_from_topo_ids(unsigned nr_dies,
>  /* Calculate thread/core/package IDs for a specific topology,
>   * based on (contiguous) CPU index
>   */
> -static inline void x86_topo_ids_from_idx(unsigned nr_dies,
> -                                         unsigned nr_cores,
> -                                         unsigned nr_threads,
> +static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>                                           unsigned cpu_index,
>                                           X86CPUTopoIDs *topo_ids)
>  {
> +    unsigned nr_dies = topo_info->dies_per_pkg;
> +    unsigned nr_cores = topo_info->cores_per_die;
> +    unsigned nr_threads = topo_info->threads_per_core;
> +
>      topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
>      topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
>      topo_ids->core_id = cpu_index / nr_threads % nr_cores;
> @@ -149,11 +159,13 @@ static inline void x86_topo_ids_from_idx(unsigned nr_dies,
>   * based on APIC ID
>   */
>  static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> -                                            unsigned nr_dies,
> -                                            unsigned nr_cores,
> -                                            unsigned nr_threads,
> +                                            X86CPUTopoInfo *topo_info,
>                                              X86CPUTopoIDs *topo_ids)
>  {
> +    unsigned nr_dies = topo_info->dies_per_pkg;
> +    unsigned nr_cores = topo_info->cores_per_die;
> +    unsigned nr_threads = topo_info->threads_per_core;
> +
>      topo_ids->smt_id = apicid &
>              ~(0xFFFFFFFFUL << apicid_smt_width(nr_dies, nr_cores, nr_threads));
>      topo_ids->core_id =
> @@ -169,14 +181,12 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>   *
>   * 'cpu_index' is a sequential, contiguous ID for the CPU.
>   */
> -static inline apic_id_t x86_apicid_from_cpu_idx(unsigned nr_dies,
> -                                                unsigned nr_cores,
> -                                                unsigned nr_threads,
> +static inline apic_id_t x86_apicid_from_cpu_idx(X86CPUTopoInfo *topo_info,
>                                                  unsigned cpu_index)
>  {
>      X86CPUTopoIDs topo_ids;
> -    x86_topo_ids_from_idx(nr_dies, nr_cores, nr_threads, cpu_index, &topo_ids);
> -    return apicid_from_topo_ids(nr_dies, nr_cores, nr_threads, &topo_ids);
> +    x86_topo_ids_from_idx(topo_info, cpu_index, &topo_ids);
> +    return apicid_from_topo_ids(topo_info, &topo_ids);
>  }
>  
>  #endif /* HW_I386_TOPOLOGY_H */
> 
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 03/18] hw/i386: Consolidate topology functions
  2019-12-04  0:37 ` [PATCH v3 03/18] hw/i386: Consolidate topology functions Babu Moger
@ 2020-01-28 15:46   ` Igor Mammedov
  0 siblings, 0 replies; 53+ messages in thread
From: Igor Mammedov @ 2020-01-28 15:46 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Tue, 03 Dec 2019 18:37:15 -0600
Babu Moger <babu.moger@amd.com> wrote:

> Now that we have all the parameters in X86CPUTopoInfo, we can just pass the
> structure to calculate the offsets and width.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  include/hw/i386/topology.h |   64 ++++++++++++++------------------------------
>  target/i386/cpu.c          |   23 ++++++++--------
>  2 files changed, 32 insertions(+), 55 deletions(-)
> 
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index cf1935d548..ba52d49079 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -69,56 +69,42 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
>  
>  /* Bit width of the SMT_ID (thread ID) field on the APIC ID
>   */
> -static inline unsigned apicid_smt_width(unsigned nr_dies,
> -                                        unsigned nr_cores,
> -                                        unsigned nr_threads)
> +static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>  {
> -    return apicid_bitwidth_for_count(nr_threads);
> +    return apicid_bitwidth_for_count(topo_info->threads_per_core);
>  }
>  
>  /* Bit width of the Core_ID field
>   */
> -static inline unsigned apicid_core_width(unsigned nr_dies,
> -                                         unsigned nr_cores,
> -                                         unsigned nr_threads)
> +static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>  {
> -    return apicid_bitwidth_for_count(nr_cores);
> +    return apicid_bitwidth_for_count(topo_info->cores_per_die);
>  }
>  
>  /* Bit width of the Die_ID field */
> -static inline unsigned apicid_die_width(unsigned nr_dies,
> -                                        unsigned nr_cores,
> -                                        unsigned nr_threads)
> +static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
>  {
> -    return apicid_bitwidth_for_count(nr_dies);
> +    return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
>  }
>  
>  /* Bit offset of the Core_ID field
>   */
> -static inline unsigned apicid_core_offset(unsigned nr_dies,
> -                                          unsigned nr_cores,
> -                                          unsigned nr_threads)
> +static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>  {
> -    return apicid_smt_width(nr_dies, nr_cores, nr_threads);
> +    return apicid_smt_width(topo_info);
>  }
>  
>  /* Bit offset of the Die_ID field */
> -static inline unsigned apicid_die_offset(unsigned nr_dies,
> -                                          unsigned nr_cores,
> -                                           unsigned nr_threads)
> +static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
>  {
> -    return apicid_core_offset(nr_dies, nr_cores, nr_threads) +
> -           apicid_core_width(nr_dies, nr_cores, nr_threads);
> +    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
>  }
>  
>  /* Bit offset of the Pkg_ID (socket ID) field
>   */
> -static inline unsigned apicid_pkg_offset(unsigned nr_dies,
> -                                         unsigned nr_cores,
> -                                         unsigned nr_threads)
> +static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
>  {
> -    return apicid_die_offset(nr_dies, nr_cores, nr_threads) +
> -           apicid_die_width(nr_dies, nr_cores, nr_threads);
> +    return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
>  }
>  
>  /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> @@ -128,13 +114,9 @@ static inline unsigned apicid_pkg_offset(unsigned nr_dies,
>  static inline apic_id_t apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
>                                               const X86CPUTopoIDs *topo_ids)
>  {
> -    unsigned nr_dies = topo_info->dies_per_pkg;
> -    unsigned nr_cores = topo_info->cores_per_die;
> -    unsigned nr_threads = topo_info->threads_per_core;
> -
> -    return (topo_ids->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
> -           (topo_ids->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
> -           (topo_ids->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
> +    return (topo_ids->pkg_id  << apicid_pkg_offset(topo_info)) |
> +           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
> +           (topo_ids->core_id << apicid_core_offset(topo_info)) |
>             topo_ids->smt_id;
>  }
>  
> @@ -162,19 +144,15 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>                                              X86CPUTopoInfo *topo_info,
>                                              X86CPUTopoIDs *topo_ids)
>  {
> -    unsigned nr_dies = topo_info->dies_per_pkg;
> -    unsigned nr_cores = topo_info->cores_per_die;
> -    unsigned nr_threads = topo_info->threads_per_core;
> -
>      topo_ids->smt_id = apicid &
> -            ~(0xFFFFFFFFUL << apicid_smt_width(nr_dies, nr_cores, nr_threads));
> +            ~(0xFFFFFFFFUL << apicid_smt_width(topo_info));
>      topo_ids->core_id =
> -            (apicid >> apicid_core_offset(nr_dies, nr_cores, nr_threads)) &
> -            ~(0xFFFFFFFFUL << apicid_core_width(nr_dies, nr_cores, nr_threads));
> +            (apicid >> apicid_core_offset(topo_info)) &
> +            ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
>      topo_ids->die_id =
> -            (apicid >> apicid_die_offset(nr_dies, nr_cores, nr_threads)) &
> -            ~(0xFFFFFFFFUL << apicid_die_width(nr_dies, nr_cores, nr_threads));
> -    topo_ids->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, nr_threads);
> +            (apicid >> apicid_die_offset(topo_info)) &
> +            ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
> +    topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
>  }
>  
>  /* Make APIC ID for the CPU 'cpu_index'
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 07cf562d89..bc9b491557 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4551,6 +4551,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>      uint32_t die_offset;
>      uint32_t limit;
>      uint32_t signature[3];
> +    X86CPUTopoInfo topo_info;
> +
> +    topo_info.dies_per_pkg = env->nr_dies;
> +    topo_info.cores_per_die = cs->nr_cores;
> +    topo_info.threads_per_core = cs->nr_threads;
>  
>      /* Calculate & apply limits for different index ranges */
>      if (index >= 0xC0000000) {
> @@ -4637,8 +4642,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                                      eax, ebx, ecx, edx);
>                  break;
>              case 3: /* L3 cache info */
> -                die_offset = apicid_die_offset(env->nr_dies,
> -                                        cs->nr_cores, cs->nr_threads);
> +                die_offset = apicid_die_offset(&topo_info);
>                  if (cpu->enable_l3_cache) {
>                      encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
>                                          (1 << die_offset), cs->nr_cores,
> @@ -4729,14 +4733,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>  
>          switch (count) {
>          case 0:
> -            *eax = apicid_core_offset(env->nr_dies,
> -                                      cs->nr_cores, cs->nr_threads);
> +            *eax = apicid_core_offset(&topo_info);
>              *ebx = cs->nr_threads;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>              break;
>          case 1:
> -            *eax = apicid_pkg_offset(env->nr_dies,
> -                                     cs->nr_cores, cs->nr_threads);
> +            *eax = apicid_pkg_offset(&topo_info);
>              *ebx = cs->nr_cores * cs->nr_threads;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>              break;
> @@ -4760,20 +4762,17 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>          *edx = cpu->apic_id;
>          switch (count) {
>          case 0:
> -            *eax = apicid_core_offset(env->nr_dies, cs->nr_cores,
> -                                                    cs->nr_threads);
> +            *eax = apicid_core_offset(&topo_info);
>              *ebx = cs->nr_threads;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>              break;
>          case 1:
> -            *eax = apicid_die_offset(env->nr_dies, cs->nr_cores,
> -                                                   cs->nr_threads);
> +            *eax = apicid_die_offset(&topo_info);
>              *ebx = cs->nr_cores * cs->nr_threads;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>              break;
>          case 2:
> -            *eax = apicid_pkg_offset(env->nr_dies, cs->nr_cores,
> -                                                   cs->nr_threads);
> +            *eax = apicid_pkg_offset(&topo_info);
>              *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>              break;
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 04/18] hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo
  2019-12-04  0:37 ` [PATCH v3 04/18] hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo Babu Moger
@ 2020-01-28 15:49   ` Igor Mammedov
  2020-01-28 16:42     ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-01-28 15:49 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Tue, 03 Dec 2019 18:37:21 -0600
Babu Moger <babu.moger@amd.com> wrote:

> Initialize all the parameters in one function initialize_topo_info.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>  hw/i386/pc.c |   28 +++++++++++++++-------------
>  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 8c23b1e8c9..cafbdafa76 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -866,6 +866,15 @@ static void handle_a20_line_change(void *opaque, int irq, int level)
>      x86_cpu_set_a20(cpu, level);
>  }
>  
> +static inline void initialize_topo_info(X86CPUTopoInfo *topo_info,
> +                                        PCMachineState *pcms,

maybe use 'const'

> +                                        const MachineState *ms)
'ms' is the same thing as 'pcms', so why pass it around separately?

you can just do
   MachineState *ms = MACHINE(pcms)
inside of function

> +{
> +    topo_info->dies_per_pkg = pcms->smp_dies;
> +    topo_info->cores_per_die = ms->smp.cores;
> +    topo_info->threads_per_core = ms->smp.threads;
> +}
> +
>  /* Calculates initial APIC ID for a specific CPU index
>   *
>   * Currently we need to be able to calculate the APIC ID from the CPU index
> @@ -882,9 +891,7 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
>      uint32_t correct_id;
>      static bool warned;
>  
> -    topo_info.dies_per_pkg = pcms->smp_dies;
> -    topo_info.cores_per_die = ms->smp.cores;
> -    topo_info.threads_per_core = ms->smp.threads;
> +    initialize_topo_info(&topo_info, pcms, ms);
>  
>      correct_id = x86_apicid_from_cpu_idx(&topo_info, cpu_index);
>      if (pcmc->compat_apic_id_mode) {
> @@ -2231,9 +2238,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>          return;
>      }
>  
> -    topo_info.dies_per_pkg = pcms->smp_dies;
> -    topo_info.cores_per_die = smp_cores;
> -    topo_info.threads_per_core = smp_threads;
> +    initialize_topo_info(&topo_info, pcms, ms);
>  
>      env->nr_dies = pcms->smp_dies;
>  
> @@ -2702,9 +2707,7 @@ static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
>     PCMachineState *pcms = PC_MACHINE(ms);
>     X86CPUTopoInfo topo_info;
>  
> -   topo_info.dies_per_pkg = pcms->smp_dies;
> -   topo_info.cores_per_die = ms->smp.cores;
> -   topo_info.threads_per_core = ms->smp.threads;
> +   initialize_topo_info(&topo_info, pcms, ms);
>  
>     assert(idx < ms->possible_cpus->len);
>     x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
> @@ -2719,10 +2722,6 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>      X86CPUTopoInfo topo_info;
>      int i;
>  
> -    topo_info.dies_per_pkg = pcms->smp_dies;
> -    topo_info.cores_per_die = ms->smp.cores;
> -    topo_info.threads_per_core = ms->smp.threads;
> -
>      if (ms->possible_cpus) {
>          /*
>           * make sure that max_cpus hasn't changed since the first use, i.e.
> @@ -2734,6 +2733,9 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>  
>      ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
>                                    sizeof(CPUArchId) * max_cpus);
> +
> +    initialize_topo_info(&topo_info, pcms, ms);
> +
>      ms->possible_cpus->len = max_cpus;
>      for (i = 0; i < ms->possible_cpus->len; i++) {
>          X86CPUTopoIDs topo_ids;
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology
  2019-12-04  0:37 ` [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology Babu Moger
@ 2020-01-28 16:27   ` Igor Mammedov
  2020-01-28 16:44     ` Babu Moger
  2020-01-28 16:31   ` Eric Blake
  1 sibling, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-01-28 16:27 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Tue, 03 Dec 2019 18:37:35 -0600
Babu Moger <babu.moger@amd.com> wrote:

> Introduce last level cache id(llc_id) in x86CPU topology.  This information is
> required to build the topology in EPIC mode.
can you add a reference to spec here so one could look for
detailed information about this?

 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  hw/core/machine-hmp-cmds.c |    3 +++
>  hw/core/machine.c          |   13 +++++++++++++
>  hw/i386/pc.c               |   10 ++++++++++
>  include/hw/i386/topology.h |    1 +
>  qapi/machine.json          |    7 +++++--
>  target/i386/cpu.c          |    2 ++
>  target/i386/cpu.h          |    1 +
>  7 files changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c
> index cd970cc4c5..59c91d1ce1 100644
> --- a/hw/core/machine-hmp-cmds.c
> +++ b/hw/core/machine-hmp-cmds.c
> @@ -90,6 +90,9 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict)
>          if (c->has_die_id) {
>              monitor_printf(mon, "    die-id: \"%" PRIu64 "\"\n", c->die_id);
>          }
> +        if (c->has_llc_id) {
> +            monitor_printf(mon, "    llc-id: \"%" PRIu64 "\"\n", c->llc_id);
> +        }
>          if (c->has_core_id) {
>              monitor_printf(mon, "    core-id: \"%" PRIu64 "\"\n", c->core_id);
>          }
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index e59b181ead..ff991e6ab5 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -683,6 +683,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
>              return;
>          }
>  
> +        if (props->has_llc_id && !slot->props.has_llc_id) {
> +            error_setg(errp, "llc-id is not supported");
> +            return;
> +        }
> +
>          /* skip slots with explicit mismatch */
>          if (props->has_thread_id && props->thread_id != slot->props.thread_id) {
>                  continue;
> @@ -696,6 +701,10 @@ void machine_set_cpu_numa_node(MachineState *machine,
>                  continue;
>          }
>  
> +        if (props->has_llc_id && props->llc_id != slot->props.llc_id) {
> +                continue;
> +        }
> +
>          if (props->has_socket_id && props->socket_id != slot->props.socket_id) {
>                  continue;
>          }
> @@ -1034,6 +1043,10 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
>      if (cpu->props.has_die_id) {
>          g_string_append_printf(s, "die-id: %"PRId64, cpu->props.die_id);
>      }
> +
> +    if (cpu->props.has_llc_id) {
> +        g_string_append_printf(s, "llc-id: %"PRId64, cpu->props.llc_id);
> +    }
>      if (cpu->props.has_core_id) {
>          if (s->len) {
>              g_string_append_printf(s, ", ");
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 17de152a77..df5339c102 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2294,6 +2294,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>  
>          topo_ids.pkg_id = cpu->socket_id;
>          topo_ids.die_id = cpu->die_id;
> +        topo_ids.llc_id = cpu->llc_id;
>          topo_ids.core_id = cpu->core_id;
>          topo_ids.smt_id = cpu->thread_id;
>          cpu->apic_id = apicid_from_topo_ids(&topo_info, &topo_ids);
> @@ -2339,6 +2340,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>      }
>      cpu->die_id = topo_ids.die_id;
>  
> +    if (cpu->llc_id != -1 && cpu->llc_id != topo_ids.llc_id) {
> +        error_setg(errp, "property llc-id: %u doesn't match set apic-id:"
> +            " 0x%x (llc-id: %u)", cpu->llc_id, cpu->apic_id, topo_ids.llc_id);
> +        return;
> +    }
> +    cpu->llc_id = topo_ids.llc_id;
> +
>      if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
>          error_setg(errp, "property core-id: %u doesn't match set apic-id:"
>              " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo_ids.core_id);
> @@ -2752,6 +2760,8 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>              ms->possible_cpus->cpus[i].props.has_die_id = true;
>              ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
>          }
> +        ms->possible_cpus->cpus[i].props.has_llc_id = true;
> +        ms->possible_cpus->cpus[i].props.llc_id = topo_ids.llc_id;
>          ms->possible_cpus->cpus[i].props.has_core_id = true;
>          ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
>          ms->possible_cpus->cpus[i].props.has_thread_id = true;
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index ba52d49079..1238006208 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -48,6 +48,7 @@ typedef uint32_t apic_id_t;
>  typedef struct X86CPUTopoIDs {
>      unsigned pkg_id;
>      unsigned die_id;
> +    unsigned llc_id;
>      unsigned core_id;
>      unsigned smt_id;
>  } X86CPUTopoIDs;
> diff --git a/qapi/machine.json b/qapi/machine.json
> index ca26779f1a..1ca5b73418 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -646,9 +646,11 @@
>  # @node-id: NUMA node ID the CPU belongs to
>  # @socket-id: socket number within node/board the CPU belongs to
>  # @die-id: die number within node/board the CPU belongs to (Since 4.1)
> -# @core-id: core number within die the CPU belongs to# @thread-id: thread number within core the CPU belongs to
> +# @llc-id: last level cache number within node/board the CPU belongs to (Since 4.2)
> +# @core-id: core number within die the CPU belongs to
> +# @thread-id: thread number within core the CPU belongs to
>  #
> -# Note: currently there are 5 properties that could be present
> +# Note: currently there are 6 properties that could be present
>  # but management should be prepared to pass through other
>  # properties with device_add command to allow for future
>  # interface extension. This also requires the filed names to be kept in
> @@ -660,6 +662,7 @@
>    'data': { '*node-id': 'int',
>              '*socket-id': 'int',
>              '*die-id': 'int',
> +            '*llc-id': 'int',
>              '*core-id': 'int',
>              '*thread-id': 'int'
>    }
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index bc9b491557..3c81aa3ecd 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6222,12 +6222,14 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, 0),
>      DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
>      DEFINE_PROP_INT32("die-id", X86CPU, die_id, 0),
> +    DEFINE_PROP_INT32("llc-id", X86CPU, llc_id, 0),
>      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, 0),
>  #else
>      DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
>      DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, -1),
>      DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
>      DEFINE_PROP_INT32("die-id", X86CPU, die_id, -1),
> +    DEFINE_PROP_INT32("llc-id", X86CPU, llc_id, -1),
>      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
>  #endif
>      DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index af57fda8e5..a56d44e405 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1711,6 +1711,7 @@ struct X86CPU {
>      int32_t node_id; /* NUMA node this CPU belongs to */
>      int32_t socket_id;
>      int32_t die_id;
> +    int32_t llc_id;
>      int32_t core_id;
>      int32_t thread_id;
>  
> 
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2019-12-04  0:37 ` [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass Babu Moger
@ 2020-01-28 16:29   ` Igor Mammedov
  2020-01-28 19:45     ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-01-28 16:29 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Tue, 03 Dec 2019 18:37:42 -0600
Babu Moger <babu.moger@amd.com> wrote:

> Add a new function init_apicid_fn in MachineClass to initialize the mode
> specific handlers to decode the apic ids.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  include/hw/boards.h |    1 +
>  vl.c                |    3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index d4fab218e6..ce5aa365cb 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -238,6 +238,7 @@ struct MachineClass {
>                                                           unsigned cpu_index);
>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
> +    void (*init_apicid_fn)(MachineState *ms);
it's x86 specific, so why it wasn put into PCMachineClass?


>  };
>  
>  /**
> diff --git a/vl.c b/vl.c
> index a42c24a77f..b6af604e11 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4318,6 +4318,9 @@ int main(int argc, char **argv, char **envp)
>      current_machine->cpu_type = machine_class->default_cpu_type;
>      if (cpu_option) {
>          current_machine->cpu_type = parse_cpu_option(cpu_option);
> +        if (machine_class->init_apicid_fn) {
> +            machine_class->init_apicid_fn(current_machine);
> +        }
>      }
>      parse_numa_opts(current_machine);
>  
> 
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology
  2019-12-04  0:37 ` [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology Babu Moger
  2020-01-28 16:27   ` Igor Mammedov
@ 2020-01-28 16:31   ` Eric Blake
  2020-01-28 16:44     ` Babu Moger
  1 sibling, 1 reply; 53+ messages in thread
From: Eric Blake @ 2020-01-28 16:31 UTC (permalink / raw)
  To: Babu Moger, ehabkost, marcel.apfelbaum, mst, pbonzini, rth,
	armbru, imammedo
  Cc: qemu-devel

On 12/3/19 6:37 PM, Babu Moger wrote:
> Introduce last level cache id(llc_id) in x86CPU topology.  This information is
> required to build the topology in EPIC mode.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

> +++ b/qapi/machine.json
> @@ -646,9 +646,11 @@
>   # @node-id: NUMA node ID the CPU belongs to
>   # @socket-id: socket number within node/board the CPU belongs to
>   # @die-id: die number within node/board the CPU belongs to (Since 4.1)
> -# @core-id: core number within die the CPU belongs to# @thread-id: thread number within core the CPU belongs to
> +# @llc-id: last level cache number within node/board the CPU belongs to (Since 4.2)

s/4.2/5.0/

> +# @core-id: core number within die the CPU belongs to
> +# @thread-id: thread number within core the CPU belongs to
>   #
> -# Note: currently there are 5 properties that could be present
> +# Note: currently there are 6 properties that could be present
>   # but management should be prepared to pass through other
>   # properties with device_add command to allow for future
>   # interface extension. This also requires the filed names to be kept in
> @@ -660,6 +662,7 @@
>     'data': { '*node-id': 'int',
>               '*socket-id': 'int',
>               '*die-id': 'int',
> +            '*llc-id': 'int',
>               '*core-id': 'int',
>               '*thread-id': 'int'
>     }
-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 04/18] hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo
  2020-01-28 15:49   ` Igor Mammedov
@ 2020-01-28 16:42     ` Babu Moger
  0 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2020-01-28 16:42 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

Igor,

On 1/28/20 9:49 AM, Igor Mammedov wrote:
> On Tue, 03 Dec 2019 18:37:21 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> Initialize all the parameters in one function initialize_topo_info.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>> ---
>>  hw/i386/pc.c |   28 +++++++++++++++-------------
>>  1 file changed, 15 insertions(+), 13 deletions(-)
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 8c23b1e8c9..cafbdafa76 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -866,6 +866,15 @@ static void handle_a20_line_change(void *opaque, int irq, int level)
>>      x86_cpu_set_a20(cpu, level);
>>  }
>>  
>> +static inline void initialize_topo_info(X86CPUTopoInfo *topo_info,
>> +                                        PCMachineState *pcms,
> 
> maybe use 'const'
> 
>> +                                        const MachineState *ms)
> 'ms' is the same thing as 'pcms', so why pass it around separately?
> 
> you can just do
>    MachineState *ms = MACHINE(pcms)
> inside of function

Yes. We can do that. Thanks

> 
>> +{
>> +    topo_info->dies_per_pkg = pcms->smp_dies;
>> +    topo_info->cores_per_die = ms->smp.cores;
>> +    topo_info->threads_per_core = ms->smp.threads;
>> +}
>> +
>>  /* Calculates initial APIC ID for a specific CPU index
>>   *
>>   * Currently we need to be able to calculate the APIC ID from the CPU index
>> @@ -882,9 +891,7 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
>>      uint32_t correct_id;
>>      static bool warned;
>>  
>> -    topo_info.dies_per_pkg = pcms->smp_dies;
>> -    topo_info.cores_per_die = ms->smp.cores;
>> -    topo_info.threads_per_core = ms->smp.threads;
>> +    initialize_topo_info(&topo_info, pcms, ms);
>>  
>>      correct_id = x86_apicid_from_cpu_idx(&topo_info, cpu_index);
>>      if (pcmc->compat_apic_id_mode) {
>> @@ -2231,9 +2238,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>          return;
>>      }
>>  
>> -    topo_info.dies_per_pkg = pcms->smp_dies;
>> -    topo_info.cores_per_die = smp_cores;
>> -    topo_info.threads_per_core = smp_threads;
>> +    initialize_topo_info(&topo_info, pcms, ms);
>>  
>>      env->nr_dies = pcms->smp_dies;
>>  
>> @@ -2702,9 +2707,7 @@ static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
>>     PCMachineState *pcms = PC_MACHINE(ms);
>>     X86CPUTopoInfo topo_info;
>>  
>> -   topo_info.dies_per_pkg = pcms->smp_dies;
>> -   topo_info.cores_per_die = ms->smp.cores;
>> -   topo_info.threads_per_core = ms->smp.threads;
>> +   initialize_topo_info(&topo_info, pcms, ms);
>>  
>>     assert(idx < ms->possible_cpus->len);
>>     x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
>> @@ -2719,10 +2722,6 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>>      X86CPUTopoInfo topo_info;
>>      int i;
>>  
>> -    topo_info.dies_per_pkg = pcms->smp_dies;
>> -    topo_info.cores_per_die = ms->smp.cores;
>> -    topo_info.threads_per_core = ms->smp.threads;
>> -
>>      if (ms->possible_cpus) {
>>          /*
>>           * make sure that max_cpus hasn't changed since the first use, i.e.
>> @@ -2734,6 +2733,9 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>>  
>>      ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
>>                                    sizeof(CPUArchId) * max_cpus);
>> +
>> +    initialize_topo_info(&topo_info, pcms, ms);
>> +
>>      ms->possible_cpus->len = max_cpus;
>>      for (i = 0; i < ms->possible_cpus->len; i++) {
>>          X86CPUTopoIDs topo_ids;
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology
  2020-01-28 16:27   ` Igor Mammedov
@ 2020-01-28 16:44     ` Babu Moger
  0 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2020-01-28 16:44 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 1/28/20 10:27 AM, Igor Mammedov wrote:
> On Tue, 03 Dec 2019 18:37:35 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> Introduce last level cache id(llc_id) in x86CPU topology.  This information is
>> required to build the topology in EPIC mode.
> can you add a reference to spec here so one could look for
> detailed information about this?

Yes. Will add it next series.
> 
>  
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>>  hw/core/machine-hmp-cmds.c |    3 +++
>>  hw/core/machine.c          |   13 +++++++++++++
>>  hw/i386/pc.c               |   10 ++++++++++
>>  include/hw/i386/topology.h |    1 +
>>  qapi/machine.json          |    7 +++++--
>>  target/i386/cpu.c          |    2 ++
>>  target/i386/cpu.h          |    1 +
>>  7 files changed, 35 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c
>> index cd970cc4c5..59c91d1ce1 100644
>> --- a/hw/core/machine-hmp-cmds.c
>> +++ b/hw/core/machine-hmp-cmds.c
>> @@ -90,6 +90,9 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict)
>>          if (c->has_die_id) {
>>              monitor_printf(mon, "    die-id: \"%" PRIu64 "\"\n", c->die_id);
>>          }
>> +        if (c->has_llc_id) {
>> +            monitor_printf(mon, "    llc-id: \"%" PRIu64 "\"\n", c->llc_id);
>> +        }
>>          if (c->has_core_id) {
>>              monitor_printf(mon, "    core-id: \"%" PRIu64 "\"\n", c->core_id);
>>          }
>> diff --git a/hw/core/machine.c b/hw/core/machine.c
>> index e59b181ead..ff991e6ab5 100644
>> --- a/hw/core/machine.c
>> +++ b/hw/core/machine.c
>> @@ -683,6 +683,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
>>              return;
>>          }
>>  
>> +        if (props->has_llc_id && !slot->props.has_llc_id) {
>> +            error_setg(errp, "llc-id is not supported");
>> +            return;
>> +        }
>> +
>>          /* skip slots with explicit mismatch */
>>          if (props->has_thread_id && props->thread_id != slot->props.thread_id) {
>>                  continue;
>> @@ -696,6 +701,10 @@ void machine_set_cpu_numa_node(MachineState *machine,
>>                  continue;
>>          }
>>  
>> +        if (props->has_llc_id && props->llc_id != slot->props.llc_id) {
>> +                continue;
>> +        }
>> +
>>          if (props->has_socket_id && props->socket_id != slot->props.socket_id) {
>>                  continue;
>>          }
>> @@ -1034,6 +1043,10 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
>>      if (cpu->props.has_die_id) {
>>          g_string_append_printf(s, "die-id: %"PRId64, cpu->props.die_id);
>>      }
>> +
>> +    if (cpu->props.has_llc_id) {
>> +        g_string_append_printf(s, "llc-id: %"PRId64, cpu->props.llc_id);
>> +    }
>>      if (cpu->props.has_core_id) {
>>          if (s->len) {
>>              g_string_append_printf(s, ", ");
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 17de152a77..df5339c102 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -2294,6 +2294,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>  
>>          topo_ids.pkg_id = cpu->socket_id;
>>          topo_ids.die_id = cpu->die_id;
>> +        topo_ids.llc_id = cpu->llc_id;
>>          topo_ids.core_id = cpu->core_id;
>>          topo_ids.smt_id = cpu->thread_id;
>>          cpu->apic_id = apicid_from_topo_ids(&topo_info, &topo_ids);
>> @@ -2339,6 +2340,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>      }
>>      cpu->die_id = topo_ids.die_id;
>>  
>> +    if (cpu->llc_id != -1 && cpu->llc_id != topo_ids.llc_id) {
>> +        error_setg(errp, "property llc-id: %u doesn't match set apic-id:"
>> +            " 0x%x (llc-id: %u)", cpu->llc_id, cpu->apic_id, topo_ids.llc_id);
>> +        return;
>> +    }
>> +    cpu->llc_id = topo_ids.llc_id;
>> +
>>      if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
>>          error_setg(errp, "property core-id: %u doesn't match set apic-id:"
>>              " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo_ids.core_id);
>> @@ -2752,6 +2760,8 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>>              ms->possible_cpus->cpus[i].props.has_die_id = true;
>>              ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
>>          }
>> +        ms->possible_cpus->cpus[i].props.has_llc_id = true;
>> +        ms->possible_cpus->cpus[i].props.llc_id = topo_ids.llc_id;
>>          ms->possible_cpus->cpus[i].props.has_core_id = true;
>>          ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
>>          ms->possible_cpus->cpus[i].props.has_thread_id = true;
>> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
>> index ba52d49079..1238006208 100644
>> --- a/include/hw/i386/topology.h
>> +++ b/include/hw/i386/topology.h
>> @@ -48,6 +48,7 @@ typedef uint32_t apic_id_t;
>>  typedef struct X86CPUTopoIDs {
>>      unsigned pkg_id;
>>      unsigned die_id;
>> +    unsigned llc_id;
>>      unsigned core_id;
>>      unsigned smt_id;
>>  } X86CPUTopoIDs;
>> diff --git a/qapi/machine.json b/qapi/machine.json
>> index ca26779f1a..1ca5b73418 100644
>> --- a/qapi/machine.json
>> +++ b/qapi/machine.json
>> @@ -646,9 +646,11 @@
>>  # @node-id: NUMA node ID the CPU belongs to
>>  # @socket-id: socket number within node/board the CPU belongs to
>>  # @die-id: die number within node/board the CPU belongs to (Since 4.1)
>> -# @core-id: core number within die the CPU belongs to# @thread-id: thread number within core the CPU belongs to
>> +# @llc-id: last level cache number within node/board the CPU belongs to (Since 4.2)
>> +# @core-id: core number within die the CPU belongs to
>> +# @thread-id: thread number within core the CPU belongs to
>>  #
>> -# Note: currently there are 5 properties that could be present
>> +# Note: currently there are 6 properties that could be present
>>  # but management should be prepared to pass through other
>>  # properties with device_add command to allow for future
>>  # interface extension. This also requires the filed names to be kept in
>> @@ -660,6 +662,7 @@
>>    'data': { '*node-id': 'int',
>>              '*socket-id': 'int',
>>              '*die-id': 'int',
>> +            '*llc-id': 'int',
>>              '*core-id': 'int',
>>              '*thread-id': 'int'
>>    }
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index bc9b491557..3c81aa3ecd 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -6222,12 +6222,14 @@ static Property x86_cpu_properties[] = {
>>      DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, 0),
>>      DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
>>      DEFINE_PROP_INT32("die-id", X86CPU, die_id, 0),
>> +    DEFINE_PROP_INT32("llc-id", X86CPU, llc_id, 0),
>>      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, 0),
>>  #else
>>      DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
>>      DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, -1),
>>      DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
>>      DEFINE_PROP_INT32("die-id", X86CPU, die_id, -1),
>> +    DEFINE_PROP_INT32("llc-id", X86CPU, llc_id, -1),
>>      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
>>  #endif
>>      DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index af57fda8e5..a56d44e405 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -1711,6 +1711,7 @@ struct X86CPU {
>>      int32_t node_id; /* NUMA node this CPU belongs to */
>>      int32_t socket_id;
>>      int32_t die_id;
>> +    int32_t llc_id;
>>      int32_t core_id;
>>      int32_t thread_id;
>>  
>>
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology
  2020-01-28 16:31   ` Eric Blake
@ 2020-01-28 16:44     ` Babu Moger
  0 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2020-01-28 16:44 UTC (permalink / raw)
  To: Eric Blake, ehabkost, marcel.apfelbaum, mst, pbonzini, rth,
	armbru, imammedo
  Cc: qemu-devel



On 1/28/20 10:31 AM, Eric Blake wrote:
> On 12/3/19 6:37 PM, Babu Moger wrote:
>> Introduce last level cache id(llc_id) in x86CPU topology.  This
>> information is
>> required to build the topology in EPIC mode.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
>> +++ b/qapi/machine.json
>> @@ -646,9 +646,11 @@
>>   # @node-id: NUMA node ID the CPU belongs to
>>   # @socket-id: socket number within node/board the CPU belongs to
>>   # @die-id: die number within node/board the CPU belongs to (Since 4.1)
>> -# @core-id: core number within die the CPU belongs to# @thread-id:
>> thread number within core the CPU belongs to
>> +# @llc-id: last level cache number within node/board the CPU belongs to
>> (Since 4.2)
> 
> s/4.2/5.0/

Sure. Will change it. Thanks

> 
>> +# @core-id: core number within die the CPU belongs to
>> +# @thread-id: thread number within core the CPU belongs to
>>   #
>> -# Note: currently there are 5 properties that could be present
>> +# Note: currently there are 6 properties that could be present
>>   # but management should be prepared to pass through other
>>   # properties with device_add command to allow for future
>>   # interface extension. This also requires the filed names to be kept in
>> @@ -660,6 +662,7 @@
>>     'data': { '*node-id': 'int',
>>               '*socket-id': 'int',
>>               '*die-id': 'int',
>> +            '*llc-id': 'int',
>>               '*core-id': 'int',
>>               '*thread-id': 'int'
>>     }


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-01-28 16:29   ` Igor Mammedov
@ 2020-01-28 19:45     ` Babu Moger
  2020-01-28 20:12       ` Eduardo Habkost
  2020-01-29  9:14       ` Igor Mammedov
  0 siblings, 2 replies; 53+ messages in thread
From: Babu Moger @ 2020-01-28 19:45 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 1/28/20 10:29 AM, Igor Mammedov wrote:
> On Tue, 03 Dec 2019 18:37:42 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> Add a new function init_apicid_fn in MachineClass to initialize the mode
>> specific handlers to decode the apic ids.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>>  include/hw/boards.h |    1 +
>>  vl.c                |    3 +++
>>  2 files changed, 4 insertions(+)
>>
>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>> index d4fab218e6..ce5aa365cb 100644
>> --- a/include/hw/boards.h
>> +++ b/include/hw/boards.h
>> @@ -238,6 +238,7 @@ struct MachineClass {
>>                                                           unsigned cpu_index);
>>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
>> +    void (*init_apicid_fn)(MachineState *ms);
> it's x86 specific, so why it wasn put into PCMachineClass?

Yes. It is x86 specific for now. I tried to make it generic function so
other OSes can use it if required(like we have done in
possible_cpu_arch_ids). It initializes functions required to build the
apicid for each CPUs. We need these functions much early in the
initialization. It should be initialized before parse_numa_opts or
machine_run_board_init(in v1.c) which are called from generic context. We
cannot use PCMachineClass at this time.

> 
> 
>>  };
>>  
>>  /**
>> diff --git a/vl.c b/vl.c
>> index a42c24a77f..b6af604e11 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -4318,6 +4318,9 @@ int main(int argc, char **argv, char **envp)
>>      current_machine->cpu_type = machine_class->default_cpu_type;
>>      if (cpu_option) {
>>          current_machine->cpu_type = parse_cpu_option(cpu_option);
>> +        if (machine_class->init_apicid_fn) {
>> +            machine_class->init_apicid_fn(current_machine);
>> +        }
>>      }
>>      parse_numa_opts(current_machine);
>>  
>>
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 16/18] hw/i386: Introduce EPYC mode function handlers
  2019-12-04  0:38 ` [PATCH v3 16/18] hw/i386: Introduce EPYC mode function handlers Babu Moger
@ 2020-01-28 20:04   ` Eduardo Habkost
  2020-01-28 21:48     ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Eduardo Habkost @ 2020-01-28 20:04 UTC (permalink / raw)
  To: Babu Moger; +Cc: mst, armbru, qemu-devel, imammedo, pbonzini, rth

Hi,

Sorry for taking so long.  I was away from the office for a
month, and now I'm finally back.

On Tue, Dec 03, 2019 at 06:38:46PM -0600, Babu Moger wrote:
> Introduce following handlers for new epyc mode.
> x86_apicid_from_cpu_idx_epyc: Generate apicid from cpu index.
> x86_topo_ids_from_apicid_epyc: Generate topo ids from apic id.
> x86_apicid_from_topo_ids_epyc: Generate apicid from topo ids.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  hw/i386/pc.c               |   12 ++++++++++++
>  include/hw/i386/topology.h |    4 ++--
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index e6c8a458e7..64e3658873 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2819,6 +2819,17 @@ static bool pc_hotplug_allowed(MachineState *ms, DeviceState *dev, Error **errp)
>      return true;
>  }
>  
> +static void pc_init_apicid_fn(MachineState *ms)
> +{
> +    PCMachineState *pcms = PC_MACHINE(ms);
> +
> +    if (!strncmp(ms->cpu_type, "EPYC", 4)) {

Please never use string comparison to introduce device-specific
behavior.  I had already pointed this out at
https://lore.kernel.org/qemu-devel/20190801192830.GD20035@habkost.net/

If you need a CPU model to provide special behavior,
you have two options:

* Add a method pointer to X86CPUClass and/or X86CPUDefinition
* Add a QOM property to enable/disable special behavior, and
  include the property in the CPU model definition.

The second option might be preferable long term, but might
require more work because the property would become visible in
query-cpu-model-expansion and in the command line.  The first
option may be acceptable to avoid extra user-visible complexity
in the first version.



> +        pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx_epyc;
> +        pcms->topo_ids_from_apicid = x86_topo_ids_from_apicid_epyc;
> +        pcms->apicid_from_topo_ids = x86_apicid_from_topo_ids_epyc;

Why do you need to override the function pointers in
PCMachineState instead of just looking up the relevant info at
X86CPUClass?

If both machine-types and CPU models are supposed to override the
APIC ID calculation functions, the interaction between
machine-type and CPU model needs to be better documented
(preferably with simple test cases) to ensure we won't break
compatibility later.

> +    }
> +}
> +
>  static void pc_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
> @@ -2847,6 +2858,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
>      mc->cpu_index_to_instance_props = pc_cpu_index_to_props;
>      mc->get_default_cpu_node_id = pc_get_default_cpu_node_id;
>      mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
> +    mc->init_apicid_fn = pc_init_apicid_fn;
>      mc->auto_enable_numa_with_memhp = true;
>      mc->has_hotpluggable_cpus = true;
>      mc->default_boot_order = "cad";
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index b2b9e93a06..f028d2332a 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -140,7 +140,7 @@ static inline unsigned apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)
>   *
>   * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
>   */
> -static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
> +static inline apic_id_t x86_apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
>                                                    const X86CPUTopoIDs *topo_ids)
>  {
>      return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
> @@ -200,7 +200,7 @@ static inline apic_id_t x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo *topo_info,
>  {
>      X86CPUTopoIDs topo_ids;
>      x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
> -    return apicid_from_topo_ids_epyc(topo_info, &topo_ids);
> +    return x86_apicid_from_topo_ids_epyc(topo_info, &topo_ids);
>  }
>  /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
>   *
> 
> 

-- 
Eduardo



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-01-28 19:45     ` Babu Moger
@ 2020-01-28 20:12       ` Eduardo Habkost
  2020-01-29  9:14       ` Igor Mammedov
  1 sibling, 0 replies; 53+ messages in thread
From: Eduardo Habkost @ 2020-01-28 20:12 UTC (permalink / raw)
  To: Babu Moger; +Cc: mst, armbru, qemu-devel, pbonzini, Igor Mammedov, rth

On Tue, Jan 28, 2020 at 01:45:31PM -0600, Babu Moger wrote:
> 
> 
> On 1/28/20 10:29 AM, Igor Mammedov wrote:
> > On Tue, 03 Dec 2019 18:37:42 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> > 
> >> Add a new function init_apicid_fn in MachineClass to initialize the mode
> >> specific handlers to decode the apic ids.
> >>
> >> Signed-off-by: Babu Moger <babu.moger@amd.com>
> >> ---
> >>  include/hw/boards.h |    1 +
> >>  vl.c                |    3 +++
> >>  2 files changed, 4 insertions(+)
> >>
> >> diff --git a/include/hw/boards.h b/include/hw/boards.h
> >> index d4fab218e6..ce5aa365cb 100644
> >> --- a/include/hw/boards.h
> >> +++ b/include/hw/boards.h
> >> @@ -238,6 +238,7 @@ struct MachineClass {
> >>                                                           unsigned cpu_index);
> >>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
> >>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
> >> +    void (*init_apicid_fn)(MachineState *ms);
> > it's x86 specific, so why it wasn put into PCMachineClass?
> 
> Yes. It is x86 specific for now. I tried to make it generic function so
> other OSes can use it if required(like we have done in
> possible_cpu_arch_ids). It initializes functions required to build the
> apicid for each CPUs. We need these functions much early in the
> initialization. It should be initialized before parse_numa_opts or
> machine_run_board_init(in v1.c) which are called from generic context. We
> cannot use PCMachineClass at this time.

Even if the only user of the new hook will be x86, you are
introducing a generic API, so a x86-specific name doesn't seem
appropriate.

I suggest using a generic name and documenting its rules and
intended usage explicitly.  Something like "pre_init" might be
good enough, as long as the rules documented clearly (e.g. it
will be called before NUMA initialization, but after CPU model
lookup).

However, I believe we can implement the same functionality
without a new generic initialization hook.  See my reply to patch
16/18.

> 
> > 
> > 
> >>  };
> >>  
> >>  /**
> >> diff --git a/vl.c b/vl.c
> >> index a42c24a77f..b6af604e11 100644
> >> --- a/vl.c
> >> +++ b/vl.c
> >> @@ -4318,6 +4318,9 @@ int main(int argc, char **argv, char **envp)
> >>      current_machine->cpu_type = machine_class->default_cpu_type;
> >>      if (cpu_option) {
> >>          current_machine->cpu_type = parse_cpu_option(cpu_option);
> >> +        if (machine_class->init_apicid_fn) {
> >> +            machine_class->init_apicid_fn(current_machine);
> >> +        }
> >>      }
> >>      parse_numa_opts(current_machine);
> >>  
> >>
> >>
> > 
> 

-- 
Eduardo



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 16/18] hw/i386: Introduce EPYC mode function handlers
  2020-01-28 20:04   ` Eduardo Habkost
@ 2020-01-28 21:48     ` Babu Moger
  2020-01-29 16:41       ` Eduardo Habkost
  0 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2020-01-28 21:48 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: mst, armbru, qemu-devel, imammedo, pbonzini, rth



On 1/28/20 2:04 PM, Eduardo Habkost wrote:
> Hi,
> 
> Sorry for taking so long.  I was away from the office for a
> month, and now I'm finally back.

no worries.

> 
> On Tue, Dec 03, 2019 at 06:38:46PM -0600, Babu Moger wrote:
>> Introduce following handlers for new epyc mode.
>> x86_apicid_from_cpu_idx_epyc: Generate apicid from cpu index.
>> x86_topo_ids_from_apicid_epyc: Generate topo ids from apic id.
>> x86_apicid_from_topo_ids_epyc: Generate apicid from topo ids.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>>  hw/i386/pc.c               |   12 ++++++++++++
>>  include/hw/i386/topology.h |    4 ++--
>>  2 files changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index e6c8a458e7..64e3658873 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -2819,6 +2819,17 @@ static bool pc_hotplug_allowed(MachineState *ms, DeviceState *dev, Error **errp)
>>      return true;
>>  }
>>  
>> +static void pc_init_apicid_fn(MachineState *ms)
>> +{
>> +    PCMachineState *pcms = PC_MACHINE(ms);
>> +
>> +    if (!strncmp(ms->cpu_type, "EPYC", 4)) {
> 
> Please never use string comparison to introduce device-specific
> behavior.  I had already pointed this out at

Yes. you did mention before. I was not sure how to achieve  without
comparing the model string

> 
> If you need a CPU model to provide special behavior,
> you have two options:
> 
> * Add a method pointer to X86CPUClass and/or X86CPUDefinition
> * Add a QOM property to enable/disable special behavior, and
>   include the property in the CPU model definition.
> 
> The second option might be preferable long term, but might
> require more work because the property would become visible in
> query-cpu-model-expansion and in the command line.  The first
> option may be acceptable to avoid extra user-visible complexity
> in the first version.

Yes. We need to have a special behavior for specific model.
I will look at both these above approaches closely. Challenge is this
needs to be done much early in the initialization(before parse_numa_opts
or machine_run_board_init). Will research more on this.

> 
> 
> 
>> +        pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx_epyc;
>> +        pcms->topo_ids_from_apicid = x86_topo_ids_from_apicid_epyc;
>> +        pcms->apicid_from_topo_ids = x86_apicid_from_topo_ids_epyc;
> 
> Why do you need to override the function pointers in
> PCMachineState instead of just looking up the relevant info at
> X86CPUClass?
> 
> If both machine-types and CPU models are supposed to override the
> APIC ID calculation functions, the interaction between
> machine-type and CPU model needs to be better documented
> (preferably with simple test cases) to ensure we won't break
> compatibility later.
> 
>> +    }
>> +}
>> +
>>  static void pc_machine_class_init(ObjectClass *oc, void *data)
>>  {
>>      MachineClass *mc = MACHINE_CLASS(oc);
>> @@ -2847,6 +2858,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
>>      mc->cpu_index_to_instance_props = pc_cpu_index_to_props;
>>      mc->get_default_cpu_node_id = pc_get_default_cpu_node_id;
>>      mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
>> +    mc->init_apicid_fn = pc_init_apicid_fn;
>>      mc->auto_enable_numa_with_memhp = true;
>>      mc->has_hotpluggable_cpus = true;
>>      mc->default_boot_order = "cad";
>> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
>> index b2b9e93a06..f028d2332a 100644
>> --- a/include/hw/i386/topology.h
>> +++ b/include/hw/i386/topology.h
>> @@ -140,7 +140,7 @@ static inline unsigned apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)
>>   *
>>   * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
>>   */
>> -static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
>> +static inline apic_id_t x86_apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
>>                                                    const X86CPUTopoIDs *topo_ids)
>>  {
>>      return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
>> @@ -200,7 +200,7 @@ static inline apic_id_t x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo *topo_info,
>>  {
>>      X86CPUTopoIDs topo_ids;
>>      x86_topo_ids_from_idx_epyc(topo_info, cpu_index, &topo_ids);
>> -    return apicid_from_topo_ids_epyc(topo_info, &topo_ids);
>> +    return x86_apicid_from_topo_ids_epyc(topo_info, &topo_ids);
>>  }
>>  /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
>>   *
>>
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-01-28 19:45     ` Babu Moger
  2020-01-28 20:12       ` Eduardo Habkost
@ 2020-01-29  9:14       ` Igor Mammedov
  2020-01-29 16:17         ` Babu Moger
  2020-01-29 16:32         ` Babu Moger
  1 sibling, 2 replies; 53+ messages in thread
From: Igor Mammedov @ 2020-01-29  9:14 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Tue, 28 Jan 2020 13:45:31 -0600
Babu Moger <babu.moger@amd.com> wrote:

> On 1/28/20 10:29 AM, Igor Mammedov wrote:
> > On Tue, 03 Dec 2019 18:37:42 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> >> Add a new function init_apicid_fn in MachineClass to initialize the mode
> >> specific handlers to decode the apic ids.
> >>
> >> Signed-off-by: Babu Moger <babu.moger@amd.com>
> >> ---
> >>  include/hw/boards.h |    1 +
> >>  vl.c                |    3 +++
> >>  2 files changed, 4 insertions(+)
> >>
> >> diff --git a/include/hw/boards.h b/include/hw/boards.h
> >> index d4fab218e6..ce5aa365cb 100644
> >> --- a/include/hw/boards.h
> >> +++ b/include/hw/boards.h
> >> @@ -238,6 +238,7 @@ struct MachineClass {
> >>                                                           unsigned cpu_index);
> >>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
> >>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
> >> +    void (*init_apicid_fn)(MachineState *ms);  
> > it's x86 specific, so why it wasn put into PCMachineClass?  
> 
> Yes. It is x86 specific for now. I tried to make it generic function so
> other OSes can use it if required(like we have done in
> possible_cpu_arch_ids). It initializes functions required to build the
> apicid for each CPUs. We need these functions much early in the
> initialization. It should be initialized before parse_numa_opts or
> machine_run_board_init(in v1.c) which are called from generic context. We
> cannot use PCMachineClass at this time.

could you point to specific patches in this series that require
apic ids being initialized before parse_numa_opts and elaborate why?

we already have possible_cpu_arch_ids() which could be called very
early and calculates APIC IDs in x86 case, so why not reuse it?

> 
> > 
> >   
> >>  };
> >>  
> >>  /**
> >> diff --git a/vl.c b/vl.c
> >> index a42c24a77f..b6af604e11 100644
> >> --- a/vl.c
> >> +++ b/vl.c
> >> @@ -4318,6 +4318,9 @@ int main(int argc, char **argv, char **envp)
> >>      current_machine->cpu_type = machine_class->default_cpu_type;
> >>      if (cpu_option) {
> >>          current_machine->cpu_type = parse_cpu_option(cpu_option);
> >> +        if (machine_class->init_apicid_fn) {
> >> +            machine_class->init_apicid_fn(current_machine);
> >> +        }
> >>      }
> >>      parse_numa_opts(current_machine);
> >>  
> >>
> >>  
> >   
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-01-29  9:14       ` Igor Mammedov
@ 2020-01-29 16:17         ` Babu Moger
  2020-02-03 15:17           ` Igor Mammedov
  2020-01-29 16:32         ` Babu Moger
  1 sibling, 1 reply; 53+ messages in thread
From: Babu Moger @ 2020-01-29 16:17 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 1/29/20 3:14 AM, Igor Mammedov wrote:
> On Tue, 28 Jan 2020 13:45:31 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> On 1/28/20 10:29 AM, Igor Mammedov wrote:
>>> On Tue, 03 Dec 2019 18:37:42 -0600
>>> Babu Moger <babu.moger@amd.com> wrote:
>>>   
>>>> Add a new function init_apicid_fn in MachineClass to initialize the mode
>>>> specific handlers to decode the apic ids.
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>>  include/hw/boards.h |    1 +
>>>>  vl.c                |    3 +++
>>>>  2 files changed, 4 insertions(+)
>>>>
>>>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>>>> index d4fab218e6..ce5aa365cb 100644
>>>> --- a/include/hw/boards.h
>>>> +++ b/include/hw/boards.h
>>>> @@ -238,6 +238,7 @@ struct MachineClass {
>>>>                                                           unsigned cpu_index);
>>>>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>>>>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
>>>> +    void (*init_apicid_fn)(MachineState *ms);  
>>> it's x86 specific, so why it wasn put into PCMachineClass?  
>>
>> Yes. It is x86 specific for now. I tried to make it generic function so
>> other OSes can use it if required(like we have done in
>> possible_cpu_arch_ids). It initializes functions required to build the
>> apicid for each CPUs. We need these functions much early in the
>> initialization. It should be initialized before parse_numa_opts or
>> machine_run_board_init(in v1.c) which are called from generic context. We
>> cannot use PCMachineClass at this time.
> 
> could you point to specific patches in this series that require
> apic ids being initialized before parse_numa_opts and elaborate why?
> 
> we already have possible_cpu_arch_ids() which could be called very
> early and calculates APIC IDs in x86 case, so why not reuse it?


The current code(before this series) parses the numa information and then
sequentially builds the apicid. Both are done together.

But this series separates the numa parsing and apicid generation. Numa
parsing is done first and after that the apicid is generated. Reason is we
need to know the number of numa nodes in advance to decode the apicid.

Look at this patch.
https://lore.kernel.org/qemu-devel/157541988471.46157.6587693720990965800.stgit@naples-babu.amd.com/

static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
+                                                  const X86CPUTopoIDs
*topo_ids)
+{
+    return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
+           (topo_ids->llc_id << apicid_llc_offset_epyc(topo_info)) |
+           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
+           (topo_ids->core_id << apicid_core_offset(topo_info)) |
+           topo_ids->smt_id;
+}


The function apicid_from_topo_ids_epyc builds the apicid. New decode adds
llc_id(which is numa id here) to the current decoding. Other fields are
mostly remains same.


Details from the bug https://bugzilla.redhat.com/show_bug.cgi?id=1728166

Processor Programming Reference (PPR) for AMD Family 17h Model 01h,
Revision B1 Processors:

"""
2.1.10.2.1.3
ApicId Enumeration Requirements
Operating systems are expected to use
Core::X86::Cpuid::SizeId[ApicIdCoreIdSize], the number of least
significant bits in the Initial APIC ID that indicate core ID within a
processor, in constructing per-core CPUID
masks. Core::X86::Cpuid::SizeId[ApicIdCoreIdSize] determines the maximum
number of cores (MNC) that the
processor could theoretically support, not the actual number of cores that
are actually implemented or enabled on
the processor, as indicated by Core::X86::Cpuid::SizeId[NC].
Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
• ApicId[6] = Socket ID.
• ApicId[5:4] = Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} :
{1'b0,LogicalCoreID[1:0]}.
"""

> 
>>
>>>
>>>   
>>>>  };
>>>>  
>>>>  /**
>>>> diff --git a/vl.c b/vl.c
>>>> index a42c24a77f..b6af604e11 100644
>>>> --- a/vl.c
>>>> +++ b/vl.c
>>>> @@ -4318,6 +4318,9 @@ int main(int argc, char **argv, char **envp)
>>>>      current_machine->cpu_type = machine_class->default_cpu_type;
>>>>      if (cpu_option) {
>>>>          current_machine->cpu_type = parse_cpu_option(cpu_option);
>>>> +        if (machine_class->init_apicid_fn) {
>>>> +            machine_class->init_apicid_fn(current_machine);
>>>> +        }
>>>>      }
>>>>      parse_numa_opts(current_machine);
>>>>  
>>>>
>>>>  
>>>   
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-01-29  9:14       ` Igor Mammedov
  2020-01-29 16:17         ` Babu Moger
@ 2020-01-29 16:32         ` Babu Moger
  2020-01-29 16:51           ` Eduardo Habkost
  1 sibling, 1 reply; 53+ messages in thread
From: Babu Moger @ 2020-01-29 16:32 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 1/29/20 3:14 AM, Igor Mammedov wrote:
> On Tue, 28 Jan 2020 13:45:31 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> On 1/28/20 10:29 AM, Igor Mammedov wrote:
>>> On Tue, 03 Dec 2019 18:37:42 -0600
>>> Babu Moger <babu.moger@amd.com> wrote:
>>>   
>>>> Add a new function init_apicid_fn in MachineClass to initialize the mode
>>>> specific handlers to decode the apic ids.
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>>  include/hw/boards.h |    1 +
>>>>  vl.c                |    3 +++
>>>>  2 files changed, 4 insertions(+)
>>>>
>>>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>>>> index d4fab218e6..ce5aa365cb 100644
>>>> --- a/include/hw/boards.h
>>>> +++ b/include/hw/boards.h
>>>> @@ -238,6 +238,7 @@ struct MachineClass {
>>>>                                                           unsigned cpu_index);
>>>>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>>>>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
>>>> +    void (*init_apicid_fn)(MachineState *ms);  
>>> it's x86 specific, so why it wasn put into PCMachineClass?  
>>
>> Yes. It is x86 specific for now. I tried to make it generic function so
>> other OSes can use it if required(like we have done in
>> possible_cpu_arch_ids). It initializes functions required to build the
>> apicid for each CPUs. We need these functions much early in the
>> initialization. It should be initialized before parse_numa_opts or
>> machine_run_board_init(in v1.c) which are called from generic context. We
>> cannot use PCMachineClass at this time.
> 
> could you point to specific patches in this series that require
> apic ids being initialized before parse_numa_opts and elaborate why?
> 
> we already have possible_cpu_arch_ids() which could be called very
> early and calculates APIC IDs in x86 case, so why not reuse it?

Forgot to respond to this. The possible_cpu_arch_ids does not use the numa
information to build the apic id. We cannot re-use it without changing it
drastically.

> 
>>
>>>
>>>   
>>>>  };
>>>>  
>>>>  /**
>>>> diff --git a/vl.c b/vl.c
>>>> index a42c24a77f..b6af604e11 100644
>>>> --- a/vl.c
>>>> +++ b/vl.c
>>>> @@ -4318,6 +4318,9 @@ int main(int argc, char **argv, char **envp)
>>>>      current_machine->cpu_type = machine_class->default_cpu_type;
>>>>      if (cpu_option) {
>>>>          current_machine->cpu_type = parse_cpu_option(cpu_option);
>>>> +        if (machine_class->init_apicid_fn) {
>>>> +            machine_class->init_apicid_fn(current_machine);
>>>> +        }
>>>>      }
>>>>      parse_numa_opts(current_machine);
>>>>  
>>>>
>>>>  
>>>   
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 16/18] hw/i386: Introduce EPYC mode function handlers
  2020-01-28 21:48     ` Babu Moger
@ 2020-01-29 16:41       ` Eduardo Habkost
  0 siblings, 0 replies; 53+ messages in thread
From: Eduardo Habkost @ 2020-01-29 16:41 UTC (permalink / raw)
  To: Babu Moger; +Cc: mst, armbru, qemu-devel, imammedo, pbonzini, rth

On Tue, Jan 28, 2020 at 03:48:15PM -0600, Babu Moger wrote:
> On 1/28/20 2:04 PM, Eduardo Habkost wrote:
[...]
> > If you need a CPU model to provide special behavior,
> > you have two options:
> > 
> > * Add a method pointer to X86CPUClass and/or X86CPUDefinition
> > * Add a QOM property to enable/disable special behavior, and
> >   include the property in the CPU model definition.
> > 
> > The second option might be preferable long term, but might
> > require more work because the property would become visible in
> > query-cpu-model-expansion and in the command line.  The first
> > option may be acceptable to avoid extra user-visible complexity
> > in the first version.
> 
> Yes. We need to have a special behavior for specific model.
> I will look at both these above approaches closely. Challenge is this
> needs to be done much early in the initialization(before parse_numa_opts
> or machine_run_board_init). Will research more on this.

You should be able to look up the requested CPU model using
object_class_by_name(machine->cpu_type).  If you do this inside
x86-specific code before calling
apicid_from_cpu_idx/topo_ids_from_apicid/apicid_from_topo_ids,
you probably won't need a init_apicid_fn hook.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-01-29 16:32         ` Babu Moger
@ 2020-01-29 16:51           ` Eduardo Habkost
  2020-01-29 17:05             ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Eduardo Habkost @ 2020-01-29 16:51 UTC (permalink / raw)
  To: Babu Moger; +Cc: mst, armbru, qemu-devel, pbonzini, Igor Mammedov, rth

On Wed, Jan 29, 2020 at 10:32:01AM -0600, Babu Moger wrote:
> 
> 
> On 1/29/20 3:14 AM, Igor Mammedov wrote:
> > On Tue, 28 Jan 2020 13:45:31 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> > 
> >> On 1/28/20 10:29 AM, Igor Mammedov wrote:
> >>> On Tue, 03 Dec 2019 18:37:42 -0600
> >>> Babu Moger <babu.moger@amd.com> wrote:
> >>>   
> >>>> Add a new function init_apicid_fn in MachineClass to initialize the mode
> >>>> specific handlers to decode the apic ids.
> >>>>
> >>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
> >>>> ---
> >>>>  include/hw/boards.h |    1 +
> >>>>  vl.c                |    3 +++
> >>>>  2 files changed, 4 insertions(+)
> >>>>
> >>>> diff --git a/include/hw/boards.h b/include/hw/boards.h
> >>>> index d4fab218e6..ce5aa365cb 100644
> >>>> --- a/include/hw/boards.h
> >>>> +++ b/include/hw/boards.h
> >>>> @@ -238,6 +238,7 @@ struct MachineClass {
> >>>>                                                           unsigned cpu_index);
> >>>>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
> >>>>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
> >>>> +    void (*init_apicid_fn)(MachineState *ms);  
> >>> it's x86 specific, so why it wasn put into PCMachineClass?  
> >>
> >> Yes. It is x86 specific for now. I tried to make it generic function so
> >> other OSes can use it if required(like we have done in
> >> possible_cpu_arch_ids). It initializes functions required to build the
> >> apicid for each CPUs. We need these functions much early in the
> >> initialization. It should be initialized before parse_numa_opts or
> >> machine_run_board_init(in v1.c) which are called from generic context. We
> >> cannot use PCMachineClass at this time.
> > 
> > could you point to specific patches in this series that require
> > apic ids being initialized before parse_numa_opts and elaborate why?
> > 
> > we already have possible_cpu_arch_ids() which could be called very
> > early and calculates APIC IDs in x86 case, so why not reuse it?
> 
> Forgot to respond to this. The possible_cpu_arch_ids does not use the numa
> information to build the apic id. We cannot re-use it without changing it
> drastically.

I don't get it.  I see multiple patches in this series changing
pc_possible_cpu_arch_ids() (which is really expected, if you are
changing how APIC IDs are generated).

-- 
Eduardo



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-01-29 16:51           ` Eduardo Habkost
@ 2020-01-29 17:05             ` Babu Moger
  0 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2020-01-29 17:05 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: mst, armbru, qemu-devel, pbonzini, Igor Mammedov, rth



On 1/29/20 10:51 AM, Eduardo Habkost wrote:
> On Wed, Jan 29, 2020 at 10:32:01AM -0600, Babu Moger wrote:
>>
>>
>> On 1/29/20 3:14 AM, Igor Mammedov wrote:
>>> On Tue, 28 Jan 2020 13:45:31 -0600
>>> Babu Moger <babu.moger@amd.com> wrote:
>>>
>>>> On 1/28/20 10:29 AM, Igor Mammedov wrote:
>>>>> On Tue, 03 Dec 2019 18:37:42 -0600
>>>>> Babu Moger <babu.moger@amd.com> wrote:
>>>>>   
>>>>>> Add a new function init_apicid_fn in MachineClass to initialize the mode
>>>>>> specific handlers to decode the apic ids.
>>>>>>
>>>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>>>> ---
>>>>>>  include/hw/boards.h |    1 +
>>>>>>  vl.c                |    3 +++
>>>>>>  2 files changed, 4 insertions(+)
>>>>>>
>>>>>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>>>>>> index d4fab218e6..ce5aa365cb 100644
>>>>>> --- a/include/hw/boards.h
>>>>>> +++ b/include/hw/boards.h
>>>>>> @@ -238,6 +238,7 @@ struct MachineClass {
>>>>>>                                                           unsigned cpu_index);
>>>>>>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>>>>>>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
>>>>>> +    void (*init_apicid_fn)(MachineState *ms);  
>>>>> it's x86 specific, so why it wasn put into PCMachineClass?  
>>>>
>>>> Yes. It is x86 specific for now. I tried to make it generic function so
>>>> other OSes can use it if required(like we have done in
>>>> possible_cpu_arch_ids). It initializes functions required to build the
>>>> apicid for each CPUs. We need these functions much early in the
>>>> initialization. It should be initialized before parse_numa_opts or
>>>> machine_run_board_init(in v1.c) which are called from generic context. We
>>>> cannot use PCMachineClass at this time.
>>>
>>> could you point to specific patches in this series that require
>>> apic ids being initialized before parse_numa_opts and elaborate why?
>>>
>>> we already have possible_cpu_arch_ids() which could be called very
>>> early and calculates APIC IDs in x86 case, so why not reuse it?
>>
>> Forgot to respond to this. The possible_cpu_arch_ids does not use the numa
>> information to build the apic id. We cannot re-use it without changing it
>> drastically.
> 
> I don't get it.  I see multiple patches in this series changing
> pc_possible_cpu_arch_ids() (which is really expected, if you are
> changing how APIC IDs are generated).
> 

My bad. I mispoke on that.I should have said the current decoding
logic(x86_apicid_from_cpu_idx, x86_topo_ids_from_apicid,
x86_apicid_from_topo_ids) cannot be used as is.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
                   ` (17 preceding siblings ...)
  2019-12-04  0:39 ` [PATCH v3 18/18] tests: Update the Unit tests Babu Moger
@ 2020-02-03 14:59 ` Igor Mammedov
  2020-02-03 19:31   ` Babu Moger
  18 siblings, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-02-03 14:59 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Tue, 03 Dec 2019 18:36:54 -0600
Babu Moger <babu.moger@amd.com> wrote:

> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
> https://bugzilla.redhat.com/show_bug.cgi?id=1728166
> 
> Currently, the APIC ID is decoded based on the sequence
> sockets->dies->cores->threads. This works for most standard AMD and other
> vendors' configurations, but this decoding sequence does not follow that of
> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
> inconsistency.  When booting a guest VM, the kernel tries to validate the
> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
> 
> To fix the problem we need to build the topology as per the Processor
> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
> Processors. It is available at https://www.amd.com/system/files/TechDocs/55570-B1_PUB.zip
> 
> Here is the text from the PPR.
> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
> number of least significant bits in the Initial APIC ID that indicate core ID
> within a processor, in constructing per-core CPUID masks.
> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
> (MNC) that the processor could theoretically support, not the actual number of
> cores that are actually implemented or enabled on the processor, as indicated
> by Core::X86::Cpuid::SizeId[NC].
> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> • ApicId[6] = Socket ID.
> • ApicId[5:4] = Node ID.
> • ApicId[3] = Logical CCX L3 complex ID
> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}


After checking out all patches and some pondering, used here approach
looks to me too intrusive for the task at hand especially where it
comes to generic code.

(Ignore till ==== to see suggestion how to simplify without reading
reasoning behind it first)

Lets look for a way to simplify it a little bit.

So problem we are trying to solve,
 1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
 2: it depends on knowing total number of numa nodes.

Externally workflow looks like following:
  1. user provides -smp x,sockets,cores,...,maxcpus
      that's used by possible_cpu_arch_ids() singleton to build list of
      possible CPUs (which is available to user via command 'hotpluggable-cpus')

      Hook could be called very early and possible_cpus data might be
      not complete. It builds a list of possible CPUs which user could
      modify later.

  2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
      options to assign cpus to nodes, which is one way or another calling
      machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
      with node information. It happens early when total number of nodes
      is not available.

  2.2 user does not provide explicit node mappings for CPUs.
      QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
      (using the same machine_set_cpu_numa_node()) right before calling boards
      specific machine init(). At that time total number of nodes is known.

In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
boards init() is run.

In 2.2 case it calls get_default_cpu_node_id() -> x86_get_default_cpu_node_id()
which uses arch_id calculate numa node.
But then question is: does it have to use APIC id or could it infer 'pkg_id',
it's after, from ms->possible_cpus->cpus[i].props data?
  
With that out of the way APIC ID will be used only during board's init(),
so board could update possible_cpus with valid APIC IDs at the start of
x86_cpus_init().

====
in nutshell it would be much easier to do following:

 1. make x86_get_default_cpu_node_id() APIC ID in-depended or
    if impossible as alternative recompute APIC IDs there if cpu
    type is EPYC based (since number of nodes is already known)
 2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based

this way one doesn't need to touch generic numa code, introduce
x86 specific init_apicid_fn() hook into generic code and keep
x86/EPYC nuances contained within x86 code only.

> v3:
>   1. Consolidated the topology information in structure X86CPUTopoInfo.
>   2. Changed the ccx_id to llc_id as commented by upstream.
>   3. Generalized the apic id decoding. It is mostly similar to current apic id
>      except that it adds new field llc_id when numa configured. Removes all the
>      hardcoded values.
>   4. Removed the earlier parse_numa split. And moved the numa node initialization
>      inside the numa_complete_configuration. This is bit cleaner as commented by 
>      Eduardo.
>   5. Added new function init_apicid_fn inside machine_class structure. This
>      will be used to update the apic id handler specific to cpu model.
>   6. Updated the cpuid unit tests.
>   7. TODO : Need to figure out how to dynamically update the handlers using cpu models.
>      I might some guidance on that.
> 
> v2:
>   https://lore.kernel.org/qemu-devel/156779689013.21957.1631551572950676212.stgit@localhost.localdomain/
>   1. Introduced the new property epyc to enable new epyc mode.
>   2. Separated the epyc mode and non epyc mode function.
>   3. Introduced function pointers in PCMachineState to handle the
>      differences.
>   4. Mildly tested different combinations to make things are working as expected.
>   5. TODO : Setting the epyc feature bit needs to be worked out. This feature is
>      supported only on AMD EPYC models. I may need some guidance on that.
> 
> v1:
>   https://lore.kernel.org/qemu-devel/20190731232032.51786-1-babu.moger@amd.com/
> 
> ---
> 
> Babu Moger (18):
>       hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
>       hw/i386: Introduce X86CPUTopoInfo to contain topology info
>       hw/i386: Consolidate topology functions
>       hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo
>       machine: Add SMP Sockets in CpuTopology
>       hw/core: Add core complex id in X86CPU topology
>       machine: Add a new function init_apicid_fn in MachineClass
>       hw/i386: Update structures for nodes_per_pkg
>       i386: Add CPUX86Family type in CPUX86State
>       hw/386: Add EPYC mode topology decoding functions
>       i386: Cleanup and use the EPYC mode topology functions
>       numa: Split the numa initialization
>       hw/i386: Introduce apicid_from_cpu_idx in PCMachineState
>       hw/i386: Introduce topo_ids_from_apicid handler PCMachineState
>       hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState
>       hw/i386: Introduce EPYC mode function handlers
>       i386: Fix pkg_id offset for epyc mode
>       tests: Update the Unit tests
> 
> 
>  hw/core/machine-hmp-cmds.c |    3 +
>  hw/core/machine.c          |   14 +++
>  hw/core/numa.c             |   62 +++++++++----
>  hw/i386/pc.c               |  132 +++++++++++++++++++---------
>  include/hw/boards.h        |    3 +
>  include/hw/i386/pc.h       |    9 ++
>  include/hw/i386/topology.h |  209 +++++++++++++++++++++++++++++++-------------
>  include/sysemu/numa.h      |    5 +
>  qapi/machine.json          |    7 +
>  target/i386/cpu.c          |  196 ++++++++++++-----------------------------
>  target/i386/cpu.h          |    9 ++
>  tests/test-x86-cpuid.c     |  115 ++++++++++++++----------
>  vl.c                       |    4 +
>  13 files changed, 455 insertions(+), 313 deletions(-)
> 
> --
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
  2019-12-04  0:37 ` [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs Babu Moger
@ 2020-02-03 15:08   ` Igor Mammedov
  2020-02-03 18:25     ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-02-03 15:08 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Tue, 03 Dec 2019 18:37:01 -0600
Babu Moger <babu.moger@amd.com> wrote:

> Rename few data structures related to X86 topology.  X86CPUTopoIDs will
> have individual arch ids. Next patch introduces X86CPUTopoInfo which will
> have all topology information(like cores, threads etc..).

On what commit series was based on?
(it doesn't apply to master anymore)


> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>  hw/i386/pc.c               |   60 ++++++++++++++++++++++----------------------
>  include/hw/i386/topology.h |   40 +++++++++++++++--------------
>  2 files changed, 50 insertions(+), 50 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 51b72439b4..5bd2ffccb7 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2212,7 +2212,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>      int idx;
>      CPUState *cs;
>      CPUArchId *cpu_slot;
> -    X86CPUTopoInfo topo;
> +    X86CPUTopoIDs topo_ids;
>      X86CPU *cpu = X86_CPU(dev);
>      CPUX86State *env = &cpu->env;
>      MachineState *ms = MACHINE(hotplug_dev);
> @@ -2277,12 +2277,12 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>              return;
>          }
>  
> -        topo.pkg_id = cpu->socket_id;
> -        topo.die_id = cpu->die_id;
> -        topo.core_id = cpu->core_id;
> -        topo.smt_id = cpu->thread_id;
> +        topo_ids.pkg_id = cpu->socket_id;
> +        topo_ids.die_id = cpu->die_id;
> +        topo_ids.core_id = cpu->core_id;
> +        topo_ids.smt_id = cpu->thread_id;
>          cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
> -                                            smp_threads, &topo);
> +                                            smp_threads, &topo_ids);
>      }
>  
>      cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, &idx);
> @@ -2290,11 +2290,11 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>          MachineState *ms = MACHINE(pcms);
>  
>          x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
> -                                 smp_cores, smp_threads, &topo);
> +                                 smp_cores, smp_threads, &topo_ids);
>          error_setg(errp,
>              "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
>              " APIC ID %" PRIu32 ", valid index range 0:%d",
> -            topo.pkg_id, topo.die_id, topo.core_id, topo.smt_id,
> +            topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, topo_ids.smt_id,
>              cpu->apic_id, ms->possible_cpus->len - 1);
>          return;
>      }
> @@ -2312,34 +2312,34 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>       * once -smp refactoring is complete and there will be CPU private
>       * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
>      x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
> -                             smp_cores, smp_threads, &topo);
> -    if (cpu->socket_id != -1 && cpu->socket_id != topo.pkg_id) {
> +                             smp_cores, smp_threads, &topo_ids);
> +    if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
>          error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
> -            " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, topo.pkg_id);
> +            " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, topo_ids.pkg_id);
>          return;
>      }
> -    cpu->socket_id = topo.pkg_id;
> +    cpu->socket_id = topo_ids.pkg_id;
>  
> -    if (cpu->die_id != -1 && cpu->die_id != topo.die_id) {
> +    if (cpu->die_id != -1 && cpu->die_id != topo_ids.die_id) {
>          error_setg(errp, "property die-id: %u doesn't match set apic-id:"
> -            " 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo.die_id);
> +            " 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo_ids.die_id);
>          return;
>      }
> -    cpu->die_id = topo.die_id;
> +    cpu->die_id = topo_ids.die_id;
>  
> -    if (cpu->core_id != -1 && cpu->core_id != topo.core_id) {
> +    if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
>          error_setg(errp, "property core-id: %u doesn't match set apic-id:"
> -            " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo.core_id);
> +            " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo_ids.core_id);
>          return;
>      }
> -    cpu->core_id = topo.core_id;
> +    cpu->core_id = topo_ids.core_id;
>  
> -    if (cpu->thread_id != -1 && cpu->thread_id != topo.smt_id) {
> +    if (cpu->thread_id != -1 && cpu->thread_id != topo_ids.smt_id) {
>          error_setg(errp, "property thread-id: %u doesn't match set apic-id:"
> -            " 0x%x (thread-id: %u)", cpu->thread_id, cpu->apic_id, topo.smt_id);
> +            " 0x%x (thread-id: %u)", cpu->thread_id, cpu->apic_id, topo_ids.smt_id);
>          return;
>      }
> -    cpu->thread_id = topo.smt_id;
> +    cpu->thread_id = topo_ids.smt_id;
>  
>      if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VPINDEX) &&
>          !kvm_hv_vpindex_settable()) {
> @@ -2692,14 +2692,14 @@ pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>  
>  static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
>  {
> -   X86CPUTopoInfo topo;
> +   X86CPUTopoIDs topo_ids;
>     PCMachineState *pcms = PC_MACHINE(ms);
>  
>     assert(idx < ms->possible_cpus->len);
>     x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
>                              pcms->smp_dies, ms->smp.cores,
> -                            ms->smp.threads, &topo);
> -   return topo.pkg_id % ms->numa_state->num_nodes;
> +                            ms->smp.threads, &topo_ids);
> +   return topo_ids.pkg_id % ms->numa_state->num_nodes;
>  }
>  
>  static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
> @@ -2721,24 +2721,24 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>                                    sizeof(CPUArchId) * max_cpus);
>      ms->possible_cpus->len = max_cpus;
>      for (i = 0; i < ms->possible_cpus->len; i++) {
> -        X86CPUTopoInfo topo;
> +        X86CPUTopoIDs topo_ids;
>  
>          ms->possible_cpus->cpus[i].type = ms->cpu_type;
>          ms->possible_cpus->cpus[i].vcpus_count = 1;
>          ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
>          x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
>                                   pcms->smp_dies, ms->smp.cores,
> -                                 ms->smp.threads, &topo);
> +                                 ms->smp.threads, &topo_ids);
>          ms->possible_cpus->cpus[i].props.has_socket_id = true;
> -        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
> +        ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
>          if (pcms->smp_dies > 1) {
>              ms->possible_cpus->cpus[i].props.has_die_id = true;
> -            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
> +            ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
>          }
>          ms->possible_cpus->cpus[i].props.has_core_id = true;
> -        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
> +        ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
>          ms->possible_cpus->cpus[i].props.has_thread_id = true;
> -        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
> +        ms->possible_cpus->cpus[i].props.thread_id = topo_ids.smt_id;
>      }
>      return ms->possible_cpus;
>  }
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 4ff5b2da6c..6c184f3115 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -45,12 +45,12 @@
>   */
>  typedef uint32_t apic_id_t;
>  
> -typedef struct X86CPUTopoInfo {
> +typedef struct X86CPUTopoIDs {
>      unsigned pkg_id;
>      unsigned die_id;
>      unsigned core_id;
>      unsigned smt_id;
> -} X86CPUTopoInfo;
> +} X86CPUTopoIDs;
>  
>  /* Return the bit width needed for 'count' IDs
>   */
> @@ -122,12 +122,12 @@ static inline unsigned apicid_pkg_offset(unsigned nr_dies,
>  static inline apic_id_t apicid_from_topo_ids(unsigned nr_dies,
>                                               unsigned nr_cores,
>                                               unsigned nr_threads,
> -                                             const X86CPUTopoInfo *topo)
> +                                             const X86CPUTopoIDs *topo_ids)
>  {
> -    return (topo->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
> -           (topo->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
> -          (topo->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
> -           topo->smt_id;
> +    return (topo_ids->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
> +           (topo_ids->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
> +           (topo_ids->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
> +           topo_ids->smt_id;
>  }
>  
>  /* Calculate thread/core/package IDs for a specific topology,
> @@ -137,12 +137,12 @@ static inline void x86_topo_ids_from_idx(unsigned nr_dies,
>                                           unsigned nr_cores,
>                                           unsigned nr_threads,
>                                           unsigned cpu_index,
> -                                         X86CPUTopoInfo *topo)
> +                                         X86CPUTopoIDs *topo_ids)
>  {
> -    topo->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> -    topo->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
> -    topo->core_id = cpu_index / nr_threads % nr_cores;
> -    topo->smt_id = cpu_index % nr_threads;
> +    topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> +    topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
> +    topo_ids->core_id = cpu_index / nr_threads % nr_cores;
> +    topo_ids->smt_id = cpu_index % nr_threads;
>  }
>  
>  /* Calculate thread/core/package IDs for a specific topology,
> @@ -152,17 +152,17 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>                                              unsigned nr_dies,
>                                              unsigned nr_cores,
>                                              unsigned nr_threads,
> -                                            X86CPUTopoInfo *topo)
> +                                            X86CPUTopoIDs *topo_ids)
>  {
> -    topo->smt_id = apicid &
> +    topo_ids->smt_id = apicid &
>              ~(0xFFFFFFFFUL << apicid_smt_width(nr_dies, nr_cores, nr_threads));
> -    topo->core_id =
> +    topo_ids->core_id =
>              (apicid >> apicid_core_offset(nr_dies, nr_cores, nr_threads)) &
>              ~(0xFFFFFFFFUL << apicid_core_width(nr_dies, nr_cores, nr_threads));
> -    topo->die_id =
> +    topo_ids->die_id =
>              (apicid >> apicid_die_offset(nr_dies, nr_cores, nr_threads)) &
>              ~(0xFFFFFFFFUL << apicid_die_width(nr_dies, nr_cores, nr_threads));
> -    topo->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, nr_threads);
> +    topo_ids->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, nr_threads);
>  }
>  
>  /* Make APIC ID for the CPU 'cpu_index'
> @@ -174,9 +174,9 @@ static inline apic_id_t x86_apicid_from_cpu_idx(unsigned nr_dies,
>                                                  unsigned nr_threads,
>                                                  unsigned cpu_index)
>  {
> -    X86CPUTopoInfo topo;
> -    x86_topo_ids_from_idx(nr_dies, nr_cores, nr_threads, cpu_index, &topo);
> -    return apicid_from_topo_ids(nr_dies, nr_cores, nr_threads, &topo);
> +    X86CPUTopoIDs topo_ids;
> +    x86_topo_ids_from_idx(nr_dies, nr_cores, nr_threads, cpu_index, &topo_ids);
> +    return apicid_from_topo_ids(nr_dies, nr_cores, nr_threads, &topo_ids);
>  }
>  
>  #endif /* HW_I386_TOPOLOGY_H */
> 
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-01-29 16:17         ` Babu Moger
@ 2020-02-03 15:17           ` Igor Mammedov
  2020-02-03 21:49             ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-02-03 15:17 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, qemu-devel, armbru, pbonzini, rth

On Wed, 29 Jan 2020 10:17:11 -0600
Babu Moger <babu.moger@amd.com> wrote:

> On 1/29/20 3:14 AM, Igor Mammedov wrote:
> > On Tue, 28 Jan 2020 13:45:31 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> >> On 1/28/20 10:29 AM, Igor Mammedov wrote:  
> >>> On Tue, 03 Dec 2019 18:37:42 -0600
> >>> Babu Moger <babu.moger@amd.com> wrote:
> >>>     
> >>>> Add a new function init_apicid_fn in MachineClass to initialize the mode
> >>>> specific handlers to decode the apic ids.
> >>>>
> >>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
> >>>> ---
> >>>>  include/hw/boards.h |    1 +
> >>>>  vl.c                |    3 +++
> >>>>  2 files changed, 4 insertions(+)
> >>>>
> >>>> diff --git a/include/hw/boards.h b/include/hw/boards.h
> >>>> index d4fab218e6..ce5aa365cb 100644
> >>>> --- a/include/hw/boards.h
> >>>> +++ b/include/hw/boards.h
> >>>> @@ -238,6 +238,7 @@ struct MachineClass {
> >>>>                                                           unsigned cpu_index);
> >>>>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
> >>>>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
> >>>> +    void (*init_apicid_fn)(MachineState *ms);    
> >>> it's x86 specific, so why it wasn put into PCMachineClass?    
> >>
> >> Yes. It is x86 specific for now. I tried to make it generic function so
> >> other OSes can use it if required(like we have done in
> >> possible_cpu_arch_ids). It initializes functions required to build the
> >> apicid for each CPUs. We need these functions much early in the
> >> initialization. It should be initialized before parse_numa_opts or
> >> machine_run_board_init(in v1.c) which are called from generic context. We
> >> cannot use PCMachineClass at this time.  
> > 
> > could you point to specific patches in this series that require
> > apic ids being initialized before parse_numa_opts and elaborate why?
> > 
> > we already have possible_cpu_arch_ids() which could be called very
> > early and calculates APIC IDs in x86 case, so why not reuse it?  
> 
> 
> The current code(before this series) parses the numa information and then
> sequentially builds the apicid. Both are done together.
> 
> But this series separates the numa parsing and apicid generation. Numa
> parsing is done first and after that the apicid is generated. Reason is we
> need to know the number of numa nodes in advance to decode the apicid.
> 
> Look at this patch.
> https://lore.kernel.org/qemu-devel/157541988471.46157.6587693720990965800.stgit@naples-babu.amd.com/
> 
> static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
> +                                                  const X86CPUTopoIDs
> *topo_ids)
> +{
> +    return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
> +           (topo_ids->llc_id << apicid_llc_offset_epyc(topo_info)) |
> +           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
> +           (topo_ids->core_id << apicid_core_offset(topo_info)) |
> +           topo_ids->smt_id;
> +}
> 
> 
> The function apicid_from_topo_ids_epyc builds the apicid. New decode adds
> llc_id(which is numa id here) to the current decoding. Other fields are
> mostly remains same.

If llc_id is the same as numa id, why not reuse CpuInstanceProperties::node-id
instead of llc_id you are adding in previous patch 6/18?


> 
> 
> Details from the bug https://bugzilla.redhat.com/show_bug.cgi?id=1728166
> 
> Processor Programming Reference (PPR) for AMD Family 17h Model 01h,
> Revision B1 Processors:
> 
> """
> 2.1.10.2.1.3
> ApicId Enumeration Requirements
> Operating systems are expected to use
> Core::X86::Cpuid::SizeId[ApicIdCoreIdSize], the number of least
> significant bits in the Initial APIC ID that indicate core ID within a
> processor, in constructing per-core CPUID
> masks. Core::X86::Cpuid::SizeId[ApicIdCoreIdSize] determines the maximum
> number of cores (MNC) that the
> processor could theoretically support, not the actual number of cores that
> are actually implemented or enabled on
> the processor, as indicated by Core::X86::Cpuid::SizeId[NC].
> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> • ApicId[6] = Socket ID.
> • ApicId[5:4] = Node ID.
> • ApicId[3] = Logical CCX L3 complex ID
> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} :
> {1'b0,LogicalCoreID[1:0]}.
> """
> 
> >   
> >>  
> >>>
> >>>     
> >>>>  };
> >>>>  
> >>>>  /**
> >>>> diff --git a/vl.c b/vl.c
> >>>> index a42c24a77f..b6af604e11 100644
> >>>> --- a/vl.c
> >>>> +++ b/vl.c
> >>>> @@ -4318,6 +4318,9 @@ int main(int argc, char **argv, char **envp)
> >>>>      current_machine->cpu_type = machine_class->default_cpu_type;
> >>>>      if (cpu_option) {
> >>>>          current_machine->cpu_type = parse_cpu_option(cpu_option);
> >>>> +        if (machine_class->init_apicid_fn) {
> >>>> +            machine_class->init_apicid_fn(current_machine);
> >>>> +        }
> >>>>      }
> >>>>      parse_numa_opts(current_machine);
> >>>>  
> >>>>
> >>>>    
> >>>     
> >>  
> >   
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
  2020-02-03 15:08   ` Igor Mammedov
@ 2020-02-03 18:25     ` Babu Moger
  0 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2020-02-03 18:25 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 2/3/20 9:08 AM, Igor Mammedov wrote:
> On Tue, 03 Dec 2019 18:37:01 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> Rename few data structures related to X86 topology.  X86CPUTopoIDs will
>> have individual arch ids. Next patch introduces X86CPUTopoInfo which will
>> have all topology information(like cores, threads etc..).
> 
> On what commit series was based on?
> (it doesn't apply to master anymore)

I used git://github.com/ehabkost/qemu.git (x86-next) to generate the
patches. It may be bit off right now.

> 
> 
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>> ---
>>  hw/i386/pc.c               |   60 ++++++++++++++++++++++----------------------
>>  include/hw/i386/topology.h |   40 +++++++++++++++--------------
>>  2 files changed, 50 insertions(+), 50 deletions(-)
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 51b72439b4..5bd2ffccb7 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -2212,7 +2212,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>      int idx;
>>      CPUState *cs;
>>      CPUArchId *cpu_slot;
>> -    X86CPUTopoInfo topo;
>> +    X86CPUTopoIDs topo_ids;
>>      X86CPU *cpu = X86_CPU(dev);
>>      CPUX86State *env = &cpu->env;
>>      MachineState *ms = MACHINE(hotplug_dev);
>> @@ -2277,12 +2277,12 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>              return;
>>          }
>>  
>> -        topo.pkg_id = cpu->socket_id;
>> -        topo.die_id = cpu->die_id;
>> -        topo.core_id = cpu->core_id;
>> -        topo.smt_id = cpu->thread_id;
>> +        topo_ids.pkg_id = cpu->socket_id;
>> +        topo_ids.die_id = cpu->die_id;
>> +        topo_ids.core_id = cpu->core_id;
>> +        topo_ids.smt_id = cpu->thread_id;
>>          cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
>> -                                            smp_threads, &topo);
>> +                                            smp_threads, &topo_ids);
>>      }
>>  
>>      cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, &idx);
>> @@ -2290,11 +2290,11 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>          MachineState *ms = MACHINE(pcms);
>>  
>>          x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
>> -                                 smp_cores, smp_threads, &topo);
>> +                                 smp_cores, smp_threads, &topo_ids);
>>          error_setg(errp,
>>              "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
>>              " APIC ID %" PRIu32 ", valid index range 0:%d",
>> -            topo.pkg_id, topo.die_id, topo.core_id, topo.smt_id,
>> +            topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, topo_ids.smt_id,
>>              cpu->apic_id, ms->possible_cpus->len - 1);
>>          return;
>>      }
>> @@ -2312,34 +2312,34 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>       * once -smp refactoring is complete and there will be CPU private
>>       * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
>>      x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
>> -                             smp_cores, smp_threads, &topo);
>> -    if (cpu->socket_id != -1 && cpu->socket_id != topo.pkg_id) {
>> +                             smp_cores, smp_threads, &topo_ids);
>> +    if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
>>          error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
>> -            " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, topo.pkg_id);
>> +            " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, topo_ids.pkg_id);
>>          return;
>>      }
>> -    cpu->socket_id = topo.pkg_id;
>> +    cpu->socket_id = topo_ids.pkg_id;
>>  
>> -    if (cpu->die_id != -1 && cpu->die_id != topo.die_id) {
>> +    if (cpu->die_id != -1 && cpu->die_id != topo_ids.die_id) {
>>          error_setg(errp, "property die-id: %u doesn't match set apic-id:"
>> -            " 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo.die_id);
>> +            " 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo_ids.die_id);
>>          return;
>>      }
>> -    cpu->die_id = topo.die_id;
>> +    cpu->die_id = topo_ids.die_id;
>>  
>> -    if (cpu->core_id != -1 && cpu->core_id != topo.core_id) {
>> +    if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
>>          error_setg(errp, "property core-id: %u doesn't match set apic-id:"
>> -            " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo.core_id);
>> +            " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo_ids.core_id);
>>          return;
>>      }
>> -    cpu->core_id = topo.core_id;
>> +    cpu->core_id = topo_ids.core_id;
>>  
>> -    if (cpu->thread_id != -1 && cpu->thread_id != topo.smt_id) {
>> +    if (cpu->thread_id != -1 && cpu->thread_id != topo_ids.smt_id) {
>>          error_setg(errp, "property thread-id: %u doesn't match set apic-id:"
>> -            " 0x%x (thread-id: %u)", cpu->thread_id, cpu->apic_id, topo.smt_id);
>> +            " 0x%x (thread-id: %u)", cpu->thread_id, cpu->apic_id, topo_ids.smt_id);
>>          return;
>>      }
>> -    cpu->thread_id = topo.smt_id;
>> +    cpu->thread_id = topo_ids.smt_id;
>>  
>>      if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VPINDEX) &&
>>          !kvm_hv_vpindex_settable()) {
>> @@ -2692,14 +2692,14 @@ pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>>  
>>  static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
>>  {
>> -   X86CPUTopoInfo topo;
>> +   X86CPUTopoIDs topo_ids;
>>     PCMachineState *pcms = PC_MACHINE(ms);
>>  
>>     assert(idx < ms->possible_cpus->len);
>>     x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
>>                              pcms->smp_dies, ms->smp.cores,
>> -                            ms->smp.threads, &topo);
>> -   return topo.pkg_id % ms->numa_state->num_nodes;
>> +                            ms->smp.threads, &topo_ids);
>> +   return topo_ids.pkg_id % ms->numa_state->num_nodes;
>>  }
>>  
>>  static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>> @@ -2721,24 +2721,24 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>>                                    sizeof(CPUArchId) * max_cpus);
>>      ms->possible_cpus->len = max_cpus;
>>      for (i = 0; i < ms->possible_cpus->len; i++) {
>> -        X86CPUTopoInfo topo;
>> +        X86CPUTopoIDs topo_ids;
>>  
>>          ms->possible_cpus->cpus[i].type = ms->cpu_type;
>>          ms->possible_cpus->cpus[i].vcpus_count = 1;
>>          ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
>>          x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
>>                                   pcms->smp_dies, ms->smp.cores,
>> -                                 ms->smp.threads, &topo);
>> +                                 ms->smp.threads, &topo_ids);
>>          ms->possible_cpus->cpus[i].props.has_socket_id = true;
>> -        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
>> +        ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
>>          if (pcms->smp_dies > 1) {
>>              ms->possible_cpus->cpus[i].props.has_die_id = true;
>> -            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
>> +            ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
>>          }
>>          ms->possible_cpus->cpus[i].props.has_core_id = true;
>> -        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
>> +        ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
>>          ms->possible_cpus->cpus[i].props.has_thread_id = true;
>> -        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
>> +        ms->possible_cpus->cpus[i].props.thread_id = topo_ids.smt_id;
>>      }
>>      return ms->possible_cpus;
>>  }
>> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
>> index 4ff5b2da6c..6c184f3115 100644
>> --- a/include/hw/i386/topology.h
>> +++ b/include/hw/i386/topology.h
>> @@ -45,12 +45,12 @@
>>   */
>>  typedef uint32_t apic_id_t;
>>  
>> -typedef struct X86CPUTopoInfo {
>> +typedef struct X86CPUTopoIDs {
>>      unsigned pkg_id;
>>      unsigned die_id;
>>      unsigned core_id;
>>      unsigned smt_id;
>> -} X86CPUTopoInfo;
>> +} X86CPUTopoIDs;
>>  
>>  /* Return the bit width needed for 'count' IDs
>>   */
>> @@ -122,12 +122,12 @@ static inline unsigned apicid_pkg_offset(unsigned nr_dies,
>>  static inline apic_id_t apicid_from_topo_ids(unsigned nr_dies,
>>                                               unsigned nr_cores,
>>                                               unsigned nr_threads,
>> -                                             const X86CPUTopoInfo *topo)
>> +                                             const X86CPUTopoIDs *topo_ids)
>>  {
>> -    return (topo->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
>> -           (topo->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
>> -          (topo->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
>> -           topo->smt_id;
>> +    return (topo_ids->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, nr_threads)) |
>> +           (topo_ids->die_id  << apicid_die_offset(nr_dies, nr_cores, nr_threads)) |
>> +           (topo_ids->core_id << apicid_core_offset(nr_dies, nr_cores, nr_threads)) |
>> +           topo_ids->smt_id;
>>  }
>>  
>>  /* Calculate thread/core/package IDs for a specific topology,
>> @@ -137,12 +137,12 @@ static inline void x86_topo_ids_from_idx(unsigned nr_dies,
>>                                           unsigned nr_cores,
>>                                           unsigned nr_threads,
>>                                           unsigned cpu_index,
>> -                                         X86CPUTopoInfo *topo)
>> +                                         X86CPUTopoIDs *topo_ids)
>>  {
>> -    topo->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
>> -    topo->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
>> -    topo->core_id = cpu_index / nr_threads % nr_cores;
>> -    topo->smt_id = cpu_index % nr_threads;
>> +    topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
>> +    topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
>> +    topo_ids->core_id = cpu_index / nr_threads % nr_cores;
>> +    topo_ids->smt_id = cpu_index % nr_threads;
>>  }
>>  
>>  /* Calculate thread/core/package IDs for a specific topology,
>> @@ -152,17 +152,17 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>>                                              unsigned nr_dies,
>>                                              unsigned nr_cores,
>>                                              unsigned nr_threads,
>> -                                            X86CPUTopoInfo *topo)
>> +                                            X86CPUTopoIDs *topo_ids)
>>  {
>> -    topo->smt_id = apicid &
>> +    topo_ids->smt_id = apicid &
>>              ~(0xFFFFFFFFUL << apicid_smt_width(nr_dies, nr_cores, nr_threads));
>> -    topo->core_id =
>> +    topo_ids->core_id =
>>              (apicid >> apicid_core_offset(nr_dies, nr_cores, nr_threads)) &
>>              ~(0xFFFFFFFFUL << apicid_core_width(nr_dies, nr_cores, nr_threads));
>> -    topo->die_id =
>> +    topo_ids->die_id =
>>              (apicid >> apicid_die_offset(nr_dies, nr_cores, nr_threads)) &
>>              ~(0xFFFFFFFFUL << apicid_die_width(nr_dies, nr_cores, nr_threads));
>> -    topo->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, nr_threads);
>> +    topo_ids->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, nr_threads);
>>  }
>>  
>>  /* Make APIC ID for the CPU 'cpu_index'
>> @@ -174,9 +174,9 @@ static inline apic_id_t x86_apicid_from_cpu_idx(unsigned nr_dies,
>>                                                  unsigned nr_threads,
>>                                                  unsigned cpu_index)
>>  {
>> -    X86CPUTopoInfo topo;
>> -    x86_topo_ids_from_idx(nr_dies, nr_cores, nr_threads, cpu_index, &topo);
>> -    return apicid_from_topo_ids(nr_dies, nr_cores, nr_threads, &topo);
>> +    X86CPUTopoIDs topo_ids;
>> +    x86_topo_ids_from_idx(nr_dies, nr_cores, nr_threads, cpu_index, &topo_ids);
>> +    return apicid_from_topo_ids(nr_dies, nr_cores, nr_threads, &topo_ids);
>>  }
>>  
>>  #endif /* HW_I386_TOPOLOGY_H */
>>
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2020-02-03 14:59 ` [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Igor Mammedov
@ 2020-02-03 19:31   ` Babu Moger
  2020-02-04  8:02     ` Igor Mammedov
  0 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2020-02-03 19:31 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 2/3/20 8:59 AM, Igor Mammedov wrote:
> On Tue, 03 Dec 2019 18:36:54 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&amp;sdata=N%2FaBBZ8G3D1gCNvabVQ%2FraHvINazcVeEc9FWdxQAWmg%3D&amp;reserved=0
>>
>> Currently, the APIC ID is decoded based on the sequence
>> sockets->dies->cores->threads. This works for most standard AMD and other
>> vendors' configurations, but this decoding sequence does not follow that of
>> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
>> inconsistency.  When booting a guest VM, the kernel tries to validate the
>> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
>>
>> To fix the problem we need to build the topology as per the Processor
>> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
>> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&amp;sdata=McjyMS3A3x5Jr57VxJmHDyh5jumdybzW%2FwLtE4FAKHQ%3D&amp;reserved=0
>>
>> Here is the text from the PPR.
>> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
>> number of least significant bits in the Initial APIC ID that indicate core ID
>> within a processor, in constructing per-core CPUID masks.
>> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
>> (MNC) that the processor could theoretically support, not the actual number of
>> cores that are actually implemented or enabled on the processor, as indicated
>> by Core::X86::Cpuid::SizeId[NC].
>> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
>> • ApicId[6] = Socket ID.
>> • ApicId[5:4] = Node ID.
>> • ApicId[3] = Logical CCX L3 complex ID
>> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}
> 
> 
> After checking out all patches and some pondering, used here approach
> looks to me too intrusive for the task at hand especially where it
> comes to generic code.
> 
> (Ignore till ==== to see suggestion how to simplify without reading
> reasoning behind it first)
> 
> Lets look for a way to simplify it a little bit.
> 
> So problem we are trying to solve,
>  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
>  2: it depends on knowing total number of numa nodes.
> 
> Externally workflow looks like following:
>   1. user provides -smp x,sockets,cores,...,maxcpus
>       that's used by possible_cpu_arch_ids() singleton to build list of
>       possible CPUs (which is available to user via command 'hotpluggable-cpus')
> 
>       Hook could be called very early and possible_cpus data might be
>       not complete. It builds a list of possible CPUs which user could
>       modify later.
> 
>   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
>       options to assign cpus to nodes, which is one way or another calling
>       machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
>       with node information. It happens early when total number of nodes
>       is not available.
> 
>   2.2 user does not provide explicit node mappings for CPUs.
>       QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
>       (using the same machine_set_cpu_numa_node()) right before calling boards
>       specific machine init(). At that time total number of nodes is known.
> 
> In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
> boards init() is run.
> 
> In 2.2 case it calls get_default_cpu_node_id() -> x86_get_default_cpu_node_id()
> which uses arch_id calculate numa node.
> But then question is: does it have to use APIC id or could it infer 'pkg_id',
> it's after, from ms->possible_cpus->cpus[i].props data?

Not sure if I got the question right. In this case because the numa
information is not provided all the cpus are assigned to only one node.
The apic id is used here to get the correct pkg_id.

>   
> With that out of the way APIC ID will be used only during board's init(),
> so board could update possible_cpus with valid APIC IDs at the start of
> x86_cpus_init().
> 
> ====
> in nutshell it would be much easier to do following:
> 
>  1. make x86_get_default_cpu_node_id() APIC ID in-depended or
>     if impossible as alternative recompute APIC IDs there if cpu
>     type is EPYC based (since number of nodes is already known)
>  2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based
> 
> this way one doesn't need to touch generic numa code, introduce
> x86 specific init_apicid_fn() hook into generic code and keep
> x86/EPYC nuances contained within x86 code only.

I was kind of already working in the similar direction in v4.
1. We already have split the numa initialization in patch #12(Split the
numa initialization). This way we know exactly how many numa nodes are
there before hand.
2. Planning to remove init_apicid_fn
3. Insert the handlers inside X86CPUDefinition.
4. EPYC model will have its own apid id handlers. Everything else will be
initialized with a default handlers(current default handler).
5. The function pc_possible_cpu_arch_ids will load the model definition
and initialize the PCMachineState data structure with the model specific
handlers.

Does that sound similar to what you are thinking. Thoughts?

> 
>> v3:
>>   1. Consolidated the topology information in structure X86CPUTopoInfo.
>>   2. Changed the ccx_id to llc_id as commented by upstream.
>>   3. Generalized the apic id decoding. It is mostly similar to current apic id
>>      except that it adds new field llc_id when numa configured. Removes all the
>>      hardcoded values.
>>   4. Removed the earlier parse_numa split. And moved the numa node initialization
>>      inside the numa_complete_configuration. This is bit cleaner as commented by 
>>      Eduardo.
>>   5. Added new function init_apicid_fn inside machine_class structure. This
>>      will be used to update the apic id handler specific to cpu model.
>>   6. Updated the cpuid unit tests.
>>   7. TODO : Need to figure out how to dynamically update the handlers using cpu models.
>>      I might some guidance on that.
>>
>> v2:
>>   https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F156779689013.21957.1631551572950676212.stgit%40localhost.localdomain%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&amp;sdata=ls1cxA1yh0P05zYsAf3sLXDM11DFHtxZvfWWaar7Mgg%3D&amp;reserved=0
>>   1. Introduced the new property epyc to enable new epyc mode.
>>   2. Separated the epyc mode and non epyc mode function.
>>   3. Introduced function pointers in PCMachineState to handle the
>>      differences.
>>   4. Mildly tested different combinations to make things are working as expected.
>>   5. TODO : Setting the epyc feature bit needs to be worked out. This feature is
>>      supported only on AMD EPYC models. I may need some guidance on that.
>>
>> v1:
>>   https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F20190731232032.51786-1-babu.moger%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&amp;sdata=nT4T9RIL4EeSvB%2Ff9%2BjbU7lldopjglQ2X6uYx13WMPE%3D&amp;reserved=0
>>
>> ---
>>
>> Babu Moger (18):
>>       hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
>>       hw/i386: Introduce X86CPUTopoInfo to contain topology info
>>       hw/i386: Consolidate topology functions
>>       hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo
>>       machine: Add SMP Sockets in CpuTopology
>>       hw/core: Add core complex id in X86CPU topology
>>       machine: Add a new function init_apicid_fn in MachineClass
>>       hw/i386: Update structures for nodes_per_pkg
>>       i386: Add CPUX86Family type in CPUX86State
>>       hw/386: Add EPYC mode topology decoding functions
>>       i386: Cleanup and use the EPYC mode topology functions
>>       numa: Split the numa initialization
>>       hw/i386: Introduce apicid_from_cpu_idx in PCMachineState
>>       hw/i386: Introduce topo_ids_from_apicid handler PCMachineState
>>       hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState
>>       hw/i386: Introduce EPYC mode function handlers
>>       i386: Fix pkg_id offset for epyc mode
>>       tests: Update the Unit tests
>>
>>
>>  hw/core/machine-hmp-cmds.c |    3 +
>>  hw/core/machine.c          |   14 +++
>>  hw/core/numa.c             |   62 +++++++++----
>>  hw/i386/pc.c               |  132 +++++++++++++++++++---------
>>  include/hw/boards.h        |    3 +
>>  include/hw/i386/pc.h       |    9 ++
>>  include/hw/i386/topology.h |  209 +++++++++++++++++++++++++++++++-------------
>>  include/sysemu/numa.h      |    5 +
>>  qapi/machine.json          |    7 +
>>  target/i386/cpu.c          |  196 ++++++++++++-----------------------------
>>  target/i386/cpu.h          |    9 ++
>>  tests/test-x86-cpuid.c     |  115 ++++++++++++++----------
>>  vl.c                       |    4 +
>>  13 files changed, 455 insertions(+), 313 deletions(-)
>>
>> --
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-02-03 15:17           ` Igor Mammedov
@ 2020-02-03 21:49             ` Babu Moger
  2020-02-04  7:38               ` Igor Mammedov
  0 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2020-02-03 21:49 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, qemu-devel, armbru, pbonzini, rth



On 2/3/20 9:17 AM, Igor Mammedov wrote:
> On Wed, 29 Jan 2020 10:17:11 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> On 1/29/20 3:14 AM, Igor Mammedov wrote:
>>> On Tue, 28 Jan 2020 13:45:31 -0600
>>> Babu Moger <babu.moger@amd.com> wrote:
>>>   
>>>> On 1/28/20 10:29 AM, Igor Mammedov wrote:  
>>>>> On Tue, 03 Dec 2019 18:37:42 -0600
>>>>> Babu Moger <babu.moger@amd.com> wrote:
>>>>>     
>>>>>> Add a new function init_apicid_fn in MachineClass to initialize the mode
>>>>>> specific handlers to decode the apic ids.
>>>>>>
>>>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>>>> ---
>>>>>>  include/hw/boards.h |    1 +
>>>>>>  vl.c                |    3 +++
>>>>>>  2 files changed, 4 insertions(+)
>>>>>>
>>>>>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>>>>>> index d4fab218e6..ce5aa365cb 100644
>>>>>> --- a/include/hw/boards.h
>>>>>> +++ b/include/hw/boards.h
>>>>>> @@ -238,6 +238,7 @@ struct MachineClass {
>>>>>>                                                           unsigned cpu_index);
>>>>>>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>>>>>>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
>>>>>> +    void (*init_apicid_fn)(MachineState *ms);    
>>>>> it's x86 specific, so why it wasn put into PCMachineClass?    
>>>>
>>>> Yes. It is x86 specific for now. I tried to make it generic function so
>>>> other OSes can use it if required(like we have done in
>>>> possible_cpu_arch_ids). It initializes functions required to build the
>>>> apicid for each CPUs. We need these functions much early in the
>>>> initialization. It should be initialized before parse_numa_opts or
>>>> machine_run_board_init(in v1.c) which are called from generic context. We
>>>> cannot use PCMachineClass at this time.  
>>>
>>> could you point to specific patches in this series that require
>>> apic ids being initialized before parse_numa_opts and elaborate why?
>>>
>>> we already have possible_cpu_arch_ids() which could be called very
>>> early and calculates APIC IDs in x86 case, so why not reuse it?  
>>
>>
>> The current code(before this series) parses the numa information and then
>> sequentially builds the apicid. Both are done together.
>>
>> But this series separates the numa parsing and apicid generation. Numa
>> parsing is done first and after that the apicid is generated. Reason is we
>> need to know the number of numa nodes in advance to decode the apicid.
>>
>> Look at this patch.
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F157541988471.46157.6587693720990965800.stgit%40naples-babu.amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C0a643dd978f149acf9d108d7a8bc487a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163398941923379&amp;sdata=sP2TnNaqNXRGEeQNhJMna3wyeBqN0XbNKqgsCTVDaOQ%3D&amp;reserved=0
>>
>> static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
>> +                                                  const X86CPUTopoIDs
>> *topo_ids)
>> +{
>> +    return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
>> +           (topo_ids->llc_id << apicid_llc_offset_epyc(topo_info)) |
>> +           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
>> +           (topo_ids->core_id << apicid_core_offset(topo_info)) |
>> +           topo_ids->smt_id;
>> +}
>>
>>
>> The function apicid_from_topo_ids_epyc builds the apicid. New decode adds
>> llc_id(which is numa id here) to the current decoding. Other fields are
>> mostly remains same.
> 
> If llc_id is the same as numa id, why not reuse CpuInstanceProperties::node-id
> instead of llc_id you are adding in previous patch 6/18?
> 
I tried to use that earlier. But dropped the idea as it required some
changes. Don't remember exactly now. I am going to investigate again if we
can use the node_id for our purpose here. Will let you know if I have any
issues.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass
  2020-02-03 21:49             ` Babu Moger
@ 2020-02-04  7:38               ` Igor Mammedov
  0 siblings, 0 replies; 53+ messages in thread
From: Igor Mammedov @ 2020-02-04  7:38 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, qemu-devel, armbru, pbonzini, rth

On Mon, 3 Feb 2020 15:49:31 -0600
Babu Moger <babu.moger@amd.com> wrote:

> On 2/3/20 9:17 AM, Igor Mammedov wrote:
> > On Wed, 29 Jan 2020 10:17:11 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> >> On 1/29/20 3:14 AM, Igor Mammedov wrote:  
> >>> On Tue, 28 Jan 2020 13:45:31 -0600
> >>> Babu Moger <babu.moger@amd.com> wrote:
> >>>     
> >>>> On 1/28/20 10:29 AM, Igor Mammedov wrote:    
> >>>>> On Tue, 03 Dec 2019 18:37:42 -0600
> >>>>> Babu Moger <babu.moger@amd.com> wrote:
> >>>>>       
> >>>>>> Add a new function init_apicid_fn in MachineClass to initialize the mode
> >>>>>> specific handlers to decode the apic ids.
> >>>>>>
> >>>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
> >>>>>> ---
> >>>>>>  include/hw/boards.h |    1 +
> >>>>>>  vl.c                |    3 +++
> >>>>>>  2 files changed, 4 insertions(+)
> >>>>>>
> >>>>>> diff --git a/include/hw/boards.h b/include/hw/boards.h
> >>>>>> index d4fab218e6..ce5aa365cb 100644
> >>>>>> --- a/include/hw/boards.h
> >>>>>> +++ b/include/hw/boards.h
> >>>>>> @@ -238,6 +238,7 @@ struct MachineClass {
> >>>>>>                                                           unsigned cpu_index);
> >>>>>>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
> >>>>>>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
> >>>>>> +    void (*init_apicid_fn)(MachineState *ms);      
> >>>>> it's x86 specific, so why it wasn put into PCMachineClass?      
> >>>>
> >>>> Yes. It is x86 specific for now. I tried to make it generic function so
> >>>> other OSes can use it if required(like we have done in
> >>>> possible_cpu_arch_ids). It initializes functions required to build the
> >>>> apicid for each CPUs. We need these functions much early in the
> >>>> initialization. It should be initialized before parse_numa_opts or
> >>>> machine_run_board_init(in v1.c) which are called from generic context. We
> >>>> cannot use PCMachineClass at this time.    
> >>>
> >>> could you point to specific patches in this series that require
> >>> apic ids being initialized before parse_numa_opts and elaborate why?
> >>>
> >>> we already have possible_cpu_arch_ids() which could be called very
> >>> early and calculates APIC IDs in x86 case, so why not reuse it?    
> >>
> >>
> >> The current code(before this series) parses the numa information and then
> >> sequentially builds the apicid. Both are done together.
> >>
> >> But this series separates the numa parsing and apicid generation. Numa
> >> parsing is done first and after that the apicid is generated. Reason is we
> >> need to know the number of numa nodes in advance to decode the apicid.
> >>
> >> Look at this patch.
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F157541988471.46157.6587693720990965800.stgit%40naples-babu.amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C0a643dd978f149acf9d108d7a8bc487a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163398941923379&amp;sdata=sP2TnNaqNXRGEeQNhJMna3wyeBqN0XbNKqgsCTVDaOQ%3D&amp;reserved=0
> >>
> >> static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
> >> +                                                  const X86CPUTopoIDs
> >> *topo_ids)
> >> +{
> >> +    return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
> >> +           (topo_ids->llc_id << apicid_llc_offset_epyc(topo_info)) |
> >> +           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
> >> +           (topo_ids->core_id << apicid_core_offset(topo_info)) |
> >> +           topo_ids->smt_id;
> >> +}
> >>
> >>
> >> The function apicid_from_topo_ids_epyc builds the apicid. New decode adds
> >> llc_id(which is numa id here) to the current decoding. Other fields are
> >> mostly remains same.  
> > 
> > If llc_id is the same as numa id, why not reuse CpuInstanceProperties::node-id
> > instead of llc_id you are adding in previous patch 6/18?
> >   
> I tried to use that earlier. But dropped the idea as it required some
> changes. Don't remember exactly now. I am going to investigate again if we
> can use the node_id for our purpose here. Will let you know if I have any
> issues.
The reason I'm asking to not add new properties here is that it
expands interface visible/used by management tools and it's maintenance
burden not only on QEMU but on engagement side as well. So if yo can reuse
node-id, it will work out of box with existing users.

It should also be less confusing for us since we don't have to keep in mind
(or figure out) that llc_id is the same as node id and wonder why the later
wasn't used in the first place.




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2020-02-03 19:31   ` Babu Moger
@ 2020-02-04  8:02     ` Igor Mammedov
  2020-02-04 19:08       ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-02-04  8:02 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Mon, 3 Feb 2020 13:31:29 -0600
Babu Moger <babu.moger@amd.com> wrote:

> On 2/3/20 8:59 AM, Igor Mammedov wrote:
> > On Tue, 03 Dec 2019 18:36:54 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> >> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&amp;sdata=N%2FaBBZ8G3D1gCNvabVQ%2FraHvINazcVeEc9FWdxQAWmg%3D&amp;reserved=0
> >>
> >> Currently, the APIC ID is decoded based on the sequence
> >> sockets->dies->cores->threads. This works for most standard AMD and other
> >> vendors' configurations, but this decoding sequence does not follow that of
> >> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
> >> inconsistency.  When booting a guest VM, the kernel tries to validate the
> >> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
> >>
> >> To fix the problem we need to build the topology as per the Processor
> >> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
> >> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&amp;sdata=McjyMS3A3x5Jr57VxJmHDyh5jumdybzW%2FwLtE4FAKHQ%3D&amp;reserved=0
> >>
> >> Here is the text from the PPR.
> >> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
> >> number of least significant bits in the Initial APIC ID that indicate core ID
> >> within a processor, in constructing per-core CPUID masks.
> >> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
> >> (MNC) that the processor could theoretically support, not the actual number of
> >> cores that are actually implemented or enabled on the processor, as indicated
> >> by Core::X86::Cpuid::SizeId[NC].
> >> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> >> • ApicId[6] = Socket ID.
> >> • ApicId[5:4] = Node ID.
> >> • ApicId[3] = Logical CCX L3 complex ID
> >> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}  
> > 
> > 
> > After checking out all patches and some pondering, used here approach
> > looks to me too intrusive for the task at hand especially where it
> > comes to generic code.
> > 
> > (Ignore till ==== to see suggestion how to simplify without reading
> > reasoning behind it first)
> > 
> > Lets look for a way to simplify it a little bit.
> > 
> > So problem we are trying to solve,
> >  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
> >  2: it depends on knowing total number of numa nodes.
> > 
> > Externally workflow looks like following:
> >   1. user provides -smp x,sockets,cores,...,maxcpus
> >       that's used by possible_cpu_arch_ids() singleton to build list of
> >       possible CPUs (which is available to user via command 'hotpluggable-cpus')
> > 
> >       Hook could be called very early and possible_cpus data might be
> >       not complete. It builds a list of possible CPUs which user could
> >       modify later.
> > 
> >   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
> >       options to assign cpus to nodes, which is one way or another calling
> >       machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
> >       with node information. It happens early when total number of nodes
> >       is not available.
> > 
> >   2.2 user does not provide explicit node mappings for CPUs.
> >       QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
> >       (using the same machine_set_cpu_numa_node()) right before calling boards
> >       specific machine init(). At that time total number of nodes is known.
> > 
> > In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
> > boards init() is run.
> > 
> > In 2.2 case it calls get_default_cpu_node_id() -> x86_get_default_cpu_node_id()
> > which uses arch_id calculate numa node.
> > But then question is: does it have to use APIC id or could it infer 'pkg_id',
> > it's after, from ms->possible_cpus->cpus[i].props data?  
> 
> Not sure if I got the question right. In this case because the numa
> information is not provided all the cpus are assigned to only one node.
> The apic id is used here to get the correct pkg_id.

apicid was composed from socket/core/thread[/die] tuple which cpus[i].props is.

Question is if we can compose only pkg_id based on the same data without
converting it to apicid and then "reverse engineering" it back
original data?

Or more direct question: is socket-id the same as pkg_id?


> 
> >   
> > With that out of the way APIC ID will be used only during board's init(),
> > so board could update possible_cpus with valid APIC IDs at the start of
> > x86_cpus_init().
> > 
> > ====
> > in nutshell it would be much easier to do following:
> > 
> >  1. make x86_get_default_cpu_node_id() APIC ID in-depended or
> >     if impossible as alternative recompute APIC IDs there if cpu
> >     type is EPYC based (since number of nodes is already known)
> >  2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based
> > 
> > this way one doesn't need to touch generic numa code, introduce
> > x86 specific init_apicid_fn() hook into generic code and keep
> > x86/EPYC nuances contained within x86 code only.  
> 
> I was kind of already working in the similar direction in v4.
> 1. We already have split the numa initialization in patch #12(Split the
> numa initialization). This way we know exactly how many numa nodes are
> there before hand.

I suggest to drop that patch, It's the one that touches generic numa
code and adding more legacy based extensions like cpu_indexes.
Which I'd like to get rid of to begin with, so only -numa cpu is left.

I think it's not necessary to touch numa code at all for apicid generation
purpose, as I tried to explain above. We should be able to keep
this x86 only business.

> 2. Planning to remove init_apicid_fn
> 3. Insert the handlers inside X86CPUDefinition.
what handlers do you mean?

> 4. EPYC model will have its own apid id handlers. Everything else will be
> initialized with a default handlers(current default handler).
> 5. The function pc_possible_cpu_arch_ids will load the model definition
> and initialize the PCMachineState data structure with the model specific
> handlers.
I'm not sure what do you mean here.
 
> Does that sound similar to what you are thinking. Thoughts?
If you have something to share and can push it on github,
I can look at, whether it has design issues to spare you a round trip on a list.
(it won't be proper review but at least I can help to pinpoint most problematic parts)

> 
> >   
> >> v3:
> >>   1. Consolidated the topology information in structure X86CPUTopoInfo.
> >>   2. Changed the ccx_id to llc_id as commented by upstream.
> >>   3. Generalized the apic id decoding. It is mostly similar to current apic id
> >>      except that it adds new field llc_id when numa configured. Removes all the
> >>      hardcoded values.
> >>   4. Removed the earlier parse_numa split. And moved the numa node initialization
> >>      inside the numa_complete_configuration. This is bit cleaner as commented by 
> >>      Eduardo.
> >>   5. Added new function init_apicid_fn inside machine_class structure. This
> >>      will be used to update the apic id handler specific to cpu model.
> >>   6. Updated the cpuid unit tests.
> >>   7. TODO : Need to figure out how to dynamically update the handlers using cpu models.
> >>      I might some guidance on that.
> >>
> >> v2:
> >>   https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F156779689013.21957.1631551572950676212.stgit%40localhost.localdomain%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&amp;sdata=ls1cxA1yh0P05zYsAf3sLXDM11DFHtxZvfWWaar7Mgg%3D&amp;reserved=0
> >>   1. Introduced the new property epyc to enable new epyc mode.
> >>   2. Separated the epyc mode and non epyc mode function.
> >>   3. Introduced function pointers in PCMachineState to handle the
> >>      differences.
> >>   4. Mildly tested different combinations to make things are working as expected.
> >>   5. TODO : Setting the epyc feature bit needs to be worked out. This feature is
> >>      supported only on AMD EPYC models. I may need some guidance on that.
> >>
> >> v1:
> >>   https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F20190731232032.51786-1-babu.moger%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&amp;sdata=nT4T9RIL4EeSvB%2Ff9%2BjbU7lldopjglQ2X6uYx13WMPE%3D&amp;reserved=0
> >>
> >> ---
> >>
> >> Babu Moger (18):
> >>       hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
> >>       hw/i386: Introduce X86CPUTopoInfo to contain topology info
> >>       hw/i386: Consolidate topology functions
> >>       hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo
> >>       machine: Add SMP Sockets in CpuTopology
> >>       hw/core: Add core complex id in X86CPU topology
> >>       machine: Add a new function init_apicid_fn in MachineClass
> >>       hw/i386: Update structures for nodes_per_pkg
> >>       i386: Add CPUX86Family type in CPUX86State
> >>       hw/386: Add EPYC mode topology decoding functions
> >>       i386: Cleanup and use the EPYC mode topology functions
> >>       numa: Split the numa initialization
> >>       hw/i386: Introduce apicid_from_cpu_idx in PCMachineState
> >>       hw/i386: Introduce topo_ids_from_apicid handler PCMachineState
> >>       hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState
> >>       hw/i386: Introduce EPYC mode function handlers
> >>       i386: Fix pkg_id offset for epyc mode
> >>       tests: Update the Unit tests
> >>
> >>
> >>  hw/core/machine-hmp-cmds.c |    3 +
> >>  hw/core/machine.c          |   14 +++
> >>  hw/core/numa.c             |   62 +++++++++----
> >>  hw/i386/pc.c               |  132 +++++++++++++++++++---------
> >>  include/hw/boards.h        |    3 +
> >>  include/hw/i386/pc.h       |    9 ++
> >>  include/hw/i386/topology.h |  209 +++++++++++++++++++++++++++++++-------------
> >>  include/sysemu/numa.h      |    5 +
> >>  qapi/machine.json          |    7 +
> >>  target/i386/cpu.c          |  196 ++++++++++++-----------------------------
> >>  target/i386/cpu.h          |    9 ++
> >>  tests/test-x86-cpuid.c     |  115 ++++++++++++++----------
> >>  vl.c                       |    4 +
> >>  13 files changed, 455 insertions(+), 313 deletions(-)
> >>
> >> --
> >>  
> >   
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2020-02-04  8:02     ` Igor Mammedov
@ 2020-02-04 19:08       ` Babu Moger
  2020-02-05  9:38         ` Igor Mammedov
  0 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2020-02-04 19:08 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 2/4/20 2:02 AM, Igor Mammedov wrote:
> On Mon, 3 Feb 2020 13:31:29 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> On 2/3/20 8:59 AM, Igor Mammedov wrote:
>>> On Tue, 03 Dec 2019 18:36:54 -0600
>>> Babu Moger <babu.moger@amd.com> wrote:
>>>   
>>>> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cbbd1693802184161c8c308d7a9489ee9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164001686545394&amp;sdata=UtYAoTk4RfZZ1VfaP%2FhcYrCSNTcubEB7cB%2BoYlRLfhc%3D&amp;reserved=0
>>>>
>>>> Currently, the APIC ID is decoded based on the sequence
>>>> sockets->dies->cores->threads. This works for most standard AMD and other
>>>> vendors' configurations, but this decoding sequence does not follow that of
>>>> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
>>>> inconsistency.  When booting a guest VM, the kernel tries to validate the
>>>> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
>>>>
>>>> To fix the problem we need to build the topology as per the Processor
>>>> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
>>>> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cbbd1693802184161c8c308d7a9489ee9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164001686555390&amp;sdata=oHONRiXtpstKqwrxyzqR20bYDDr3zvmwq91a%2Br6iDqc%3D&amp;reserved=0
>>>>
>>>> Here is the text from the PPR.
>>>> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
>>>> number of least significant bits in the Initial APIC ID that indicate core ID
>>>> within a processor, in constructing per-core CPUID masks.
>>>> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
>>>> (MNC) that the processor could theoretically support, not the actual number of
>>>> cores that are actually implemented or enabled on the processor, as indicated
>>>> by Core::X86::Cpuid::SizeId[NC].
>>>> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
>>>> • ApicId[6] = Socket ID.
>>>> • ApicId[5:4] = Node ID.
>>>> • ApicId[3] = Logical CCX L3 complex ID
>>>> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}  
>>>
>>>
>>> After checking out all patches and some pondering, used here approach
>>> looks to me too intrusive for the task at hand especially where it
>>> comes to generic code.
>>>
>>> (Ignore till ==== to see suggestion how to simplify without reading
>>> reasoning behind it first)
>>>
>>> Lets look for a way to simplify it a little bit.
>>>
>>> So problem we are trying to solve,
>>>  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
>>>  2: it depends on knowing total number of numa nodes.
>>>
>>> Externally workflow looks like following:
>>>   1. user provides -smp x,sockets,cores,...,maxcpus
>>>       that's used by possible_cpu_arch_ids() singleton to build list of
>>>       possible CPUs (which is available to user via command 'hotpluggable-cpus')
>>>
>>>       Hook could be called very early and possible_cpus data might be
>>>       not complete. It builds a list of possible CPUs which user could
>>>       modify later.
>>>
>>>   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
>>>       options to assign cpus to nodes, which is one way or another calling
>>>       machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
>>>       with node information. It happens early when total number of nodes
>>>       is not available.
>>>
>>>   2.2 user does not provide explicit node mappings for CPUs.
>>>       QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
>>>       (using the same machine_set_cpu_numa_node()) right before calling boards
>>>       specific machine init(). At that time total number of nodes is known.
>>>
>>> In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
>>> boards init() is run.

In case of 2.1, we need to have the arch_id already generated. This is
done inside possible_cpu_arch_ids. The arch_id is used by
machine_set_cpu_numa_node to assign the cpus to correct numa node.

If we want to move the arch_id generation into board init(), then we need
to save the cpu indexes belonging to each node somewhere.

>>>
>>> In 2.2 case it calls get_default_cpu_node_id() -> x86_get_default_cpu_node_id()
>>> which uses arch_id calculate numa node.
>>> But then question is: does it have to use APIC id or could it infer 'pkg_id',
>>> it's after, from ms->possible_cpus->cpus[i].props data?  
>>
>> Not sure if I got the question right. In this case because the numa
>> information is not provided all the cpus are assigned to only one node.
>> The apic id is used here to get the correct pkg_id.
> 
> apicid was composed from socket/core/thread[/die] tuple which cpus[i].props is.
> 
> Question is if we can compose only pkg_id based on the same data without
> converting it to apicid and then "reverse engineering" it back
> original data?

Yes. It is possible.

> 
> Or more direct question: is socket-id the same as pkg_id?

Yes. Socket_id and pkg_id is same.

> 
> 
>>
>>>   
>>> With that out of the way APIC ID will be used only during board's init(),
>>> so board could update possible_cpus with valid APIC IDs at the start of
>>> x86_cpus_init().
>>>
>>> ====
>>> in nutshell it would be much easier to do following:
>>>
>>>  1. make x86_get_default_cpu_node_id() APIC ID in-depended or
>>>     if impossible as alternative recompute APIC IDs there if cpu
>>>     type is EPYC based (since number of nodes is already known)
>>>  2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based
>>>
>>> this way one doesn't need to touch generic numa code, introduce
>>> x86 specific init_apicid_fn() hook into generic code and keep
>>> x86/EPYC nuances contained within x86 code only.  
>>
>> I was kind of already working in the similar direction in v4.
>> 1. We already have split the numa initialization in patch #12(Split the
>> numa initialization). This way we know exactly how many numa nodes are
>> there before hand.
> 
> I suggest to drop that patch, It's the one that touches generic numa
> code and adding more legacy based extensions like cpu_indexes.
> Which I'd like to get rid of to begin with, so only -numa cpu is left.
> 
> I think it's not necessary to touch numa code at all for apicid generation
> purpose, as I tried to explain above. We should be able to keep
> this x86 only business.

This is going to be difficult without touching the generic numa code.

> 
>> 2. Planning to remove init_apicid_fn
>> 3. Insert the handlers inside X86CPUDefinition.
> what handlers do you mean?

Apicid generation logic can be separated into 3 types of handlers.
x86_apicid_from_cpu_idx: Generate apicid from cpu index.
x86_topo_ids_from_apicid: Generate topo ids from apic id.
x86_apicid_from_topo_ids: Generate apicid from topo ids.

We should be able to generate one id from other(you can see topology.h).

X86CPUDefinition will have the handlers specific to each model like the
way we have features now. The above 3 handlers will be used as default
handler.


The EPYC model will have its corresponding handlers.

x86_apicid_from_cpu_idx_epyc
x86_topo_ids_from_apicid_epyc
x86_apicid_from_topo_ids_epyc.

> 
>> 4. EPYC model will have its own apid id handlers. Everything else will be
>> initialized with a default handlers(current default handler).
>> 5. The function pc_possible_cpu_arch_ids will load the model definition
>> and initialize the PCMachineState data structure with the model specific
>> handlers.
> I'm not sure what do you mean here.

PCMachineState will have the function pointers to the above handlers.
I was going to load the correct handler based on the mode type.

>  
>> Does that sound similar to what you are thinking. Thoughts?
> If you have something to share and can push it on github,
> I can look at, whether it has design issues to spare you a round trip on a list.
> (it won't be proper review but at least I can help to pinpoint most problematic parts)
> 
My code for the current approach is kind of ready(yet to be tested). I can
send it as v3.1 if you want to look. Or we can wait for our discussion to
settle. I will post it after our discussion.


There is one more problem we need to address. I was going to address later
in v4 or v5.

This works
-numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7

This does not work
-numa node,nodeid=0,cpus=0-5 -numa node,nodeid=1,cpus=6-7

This requires the generic code to pass the node information to the x86
code which requires some handler changes. I was thinking my code will
simplify the changes to address this issue.

>>
>>>   
>>>> v3:
>>>>   1. Consolidated the topology information in structure X86CPUTopoInfo.
>>>>   2. Changed the ccx_id to llc_id as commented by upstream.
>>>>   3. Generalized the apic id decoding. It is mostly similar to current apic id
>>>>      except that it adds new field llc_id when numa configured. Removes all the
>>>>      hardcoded values.
>>>>   4. Removed the earlier parse_numa split. And moved the numa node initialization
>>>>      inside the numa_complete_configuration. This is bit cleaner as commented by 
>>>>      Eduardo.
>>>>   5. Added new function init_apicid_fn inside machine_class structure. This
>>>>      will be used to update the apic id handler specific to cpu model.
>>>>   6. Updated the cpuid unit tests.
>>>>   7. TODO : Need to figure out how to dynamically update the handlers using cpu models.
>>>>      I might some guidance on that.
>>>>
>>>> v2:
>>>>   https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F156779689013.21957.1631551572950676212.stgit%40localhost.localdomain%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cbbd1693802184161c8c308d7a9489ee9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164001686555390&amp;sdata=phJxrK4AExmcfBfxcN8Ngtnuwv1vLR%2BW4PnqjUSPQfI%3D&amp;reserved=0
>>>>   1. Introduced the new property epyc to enable new epyc mode.
>>>>   2. Separated the epyc mode and non epyc mode function.
>>>>   3. Introduced function pointers in PCMachineState to handle the
>>>>      differences.
>>>>   4. Mildly tested different combinations to make things are working as expected.
>>>>   5. TODO : Setting the epyc feature bit needs to be worked out. This feature is
>>>>      supported only on AMD EPYC models. I may need some guidance on that.
>>>>
>>>> v1:
>>>>   https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F20190731232032.51786-1-babu.moger%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cbbd1693802184161c8c308d7a9489ee9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164001686555390&amp;sdata=P1Ghnsypj8uSuGiv9XW38nytrHXAIeOGumsbbAEUjCU%3D&amp;reserved=0
>>>>
>>>> ---
>>>>
>>>> Babu Moger (18):
>>>>       hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
>>>>       hw/i386: Introduce X86CPUTopoInfo to contain topology info
>>>>       hw/i386: Consolidate topology functions
>>>>       hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo
>>>>       machine: Add SMP Sockets in CpuTopology
>>>>       hw/core: Add core complex id in X86CPU topology
>>>>       machine: Add a new function init_apicid_fn in MachineClass
>>>>       hw/i386: Update structures for nodes_per_pkg
>>>>       i386: Add CPUX86Family type in CPUX86State
>>>>       hw/386: Add EPYC mode topology decoding functions
>>>>       i386: Cleanup and use the EPYC mode topology functions
>>>>       numa: Split the numa initialization
>>>>       hw/i386: Introduce apicid_from_cpu_idx in PCMachineState
>>>>       hw/i386: Introduce topo_ids_from_apicid handler PCMachineState
>>>>       hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState
>>>>       hw/i386: Introduce EPYC mode function handlers
>>>>       i386: Fix pkg_id offset for epyc mode
>>>>       tests: Update the Unit tests
>>>>
>>>>
>>>>  hw/core/machine-hmp-cmds.c |    3 +
>>>>  hw/core/machine.c          |   14 +++
>>>>  hw/core/numa.c             |   62 +++++++++----
>>>>  hw/i386/pc.c               |  132 +++++++++++++++++++---------
>>>>  include/hw/boards.h        |    3 +
>>>>  include/hw/i386/pc.h       |    9 ++
>>>>  include/hw/i386/topology.h |  209 +++++++++++++++++++++++++++++++-------------
>>>>  include/sysemu/numa.h      |    5 +
>>>>  qapi/machine.json          |    7 +
>>>>  target/i386/cpu.c          |  196 ++++++++++++-----------------------------
>>>>  target/i386/cpu.h          |    9 ++
>>>>  tests/test-x86-cpuid.c     |  115 ++++++++++++++----------
>>>>  vl.c                       |    4 +
>>>>  13 files changed, 455 insertions(+), 313 deletions(-)
>>>>
>>>> --
>>>>  
>>>   
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2020-02-04 19:08       ` Babu Moger
@ 2020-02-05  9:38         ` Igor Mammedov
  2020-02-05 16:10           ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-02-05  9:38 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Tue, 4 Feb 2020 13:08:58 -0600
Babu Moger <babu.moger@amd.com> wrote:

> On 2/4/20 2:02 AM, Igor Mammedov wrote:
> > On Mon, 3 Feb 2020 13:31:29 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> >> On 2/3/20 8:59 AM, Igor Mammedov wrote:  
> >>> On Tue, 03 Dec 2019 18:36:54 -0600
> >>> Babu Moger <babu.moger@amd.com> wrote:
> >>>     
> >>>> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cbbd1693802184161c8c308d7a9489ee9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164001686545394&amp;sdata=UtYAoTk4RfZZ1VfaP%2FhcYrCSNTcubEB7cB%2BoYlRLfhc%3D&amp;reserved=0
> >>>>
> >>>> Currently, the APIC ID is decoded based on the sequence
> >>>> sockets->dies->cores->threads. This works for most standard AMD and other
> >>>> vendors' configurations, but this decoding sequence does not follow that of
> >>>> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
> >>>> inconsistency.  When booting a guest VM, the kernel tries to validate the
> >>>> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
> >>>>
> >>>> To fix the problem we need to build the topology as per the Processor
> >>>> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
> >>>> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cbbd1693802184161c8c308d7a9489ee9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164001686555390&amp;sdata=oHONRiXtpstKqwrxyzqR20bYDDr3zvmwq91a%2Br6iDqc%3D&amp;reserved=0
> >>>>
> >>>> Here is the text from the PPR.
> >>>> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
> >>>> number of least significant bits in the Initial APIC ID that indicate core ID
> >>>> within a processor, in constructing per-core CPUID masks.
> >>>> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
> >>>> (MNC) that the processor could theoretically support, not the actual number of
> >>>> cores that are actually implemented or enabled on the processor, as indicated
> >>>> by Core::X86::Cpuid::SizeId[NC].
> >>>> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> >>>> • ApicId[6] = Socket ID.
> >>>> • ApicId[5:4] = Node ID.
> >>>> • ApicId[3] = Logical CCX L3 complex ID
> >>>> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}    
> >>>
> >>>
> >>> After checking out all patches and some pondering, used here approach
> >>> looks to me too intrusive for the task at hand especially where it
> >>> comes to generic code.
> >>>
> >>> (Ignore till ==== to see suggestion how to simplify without reading
> >>> reasoning behind it first)
> >>>
> >>> Lets look for a way to simplify it a little bit.
> >>>
> >>> So problem we are trying to solve,
> >>>  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
> >>>  2: it depends on knowing total number of numa nodes.
> >>>
> >>> Externally workflow looks like following:
> >>>   1. user provides -smp x,sockets,cores,...,maxcpus
> >>>       that's used by possible_cpu_arch_ids() singleton to build list of
> >>>       possible CPUs (which is available to user via command 'hotpluggable-cpus')
> >>>
> >>>       Hook could be called very early and possible_cpus data might be
> >>>       not complete. It builds a list of possible CPUs which user could
> >>>       modify later.
> >>>
> >>>   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
> >>>       options to assign cpus to nodes, which is one way or another calling
> >>>       machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
> >>>       with node information. It happens early when total number of nodes
> >>>       is not available.
> >>>
> >>>   2.2 user does not provide explicit node mappings for CPUs.
> >>>       QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
> >>>       (using the same machine_set_cpu_numa_node()) right before calling boards
> >>>       specific machine init(). At that time total number of nodes is known.
> >>>
> >>> In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
> >>> boards init() is run.  
> 
> In case of 2.1, we need to have the arch_id already generated. This is
> done inside possible_cpu_arch_ids. The arch_id is used by
> machine_set_cpu_numa_node to assign the cpus to correct numa node.

I might have missed something but I don't see arch_id itself being used in
machine_set_cpu_numa_node(). It only uses props part of possible_cpus

 
> If we want to move the arch_id generation into board init(), then we need
> to save the cpu indexes belonging to each node somewhere.

when cpus are assigned explicitly, decision what cpus go to what nodes is
up to user and user configured mapping is stored in MachineState::possible_cpus
which is accessed by via possible_cpu_arch_ids() callback.
Hence I don see any reason to touch cpu indexes.

> 
> >>>
> >>> In 2.2 case it calls get_default_cpu_node_id() -> x86_get_default_cpu_node_id()
> >>> which uses arch_id calculate numa node.
> >>> But then question is: does it have to use APIC id or could it infer 'pkg_id',
> >>> it's after, from ms->possible_cpus->cpus[i].props data?    
> >>
> >> Not sure if I got the question right. In this case because the numa
> >> information is not provided all the cpus are assigned to only one node.
> >> The apic id is used here to get the correct pkg_id.  
> > 
> > apicid was composed from socket/core/thread[/die] tuple which cpus[i].props is.
> > 
> > Question is if we can compose only pkg_id based on the same data without
> > converting it to apicid and then "reverse engineering" it back
> > original data?  
> 
> Yes. It is possible.
> 
> > 
> > Or more direct question: is socket-id the same as pkg_id?  
> 
> Yes. Socket_id and pkg_id is same.
> 
> > 
> >   
> >>  
> >>>   
> >>> With that out of the way APIC ID will be used only during board's init(),
> >>> so board could update possible_cpus with valid APIC IDs at the start of
> >>> x86_cpus_init().
> >>>
> >>> ====
> >>> in nutshell it would be much easier to do following:
> >>>
> >>>  1. make x86_get_default_cpu_node_id() APIC ID in-depended or
> >>>     if impossible as alternative recompute APIC IDs there if cpu
> >>>     type is EPYC based (since number of nodes is already known)
> >>>  2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based
> >>>
> >>> this way one doesn't need to touch generic numa code, introduce
> >>> x86 specific init_apicid_fn() hook into generic code and keep
> >>> x86/EPYC nuances contained within x86 code only.    
> >>
> >> I was kind of already working in the similar direction in v4.
> >> 1. We already have split the numa initialization in patch #12(Split the
> >> numa initialization). This way we know exactly how many numa nodes are
> >> there before hand.  
> > 
> > I suggest to drop that patch, It's the one that touches generic numa
> > code and adding more legacy based extensions like cpu_indexes.
> > Which I'd like to get rid of to begin with, so only -numa cpu is left.
> > 
> > I think it's not necessary to touch numa code at all for apicid generation
> > purpose, as I tried to explain above. We should be able to keep
> > this x86 only business.  
> 
> This is going to be difficult without touching the generic numa code.

Looking at current code I don't see why one would touch numa code.
Care to explain in more details why you'd have to touch it?

> >> 2. Planning to remove init_apicid_fn
> >> 3. Insert the handlers inside X86CPUDefinition.  
> > what handlers do you mean?  
> 
> Apicid generation logic can be separated into 3 types of handlers.
> x86_apicid_from_cpu_idx: Generate apicid from cpu index.
> x86_topo_ids_from_apicid: Generate topo ids from apic id.
> x86_apicid_from_topo_ids: Generate apicid from topo ids.
> 
> We should be able to generate one id from other(you can see topology.h).
> 
> X86CPUDefinition will have the handlers specific to each model like the
> way we have features now. The above 3 handlers will be used as default
> handler.

it probably shouldn't be a part of X86CPUDefinition,
as it's machines responsibility to generate and set APIC ID.

What you are doing with this topo functions in this version
looks more that enough to me.

> The EPYC model will have its corresponding handlers.
> 
> x86_apicid_from_cpu_idx_epyc
> x86_topo_ids_from_apicid_epyc
> x86_apicid_from_topo_ids_epyc.

CPU might use call backs, but does it have to?
I see cpu_x86_cpuid() uses these functions to decode apic_id back to topo
info and then compose various leaves based on it.
Within CPU code I'd just use
 if (i_am_epyc)
    x86_topo_ids_from_apicid_epyc()
 else
    x86_topo_ids_from_apicid()
it's easier to read and one doesn't have to go figure
indirection chain to figure out what code is called.
   
> >> 4. EPYC model will have its own apid id handlers. Everything else will be
> >> initialized with a default handlers(current default handler).
> >> 5. The function pc_possible_cpu_arch_ids will load the model definition
> >> and initialize the PCMachineState data structure with the model specific
> >> handlers.  
> > I'm not sure what do you mean here.  
> 
> PCMachineState will have the function pointers to the above handlers.
> I was going to load the correct handler based on the mode type.

Could be done like this, but considering that within machine we need
to calculate apic_id only once, the same 'if' trick would be simpler

x86_cpus_init() {

  if (cpu == epic) {
     make_epyc_apic_ids(mc->possible_cpu_arch_ids(ms))
  }

  // go on with creating cpus ...
}

> >> Does that sound similar to what you are thinking. Thoughts?  
> > If you have something to share and can push it on github,
> > I can look at, whether it has design issues to spare you a round trip on a list.
> > (it won't be proper review but at least I can help to pinpoint most problematic parts)
> >   
> My code for the current approach is kind of ready(yet to be tested). I can
> send it as v3.1 if you want to look. Or we can wait for our discussion to
> settle. I will post it after our discussion.
ok, lets wait till we finish this discussion

> There is one more problem we need to address. I was going to address later
> in v4 or v5.
> 
> This works
> -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7
> 
> This does not work
> -numa node,nodeid=0,cpus=0-5 -numa node,nodeid=1,cpus=6-7
Is it supposed to work (i.e. can real hardware do such topology)?

> This requires the generic code to pass the node information to the x86
> code which requires some handler changes. I was thinking my code will
> simplify the changes to address this issue.

without more information, it's hard to comment on issue and whether
extra complexity of callbacks is justificated. 

There could be 2 ways here, add fixes to this series so we could see the reason
or make this series simple to solve apic_id problem only and then on top of
it send the second series that solves another issue.

Considering that this series is already big/complicated enough,
personally I'd go for 2nd option. As it's easier to describe what patches are
doing and easier to review => should result in faster reaching consensus and merging.
[...]



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2020-02-05  9:38         ` Igor Mammedov
@ 2020-02-05 16:10           ` Babu Moger
  2020-02-05 16:56             ` Igor Mammedov
  0 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2020-02-05 16:10 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 2/5/20 3:38 AM, Igor Mammedov wrote:
> On Tue, 4 Feb 2020 13:08:58 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> On 2/4/20 2:02 AM, Igor Mammedov wrote:
>>> On Mon, 3 Feb 2020 13:31:29 -0600
>>> Babu Moger <babu.moger@amd.com> wrote:
>>>   
>>>> On 2/3/20 8:59 AM, Igor Mammedov wrote:  
>>>>> On Tue, 03 Dec 2019 18:36:54 -0600
>>>>> Babu Moger <babu.moger@amd.com> wrote:
>>>>>     
>>>>>> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cdbfd059a060a4851aad908d7aa1f3532%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164923333568238&amp;sdata=P0I547X5r0s9emWu3ptIcm1U%2FhCMZmnMQOQ0IgLPzzQ%3D&amp;reserved=0
>>>>>>
>>>>>> Currently, the APIC ID is decoded based on the sequence
>>>>>> sockets->dies->cores->threads. This works for most standard AMD and other
>>>>>> vendors' configurations, but this decoding sequence does not follow that of
>>>>>> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
>>>>>> inconsistency.  When booting a guest VM, the kernel tries to validate the
>>>>>> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
>>>>>>
>>>>>> To fix the problem we need to build the topology as per the Processor
>>>>>> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
>>>>>> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cdbfd059a060a4851aad908d7aa1f3532%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164923333568238&amp;sdata=AO6m%2FEI17iLoAa3gNnRSJKJAdvBRKh0Dmbr7bCVA0us%3D&amp;reserved=0
>>>>>>
>>>>>> Here is the text from the PPR.
>>>>>> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
>>>>>> number of least significant bits in the Initial APIC ID that indicate core ID
>>>>>> within a processor, in constructing per-core CPUID masks.
>>>>>> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
>>>>>> (MNC) that the processor could theoretically support, not the actual number of
>>>>>> cores that are actually implemented or enabled on the processor, as indicated
>>>>>> by Core::X86::Cpuid::SizeId[NC].
>>>>>> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
>>>>>> • ApicId[6] = Socket ID.
>>>>>> • ApicId[5:4] = Node ID.
>>>>>> • ApicId[3] = Logical CCX L3 complex ID
>>>>>> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}    
>>>>>
>>>>>
>>>>> After checking out all patches and some pondering, used here approach
>>>>> looks to me too intrusive for the task at hand especially where it
>>>>> comes to generic code.
>>>>>
>>>>> (Ignore till ==== to see suggestion how to simplify without reading
>>>>> reasoning behind it first)
>>>>>
>>>>> Lets look for a way to simplify it a little bit.
>>>>>
>>>>> So problem we are trying to solve,
>>>>>  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
>>>>>  2: it depends on knowing total number of numa nodes.
>>>>>
>>>>> Externally workflow looks like following:
>>>>>   1. user provides -smp x,sockets,cores,...,maxcpus
>>>>>       that's used by possible_cpu_arch_ids() singleton to build list of
>>>>>       possible CPUs (which is available to user via command 'hotpluggable-cpus')
>>>>>
>>>>>       Hook could be called very early and possible_cpus data might be
>>>>>       not complete. It builds a list of possible CPUs which user could
>>>>>       modify later.
>>>>>
>>>>>   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
>>>>>       options to assign cpus to nodes, which is one way or another calling
>>>>>       machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
>>>>>       with node information. It happens early when total number of nodes
>>>>>       is not available.
>>>>>
>>>>>   2.2 user does not provide explicit node mappings for CPUs.
>>>>>       QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
>>>>>       (using the same machine_set_cpu_numa_node()) right before calling boards
>>>>>       specific machine init(). At that time total number of nodes is known.
>>>>>
>>>>> In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
>>>>> boards init() is run.  
>>
>> In case of 2.1, we need to have the arch_id already generated. This is
>> done inside possible_cpu_arch_ids. The arch_id is used by
>> machine_set_cpu_numa_node to assign the cpus to correct numa node.
> 
> I might have missed something but I don't see arch_id itself being used in
> machine_set_cpu_numa_node(). It only uses props part of possible_cpus

Before calling machine_set_cpu_numa_node, we call
cpu_index_to_instance_props -> x86_cpu_index_to_props->
possible_cpu_arch_ids->x86_possible_cpu_arch_ids.

This sequence sets up the arch_id(in x86_cpu_apic_id_from_index) for all
the available cpus. Based on the arch_id, it also sets up the props.
And these props values are used to assign the nodes in
machine_set_cpu_numa_node.

At this point we are still parsing the numa nodes and so we don't know the
total number of numa nodes. Without that information, the arch_id
generated here will not be correct for EPYC models.

This is the reason for changing the generic numa code(patch #12-Split the
numa initialization).

> 
>  
>> If we want to move the arch_id generation into board init(), then we need
>> to save the cpu indexes belonging to each node somewhere.
> 
> when cpus are assigned explicitly, decision what cpus go to what nodes is
> up to user and user configured mapping is stored in MachineState::possible_cpus
> which is accessed by via possible_cpu_arch_ids() callback.
> Hence I don see any reason to touch cpu indexes.

Please see my reasoning above.

> 
>>
>>>>>
>>>>> In 2.2 case it calls get_default_cpu_node_id() -> x86_get_default_cpu_node_id()
>>>>> which uses arch_id calculate numa node.
>>>>> But then question is: does it have to use APIC id or could it infer 'pkg_id',
>>>>> it's after, from ms->possible_cpus->cpus[i].props data?    
>>>>
>>>> Not sure if I got the question right. In this case because the numa
>>>> information is not provided all the cpus are assigned to only one node.
>>>> The apic id is used here to get the correct pkg_id.  
>>>
>>> apicid was composed from socket/core/thread[/die] tuple which cpus[i].props is.
>>>
>>> Question is if we can compose only pkg_id based on the same data without
>>> converting it to apicid and then "reverse engineering" it back
>>> original data?  
>>
>> Yes. It is possible.
>>
>>>
>>> Or more direct question: is socket-id the same as pkg_id?  
>>
>> Yes. Socket_id and pkg_id is same.
>>
>>>
>>>   
>>>>  
>>>>>   
>>>>> With that out of the way APIC ID will be used only during board's init(),
>>>>> so board could update possible_cpus with valid APIC IDs at the start of
>>>>> x86_cpus_init().
>>>>>
>>>>> ====
>>>>> in nutshell it would be much easier to do following:
>>>>>
>>>>>  1. make x86_get_default_cpu_node_id() APIC ID in-depended or
>>>>>     if impossible as alternative recompute APIC IDs there if cpu
>>>>>     type is EPYC based (since number of nodes is already known)
>>>>>  2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based
>>>>>
>>>>> this way one doesn't need to touch generic numa code, introduce
>>>>> x86 specific init_apicid_fn() hook into generic code and keep
>>>>> x86/EPYC nuances contained within x86 code only.    
>>>>
>>>> I was kind of already working in the similar direction in v4.
>>>> 1. We already have split the numa initialization in patch #12(Split the
>>>> numa initialization). This way we know exactly how many numa nodes are
>>>> there before hand.  
>>>
>>> I suggest to drop that patch, It's the one that touches generic numa
>>> code and adding more legacy based extensions like cpu_indexes.
>>> Which I'd like to get rid of to begin with, so only -numa cpu is left.
>>>
>>> I think it's not necessary to touch numa code at all for apicid generation
>>> purpose, as I tried to explain above. We should be able to keep
>>> this x86 only business.  
>>
>> This is going to be difficult without touching the generic numa code.patch #12(Split the
>>>> numa initialization)
> 
> Looking at current code I don't see why one would touch numa code.
> Care to explain in more details why you'd have to touch it?

Please see the reasoning above.
> 
>>>> 2. Planning to remove init_apicid_fn
>>>> 3. Insert the handlers inside X86CPUDefinition.  
>>> what handlers do you mean?  
>>
>> Apicid generation logic can be separated into 3 types of handlers.
>> x86_apicid_from_cpu_idx: Generate apicid from cpu index.
>> x86_topo_ids_from_apicid: Generate topo ids from apic id.
>> x86_apicid_from_topo_ids: Generate apicid from topo ids.
>>
>> We should be able to generate one id from other(you can see topology.h).
>>
>> X86CPUDefinition will have the handlers specific to each model like the
>> way we have features now. The above 3 handlers will be used as default
>> handler.
> 
> it probably shouldn't be a part of X86CPUDefinition,
> as it's machines responsibility to generate and set APIC ID.
> 
> What you are doing with this topo functions in this version
> looks more that enough to me.

It is all the exact same topo functions. Only making these functions as
the handlers inside the X86CPUDefinition.

> 
>> The EPYC model will have its corresponding handlers.
>>
>> x86_apicid_from_cpu_idx_epyc
>> x86_topo_ids_from_apicid_epyc
>> x86_apicid_from_topo_ids_epyc.
> 
> CPU might use call backs, but does it have to?
> I see cpu_x86_cpuid() uses these functions to decode apic_id back to topo
> info and then compose various leaves based on it.
> Within CPU code I'd just use
>  if (i_am_epyc)
>     x86_topo_ids_from_apicid_epyc()
>  else
>     x86_topo_ids_from_apicid()
> it's easier to read and one doesn't have to go figure
> indirection chain to figure out what code is called.

Eduardo already commented on this idea. Anything specific to cpu models
should be part of the X86CPUDefinition. We should not compare the specific
model here. Comparing the specific model does not scale. We are achieving
this by loading the model definition(similar to what we do in
x86_cpu_load_model).

>    
>>>> 4. EPYC model will have its own apid id handlers. Everything else will be
>>>> initialized with a default handlers(current default handler).
>>>> 5. The function pc_possible_cpu_arch_ids will load the model definition
>>>> and initialize the PCMachineState data structure with the model specific
>>>> handlers.  
>>> I'm not sure what do you mean here.  
>>
>> PCMachineState will have the function pointers to the above handlers.
>> I was going to load the correct handler based on the mode type.
> 
> Could be done like this, but considering that within machine we need
> to calculate apic_id only once, the same 'if' trick would be simpler
> 
> x86_cpus_init() {
> 
>   if (cpu == epic) {
>      make_epyc_apic_ids(mc->possible_cpu_arch_ids(ms))
>   }

Once again, this does not scale. Please see my response above.

> 
>   // go on with creating cpus ...
> }
> 
>>>> Does that sound similar to what you are thinking. Thoughts?  
>>> If you have something to share and can push it on github,
>>> I can look at, whether it has design issues to spare you a round trip on a list.
>>> (it won't be proper review but at least I can help to pinpoint most problematic parts)
>>>   
>> My code for the current approach is kind of ready(yet to be tested). I can
>> send it as v3.1 if you want to look. Or we can wait for our discussion to
>> settle. I will post it after our discussion.
> ok, lets wait till we finish this discussion

I can post my draft patch to give you more idea about what i am talking
about now. Let me know.

> 
>> There is one more problem we need to address. I was going to address later
>> in v4 or v5.
>>
>> This works
>> -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7
>>
>> This does not work
>> -numa node,nodeid=0,cpus=0-5 -numa node,nodeid=1,cpus=6-7
> Is it supposed to work (i.e. can real hardware do such topology)?

Hardware does not support this configuration. That is why I did not think
it is serious enough to fix this problem right now.

> 
>> This requires the generic code to pass the node information to the x86
>> code which requires some handler changes. I was thinking my code will
>> simplify the changes to address this issue.
> 
> without more information, it's hard to comment on issue and whether
> extra complexity of callbacks is justificated. 
> 
> There could be 2 ways here, add fixes to this series so we could see the reason
> or make this series simple to solve apic_id problem only and then on top of
> it send the second series that solves another issue.
> 
> Considering that this series is already big/complicated enough,
> personally I'd go for 2nd option. As it's easier to describe what patches are
> doing and easier to review => should result in faster reaching consensus and merging.
> [...]
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2020-02-05 16:10           ` Babu Moger
@ 2020-02-05 16:56             ` Igor Mammedov
  2020-02-05 19:07               ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-02-05 16:56 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Wed, 5 Feb 2020 10:10:06 -0600
Babu Moger <babu.moger@amd.com> wrote:

> On 2/5/20 3:38 AM, Igor Mammedov wrote:
> > On Tue, 4 Feb 2020 13:08:58 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> >> On 2/4/20 2:02 AM, Igor Mammedov wrote:  
> >>> On Mon, 3 Feb 2020 13:31:29 -0600
> >>> Babu Moger <babu.moger@amd.com> wrote:
> >>>     
> >>>> On 2/3/20 8:59 AM, Igor Mammedov wrote:    
> >>>>> On Tue, 03 Dec 2019 18:36:54 -0600
> >>>>> Babu Moger <babu.moger@amd.com> wrote:
> >>>>>       
> >>>>>> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cdbfd059a060a4851aad908d7aa1f3532%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164923333568238&amp;sdata=P0I547X5r0s9emWu3ptIcm1U%2FhCMZmnMQOQ0IgLPzzQ%3D&amp;reserved=0
> >>>>>>
> >>>>>> Currently, the APIC ID is decoded based on the sequence
> >>>>>> sockets->dies->cores->threads. This works for most standard AMD and other
> >>>>>> vendors' configurations, but this decoding sequence does not follow that of
> >>>>>> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
> >>>>>> inconsistency.  When booting a guest VM, the kernel tries to validate the
> >>>>>> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
> >>>>>>
> >>>>>> To fix the problem we need to build the topology as per the Processor
> >>>>>> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
> >>>>>> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&amp;data=02%7C01%7Cbabu.moger%40amd.com%7Cdbfd059a060a4851aad908d7aa1f3532%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637164923333568238&amp;sdata=AO6m%2FEI17iLoAa3gNnRSJKJAdvBRKh0Dmbr7bCVA0us%3D&amp;reserved=0
> >>>>>>
> >>>>>> Here is the text from the PPR.
> >>>>>> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
> >>>>>> number of least significant bits in the Initial APIC ID that indicate core ID
> >>>>>> within a processor, in constructing per-core CPUID masks.
> >>>>>> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
> >>>>>> (MNC) that the processor could theoretically support, not the actual number of
> >>>>>> cores that are actually implemented or enabled on the processor, as indicated
> >>>>>> by Core::X86::Cpuid::SizeId[NC].
> >>>>>> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> >>>>>> • ApicId[6] = Socket ID.
> >>>>>> • ApicId[5:4] = Node ID.
> >>>>>> • ApicId[3] = Logical CCX L3 complex ID
> >>>>>> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}      
> >>>>>
> >>>>>
> >>>>> After checking out all patches and some pondering, used here approach
> >>>>> looks to me too intrusive for the task at hand especially where it
> >>>>> comes to generic code.
> >>>>>
> >>>>> (Ignore till ==== to see suggestion how to simplify without reading
> >>>>> reasoning behind it first)
> >>>>>
> >>>>> Lets look for a way to simplify it a little bit.
> >>>>>
> >>>>> So problem we are trying to solve,
> >>>>>  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
> >>>>>  2: it depends on knowing total number of numa nodes.
> >>>>>
> >>>>> Externally workflow looks like following:
> >>>>>   1. user provides -smp x,sockets,cores,...,maxcpus
> >>>>>       that's used by possible_cpu_arch_ids() singleton to build list of
> >>>>>       possible CPUs (which is available to user via command 'hotpluggable-cpus')
> >>>>>
> >>>>>       Hook could be called very early and possible_cpus data might be
> >>>>>       not complete. It builds a list of possible CPUs which user could
> >>>>>       modify later.
> >>>>>
> >>>>>   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
> >>>>>       options to assign cpus to nodes, which is one way or another calling
> >>>>>       machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
> >>>>>       with node information. It happens early when total number of nodes
> >>>>>       is not available.
> >>>>>
> >>>>>   2.2 user does not provide explicit node mappings for CPUs.
> >>>>>       QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
> >>>>>       (using the same machine_set_cpu_numa_node()) right before calling boards
> >>>>>       specific machine init(). At that time total number of nodes is known.
> >>>>>
> >>>>> In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
> >>>>> boards init() is run.    
> >>
> >> In case of 2.1, we need to have the arch_id already generated. This is
> >> done inside possible_cpu_arch_ids. The arch_id is used by
> >> machine_set_cpu_numa_node to assign the cpus to correct numa node.  
> > 
> > I might have missed something but I don't see arch_id itself being used in
> > machine_set_cpu_numa_node(). It only uses props part of possible_cpus  
> 
> Before calling machine_set_cpu_numa_node, we call
> cpu_index_to_instance_props -> x86_cpu_index_to_props->
> possible_cpu_arch_ids->x86_possible_cpu_arch_ids.
> 
> This sequence sets up the arch_id(in x86_cpu_apic_id_from_index) for all
> the available cpus. Based on the arch_id, it also sets up the props.


x86_possible_cpu_arch_ids()
   arch_id = x86_cpu_apic_id_from_index(x86ms, i)
   x86_topo_ids_from_apicid(arch_id, x86ms->smp_dies, ms->smp.cores,  ms->smp.threads, &topo);
   // assign socket/die/core/thread from topo

so currently it uses indirect way to convert index in possible_cpus->cpus[]
to socket/die/core/thread ids.
But essentially it take '-smp' options and [0..max_cpus) number as original data
converts it into intermediate apic_id and then reverse engineer it back to
topo info.

Why not use x86_topo_ids_from_idx() directly to get rid of 'props' dependency on apic_id?



> And these props values are used to assign the nodes in
> machine_set_cpu_numa_node.
> 
> At this point we are still parsing the numa nodes and so we don't know the
> total number of numa nodes. Without that information, the arch_id
> generated here will not be correct for EPYC models.
> 
> This is the reason for changing the generic numa code(patch #12-Split the
> numa initialization).
> 
> > 
> >    
> >> If we want to move the arch_id generation into board init(), then we need
> >> to save the cpu indexes belonging to each node somewhere.  
> > 
> > when cpus are assigned explicitly, decision what cpus go to what nodes is
> > up to user and user configured mapping is stored in MachineState::possible_cpus
> > which is accessed by via possible_cpu_arch_ids() callback.
> > Hence I don see any reason to touch cpu indexes.  
> 
> Please see my reasoning above.
> 
> >   
> >>  
> >>>>>
> >>>>> In 2.2 case it calls get_default_cpu_node_id() -> x86_get_default_cpu_node_id()
> >>>>> which uses arch_id calculate numa node.
> >>>>> But then question is: does it have to use APIC id or could it infer 'pkg_id',
> >>>>> it's after, from ms->possible_cpus->cpus[i].props data?      
> >>>>
> >>>> Not sure if I got the question right. In this case because the numa
> >>>> information is not provided all the cpus are assigned to only one node.
> >>>> The apic id is used here to get the correct pkg_id.    
> >>>
> >>> apicid was composed from socket/core/thread[/die] tuple which cpus[i].props is.
> >>>
> >>> Question is if we can compose only pkg_id based on the same data without
> >>> converting it to apicid and then "reverse engineering" it back
> >>> original data?    
> >>
> >> Yes. It is possible.
> >>  
> >>>
> >>> Or more direct question: is socket-id the same as pkg_id?    
> >>
> >> Yes. Socket_id and pkg_id is same.
> >>  
> >>>
> >>>     
> >>>>    
> >>>>>   
> >>>>> With that out of the way APIC ID will be used only during board's init(),
> >>>>> so board could update possible_cpus with valid APIC IDs at the start of
> >>>>> x86_cpus_init().
> >>>>>
> >>>>> ====
> >>>>> in nutshell it would be much easier to do following:
> >>>>>
> >>>>>  1. make x86_get_default_cpu_node_id() APIC ID in-depended or
> >>>>>     if impossible as alternative recompute APIC IDs there if cpu
> >>>>>     type is EPYC based (since number of nodes is already known)
> >>>>>  2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based
> >>>>>
> >>>>> this way one doesn't need to touch generic numa code, introduce
> >>>>> x86 specific init_apicid_fn() hook into generic code and keep
> >>>>> x86/EPYC nuances contained within x86 code only.      
> >>>>
> >>>> I was kind of already working in the similar direction in v4.
> >>>> 1. We already have split the numa initialization in patch #12(Split the
> >>>> numa initialization). This way we know exactly how many numa nodes are
> >>>> there before hand.    
> >>>
> >>> I suggest to drop that patch, It's the one that touches generic numa
> >>> code and adding more legacy based extensions like cpu_indexes.
> >>> Which I'd like to get rid of to begin with, so only -numa cpu is left.
> >>>
> >>> I think it's not necessary to touch numa code at all for apicid generation
> >>> purpose, as I tried to explain above. We should be able to keep
> >>> this x86 only business.    
> >>
> >> This is going to be difficult without touching the generic numa code.patch #12(Split the  
> >>>> numa initialization)  
> > 
> > Looking at current code I don't see why one would touch numa code.
> > Care to explain in more details why you'd have to touch it?  
> 
> Please see the reasoning above.
> >   
> >>>> 2. Planning to remove init_apicid_fn
> >>>> 3. Insert the handlers inside X86CPUDefinition.    
> >>> what handlers do you mean?    
> >>
> >> Apicid generation logic can be separated into 3 types of handlers.
> >> x86_apicid_from_cpu_idx: Generate apicid from cpu index.
> >> x86_topo_ids_from_apicid: Generate topo ids from apic id.
> >> x86_apicid_from_topo_ids: Generate apicid from topo ids.
> >>
> >> We should be able to generate one id from other(you can see topology.h).
> >>
> >> X86CPUDefinition will have the handlers specific to each model like the
> >> way we have features now. The above 3 handlers will be used as default
> >> handler.  
> > 
> > it probably shouldn't be a part of X86CPUDefinition,
> > as it's machines responsibility to generate and set APIC ID.
> > 
> > What you are doing with this topo functions in this version
> > looks more that enough to me.  
> 
> It is all the exact same topo functions. Only making these functions as
> the handlers inside the X86CPUDefinition.
> 
> >   
> >> The EPYC model will have its corresponding handlers.
> >>
> >> x86_apicid_from_cpu_idx_epyc
> >> x86_topo_ids_from_apicid_epyc
> >> x86_apicid_from_topo_ids_epyc.  
> > 
> > CPU might use call backs, but does it have to?
> > I see cpu_x86_cpuid() uses these functions to decode apic_id back to topo
> > info and then compose various leaves based on it.
> > Within CPU code I'd just use
> >  if (i_am_epyc)
> >     x86_topo_ids_from_apicid_epyc()
> >  else
> >     x86_topo_ids_from_apicid()
> > it's easier to read and one doesn't have to go figure
> > indirection chain to figure out what code is called.  
> 
> Eduardo already commented on this idea. Anything specific to cpu models
> should be part of the X86CPUDefinition. We should not compare the specific
> model here. Comparing the specific model does not scale. We are achieving
> this by loading the model definition(similar to what we do in
> x86_cpu_load_model).

ok

> 
> >      
> >>>> 4. EPYC model will have its own apid id handlers. Everything else will be
> >>>> initialized with a default handlers(current default handler).
> >>>> 5. The function pc_possible_cpu_arch_ids will load the model definition
> >>>> and initialize the PCMachineState data structure with the model specific
> >>>> handlers.    
> >>> I'm not sure what do you mean here.    
> >>
> >> PCMachineState will have the function pointers to the above handlers.
> >> I was going to load the correct handler based on the mode type.  
> > 
> > Could be done like this, but considering that within machine we need
> > to calculate apic_id only once, the same 'if' trick would be simpler
> > 
> > x86_cpus_init() {
> > 
> >   if (cpu == epic) {
> >      make_epyc_apic_ids(mc->possible_cpu_arch_ids(ms))
> >   }  
> 
> Once again, this does not scale. Please see my response above.
> 
> > 
> >   // go on with creating cpus ...
> > }
> >   
> >>>> Does that sound similar to what you are thinking. Thoughts?    
> >>> If you have something to share and can push it on github,
> >>> I can look at, whether it has design issues to spare you a round trip on a list.
> >>> (it won't be proper review but at least I can help to pinpoint most problematic parts)
> >>>     
> >> My code for the current approach is kind of ready(yet to be tested). I can
> >> send it as v3.1 if you want to look. Or we can wait for our discussion to
> >> settle. I will post it after our discussion.  
> > ok, lets wait till we finish this discussion  
> 
> I can post my draft patch to give you more idea about what i am talking
> about now. Let me know.
> 
> >   
> >> There is one more problem we need to address. I was going to address later
> >> in v4 or v5.
> >>
> >> This works
> >> -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7
> >>
> >> This does not work
> >> -numa node,nodeid=0,cpus=0-5 -numa node,nodeid=1,cpus=6-7  
> > Is it supposed to work (i.e. can real hardware do such topology)?  
> 
> Hardware does not support this configuration. That is why I did not think
> it is serious enough to fix this problem right now.
> 
> >   
> >> This requires the generic code to pass the node information to the x86
> >> code which requires some handler changes. I was thinking my code will
> >> simplify the changes to address this issue.  
> > 
> > without more information, it's hard to comment on issue and whether
> > extra complexity of callbacks is justificated. 
> > 
> > There could be 2 ways here, add fixes to this series so we could see the reason
> > or make this series simple to solve apic_id problem only and then on top of
> > it send the second series that solves another issue.
> > 
> > Considering that this series is already big/complicated enough,
> > personally I'd go for 2nd option. As it's easier to describe what patches are
> > doing and easier to review => should result in faster reaching consensus and merging.
> > [...]
> >   
> 



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2020-02-05 16:56             ` Igor Mammedov
@ 2020-02-05 19:07               ` Babu Moger
  2020-02-06 13:08                 ` Igor Mammedov
  0 siblings, 1 reply; 53+ messages in thread
From: Babu Moger @ 2020-02-05 19:07 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 2/5/20 10:56 AM, Igor Mammedov wrote:
> On Wed, 5 Feb 2020 10:10:06 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> On 2/5/20 3:38 AM, Igor Mammedov wrote:
>>> On Tue, 4 Feb 2020 13:08:58 -0600
>>> Babu Moger <babu.moger@amd.com> wrote:
>>>   
>>>> On 2/4/20 2:02 AM, Igor Mammedov wrote:  
>>>>> On Mon, 3 Feb 2020 13:31:29 -0600
>>>>> Babu Moger <babu.moger@amd.com> wrote:
>>>>>     
>>>>>> On 2/3/20 8:59 AM, Igor Mammedov wrote:    
>>>>>>> On Tue, 03 Dec 2019 18:36:54 -0600
>>>>>>> Babu Moger <babu.moger@amd.com> wrote:
>>>>>>>       
>>>>>>>> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C6b6d6af79fee45cc904808d7aa5c5f37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165186049856500&amp;sdata=vDAkIxR3U6LX%2FmnYjZPRC55smMqLend%2FHQjbfYWydBk%3D&amp;reserved=0
>>>>>>>>
>>>>>>>> Currently, the APIC ID is decoded based on the sequence
>>>>>>>> sockets->dies->cores->threads. This works for most standard AMD and other
>>>>>>>> vendors' configurations, but this decoding sequence does not follow that of
>>>>>>>> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
>>>>>>>> inconsistency.  When booting a guest VM, the kernel tries to validate the
>>>>>>>> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
>>>>>>>>
>>>>>>>> To fix the problem we need to build the topology as per the Processor
>>>>>>>> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
>>>>>>>> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C6b6d6af79fee45cc904808d7aa5c5f37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165186049856500&amp;sdata=rVMRN%2BbUeGWEksKO5uQ3Wxc71eeHCXMrkLVRbo4JHHI%3D&amp;reserved=0
>>>>>>>>
>>>>>>>> Here is the text from the PPR.
>>>>>>>> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
>>>>>>>> number of least significant bits in the Initial APIC ID that indicate core ID
>>>>>>>> within a processor, in constructing per-core CPUID masks.
>>>>>>>> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
>>>>>>>> (MNC) that the processor could theoretically support, not the actual number of
>>>>>>>> cores that are actually implemented or enabled on the processor, as indicated
>>>>>>>> by Core::X86::Cpuid::SizeId[NC].
>>>>>>>> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
>>>>>>>> • ApicId[6] = Socket ID.
>>>>>>>> • ApicId[5:4] = Node ID.
>>>>>>>> • ApicId[3] = Logical CCX L3 complex ID
>>>>>>>> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}      
>>>>>>>
>>>>>>>
>>>>>>> After checking out all patches and some pondering, used here approach
>>>>>>> looks to me too intrusive for the task at hand especially where it
>>>>>>> comes to generic code.
>>>>>>>
>>>>>>> (Ignore till ==== to see suggestion how to simplify without reading
>>>>>>> reasoning behind it first)
>>>>>>>
>>>>>>> Lets look for a way to simplify it a little bit.
>>>>>>>
>>>>>>> So problem we are trying to solve,
>>>>>>>  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
>>>>>>>  2: it depends on knowing total number of numa nodes.
>>>>>>>
>>>>>>> Externally workflow looks like following:
>>>>>>>   1. user provides -smp x,sockets,cores,...,maxcpus
>>>>>>>       that's used by possible_cpu_arch_ids() singleton to build list of
>>>>>>>       possible CPUs (which is available to user via command 'hotpluggable-cpus')
>>>>>>>
>>>>>>>       Hook could be called very early and possible_cpus data might be
>>>>>>>       not complete. It builds a list of possible CPUs which user could
>>>>>>>       modify later.
>>>>>>>
>>>>>>>   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
>>>>>>>       options to assign cpus to nodes, which is one way or another calling
>>>>>>>       machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
>>>>>>>       with node information. It happens early when total number of nodes
>>>>>>>       is not available.
>>>>>>>
>>>>>>>   2.2 user does not provide explicit node mappings for CPUs.
>>>>>>>       QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
>>>>>>>       (using the same machine_set_cpu_numa_node()) right before calling boards
>>>>>>>       specific machine init(). At that time total number of nodes is known.
>>>>>>>
>>>>>>> In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
>>>>>>> boards init() is run.    
>>>>
>>>> In case of 2.1, we need to have the arch_id already generated. This is
>>>> done inside possible_cpu_arch_ids. The arch_id is used by
>>>> machine_set_cpu_numa_node to assign the cpus to correct numa node.  
>>>
>>> I might have missed something but I don't see arch_id itself being used in
>>> machine_set_cpu_numa_node(). It only uses props part of possible_cpus  
>>
>> Before calling machine_set_cpu_numa_node, we call
>> cpu_index_to_instance_props -> x86_cpu_index_to_props->
>> possible_cpu_arch_ids->x86_possible_cpu_arch_ids.
>>
>> This sequence sets up the arch_id(in x86_cpu_apic_id_from_index) for all
>> the available cpus. Based on the arch_id, it also sets up the props.
> 
> 
> x86_possible_cpu_arch_ids()
>    arch_id = x86_cpu_apic_id_from_index(x86ms, i)
>    x86_topo_ids_from_apicid(arch_id, x86ms->smp_dies, ms->smp.cores,  ms->smp.threads, &topo);
>    // assign socket/die/core/thread from topo
> 
> so currently it uses indirect way to convert index in possible_cpus->cpus[]
> to socket/die/core/thread ids.
> But essentially it take '-smp' options and [0..max_cpus) number as original data
> converts it into intermediate apic_id and then reverse engineer it back to
> topo info.
> 
> Why not use x86_topo_ids_from_idx() directly to get rid of 'props' dependency on apic_id?

It might work. But this feels like a work-around and delaying the problem
for later. Just re-arranging the numa code little bit we can address this.

> 
> 
> 
>> And these props values are used to assign the nodes in
>> machine_set_cpu_numa_node.
>>
>> At this point we are still parsing the numa nodes and so we don't know the
>> total number of numa nodes. Without that information, the arch_id
>> generated here will not be correct for EPYC models.
>>
>> This is the reason for changing the generic numa code(patch #12-Split the
>> numa initialization).
>>
>>>
>>>    
>>>> If we want to move the arch_id generation into board init(), then we need
>>>> to save the cpu indexes belonging to each node somewhere.  
>>>
>>> when cpus are assigned explicitly, decision what cpus go to what nodes is
>>> up to user and user configured mapping is stored in MachineState::possible_cpus
>>> which is accessed by via possible_cpu_arch_ids() callback.
>>> Hence I don see any reason to touch cpu indexes.  
>>
>> Please see my reasoning above.
>>
>>>   
>>>>  
>>>>>>>
>>>>>>> In 2.2 case it calls get_default_cpu_node_id() -> x86_get_default_cpu_node_id()
>>>>>>> which uses arch_id calculate numa node.
>>>>>>> But then question is: does it have to use APIC id or could it infer 'pkg_id',
>>>>>>> it's after, from ms->possible_cpus->cpus[i].props data?      
>>>>>>
>>>>>> Not sure if I got the question right. In this case because the numa
>>>>>> information is not provided all the cpus are assigned to only one node.
>>>>>> The apic id is used here to get the correct pkg_id.    
>>>>>
>>>>> apicid was composed from socket/core/thread[/die] tuple which cpus[i].props is.
>>>>>
>>>>> Question is if we can compose only pkg_id based on the same data without
>>>>> converting it to apicid and then "reverse engineering" it back
>>>>> original data?    
>>>>
>>>> Yes. It is possible.
>>>>  
>>>>>
>>>>> Or more direct question: is socket-id the same as pkg_id?    
>>>>
>>>> Yes. Socket_id and pkg_id is same.
>>>>  
>>>>>
>>>>>     
>>>>>>    
>>>>>>>   
>>>>>>> With that out of the way APIC ID will be used only during board's init(),
>>>>>>> so board could update possible_cpus with valid APIC IDs at the start of
>>>>>>> x86_cpus_init().
>>>>>>>
>>>>>>> ====
>>>>>>> in nutshell it would be much easier to do following:
>>>>>>>
>>>>>>>  1. make x86_get_default_cpu_node_id() APIC ID in-depended or
>>>>>>>     if impossible as alternative recompute APIC IDs there if cpu
>>>>>>>     type is EPYC based (since number of nodes is already known)
>>>>>>>  2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based
>>>>>>>
>>>>>>> this way one doesn't need to touch generic numa code, introduce
>>>>>>> x86 specific init_apicid_fn() hook into generic code and keep
>>>>>>> x86/EPYC nuances contained within x86 code only.      
>>>>>>
>>>>>> I was kind of already working in the similar direction in v4.
>>>>>> 1. We already have split the numa initialization in patch #12(Split the
>>>>>> numa initialization). This way we know exactly how many numa nodes are
>>>>>> there before hand.    
>>>>>
>>>>> I suggest to drop that patch, It's the one that touches generic numa
>>>>> code and adding more legacy based extensions like cpu_indexes.
>>>>> Which I'd like to get rid of to begin with, so only -numa cpu is left.
>>>>>
>>>>> I think it's not necessary to touch numa code at all for apicid generation
>>>>> purpose, as I tried to explain above. We should be able to keep
>>>>> this x86 only business.    
>>>>
>>>> This is going to be difficult without touching the generic numa code.patch #12(Split the  
>>>>>> numa initialization)  
>>>
>>> Looking at current code I don't see why one would touch numa code.
>>> Care to explain in more details why you'd have to touch it?  
>>
>> Please see the reasoning above.
>>>   
>>>>>> 2. Planning to remove init_apicid_fn
>>>>>> 3. Insert the handlers inside X86CPUDefinition.    
>>>>> what handlers do you mean?    
>>>>
>>>> Apicid generation logic can be separated into 3 types of handlers.
>>>> x86_apicid_from_cpu_idx: Generate apicid from cpu index.
>>>> x86_topo_ids_from_apicid: Generate topo ids from apic id.
>>>> x86_apicid_from_topo_ids: Generate apicid from topo ids.
>>>>
>>>> We should be able to generate one id from other(you can see topology.h).
>>>>
>>>> X86CPUDefinition will have the handlers specific to each model like the
>>>> way we have features now. The above 3 handlers will be used as default
>>>> handler.  
>>>
>>> it probably shouldn't be a part of X86CPUDefinition,
>>> as it's machines responsibility to generate and set APIC ID.
>>>
>>> What you are doing with this topo functions in this version
>>> looks more that enough to me.  
>>
>> It is all the exact same topo functions. Only making these functions as
>> the handlers inside the X86CPUDefinition.
>>
>>>   
>>>> The EPYC model will have its corresponding handlers.
>>>>
>>>> x86_apicid_from_cpu_idx_epyc
>>>> x86_topo_ids_from_apicid_epyc
>>>> x86_apicid_from_topo_ids_epyc.  
>>>
>>> CPU might use call backs, but does it have to?
>>> I see cpu_x86_cpuid() uses these functions to decode apic_id back to topo
>>> info and then compose various leaves based on it.
>>> Within CPU code I'd just use
>>>  if (i_am_epyc)
>>>     x86_topo_ids_from_apicid_epyc()
>>>  else
>>>     x86_topo_ids_from_apicid()
>>> it's easier to read and one doesn't have to go figure
>>> indirection chain to figure out what code is called.  
>>
>> Eduardo already commented on this idea. Anything specific to cpu models
>> should be part of the X86CPUDefinition. We should not compare the specific
>> model here. Comparing the specific model does not scale. We are achieving
>> this by loading the model definition(similar to what we do in
>> x86_cpu_load_model).
> 
> ok
> 
>>
>>>      
>>>>>> 4. EPYC model will have its own apid id handlers. Everything else will be
>>>>>> initialized with a default handlers(current default handler).
>>>>>> 5. The function pc_possible_cpu_arch_ids will load the model definition
>>>>>> and initialize the PCMachineState data structure with the model specific
>>>>>> handlers.    
>>>>> I'm not sure what do you mean here.    
>>>>
>>>> PCMachineState will have the function pointers to the above handlers.
>>>> I was going to load the correct handler based on the mode type.  
>>>
>>> Could be done like this, but considering that within machine we need
>>> to calculate apic_id only once, the same 'if' trick would be simpler
>>>
>>> x86_cpus_init() {
>>>
>>>   if (cpu == epic) {
>>>      make_epyc_apic_ids(mc->possible_cpu_arch_ids(ms))
>>>   }  
>>
>> Once again, this does not scale. Please see my response above.
>>
>>>
>>>   // go on with creating cpus ...
>>> }
>>>   
>>>>>> Does that sound similar to what you are thinking. Thoughts?    
>>>>> If you have something to share and can push it on github,
>>>>> I can look at, whether it has design issues to spare you a round trip on a list.
>>>>> (it won't be proper review but at least I can help to pinpoint most problematic parts)
>>>>>     
>>>> My code for the current approach is kind of ready(yet to be tested). I can
>>>> send it as v3.1 if you want to look. Or we can wait for our discussion to
>>>> settle. I will post it after our discussion.  
>>> ok, lets wait till we finish this discussion  
>>
>> I can post my draft patch to give you more idea about what i am talking
>> about now. Let me know.
>>
>>>   
>>>> There is one more problem we need to address. I was going to address later
>>>> in v4 or v5.
>>>>
>>>> This works
>>>> -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7
>>>>
>>>> This does not work
>>>> -numa node,nodeid=0,cpus=0-5 -numa node,nodeid=1,cpus=6-7  
>>> Is it supposed to work (i.e. can real hardware do such topology)?  
>>
>> Hardware does not support this configuration. That is why I did not think
>> it is serious enough to fix this problem right now.
>>
>>>   
>>>> This requires the generic code to pass the node information to the x86
>>>> code which requires some handler changes. I was thinking my code will
>>>> simplify the changes to address this issue.  
>>>
>>> without more information, it's hard to comment on issue and whether
>>> extra complexity of callbacks is justificated. 
>>>
>>> There could be 2 ways here, add fixes to this series so we could see the reason
>>> or make this series simple to solve apic_id problem only and then on top of
>>> it send the second series that solves another issue.
>>>
>>> Considering that this series is already big/complicated enough,
>>> personally I'd go for 2nd option. As it's easier to describe what patches are
>>> doing and easier to review => should result in faster reaching consensus and merging.
>>> [...]
>>>   
>>
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2020-02-05 19:07               ` Babu Moger
@ 2020-02-06 13:08                 ` Igor Mammedov
  2020-02-06 15:32                   ` Babu Moger
  0 siblings, 1 reply; 53+ messages in thread
From: Igor Mammedov @ 2020-02-06 13:08 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth

On Wed, 5 Feb 2020 13:07:31 -0600
Babu Moger <babu.moger@amd.com> wrote:

> On 2/5/20 10:56 AM, Igor Mammedov wrote:
> > On Wed, 5 Feb 2020 10:10:06 -0600
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> >> On 2/5/20 3:38 AM, Igor Mammedov wrote:  
> >>> On Tue, 4 Feb 2020 13:08:58 -0600
> >>> Babu Moger <babu.moger@amd.com> wrote:
> >>>     
> >>>> On 2/4/20 2:02 AM, Igor Mammedov wrote:    
> >>>>> On Mon, 3 Feb 2020 13:31:29 -0600
> >>>>> Babu Moger <babu.moger@amd.com> wrote:
> >>>>>       
> >>>>>> On 2/3/20 8:59 AM, Igor Mammedov wrote:      
> >>>>>>> On Tue, 03 Dec 2019 18:36:54 -0600
> >>>>>>> Babu Moger <babu.moger@amd.com> wrote:
> >>>>>>>         
> >>>>>>>> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
> >>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C6b6d6af79fee45cc904808d7aa5c5f37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165186049856500&amp;sdata=vDAkIxR3U6LX%2FmnYjZPRC55smMqLend%2FHQjbfYWydBk%3D&amp;reserved=0
> >>>>>>>>
> >>>>>>>> Currently, the APIC ID is decoded based on the sequence
> >>>>>>>> sockets->dies->cores->threads. This works for most standard AMD and other
> >>>>>>>> vendors' configurations, but this decoding sequence does not follow that of
> >>>>>>>> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
> >>>>>>>> inconsistency.  When booting a guest VM, the kernel tries to validate the
> >>>>>>>> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
> >>>>>>>>
> >>>>>>>> To fix the problem we need to build the topology as per the Processor
> >>>>>>>> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
> >>>>>>>> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C6b6d6af79fee45cc904808d7aa5c5f37%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165186049856500&amp;sdata=rVMRN%2BbUeGWEksKO5uQ3Wxc71eeHCXMrkLVRbo4JHHI%3D&amp;reserved=0
> >>>>>>>>
> >>>>>>>> Here is the text from the PPR.
> >>>>>>>> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
> >>>>>>>> number of least significant bits in the Initial APIC ID that indicate core ID
> >>>>>>>> within a processor, in constructing per-core CPUID masks.
> >>>>>>>> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
> >>>>>>>> (MNC) that the processor could theoretically support, not the actual number of
> >>>>>>>> cores that are actually implemented or enabled on the processor, as indicated
> >>>>>>>> by Core::X86::Cpuid::SizeId[NC].
> >>>>>>>> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> >>>>>>>> • ApicId[6] = Socket ID.
> >>>>>>>> • ApicId[5:4] = Node ID.
> >>>>>>>> • ApicId[3] = Logical CCX L3 complex ID
> >>>>>>>> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}        
> >>>>>>>
> >>>>>>>
> >>>>>>> After checking out all patches and some pondering, used here approach
> >>>>>>> looks to me too intrusive for the task at hand especially where it
> >>>>>>> comes to generic code.
> >>>>>>>
> >>>>>>> (Ignore till ==== to see suggestion how to simplify without reading
> >>>>>>> reasoning behind it first)
> >>>>>>>
> >>>>>>> Lets look for a way to simplify it a little bit.
> >>>>>>>
> >>>>>>> So problem we are trying to solve,
> >>>>>>>  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
> >>>>>>>  2: it depends on knowing total number of numa nodes.
> >>>>>>>
> >>>>>>> Externally workflow looks like following:
> >>>>>>>   1. user provides -smp x,sockets,cores,...,maxcpus
> >>>>>>>       that's used by possible_cpu_arch_ids() singleton to build list of
> >>>>>>>       possible CPUs (which is available to user via command 'hotpluggable-cpus')
> >>>>>>>
> >>>>>>>       Hook could be called very early and possible_cpus data might be
> >>>>>>>       not complete. It builds a list of possible CPUs which user could
> >>>>>>>       modify later.
> >>>>>>>
> >>>>>>>   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
> >>>>>>>       options to assign cpus to nodes, which is one way or another calling
> >>>>>>>       machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
> >>>>>>>       with node information. It happens early when total number of nodes
> >>>>>>>       is not available.
> >>>>>>>
> >>>>>>>   2.2 user does not provide explicit node mappings for CPUs.
> >>>>>>>       QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
> >>>>>>>       (using the same machine_set_cpu_numa_node()) right before calling boards
> >>>>>>>       specific machine init(). At that time total number of nodes is known.
> >>>>>>>
> >>>>>>> In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
> >>>>>>> boards init() is run.      
> >>>>
> >>>> In case of 2.1, we need to have the arch_id already generated. This is
> >>>> done inside possible_cpu_arch_ids. The arch_id is used by
> >>>> machine_set_cpu_numa_node to assign the cpus to correct numa node.    
> >>>
> >>> I might have missed something but I don't see arch_id itself being used in
> >>> machine_set_cpu_numa_node(). It only uses props part of possible_cpus    
> >>
> >> Before calling machine_set_cpu_numa_node, we call
> >> cpu_index_to_instance_props -> x86_cpu_index_to_props->
> >> possible_cpu_arch_ids->x86_possible_cpu_arch_ids.
> >>
> >> This sequence sets up the arch_id(in x86_cpu_apic_id_from_index) for all
> >> the available cpus. Based on the arch_id, it also sets up the props.  
> > 
> > 
> > x86_possible_cpu_arch_ids()
> >    arch_id = x86_cpu_apic_id_from_index(x86ms, i)
> >    x86_topo_ids_from_apicid(arch_id, x86ms->smp_dies, ms->smp.cores,  ms->smp.threads, &topo);
> >    // assign socket/die/core/thread from topo
> > 
> > so currently it uses indirect way to convert index in possible_cpus->cpus[]
> > to socket/die/core/thread ids.
> > But essentially it take '-smp' options and [0..max_cpus) number as original data
> > converts it into intermediate apic_id and then reverse engineer it back to
> > topo info.
> > 
> > Why not use x86_topo_ids_from_idx() directly to get rid of 'props' dependency on apic_id?  
> 
> It might work. But this feels like a work-around and delaying the problem
> for later. Just re-arranging the numa code little bit we can address this.

The idea behind possible_cpus is to allow users query topo information
board generates (based on -smp) at configuration time (or late) so users
could know what -numa cpu,topo_options [and -device foo-cpu,topo_options]
to use, initializing apic_id on the first access is secondary and I did
it only because I could do it without additional data.

But main purpose of possible_cpus is to keep topology information.
That includes numa node mapping, which should be stored in possible_cpus
along with the rest of cpu topology.

Looking [12/18] numa patch, it makes -numa node,cpus legacy option
to reintroduce data duplication, by storing mapping elsewhere and
then putting that mapping into possible_cpus at numa complete time
(that's what I dislike and don't see a valid reason to do so).

That also won't work if user queries hotpluggable-cpus before that time
and it also doesn't work if user uses preferable -numa cpu,topo_options
as both would initialize possible_cpus on the first access.

So if you need do some board specific post-processing done on topo
information when it's complete and recalculate apic_id do it at board
init time like was suggested before (x86_cpu_new() looks like a good
place to do it).

[...]



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models
  2020-02-06 13:08                 ` Igor Mammedov
@ 2020-02-06 15:32                   ` Babu Moger
  0 siblings, 0 replies; 53+ messages in thread
From: Babu Moger @ 2020-02-06 15:32 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: ehabkost, mst, armbru, qemu-devel, pbonzini, rth



On 2/6/20 7:08 AM, Igor Mammedov wrote:
> On Wed, 5 Feb 2020 13:07:31 -0600
> Babu Moger <babu.moger@amd.com> wrote:
> 
>> On 2/5/20 10:56 AM, Igor Mammedov wrote:
>>> On Wed, 5 Feb 2020 10:10:06 -0600
>>> Babu Moger <babu.moger@amd.com> wrote:
>>>   
>>>> On 2/5/20 3:38 AM, Igor Mammedov wrote:  
>>>>> On Tue, 4 Feb 2020 13:08:58 -0600
>>>>> Babu Moger <babu.moger@amd.com> wrote:
>>>>>     
>>>>>> On 2/4/20 2:02 AM, Igor Mammedov wrote:    
>>>>>>> On Mon, 3 Feb 2020 13:31:29 -0600
>>>>>>> Babu Moger <babu.moger@amd.com> wrote:
>>>>>>>       
>>>>>>>> On 2/3/20 8:59 AM, Igor Mammedov wrote:      
>>>>>>>>> On Tue, 03 Dec 2019 18:36:54 -0600
>>>>>>>>> Babu Moger <babu.moger@amd.com> wrote:
>>>>>>>>>         
>>>>>>>>>> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
>>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C76bf8434899b41de094f08d7ab05bdf3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165913481441118&amp;sdata=34fZQpUjScKbbc35c7ot433HA1Rz03YG6aP1ucyGUsQ%3D&amp;reserved=0
>>>>>>>>>>
>>>>>>>>>> Currently, the APIC ID is decoded based on the sequence
>>>>>>>>>> sockets->dies->cores->threads. This works for most standard AMD and other
>>>>>>>>>> vendors' configurations, but this decoding sequence does not follow that of
>>>>>>>>>> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
>>>>>>>>>> inconsistency.  When booting a guest VM, the kernel tries to validate the
>>>>>>>>>> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
>>>>>>>>>>
>>>>>>>>>> To fix the problem we need to build the topology as per the Processor
>>>>>>>>>> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
>>>>>>>>>> Processors. It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C76bf8434899b41de094f08d7ab05bdf3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165913481451075&amp;sdata=4YXG%2BrCP5UUXcCQX4Ly8B%2FXdlvZoFrPCgonjy0IwG0U%3D&amp;reserved=0
>>>>>>>>>>
>>>>>>>>>> Here is the text from the PPR.
>>>>>>>>>> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
>>>>>>>>>> number of least significant bits in the Initial APIC ID that indicate core ID
>>>>>>>>>> within a processor, in constructing per-core CPUID masks.
>>>>>>>>>> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
>>>>>>>>>> (MNC) that the processor could theoretically support, not the actual number of
>>>>>>>>>> cores that are actually implemented or enabled on the processor, as indicated
>>>>>>>>>> by Core::X86::Cpuid::SizeId[NC].
>>>>>>>>>> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
>>>>>>>>>> • ApicId[6] = Socket ID.
>>>>>>>>>> • ApicId[5:4] = Node ID.
>>>>>>>>>> • ApicId[3] = Logical CCX L3 complex ID
>>>>>>>>>> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}        
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> After checking out all patches and some pondering, used here approach
>>>>>>>>> looks to me too intrusive for the task at hand especially where it
>>>>>>>>> comes to generic code.
>>>>>>>>>
>>>>>>>>> (Ignore till ==== to see suggestion how to simplify without reading
>>>>>>>>> reasoning behind it first)
>>>>>>>>>
>>>>>>>>> Lets look for a way to simplify it a little bit.
>>>>>>>>>
>>>>>>>>> So problem we are trying to solve,
>>>>>>>>>  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based CPUs)
>>>>>>>>>  2: it depends on knowing total number of numa nodes.
>>>>>>>>>
>>>>>>>>> Externally workflow looks like following:
>>>>>>>>>   1. user provides -smp x,sockets,cores,...,maxcpus
>>>>>>>>>       that's used by possible_cpu_arch_ids() singleton to build list of
>>>>>>>>>       possible CPUs (which is available to user via command 'hotpluggable-cpus')
>>>>>>>>>
>>>>>>>>>       Hook could be called very early and possible_cpus data might be
>>>>>>>>>       not complete. It builds a list of possible CPUs which user could
>>>>>>>>>       modify later.
>>>>>>>>>
>>>>>>>>>   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa node,node_id=x,cpus="
>>>>>>>>>       options to assign cpus to nodes, which is one way or another calling
>>>>>>>>>       machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
>>>>>>>>>       with node information. It happens early when total number of nodes
>>>>>>>>>       is not available.
>>>>>>>>>
>>>>>>>>>   2.2 user does not provide explicit node mappings for CPUs.
>>>>>>>>>       QEMU steps in and assigns possible cpus to nodes in machine_numa_finish_cpu_init()
>>>>>>>>>       (using the same machine_set_cpu_numa_node()) right before calling boards
>>>>>>>>>       specific machine init(). At that time total number of nodes is known.
>>>>>>>>>
>>>>>>>>> In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be defined before
>>>>>>>>> boards init() is run.      
>>>>>>
>>>>>> In case of 2.1, we need to have the arch_id already generated. This is
>>>>>> done inside possible_cpu_arch_ids. The arch_id is used by
>>>>>> machine_set_cpu_numa_node to assign the cpus to correct numa node.    
>>>>>
>>>>> I might have missed something but I don't see arch_id itself being used in
>>>>> machine_set_cpu_numa_node(). It only uses props part of possible_cpus    
>>>>
>>>> Before calling machine_set_cpu_numa_node, we call
>>>> cpu_index_to_instance_props -> x86_cpu_index_to_props->
>>>> possible_cpu_arch_ids->x86_possible_cpu_arch_ids.
>>>>
>>>> This sequence sets up the arch_id(in x86_cpu_apic_id_from_index) for all
>>>> the available cpus. Based on the arch_id, it also sets up the props.  
>>>
>>>
>>> x86_possible_cpu_arch_ids()
>>>    arch_id = x86_cpu_apic_id_from_index(x86ms, i)
>>>    x86_topo_ids_from_apicid(arch_id, x86ms->smp_dies, ms->smp.cores,  ms->smp.threads, &topo);
>>>    // assign socket/die/core/thread from topo
>>>
>>> so currently it uses indirect way to convert index in possible_cpus->cpus[]
>>> to socket/die/core/thread ids.
>>> But essentially it take '-smp' options and [0..max_cpus) number as original data
>>> converts it into intermediate apic_id and then reverse engineer it back to
>>> topo info.
>>>
>>> Why not use x86_topo_ids_from_idx() directly to get rid of 'props' dependency on apic_id?  
>>
>> It might work. But this feels like a work-around and delaying the problem
>> for later. Just re-arranging the numa code little bit we can address this.
> 
> The idea behind possible_cpus is to allow users query topo information
> board generates (based on -smp) at configuration time (or late) so users
> could know what -numa cpu,topo_options [and -device foo-cpu,topo_options]
> to use, initializing apic_id on the first access is secondary and I did
> it only because I could do it without additional data.
> 
> But main purpose of possible_cpus is to keep topology information.
> That includes numa node mapping, which should be stored in possible_cpus
> along with the rest of cpu topology.
> 
> Looking [12/18] numa patch, it makes -numa node,cpus legacy option
> to reintroduce data duplication, by storing mapping elsewhere and
> then putting that mapping into possible_cpus at numa complete time
> (that's what I dislike and don't see a valid reason to do so).
> 
> That also won't work if user queries hotpluggable-cpus before that time
> and it also doesn't work if user uses preferable -numa cpu,topo_options
> as both would initialize possible_cpus on the first access.
> 
> So if you need do some board specific post-processing done on topo
> information when it's complete and recalculate apic_id do it at board
> init time like was suggested before (x86_cpu_new() looks like a good
> place to do it).

Ok. Sure. Will start working on it. Thanks


^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2020-02-06 15:33 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-04  0:36 [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Babu Moger
2019-12-04  0:37 ` [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs Babu Moger
2020-02-03 15:08   ` Igor Mammedov
2020-02-03 18:25     ` Babu Moger
2019-12-04  0:37 ` [PATCH v3 02/18] hw/i386: Introduce X86CPUTopoInfo to contain topology info Babu Moger
2020-01-28 15:44   ` Igor Mammedov
2019-12-04  0:37 ` [PATCH v3 03/18] hw/i386: Consolidate topology functions Babu Moger
2020-01-28 15:46   ` Igor Mammedov
2019-12-04  0:37 ` [PATCH v3 04/18] hw/i386: Introduce initialize_topo_info to initialize X86CPUTopoInfo Babu Moger
2020-01-28 15:49   ` Igor Mammedov
2020-01-28 16:42     ` Babu Moger
2019-12-04  0:37 ` [PATCH v3 05/18] machine: Add SMP Sockets in CpuTopology Babu Moger
2019-12-04  0:37 ` [PATCH v3 06/18] hw/core: Add core complex id in X86CPU topology Babu Moger
2020-01-28 16:27   ` Igor Mammedov
2020-01-28 16:44     ` Babu Moger
2020-01-28 16:31   ` Eric Blake
2020-01-28 16:44     ` Babu Moger
2019-12-04  0:37 ` [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass Babu Moger
2020-01-28 16:29   ` Igor Mammedov
2020-01-28 19:45     ` Babu Moger
2020-01-28 20:12       ` Eduardo Habkost
2020-01-29  9:14       ` Igor Mammedov
2020-01-29 16:17         ` Babu Moger
2020-02-03 15:17           ` Igor Mammedov
2020-02-03 21:49             ` Babu Moger
2020-02-04  7:38               ` Igor Mammedov
2020-01-29 16:32         ` Babu Moger
2020-01-29 16:51           ` Eduardo Habkost
2020-01-29 17:05             ` Babu Moger
2019-12-04  0:37 ` [PATCH v3 08/18] hw/i386: Update structures for nodes_per_pkg Babu Moger
2019-12-04  0:37 ` [PATCH v3 09/18] i386: Add CPUX86Family type in CPUX86State Babu Moger
2019-12-04  0:38 ` [PATCH v3 10/18] hw/386: Add EPYC mode topology decoding functions Babu Moger
2019-12-04  0:38 ` [PATCH v3 11/18] i386: Cleanup and use the EPYC mode topology functions Babu Moger
2019-12-04  0:38 ` [PATCH v3 12/18] numa: Split the numa initialization Babu Moger
2019-12-04  0:38 ` [PATCH v3 13/18] hw/i386: Introduce apicid_from_cpu_idx in PCMachineState Babu Moger
2019-12-04  0:38 ` [PATCH v3 14/18] hw/i386: Introduce topo_ids_from_apicid handler PCMachineState Babu Moger
2019-12-04  0:38 ` [PATCH v3 15/18] hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState Babu Moger
2019-12-04  0:38 ` [PATCH v3 16/18] hw/i386: Introduce EPYC mode function handlers Babu Moger
2020-01-28 20:04   ` Eduardo Habkost
2020-01-28 21:48     ` Babu Moger
2020-01-29 16:41       ` Eduardo Habkost
2019-12-04  0:38 ` [PATCH v3 17/18] i386: Fix pkg_id offset for epyc mode Babu Moger
2019-12-04  0:39 ` [PATCH v3 18/18] tests: Update the Unit tests Babu Moger
2020-02-03 14:59 ` [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models Igor Mammedov
2020-02-03 19:31   ` Babu Moger
2020-02-04  8:02     ` Igor Mammedov
2020-02-04 19:08       ` Babu Moger
2020-02-05  9:38         ` Igor Mammedov
2020-02-05 16:10           ` Babu Moger
2020-02-05 16:56             ` Igor Mammedov
2020-02-05 19:07               ` Babu Moger
2020-02-06 13:08                 ` Igor Mammedov
2020-02-06 15:32                   ` Babu Moger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.