qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Fix couple of issues with AMD topology
@ 2020-08-07 16:32 Babu Moger
  2020-08-07 16:32 ` [PATCH v3 1/3] i386: Simplify CPUID_8000_001E for AMD Babu Moger
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Babu Moger @ 2020-08-07 16:32 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

This series fixes couple of issues with recent topology related code.
1. Warn users to pass the dies information if EPYC cpu is numa configured.

2. Remove the node_id references in topology and use die_id instead.

3. With node_id removed in topology the uninitialized memory issue 
   with -device and CPU hotplug will be fixed.
   Link: https://bugzilla.redhat.com/show_bug.cgi?id=1828750

---
v3:
  Added a new check to pass the dies for EPYC numa configuration.
  Added Simplify CPUID_8000_001E patch with some changes suggested by Igor.
  Dropped the patch to build the topology from CpuInstanceProperties.
  TODO: Not sure if we still need the Autonuma changes Igor mentioned.
  Needs more clarity on that.

v2:
   https://lore.kernel.org/qemu-devel/159362436285.36204.986406297373871949.stgit@naples-babu.amd.com/
 - Used the numa information from CpuInstanceProperties for building
   the apic_id suggested by Igor.
 - Also did some minor code re-aarangement to take care of changes.
 - Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
   it later.

v1:
 https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.stgit@naples-babu.amd.com

Babu Moger (3):
      i386: Simplify CPUID_8000_001E for AMD
      hw/i386: Add a new check to configure smp dies for EPYC
      hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology


 hw/i386/pc.c               |    1 -
 hw/i386/x86.c              |    8 ++++
 include/hw/i386/topology.h |   40 +++++---------------
 target/i386/cpu.c          |   86 +++++++++++++++++++-------------------------
 target/i386/cpu.h          |    1 -
 5 files changed, 54 insertions(+), 82 deletions(-)

--


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/3] i386: Simplify CPUID_8000_001E for AMD
  2020-08-07 16:32 [PATCH v3 0/3] Fix couple of issues with AMD topology Babu Moger
@ 2020-08-07 16:32 ` Babu Moger
  2020-08-07 18:11   ` Igor Mammedov
  2020-08-07 16:32 ` [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC Babu Moger
  2020-08-07 16:32 ` [PATCH v3 3/3] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology Babu Moger
  2 siblings, 1 reply; 13+ messages in thread
From: Babu Moger @ 2020-08-07 16:32 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

apic_id contains all the information required to build CPUID_8000_001E.
Also remove the restriction on number bits on core_id and node_id.
Remove all the hardcoded values and replace with generalized
fields.

Refer the Processor Programming Reference (PPR) documentation
available from the bugzilla Link below.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 target/i386/cpu.c |   77 +++++++++++++++++++++++++----------------------------
 1 file changed, 36 insertions(+), 41 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 588f32e136..c892432cae 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -388,57 +388,52 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
 {
     X86CPUTopoIDs topo_ids = {0};
     unsigned long nodes = MAX(topo_info->nodes_per_pkg, 1);
-    int shift;
 
     x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
 
     *eax = cpu->apic_id;
+
     /*
+     * CPUID_Fn8000001E_EBX [Core Identifiers] (CoreId)
+     * Read-only. Reset: 0000_XXXXh.
+     * See Core::X86::Cpuid::ExtApicId.
+     * Core::X86::Cpuid::CoreId_lthree[1:0]_core[3:0]_thread[1:0];
      * CPUID_Fn8000001E_EBX
-     * 31:16 Reserved
-     * 15:8  Threads per core (The number of threads per core is
-     *       Threads per core + 1)
-     *  7:0  Core id (see bit decoding below)
-     *       SMT:
-     *           4:3 node id
-     *             2 Core complex id
-     *           1:0 Core id
-     *       Non SMT:
-     *           5:4 node id
-     *             3 Core complex id
-     *           1:0 Core id
+     * Bits Description
+     * 31:16 Reserved.
+     * 15:8 ThreadsPerCore: threads per core. Read-only. Reset: XXh.
+     *      The number of threads per core is ThreadsPerCore+1.
+     *  7:0 CoreId: core ID. Read-only. Reset: XXh.
+     *
+     *  NOTE: CoreId is already part of apic_id. Just use it. We can
+     *  use all the 8 bits to represent the core_id here.
      */
-    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.node_id << 3) |
-            (topo_ids.core_id);
+    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.core_id & 0xFF);
+
     /*
+     * CPUID_Fn8000001E_ECX [Node Identifiers] (NodeId)
+     * Read-only. Reset: 0000_0XXXh.
+     * Core::X86::Cpuid::NodeId_lthree[1:0]_core[3:0]_thread[1:0];
      * CPUID_Fn8000001E_ECX
-     * 31:11 Reserved
-     * 10:8  Nodes per processor (Nodes per processor is number of nodes + 1)
-     *  7:0  Node id (see bit decoding below)
-     *         2  Socket id
-     *       1:0  Node id
+     * Bits Description
+     * 31:11 Reserved.
+     * 10:8 NodesPerProcessor: Node per processor. Read-only. Reset: XXXb.
+     *      ValidValues:
+     *      Value Description
+     *      000b  1 node per processor.
+     *      001b  2 nodes per processor.
+     *      010b Reserved.
+     *      011b 4 nodes per processor.
+     *      111b-100b Reserved.
+     *  7:0 NodeId: Node ID. Read-only. Reset: XXh.
+     *
+     * NOTE: Hardware reserves 3 bits for number of nodes per processor.
+     * But users can create more nodes than the actual hardware can
+     * support. To genaralize we can use all the upper 8 bits for nodes.
+     * NodeId is combination of node and socket_id which is already decoded
+     * in apic_id. Just use it by shifting.
      */
-    if (nodes <= 4) {
-        *ecx = ((nodes - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.node_id;
-    } else {
-        /*
-         * Node id fix up. Actual hardware supports up to 4 nodes. But with
-         * more than 32 cores, we may end up with more than 4 nodes.
-         * Node id is a combination of socket id and node id. Only requirement
-         * here is that this number should be unique accross the system.
-         * Shift the socket id to accommodate more nodes. We dont expect both
-         * socket id and node id to be big number at the same time. This is not
-         * an ideal config but we need to to support it. Max nodes we can have
-         * is 32 (255/8) with 8 cores per node and 255 max cores. We only need
-         * 5 bits for nodes. Find the left most set bit to represent the total
-         * number of nodes. find_last_bit returns last set bit(0 based). Left
-         * shift(+1) the socket id to represent all the nodes.
-         */
-        nodes -= 1;
-        shift = find_last_bit(&nodes, 8);
-        *ecx = (nodes << 8) | (topo_ids.pkg_id << (shift + 1)) |
-               topo_ids.node_id;
-    }
+    *ecx = ((nodes - 1) << 8) | cpu->node_id;
     *edx = 0;
 }
 



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC
  2020-08-07 16:32 [PATCH v3 0/3] Fix couple of issues with AMD topology Babu Moger
  2020-08-07 16:32 ` [PATCH v3 1/3] i386: Simplify CPUID_8000_001E for AMD Babu Moger
@ 2020-08-07 16:32 ` Babu Moger
  2020-08-07 16:52   ` Daniel P. Berrangé
  2020-08-07 19:27   ` Igor Mammedov
  2020-08-07 16:32 ` [PATCH v3 3/3] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology Babu Moger
  2 siblings, 2 replies; 13+ messages in thread
From: Babu Moger @ 2020-08-07 16:32 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

Adding a new check to warn the users to configure 'dies' when
topology is numa configured. It makes it easy to build the
topology for EPYC models.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/x86.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 67bee1bcb8..2a6ce56ef1 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -138,6 +138,13 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
 
     /* Check for apicid encoding */
     if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
+        if ((ms->numa_state->num_nodes > 0) &&
+            ms->numa_state->num_nodes != (ms->smp.sockets * x86ms->smp_dies)) {
+            error_setg(&error_fatal, "Numa configuration requires smp 'dies' "
+                       "parameter. Configure the cpu topology properly with "
+                       "max_cpus = sockets * dies * cores * threads");
+            return;
+        }
         x86_set_epyc_topo_handlers(ms);
     }
 



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/3] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
  2020-08-07 16:32 [PATCH v3 0/3] Fix couple of issues with AMD topology Babu Moger
  2020-08-07 16:32 ` [PATCH v3 1/3] i386: Simplify CPUID_8000_001E for AMD Babu Moger
  2020-08-07 16:32 ` [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC Babu Moger
@ 2020-08-07 16:32 ` Babu Moger
  2 siblings, 0 replies; 13+ messages in thread
From: Babu Moger @ 2020-08-07 16:32 UTC (permalink / raw)
  To: pbonzini, rth, ehabkost, imammedo; +Cc: qemu-devel, mst

Remove node_id, nr_nodes and nodes_per_pkg from topology. Use
die_id, nr_dies and dies_per_pkg which is already available.
Removes the confusion over two variables.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 hw/i386/pc.c               |    1 -
 hw/i386/x86.c              |    1 -
 include/hw/i386/topology.h |   40 +++++++++-------------------------------
 target/i386/cpu.c          |   11 +++--------
 target/i386/cpu.h          |    1 -
 5 files changed, 12 insertions(+), 42 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 47c5ca3e34..0ae5cb3af4 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1498,7 +1498,6 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     init_topo_info(&topo_info, x86ms);
 
     env->nr_dies = x86ms->smp_dies;
-    env->nr_nodes = topo_info.nodes_per_pkg;
     env->pkg_offset = x86ms->apicid_pkg_offset(&topo_info);
 
     /*
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 2a6ce56ef1..3ac46960db 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -62,7 +62,6 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
 {
     MachineState *ms = MACHINE(x86ms);
 
-    topo_info->nodes_per_pkg = ms->numa_state->num_nodes / ms->smp.sockets;
     topo_info->dies_per_pkg = x86ms->smp_dies;
     topo_info->cores_per_die = ms->smp.cores;
     topo_info->threads_per_core = ms->smp.threads;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 07239f95f4..05ddde7ba0 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -47,14 +47,12 @@ typedef uint32_t apic_id_t;
 
 typedef struct X86CPUTopoIDs {
     unsigned pkg_id;
-    unsigned node_id;
     unsigned die_id;
     unsigned core_id;
     unsigned smt_id;
 } X86CPUTopoIDs;
 
 typedef struct X86CPUTopoInfo {
-    unsigned nodes_per_pkg;
     unsigned dies_per_pkg;
     unsigned cores_per_die;
     unsigned threads_per_core;
@@ -89,11 +87,6 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
     return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
 }
 
-/* Bit width of the node_id field per socket */
-static inline unsigned apicid_node_width_epyc(X86CPUTopoInfo *topo_info)
-{
-    return apicid_bitwidth_for_count(MAX(topo_info->nodes_per_pkg, 1));
-}
 /* Bit offset of the Core_ID field
  */
 static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
@@ -114,30 +107,23 @@ static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
     return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
 }
 
-#define NODE_ID_OFFSET 3 /* Minimum node_id offset if numa configured */
+#define EPYC_DIE_OFFSET 3 /* Minimum die_id offset if numa configured */
 
 /*
- * Bit offset of the node_id field
- *
- * Make sure nodes_per_pkg >  0 if numa configured else zero.
+ * Bit offset of the die_id field
  */
-static inline unsigned apicid_node_offset_epyc(X86CPUTopoInfo *topo_info)
+static inline unsigned apicid_die_offset_epyc(X86CPUTopoInfo *topo_info)
 {
-    unsigned offset = apicid_die_offset(topo_info) +
-                      apicid_die_width(topo_info);
+    unsigned offset = apicid_core_offset(topo_info) +
+                      apicid_core_width(topo_info);
 
-    if (topo_info->nodes_per_pkg) {
-        return MAX(NODE_ID_OFFSET, offset);
-    } else {
-        return offset;
-    }
+    return MAX(EPYC_DIE_OFFSET, offset);
 }
 
 /* Bit offset of the Pkg_ID (socket ID) field */
 static inline unsigned apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)
 {
-    return apicid_node_offset_epyc(topo_info) +
-           apicid_node_width_epyc(topo_info);
+    return apicid_die_offset_epyc(topo_info) + apicid_die_width(topo_info);
 }
 
 /*
@@ -150,8 +136,7 @@ x86_apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
                               const X86CPUTopoIDs *topo_ids)
 {
     return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
-           (topo_ids->node_id << apicid_node_offset_epyc(topo_info)) |
-           (topo_ids->die_id  << apicid_die_offset(topo_info)) |
+           (topo_ids->die_id  << apicid_die_offset_epyc(topo_info)) |
            (topo_ids->core_id << apicid_core_offset(topo_info)) |
            topo_ids->smt_id;
 }
@@ -160,15 +145,11 @@ static inline void x86_topo_ids_from_idx_epyc(X86CPUTopoInfo *topo_info,
                                               unsigned cpu_index,
                                               X86CPUTopoIDs *topo_ids)
 {
-    unsigned nr_nodes = MAX(topo_info->nodes_per_pkg, 1);
     unsigned nr_dies = topo_info->dies_per_pkg;
     unsigned nr_cores = topo_info->cores_per_die;
     unsigned nr_threads = topo_info->threads_per_core;
-    unsigned cores_per_node = DIV_ROUND_UP((nr_dies * nr_cores * nr_threads),
-                                            nr_nodes);
 
     topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
-    topo_ids->node_id = (cpu_index / cores_per_node) % nr_nodes;
     topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
     topo_ids->core_id = cpu_index / nr_threads % nr_cores;
     topo_ids->smt_id = cpu_index % nr_threads;
@@ -188,11 +169,8 @@ static inline void x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
             (apicid >> apicid_core_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
     topo_ids->die_id =
-            (apicid >> apicid_die_offset(topo_info)) &
+            (apicid >> apicid_die_offset_epyc(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
-    topo_ids->node_id =
-            (apicid >> apicid_node_offset_epyc(topo_info)) &
-            ~(0xFFFFFFFFUL << apicid_node_width_epyc(topo_info));
     topo_ids->pkg_id = apicid >> apicid_pkg_offset_epyc(topo_info);
 }
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index c892432cae..ba0a24f6b8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -345,7 +345,6 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
                                        uint32_t *ecx, uint32_t *edx)
 {
     uint32_t l3_cores;
-    unsigned nodes = MAX(topo_info->nodes_per_pkg, 1);
 
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
@@ -355,10 +354,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_cores = DIV_ROUND_UP((topo_info->dies_per_pkg *
-                                 topo_info->cores_per_die *
+        l3_cores = DIV_ROUND_UP((topo_info->cores_per_die *
                                  topo_info->threads_per_core),
-                                 nodes);
+                                 topo_info->dies_per_pkg);
         *eax |= (l3_cores - 1) << 14;
     } else {
         *eax |= ((topo_info->threads_per_core - 1) << 14);
@@ -387,7 +385,6 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
                                        uint32_t *ecx, uint32_t *edx)
 {
     X86CPUTopoIDs topo_ids = {0};
-    unsigned long nodes = MAX(topo_info->nodes_per_pkg, 1);
 
     x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
 
@@ -433,7 +430,7 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
      * NodeId is combination of node and socket_id which is already decoded
      * in apic_id. Just use it by shifting.
      */
-    *ecx = ((nodes - 1) << 8) | cpu->node_id;
+    *ecx = ((topo_info->dies_per_pkg - 1) << 8) | cpu->node_id;
     *edx = 0;
 }
 
@@ -5484,7 +5481,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
 
-    topo_info.nodes_per_pkg = env->nr_nodes;
     topo_info.dies_per_pkg = env->nr_dies;
     topo_info.cores_per_die = cs->nr_cores;
     topo_info.threads_per_core = cs->nr_threads;
@@ -6944,7 +6940,6 @@ static void x86_cpu_initfn(Object *obj)
     FeatureWord w;
 
     env->nr_dies = 1;
-    env->nr_nodes = 1;
     cpu_set_cpustate_pointers(cpu);
 
     object_property_add(obj, "family", "int",
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index e1a5c174dc..4c89bee8d1 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1629,7 +1629,6 @@ typedef struct CPUX86State {
     TPRAccess tpr_access_type;
 
     unsigned nr_dies;
-    unsigned nr_nodes;
     unsigned pkg_offset;
 } CPUX86State;
 



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC
  2020-08-07 16:32 ` [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC Babu Moger
@ 2020-08-07 16:52   ` Daniel P. Berrangé
  2020-08-07 17:23     ` Babu Moger
  2020-08-07 19:11     ` Igor Mammedov
  2020-08-07 19:27   ` Igor Mammedov
  1 sibling, 2 replies; 13+ messages in thread
From: Daniel P. Berrangé @ 2020-08-07 16:52 UTC (permalink / raw)
  To: Babu Moger; +Cc: ehabkost, mst, qemu-devel, imammedo, pbonzini, rth

On Fri, Aug 07, 2020 at 11:32:51AM -0500, Babu Moger wrote:
> Adding a new check to warn the users to configure 'dies' when
> topology is numa configured. It makes it easy to build the
> topology for EPYC models.

This says you're adding a warning....

> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  hw/i386/x86.c |    7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 67bee1bcb8..2a6ce56ef1 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -138,6 +138,13 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
>  
>      /* Check for apicid encoding */
>      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
> +        if ((ms->numa_state->num_nodes > 0) &&
> +            ms->numa_state->num_nodes != (ms->smp.sockets * x86ms->smp_dies)) {
> +            error_setg(&error_fatal, "Numa configuration requires smp 'dies' "
> +                       "parameter. Configure the cpu topology properly with "
> +                       "max_cpus = sockets * dies * cores * threads");

...but you're actually making this a fatal error, not a warning.

I'm not sure this is really OK. Wouldn't this mean that existing VMs
deployed today, risk triggering this fatal error next time they
are booted, or live migrated.  If it is possible someone is using
such a config today, I don't think we can break it.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC
  2020-08-07 16:52   ` Daniel P. Berrangé
@ 2020-08-07 17:23     ` Babu Moger
  2020-08-07 19:11     ` Igor Mammedov
  1 sibling, 0 replies; 13+ messages in thread
From: Babu Moger @ 2020-08-07 17:23 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: ehabkost, mst, qemu-devel, imammedo, pbonzini, rth



On 8/7/20 11:52 AM, Daniel P. Berrangé wrote:
> On Fri, Aug 07, 2020 at 11:32:51AM -0500, Babu Moger wrote:
>> Adding a new check to warn the users to configure 'dies' when
>> topology is numa configured. It makes it easy to build the
>> topology for EPYC models.
> 
> This says you're adding a warning....
> 
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>>  hw/i386/x86.c |    7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
>> index 67bee1bcb8..2a6ce56ef1 100644
>> --- a/hw/i386/x86.c
>> +++ b/hw/i386/x86.c
>> @@ -138,6 +138,13 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
>>  
>>      /* Check for apicid encoding */
>>      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
>> +        if ((ms->numa_state->num_nodes > 0) &&
>> +            ms->numa_state->num_nodes != (ms->smp.sockets * x86ms->smp_dies)) {
>> +            error_setg(&error_fatal, "Numa configuration requires smp 'dies' "
>> +                       "parameter. Configure the cpu topology properly with "
>> +                       "max_cpus = sockets * dies * cores * threads");
> 
> ...but you're actually making this a fatal error, not a warning.
> 
> I'm not sure this is really OK. Wouldn't this mean that existing VMs
> deployed today, risk triggering this fatal error next time they
> are booted, or live migrated.  If it is possible someone is using
> such a config today, I don't think we can break it.

Yes. that is correct. May be we can display warning(not fatal) and assign
the smp_dies = numa_nodes / sockets. But that requires re-arrange of smp
parameters. Not sure if that is ok.

> 
> Regards,
> Daniel
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/3] i386: Simplify CPUID_8000_001E for AMD
  2020-08-07 16:32 ` [PATCH v3 1/3] i386: Simplify CPUID_8000_001E for AMD Babu Moger
@ 2020-08-07 18:11   ` Igor Mammedov
  0 siblings, 0 replies; 13+ messages in thread
From: Igor Mammedov @ 2020-08-07 18:11 UTC (permalink / raw)
  To: Babu Moger; +Cc: qemu-devel, pbonzini, mst, ehabkost, rth

On Fri, 07 Aug 2020 11:32:45 -0500
Babu Moger <babu.moger@amd.com> wrote:

> apic_id contains all the information required to build CPUID_8000_001E.
> Also remove the restriction on number bits on core_id and node_id.
> Remove all the hardcoded values and replace with generalized
> fields.
> 
> Refer the Processor Programming Reference (PPR) documentation
> available from the bugzilla Link below.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  target/i386/cpu.c |   77 +++++++++++++++++++++++++----------------------------
>  1 file changed, 36 insertions(+), 41 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 588f32e136..c892432cae 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -388,57 +388,52 @@ static void encode_topo_cpuid8000001e(X86CPUTopoInfo *topo_info, X86CPU *cpu,
>  {
>      X86CPUTopoIDs topo_ids = {0};
>      unsigned long nodes = MAX(topo_info->nodes_per_pkg, 1);
> -    int shift;
>  
>      x86_topo_ids_from_apicid_epyc(cpu->apic_id, topo_info, &topo_ids);
>  
>      *eax = cpu->apic_id;
> +
>      /*
> +     * CPUID_Fn8000001E_EBX [Core Identifiers] (CoreId)
> +     * Read-only. Reset: 0000_XXXXh.
> +     * See Core::X86::Cpuid::ExtApicId.
> +     * Core::X86::Cpuid::CoreId_lthree[1:0]_core[3:0]_thread[1:0];
>       * CPUID_Fn8000001E_EBX
> -     * 31:16 Reserved
> -     * 15:8  Threads per core (The number of threads per core is
> -     *       Threads per core + 1)
> -     *  7:0  Core id (see bit decoding below)
> -     *       SMT:
> -     *           4:3 node id
> -     *             2 Core complex id
> -     *           1:0 Core id
> -     *       Non SMT:
> -     *           5:4 node id
> -     *             3 Core complex id
> -     *           1:0 Core id
> +     * Bits Description
> +     * 31:16 Reserved.
> +     * 15:8 ThreadsPerCore: threads per core. Read-only. Reset: XXh.
> +     *      The number of threads per core is ThreadsPerCore+1.
> +     *  7:0 CoreId: core ID. Read-only. Reset: XXh.
> +     *
> +     *  NOTE: CoreId is already part of apic_id. Just use it. We can
> +     *  use all the 8 bits to represent the core_id here.
>       */
> -    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.node_id << 3) |
> -            (topo_ids.core_id);
> +    *ebx = ((topo_info->threads_per_core - 1) << 8) | (topo_ids.core_id & 0xFF);
> +
>      /*
> +     * CPUID_Fn8000001E_ECX [Node Identifiers] (NodeId)
> +     * Read-only. Reset: 0000_0XXXh.
> +     * Core::X86::Cpuid::NodeId_lthree[1:0]_core[3:0]_thread[1:0];
>       * CPUID_Fn8000001E_ECX
> -     * 31:11 Reserved
> -     * 10:8  Nodes per processor (Nodes per processor is number of nodes + 1)
> -     *  7:0  Node id (see bit decoding below)
> -     *         2  Socket id
> -     *       1:0  Node id
> +     * Bits Description
> +     * 31:11 Reserved.
> +     * 10:8 NodesPerProcessor: Node per processor. Read-only. Reset: XXXb.
> +     *      ValidValues:
> +     *      Value Description
> +     *      000b  1 node per processor.
> +     *      001b  2 nodes per processor.
> +     *      010b Reserved.
> +     *      011b 4 nodes per processor.
> +     *      111b-100b Reserved.
> +     *  7:0 NodeId: Node ID. Read-only. Reset: XXh.
> +     *
> +     * NOTE: Hardware reserves 3 bits for number of nodes per processor.
> +     * But users can create more nodes than the actual hardware can
> +     * support. To genaralize we can use all the upper 8 bits for nodes.
> +     * NodeId is combination of node and socket_id which is already decoded
> +     * in apic_id. Just use it by shifting.
>       */
> -    if (nodes <= 4) {
> -        *ecx = ((nodes - 1) << 8) | (topo_ids.pkg_id << 2) | topo_ids.node_id;
> -    } else {
> -        /*
> -         * Node id fix up. Actual hardware supports up to 4 nodes. But with
> -         * more than 32 cores, we may end up with more than 4 nodes.
> -         * Node id is a combination of socket id and node id. Only requirement
> -         * here is that this number should be unique accross the system.
> -         * Shift the socket id to accommodate more nodes. We dont expect both
> -         * socket id and node id to be big number at the same time. This is not
> -         * an ideal config but we need to to support it. Max nodes we can have
> -         * is 32 (255/8) with 8 cores per node and 255 max cores. We only need
> -         * 5 bits for nodes. Find the left most set bit to represent the total
> -         * number of nodes. find_last_bit returns last set bit(0 based). Left
> -         * shift(+1) the socket id to represent all the nodes.
> -         */
> -        nodes -= 1;
> -        shift = find_last_bit(&nodes, 8);
> -        *ecx = (nodes << 8) | (topo_ids.pkg_id << (shift + 1)) |
> -               topo_ids.node_id;
> -    }
> +    *ecx = ((nodes - 1) << 8) | cpu->node_id;
>      *edx = 0;
>  }
>  
> 
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC
  2020-08-07 16:52   ` Daniel P. Berrangé
  2020-08-07 17:23     ` Babu Moger
@ 2020-08-07 19:11     ` Igor Mammedov
  2020-08-11 21:03       ` Babu Moger
  1 sibling, 1 reply; 13+ messages in thread
From: Igor Mammedov @ 2020-08-07 19:11 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: ehabkost, mst, qemu-devel, Babu Moger, pbonzini, rth

On Fri, 7 Aug 2020 17:52:22 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Fri, Aug 07, 2020 at 11:32:51AM -0500, Babu Moger wrote:
> > Adding a new check to warn the users to configure 'dies' when
> > topology is numa configured. It makes it easy to build the
> > topology for EPYC models.  
> 
> This says you're adding a warning....
> 
> > 
> > Signed-off-by: Babu Moger <babu.moger@amd.com>
> > ---
> >  hw/i386/x86.c |    7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> > index 67bee1bcb8..2a6ce56ef1 100644
> > --- a/hw/i386/x86.c
> > +++ b/hw/i386/x86.c
> > @@ -138,6 +138,13 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
> >  
> >      /* Check for apicid encoding */
> >      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
> > +        if ((ms->numa_state->num_nodes > 0) &&
> > +            ms->numa_state->num_nodes != (ms->smp.sockets * x86ms->smp_dies)) {
> > +            error_setg(&error_fatal, "Numa configuration requires smp 'dies' "
> > +                       "parameter. Configure the cpu topology properly with "
> > +                       "max_cpus = sockets * dies * cores * threads");  
> 
> ...but you're actually making this a fatal error, not a warning.
> 
> I'm not sure this is really OK. Wouldn't this mean that existing VMs
> deployed today, risk triggering this fatal error next time they
> are booted, or live migrated.  If it is possible someone is using
> such a config today, I don't think we can break it.

to begin with, users shouldn't have used 'dies' with initial impl. at all.
(it was Intel introduced option and EPYC's added very similar internal node_id
(removed by the next patch)).
Now we are trying to consolidate this mess and reuse dies for EPYC.

EPYC was out in the since with 5.0 (though broken), users could start a VM with
such config but that would not be correct EPYC from apicid and cpuid point of view.
Guest OS might run if it doesn't know about EPYCs or behave wierdly (sub optimal|crash|whatever)
on seeing unexpected values.

If we are hell bound on keeping bugs of initial impl, then we should keep it to 5.1<=
machine version and do the right thing for newer ones.
Though I'm not sure we should keep broken variant around (all we would get from it is
bug reports*/complains from users with end result of their config anyways).
I'd rather error out with clear error message so user could fix their broken config.

*) there is at least one thread/bz on qemu-devel where users are trying to run
with EPYC and pick up options combination so it would produce sensible topology.


> Regards,
> Daniel



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC
  2020-08-07 16:32 ` [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC Babu Moger
  2020-08-07 16:52   ` Daniel P. Berrangé
@ 2020-08-07 19:27   ` Igor Mammedov
  2020-08-11 21:04     ` Babu Moger
  1 sibling, 1 reply; 13+ messages in thread
From: Igor Mammedov @ 2020-08-07 19:27 UTC (permalink / raw)
  To: Babu Moger; +Cc: qemu-devel, pbonzini, mst, ehabkost, rth

On Fri, 07 Aug 2020 11:32:51 -0500
Babu Moger <babu.moger@amd.com> wrote:

> Adding a new check to warn the users to configure 'dies' when
s/warn .../error out .../

> topology is numa configured. It makes it easy to build the
> topology for EPYC models.

probably it should mention that will break configs that
do not have correct topology configured.

 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  hw/i386/x86.c |    7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 67bee1bcb8..2a6ce56ef1 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -138,6 +138,13 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
>  
>      /* Check for apicid encoding */
>      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
> +        if ((ms->numa_state->num_nodes > 0) &&
> +            ms->numa_state->num_nodes != (ms->smp.sockets * x86ms->smp_dies)) {
> +            error_setg(&error_fatal, "Numa configuration requires smp 'dies' "
> +                       "parameter. Configure the cpu topology properly with "
> +                       "max_cpus = sockets * dies * cores * threads");
> +            return;
> +        }
>          x86_set_epyc_topo_handlers(ms);
>      }

we also should error out in case 
    ms->numa_state->num_nodes != (ms->smp.sockets * x86ms->smp_dies
and ask user to configure numa to match total number of used dies.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC
  2020-08-07 19:11     ` Igor Mammedov
@ 2020-08-11 21:03       ` Babu Moger
  2020-08-13 13:56         ` Igor Mammedov
  0 siblings, 1 reply; 13+ messages in thread
From: Babu Moger @ 2020-08-11 21:03 UTC (permalink / raw)
  To: Igor Mammedov, Daniel P. Berrangé
  Cc: qemu-devel, pbonzini, rth, ehabkost, mst



On 8/7/20 2:11 PM, Igor Mammedov wrote:
> On Fri, 7 Aug 2020 17:52:22 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
>> On Fri, Aug 07, 2020 at 11:32:51AM -0500, Babu Moger wrote:
>>> Adding a new check to warn the users to configure 'dies' when
>>> topology is numa configured. It makes it easy to build the
>>> topology for EPYC models.  
>>
>> This says you're adding a warning....
>>
>>>
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>>  hw/i386/x86.c |    7 +++++++
>>>  1 file changed, 7 insertions(+)
>>>
>>> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
>>> index 67bee1bcb8..2a6ce56ef1 100644
>>> --- a/hw/i386/x86.c
>>> +++ b/hw/i386/x86.c
>>> @@ -138,6 +138,13 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
>>>  
>>>      /* Check for apicid encoding */
>>>      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
>>> +        if ((ms->numa_state->num_nodes > 0) &&
>>> +            ms->numa_state->num_nodes != (ms->smp.sockets * x86ms->smp_dies)) {
>>> +            error_setg(&error_fatal, "Numa configuration requires smp 'dies' "
>>> +                       "parameter. Configure the cpu topology properly with "
>>> +                       "max_cpus = sockets * dies * cores * threads");  
>>
>> ...but you're actually making this a fatal error, not a warning.
>>
>> I'm not sure this is really OK. Wouldn't this mean that existing VMs
>> deployed today, risk triggering this fatal error next time they
>> are booted, or live migrated.  If it is possible someone is using
>> such a config today, I don't think we can break it.
> 
> to begin with, users shouldn't have used 'dies' with initial impl. at all.
> (it was Intel introduced option and EPYC's added very similar internal node_id
> (removed by the next patch)).
> Now we are trying to consolidate this mess and reuse dies for EPYC.
> 
> EPYC was out in the since with 5.0 (though broken), users could start a VM with
> such config but that would not be correct EPYC from apicid and cpuid point of view.
> Guest OS might run if it doesn't know about EPYCs or behave wierdly (sub optimal|crash|whatever)
> on seeing unexpected values.
> 
> If we are hell bound on keeping bugs of initial impl, then we should keep it to 5.1<=
> machine version and do the right thing for newer ones.
> Though I'm not sure we should keep broken variant around (all we would get from it is
> bug reports*/complains from users with end result of their config anyways).
> I'd rather error out with clear error message so user could fix their broken config.
> 
> *) there is at least one thread/bz on qemu-devel where users are trying to run
> with EPYC and pick up options combination so it would produce sensible topology.


I am still not sure what is the right approach here.  I can think of
couple of options.
1. If smp_dies != num_nodes then go ahead create the configuration with as
 many smp_dies and warn(but not error out) users about the mis-configuration.

2. Introduce it as a fix based on  machine version(5.1 >) like Igor
mentioned. I am not sure how to achieve that. I can look into that.

Thanks
Babu

> 
> 
>> Regards,
>> Daniel
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC
  2020-08-07 19:27   ` Igor Mammedov
@ 2020-08-11 21:04     ` Babu Moger
  0 siblings, 0 replies; 13+ messages in thread
From: Babu Moger @ 2020-08-11 21:04 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: qemu-devel, pbonzini, mst, ehabkost, rth



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Friday, August 7, 2020 2:27 PM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com; qemu-
> devel@nongnu.org; mst@redhat.com
> Subject: Re: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for
> EPYC
> 
> On Fri, 07 Aug 2020 11:32:51 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > Adding a new check to warn the users to configure 'dies' when
> s/warn .../error out .../
> 
> > topology is numa configured. It makes it easy to build the topology
> > for EPYC models.
> 
> probably it should mention that will break configs that do not have correct
> topology configured.

Sure. Will do that.

> 
> 
> > Signed-off-by: Babu Moger <babu.moger@amd.com>
> > ---
> >  hw/i386/x86.c |    7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/hw/i386/x86.c b/hw/i386/x86.c index
> > 67bee1bcb8..2a6ce56ef1 100644
> > --- a/hw/i386/x86.c
> > +++ b/hw/i386/x86.c
> > @@ -138,6 +138,13 @@ void x86_cpus_init(X86MachineState *x86ms, int
> > default_cpu_version)
> >
> >      /* Check for apicid encoding */
> >      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
> > +        if ((ms->numa_state->num_nodes > 0) &&
> > +            ms->numa_state->num_nodes != (ms->smp.sockets * x86ms-
> >smp_dies)) {
> > +            error_setg(&error_fatal, "Numa configuration requires smp 'dies' "
> > +                       "parameter. Configure the cpu topology properly with "
> > +                       "max_cpus = sockets * dies * cores * threads");
> > +            return;
> > +        }
> >          x86_set_epyc_topo_handlers(ms);
> >      }
> 
> we also should error out in case
>     ms->numa_state->num_nodes != (ms->smp.sockets * x86ms->smp_dies and
> ask user to configure numa to match total number of used dies.
> 

I have already added the same check above. Do you want anything more?



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC
  2020-08-11 21:03       ` Babu Moger
@ 2020-08-13 13:56         ` Igor Mammedov
  2020-08-13 21:10           ` Babu Moger
  0 siblings, 1 reply; 13+ messages in thread
From: Igor Mammedov @ 2020-08-13 13:56 UTC (permalink / raw)
  To: Babu Moger
  Cc: Daniel P. Berrangé, ehabkost, mst, qemu-devel, pbonzini, rth

On Tue, 11 Aug 2020 16:03:58 -0500
Babu Moger <babu.moger@amd.com> wrote:

> On 8/7/20 2:11 PM, Igor Mammedov wrote:
> > On Fri, 7 Aug 2020 17:52:22 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> >> On Fri, Aug 07, 2020 at 11:32:51AM -0500, Babu Moger wrote:  
> >>> Adding a new check to warn the users to configure 'dies' when
> >>> topology is numa configured. It makes it easy to build the
> >>> topology for EPYC models.    
> >>
> >> This says you're adding a warning....
> >>  
> >>>
> >>> Signed-off-by: Babu Moger <babu.moger@amd.com>
> >>> ---
> >>>  hw/i386/x86.c |    7 +++++++
> >>>  1 file changed, 7 insertions(+)
> >>>
> >>> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> >>> index 67bee1bcb8..2a6ce56ef1 100644
> >>> --- a/hw/i386/x86.c
> >>> +++ b/hw/i386/x86.c
> >>> @@ -138,6 +138,13 @@ void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
> >>>  
> >>>      /* Check for apicid encoding */
> >>>      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
> >>> +        if ((ms->numa_state->num_nodes > 0) &&
> >>> +            ms->numa_state->num_nodes != (ms->smp.sockets * x86ms->smp_dies)) {
> >>> +            error_setg(&error_fatal, "Numa configuration requires smp 'dies' "
> >>> +                       "parameter. Configure the cpu topology properly with "
> >>> +                       "max_cpus = sockets * dies * cores * threads");    
> >>
> >> ...but you're actually making this a fatal error, not a warning.
> >>
> >> I'm not sure this is really OK. Wouldn't this mean that existing VMs
> >> deployed today, risk triggering this fatal error next time they
> >> are booted, or live migrated.  If it is possible someone is using
> >> such a config today, I don't think we can break it.  
> > 
> > to begin with, users shouldn't have used 'dies' with initial impl. at all.
> > (it was Intel introduced option and EPYC's added very similar internal node_id
> > (removed by the next patch)).
> > Now we are trying to consolidate this mess and reuse dies for EPYC.
> > 
> > EPYC was out in the since with 5.0 (though broken), users could start a VM with
> > such config but that would not be correct EPYC from apicid and cpuid point of view.
> > Guest OS might run if it doesn't know about EPYCs or behave wierdly (sub optimal|crash|whatever)
> > on seeing unexpected values.
> > 
> > If we are hell bound on keeping bugs of initial impl, then we should keep it to 5.1<=
> > machine version and do the right thing for newer ones.
> > Though I'm not sure we should keep broken variant around (all we would get from it is
> > bug reports*/complains from users with end result of their config anyways).
> > I'd rather error out with clear error message so user could fix their broken config.
> > 
> > *) there is at least one thread/bz on qemu-devel where users are trying to run
> > with EPYC and pick up options combination so it would produce sensible topology.  
> 
> 
> I am still not sure what is the right approach here.  I can think of
> couple of options.
> 1. If smp_dies != num_nodes then go ahead create the configuration with as
>  many smp_dies and warn(but not error out) users about the mis-configuration.
warning is a bad idea, that usually leads to troubles down the road.

Provided that code is relatively new and produces misconfigured CPUs
and if nobody insists on keeping bug around, I'd try to go for erroring out.
Yes that would break misconfigured configs but that could be fixed by
reconfiguring on user side.

> 2. Introduce it as a fix based on  machine version(5.1 >) like Igor
> mentioned. I am not sure how to achieve that. I can look into that.
That's a headache for maintaing point of view, so again if nobody insist
I'd rather avoid it.

> 
> Thanks
> Babu
> 
> > 
> >   
> >> Regards,
> >> Daniel  
> >   
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC
  2020-08-13 13:56         ` Igor Mammedov
@ 2020-08-13 21:10           ` Babu Moger
  0 siblings, 0 replies; 13+ messages in thread
From: Babu Moger @ 2020-08-13 21:10 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Daniel P. Berrangé, ehabkost, mst, qemu-devel, pbonzini, rth



> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Thursday, August 13, 2020 8:56 AM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: Daniel P. Berrangé <berrange@redhat.com>; ehabkost@redhat.com;
> mst@redhat.com; qemu-devel@nongnu.org; pbonzini@redhat.com;
> rth@twiddle.net
> Subject: Re: [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for
> EPYC
> 
> On Tue, 11 Aug 2020 16:03:58 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > On 8/7/20 2:11 PM, Igor Mammedov wrote:
> > > On Fri, 7 Aug 2020 17:52:22 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > >> On Fri, Aug 07, 2020 at 11:32:51AM -0500, Babu Moger wrote:
> > >>> Adding a new check to warn the users to configure 'dies' when
> > >>> topology is numa configured. It makes it easy to build the
> > >>> topology for EPYC models.
> > >>
> > >> This says you're adding a warning....
> > >>
> > >>>
> > >>> Signed-off-by: Babu Moger <babu.moger@amd.com>
> > >>> ---
> > >>>  hw/i386/x86.c |    7 +++++++
> > >>>  1 file changed, 7 insertions(+)
> > >>>
> > >>> diff --git a/hw/i386/x86.c b/hw/i386/x86.c index
> > >>> 67bee1bcb8..2a6ce56ef1 100644
> > >>> --- a/hw/i386/x86.c
> > >>> +++ b/hw/i386/x86.c
> > >>> @@ -138,6 +138,13 @@ void x86_cpus_init(X86MachineState *x86ms,
> > >>> int default_cpu_version)
> > >>>
> > >>>      /* Check for apicid encoding */
> > >>>      if (cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)) {
> > >>> +        if ((ms->numa_state->num_nodes > 0) &&
> > >>> +            ms->numa_state->num_nodes != (ms->smp.sockets * x86ms-
> >smp_dies)) {
> > >>> +            error_setg(&error_fatal, "Numa configuration requires smp 'dies' "
> > >>> +                       "parameter. Configure the cpu topology properly with "
> > >>> +                       "max_cpus = sockets * dies * cores * threads");
> > >>
> > >> ...but you're actually making this a fatal error, not a warning.
> > >>
> > >> I'm not sure this is really OK. Wouldn't this mean that existing
> > >> VMs deployed today, risk triggering this fatal error next time they
> > >> are booted, or live migrated.  If it is possible someone is using
> > >> such a config today, I don't think we can break it.
> > >
> > > to begin with, users shouldn't have used 'dies' with initial impl. at all.
> > > (it was Intel introduced option and EPYC's added very similar
> > > internal node_id (removed by the next patch)).
> > > Now we are trying to consolidate this mess and reuse dies for EPYC.
> > >
> > > EPYC was out in the since with 5.0 (though broken), users could
> > > start a VM with such config but that would not be correct EPYC from apicid
> and cpuid point of view.
> > > Guest OS might run if it doesn't know about EPYCs or behave wierdly
> > > (sub optimal|crash|whatever) on seeing unexpected values.
> > >
> > > If we are hell bound on keeping bugs of initial impl, then we should
> > > keep it to 5.1<= machine version and do the right thing for newer ones.
> > > Though I'm not sure we should keep broken variant around (all we
> > > would get from it is bug reports*/complains from users with end result of
> their config anyways).
> > > I'd rather error out with clear error message so user could fix their broken
> config.
> > >
> > > *) there is at least one thread/bz on qemu-devel where users are
> > > trying to run with EPYC and pick up options combination so it would produce
> sensible topology.
> >
> >
> > I am still not sure what is the right approach here.  I can think of
> > couple of options.
> > 1. If smp_dies != num_nodes then go ahead create the configuration
> > with as  many smp_dies and warn(but not error out) users about the mis-
> configuration.
> warning is a bad idea, that usually leads to troubles down the road.
> 
> Provided that code is relatively new and produces misconfigured CPUs and if
> nobody insists on keeping bug around, I'd try to go for erroring out.
> Yes that would break misconfigured configs but that could be fixed by
> reconfiguring on user side.

Ok. I will refresh the patches if there are no other comments. thanks

> 
> > 2. Introduce it as a fix based on  machine version(5.1 >) like Igor
> > mentioned. I am not sure how to achieve that. I can look into that.
> That's a headache for maintaing point of view, so again if nobody insist I'd
> rather avoid it.
> 
> >
> > Thanks
> > Babu
> >
> > >
> > >
> > >> Regards,
> > >> Daniel
> > >
> >



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-08-13 21:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-07 16:32 [PATCH v3 0/3] Fix couple of issues with AMD topology Babu Moger
2020-08-07 16:32 ` [PATCH v3 1/3] i386: Simplify CPUID_8000_001E for AMD Babu Moger
2020-08-07 18:11   ` Igor Mammedov
2020-08-07 16:32 ` [PATCH v3 2/3] hw/i386: Add a new check to configure smp dies for EPYC Babu Moger
2020-08-07 16:52   ` Daniel P. Berrangé
2020-08-07 17:23     ` Babu Moger
2020-08-07 19:11     ` Igor Mammedov
2020-08-11 21:03       ` Babu Moger
2020-08-13 13:56         ` Igor Mammedov
2020-08-13 21:10           ` Babu Moger
2020-08-07 19:27   ` Igor Mammedov
2020-08-11 21:04     ` Babu Moger
2020-08-07 16:32 ` [PATCH v3 3/3] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology Babu Moger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).