All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/17] Support smp.clusters for x86
@ 2023-08-01 10:35 Zhao Liu
  2023-08-01 10:35 ` [PATCH v3 01/17] i386: Fix comment style in topology.h Zhao Liu
                   ` (18 more replies)
  0 siblings, 19 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Hi list,

This is the our v3 patch series, rebased on the master branch at the
commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
//git.linaro.org/people/pmaydell/qemu-arm into staging").

Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
"ACKed-by" (for PC related patchies) tags and minor code changes (Pls
see changelog).


# Introduction

This series add the cluster support for x86 PC machine, which allows
x86 can use smp.clusters to configure x86 modlue level CPU topology.

And since the compatibility issue (see section: ## Why not share L2
cache in cluster directly), this series also introduce a new command
to adjust the x86 L2 cache topology.

Welcome your comments!


# Backgroud

The "clusters" parameter in "smp" is introduced by ARM [2], but x86
hasn't supported it.

At present, x86 defaults L2 cache is shared in one core, but this is
not enough. There're some platforms that multiple cores share the
same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
Atom cores [3], that is, every four Atom cores shares one L2 cache.
Therefore, we need the new CPU topology level (cluster/module).

Another reason is for hybrid architecture. cluster support not only
provides another level of topology definition in x86, but would aslo
provide required code change for future our hybrid topology support.


# Overview

## Introduction of module level for x86

"cluster" in smp is the CPU topology level which is between "core" and
die.

For x86, the "cluster" in smp is corresponding to the module level [4],
which is above the core level. So use the "module" other than "cluster"
in x86 code.

And please note that x86 already has a cpu topology level also named
"cluster" [4], this level is at the upper level of the package. Here,
the cluster in x86 cpu topology is completely different from the
"clusters" as the smp parameter. After the module level is introduced,
the cluster as the smp parameter will actually refer to the module level
of x86.


## Why not share L2 cache in cluster directly

Though "clusters" was introduced to help define L2 cache topology
[2], using cluster to define x86's L2 cache topology will cause the
compatibility problem:

Currently, x86 defaults that the L2 cache is shared in one core, which
actually implies a default setting "cores per L2 cache is 1" and
therefore implicitly defaults to having as many L2 caches as cores.

For example (i386 PC machine):
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)

Considering the topology of the L2 cache, this (*) implicitly means "1
core per L2 cache" and "2 L2 caches per die".

If we use cluster to configure L2 cache topology with the new default
setting "clusters per L2 cache is 1", the above semantics will change
to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
cores per L2 cache".

So the same command (*) will cause changes in the L2 cache topology,
further affecting the performance of the virtual machine.

Therefore, x86 should only treat cluster as a cpu topology level and
avoid using it to change L2 cache by default for compatibility.


## module level in CPUID

Currently, we don't expose module level in CPUID.1FH because currently
linux (v6.2-rc6) doesn't support module level. And exposing module and
die levels at the same time in CPUID.1FH will cause linux to calculate
wrong die_id. The module level should be exposed until the real machine
has the module level in CPUID.1FH.

We can configure CPUID.04H.02H (L2 cache topology) with module level by
a new command:

        "-cpu,x-l2-cache-topo=cluster"

More information about this command, please see the section: "## New
property: x-l2-cache-topo".


## New cache topology info in CPUCacheInfo

Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache [2].

Thus, in this patch set, we explicitly define the corresponding cache
topology for different cpu models and this has two benefits:
1. Easy to expand to new CPU models in the future, which has different
   cache topology.
2. It can easily support custom cache topology by some command (e.g.,
   x-l2-cache-topo).


## New property: x-l2-cache-topo

The property l2-cache-topo will be used to change the L2 cache topology
in CPUID.04H.

Now it allows user to set the L2 cache is shared in core level or
cluster level.

If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
topology will be overrided by the new topology setting.

Since CPUID.04H is used by intel cpus, this property is available on
intel cpus as for now.

When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.


# Patch description

patch 1-2 Cleanups about coding style and test name.

patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
             cache topology encoding.

patch 5-6 Cleanups about topology related CPUID encoding and QEMU
          topology variables.

patch 7-12 Add the module as the new CPU topology level in x86, and it
           is corresponding to the cluster level in generic code.

patch 13,14,16 Add cache topology infomation in cache models.

patch 17 Introduce a new command to configure L2 cache topology.


[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
[2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
[3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
[4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.

Best Regards,
Zhao

---
Changelog:

Changes since v2:
 * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
 * Use newly added wrapped helper to get cores per socket in
   qemu_init_vcpu().

Changes since v1:
 * Reordered patches. (Yanan)
 * Deprecated the patch to fix comment of machine_parse_smp_config().
   (Yanan)
 * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
 * Split the intel's l1 cache topology fix into a new separate patch.
   (Yanan)
 * Combined module_id and APIC ID for module level support into one
   patch. (Yanan)
 * Make cache_into_passthrough case of cpuid 0x04 leaf in
 * cpu_x86_cpuid() use max_processor_ids_for_cache() and
   max_core_ids_in_package() to encode CPUID[4]. (Yanan)
 * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
   (Yanan)
 * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)

---
Zhao Liu (10):
  i386: Fix comment style in topology.h
  tests: Rename test-x86-cpuid.c to test-x86-topo.c
  i386/cpu: Fix i/d-cache topology to core level for Intel CPU
  i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
  i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  i386: Add cache topology info in CPUCacheInfo
  i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  i386: Use CPUCacheInfo.share_level to encode
    CPUID[0x8000001D].EAX[bits 25:14]
  i386: Add new property to control L2 cache topo in CPUID.04H

Zhuocheng Ding (7):
  softmmu: Fix CPUSTATE.nr_cores' calculation
  i386: Introduce module-level cpu topology to CPUX86State
  i386: Support modules_per_die in X86CPUTopoInfo
  i386: Support module_id in X86CPUTopoIDs
  i386/cpu: Introduce cluster-id to X86CPU
  tests: Add test case of APIC ID for module level parsing
  hw/i386/pc: Support smp.clusters for x86 PC machine

 MAINTAINERS                                   |   2 +-
 hw/i386/pc.c                                  |   1 +
 hw/i386/x86.c                                 |  49 +++++-
 include/hw/core/cpu.h                         |   2 +-
 include/hw/i386/topology.h                    |  68 +++++---
 qemu-options.hx                               |  10 +-
 softmmu/cpus.c                                |   2 +-
 target/i386/cpu.c                             | 158 ++++++++++++++----
 target/i386/cpu.h                             |  25 +++
 tests/unit/meson.build                        |   4 +-
 .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
 11 files changed, 280 insertions(+), 99 deletions(-)
 rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH v3 01/17] i386: Fix comment style in topology.h
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-01 23:13   ` Moger, Babu
  2023-08-07  2:16   ` Xiaoyao Li
  2023-08-01 10:35 ` [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c Zhao Liu
                   ` (17 subsequent siblings)
  18 siblings, 2 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

For function comments in this file, keep the comment style consistent
with other places.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/hw/i386/topology.h | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 81573f6cfde0..5a19679f618b 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -24,7 +24,8 @@
 #ifndef HW_I386_TOPOLOGY_H
 #define HW_I386_TOPOLOGY_H
 
-/* This file implements the APIC-ID-based CPU topology enumeration logic,
+/*
+ * This file implements the APIC-ID-based CPU topology enumeration logic,
  * documented at the following document:
  *   Intel® 64 Architecture Processor Topology Enumeration
  *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
@@ -41,7 +42,8 @@
 
 #include "qemu/bitops.h"
 
-/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
+/*
+ * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
  */
 typedef uint32_t apic_id_t;
 
@@ -58,8 +60,7 @@ typedef struct X86CPUTopoInfo {
     unsigned threads_per_core;
 } X86CPUTopoInfo;
 
-/* Return the bit width needed for 'count' IDs
- */
+/* Return the bit width needed for 'count' IDs */
 static unsigned apicid_bitwidth_for_count(unsigned count)
 {
     g_assert(count >= 1);
@@ -67,15 +68,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
     return count ? 32 - clz32(count) : 0;
 }
 
-/* Bit width of the SMT_ID (thread ID) field on the APIC ID
- */
+/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
 static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
 {
     return apicid_bitwidth_for_count(topo_info->threads_per_core);
 }
 
-/* Bit width of the Core_ID field
- */
+/* Bit width of the Core_ID field */
 static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
 {
     return apicid_bitwidth_for_count(topo_info->cores_per_die);
@@ -87,8 +86,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
     return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
 }
 
-/* Bit offset of the Core_ID field
- */
+/* Bit offset of the Core_ID field */
 static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
 {
     return apicid_smt_width(topo_info);
@@ -100,14 +98,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
     return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
 }
 
-/* Bit offset of the Pkg_ID (socket ID) field
- */
+/* Bit offset of the Pkg_ID (socket ID) field */
 static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
 {
     return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
 }
 
-/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
+/*
+ * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
  *
  * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
  */
@@ -120,7 +118,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
            topo_ids->smt_id;
 }
 
-/* Calculate thread/core/package IDs for a specific topology,
+/*
+ * Calculate thread/core/package IDs for a specific topology,
  * based on (contiguous) CPU index
  */
 static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
@@ -137,7 +136,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
     topo_ids->smt_id = cpu_index % nr_threads;
 }
 
-/* Calculate thread/core/package IDs for a specific topology,
+/*
+ * Calculate thread/core/package IDs for a specific topology,
  * based on APIC ID
  */
 static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
@@ -155,7 +155,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
     topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
 }
 
-/* Make APIC ID for the CPU 'cpu_index'
+/*
+ * Make APIC ID for the CPU 'cpu_index'
  *
  * 'cpu_index' is a sequential, contiguous ID for the CPU.
  */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
  2023-08-01 10:35 ` [PATCH v3 01/17] i386: Fix comment style in topology.h Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-01 23:20   ` Moger, Babu
  2023-08-01 10:35 ` [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu, Yongwei Ma

From: Zhao Liu <zhao1.liu@intel.com>

In fact, this unit tests APIC ID other than CPUID.
Rename to test-x86-topo.c to make its name more in line with its
actual content.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v1:
 * Rename test-x86-apicid.c to test-x86-topo.c. (Yanan)
---
 MAINTAINERS                                      | 2 +-
 tests/unit/meson.build                           | 4 ++--
 tests/unit/{test-x86-cpuid.c => test-x86-topo.c} | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)
 rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (99%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 12e59b6b27de..51ba3d593e90 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1719,7 +1719,7 @@ F: include/hw/southbridge/ich9.h
 F: include/hw/southbridge/piix.h
 F: hw/isa/apm.c
 F: include/hw/isa/apm.h
-F: tests/unit/test-x86-cpuid.c
+F: tests/unit/test-x86-topo.c
 F: tests/qtest/test-x86-cpuid-compat.c
 
 PC Chipset
diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index 93977cc32d2b..39b5d0007c69 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -21,8 +21,8 @@ tests = {
   'test-opts-visitor': [testqapi],
   'test-visitor-serialization': [testqapi],
   'test-bitmap': [],
-  # all code tested by test-x86-cpuid is inside topology.h
-  'test-x86-cpuid': [],
+  # all code tested by test-x86-topo is inside topology.h
+  'test-x86-topo': [],
   'test-cutils': [],
   'test-div128': [],
   'test-shift128': [],
diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-topo.c
similarity index 99%
rename from tests/unit/test-x86-cpuid.c
rename to tests/unit/test-x86-topo.c
index bfabc0403a1a..2b104f86d7c2 100644
--- a/tests/unit/test-x86-cpuid.c
+++ b/tests/unit/test-x86-topo.c
@@ -1,5 +1,5 @@
 /*
- *  Test code for x86 CPUID and Topology functions
+ *  Test code for x86 APIC ID and Topology functions
  *
  *  Copyright (c) 2012 Red Hat Inc.
  *
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
  2023-08-01 10:35 ` [PATCH v3 01/17] i386: Fix comment style in topology.h Zhao Liu
  2023-08-01 10:35 ` [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-02 15:25   ` Moger, Babu
  2023-08-07  7:03   ` Xiaoyao Li
  2023-08-01 10:35 ` [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
                   ` (15 subsequent siblings)
  18 siblings, 2 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

From CPUState.nr_cores' comment, it represents "number of cores within
this CPU package".

After 003f230e37d7 ("machine: Tweak the order of topology members in
struct CpuTopology"), the meaning of smp.cores changed to "the number of
cores in one die", but this commit missed to change CPUState.nr_cores'
caculation, so that CPUState.nr_cores became wrong and now it
misses to consider numbers of clusters and dies.

At present, only i386 is using CPUState.nr_cores.

But as for i386, which supports die level, the uses of CPUState.nr_cores
are very confusing:

Early uses are based on the meaning of "cores per package" (before die
is introduced into i386), and later uses are based on "cores per die"
(after die's introduction).

This difference is due to that commit a94e1428991f ("target/i386: Add
CPUID.1F generation support for multi-dies PCMachine") misunderstood
that CPUState.nr_cores means "cores per die" when caculated
CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
wrong understanding.

With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
the result of CPUState.nr_cores is "cores per die", thus the original
uses of CPUState.cores based on the meaning of "cores per package" are
wrong when mutiple dies exist:
1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
   incorrect because it expects "cpus per package" but now the
   result is "cpus per die".
2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
   EAX[bits 31:26] is incorrect because they expect "cpus per package"
   but now the result is "cpus per die". The error not only impacts the
   EAX caculation in cache_info_passthrough case, but also impacts other
   cases of setting cache topology for Intel CPU according to cpu
   topology (specifically, the incoming parameter "num_cores" expects
   "cores per package" in encode_cache_cpuid4()).
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
   15:00] is incorrect because the EBX of 0BH.01H (core level) expects
   "cpus per package", which may be different with 1FH.01H (The reason
   is 1FH can support more levels. For QEMU, 1FH also supports die,
   1FH.01H:EBX[bits 15:00] expects "cpus per die").
4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
   caculated, here "cpus per package" is expected to be checked, but in
   fact, now it checks "cpus per die". Though "cpus per die" also works
   for this code logic, this isn't consistent with AMD's APM.
5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
   "cpus per package" but it obtains "cpus per die".
6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
   kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
   helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
   MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
   package", but in these functions, it obtains "cpus per die" and
   "cores per die".

On the other hand, these uses are correct now (they are added in/after
a94e1428991f):
1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
   meets the actual meaning of CPUState.nr_cores ("cores per die").
2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
   04H's caculation) considers number of dies, so it's correct.
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
   15:00] needs "cpus per die" and it gets the correct result, and
   CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".

When CPUState.nr_cores is correctly changed to "cores per package" again
, the above errors will be fixed without extra work, but the "currently"
correct cases will go wrong and need special handling to pass correct
"cpus/cores per die" they want.

Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
original meaning "cores per package", as well as changing calculation of
topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.

In addition, in the nr_threads' comment, specify it represents the
number of threads in the "core" to avoid confusion, and also add comment
for nr_dies in CPUX86State.

Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v2:
 * Use wrapped helper to get cores per socket in qemu_init_vcpu().
Changes since v1:
 * Add comment for nr_dies in CPUX86State. (Yanan)
---
 include/hw/core/cpu.h | 2 +-
 softmmu/cpus.c        | 2 +-
 target/i386/cpu.c     | 9 ++++-----
 target/i386/cpu.h     | 1 +
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index fdcbe8735258..57f4d50ace72 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -277,7 +277,7 @@ struct qemu_work_item;
  *   See TranslationBlock::TCG CF_CLUSTER_MASK.
  * @tcg_cflags: Pre-computed cflags for this cpu.
  * @nr_cores: Number of cores within this CPU package.
- * @nr_threads: Number of threads within this CPU.
+ * @nr_threads: Number of threads within this CPU core.
  * @running: #true if CPU is currently running (lockless).
  * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
  * valid under cpu_list_lock.
diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index fed20ffb5dd2..984558d7b245 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
 
-    cpu->nr_cores = ms->smp.cores;
+    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
     cpu->nr_threads =  ms->smp.threads;
     cpu->stopped = true;
     cpu->random_seed = qemu_guest_random_seed_thread_part1();
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 97ad229d8ba3..50613cd04612 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     X86CPUTopoInfo topo_info;
 
     topo_info.dies_per_pkg = env->nr_dies;
-    topo_info.cores_per_die = cs->nr_cores;
+    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
     topo_info.threads_per_core = cs->nr_threads;
 
     /* Calculate & apply limits for different index ranges */
@@ -6087,8 +6087,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              */
             if (*eax & 31) {
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
-                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
-                                       cs->nr_threads;
+                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
                 if (cs->nr_cores > 1) {
                     *eax &= ~0xFC000000;
                     *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
@@ -6266,12 +6265,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         case 1:
             *eax = apicid_die_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         case 2:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
+            *ebx = cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;
         default:
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index e0771a10433b..7638128d59cc 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1878,6 +1878,7 @@ typedef struct CPUArchState {
 
     TPRAccess tpr_access_type;
 
+    /* Number of dies within this CPU package. */
     unsigned nr_dies;
 } CPUX86State;
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (2 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-04  9:56   ` Xiaoyao Li
  2023-08-01 10:35 ` [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu, Robert Hoo

From: Zhao Liu <zhao1.liu@intel.com>

For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
both 0, and this means i-cache and d-cache are shared in the SMT level.
This is correct if there's single thread per core, but is wrong for the
hyper threading case (one core contains multiple threads) since the
i-cache and d-cache are shared in the core level other than SMT level.

For AMD CPU, commit 8f4202fb1080 ("i386: Populate AMD Processor Cache
Information for cpuid 0x8000001D") has already introduced i/d cache
topology as core level by default.

Therefore, in order to be compatible with both multi-threaded and
single-threaded situations, we should set i-cache and d-cache be shared
at the core level by default.

This fix changes the default i/d cache topology from per-thread to
per-core. Potentially, this change in L1 cache topology may affect the
performance of the VM if the user does not specifically specify the
topology or bind the vCPU. However, the way to achieve optimal
performance should be to create a reasonable topology and set the
appropriate vCPU affinity without relying on QEMU's default topology
structure.

Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v1:
 * Split this fix from the patch named "i386/cpu: Fix number of
   addressable IDs in CPUID.04H".
 * Add the explanation of the impact on performance. (Xiaoyao)
---
 target/i386/cpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 50613cd04612..b439a05244ee 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6104,12 +6104,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             switch (count) {
             case 0: /* L1 dcache info */
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
-                                    1, cs->nr_cores,
+                                    cs->nr_threads, cs->nr_cores,
                                     eax, ebx, ecx, edx);
                 break;
             case 1: /* L1 icache info */
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
-                                    1, cs->nr_cores,
+                                    cs->nr_threads, cs->nr_cores,
                                     eax, ebx, ecx, edx);
                 break;
             case 2: /* L2 cache info */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (3 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-02 15:41   ` Moger, Babu
  2023-08-07  8:13   ` Xiaoyao Li
  2023-08-01 10:35 ` [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
                   ` (13 subsequent siblings)
  18 siblings, 2 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu, Robert Hoo

From: Zhao Liu <zhao1.liu@intel.com>

Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
nearest power-of-2 integer.

The nearest power-of-2 integer can be caculated by pow2ceil() or by
using APIC ID offset (like L3 topology using 1 << die_offset [3]).

But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
are associated with APIC ID. For example, in linux kernel, the field
"num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
matched with actual core numbers and it's caculated by:
"(1 << (pkg_offset - core_offset)) - 1".

Therefore the offset of APIC ID should be preferred to caculate nearest
power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
31:26]:
1. d/i cache is shared in a core, 1 << core_offset should be used
   instand of "cs->nr_threads" in encode_cache_cpuid4() for
   CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
2. L2 cache is supposed to be shared in a core as for now, thereby
   1 << core_offset should also be used instand of "cs->nr_threads" in
   encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
   replaced by the offsets upper SMT level in APIC ID.

In addition, use APIC ID offset to replace "pow2ceil()" for
cache_info_passthrough case.

[1]: efb3934adf9e ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec")
[2]: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
[3]: d65af288a84d ("i386: Update new x86_apicid parsing rules with die_offset support")

Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v1:
 * Use APIC ID offset to replace "pow2ceil()" for cache_info_passthrough
   case. (Yanan)
 * Split the L1 cache fix into a separate patch.
 * Rename the title of this patch (the original is "i386/cpu: Fix number
   of addressable IDs in CPUID.04H").
---
 target/i386/cpu.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b439a05244ee..c80613bfcded 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6005,7 +6005,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 {
     X86CPU *cpu = env_archcpu(env);
     CPUState *cs = env_cpu(env);
-    uint32_t die_offset;
     uint32_t limit;
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
@@ -6089,39 +6088,56 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
                 int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
                 if (cs->nr_cores > 1) {
+                    int addressable_cores_offset =
+                                                apicid_pkg_offset(&topo_info) -
+                                                apicid_core_offset(&topo_info);
+
                     *eax &= ~0xFC000000;
-                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
+                    *eax |= (1 << addressable_cores_offset - 1) << 26;
                 }
                 if (host_vcpus_per_cache > vcpus_per_socket) {
+                    int pkg_offset = apicid_pkg_offset(&topo_info);
+
                     *eax &= ~0x3FFC000;
-                    *eax |= (pow2ceil(vcpus_per_socket) - 1) << 14;
+                    *eax |= (1 << pkg_offset - 1) << 14;
                 }
             }
         } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
             *eax = *ebx = *ecx = *edx = 0;
         } else {
             *eax = 0;
+            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
+                                           apicid_core_offset(&topo_info);
+            int core_offset, die_offset;
+
             switch (count) {
             case 0: /* L1 dcache info */
+                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
-                                    cs->nr_threads, cs->nr_cores,
+                                    (1 << core_offset),
+                                    (1 << addressable_cores_offset),
                                     eax, ebx, ecx, edx);
                 break;
             case 1: /* L1 icache info */
+                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
-                                    cs->nr_threads, cs->nr_cores,
+                                    (1 << core_offset),
+                                    (1 << addressable_cores_offset),
                                     eax, ebx, ecx, edx);
                 break;
             case 2: /* L2 cache info */
+                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
-                                    cs->nr_threads, cs->nr_cores,
+                                    (1 << core_offset),
+                                    (1 << addressable_cores_offset),
                                     eax, ebx, ecx, edx);
                 break;
             case 3: /* L3 cache info */
                 die_offset = apicid_die_offset(&topo_info);
                 if (cpu->enable_l3_cache) {
                     encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
-                                        (1 << die_offset), cs->nr_cores,
+                                        (1 << die_offset),
+                                        (1 << addressable_cores_offset),
                                         eax, ebx, ecx, edx);
                     break;
                 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (4 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-02 16:31   ` Moger, Babu
  2023-08-01 10:35 ` [PATCH v3 07/17] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu, Robert Hoo

From: Zhao Liu <zhao1.liu@intel.com>

In cpu_x86_cpuid(), there are many variables in representing the cpu
topology, e.g., topo_info, cs->nr_cores/cs->nr_threads.

Since the names of cs->nr_cores/cs->nr_threads does not accurately
represent its meaning, the use of cs->nr_cores/cs->nr_threads is prone
to confusion and mistakes.

And the structure X86CPUTopoInfo names its memebers clearly, thus the
variable "topo_info" should be preferred.

In addition, in cpu_x86_cpuid(), to uniformly use the topology variable,
replace env->dies with topo_info.dies_per_pkg as well.

Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v1:
 * Extract cores_per_socket from the code block and use it as a local
   variable for cpu_x86_cpuid(). (Yanan)
 * Remove vcpus_per_socket variable and use cpus_per_pkg directly.
   (Yanan)
 * Replace env->dies with topo_info.dies_per_pkg in cpu_x86_cpuid().
---
 target/i386/cpu.c | 31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index c80613bfcded..fc50bf98c60e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6008,11 +6008,16 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t limit;
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
+    uint32_t cores_per_pkg;
+    uint32_t cpus_per_pkg;
 
     topo_info.dies_per_pkg = env->nr_dies;
     topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
     topo_info.threads_per_core = cs->nr_threads;
 
+    cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
+    cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
+
     /* Calculate & apply limits for different index ranges */
     if (index >= 0xC0000000) {
         limit = env->cpuid_xlevel2;
@@ -6048,8 +6053,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *ecx |= CPUID_EXT_OSXSAVE;
         }
         *edx = env->features[FEAT_1_EDX];
-        if (cs->nr_cores * cs->nr_threads > 1) {
-            *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
+        if (cpus_per_pkg > 1) {
+            *ebx |= cpus_per_pkg << 16;
             *edx |= CPUID_HT;
         }
         if (!cpu->enable_pmu) {
@@ -6086,8 +6091,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              */
             if (*eax & 31) {
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
-                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
-                if (cs->nr_cores > 1) {
+
+                if (cores_per_pkg > 1) {
                     int addressable_cores_offset =
                                                 apicid_pkg_offset(&topo_info) -
                                                 apicid_core_offset(&topo_info);
@@ -6095,7 +6100,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
                     *eax &= ~0xFC000000;
                     *eax |= (1 << addressable_cores_offset - 1) << 26;
                 }
-                if (host_vcpus_per_cache > vcpus_per_socket) {
+                if (host_vcpus_per_cache > cpus_per_pkg) {
                     int pkg_offset = apicid_pkg_offset(&topo_info);
 
                     *eax &= ~0x3FFC000;
@@ -6240,12 +6245,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         switch (count) {
         case 0:
             *eax = apicid_core_offset(&topo_info);
-            *ebx = cs->nr_threads;
+            *ebx = topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = cpus_per_pkg;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         default:
@@ -6266,7 +6271,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         break;
     case 0x1F:
         /* V2 Extended Topology Enumeration Leaf */
-        if (env->nr_dies < 2) {
+        if (topo_info.dies_per_pkg < 2) {
             *eax = *ebx = *ecx = *edx = 0;
             break;
         }
@@ -6276,7 +6281,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         switch (count) {
         case 0:
             *eax = apicid_core_offset(&topo_info);
-            *ebx = cs->nr_threads;
+            *ebx = topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
@@ -6286,7 +6291,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         case 2:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = cpus_per_pkg;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;
         default:
@@ -6511,7 +6516,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
          * discards multiple thread information if it is set.
          * So don't set it here for Intel to make Linux guests happy.
          */
-        if (cs->nr_cores * cs->nr_threads > 1) {
+        if (cpus_per_pkg > 1) {
             if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 ||
                 env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 ||
                 env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) {
@@ -6577,7 +6582,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              *eax |= (cpu_x86_virtual_addr_width(env) << 8);
         }
         *ebx = env->features[FEAT_8000_0008_EBX];
-        if (cs->nr_cores * cs->nr_threads > 1) {
+        if (cpus_per_pkg > 1) {
             /*
              * Bits 15:12 is "The number of bits in the initial
              * Core::X86::Apic::ApicId[ApicId] value that indicate
@@ -6585,7 +6590,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              * Bits 7:0 is "The number of threads in the package is NC+1"
              */
             *ecx = (apicid_pkg_offset(&topo_info) << 12) |
-                   ((cs->nr_cores * cs->nr_threads) - 1);
+                   (cpus_per_pkg - 1);
         } else {
             *ecx = 0;
         }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 07/17] i386: Introduce module-level cpu topology to CPUX86State
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (5 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-01 10:35 ` [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

smp command has the "clusters" parameter but x86 hasn't supported that
level. "cluster" is a CPU topology level concept above cores, in which
the cores may share some resources (L2 cache or some others like L3
cache tags, depending on the Archs) [1][2]. For x86, the resource shared
by cores at the cluster level is mainly the L2 cache.

However, using cluster to define x86's L2 cache topology will cause the
compatibility problem:

Currently, x86 defaults that the L2 cache is shared in one core, which
actually implies a default setting "cores per L2 cache is 1" and
therefore implicitly defaults to having as many L2 caches as cores.

For example (i386 PC machine):
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)

Considering the topology of the L2 cache, this (*) implicitly means "1
core per L2 cache" and "2 L2 caches per die".

If we use cluster to configure L2 cache topology with the new default
setting "clusters per L2 cache is 1", the above semantics will change
to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
cores per L2 cache".

So the same command (*) will cause changes in the L2 cache topology,
further affecting the performance of the virtual machine.

Therefore, x86 should only treat cluster as a cpu topology level and
avoid using it to change L2 cache by default for compatibility.

"cluster" in smp is the CPU topology level which is between "core" and
die.

For x86, the "cluster" in smp is corresponding to the module level [2],
which is above the core level. So use the "module" other than "cluster"
in i386 code.

And please note that x86 already has a cpu topology level also named
"cluster" [3], this level is at the upper level of the package. Here,
the cluster in x86 cpu topology is completely different from the
"clusters" as the smp parameter. After the module level is introduced,
the cluster as the smp parameter will actually refer to the module level
of x86.

[1]: 864c3b5c32f0 ("hw/core/machine: Introduce CPU cluster topology support")
[2]: Yanan's comment about "cluster",
     https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg04051.html
[3]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v1:
 * The background of the introduction of the "cluster" parameter and its
   exact meaning were revised according to Yanan's explanation. (Yanan)
---
 hw/i386/x86.c     | 1 +
 target/i386/cpu.c | 1 +
 target/i386/cpu.h | 5 +++++
 3 files changed, 7 insertions(+)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index a88a126123be..4efc390905ff 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -309,6 +309,7 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
     init_topo_info(&topo_info, x86ms);
 
     env->nr_dies = ms->smp.dies;
+    env->nr_modules = ms->smp.clusters;
 
     /*
      * If APIC ID is not set,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index fc50bf98c60e..8a9fd5682efc 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7606,6 +7606,7 @@ static void x86_cpu_initfn(Object *obj)
     CPUX86State *env = &cpu->env;
 
     env->nr_dies = 1;
+    env->nr_modules = 1;
     cpu_set_cpustate_pointers(cpu);
 
     object_property_add(obj, "feature-words", "X86CPUFeatureWordInfo",
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 7638128d59cc..5e97d0b76574 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1880,6 +1880,11 @@ typedef struct CPUArchState {
 
     /* Number of dies within this CPU package. */
     unsigned nr_dies;
+    /*
+     * Number of modules within this CPU package.
+     * Module level in x86 cpu topology is corresponding to smp.clusters.
+     */
+    unsigned nr_modules;
 } CPUX86State;
 
 struct kvm_msrs;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (6 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 07/17] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-02 17:25   ` Moger, Babu
  2023-08-01 10:35 ` [PATCH v3 09/17] i386: Support module_id in X86CPUTopoIDs Zhao Liu
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

Support module level in i386 cpu topology structure "X86CPUTopoInfo".

Since x86 does not yet support the "clusters" parameter in "-smp",
X86CPUTopoInfo.modules_per_die is currently always 1. Therefore, the
module level width in APIC ID, which can be calculated by
"apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0
for now, so we can directly add APIC ID related helpers to support
module level parsing.

At present, we don't expose module level in CPUID.1FH because currently
linux (v6.4-rc1) doesn't support module level. And exposing module and
die levels at the same time in CPUID.1FH will cause linux to calculate
the wrong die_id. The module level should be exposed until the real
machine has the module level in CPUID.1FH.

In addition, update topology structure in test-x86-topo.c.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v1:
 * Include module level related helpers (apicid_module_width() and
   apicid_module_offset()) in this patch. (Yanan)
---
 hw/i386/x86.c              |  3 ++-
 include/hw/i386/topology.h | 22 +++++++++++++++----
 target/i386/cpu.c          | 12 ++++++----
 tests/unit/test-x86-topo.c | 45 ++++++++++++++++++++------------------
 4 files changed, 52 insertions(+), 30 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 4efc390905ff..a552ae8bb4a8 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -72,7 +72,8 @@ static void init_topo_info(X86CPUTopoInfo *topo_info,
     MachineState *ms = MACHINE(x86ms);
 
     topo_info->dies_per_pkg = ms->smp.dies;
-    topo_info->cores_per_die = ms->smp.cores;
+    topo_info->modules_per_die = ms->smp.clusters;
+    topo_info->cores_per_module = ms->smp.cores;
     topo_info->threads_per_core = ms->smp.threads;
 }
 
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 5a19679f618b..c807d3811dd3 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -56,7 +56,8 @@ typedef struct X86CPUTopoIDs {
 
 typedef struct X86CPUTopoInfo {
     unsigned dies_per_pkg;
-    unsigned cores_per_die;
+    unsigned modules_per_die;
+    unsigned cores_per_module;
     unsigned threads_per_core;
 } X86CPUTopoInfo;
 
@@ -77,7 +78,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
 /* Bit width of the Core_ID field */
 static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
 {
-    return apicid_bitwidth_for_count(topo_info->cores_per_die);
+    return apicid_bitwidth_for_count(topo_info->cores_per_module);
+}
+
+/* Bit width of the Module_ID (cluster ID) field */
+static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
+{
+    return apicid_bitwidth_for_count(topo_info->modules_per_die);
 }
 
 /* Bit width of the Die_ID field */
@@ -92,10 +99,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
     return apicid_smt_width(topo_info);
 }
 
+/* Bit offset of the Module_ID (cluster ID) field */
+static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
+{
+    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
+}
+
 /* Bit offset of the Die_ID field */
 static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
 {
-    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
+    return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
 }
 
 /* Bit offset of the Pkg_ID (socket ID) field */
@@ -127,7 +140,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
                                          X86CPUTopoIDs *topo_ids)
 {
     unsigned nr_dies = topo_info->dies_per_pkg;
-    unsigned nr_cores = topo_info->cores_per_die;
+    unsigned nr_cores = topo_info->cores_per_module *
+                        topo_info->modules_per_die;
     unsigned nr_threads = topo_info->threads_per_core;
 
     topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 8a9fd5682efc..d6969813ee02 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -339,7 +339,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
+        l3_threads = topo_info->modules_per_die *
+                     topo_info->cores_per_module *
+                     topo_info->threads_per_core;
         *eax |= (l3_threads - 1) << 14;
     } else {
         *eax |= ((topo_info->threads_per_core - 1) << 14);
@@ -6012,10 +6014,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t cpus_per_pkg;
 
     topo_info.dies_per_pkg = env->nr_dies;
-    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
+    topo_info.modules_per_die = env->nr_modules;
+    topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
     topo_info.threads_per_core = cs->nr_threads;
 
-    cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
+    cores_per_pkg = topo_info.cores_per_module * topo_info.modules_per_die *
+                    topo_info.dies_per_pkg;
     cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
 
     /* Calculate & apply limits for different index ranges */
@@ -6286,7 +6290,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         case 1:
             *eax = apicid_die_offset(&topo_info);
-            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
+            *ebx = cpus_per_pkg / topo_info.dies_per_pkg;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         case 2:
diff --git a/tests/unit/test-x86-topo.c b/tests/unit/test-x86-topo.c
index 2b104f86d7c2..f21b8a5d95c2 100644
--- a/tests/unit/test-x86-topo.c
+++ b/tests/unit/test-x86-topo.c
@@ -30,13 +30,16 @@ static void test_topo_bits(void)
 {
     X86CPUTopoInfo topo_info = {0};
 
-    /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
-    topo_info = (X86CPUTopoInfo) {1, 1, 1};
+    /*
+     * simple tests for 1 thread per core, 1 core per module,
+     *                  1 module per die, 1 die per package
+     */
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
 
-    topo_info = (X86CPUTopoInfo) {1, 1, 1};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
@@ -45,39 +48,39 @@ static void test_topo_bits(void)
 
     /* Test field width calculation for multiple values
      */
-    topo_info = (X86CPUTopoInfo) {1, 1, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {1, 1, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {1, 1, 4};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
 
-    topo_info = (X86CPUTopoInfo) {1, 1, 14};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {1, 1, 15};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {1, 1, 16};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {1, 1, 17};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
 
 
-    topo_info = (X86CPUTopoInfo) {1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {1, 31, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {1, 32, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {1, 33, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
 
-    topo_info = (X86CPUTopoInfo) {1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
-    topo_info = (X86CPUTopoInfo) {2, 30, 2};
+    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {3, 30, 2};
+    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {4, 30, 2};
+    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
 
     /* build a weird topology and see if IDs are calculated correctly
@@ -85,18 +88,18 @@ static void test_topo_bits(void)
 
     /* This will use 2 bits for thread ID and 3 bits for core ID
      */
-    topo_info = (X86CPUTopoInfo) {1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
 
-    topo_info = (X86CPUTopoInfo) {1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
 
-    topo_info = (X86CPUTopoInfo) {1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
                      (1 << 2) | 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 09/17] i386: Support module_id in X86CPUTopoIDs
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (7 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-01 10:35 ` [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

Add module_id member in X86CPUTopoIDs.

module_id can be parsed from APIC ID, so also update APIC ID parsing
rule to support module level. With this support, the conversions with
module level between X86CPUTopoIDs, X86CPUTopoInfo and APIC ID are
completed.

module_id can be also generated from cpu topology, and before i386
supports "clusters" in smp, the default "clusters per die" is only 1,
thus the module_id generated in this way is 0, so that it will not
conflict with the module_id generated by APIC ID.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v1:
 * Merge the patch "i386: Update APIC ID parsing rule to support module
   level" into this one. (Yanan)
 * Move the apicid_module_width() and apicid_module_offset() support
   into the previous modules_per_die related patch. (Yanan)
---
 hw/i386/x86.c              | 28 +++++++++++++++++++++-------
 include/hw/i386/topology.h | 17 +++++++++++++----
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index a552ae8bb4a8..0b460fd6074d 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -314,11 +314,11 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
     /*
      * If APIC ID is not set,
-     * set it based on socket/die/core/thread properties.
+     * set it based on socket/die/cluster/core/thread properties.
      */
     if (cpu->apic_id == UNASSIGNED_APIC_ID) {
-        int max_socket = (ms->smp.max_cpus - 1) /
-                                smp_threads / smp_cores / ms->smp.dies;
+        int max_socket = (ms->smp.max_cpus - 1) / smp_threads / smp_cores /
+                                ms->smp.clusters / ms->smp.dies;
 
         /*
          * die-id was optional in QEMU 4.0 and older, so keep it optional
@@ -365,6 +365,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
         topo_ids.die_id = cpu->die_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
+
+        /*
+         * TODO: This is the temporary initialization for topo_ids.module_id to
+         * avoid "maybe-uninitialized" compilation errors. Will remove when
+         * X86CPU supports cluster_id.
+         */
+        topo_ids.module_id = 0;
+
         cpu->apic_id = x86_apicid_from_topo_ids(&topo_info, &topo_ids);
     }
 
@@ -373,11 +381,13 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
         MachineState *ms = MACHINE(x86ms);
 
         x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
+
         error_setg(errp,
-            "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
-            " APIC ID %" PRIu32 ", valid index range 0:%d",
-            topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, topo_ids.smt_id,
-            cpu->apic_id, ms->possible_cpus->len - 1);
+            "Invalid CPU [socket: %u, die: %u, module: %u, core: %u, thread: %u]"
+            " with APIC ID %" PRIu32 ", valid index range 0:%d",
+            topo_ids.pkg_id, topo_ids.die_id, topo_ids.module_id,
+            topo_ids.core_id, topo_ids.smt_id, cpu->apic_id,
+            ms->possible_cpus->len - 1);
         return;
     }
 
@@ -498,6 +508,10 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
             ms->possible_cpus->cpus[i].props.has_die_id = true;
             ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
         }
+        if (ms->smp.clusters > 1) {
+            ms->possible_cpus->cpus[i].props.has_cluster_id = true;
+            ms->possible_cpus->cpus[i].props.cluster_id = topo_ids.module_id;
+        }
         ms->possible_cpus->cpus[i].props.has_core_id = true;
         ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
         ms->possible_cpus->cpus[i].props.has_thread_id = true;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index c807d3811dd3..3cec97b377f2 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -50,6 +50,7 @@ typedef uint32_t apic_id_t;
 typedef struct X86CPUTopoIDs {
     unsigned pkg_id;
     unsigned die_id;
+    unsigned module_id;
     unsigned core_id;
     unsigned smt_id;
 } X86CPUTopoIDs;
@@ -127,6 +128,7 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
 {
     return (topo_ids->pkg_id  << apicid_pkg_offset(topo_info)) |
            (topo_ids->die_id  << apicid_die_offset(topo_info)) |
+           (topo_ids->module_id << apicid_module_offset(topo_info)) |
            (topo_ids->core_id << apicid_core_offset(topo_info)) |
            topo_ids->smt_id;
 }
@@ -140,12 +142,16 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
                                          X86CPUTopoIDs *topo_ids)
 {
     unsigned nr_dies = topo_info->dies_per_pkg;
-    unsigned nr_cores = topo_info->cores_per_module *
-                        topo_info->modules_per_die;
+    unsigned nr_modules = topo_info->modules_per_die;
+    unsigned nr_cores = topo_info->cores_per_module;
     unsigned nr_threads = topo_info->threads_per_core;
 
-    topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
-    topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
+    topo_ids->pkg_id = cpu_index / (nr_dies * nr_modules *
+                       nr_cores * nr_threads);
+    topo_ids->die_id = cpu_index / (nr_modules * nr_cores *
+                       nr_threads) % nr_dies;
+    topo_ids->module_id = cpu_index / (nr_cores * nr_threads) %
+                          nr_modules;
     topo_ids->core_id = cpu_index / nr_threads % nr_cores;
     topo_ids->smt_id = cpu_index % nr_threads;
 }
@@ -163,6 +169,9 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
     topo_ids->core_id =
             (apicid >> apicid_core_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
+    topo_ids->module_id =
+            (apicid >> apicid_module_offset(topo_info)) &
+            ~(0xFFFFFFFFUL << apicid_module_width(topo_info));
     topo_ids->die_id =
             (apicid >> apicid_die_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (8 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 09/17] i386: Support module_id in X86CPUTopoIDs Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-02 22:44   ` Moger, Babu
  2023-08-01 10:35 ` [PATCH v3 11/17] tests: Add test case of APIC ID for module level parsing Zhao Liu
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

We introduce cluster-id other than module-id to be consistent with
CpuInstanceProperties.cluster-id, and this avoids the confusion
of parameter names when hotplugging.

Following the legacy smp check rules, also add the cluster_id validity
into x86_cpu_pre_plug().

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/i386/x86.c     | 33 +++++++++++++++++++++++++--------
 target/i386/cpu.c |  2 ++
 target/i386/cpu.h |  1 +
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 0b460fd6074d..8154b86f95c7 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -328,6 +328,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
             cpu->die_id = 0;
         }
 
+        /*
+         * cluster-id was optional in QEMU 8.0 and older, so keep it optional
+         * if there's only one cluster per die.
+         */
+        if (cpu->cluster_id < 0 && ms->smp.clusters == 1) {
+            cpu->cluster_id = 0;
+        }
+
         if (cpu->socket_id < 0) {
             error_setg(errp, "CPU socket-id is not set");
             return;
@@ -344,6 +352,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
                        cpu->die_id, ms->smp.dies - 1);
             return;
         }
+        if (cpu->cluster_id < 0) {
+            error_setg(errp, "CPU cluster-id is not set");
+            return;
+        } else if (cpu->cluster_id > ms->smp.clusters - 1) {
+            error_setg(errp, "Invalid CPU cluster-id: %u must be in range 0:%u",
+                       cpu->cluster_id, ms->smp.clusters - 1);
+            return;
+        }
         if (cpu->core_id < 0) {
             error_setg(errp, "CPU core-id is not set");
             return;
@@ -363,16 +379,9 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
         topo_ids.pkg_id = cpu->socket_id;
         topo_ids.die_id = cpu->die_id;
+        topo_ids.module_id = cpu->cluster_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
-
-        /*
-         * TODO: This is the temporary initialization for topo_ids.module_id to
-         * avoid "maybe-uninitialized" compilation errors. Will remove when
-         * X86CPU supports cluster_id.
-         */
-        topo_ids.module_id = 0;
-
         cpu->apic_id = x86_apicid_from_topo_ids(&topo_info, &topo_ids);
     }
 
@@ -419,6 +428,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
     }
     cpu->die_id = topo_ids.die_id;
 
+    if (cpu->cluster_id != -1 && cpu->cluster_id != topo_ids.module_id) {
+        error_setg(errp, "property cluster-id: %u doesn't match set apic-id:"
+            " 0x%x (cluster-id: %u)", cpu->cluster_id, cpu->apic_id,
+            topo_ids.module_id);
+        return;
+    }
+    cpu->cluster_id = topo_ids.module_id;
+
     if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
         error_setg(errp, "property core-id: %u doesn't match set apic-id:"
             " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index d6969813ee02..ffa282219078 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7806,12 +7806,14 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, 0),
     DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, 0),
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
+    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, 0),
     DEFINE_PROP_INT32("die-id", X86CPU, die_id, 0),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, 0),
 #else
     DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
     DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, -1),
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
+    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, -1),
     DEFINE_PROP_INT32("die-id", X86CPU, die_id, -1),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
 #endif
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 5e97d0b76574..d9577938ae04 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2034,6 +2034,7 @@ struct ArchCPU {
     int32_t node_id; /* NUMA node this CPU belongs to */
     int32_t socket_id;
     int32_t die_id;
+    int32_t cluster_id;
     int32_t core_id;
     int32_t thread_id;
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 11/17] tests: Add test case of APIC ID for module level parsing
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (9 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-01 10:35 ` [PATCH v3 12/17] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding, Yongwei Ma

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

After i386 supports module level, it's time to add the test for module
level's parsing.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 tests/unit/test-x86-topo.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/tests/unit/test-x86-topo.c b/tests/unit/test-x86-topo.c
index f21b8a5d95c2..55b731ccae55 100644
--- a/tests/unit/test-x86-topo.c
+++ b/tests/unit/test-x86-topo.c
@@ -37,6 +37,7 @@ static void test_topo_bits(void)
     topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
 
     topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
@@ -74,13 +75,22 @@ static void test_topo_bits(void)
     topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
 
-    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 6, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
+    topo_info = (X86CPUTopoInfo) {1, 7, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
+    topo_info = (X86CPUTopoInfo) {1, 8, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
+    topo_info = (X86CPUTopoInfo) {1, 9, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 4);
+
+    topo_info = (X86CPUTopoInfo) {1, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
-    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {2, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {3, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {4, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
 
     /* build a weird topology and see if IDs are calculated correctly
@@ -91,6 +101,7 @@ static void test_topo_bits(void)
     topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
+    g_assert_cmpuint(apicid_module_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 12/17] hw/i386/pc: Support smp.clusters for x86 PC machine
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (10 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 11/17] tests: Add test case of APIC ID for module level parsing Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-01 10:35 ` [PATCH v3 13/17] i386: Add cache topology info in CPUCacheInfo Zhao Liu
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

As module-level topology support is added to X86CPU, now we can enable
the support for the cluster parameter on PC machines. With this support,
we can define a 5-level x86 CPU topology with "-smp":

-smp cpus=*,maxcpus=*,sockets=*,dies=*,clusters=*,cores=*,threads=*.

Additionally, add the 5-level topology example in description of "-smp".

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/i386/pc.c    |  1 +
 qemu-options.hx | 10 +++++-----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 3109d5e0e035..f2ec5720d233 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1885,6 +1885,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = true;
     mc->smp_props.dies_supported = true;
+    mc->smp_props.clusters_supported = true;
     mc->default_ram_id = "pc.ram";
     pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 29b98c3d4c55..5fb73d996151 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -319,14 +319,14 @@ SRST
         -smp 8,sockets=2,cores=2,threads=2,maxcpus=8
 
     The following sub-option defines a CPU topology hierarchy (2 sockets
-    totally on the machine, 2 dies per socket, 2 cores per die, 2 threads
-    per core) for PC machines which support sockets/dies/cores/threads.
-    Some members of the option can be omitted but their values will be
-    automatically computed:
+    totally on the machine, 2 dies per socket, 2 clusters per die, 2 cores per
+    cluster, 2 threads per core) for PC machines which support sockets/dies
+    /clusters/cores/threads. Some members of the option can be omitted but
+    their values will be automatically computed:
 
     ::
 
-        -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16
+        -smp 32,sockets=2,dies=2,clusters=2,cores=2,threads=2,maxcpus=32
 
     The following sub-option defines a CPU topology hierarchy (2 sockets
     totally on the machine, 2 clusters per socket, 2 cores per cluster,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 13/17] i386: Add cache topology info in CPUCacheInfo
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (11 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 12/17] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-01 10:35 ` [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache.

Thus, we should explicitly define the corresponding cache topology for
different cache models to increase scalability.

Except legacy_l2_cache_cpuid2 (its default topo level is
CPU_TOPO_LEVEL_UNKNOW), explicitly set the corresponding topology level
for all other cache models. In order to be compatible with the existing
cache topology, set the CPU_TOPO_LEVEL_CORE level for the i/d cache, set
the CPU_TOPO_LEVEL_CORE level for L2 cache, and set the
CPU_TOPO_LEVEL_DIE level for L3 cache.

The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits
25:14] will be set based on CPUCacheInfo.share_level.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v1:
 * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
   (Yanan)
 * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
---
 target/i386/cpu.c | 19 +++++++++++++++++++
 target/i386/cpu.h | 16 ++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ffa282219078..55aba4889628 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -436,6 +436,7 @@ static CPUCacheInfo legacy_l1d_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -450,6 +451,7 @@ static CPUCacheInfo legacy_l1d_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /* L1 instruction cache: */
@@ -463,6 +465,7 @@ static CPUCacheInfo legacy_l1i_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -477,6 +480,7 @@ static CPUCacheInfo legacy_l1i_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /* Level 2 unified cache: */
@@ -490,6 +494,7 @@ static CPUCacheInfo legacy_l2_cache = {
     .sets = 4096,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */
@@ -512,6 +517,7 @@ static CPUCacheInfo legacy_l2_cache_amd = {
     .associativity = 16,
     .sets = 512,
     .partitions = 1,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /* Level 3 unified cache: */
@@ -527,6 +533,7 @@ static CPUCacheInfo legacy_l3_cache = {
     .self_init = true,
     .inclusive = true,
     .complex_indexing = true,
+    .share_level = CPU_TOPO_LEVEL_DIE,
 };
 
 /* TLB definitions: */
@@ -1819,6 +1826,7 @@ static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -1831,6 +1839,7 @@ static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1841,6 +1850,7 @@ static const CPUCaches epyc_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1854,6 +1864,7 @@ static const CPUCaches epyc_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -1919,6 +1930,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -1931,6 +1943,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1941,6 +1954,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1954,6 +1968,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2019,6 +2034,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2031,6 +2047,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2041,6 +2058,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2054,6 +2072,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d9577938ae04..3f0cdc45607a 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1530,6 +1530,15 @@ enum CacheType {
     UNIFIED_CACHE
 };
 
+enum CPUTopoLevel {
+    CPU_TOPO_LEVEL_UNKNOW = 0,
+    CPU_TOPO_LEVEL_SMT,
+    CPU_TOPO_LEVEL_CORE,
+    CPU_TOPO_LEVEL_MODULE,
+    CPU_TOPO_LEVEL_DIE,
+    CPU_TOPO_LEVEL_PACKAGE,
+};
+
 typedef struct CPUCacheInfo {
     enum CacheType type;
     uint8_t level;
@@ -1571,6 +1580,13 @@ typedef struct CPUCacheInfo {
      * address bits.  CPUID[4].EDX[bit 2].
      */
     bool complex_indexing;
+
+    /*
+     * Cache Topology. The level that cache is shared in.
+     * Used to encode CPUID[4].EAX[bits 25:14] or
+     * CPUID[0x8000001D].EAX[bits 25:14].
+     */
+    enum CPUTopoLevel share_level;
 } CPUCacheInfo;
 
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (12 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 13/17] i386: Add cache topology info in CPUCacheInfo Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-02 23:49   ` Moger, Babu
  2023-08-01 10:35 ` [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
intel CPUs.

After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[4].EAX[bits 25:14].

And since maximum_processor_id (original "num_apic_ids") is parsed
based on cpu topology levels, which are verified when parsing smp, it's
no need to check this value by "assert(num_apic_ids > 0)" again, so
remove this assert.

Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
helper to make the code cleaner.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v1:
 * Use "enum CPUTopoLevel share_level" as the parameter in
   max_processor_ids_for_cache().
 * Make cache_into_passthrough case also use
   max_processor_ids_for_cache() and max_core_ids_in_package() to
   encode CPUID[4]. (Yanan)
 * Rename the title of this patch (the original is "i386: Use
   CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]").
---
 target/i386/cpu.c | 70 +++++++++++++++++++++++++++++------------------
 1 file changed, 43 insertions(+), 27 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 55aba4889628..c9897c0fe91a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -234,22 +234,53 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
                        ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
                        0 /* Invalid value */)
 
+static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
+                                            enum CPUTopoLevel share_level)
+{
+    uint32_t num_ids = 0;
+
+    switch (share_level) {
+    case CPU_TOPO_LEVEL_CORE:
+        num_ids = 1 << apicid_core_offset(topo_info);
+        break;
+    case CPU_TOPO_LEVEL_DIE:
+        num_ids = 1 << apicid_die_offset(topo_info);
+        break;
+    case CPU_TOPO_LEVEL_PACKAGE:
+        num_ids = 1 << apicid_pkg_offset(topo_info);
+        break;
+    default:
+        /*
+         * Currently there is no use case for SMT and MODULE, so use
+         * assert directly to facilitate debugging.
+         */
+        g_assert_not_reached();
+    }
+
+    return num_ids - 1;
+}
+
+static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
+{
+    uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
+                               apicid_core_offset(topo_info));
+    return num_cores - 1;
+}
 
 /* Encode cache info for CPUID[4] */
 static void encode_cache_cpuid4(CPUCacheInfo *cache,
-                                int num_apic_ids, int num_cores,
+                                X86CPUTopoInfo *topo_info,
                                 uint32_t *eax, uint32_t *ebx,
                                 uint32_t *ecx, uint32_t *edx)
 {
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
-    assert(num_apic_ids > 0);
     *eax = CACHE_TYPE(cache->type) |
            CACHE_LEVEL(cache->level) |
            (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
-           ((num_cores - 1) << 26) |
-           ((num_apic_ids - 1) << 14);
+           (max_core_ids_in_package(topo_info) << 26) |
+           (max_processor_ids_for_cache(topo_info, cache->share_level) << 14);
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
@@ -6116,56 +6147,41 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
 
                 if (cores_per_pkg > 1) {
-                    int addressable_cores_offset =
-                                                apicid_pkg_offset(&topo_info) -
-                                                apicid_core_offset(&topo_info);
-
                     *eax &= ~0xFC000000;
-                    *eax |= (1 << addressable_cores_offset - 1) << 26;
+                    *eax |= max_core_ids_in_package(&topo_info) << 26;
                 }
                 if (host_vcpus_per_cache > cpus_per_pkg) {
-                    int pkg_offset = apicid_pkg_offset(&topo_info);
-
                     *eax &= ~0x3FFC000;
-                    *eax |= (1 << pkg_offset - 1) << 14;
+                    *eax |=
+                        max_processor_ids_for_cache(&topo_info,
+                                                CPU_TOPO_LEVEL_PACKAGE) << 14;
                 }
             }
         } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
             *eax = *ebx = *ecx = *edx = 0;
         } else {
             *eax = 0;
-            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
-                                           apicid_core_offset(&topo_info);
-            int core_offset, die_offset;
 
             switch (count) {
             case 0: /* L1 dcache info */
-                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
-                                    (1 << core_offset),
-                                    (1 << addressable_cores_offset),
+                                    &topo_info,
                                     eax, ebx, ecx, edx);
                 break;
             case 1: /* L1 icache info */
-                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
-                                    (1 << core_offset),
-                                    (1 << addressable_cores_offset),
+                                    &topo_info,
                                     eax, ebx, ecx, edx);
                 break;
             case 2: /* L2 cache info */
-                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
-                                    (1 << core_offset),
-                                    (1 << addressable_cores_offset),
+                                    &topo_info,
                                     eax, ebx, ecx, edx);
                 break;
             case 3: /* L3 cache info */
-                die_offset = apicid_die_offset(&topo_info);
                 if (cpu->enable_l3_cache) {
                     encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
-                                        (1 << die_offset),
-                                        (1 << addressable_cores_offset),
+                                        &topo_info,
                                         eax, ebx, ecx, edx);
                     break;
                 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (13 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-03 20:40   ` Moger, Babu
  2023-08-01 10:35 ` [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
the number of sharing threads directly.

From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
means [1]:

The number of logical processors sharing this cache is the value of
this field incremented by 1. To determine which logical processors are
sharing a cache, determine a Share Id for each processor as follows:

ShareId = LocalApicId >> log2(NumSharingCache+1)

Logical processors with the same ShareId then share a cache. If
NumSharingCache+1 is not a power of two, round it up to the next power
of two.

From the description above, the caculation of this feild should be same
as CPUID[4].EAX[bits 25:14] for intel cpus. So also use the offsets of
APIC ID to calculate this field.

Note: I don't have the AMD hardware available, hope folks can help me
to test this, thanks!

[1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
     Information

Cc: Babu Moger <babu.moger@amd.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v1:
 * Rename "l3_threads" to "num_apic_ids" in
   encode_cache_cpuid8000001d(). (Yanan)
 * Add the description of the original commit and add Cc.
---
 target/i386/cpu.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index c9897c0fe91a..f67b6be10b8d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -361,7 +361,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
                                        uint32_t *eax, uint32_t *ebx,
                                        uint32_t *ecx, uint32_t *edx)
 {
-    uint32_t l3_threads;
+    uint32_t num_apic_ids;
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
@@ -370,13 +370,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_threads = topo_info->modules_per_die *
-                     topo_info->cores_per_module *
-                     topo_info->threads_per_core;
-        *eax |= (l3_threads - 1) << 14;
+        num_apic_ids = 1 << apicid_die_offset(topo_info);
     } else {
-        *eax |= ((topo_info->threads_per_core - 1) << 14);
+        num_apic_ids = 1 << apicid_core_offset(topo_info);
     }
+    *eax |= (num_apic_ids - 1) << 14;
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (14 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-03 20:44   ` Moger, Babu
  2023-08-01 10:35 ` [PATCH v3 17/17] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

CPUID[0x8000001D].EAX[bits 25:14] is used to represent the cache
topology for amd CPUs.

After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[0x8000001D].EAX[bits 25:14].

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v1:
 * Use cache->share_level as the parameter in
   max_processor_ids_for_cache().
---
 target/i386/cpu.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f67b6be10b8d..6eee0274ade4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -361,20 +361,12 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
                                        uint32_t *eax, uint32_t *ebx,
                                        uint32_t *ecx, uint32_t *edx)
 {
-    uint32_t num_apic_ids;
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
     *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
                (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
-
-    /* L3 is shared among multiple cores */
-    if (cache->level == 3) {
-        num_apic_ids = 1 << apicid_die_offset(topo_info);
-    } else {
-        num_apic_ids = 1 << apicid_core_offset(topo_info);
-    }
-    *eax |= (num_apic_ids - 1) << 14;
+    *eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 14;
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v3 17/17] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (15 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
@ 2023-08-01 10:35 ` Zhao Liu
  2023-08-01 15:35 ` [PATCH v3 00/17] Support smp.clusters for x86 Jonathan Cameron via
  2023-08-01 23:11 ` Moger, Babu
  18 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-01 10:35 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu, Yongwei Ma

From: Zhao Liu <zhao1.liu@intel.com>

The property x-l2-cache-topo will be used to change the L2 cache
topology in CPUID.04H.

Now it allows user to set the L2 cache is shared in core level or
cluster level.

If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
topology will be overrided by the new topology setting.

Here we expose to user "cluster" instead of "module", to be consistent
with "cluster-id" naming.

Since CPUID.04H is used by intel CPUs, this property is available on
intel CPUs as for now.

When necessary, it can be extended to CPUID.8000001DH for amd CPUs.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
---
Changes since v1:
 * Rename MODULE branch to CPU_TOPO_LEVEL_MODULE to match the previous
   renaming changes.
---
 target/i386/cpu.c | 34 +++++++++++++++++++++++++++++++++-
 target/i386/cpu.h |  2 ++
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6eee0274ade4..f4c48e19fa4e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -243,6 +243,9 @@ static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
     case CPU_TOPO_LEVEL_CORE:
         num_ids = 1 << apicid_core_offset(topo_info);
         break;
+    case CPU_TOPO_LEVEL_MODULE:
+        num_ids = 1 << apicid_module_offset(topo_info);
+        break;
     case CPU_TOPO_LEVEL_DIE:
         num_ids = 1 << apicid_die_offset(topo_info);
         break;
@@ -251,7 +254,7 @@ static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
         break;
     default:
         /*
-         * Currently there is no use case for SMT and MODULE, so use
+         * Currently there is no use case for SMT, so use
          * assert directly to facilitate debugging.
          */
         g_assert_not_reached();
@@ -7458,6 +7461,34 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
         env->cache_info_amd.l3_cache = &legacy_l3_cache;
     }
 
+    if (cpu->l2_cache_topo_level) {
+        /*
+         * FIXME: Currently only supports changing CPUID[4] (for intel), and
+         * will support changing CPUID[0x8000001D] when necessary.
+         */
+        if (!IS_INTEL_CPU(env)) {
+            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
+            return;
+        }
+
+        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
+            env->cache_info_cpuid4.l2_cache->share_level = CPU_TOPO_LEVEL_CORE;
+        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
+            /*
+             * We expose to users "cluster" instead of "module", to be
+             * consistent with "cluster-id" naming.
+             */
+            env->cache_info_cpuid4.l2_cache->share_level =
+                                                        CPU_TOPO_LEVEL_MODULE;
+        } else {
+            error_setg(errp,
+                       "x-l2-cache-topo doesn't support '%s', "
+                       "and it only supports 'core' or 'cluster'",
+                       cpu->l2_cache_topo_level);
+            return;
+        }
+    }
+
 #ifndef CONFIG_USER_ONLY
     MachineState *ms = MACHINE(qdev_get_machine());
     qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
@@ -7961,6 +7992,7 @@ static Property x86_cpu_properties[] = {
                      false),
     DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
                      true),
+    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
     DEFINE_PROP_END_OF_LIST()
 };
 
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 3f0cdc45607a..24db2a0d9588 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2057,6 +2057,8 @@ struct ArchCPU {
     int32_t hv_max_vps;
 
     bool xen_vapic;
+
+    char *l2_cache_topo_level;
 };
 
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 00/17] Support smp.clusters for x86
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (16 preceding siblings ...)
  2023-08-01 10:35 ` [PATCH v3 17/17] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
@ 2023-08-01 15:35 ` Jonathan Cameron via
  2023-08-04 13:17   ` Zhao Liu
  2023-08-01 23:11 ` Moger, Babu
  18 siblings, 1 reply; 63+ messages in thread
From: Jonathan Cameron via @ 2023-08-01 15:35 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger,
	Zhao Liu

On Tue,  1 Aug 2023 18:35:10 +0800
Zhao Liu <zhao1.liu@linux.intel.com> wrote:

> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Hi list,
> 
> This is the our v3 patch series, rebased on the master branch at the
> commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
> //git.linaro.org/people/pmaydell/qemu-arm into staging").
> 
> Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
> "ACKed-by" (for PC related patchies) tags and minor code changes (Pls
> see changelog).
> 
> 
> # Introduction
> 
> This series add the cluster support for x86 PC machine, which allows
> x86 can use smp.clusters to configure x86 modlue level CPU topology.
> 
> And since the compatibility issue (see section: ## Why not share L2
> cache in cluster directly), this series also introduce a new command
> to adjust the x86 L2 cache topology.
> 
> Welcome your comments!
> 
> 
> # Backgroud
> 
> The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> hasn't supported it.
> 
> At present, x86 defaults L2 cache is shared in one core, but this is
> not enough. There're some platforms that multiple cores share the
> same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> Atom cores [3], that is, every four Atom cores shares one L2 cache.
> Therefore, we need the new CPU topology level (cluster/module).
> 
> Another reason is for hybrid architecture. cluster support not only
> provides another level of topology definition in x86, but would aslo
> provide required code change for future our hybrid topology support.
> 
> 
> # Overview
> 
> ## Introduction of module level for x86
> 
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
> 
> For x86, the "cluster" in smp is corresponding to the module level [4],
> which is above the core level. So use the "module" other than "cluster"
> in x86 code.
> 
> And please note that x86 already has a cpu topology level also named
> "cluster" [4], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
> 
> 
> ## Why not share L2 cache in cluster directly
> 
> Though "clusters" was introduced to help define L2 cache topology
> [2], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
> 
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
> 
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> 
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
> 
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
> 
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
> 
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
> 
> 
> ## module level in CPUID
> 
> Currently, we don't expose module level in CPUID.1FH because currently
> linux (v6.2-rc6) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> wrong die_id. The module level should be exposed until the real machine
> has the module level in CPUID.1FH.
> 
> We can configure CPUID.04H.02H (L2 cache topology) with module level by
> a new command:
> 
>         "-cpu,x-l2-cache-topo=cluster"
> 
> More information about this command, please see the section: "## New
> property: x-l2-cache-topo".
> 
> 
> ## New cache topology info in CPUCacheInfo
> 
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
> 
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
> 
> In fact, the settings of these topologies depend on the specific
> platform and are not static. For example, on Alder Lake-P, every
> four Atom cores share the same L2 cache [2].
> 
> Thus, in this patch set, we explicitly define the corresponding cache
> topology for different cpu models and this has two benefits:
> 1. Easy to expand to new CPU models in the future, which has different
>    cache topology.
> 2. It can easily support custom cache topology by some command (e.g.,
>    x-l2-cache-topo).
> 
> 
> ## New property: x-l2-cache-topo
> 
> The property l2-cache-topo will be used to change the L2 cache topology
> in CPUID.04H.
> 
> Now it allows user to set the L2 cache is shared in core level or
> cluster level.
> 
> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> topology will be overrided by the new topology setting.
> 
> Since CPUID.04H is used by intel cpus, this property is available on
> intel cpus as for now.
> 
> When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.

Hi Zhao Liu,

As part of emulating arm's MPAM (cache partitioning controls) I needed
to add the missing cache description in the ACPI PPTT table. As such I ran
into a very similar problem to the one you are addressing.

I wonder if a more generic description is possible? We can rely on ordering
of the cache levels, so what I was planning to propose was the rather lengthy
but flexible (and with better names ;)

-smp 16,sockets=1,clusters=4,threads=2,cache-cluster-start-level=2,cache-node-start-level=3

Perhaps we can come up with a common scheme that covers both usecases?
It gets more fiddly to define if we have variable topology across different clusters
- and that was going to be an open question in the RFC proposing this - our current
definition of the more basic topology doesn't cover those cases anyway.

What I want:

1) No restriction on maximum cache levels - some systems have more than 3.
2) Easy ability to express everything from all caches are private to all caches are shared.
Is 3 levels enough? (private, shared at cluster level, shared at a level above that) I think
so, but if not any scheme should be extensible to cover another level.

Great if we can figure out a common scheme.

Jonathan

> 
> 
> # Patch description
> 
> patch 1-2 Cleanups about coding style and test name.
> 
> patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
>              cache topology encoding.
> 
> patch 5-6 Cleanups about topology related CPUID encoding and QEMU
>           topology variables.
> 
> patch 7-12 Add the module as the new CPU topology level in x86, and it
>            is corresponding to the cluster level in generic code.
> 
> patch 13,14,16 Add cache topology infomation in cache models.
> 
> patch 17 Introduce a new command to configure L2 cache topology.
> 
> 
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> 
> Best Regards,
> Zhao
> 
> ---
> Changelog:
> 
> Changes since v2:
>  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
>  * Use newly added wrapped helper to get cores per socket in
>    qemu_init_vcpu().
> 
> Changes since v1:
>  * Reordered patches. (Yanan)
>  * Deprecated the patch to fix comment of machine_parse_smp_config().
>    (Yanan)
>  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
>  * Split the intel's l1 cache topology fix into a new separate patch.
>    (Yanan)
>  * Combined module_id and APIC ID for module level support into one
>    patch. (Yanan)
>  * Make cache_into_passthrough case of cpuid 0x04 leaf in
>  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
>    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
>  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
>    (Yanan)
>  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> 
> ---
> Zhao Liu (10):
>   i386: Fix comment style in topology.h
>   tests: Rename test-x86-cpuid.c to test-x86-topo.c
>   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
>   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
>   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
>   i386: Add cache topology info in CPUCacheInfo
>   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
>   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
>   i386: Use CPUCacheInfo.share_level to encode
>     CPUID[0x8000001D].EAX[bits 25:14]
>   i386: Add new property to control L2 cache topo in CPUID.04H
> 
> Zhuocheng Ding (7):
>   softmmu: Fix CPUSTATE.nr_cores' calculation
>   i386: Introduce module-level cpu topology to CPUX86State
>   i386: Support modules_per_die in X86CPUTopoInfo
>   i386: Support module_id in X86CPUTopoIDs
>   i386/cpu: Introduce cluster-id to X86CPU
>   tests: Add test case of APIC ID for module level parsing
>   hw/i386/pc: Support smp.clusters for x86 PC machine
> 
>  MAINTAINERS                                   |   2 +-
>  hw/i386/pc.c                                  |   1 +
>  hw/i386/x86.c                                 |  49 +++++-
>  include/hw/core/cpu.h                         |   2 +-
>  include/hw/i386/topology.h                    |  68 +++++---
>  qemu-options.hx                               |  10 +-
>  softmmu/cpus.c                                |   2 +-
>  target/i386/cpu.c                             | 158 ++++++++++++++----
>  target/i386/cpu.h                             |  25 +++
>  tests/unit/meson.build                        |   4 +-
>  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
>  11 files changed, 280 insertions(+), 99 deletions(-)
>  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> 



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 00/17] Support smp.clusters for x86
  2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
                   ` (17 preceding siblings ...)
  2023-08-01 15:35 ` [PATCH v3 00/17] Support smp.clusters for x86 Jonathan Cameron via
@ 2023-08-01 23:11 ` Moger, Babu
  2023-08-04  7:44   ` Zhao Liu
  18 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-01 23:11 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Hi list,
> 
> This is the our v3 patch series, rebased on the master branch at the
> commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
> //git.linaro.org/people/pmaydell/qemu-arm into staging").
> 
> Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
> "ACKed-by" (for PC related patchies) tags and minor code changes (Pls
> see changelog).
> 
> 
> # Introduction
> 
> This series add the cluster support for x86 PC machine, which allows
> x86 can use smp.clusters to configure x86 modlue level CPU topology.

/s/modlue/module
> 
> And since the compatibility issue (see section: ## Why not share L2
> cache in cluster directly), this series also introduce a new command
> to adjust the x86 L2 cache topology.
> 
> Welcome your comments!
> 
> 
> # Backgroud
> 
> The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> hasn't supported it.
> 
> At present, x86 defaults L2 cache is shared in one core, but this is
> not enough. There're some platforms that multiple cores share the
> same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> Atom cores [3], that is, every four Atom cores shares one L2 cache.
> Therefore, we need the new CPU topology level (cluster/module).
> 
> Another reason is for hybrid architecture. cluster support not only
> provides another level of topology definition in x86, but would aslo
> provide required code change for future our hybrid topology support.
> 
> 
> # Overview
> 
> ## Introduction of module level for x86
> 
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
> 
> For x86, the "cluster" in smp is corresponding to the module level [4],
> which is above the core level. So use the "module" other than "cluster"
> in x86 code.
> 
> And please note that x86 already has a cpu topology level also named
> "cluster" [4], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
> 
> 
> ## Why not share L2 cache in cluster directly
> 
> Though "clusters" was introduced to help define L2 cache topology
> [2], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
> 
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
> 
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> 
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
> 
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
> 
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
> 
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
> 
> 
> ## module level in CPUID
> 
> Currently, we don't expose module level in CPUID.1FH because currently
> linux (v6.2-rc6) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> wrong die_id. The module level should be exposed until the real machine
> has the module level in CPUID.1FH.
> 
> We can configure CPUID.04H.02H (L2 cache topology) with module level by
> a new command:
> 
>         "-cpu,x-l2-cache-topo=cluster"
> 
> More information about this command, please see the section: "## New
> property: x-l2-cache-topo".
> 
> 
> ## New cache topology info in CPUCacheInfo
> 
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
> 
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
> 
> In fact, the settings of these topologies depend on the specific
> platform and are not static. For example, on Alder Lake-P, every
> four Atom cores share the same L2 cache [2].
> 
> Thus, in this patch set, we explicitly define the corresponding cache
> topology for different cpu models and this has two benefits:
> 1. Easy to expand to new CPU models in the future, which has different
>    cache topology.
> 2. It can easily support custom cache topology by some command (e.g.,
>    x-l2-cache-topo).
> 
> 
> ## New property: x-l2-cache-topo
> 
> The property l2-cache-topo will be used to change the L2 cache topology

Should this be x-l2-cache-topo ?

> in CPUID.04H.
> 
> Now it allows user to set the L2 cache is shared in core level or
> cluster level.
> 
> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> topology will be overrided by the new topology setting.
> 
> Since CPUID.04H is used by intel cpus, this property is available on
> intel cpus as for now.

s/intel cpus/Intel CPUs/
I feel this looks  better

> 
> When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.

s/amd cpus/AMD CPUs/

> 
> 
> # Patch description
> 
> patch 1-2 Cleanups about coding style and test name.
> 
> patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
>              cache topology encoding.
> 
> patch 5-6 Cleanups about topology related CPUID encoding and QEMU
>           topology variables.
> 
> patch 7-12 Add the module as the new CPU topology level in x86, and it
>            is corresponding to the cluster level in generic code.
> 
> patch 13,14,16 Add cache topology infomation in cache models.
> 
> patch 17 Introduce a new command to configure L2 cache topology.
> 
> 
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> 
> Best Regards,
> Zhao
> 
> ---
> Changelog:
> 
> Changes since v2:
>  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
>  * Use newly added wrapped helper to get cores per socket in
>    qemu_init_vcpu().
> 
> Changes since v1:
>  * Reordered patches. (Yanan)
>  * Deprecated the patch to fix comment of machine_parse_smp_config().
>    (Yanan)
>  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
>  * Split the intel's l1 cache topology fix into a new separate patch.
>    (Yanan)
>  * Combined module_id and APIC ID for module level support into one
>    patch. (Yanan)
>  * Make cache_into_passthrough case of cpuid 0x04 leaf in
>  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
>    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
>  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
>    (Yanan)
>  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> 
> ---
> Zhao Liu (10):
>   i386: Fix comment style in topology.h
>   tests: Rename test-x86-cpuid.c to test-x86-topo.c
>   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
>   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
>   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
>   i386: Add cache topology info in CPUCacheInfo
>   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
>   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
>   i386: Use CPUCacheInfo.share_level to encode
>     CPUID[0x8000001D].EAX[bits 25:14]
>   i386: Add new property to control L2 cache topo in CPUID.04H
> 
> Zhuocheng Ding (7):
>   softmmu: Fix CPUSTATE.nr_cores' calculation
>   i386: Introduce module-level cpu topology to CPUX86State
>   i386: Support modules_per_die in X86CPUTopoInfo
>   i386: Support module_id in X86CPUTopoIDs
>   i386/cpu: Introduce cluster-id to X86CPU
>   tests: Add test case of APIC ID for module level parsing
>   hw/i386/pc: Support smp.clusters for x86 PC machine
> 
>  MAINTAINERS                                   |   2 +-
>  hw/i386/pc.c                                  |   1 +
>  hw/i386/x86.c                                 |  49 +++++-
>  include/hw/core/cpu.h                         |   2 +-
>  include/hw/i386/topology.h                    |  68 +++++---
>  qemu-options.hx                               |  10 +-
>  softmmu/cpus.c                                |   2 +-
>  target/i386/cpu.c                             | 158 ++++++++++++++----
>  target/i386/cpu.h                             |  25 +++
>  tests/unit/meson.build                        |   4 +-
>  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
>  11 files changed, 280 insertions(+), 99 deletions(-)
>  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> 

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 01/17] i386: Fix comment style in topology.h
  2023-08-01 10:35 ` [PATCH v3 01/17] i386: Fix comment style in topology.h Zhao Liu
@ 2023-08-01 23:13   ` Moger, Babu
  2023-08-04  8:12     ` Zhao Liu
  2023-08-07  2:16   ` Xiaoyao Li
  1 sibling, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-01 23:13 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> For function comments in this file, keep the comment style consistent
> with other places.

s/with other places./with other files in the directory./

> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org
> Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/hw/i386/topology.h | 33 +++++++++++++++++----------------
>  1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 81573f6cfde0..5a19679f618b 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -24,7 +24,8 @@
>  #ifndef HW_I386_TOPOLOGY_H
>  #define HW_I386_TOPOLOGY_H
>  
> -/* This file implements the APIC-ID-based CPU topology enumeration logic,
> +/*
> + * This file implements the APIC-ID-based CPU topology enumeration logic,
>   * documented at the following document:
>   *   Intel® 64 Architecture Processor Topology Enumeration
>   *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
> @@ -41,7 +42,8 @@
>  
>  #include "qemu/bitops.h"
>  
> -/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> +/*
> + * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
>   */
>  typedef uint32_t apic_id_t;
>  
> @@ -58,8 +60,7 @@ typedef struct X86CPUTopoInfo {
>      unsigned threads_per_core;
>  } X86CPUTopoInfo;
>  
> -/* Return the bit width needed for 'count' IDs
> - */
> +/* Return the bit width needed for 'count' IDs */
>  static unsigned apicid_bitwidth_for_count(unsigned count)
>  {
>      g_assert(count >= 1);
> @@ -67,15 +68,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
>      return count ? 32 - clz32(count) : 0;
>  }
>  
> -/* Bit width of the SMT_ID (thread ID) field on the APIC ID
> - */
> +/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
>  static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>  {
>      return apicid_bitwidth_for_count(topo_info->threads_per_core);
>  }
>  
> -/* Bit width of the Core_ID field
> - */
> +/* Bit width of the Core_ID field */
>  static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>  {
>      return apicid_bitwidth_for_count(topo_info->cores_per_die);
> @@ -87,8 +86,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
>      return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
>  }
>  
> -/* Bit offset of the Core_ID field
> - */
> +/* Bit offset of the Core_ID field */
>  static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>  {
>      return apicid_smt_width(topo_info);
> @@ -100,14 +98,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
>      return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
>  }
>  
> -/* Bit offset of the Pkg_ID (socket ID) field
> - */
> +/* Bit offset of the Pkg_ID (socket ID) field */
>  static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
>  {
>      return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
>  }
>  
> -/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> +/*
> + * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
>   *
>   * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
>   */
> @@ -120,7 +118,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
>             topo_ids->smt_id;
>  }
>  
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>   * based on (contiguous) CPU index
>   */
>  static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> @@ -137,7 +136,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>      topo_ids->smt_id = cpu_index % nr_threads;
>  }
>  
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>   * based on APIC ID
>   */
>  static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> @@ -155,7 +155,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>      topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
>  }
>  
> -/* Make APIC ID for the CPU 'cpu_index'
> +/*
> + * Make APIC ID for the CPU 'cpu_index'
>   *
>   * 'cpu_index' is a sequential, contiguous ID for the CPU.
>   */

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c
  2023-08-01 10:35 ` [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c Zhao Liu
@ 2023-08-01 23:20   ` Moger, Babu
  2023-08-04  8:14     ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-01 23:20 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu, Yongwei Ma

Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> In fact, this unit tests APIC ID other than CPUID.

This is not clear.

The tests in test-x86-topo.c actually test the APIC ID combinations.
Rename to test-x86-topo.c to make its name more in line with its actual
content.

> Rename to test-x86-topo.c to make its name more in line with its
> actual content.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> Changes since v1:
>  * Rename test-x86-apicid.c to test-x86-topo.c. (Yanan)
> ---
>  MAINTAINERS                                      | 2 +-
>  tests/unit/meson.build                           | 4 ++--
>  tests/unit/{test-x86-cpuid.c => test-x86-topo.c} | 2 +-
>  3 files changed, 4 insertions(+), 4 deletions(-)
>  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (99%)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 12e59b6b27de..51ba3d593e90 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1719,7 +1719,7 @@ F: include/hw/southbridge/ich9.h
>  F: include/hw/southbridge/piix.h
>  F: hw/isa/apm.c
>  F: include/hw/isa/apm.h
> -F: tests/unit/test-x86-cpuid.c
> +F: tests/unit/test-x86-topo.c
>  F: tests/qtest/test-x86-cpuid-compat.c
>  
>  PC Chipset
> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> index 93977cc32d2b..39b5d0007c69 100644
> --- a/tests/unit/meson.build
> +++ b/tests/unit/meson.build
> @@ -21,8 +21,8 @@ tests = {
>    'test-opts-visitor': [testqapi],
>    'test-visitor-serialization': [testqapi],
>    'test-bitmap': [],
> -  # all code tested by test-x86-cpuid is inside topology.h
> -  'test-x86-cpuid': [],
> +  # all code tested by test-x86-topo is inside topology.h
> +  'test-x86-topo': [],
>    'test-cutils': [],
>    'test-div128': [],
>    'test-shift128': [],
> diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-topo.c
> similarity index 99%
> rename from tests/unit/test-x86-cpuid.c
> rename to tests/unit/test-x86-topo.c
> index bfabc0403a1a..2b104f86d7c2 100644
> --- a/tests/unit/test-x86-cpuid.c
> +++ b/tests/unit/test-x86-topo.c
> @@ -1,5 +1,5 @@
>  /*
> - *  Test code for x86 CPUID and Topology functions
> + *  Test code for x86 APIC ID and Topology functions
>   *
>   *  Copyright (c) 2012 Red Hat Inc.
>   *

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-08-01 10:35 ` [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
@ 2023-08-02 15:25   ` Moger, Babu
  2023-08-04  8:16     ` Zhao Liu
  2023-08-07  7:03   ` Xiaoyao Li
  1 sibling, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-02 15:25 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu, Zhuocheng Ding

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> 
> From CPUState.nr_cores' comment, it represents "number of cores within
> this CPU package".
> 
> After 003f230e37d7 ("machine: Tweak the order of topology members in
> struct CpuTopology"), the meaning of smp.cores changed to "the number of
> cores in one die", but this commit missed to change CPUState.nr_cores'
> caculation, so that CPUState.nr_cores became wrong and now it
> misses to consider numbers of clusters and dies.
> 
> At present, only i386 is using CPUState.nr_cores.
> 
> But as for i386, which supports die level, the uses of CPUState.nr_cores
> are very confusing:
> 
> Early uses are based on the meaning of "cores per package" (before die
> is introduced into i386), and later uses are based on "cores per die"
> (after die's introduction).
> 
> This difference is due to that commit a94e1428991f ("target/i386: Add
> CPUID.1F generation support for multi-dies PCMachine") misunderstood
> that CPUState.nr_cores means "cores per die" when caculated
> CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> wrong understanding.
> 
> With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> the result of CPUState.nr_cores is "cores per die", thus the original
> uses of CPUState.cores based on the meaning of "cores per package" are
> wrong when mutiple dies exist:
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
>    incorrect because it expects "cpus per package" but now the
>    result is "cpus per die".
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
>    EAX[bits 31:26] is incorrect because they expect "cpus per package"
>    but now the result is "cpus per die". The error not only impacts the
>    EAX caculation in cache_info_passthrough case, but also impacts other
>    cases of setting cache topology for Intel CPU according to cpu
>    topology (specifically, the incoming parameter "num_cores" expects
>    "cores per package" in encode_cache_cpuid4()).
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
>    15:00] is incorrect because the EBX of 0BH.01H (core level) expects
>    "cpus per package", which may be different with 1FH.01H (The reason
>    is 1FH can support more levels. For QEMU, 1FH also supports die,
>    1FH.01H:EBX[bits 15:00] expects "cpus per die").
> 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
>    caculated, here "cpus per package" is expected to be checked, but in
>    fact, now it checks "cpus per die". Though "cpus per die" also works
>    for this code logic, this isn't consistent with AMD's APM.
> 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
>    "cpus per package" but it obtains "cpus per die".
> 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
>    kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
>    helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
>    MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
>    package", but in these functions, it obtains "cpus per die" and
>    "cores per die".
> 
> On the other hand, these uses are correct now (they are added in/after
> a94e1428991f):
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
>    meets the actual meaning of CPUState.nr_cores ("cores per die").
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
>    04H's caculation) considers number of dies, so it's correct.
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
>    15:00] needs "cpus per die" and it gets the correct result, and
>    CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> 
> When CPUState.nr_cores is correctly changed to "cores per package" again
> , the above errors will be fixed without extra work, but the "currently"
> correct cases will go wrong and need special handling to pass correct
> "cpus/cores per die" they want.
> 
> Thus in this patch, we fix CPUState.nr_cores' caculation to fit the

s/Thus in this patch, we fix CPUState.nr_cores' caculation/Fix
CPUState.nr_cores' calculation/


Describe your changes in imperative mood also spell check.
Thanks
Babu


> original meaning "cores per package", as well as changing calculation of
> topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> 
> In addition, in the nr_threads' comment, specify it represents the
> number of threads in the "core" to avoid confusion, and also add comment
> for nr_dies in CPUX86State.
> 
> Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v2:
>  * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> Changes since v1:
>  * Add comment for nr_dies in CPUX86State. (Yanan)
> ---
>  include/hw/core/cpu.h | 2 +-
>  softmmu/cpus.c        | 2 +-
>  target/i386/cpu.c     | 9 ++++-----
>  target/i386/cpu.h     | 1 +
>  4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index fdcbe8735258..57f4d50ace72 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -277,7 +277,7 @@ struct qemu_work_item;
>   *   See TranslationBlock::TCG CF_CLUSTER_MASK.
>   * @tcg_cflags: Pre-computed cflags for this cpu.
>   * @nr_cores: Number of cores within this CPU package.
> - * @nr_threads: Number of threads within this CPU.
> + * @nr_threads: Number of threads within this CPU core.
>   * @running: #true if CPU is currently running (lockless).
>   * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
>   * valid under cpu_list_lock.
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index fed20ffb5dd2..984558d7b245 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
>  {
>      MachineState *ms = MACHINE(qdev_get_machine());
>  
> -    cpu->nr_cores = ms->smp.cores;
> +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
>      cpu->nr_threads =  ms->smp.threads;
>      cpu->stopped = true;
>      cpu->random_seed = qemu_guest_random_seed_thread_part1();
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 97ad229d8ba3..50613cd04612 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>      X86CPUTopoInfo topo_info;
>  
>      topo_info.dies_per_pkg = env->nr_dies;
> -    topo_info.cores_per_die = cs->nr_cores;
> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>      topo_info.threads_per_core = cs->nr_threads;
>  
>      /* Calculate & apply limits for different index ranges */
> @@ -6087,8 +6087,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               */
>              if (*eax & 31) {
>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> -                                       cs->nr_threads;
> +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>                  if (cs->nr_cores > 1) {
>                      *eax &= ~0xFC000000;
>                      *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> @@ -6266,12 +6265,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>              break;
>          case 1:
>              *eax = apicid_die_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>              break;
>          case 2:
>              *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> +            *ebx = cs->nr_cores * cs->nr_threads;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>              break;
>          default:
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index e0771a10433b..7638128d59cc 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1878,6 +1878,7 @@ typedef struct CPUArchState {
>  
>      TPRAccess tpr_access_type;
>  
> +    /* Number of dies within this CPU package. */
>      unsigned nr_dies;
>  } CPUX86State;
>  

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
  2023-08-01 10:35 ` [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
@ 2023-08-02 15:41   ` Moger, Babu
  2023-08-04  8:21     ` Zhao Liu
  2023-08-07  8:13   ` Xiaoyao Li
  1 sibling, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-02 15:41 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu, Robert Hoo

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
> CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
> nearest power-of-2 integer.
> 
> The nearest power-of-2 integer can be caculated by pow2ceil() or by
> using APIC ID offset (like L3 topology using 1 << die_offset [3]).
> 
> But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
> are associated with APIC ID. For example, in linux kernel, the field
> "num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
> another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
> matched with actual core numbers and it's caculated by:
> "(1 << (pkg_offset - core_offset)) - 1".
> 
> Therefore the offset of APIC ID should be preferred to caculate nearest
> power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
> 31:26]:
> 1. d/i cache is shared in a core, 1 << core_offset should be used
>    instand of "cs->nr_threads" in encode_cache_cpuid4() for
>    CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
> 2. L2 cache is supposed to be shared in a core as for now, thereby
>    1 << core_offset should also be used instand of "cs->nr_threads" in
>    encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
> 3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
>    replaced by the offsets upper SMT level in APIC ID.
> 
> In addition, use APIC ID offset to replace "pow2ceil()" for
> cache_info_passthrough case.
> 
> [1]: efb3934adf9e ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec")
> [2]: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
> [3]: d65af288a84d ("i386: Update new x86_apicid parsing rules with die_offset support")
> 
> Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
> Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v1:
>  * Use APIC ID offset to replace "pow2ceil()" for cache_info_passthrough
>    case. (Yanan)
>  * Split the L1 cache fix into a separate patch.
>  * Rename the title of this patch (the original is "i386/cpu: Fix number
>    of addressable IDs in CPUID.04H").
> ---
>  target/i386/cpu.c | 30 +++++++++++++++++++++++-------
>  1 file changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b439a05244ee..c80613bfcded 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6005,7 +6005,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>  {
>      X86CPU *cpu = env_archcpu(env);
>      CPUState *cs = env_cpu(env);
> -    uint32_t die_offset;
>      uint32_t limit;
>      uint32_t signature[3];
>      X86CPUTopoInfo topo_info;
> @@ -6089,39 +6088,56 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>                  int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>                  if (cs->nr_cores > 1) {
> +                    int addressable_cores_offset =
> +                                                apicid_pkg_offset(&topo_info) -
> +                                                apicid_core_offset(&topo_info);
> +
>                      *eax &= ~0xFC000000;
> -                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> +                    *eax |= (1 << addressable_cores_offset - 1) << 26;
>                  }
>                  if (host_vcpus_per_cache > vcpus_per_socket) {
> +                    int pkg_offset = apicid_pkg_offset(&topo_info);
> +
>                      *eax &= ~0x3FFC000;
> -                    *eax |= (pow2ceil(vcpus_per_socket) - 1) << 14;
> +                    *eax |= (1 << pkg_offset - 1) << 14;
>                  }
>              }

I hit this compile error with this patch.

[1/18] Generating qemu-version.h with a custom command (wrapped by meson
to capture output)
[2/4] Compiling C object libqemu-x86_64-softmmu.fa.p/target_i386_cpu.c.o
FAILED: libqemu-x86_64-softmmu.fa.p/target_i386_cpu.c.o
..
..
softmmu.fa.p/target_i386_cpu.c.o -c ../target/i386/cpu.c
../target/i386/cpu.c: In function ‘cpu_x86_cpuid’:
../target/i386/cpu.c:6096:60: error: suggest parentheses around ‘-’ inside
‘<<’ [-Werror=parentheses]
 6096 |                     *eax |= (1 << addressable_cores_offset - 1) << 26;
      |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~^~~
../target/i386/cpu.c:6102:46: error: suggest parentheses around ‘-’ inside
‘<<’ [-Werror=parentheses]
 6102 |                     *eax |= (1 << pkg_offset - 1) << 14;
      |                                   ~~~~~~~~~~~^~~
cc1: all warnings being treated as errors

Please fix this.
Thanks
Babu


>          } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
>              *eax = *ebx = *ecx = *edx = 0;
>          } else {
>              *eax = 0;
> +            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> +                                           apicid_core_offset(&topo_info);
> +            int core_offset, die_offset;
> +
>              switch (count) {
>              case 0: /* L1 dcache info */
> +                core_offset = apicid_core_offset(&topo_info);
>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> -                                    cs->nr_threads, cs->nr_cores,
> +                                    (1 << core_offset),
> +                                    (1 << addressable_cores_offset),
>                                      eax, ebx, ecx, edx);
>                  break;
>              case 1: /* L1 icache info */
> +                core_offset = apicid_core_offset(&topo_info);
>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> -                                    cs->nr_threads, cs->nr_cores,
> +                                    (1 << core_offset),
> +                                    (1 << addressable_cores_offset),
>                                      eax, ebx, ecx, edx);
>                  break;
>              case 2: /* L2 cache info */
> +                core_offset = apicid_core_offset(&topo_info);
>                  encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> -                                    cs->nr_threads, cs->nr_cores,
> +                                    (1 << core_offset),
> +                                    (1 << addressable_cores_offset),
>                                      eax, ebx, ecx, edx);
>                  break;
>              case 3: /* L3 cache info */
>                  die_offset = apicid_die_offset(&topo_info);
>                  if (cpu->enable_l3_cache) {
>                      encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> -                                        (1 << die_offset), cs->nr_cores,
> +                                        (1 << die_offset),
> +                                        (1 << addressable_cores_offset),
>                                          eax, ebx, ecx, edx);
>                      break;
>                  }

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  2023-08-01 10:35 ` [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
@ 2023-08-02 16:31   ` Moger, Babu
  2023-08-04  8:23     ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-02 16:31 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu, Robert Hoo

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> In cpu_x86_cpuid(), there are many variables in representing the cpu
> topology, e.g., topo_info, cs->nr_cores/cs->nr_threads.
> 
> Since the names of cs->nr_cores/cs->nr_threads does not accurately
> represent its meaning, the use of cs->nr_cores/cs->nr_threads is prone
> to confusion and mistakes.
> 
> And the structure X86CPUTopoInfo names its memebers clearly, thus the

s/memebers/members/
Thanks
Babu

> variable "topo_info" should be preferred.
> 
> In addition, in cpu_x86_cpuid(), to uniformly use the topology variable,
> replace env->dies with topo_info.dies_per_pkg as well.
> 
> Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v1:
>  * Extract cores_per_socket from the code block and use it as a local
>    variable for cpu_x86_cpuid(). (Yanan)
>  * Remove vcpus_per_socket variable and use cpus_per_pkg directly.
>    (Yanan)
>  * Replace env->dies with topo_info.dies_per_pkg in cpu_x86_cpuid().
> ---
>  target/i386/cpu.c | 31 ++++++++++++++++++-------------
>  1 file changed, 18 insertions(+), 13 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index c80613bfcded..fc50bf98c60e 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6008,11 +6008,16 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>      uint32_t limit;
>      uint32_t signature[3];
>      X86CPUTopoInfo topo_info;
> +    uint32_t cores_per_pkg;
> +    uint32_t cpus_per_pkg;
>  
>      topo_info.dies_per_pkg = env->nr_dies;
>      topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>      topo_info.threads_per_core = cs->nr_threads;
>  
> +    cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
> +    cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
> +
>      /* Calculate & apply limits for different index ranges */
>      if (index >= 0xC0000000) {
>          limit = env->cpuid_xlevel2;
> @@ -6048,8 +6053,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>              *ecx |= CPUID_EXT_OSXSAVE;
>          }
>          *edx = env->features[FEAT_1_EDX];
> -        if (cs->nr_cores * cs->nr_threads > 1) {
> -            *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
> +        if (cpus_per_pkg > 1) {
> +            *ebx |= cpus_per_pkg << 16;
>              *edx |= CPUID_HT;
>          }
>          if (!cpu->enable_pmu) {
> @@ -6086,8 +6091,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               */
>              if (*eax & 31) {
>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> -                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> -                if (cs->nr_cores > 1) {
> +
> +                if (cores_per_pkg > 1) {
>                      int addressable_cores_offset =
>                                                  apicid_pkg_offset(&topo_info) -
>                                                  apicid_core_offset(&topo_info);
> @@ -6095,7 +6100,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                      *eax &= ~0xFC000000;
>                      *eax |= (1 << addressable_cores_offset - 1) << 26;
>                  }
> -                if (host_vcpus_per_cache > vcpus_per_socket) {
> +                if (host_vcpus_per_cache > cpus_per_pkg) {
>                      int pkg_offset = apicid_pkg_offset(&topo_info);
>  
>                      *eax &= ~0x3FFC000;
> @@ -6240,12 +6245,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>          switch (count) {
>          case 0:
>              *eax = apicid_core_offset(&topo_info);
> -            *ebx = cs->nr_threads;
> +            *ebx = topo_info.threads_per_core;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>              break;
>          case 1:
>              *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = cpus_per_pkg;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>              break;
>          default:
> @@ -6266,7 +6271,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>          break;
>      case 0x1F:
>          /* V2 Extended Topology Enumeration Leaf */
> -        if (env->nr_dies < 2) {
> +        if (topo_info.dies_per_pkg < 2) {
>              *eax = *ebx = *ecx = *edx = 0;
>              break;
>          }
> @@ -6276,7 +6281,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>          switch (count) {
>          case 0:
>              *eax = apicid_core_offset(&topo_info);
> -            *ebx = cs->nr_threads;
> +            *ebx = topo_info.threads_per_core;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>              break;
>          case 1:
> @@ -6286,7 +6291,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>              break;
>          case 2:
>              *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = cpus_per_pkg;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>              break;
>          default:
> @@ -6511,7 +6516,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>           * discards multiple thread information if it is set.
>           * So don't set it here for Intel to make Linux guests happy.
>           */
> -        if (cs->nr_cores * cs->nr_threads > 1) {
> +        if (cpus_per_pkg > 1) {
>              if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 ||
>                  env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 ||
>                  env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) {
> @@ -6577,7 +6582,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               *eax |= (cpu_x86_virtual_addr_width(env) << 8);
>          }
>          *ebx = env->features[FEAT_8000_0008_EBX];
> -        if (cs->nr_cores * cs->nr_threads > 1) {
> +        if (cpus_per_pkg > 1) {
>              /*
>               * Bits 15:12 is "The number of bits in the initial
>               * Core::X86::Apic::ApicId[ApicId] value that indicate
> @@ -6585,7 +6590,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               * Bits 7:0 is "The number of threads in the package is NC+1"
>               */
>              *ecx = (apicid_pkg_offset(&topo_info) << 12) |
> -                   ((cs->nr_cores * cs->nr_threads) - 1);
> +                   (cpus_per_pkg - 1);
>          } else {
>              *ecx = 0;
>          }

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo
  2023-08-01 10:35 ` [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
@ 2023-08-02 17:25   ` Moger, Babu
  2023-08-04  9:05     ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-02 17:25 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu, Zhuocheng Ding

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> 
> Support module level in i386 cpu topology structure "X86CPUTopoInfo".
> 
> Since x86 does not yet support the "clusters" parameter in "-smp",
> X86CPUTopoInfo.modules_per_die is currently always 1. Therefore, the
> module level width in APIC ID, which can be calculated by
> "apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0
> for now, so we can directly add APIC ID related helpers to support
> module level parsing.
> 
> At present, we don't expose module level in CPUID.1FH because currently
> linux (v6.4-rc1) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> the wrong die_id. The module level should be exposed until the real
> machine has the module level in CPUID.1FH.
> 
> In addition, update topology structure in test-x86-topo.c.
> 
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> Changes since v1:
>  * Include module level related helpers (apicid_module_width() and
>    apicid_module_offset()) in this patch. (Yanan)
> ---
>  hw/i386/x86.c              |  3 ++-
>  include/hw/i386/topology.h | 22 +++++++++++++++----
>  target/i386/cpu.c          | 12 ++++++----
>  tests/unit/test-x86-topo.c | 45 ++++++++++++++++++++------------------
>  4 files changed, 52 insertions(+), 30 deletions(-)
> 
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 4efc390905ff..a552ae8bb4a8 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -72,7 +72,8 @@ static void init_topo_info(X86CPUTopoInfo *topo_info,
>      MachineState *ms = MACHINE(x86ms);
>  
>      topo_info->dies_per_pkg = ms->smp.dies;
> -    topo_info->cores_per_die = ms->smp.cores;
> +    topo_info->modules_per_die = ms->smp.clusters;

It is confusing. You said in the previous patch, using the clusters for
x86 is going to cause compatibility issues. Why is this clusters is used
to initialize modules_per_die?

Why not define a new field "modules"(just like clusters) in smp and use it
x86? Is is going to a problem?
May be I am not clear here. I am yet to understand all the other changes.

Thanks
Babu

> +    topo_info->cores_per_module = ms->smp.cores;
>      topo_info->threads_per_core = ms->smp.threads;
>  }
>  
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 5a19679f618b..c807d3811dd3 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -56,7 +56,8 @@ typedef struct X86CPUTopoIDs {
>  
>  typedef struct X86CPUTopoInfo {
>      unsigned dies_per_pkg;
> -    unsigned cores_per_die;
> +    unsigned modules_per_die;
> +    unsigned cores_per_module;
>      unsigned threads_per_core;
>  } X86CPUTopoInfo;
>  
> @@ -77,7 +78,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>  /* Bit width of the Core_ID field */
>  static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>  {
> -    return apicid_bitwidth_for_count(topo_info->cores_per_die);
> +    return apicid_bitwidth_for_count(topo_info->cores_per_module);
> +}
> +
> +/* Bit width of the Module_ID (cluster ID) field */
> +static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
> +{
> +    return apicid_bitwidth_for_count(topo_info->modules_per_die);
>  }
>  
>  /* Bit width of the Die_ID field */
> @@ -92,10 +99,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>      return apicid_smt_width(topo_info);
>  }
>  
> +/* Bit offset of the Module_ID (cluster ID) field */
> +static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
> +{
> +    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> +}
> +
>  /* Bit offset of the Die_ID field */
>  static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
>  {
> -    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> +    return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
>  }
>  
>  /* Bit offset of the Pkg_ID (socket ID) field */
> @@ -127,7 +140,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>                                           X86CPUTopoIDs *topo_ids)
>  {
>      unsigned nr_dies = topo_info->dies_per_pkg;
> -    unsigned nr_cores = topo_info->cores_per_die;
> +    unsigned nr_cores = topo_info->cores_per_module *
> +                        topo_info->modules_per_die;
>      unsigned nr_threads = topo_info->threads_per_core;
>  
>      topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 8a9fd5682efc..d6969813ee02 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -339,7 +339,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>  
>      /* L3 is shared among multiple cores */
>      if (cache->level == 3) {
> -        l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
> +        l3_threads = topo_info->modules_per_die *
> +                     topo_info->cores_per_module *
> +                     topo_info->threads_per_core;
>          *eax |= (l3_threads - 1) << 14;
>      } else {
>          *eax |= ((topo_info->threads_per_core - 1) << 14);
> @@ -6012,10 +6014,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>      uint32_t cpus_per_pkg;
>  
>      topo_info.dies_per_pkg = env->nr_dies;
> -    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> +    topo_info.modules_per_die = env->nr_modules;
> +    topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
>      topo_info.threads_per_core = cs->nr_threads;
>  
> -    cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
> +    cores_per_pkg = topo_info.cores_per_module * topo_info.modules_per_die *
> +                    topo_info.dies_per_pkg;
>      cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
>  
>      /* Calculate & apply limits for different index ranges */
> @@ -6286,7 +6290,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>              break;
>          case 1:
>              *eax = apicid_die_offset(&topo_info);
> -            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> +            *ebx = cpus_per_pkg / topo_info.dies_per_pkg;
>              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>              break;
>          case 2:
> diff --git a/tests/unit/test-x86-topo.c b/tests/unit/test-x86-topo.c
> index 2b104f86d7c2..f21b8a5d95c2 100644
> --- a/tests/unit/test-x86-topo.c
> +++ b/tests/unit/test-x86-topo.c
> @@ -30,13 +30,16 @@ static void test_topo_bits(void)
>  {
>      X86CPUTopoInfo topo_info = {0};
>  
> -    /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
> -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> +    /*
> +     * simple tests for 1 thread per core, 1 core per module,
> +     *                  1 module per die, 1 die per package
> +     */
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
>  
> -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> @@ -45,39 +48,39 @@ static void test_topo_bits(void)
>  
>      /* Test field width calculation for multiple values
>       */
> -    topo_info = (X86CPUTopoInfo) {1, 1, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 4};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>  
> -    topo_info = (X86CPUTopoInfo) {1, 1, 14};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 15};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 16};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 17};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
>  
>  
> -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {1, 31, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {1, 32, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {1, 33, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
>      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
>  
> -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> -    topo_info = (X86CPUTopoInfo) {2, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
> -    topo_info = (X86CPUTopoInfo) {3, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> -    topo_info = (X86CPUTopoInfo) {4, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
>      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
>  
>      /* build a weird topology and see if IDs are calculated correctly
> @@ -85,18 +88,18 @@ static void test_topo_bits(void)
>  
>      /* This will use 2 bits for thread ID and 3 bits for core ID
>       */
> -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>      g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
>      g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
>      g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
>  
> -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
>  
> -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
>                       (1 << 2) | 0);
>      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU
  2023-08-01 10:35 ` [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
@ 2023-08-02 22:44   ` Moger, Babu
  2023-08-04  9:06     ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-02 22:44 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu, Zhuocheng Ding

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> 
> We introduce cluster-id other than module-id to be consistent with

s/We introduce/Introduce/

Thanks
Babu

> CpuInstanceProperties.cluster-id, and this avoids the confusion
> of parameter names when hotplugging.
> 
> Following the legacy smp check rules, also add the cluster_id validity
> into x86_cpu_pre_plug().
> 
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  hw/i386/x86.c     | 33 +++++++++++++++++++++++++--------
>  target/i386/cpu.c |  2 ++
>  target/i386/cpu.h |  1 +
>  3 files changed, 28 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 0b460fd6074d..8154b86f95c7 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -328,6 +328,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>              cpu->die_id = 0;
>          }
>  
> +        /*
> +         * cluster-id was optional in QEMU 8.0 and older, so keep it optional
> +         * if there's only one cluster per die.
> +         */
> +        if (cpu->cluster_id < 0 && ms->smp.clusters == 1) {
> +            cpu->cluster_id = 0;
> +        }
> +
>          if (cpu->socket_id < 0) {
>              error_setg(errp, "CPU socket-id is not set");
>              return;
> @@ -344,6 +352,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>                         cpu->die_id, ms->smp.dies - 1);
>              return;
>          }
> +        if (cpu->cluster_id < 0) {
> +            error_setg(errp, "CPU cluster-id is not set");
> +            return;
> +        } else if (cpu->cluster_id > ms->smp.clusters - 1) {
> +            error_setg(errp, "Invalid CPU cluster-id: %u must be in range 0:%u",
> +                       cpu->cluster_id, ms->smp.clusters - 1);
> +            return;
> +        }
>          if (cpu->core_id < 0) {
>              error_setg(errp, "CPU core-id is not set");
>              return;
> @@ -363,16 +379,9 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>  
>          topo_ids.pkg_id = cpu->socket_id;
>          topo_ids.die_id = cpu->die_id;
> +        topo_ids.module_id = cpu->cluster_id;
>          topo_ids.core_id = cpu->core_id;
>          topo_ids.smt_id = cpu->thread_id;
> -
> -        /*
> -         * TODO: This is the temporary initialization for topo_ids.module_id to
> -         * avoid "maybe-uninitialized" compilation errors. Will remove when
> -         * X86CPU supports cluster_id.
> -         */
> -        topo_ids.module_id = 0;
> -
>          cpu->apic_id = x86_apicid_from_topo_ids(&topo_info, &topo_ids);
>      }
>  
> @@ -419,6 +428,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>      }
>      cpu->die_id = topo_ids.die_id;
>  
> +    if (cpu->cluster_id != -1 && cpu->cluster_id != topo_ids.module_id) {
> +        error_setg(errp, "property cluster-id: %u doesn't match set apic-id:"
> +            " 0x%x (cluster-id: %u)", cpu->cluster_id, cpu->apic_id,
> +            topo_ids.module_id);
> +        return;
> +    }
> +    cpu->cluster_id = topo_ids.module_id;
> +
>      if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
>          error_setg(errp, "property core-id: %u doesn't match set apic-id:"
>              " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id,
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index d6969813ee02..ffa282219078 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7806,12 +7806,14 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, 0),
>      DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, 0),
>      DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
> +    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, 0),
>      DEFINE_PROP_INT32("die-id", X86CPU, die_id, 0),
>      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, 0),
>  #else
>      DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
>      DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, -1),
>      DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
> +    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, -1),
>      DEFINE_PROP_INT32("die-id", X86CPU, die_id, -1),
>      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
>  #endif
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 5e97d0b76574..d9577938ae04 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -2034,6 +2034,7 @@ struct ArchCPU {
>      int32_t node_id; /* NUMA node this CPU belongs to */
>      int32_t socket_id;
>      int32_t die_id;
> +    int32_t cluster_id;
>      int32_t core_id;
>      int32_t thread_id;
>  

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-01 10:35 ` [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
@ 2023-08-02 23:49   ` Moger, Babu
  2023-08-03 16:41     ` Moger, Babu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-02 23:49 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Zhao,

Hitting this error after this patch.

ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
not be reached
Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
should not be reached
Aborted (core dumped)

Looks like share_level for all the caches for AMD is not initialized.

Thanks
Babu

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
> intel CPUs.
> 
> After cache models have topology information, we can use
> CPUCacheInfo.share_level to decide which topology level to be encoded
> into CPUID[4].EAX[bits 25:14].
> 
> And since maximum_processor_id (original "num_apic_ids") is parsed
> based on cpu topology levels, which are verified when parsing smp, it's
> no need to check this value by "assert(num_apic_ids > 0)" again, so
> remove this assert.
> 
> Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
> helper to make the code cleaner.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v1:
>  * Use "enum CPUTopoLevel share_level" as the parameter in
>    max_processor_ids_for_cache().
>  * Make cache_into_passthrough case also use
>    max_processor_ids_for_cache() and max_core_ids_in_package() to
>    encode CPUID[4]. (Yanan)
>  * Rename the title of this patch (the original is "i386: Use
>    CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]").
> ---
>  target/i386/cpu.c | 70 +++++++++++++++++++++++++++++------------------
>  1 file changed, 43 insertions(+), 27 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 55aba4889628..c9897c0fe91a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -234,22 +234,53 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
>                         ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
>                         0 /* Invalid value */)
>  
> +static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
> +                                            enum CPUTopoLevel share_level)
> +{
> +    uint32_t num_ids = 0;
> +
> +    switch (share_level) {
> +    case CPU_TOPO_LEVEL_CORE:
> +        num_ids = 1 << apicid_core_offset(topo_info);
> +        break;
> +    case CPU_TOPO_LEVEL_DIE:
> +        num_ids = 1 << apicid_die_offset(topo_info);
> +        break;
> +    case CPU_TOPO_LEVEL_PACKAGE:
> +        num_ids = 1 << apicid_pkg_offset(topo_info);
> +        break;
> +    default:
> +        /*
> +         * Currently there is no use case for SMT and MODULE, so use
> +         * assert directly to facilitate debugging.
> +         */
> +        g_assert_not_reached();
> +    }
> +
> +    return num_ids - 1;
> +}
> +
> +static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
> +{
> +    uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
> +                               apicid_core_offset(topo_info));
> +    return num_cores - 1;
> +}
>  
>  /* Encode cache info for CPUID[4] */
>  static void encode_cache_cpuid4(CPUCacheInfo *cache,
> -                                int num_apic_ids, int num_cores,
> +                                X86CPUTopoInfo *topo_info,
>                                  uint32_t *eax, uint32_t *ebx,
>                                  uint32_t *ecx, uint32_t *edx)
>  {
>      assert(cache->size == cache->line_size * cache->associativity *
>                            cache->partitions * cache->sets);
>  
> -    assert(num_apic_ids > 0);
>      *eax = CACHE_TYPE(cache->type) |
>             CACHE_LEVEL(cache->level) |
>             (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
> -           ((num_cores - 1) << 26) |
> -           ((num_apic_ids - 1) << 14);
> +           (max_core_ids_in_package(topo_info) << 26) |
> +           (max_processor_ids_for_cache(topo_info, cache->share_level) << 14);
>  
>      assert(cache->line_size > 0);
>      assert(cache->partitions > 0);
> @@ -6116,56 +6147,41 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>  
>                  if (cores_per_pkg > 1) {
> -                    int addressable_cores_offset =
> -                                                apicid_pkg_offset(&topo_info) -
> -                                                apicid_core_offset(&topo_info);
> -
>                      *eax &= ~0xFC000000;
> -                    *eax |= (1 << addressable_cores_offset - 1) << 26;
> +                    *eax |= max_core_ids_in_package(&topo_info) << 26;
>                  }
>                  if (host_vcpus_per_cache > cpus_per_pkg) {
> -                    int pkg_offset = apicid_pkg_offset(&topo_info);
> -
>                      *eax &= ~0x3FFC000;
> -                    *eax |= (1 << pkg_offset - 1) << 14;
> +                    *eax |=
> +                        max_processor_ids_for_cache(&topo_info,
> +                                                CPU_TOPO_LEVEL_PACKAGE) << 14;
>                  }
>              }
>          } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
>              *eax = *ebx = *ecx = *edx = 0;
>          } else {
>              *eax = 0;
> -            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> -                                           apicid_core_offset(&topo_info);
> -            int core_offset, die_offset;
>  
>              switch (count) {
>              case 0: /* L1 dcache info */
> -                core_offset = apicid_core_offset(&topo_info);
>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> -                                    (1 << core_offset),
> -                                    (1 << addressable_cores_offset),
> +                                    &topo_info,
>                                      eax, ebx, ecx, edx);
>                  break;
>              case 1: /* L1 icache info */
> -                core_offset = apicid_core_offset(&topo_info);
>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> -                                    (1 << core_offset),
> -                                    (1 << addressable_cores_offset),
> +                                    &topo_info,
>                                      eax, ebx, ecx, edx);
>                  break;
>              case 2: /* L2 cache info */
> -                core_offset = apicid_core_offset(&topo_info);
>                  encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> -                                    (1 << core_offset),
> -                                    (1 << addressable_cores_offset),
> +                                    &topo_info,
>                                      eax, ebx, ecx, edx);
>                  break;
>              case 3: /* L3 cache info */
> -                die_offset = apicid_die_offset(&topo_info);
>                  if (cpu->enable_l3_cache) {
>                      encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> -                                        (1 << die_offset),
> -                                        (1 << addressable_cores_offset),
> +                                        &topo_info,
>                                          eax, ebx, ecx, edx);
>                      break;
>                  }

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-02 23:49   ` Moger, Babu
@ 2023-08-03 16:41     ` Moger, Babu
  2023-08-04  9:48       ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-03 16:41 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Zhao,

On 8/2/23 18:49, Moger, Babu wrote:
> Hi Zhao,
> 
> Hitting this error after this patch.
> 
> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
> not be reached
> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
> should not be reached
> Aborted (core dumped)
> 
> Looks like share_level for all the caches for AMD is not initialized.

The following patch fixes the problem.

======================================================


diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f4c48e19fa..976a2755d8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -528,6 +528,7 @@ static CPUCacheInfo legacy_l2_cache_cpuid2 = {
     .size = 2 * MiB,
     .line_size = 64,
     .associativity = 8,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };


@@ -1904,6 +1905,7 @@ static CPUCaches epyc_v4_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -1916,6 +1918,7 @@ static CPUCaches epyc_v4_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1926,6 +1929,7 @@ static CPUCaches epyc_v4_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1939,6 +1943,7 @@ static CPUCaches epyc_v4_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };

@@ -2008,6 +2013,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2020,6 +2026,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2030,6 +2037,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2043,6 +2051,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };

@@ -2112,6 +2121,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2124,6 +2134,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2134,6 +2145,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2147,6 +2159,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };

@@ -2162,6 +2175,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2174,6 +2188,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2184,6 +2199,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .partitions = 1,
         .sets = 2048,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2197,6 +2213,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };


=========================================================================

Thanks
Babu
> 
> On 8/1/23 05:35, Zhao Liu wrote:
>> From: Zhao Liu <zhao1.liu@intel.com>
>>
>> CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
>> intel CPUs.
>>
>> After cache models have topology information, we can use
>> CPUCacheInfo.share_level to decide which topology level to be encoded
>> into CPUID[4].EAX[bits 25:14].
>>
>> And since maximum_processor_id (original "num_apic_ids") is parsed
>> based on cpu topology levels, which are verified when parsing smp, it's
>> no need to check this value by "assert(num_apic_ids > 0)" again, so
>> remove this assert.
>>
>> Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
>> helper to make the code cleaner.
>>
>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>> ---
>> Changes since v1:
>>  * Use "enum CPUTopoLevel share_level" as the parameter in
>>    max_processor_ids_for_cache().
>>  * Make cache_into_passthrough case also use
>>    max_processor_ids_for_cache() and max_core_ids_in_package() to
>>    encode CPUID[4]. (Yanan)
>>  * Rename the title of this patch (the original is "i386: Use
>>    CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]").
>> ---
>>  target/i386/cpu.c | 70 +++++++++++++++++++++++++++++------------------
>>  1 file changed, 43 insertions(+), 27 deletions(-)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 55aba4889628..c9897c0fe91a 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -234,22 +234,53 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
>>                         ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
>>                         0 /* Invalid value */)
>>  
>> +static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
>> +                                            enum CPUTopoLevel share_level)
>> +{
>> +    uint32_t num_ids = 0;
>> +
>> +    switch (share_level) {
>> +    case CPU_TOPO_LEVEL_CORE:
>> +        num_ids = 1 << apicid_core_offset(topo_info);
>> +        break;
>> +    case CPU_TOPO_LEVEL_DIE:
>> +        num_ids = 1 << apicid_die_offset(topo_info);
>> +        break;
>> +    case CPU_TOPO_LEVEL_PACKAGE:
>> +        num_ids = 1 << apicid_pkg_offset(topo_info);
>> +        break;
>> +    default:
>> +        /*
>> +         * Currently there is no use case for SMT and MODULE, so use
>> +         * assert directly to facilitate debugging.
>> +         */
>> +        g_assert_not_reached();
>> +    }
>> +
>> +    return num_ids - 1;
>> +}
>> +
>> +static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
>> +{
>> +    uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
>> +                               apicid_core_offset(topo_info));
>> +    return num_cores - 1;
>> +}
>>  
>>  /* Encode cache info for CPUID[4] */
>>  static void encode_cache_cpuid4(CPUCacheInfo *cache,
>> -                                int num_apic_ids, int num_cores,
>> +                                X86CPUTopoInfo *topo_info,
>>                                  uint32_t *eax, uint32_t *ebx,
>>                                  uint32_t *ecx, uint32_t *edx)
>>  {
>>      assert(cache->size == cache->line_size * cache->associativity *
>>                            cache->partitions * cache->sets);
>>  
>> -    assert(num_apic_ids > 0);
>>      *eax = CACHE_TYPE(cache->type) |
>>             CACHE_LEVEL(cache->level) |
>>             (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
>> -           ((num_cores - 1) << 26) |
>> -           ((num_apic_ids - 1) << 14);
>> +           (max_core_ids_in_package(topo_info) << 26) |
>> +           (max_processor_ids_for_cache(topo_info, cache->share_level) << 14);
>>  
>>      assert(cache->line_size > 0);
>>      assert(cache->partitions > 0);
>> @@ -6116,56 +6147,41 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>>  
>>                  if (cores_per_pkg > 1) {
>> -                    int addressable_cores_offset =
>> -                                                apicid_pkg_offset(&topo_info) -
>> -                                                apicid_core_offset(&topo_info);
>> -
>>                      *eax &= ~0xFC000000;
>> -                    *eax |= (1 << addressable_cores_offset - 1) << 26;
>> +                    *eax |= max_core_ids_in_package(&topo_info) << 26;
>>                  }
>>                  if (host_vcpus_per_cache > cpus_per_pkg) {
>> -                    int pkg_offset = apicid_pkg_offset(&topo_info);
>> -
>>                      *eax &= ~0x3FFC000;
>> -                    *eax |= (1 << pkg_offset - 1) << 14;
>> +                    *eax |=
>> +                        max_processor_ids_for_cache(&topo_info,
>> +                                                CPU_TOPO_LEVEL_PACKAGE) << 14;
>>                  }
>>              }
>>          } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
>>              *eax = *ebx = *ecx = *edx = 0;
>>          } else {
>>              *eax = 0;
>> -            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
>> -                                           apicid_core_offset(&topo_info);
>> -            int core_offset, die_offset;
>>  
>>              switch (count) {
>>              case 0: /* L1 dcache info */
>> -                core_offset = apicid_core_offset(&topo_info);
>>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
>> -                                    (1 << core_offset),
>> -                                    (1 << addressable_cores_offset),
>> +                                    &topo_info,
>>                                      eax, ebx, ecx, edx);
>>                  break;
>>              case 1: /* L1 icache info */
>> -                core_offset = apicid_core_offset(&topo_info);
>>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
>> -                                    (1 << core_offset),
>> -                                    (1 << addressable_cores_offset),
>> +                                    &topo_info,
>>                                      eax, ebx, ecx, edx);
>>                  break;
>>              case 2: /* L2 cache info */
>> -                core_offset = apicid_core_offset(&topo_info);
>>                  encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
>> -                                    (1 << core_offset),
>> -                                    (1 << addressable_cores_offset),
>> +                                    &topo_info,
>>                                      eax, ebx, ecx, edx);
>>                  break;
>>              case 3: /* L3 cache info */
>> -                die_offset = apicid_die_offset(&topo_info);
>>                  if (cpu->enable_l3_cache) {
>>                      encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
>> -                                        (1 << die_offset),
>> -                                        (1 << addressable_cores_offset),
>> +                                        &topo_info,
>>                                          eax, ebx, ecx, edx);
>>                      break;
>>                  }
> 

-- 
Thanks
Babu Moger


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  2023-08-01 10:35 ` [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
@ 2023-08-03 20:40   ` Moger, Babu
  2023-08-04  9:50     ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-03 20:40 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
> for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
> the number of sharing threads directly.
> 
> From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
> means [1]:
> 
> The number of logical processors sharing this cache is the value of
> this field incremented by 1. To determine which logical processors are
> sharing a cache, determine a Share Id for each processor as follows:
> 
> ShareId = LocalApicId >> log2(NumSharingCache+1)
> 
> Logical processors with the same ShareId then share a cache. If
> NumSharingCache+1 is not a power of two, round it up to the next power
> of two.
> 
> From the description above, the caculation of this feild should be same
> as CPUID[4].EAX[bits 25:14] for intel cpus. So also use the offsets of
> APIC ID to calculate this field.
> 
> Note: I don't have the AMD hardware available, hope folks can help me
> to test this, thanks!

Yes. Decode looks good. You can remove this note in next revision.

The subject line "Fix" gives wrong impression. I would change the subject
to (or something like this).

i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]


> 
> [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
>      Information
> 
> Cc: Babu Moger <babu.moger@amd.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v1:
>  * Rename "l3_threads" to "num_apic_ids" in
>    encode_cache_cpuid8000001d(). (Yanan)
>  * Add the description of the original commit and add Cc.
> ---
>  target/i386/cpu.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index c9897c0fe91a..f67b6be10b8d 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -361,7 +361,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>                                         uint32_t *eax, uint32_t *ebx,
>                                         uint32_t *ecx, uint32_t *edx)
>  {
> -    uint32_t l3_threads;
> +    uint32_t num_apic_ids;

I would change it to match spec definition.

  uint32_t num_sharing_cache;


>      assert(cache->size == cache->line_size * cache->associativity *
>                            cache->partitions * cache->sets);
>  
> @@ -370,13 +370,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>  
>      /* L3 is shared among multiple cores */
>      if (cache->level == 3) {
> -        l3_threads = topo_info->modules_per_die *
> -                     topo_info->cores_per_module *
> -                     topo_info->threads_per_core;
> -        *eax |= (l3_threads - 1) << 14;
> +        num_apic_ids = 1 << apicid_die_offset(topo_info);
>      } else {
> -        *eax |= ((topo_info->threads_per_core - 1) << 14);
> +        num_apic_ids = 1 << apicid_core_offset(topo_info);
>      }
> +    *eax |= (num_apic_ids - 1) << 14;
>  
>      assert(cache->line_size > 0);
>      assert(cache->partitions > 0);

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
  2023-08-01 10:35 ` [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
@ 2023-08-03 20:44   ` Moger, Babu
  2023-08-04  9:56     ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-03 20:44 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Zhao,
  Please copy the thread to kvm@vger.kernel.org also.  It makes it easier
to browse.


On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> CPUID[0x8000001D].EAX[bits 25:14] is used to represent the cache
> topology for amd CPUs.
Please change this to.


CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical
processors sharing cache. The number of
logical processors sharing this cache is NumSharingCache + 1.

> 
> After cache models have topology information, we can use
> CPUCacheInfo.share_level to decide which topology level to be encoded
> into CPUID[0x8000001D].EAX[bits 25:14].
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v1:
>  * Use cache->share_level as the parameter in
>    max_processor_ids_for_cache().
> ---
>  target/i386/cpu.c | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f67b6be10b8d..6eee0274ade4 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -361,20 +361,12 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>                                         uint32_t *eax, uint32_t *ebx,
>                                         uint32_t *ecx, uint32_t *edx)
>  {
> -    uint32_t num_apic_ids;
>      assert(cache->size == cache->line_size * cache->associativity *
>                            cache->partitions * cache->sets);
>  
>      *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
>                 (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
> -
> -    /* L3 is shared among multiple cores */
> -    if (cache->level == 3) {
> -        num_apic_ids = 1 << apicid_die_offset(topo_info);
> -    } else {
> -        num_apic_ids = 1 << apicid_core_offset(topo_info);
> -    }
> -    *eax |= (num_apic_ids - 1) << 14;
> +    *eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 14;
>  
>      assert(cache->line_size > 0);
>      assert(cache->partitions > 0);

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 00/17] Support smp.clusters for x86
  2023-08-01 23:11 ` Moger, Babu
@ 2023-08-04  7:44   ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  7:44 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

On Tue, Aug 01, 2023 at 06:11:52PM -0500, Moger, Babu wrote:
> Date: Tue, 1 Aug 2023 18:11:52 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 00/17] Support smp.clusters for x86
> 

Hi Babu,

Many thanks for your review and help!

> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > Hi list,
> > 
> > This is the our v3 patch series, rebased on the master branch at the
> > commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
> > //git.linaro.org/people/pmaydell/qemu-arm into staging").
> > 
> > Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
> > "ACKed-by" (for PC related patchies) tags and minor code changes (Pls
> > see changelog).
> > 
> > 
> > # Introduction
> > 
> > This series add the cluster support for x86 PC machine, which allows
> > x86 can use smp.clusters to configure x86 modlue level CPU topology.
> 
> /s/modlue/module

Thanks!

> > 
> > And since the compatibility issue (see section: ## Why not share L2
> > cache in cluster directly), this series also introduce a new command
> > to adjust the x86 L2 cache topology.
> > 
> > Welcome your comments!
> > 
> > 
> > # Backgroud
> > 
> > The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> > hasn't supported it.
> > 
> > At present, x86 defaults L2 cache is shared in one core, but this is
> > not enough. There're some platforms that multiple cores share the
> > same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> > Atom cores [3], that is, every four Atom cores shares one L2 cache.
> > Therefore, we need the new CPU topology level (cluster/module).
> > 
> > Another reason is for hybrid architecture. cluster support not only
> > provides another level of topology definition in x86, but would aslo
> > provide required code change for future our hybrid topology support.
> > 
> > 
> > # Overview
> > 
> > ## Introduction of module level for x86
> > 
> > "cluster" in smp is the CPU topology level which is between "core" and
> > die.
> > 
> > For x86, the "cluster" in smp is corresponding to the module level [4],
> > which is above the core level. So use the "module" other than "cluster"
> > in x86 code.
> > 
> > And please note that x86 already has a cpu topology level also named
> > "cluster" [4], this level is at the upper level of the package. Here,
> > the cluster in x86 cpu topology is completely different from the
> > "clusters" as the smp parameter. After the module level is introduced,
> > the cluster as the smp parameter will actually refer to the module level
> > of x86.
> > 
> > 
> > ## Why not share L2 cache in cluster directly
> > 
> > Though "clusters" was introduced to help define L2 cache topology
> > [2], using cluster to define x86's L2 cache topology will cause the
> > compatibility problem:
> > 
> > Currently, x86 defaults that the L2 cache is shared in one core, which
> > actually implies a default setting "cores per L2 cache is 1" and
> > therefore implicitly defaults to having as many L2 caches as cores.
> > 
> > For example (i386 PC machine):
> > -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> > 
> > Considering the topology of the L2 cache, this (*) implicitly means "1
> > core per L2 cache" and "2 L2 caches per die".
> > 
> > If we use cluster to configure L2 cache topology with the new default
> > setting "clusters per L2 cache is 1", the above semantics will change
> > to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> > cores per L2 cache".
> > 
> > So the same command (*) will cause changes in the L2 cache topology,
> > further affecting the performance of the virtual machine.
> > 
> > Therefore, x86 should only treat cluster as a cpu topology level and
> > avoid using it to change L2 cache by default for compatibility.
> > 
> > 
> > ## module level in CPUID
> > 
> > Currently, we don't expose module level in CPUID.1FH because currently
> > linux (v6.2-rc6) doesn't support module level. And exposing module and
> > die levels at the same time in CPUID.1FH will cause linux to calculate
> > wrong die_id. The module level should be exposed until the real machine
> > has the module level in CPUID.1FH.
> > 
> > We can configure CPUID.04H.02H (L2 cache topology) with module level by
> > a new command:
> > 
> >         "-cpu,x-l2-cache-topo=cluster"
> > 
> > More information about this command, please see the section: "## New
> > property: x-l2-cache-topo".
> > 
> > 
> > ## New cache topology info in CPUCacheInfo
> > 
> > Currently, by default, the cache topology is encoded as:
> > 1. i/d cache is shared in one core.
> > 2. L2 cache is shared in one core.
> > 3. L3 cache is shared in one die.
> > 
> > This default general setting has caused a misunderstanding, that is, the
> > cache topology is completely equated with a specific cpu topology, such
> > as the connection between L2 cache and core level, and the connection
> > between L3 cache and die level.
> > 
> > In fact, the settings of these topologies depend on the specific
> > platform and are not static. For example, on Alder Lake-P, every
> > four Atom cores share the same L2 cache [2].
> > 
> > Thus, in this patch set, we explicitly define the corresponding cache
> > topology for different cpu models and this has two benefits:
> > 1. Easy to expand to new CPU models in the future, which has different
> >    cache topology.
> > 2. It can easily support custom cache topology by some command (e.g.,
> >    x-l2-cache-topo).
> > 
> > 
> > ## New property: x-l2-cache-topo
> > 
> > The property l2-cache-topo will be used to change the L2 cache topology
> 
> Should this be x-l2-cache-topo ?

Yes.

> 
> > in CPUID.04H.
> > 
> > Now it allows user to set the L2 cache is shared in core level or
> > cluster level.
> > 
> > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > topology will be overrided by the new topology setting.
> > 
> > Since CPUID.04H is used by intel cpus, this property is available on
> > intel cpus as for now.
> 
> s/intel cpus/Intel CPUs/
> I feel this looks  better

OK.

> 
> > 
> > When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.
> 
> s/amd cpus/AMD CPUs/

Will fix.

Thanks,
Zhao

> 
> > 
> > 
> > # Patch description
> > 
> > patch 1-2 Cleanups about coding style and test name.
> > 
> > patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
> >              cache topology encoding.
> > 
> > patch 5-6 Cleanups about topology related CPUID encoding and QEMU
> >           topology variables.
> > 
> > patch 7-12 Add the module as the new CPU topology level in x86, and it
> >            is corresponding to the cluster level in generic code.
> > 
> > patch 13,14,16 Add cache topology infomation in cache models.
> > 
> > patch 17 Introduce a new command to configure L2 cache topology.
> > 
> > 
> > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> > [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> > [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> > [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > 
> > Best Regards,
> > Zhao
> > 
> > ---
> > Changelog:
> > 
> > Changes since v2:
> >  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
> >  * Use newly added wrapped helper to get cores per socket in
> >    qemu_init_vcpu().
> > 
> > Changes since v1:
> >  * Reordered patches. (Yanan)
> >  * Deprecated the patch to fix comment of machine_parse_smp_config().
> >    (Yanan)
> >  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
> >  * Split the intel's l1 cache topology fix into a new separate patch.
> >    (Yanan)
> >  * Combined module_id and APIC ID for module level support into one
> >    patch. (Yanan)
> >  * Make cache_into_passthrough case of cpuid 0x04 leaf in
> >  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
> >    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
> >  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
> >    (Yanan)
> >  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> > 
> > ---
> > Zhao Liu (10):
> >   i386: Fix comment style in topology.h
> >   tests: Rename test-x86-cpuid.c to test-x86-topo.c
> >   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
> >   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
> >   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
> >   i386: Add cache topology info in CPUCacheInfo
> >   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
> >   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
> >   i386: Use CPUCacheInfo.share_level to encode
> >     CPUID[0x8000001D].EAX[bits 25:14]
> >   i386: Add new property to control L2 cache topo in CPUID.04H
> > 
> > Zhuocheng Ding (7):
> >   softmmu: Fix CPUSTATE.nr_cores' calculation
> >   i386: Introduce module-level cpu topology to CPUX86State
> >   i386: Support modules_per_die in X86CPUTopoInfo
> >   i386: Support module_id in X86CPUTopoIDs
> >   i386/cpu: Introduce cluster-id to X86CPU
> >   tests: Add test case of APIC ID for module level parsing
> >   hw/i386/pc: Support smp.clusters for x86 PC machine
> > 
> >  MAINTAINERS                                   |   2 +-
> >  hw/i386/pc.c                                  |   1 +
> >  hw/i386/x86.c                                 |  49 +++++-
> >  include/hw/core/cpu.h                         |   2 +-
> >  include/hw/i386/topology.h                    |  68 +++++---
> >  qemu-options.hx                               |  10 +-
> >  softmmu/cpus.c                                |   2 +-
> >  target/i386/cpu.c                             | 158 ++++++++++++++----
> >  target/i386/cpu.h                             |  25 +++
> >  tests/unit/meson.build                        |   4 +-
> >  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
> >  11 files changed, 280 insertions(+), 99 deletions(-)
> >  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> > 
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 01/17] i386: Fix comment style in topology.h
  2023-08-01 23:13   ` Moger, Babu
@ 2023-08-04  8:12     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  8:12 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Babu,

On Tue, Aug 01, 2023 at 06:13:55PM -0500, Moger, Babu wrote:
> Date: Tue, 1 Aug 2023 18:13:55 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 01/17] i386: Fix comment style in topology.h
> 
> Hi Zhao,
> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > For function comments in this file, keep the comment style consistent
> > with other places.
> 
> s/with other places./with other files in the directory./

OK, thanks!

-Zhao

> 
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org
> > Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/hw/i386/topology.h | 33 +++++++++++++++++----------------
> >  1 file changed, 17 insertions(+), 16 deletions(-)
> > 
> > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > index 81573f6cfde0..5a19679f618b 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -24,7 +24,8 @@
> >  #ifndef HW_I386_TOPOLOGY_H
> >  #define HW_I386_TOPOLOGY_H
> >  
> > -/* This file implements the APIC-ID-based CPU topology enumeration logic,
> > +/*
> > + * This file implements the APIC-ID-based CPU topology enumeration logic,
> >   * documented at the following document:
> >   *   Intel® 64 Architecture Processor Topology Enumeration
> >   *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
> > @@ -41,7 +42,8 @@
> >  
> >  #include "qemu/bitops.h"
> >  
> > -/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> > +/*
> > + * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> >   */
> >  typedef uint32_t apic_id_t;
> >  
> > @@ -58,8 +60,7 @@ typedef struct X86CPUTopoInfo {
> >      unsigned threads_per_core;
> >  } X86CPUTopoInfo;
> >  
> > -/* Return the bit width needed for 'count' IDs
> > - */
> > +/* Return the bit width needed for 'count' IDs */
> >  static unsigned apicid_bitwidth_for_count(unsigned count)
> >  {
> >      g_assert(count >= 1);
> > @@ -67,15 +68,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
> >      return count ? 32 - clz32(count) : 0;
> >  }
> >  
> > -/* Bit width of the SMT_ID (thread ID) field on the APIC ID
> > - */
> > +/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
> >  static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
> >  {
> >      return apicid_bitwidth_for_count(topo_info->threads_per_core);
> >  }
> >  
> > -/* Bit width of the Core_ID field
> > - */
> > +/* Bit width of the Core_ID field */
> >  static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
> >  {
> >      return apicid_bitwidth_for_count(topo_info->cores_per_die);
> > @@ -87,8 +86,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
> >      return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
> >  }
> >  
> > -/* Bit offset of the Core_ID field
> > - */
> > +/* Bit offset of the Core_ID field */
> >  static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
> >  {
> >      return apicid_smt_width(topo_info);
> > @@ -100,14 +98,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
> >      return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> >  }
> >  
> > -/* Bit offset of the Pkg_ID (socket ID) field
> > - */
> > +/* Bit offset of the Pkg_ID (socket ID) field */
> >  static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
> >  {
> >      return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
> >  }
> >  
> > -/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> > +/*
> > + * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> >   *
> >   * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
> >   */
> > @@ -120,7 +118,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
> >             topo_ids->smt_id;
> >  }
> >  
> > -/* Calculate thread/core/package IDs for a specific topology,
> > +/*
> > + * Calculate thread/core/package IDs for a specific topology,
> >   * based on (contiguous) CPU index
> >   */
> >  static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> > @@ -137,7 +136,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> >      topo_ids->smt_id = cpu_index % nr_threads;
> >  }
> >  
> > -/* Calculate thread/core/package IDs for a specific topology,
> > +/*
> > + * Calculate thread/core/package IDs for a specific topology,
> >   * based on APIC ID
> >   */
> >  static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> > @@ -155,7 +155,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> >      topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
> >  }
> >  
> > -/* Make APIC ID for the CPU 'cpu_index'
> > +/*
> > + * Make APIC ID for the CPU 'cpu_index'
> >   *
> >   * 'cpu_index' is a sequential, contiguous ID for the CPU.
> >   */
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c
  2023-08-01 23:20   ` Moger, Babu
@ 2023-08-04  8:14     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  8:14 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu,
	Yongwei Ma

Hi Babu,

On Tue, Aug 01, 2023 at 06:20:46PM -0500, Moger, Babu wrote:
> Date: Tue, 1 Aug 2023 18:20:46 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to
>  test-x86-topo.c
> 
> Zhao,
> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > In fact, this unit tests APIC ID other than CPUID.
> 
> This is not clear.
> 
> The tests in test-x86-topo.c actually test the APIC ID combinations.
> Rename to test-x86-topo.c to make its name more in line with its actual
> content.

Thanks, your description is better and clearer!

-Zhao

> 
> > Rename to test-x86-topo.c to make its name more in line with its
> > actual content.
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> > Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > Changes since v1:
> >  * Rename test-x86-apicid.c to test-x86-topo.c. (Yanan)
> > ---
> >  MAINTAINERS                                      | 2 +-
> >  tests/unit/meson.build                           | 4 ++--
> >  tests/unit/{test-x86-cpuid.c => test-x86-topo.c} | 2 +-
> >  3 files changed, 4 insertions(+), 4 deletions(-)
> >  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (99%)
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 12e59b6b27de..51ba3d593e90 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -1719,7 +1719,7 @@ F: include/hw/southbridge/ich9.h
> >  F: include/hw/southbridge/piix.h
> >  F: hw/isa/apm.c
> >  F: include/hw/isa/apm.h
> > -F: tests/unit/test-x86-cpuid.c
> > +F: tests/unit/test-x86-topo.c
> >  F: tests/qtest/test-x86-cpuid-compat.c
> >  
> >  PC Chipset
> > diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> > index 93977cc32d2b..39b5d0007c69 100644
> > --- a/tests/unit/meson.build
> > +++ b/tests/unit/meson.build
> > @@ -21,8 +21,8 @@ tests = {
> >    'test-opts-visitor': [testqapi],
> >    'test-visitor-serialization': [testqapi],
> >    'test-bitmap': [],
> > -  # all code tested by test-x86-cpuid is inside topology.h
> > -  'test-x86-cpuid': [],
> > +  # all code tested by test-x86-topo is inside topology.h
> > +  'test-x86-topo': [],
> >    'test-cutils': [],
> >    'test-div128': [],
> >    'test-shift128': [],
> > diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-topo.c
> > similarity index 99%
> > rename from tests/unit/test-x86-cpuid.c
> > rename to tests/unit/test-x86-topo.c
> > index bfabc0403a1a..2b104f86d7c2 100644
> > --- a/tests/unit/test-x86-cpuid.c
> > +++ b/tests/unit/test-x86-topo.c
> > @@ -1,5 +1,5 @@
> >  /*
> > - *  Test code for x86 CPUID and Topology functions
> > + *  Test code for x86 APIC ID and Topology functions
> >   *
> >   *  Copyright (c) 2012 Red Hat Inc.
> >   *
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-08-02 15:25   ` Moger, Babu
@ 2023-08-04  8:16     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  8:16 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu,
	Zhuocheng Ding

Hi Babu,

On Wed, Aug 02, 2023 at 10:25:58AM -0500, Moger, Babu wrote:
> Date: Wed, 2 Aug 2023 10:25:58 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
> 
> Hi Zhao,
> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> > From CPUState.nr_cores' comment, it represents "number of cores within
> > this CPU package".
> > 
> > After 003f230e37d7 ("machine: Tweak the order of topology members in
> > struct CpuTopology"), the meaning of smp.cores changed to "the number of
> > cores in one die", but this commit missed to change CPUState.nr_cores'
> > caculation, so that CPUState.nr_cores became wrong and now it
> > misses to consider numbers of clusters and dies.
> > 
> > At present, only i386 is using CPUState.nr_cores.
> > 
> > But as for i386, which supports die level, the uses of CPUState.nr_cores
> > are very confusing:
> > 
> > Early uses are based on the meaning of "cores per package" (before die
> > is introduced into i386), and later uses are based on "cores per die"
> > (after die's introduction).
> > 
> > This difference is due to that commit a94e1428991f ("target/i386: Add
> > CPUID.1F generation support for multi-dies PCMachine") misunderstood
> > that CPUState.nr_cores means "cores per die" when caculated
> > CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> > wrong understanding.
> > 
> > With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> > the result of CPUState.nr_cores is "cores per die", thus the original
> > uses of CPUState.cores based on the meaning of "cores per package" are
> > wrong when mutiple dies exist:
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
> >    incorrect because it expects "cpus per package" but now the
> >    result is "cpus per die".
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
> >    EAX[bits 31:26] is incorrect because they expect "cpus per package"
> >    but now the result is "cpus per die". The error not only impacts the
> >    EAX caculation in cache_info_passthrough case, but also impacts other
> >    cases of setting cache topology for Intel CPU according to cpu
> >    topology (specifically, the incoming parameter "num_cores" expects
> >    "cores per package" in encode_cache_cpuid4()).
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
> >    15:00] is incorrect because the EBX of 0BH.01H (core level) expects
> >    "cpus per package", which may be different with 1FH.01H (The reason
> >    is 1FH can support more levels. For QEMU, 1FH also supports die,
> >    1FH.01H:EBX[bits 15:00] expects "cpus per die").
> > 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
> >    caculated, here "cpus per package" is expected to be checked, but in
> >    fact, now it checks "cpus per die". Though "cpus per die" also works
> >    for this code logic, this isn't consistent with AMD's APM.
> > 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
> >    "cpus per package" but it obtains "cpus per die".
> > 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
> >    kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
> >    helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
> >    MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
> >    package", but in these functions, it obtains "cpus per die" and
> >    "cores per die".
> > 
> > On the other hand, these uses are correct now (they are added in/after
> > a94e1428991f):
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
> >    meets the actual meaning of CPUState.nr_cores ("cores per die").
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
> >    04H's caculation) considers number of dies, so it's correct.
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
> >    15:00] needs "cpus per die" and it gets the correct result, and
> >    CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> > 
> > When CPUState.nr_cores is correctly changed to "cores per package" again
> > , the above errors will be fixed without extra work, but the "currently"
> > correct cases will go wrong and need special handling to pass correct
> > "cpus/cores per die" they want.
> > 
> > Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
> 
> s/Thus in this patch, we fix CPUState.nr_cores' caculation/Fix
> CPUState.nr_cores' calculation/

Thanks!

> 
> 
> Describe your changes in imperative mood also spell check.

Thanks for your suggestion!

-Zhao

> 
> 
> > original meaning "cores per package", as well as changing calculation of
> > topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> > 
> > In addition, in the nr_threads' comment, specify it represents the
> > number of threads in the "core" to avoid confusion, and also add comment
> > for nr_dies in CPUX86State.
> > 
> > Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> > Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v2:
> >  * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> > Changes since v1:
> >  * Add comment for nr_dies in CPUX86State. (Yanan)
> > ---
> >  include/hw/core/cpu.h | 2 +-
> >  softmmu/cpus.c        | 2 +-
> >  target/i386/cpu.c     | 9 ++++-----
> >  target/i386/cpu.h     | 1 +
> >  4 files changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index fdcbe8735258..57f4d50ace72 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -277,7 +277,7 @@ struct qemu_work_item;
> >   *   See TranslationBlock::TCG CF_CLUSTER_MASK.
> >   * @tcg_cflags: Pre-computed cflags for this cpu.
> >   * @nr_cores: Number of cores within this CPU package.
> > - * @nr_threads: Number of threads within this CPU.
> > + * @nr_threads: Number of threads within this CPU core.
> >   * @running: #true if CPU is currently running (lockless).
> >   * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
> >   * valid under cpu_list_lock.
> > diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> > index fed20ffb5dd2..984558d7b245 100644
> > --- a/softmmu/cpus.c
> > +++ b/softmmu/cpus.c
> > @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
> >  {
> >      MachineState *ms = MACHINE(qdev_get_machine());
> >  
> > -    cpu->nr_cores = ms->smp.cores;
> > +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
> >      cpu->nr_threads =  ms->smp.threads;
> >      cpu->stopped = true;
> >      cpu->random_seed = qemu_guest_random_seed_thread_part1();
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 97ad229d8ba3..50613cd04612 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >      X86CPUTopoInfo topo_info;
> >  
> >      topo_info.dies_per_pkg = env->nr_dies;
> > -    topo_info.cores_per_die = cs->nr_cores;
> > +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> >      topo_info.threads_per_core = cs->nr_threads;
> >  
> >      /* Calculate & apply limits for different index ranges */
> > @@ -6087,8 +6087,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               */
> >              if (*eax & 31) {
> >                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> > -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> > -                                       cs->nr_threads;
> > +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> >                  if (cs->nr_cores > 1) {
> >                      *eax &= ~0xFC000000;
> >                      *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> > @@ -6266,12 +6265,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >              break;
> >          case 1:
> >              *eax = apicid_die_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> >              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >              break;
> >          case 2:
> >              *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> > +            *ebx = cs->nr_cores * cs->nr_threads;
> >              *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
> >              break;
> >          default:
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index e0771a10433b..7638128d59cc 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1878,6 +1878,7 @@ typedef struct CPUArchState {
> >  
> >      TPRAccess tpr_access_type;
> >  
> > +    /* Number of dies within this CPU package. */
> >      unsigned nr_dies;
> >  } CPUX86State;
> >  
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
  2023-08-02 15:41   ` Moger, Babu
@ 2023-08-04  8:21     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  8:21 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu,
	Robert Hoo

Hi Babu,

On Wed, Aug 02, 2023 at 10:41:17AM -0500, Moger, Babu wrote:
> Date: Wed, 2 Aug 2023 10:41:17 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache
>  topo in CPUID[4]
> 
> Hi Zhao,
> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
> > CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
> > nearest power-of-2 integer.
> > 
> > The nearest power-of-2 integer can be caculated by pow2ceil() or by
> > using APIC ID offset (like L3 topology using 1 << die_offset [3]).
> > 
> > But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
> > are associated with APIC ID. For example, in linux kernel, the field
> > "num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
> > another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
> > matched with actual core numbers and it's caculated by:
> > "(1 << (pkg_offset - core_offset)) - 1".
> > 
> > Therefore the offset of APIC ID should be preferred to caculate nearest
> > power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
> > 31:26]:
> > 1. d/i cache is shared in a core, 1 << core_offset should be used
> >    instand of "cs->nr_threads" in encode_cache_cpuid4() for
> >    CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
> > 2. L2 cache is supposed to be shared in a core as for now, thereby
> >    1 << core_offset should also be used instand of "cs->nr_threads" in
> >    encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
> > 3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
> >    replaced by the offsets upper SMT level in APIC ID.
> > 
> > In addition, use APIC ID offset to replace "pow2ceil()" for
> > cache_info_passthrough case.
> > 
> > [1]: efb3934adf9e ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec")
> > [2]: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
> > [3]: d65af288a84d ("i386: Update new x86_apicid parsing rules with die_offset support")
> > 
> > Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
> > Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v1:
> >  * Use APIC ID offset to replace "pow2ceil()" for cache_info_passthrough
> >    case. (Yanan)
> >  * Split the L1 cache fix into a separate patch.
> >  * Rename the title of this patch (the original is "i386/cpu: Fix number
> >    of addressable IDs in CPUID.04H").
> > ---
> >  target/i386/cpu.c | 30 +++++++++++++++++++++++-------
> >  1 file changed, 23 insertions(+), 7 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index b439a05244ee..c80613bfcded 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6005,7 +6005,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >  {
> >      X86CPU *cpu = env_archcpu(env);
> >      CPUState *cs = env_cpu(env);
> > -    uint32_t die_offset;
> >      uint32_t limit;
> >      uint32_t signature[3];
> >      X86CPUTopoInfo topo_info;
> > @@ -6089,39 +6088,56 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> >                  int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> >                  if (cs->nr_cores > 1) {
> > +                    int addressable_cores_offset =
> > +                                                apicid_pkg_offset(&topo_info) -
> > +                                                apicid_core_offset(&topo_info);
> > +
> >                      *eax &= ~0xFC000000;
> > -                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> > +                    *eax |= (1 << addressable_cores_offset - 1) << 26;
> >                  }
> >                  if (host_vcpus_per_cache > vcpus_per_socket) {
> > +                    int pkg_offset = apicid_pkg_offset(&topo_info);
> > +
> >                      *eax &= ~0x3FFC000;
> > -                    *eax |= (pow2ceil(vcpus_per_socket) - 1) << 14;
> > +                    *eax |= (1 << pkg_offset - 1) << 14;
> >                  }
> >              }
> 
> I hit this compile error with this patch.
> 
> [1/18] Generating qemu-version.h with a custom command (wrapped by meson
> to capture output)
> [2/4] Compiling C object libqemu-x86_64-softmmu.fa.p/target_i386_cpu.c.o
> FAILED: libqemu-x86_64-softmmu.fa.p/target_i386_cpu.c.o
> ..
> ..
> softmmu.fa.p/target_i386_cpu.c.o -c ../target/i386/cpu.c
> ../target/i386/cpu.c: In function ‘cpu_x86_cpuid’:
> ../target/i386/cpu.c:6096:60: error: suggest parentheses around ‘-’ inside
> ‘<<’ [-Werror=parentheses]
>  6096 |                     *eax |= (1 << addressable_cores_offset - 1) << 26;
>       |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~^~~
> ../target/i386/cpu.c:6102:46: error: suggest parentheses around ‘-’ inside
> ‘<<’ [-Werror=parentheses]
>  6102 |                     *eax |= (1 << pkg_offset - 1) << 14;
>       |                                   ~~~~~~~~~~~^~~
> cc1: all warnings being treated as errors
> 
> Please fix this.

Thanks for your test! Sorry I may miss this warning. I'll fix this.

Thanks,
Zhao

> 
> 
> >          } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
> >              *eax = *ebx = *ecx = *edx = 0;
> >          } else {
> >              *eax = 0;
> > +            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> > +                                           apicid_core_offset(&topo_info);
> > +            int core_offset, die_offset;
> > +
> >              switch (count) {
> >              case 0: /* L1 dcache info */
> > +                core_offset = apicid_core_offset(&topo_info);
> >                  encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> > -                                    cs->nr_threads, cs->nr_cores,
> > +                                    (1 << core_offset),
> > +                                    (1 << addressable_cores_offset),
> >                                      eax, ebx, ecx, edx);
> >                  break;
> >              case 1: /* L1 icache info */
> > +                core_offset = apicid_core_offset(&topo_info);
> >                  encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> > -                                    cs->nr_threads, cs->nr_cores,
> > +                                    (1 << core_offset),
> > +                                    (1 << addressable_cores_offset),
> >                                      eax, ebx, ecx, edx);
> >                  break;
> >              case 2: /* L2 cache info */
> > +                core_offset = apicid_core_offset(&topo_info);
> >                  encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> > -                                    cs->nr_threads, cs->nr_cores,
> > +                                    (1 << core_offset),
> > +                                    (1 << addressable_cores_offset),
> >                                      eax, ebx, ecx, edx);
> >                  break;
> >              case 3: /* L3 cache info */
> >                  die_offset = apicid_die_offset(&topo_info);
> >                  if (cpu->enable_l3_cache) {
> >                      encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> > -                                        (1 << die_offset), cs->nr_cores,
> > +                                        (1 << die_offset),
> > +                                        (1 << addressable_cores_offset),
> >                                          eax, ebx, ecx, edx);
> >                      break;
> >                  }
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  2023-08-02 16:31   ` Moger, Babu
@ 2023-08-04  8:23     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  8:23 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu,
	Robert Hoo

Hi Babu,

On Wed, Aug 02, 2023 at 11:31:46AM -0500, Moger, Babu wrote:
> Date: Wed, 2 Aug 2023 11:31:46 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in
>  cpu_x86_cpuid()
> 
> Hi Zhao,
> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > In cpu_x86_cpuid(), there are many variables in representing the cpu
> > topology, e.g., topo_info, cs->nr_cores/cs->nr_threads.
> > 
> > Since the names of cs->nr_cores/cs->nr_threads does not accurately
> > represent its meaning, the use of cs->nr_cores/cs->nr_threads is prone
> > to confusion and mistakes.
> > 
> > And the structure X86CPUTopoInfo names its memebers clearly, thus the
> 
> s/memebers/members/

Thanks! I'll be more careful with my spelling.

-Zhao


> Thanks
> Babu
> 
> > variable "topo_info" should be preferred.
> > 
> > In addition, in cpu_x86_cpuid(), to uniformly use the topology variable,
> > replace env->dies with topo_info.dies_per_pkg as well.
> > 
> > Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v1:
> >  * Extract cores_per_socket from the code block and use it as a local
> >    variable for cpu_x86_cpuid(). (Yanan)
> >  * Remove vcpus_per_socket variable and use cpus_per_pkg directly.
> >    (Yanan)
> >  * Replace env->dies with topo_info.dies_per_pkg in cpu_x86_cpuid().
> > ---
> >  target/i386/cpu.c | 31 ++++++++++++++++++-------------
> >  1 file changed, 18 insertions(+), 13 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index c80613bfcded..fc50bf98c60e 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6008,11 +6008,16 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >      uint32_t limit;
> >      uint32_t signature[3];
> >      X86CPUTopoInfo topo_info;
> > +    uint32_t cores_per_pkg;
> > +    uint32_t cpus_per_pkg;
> >  
> >      topo_info.dies_per_pkg = env->nr_dies;
> >      topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> >      topo_info.threads_per_core = cs->nr_threads;
> >  
> > +    cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
> > +    cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
> > +
> >      /* Calculate & apply limits for different index ranges */
> >      if (index >= 0xC0000000) {
> >          limit = env->cpuid_xlevel2;
> > @@ -6048,8 +6053,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >              *ecx |= CPUID_EXT_OSXSAVE;
> >          }
> >          *edx = env->features[FEAT_1_EDX];
> > -        if (cs->nr_cores * cs->nr_threads > 1) {
> > -            *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
> > +        if (cpus_per_pkg > 1) {
> > +            *ebx |= cpus_per_pkg << 16;
> >              *edx |= CPUID_HT;
> >          }
> >          if (!cpu->enable_pmu) {
> > @@ -6086,8 +6091,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               */
> >              if (*eax & 31) {
> >                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> > -                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> > -                if (cs->nr_cores > 1) {
> > +
> > +                if (cores_per_pkg > 1) {
> >                      int addressable_cores_offset =
> >                                                  apicid_pkg_offset(&topo_info) -
> >                                                  apicid_core_offset(&topo_info);
> > @@ -6095,7 +6100,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                      *eax &= ~0xFC000000;
> >                      *eax |= (1 << addressable_cores_offset - 1) << 26;
> >                  }
> > -                if (host_vcpus_per_cache > vcpus_per_socket) {
> > +                if (host_vcpus_per_cache > cpus_per_pkg) {
> >                      int pkg_offset = apicid_pkg_offset(&topo_info);
> >  
> >                      *eax &= ~0x3FFC000;
> > @@ -6240,12 +6245,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >          switch (count) {
> >          case 0:
> >              *eax = apicid_core_offset(&topo_info);
> > -            *ebx = cs->nr_threads;
> > +            *ebx = topo_info.threads_per_core;
> >              *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
> >              break;
> >          case 1:
> >              *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = cpus_per_pkg;
> >              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >              break;
> >          default:
> > @@ -6266,7 +6271,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >          break;
> >      case 0x1F:
> >          /* V2 Extended Topology Enumeration Leaf */
> > -        if (env->nr_dies < 2) {
> > +        if (topo_info.dies_per_pkg < 2) {
> >              *eax = *ebx = *ecx = *edx = 0;
> >              break;
> >          }
> > @@ -6276,7 +6281,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >          switch (count) {
> >          case 0:
> >              *eax = apicid_core_offset(&topo_info);
> > -            *ebx = cs->nr_threads;
> > +            *ebx = topo_info.threads_per_core;
> >              *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
> >              break;
> >          case 1:
> > @@ -6286,7 +6291,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >              break;
> >          case 2:
> >              *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = cpus_per_pkg;
> >              *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
> >              break;
> >          default:
> > @@ -6511,7 +6516,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >           * discards multiple thread information if it is set.
> >           * So don't set it here for Intel to make Linux guests happy.
> >           */
> > -        if (cs->nr_cores * cs->nr_threads > 1) {
> > +        if (cpus_per_pkg > 1) {
> >              if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 ||
> >                  env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 ||
> >                  env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) {
> > @@ -6577,7 +6582,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               *eax |= (cpu_x86_virtual_addr_width(env) << 8);
> >          }
> >          *ebx = env->features[FEAT_8000_0008_EBX];
> > -        if (cs->nr_cores * cs->nr_threads > 1) {
> > +        if (cpus_per_pkg > 1) {
> >              /*
> >               * Bits 15:12 is "The number of bits in the initial
> >               * Core::X86::Apic::ApicId[ApicId] value that indicate
> > @@ -6585,7 +6590,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               * Bits 7:0 is "The number of threads in the package is NC+1"
> >               */
> >              *ecx = (apicid_pkg_offset(&topo_info) << 12) |
> > -                   ((cs->nr_cores * cs->nr_threads) - 1);
> > +                   (cpus_per_pkg - 1);
> >          } else {
> >              *ecx = 0;
> >          }
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo
  2023-08-02 17:25   ` Moger, Babu
@ 2023-08-04  9:05     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  9:05 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu,
	Zhuocheng Ding

Hi Babu,

On Wed, Aug 02, 2023 at 12:25:07PM -0500, Moger, Babu wrote:
> Date: Wed, 2 Aug 2023 12:25:07 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 08/17] i386: Support modules_per_die in
>  X86CPUTopoInfo
> 
> Hi Zhao,
> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> > Support module level in i386 cpu topology structure "X86CPUTopoInfo".
> > 
> > Since x86 does not yet support the "clusters" parameter in "-smp",
> > X86CPUTopoInfo.modules_per_die is currently always 1. Therefore, the
> > module level width in APIC ID, which can be calculated by
> > "apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0
> > for now, so we can directly add APIC ID related helpers to support
> > module level parsing.
> > 
> > At present, we don't expose module level in CPUID.1FH because currently
> > linux (v6.4-rc1) doesn't support module level. And exposing module and
> > die levels at the same time in CPUID.1FH will cause linux to calculate
> > the wrong die_id. The module level should be exposed until the real
> > machine has the module level in CPUID.1FH.
> > 
> > In addition, update topology structure in test-x86-topo.c.
> > 
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > Changes since v1:
> >  * Include module level related helpers (apicid_module_width() and
> >    apicid_module_offset()) in this patch. (Yanan)
> > ---
> >  hw/i386/x86.c              |  3 ++-
> >  include/hw/i386/topology.h | 22 +++++++++++++++----
> >  target/i386/cpu.c          | 12 ++++++----
> >  tests/unit/test-x86-topo.c | 45 ++++++++++++++++++++------------------
> >  4 files changed, 52 insertions(+), 30 deletions(-)
> > 
> > diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> > index 4efc390905ff..a552ae8bb4a8 100644
> > --- a/hw/i386/x86.c
> > +++ b/hw/i386/x86.c
> > @@ -72,7 +72,8 @@ static void init_topo_info(X86CPUTopoInfo *topo_info,
> >      MachineState *ms = MACHINE(x86ms);
> >  
> >      topo_info->dies_per_pkg = ms->smp.dies;
> > -    topo_info->cores_per_die = ms->smp.cores;
> > +    topo_info->modules_per_die = ms->smp.clusters;
> 
> It is confusing. You said in the previous patch, using the clusters for
> x86 is going to cause compatibility issues. 

The compatibility issue means the default L2 cache topology should be "1
L2 cache per core", and we shouldn't change this default setting.

If we want the "1 L2 cache per module", then we need other way to do
this (this is x-l2-cache-topo).

Since "cluster" was originally introduced into QEMU to help define the
L2 cache topology, I explained that we can't just change the default
topology level of L2.

> Why is this clusters is used to initialize modules_per_die?

"cluster" v.s. "module" just like "socket" v.s. "package".

The former is the generic name in smp code, while the latter is the more
accurate naming in the i386 context.

> 
> Why not define a new field "modules"(just like clusters) in smp and use it
> x86? Is is going to a problem?

In this case (just add a new "module" in smp), the "cluster" parameter of
smp is not useful for i386, and different architectures have different
parameters for smp, which is not general enough. I think it's clearest to
have a common topology hierarchy in QEMU.

cluster was originally introduced to QEMU by arm. From Yanan's explanation
[1], it is a CPU topology level, above the core level, and that L2 is often
shared at this level as well.

This description is very similar to i386's module, so I think we could align
cluster with module instead of intruducing a new "module" in smp, just like
"socket" in smp is the same as "package" in i386.

[1]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/

> May be I am not clear here. I am yet to understand all the other changes.
> 

Hope my explanation above clarifies your question.

Thanks,
Zhao

> Thanks
> Babu
> 
> > +    topo_info->cores_per_module = ms->smp.cores;
> >      topo_info->threads_per_core = ms->smp.threads;
> >  }
> >  
> > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > index 5a19679f618b..c807d3811dd3 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -56,7 +56,8 @@ typedef struct X86CPUTopoIDs {
> >  
> >  typedef struct X86CPUTopoInfo {
> >      unsigned dies_per_pkg;
> > -    unsigned cores_per_die;
> > +    unsigned modules_per_die;
> > +    unsigned cores_per_module;
> >      unsigned threads_per_core;
> >  } X86CPUTopoInfo;
> >  
> > @@ -77,7 +78,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
> >  /* Bit width of the Core_ID field */
> >  static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
> >  {
> > -    return apicid_bitwidth_for_count(topo_info->cores_per_die);
> > +    return apicid_bitwidth_for_count(topo_info->cores_per_module);
> > +}
> > +
> > +/* Bit width of the Module_ID (cluster ID) field */
> > +static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
> > +{
> > +    return apicid_bitwidth_for_count(topo_info->modules_per_die);
> >  }
> >  
> >  /* Bit width of the Die_ID field */
> > @@ -92,10 +99,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
> >      return apicid_smt_width(topo_info);
> >  }
> >  
> > +/* Bit offset of the Module_ID (cluster ID) field */
> > +static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
> > +{
> > +    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> > +}
> > +
> >  /* Bit offset of the Die_ID field */
> >  static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
> >  {
> > -    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> > +    return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
> >  }
> >  
> >  /* Bit offset of the Pkg_ID (socket ID) field */
> > @@ -127,7 +140,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> >                                           X86CPUTopoIDs *topo_ids)
> >  {
> >      unsigned nr_dies = topo_info->dies_per_pkg;
> > -    unsigned nr_cores = topo_info->cores_per_die;
> > +    unsigned nr_cores = topo_info->cores_per_module *
> > +                        topo_info->modules_per_die;
> >      unsigned nr_threads = topo_info->threads_per_core;
> >  
> >      topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 8a9fd5682efc..d6969813ee02 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -339,7 +339,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >  
> >      /* L3 is shared among multiple cores */
> >      if (cache->level == 3) {
> > -        l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
> > +        l3_threads = topo_info->modules_per_die *
> > +                     topo_info->cores_per_module *
> > +                     topo_info->threads_per_core;
> >          *eax |= (l3_threads - 1) << 14;
> >      } else {
> >          *eax |= ((topo_info->threads_per_core - 1) << 14);
> > @@ -6012,10 +6014,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >      uint32_t cpus_per_pkg;
> >  
> >      topo_info.dies_per_pkg = env->nr_dies;
> > -    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> > +    topo_info.modules_per_die = env->nr_modules;
> > +    topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
> >      topo_info.threads_per_core = cs->nr_threads;
> >  
> > -    cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
> > +    cores_per_pkg = topo_info.cores_per_module * topo_info.modules_per_die *
> > +                    topo_info.dies_per_pkg;
> >      cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
> >  
> >      /* Calculate & apply limits for different index ranges */
> > @@ -6286,7 +6290,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >              break;
> >          case 1:
> >              *eax = apicid_die_offset(&topo_info);
> > -            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> > +            *ebx = cpus_per_pkg / topo_info.dies_per_pkg;
> >              *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >              break;
> >          case 2:
> > diff --git a/tests/unit/test-x86-topo.c b/tests/unit/test-x86-topo.c
> > index 2b104f86d7c2..f21b8a5d95c2 100644
> > --- a/tests/unit/test-x86-topo.c
> > +++ b/tests/unit/test-x86-topo.c
> > @@ -30,13 +30,16 @@ static void test_topo_bits(void)
> >  {
> >      X86CPUTopoInfo topo_info = {0};
> >  
> > -    /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> > +    /*
> > +     * simple tests for 1 thread per core, 1 core per module,
> > +     *                  1 module per die, 1 die per package
> > +     */
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
> >      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
> >      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
> >      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> >  
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> > @@ -45,39 +48,39 @@ static void test_topo_bits(void)
> >  
> >      /* Test field width calculation for multiple values
> >       */
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
> >      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
> >      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 4};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
> >      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> >  
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 14};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
> >      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 15};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
> >      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 16};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
> >      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 17};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
> >      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
> >  
> >  
> > -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
> >      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 31, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
> >      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 32, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
> >      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 33, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
> >      g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
> >  
> > -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
> >      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> > -    topo_info = (X86CPUTopoInfo) {2, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
> >      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
> > -    topo_info = (X86CPUTopoInfo) {3, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
> >      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {4, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
> >      g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> >  
> >      /* build a weird topology and see if IDs are calculated correctly
> > @@ -85,18 +88,18 @@ static void test_topo_bits(void)
> >  
> >      /* This will use 2 bits for thread ID and 3 bits for core ID
> >       */
> > -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> >      g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> >      g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
> >      g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
> >      g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
> >  
> > -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> >  
> > -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
> >                       (1 << 2) | 0);
> >      g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU
  2023-08-02 22:44   ` Moger, Babu
@ 2023-08-04  9:06     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  9:06 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu,
	Zhuocheng Ding

Hi Babu,

On Wed, Aug 02, 2023 at 05:44:38PM -0500, Moger, Babu wrote:
> Date: Wed, 2 Aug 2023 17:44:38 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU
> 
> Hi Zhao,
> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> > We introduce cluster-id other than module-id to be consistent with
> 
> s/We introduce/Introduce/

Thanks! Will fix.

-Zhao

> 
> Thanks
> Babu
> 
> > CpuInstanceProperties.cluster-id, and this avoids the confusion
> > of parameter names when hotplugging.
> > 
> > Following the legacy smp check rules, also add the cluster_id validity
> > into x86_cpu_pre_plug().
> > 
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  hw/i386/x86.c     | 33 +++++++++++++++++++++++++--------
> >  target/i386/cpu.c |  2 ++
> >  target/i386/cpu.h |  1 +
> >  3 files changed, 28 insertions(+), 8 deletions(-)
> > 
> > diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> > index 0b460fd6074d..8154b86f95c7 100644
> > --- a/hw/i386/x86.c
> > +++ b/hw/i386/x86.c
> > @@ -328,6 +328,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
> >              cpu->die_id = 0;
> >          }
> >  
> > +        /*
> > +         * cluster-id was optional in QEMU 8.0 and older, so keep it optional
> > +         * if there's only one cluster per die.
> > +         */
> > +        if (cpu->cluster_id < 0 && ms->smp.clusters == 1) {
> > +            cpu->cluster_id = 0;
> > +        }
> > +
> >          if (cpu->socket_id < 0) {
> >              error_setg(errp, "CPU socket-id is not set");
> >              return;
> > @@ -344,6 +352,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
> >                         cpu->die_id, ms->smp.dies - 1);
> >              return;
> >          }
> > +        if (cpu->cluster_id < 0) {
> > +            error_setg(errp, "CPU cluster-id is not set");
> > +            return;
> > +        } else if (cpu->cluster_id > ms->smp.clusters - 1) {
> > +            error_setg(errp, "Invalid CPU cluster-id: %u must be in range 0:%u",
> > +                       cpu->cluster_id, ms->smp.clusters - 1);
> > +            return;
> > +        }
> >          if (cpu->core_id < 0) {
> >              error_setg(errp, "CPU core-id is not set");
> >              return;
> > @@ -363,16 +379,9 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
> >  
> >          topo_ids.pkg_id = cpu->socket_id;
> >          topo_ids.die_id = cpu->die_id;
> > +        topo_ids.module_id = cpu->cluster_id;
> >          topo_ids.core_id = cpu->core_id;
> >          topo_ids.smt_id = cpu->thread_id;
> > -
> > -        /*
> > -         * TODO: This is the temporary initialization for topo_ids.module_id to
> > -         * avoid "maybe-uninitialized" compilation errors. Will remove when
> > -         * X86CPU supports cluster_id.
> > -         */
> > -        topo_ids.module_id = 0;
> > -
> >          cpu->apic_id = x86_apicid_from_topo_ids(&topo_info, &topo_ids);
> >      }
> >  
> > @@ -419,6 +428,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
> >      }
> >      cpu->die_id = topo_ids.die_id;
> >  
> > +    if (cpu->cluster_id != -1 && cpu->cluster_id != topo_ids.module_id) {
> > +        error_setg(errp, "property cluster-id: %u doesn't match set apic-id:"
> > +            " 0x%x (cluster-id: %u)", cpu->cluster_id, cpu->apic_id,
> > +            topo_ids.module_id);
> > +        return;
> > +    }
> > +    cpu->cluster_id = topo_ids.module_id;
> > +
> >      if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
> >          error_setg(errp, "property core-id: %u doesn't match set apic-id:"
> >              " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id,
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index d6969813ee02..ffa282219078 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -7806,12 +7806,14 @@ static Property x86_cpu_properties[] = {
> >      DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, 0),
> >      DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, 0),
> >      DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
> > +    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, 0),
> >      DEFINE_PROP_INT32("die-id", X86CPU, die_id, 0),
> >      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, 0),
> >  #else
> >      DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
> >      DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, -1),
> >      DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
> > +    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, -1),
> >      DEFINE_PROP_INT32("die-id", X86CPU, die_id, -1),
> >      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
> >  #endif
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index 5e97d0b76574..d9577938ae04 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -2034,6 +2034,7 @@ struct ArchCPU {
> >      int32_t node_id; /* NUMA node this CPU belongs to */
> >      int32_t socket_id;
> >      int32_t die_id;
> > +    int32_t cluster_id;
> >      int32_t core_id;
> >      int32_t thread_id;
> >  
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-03 16:41     ` Moger, Babu
@ 2023-08-04  9:48       ` Zhao Liu
  2023-08-04 15:48         ` Moger, Babu
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  9:48 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Babu,

On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
> Date: Thu, 3 Aug 2023 11:41:40 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>  CPUID[4]
> 
> Hi Zhao,
> 
> On 8/2/23 18:49, Moger, Babu wrote:
> > Hi Zhao,
> > 
> > Hitting this error after this patch.
> > 
> > ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
> > not be reached
> > Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
> > should not be reached
> > Aborted (core dumped)
> > 
> > Looks like share_level for all the caches for AMD is not initialized.

I missed these change when I rebase. Sorry for that.

BTW, could I ask a question? From a previous discussion[1], I understand
that the cache info is used to show the correct cache information in
new machine. And from [2], the wrong cache info may cause "compatibility
issues".

Is this "compatibility issues" AMD specific? I'm not sure if Intel should
update the cache info like that. thanks!

[1]: https://patchwork.kernel.org/project/kvm/patch/CY4PR12MB1768A3CBE42AAFB03CB1081E95AA0@CY4PR12MB1768.namprd12.prod.outlook.com/
[2]: https://lore.kernel.org/qemu-devel/20180510204148.11687-1-babu.moger@amd.com/

> 
> The following patch fixes the problem.
> 
> ======================================================
> 
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f4c48e19fa..976a2755d8 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -528,6 +528,7 @@ static CPUCacheInfo legacy_l2_cache_cpuid2 = {
>      .size = 2 * MiB,
>      .line_size = 64,
>      .associativity = 8,
> +    .share_level = CPU_TOPO_LEVEL_CORE,

This "legacy_l2_cache_cpuid2" is not used to encode cache topology.
I should explicitly set this default topo level as CPU_TOPO_LEVEL_UNKNOW.

>  };
> 
> 
> @@ -1904,6 +1905,7 @@ static CPUCaches epyc_v4_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l1i_cache = &(CPUCacheInfo) {
>          .type = INSTRUCTION_CACHE,
> @@ -1916,6 +1918,7 @@ static CPUCaches epyc_v4_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l2_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -1926,6 +1929,7 @@ static CPUCaches epyc_v4_cache_info = {
>          .partitions = 1,
>          .sets = 1024,
>          .lines_per_tag = 1,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l3_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -1939,6 +1943,7 @@ static CPUCaches epyc_v4_cache_info = {
>          .self_init = true,
>          .inclusive = true,
>          .complex_indexing = false,
> +        .share_level = CPU_TOPO_LEVEL_DIE,
>      },
>  };
> 
> @@ -2008,6 +2013,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l1i_cache = &(CPUCacheInfo) {
>          .type = INSTRUCTION_CACHE,
> @@ -2020,6 +2026,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l2_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2030,6 +2037,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>          .partitions = 1,
>          .sets = 1024,
>          .lines_per_tag = 1,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l3_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2043,6 +2051,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>          .self_init = true,
>          .inclusive = true,
>          .complex_indexing = false,
> +        .share_level = CPU_TOPO_LEVEL_DIE,
>      },
>  };
> 
> @@ -2112,6 +2121,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l1i_cache = &(CPUCacheInfo) {
>          .type = INSTRUCTION_CACHE,
> @@ -2124,6 +2134,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l2_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2134,6 +2145,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>          .partitions = 1,
>          .sets = 1024,
>          .lines_per_tag = 1,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l3_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2147,6 +2159,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>          .self_init = true,
>          .inclusive = true,
>          .complex_indexing = false,
> +        .share_level = CPU_TOPO_LEVEL_DIE,
>      },
>  };
> 
> @@ -2162,6 +2175,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l1i_cache = &(CPUCacheInfo) {
>          .type = INSTRUCTION_CACHE,
> @@ -2174,6 +2188,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>          .lines_per_tag = 1,
>          .self_init = 1,
>          .no_invd_sharing = true,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l2_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2184,6 +2199,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>          .partitions = 1,
>          .sets = 2048,
>          .lines_per_tag = 1,
> +        .share_level = CPU_TOPO_LEVEL_CORE,
>      },
>      .l3_cache = &(CPUCacheInfo) {
>          .type = UNIFIED_CACHE,
> @@ -2197,6 +2213,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>          .self_init = true,
>          .inclusive = true,
>          .complex_indexing = false,
> +        .share_level = CPU_TOPO_LEVEL_DIE,
>      },
>  };
> 
> 
> =========================================================================


Look good to me except legacy_l2_cache_cpuid2, thanks very much!
I'll add this in next version.

-Zhao

> 
> Thanks
> Babu
> > 
> > On 8/1/23 05:35, Zhao Liu wrote:
> >> From: Zhao Liu <zhao1.liu@intel.com>
> >>
> >> CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
> >> intel CPUs.
> >>
> >> After cache models have topology information, we can use
> >> CPUCacheInfo.share_level to decide which topology level to be encoded
> >> into CPUID[4].EAX[bits 25:14].
> >>
> >> And since maximum_processor_id (original "num_apic_ids") is parsed
> >> based on cpu topology levels, which are verified when parsing smp, it's
> >> no need to check this value by "assert(num_apic_ids > 0)" again, so
> >> remove this assert.
> >>
> >> Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
> >> helper to make the code cleaner.
> >>
> >> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> >> ---
> >> Changes since v1:
> >>  * Use "enum CPUTopoLevel share_level" as the parameter in
> >>    max_processor_ids_for_cache().
> >>  * Make cache_into_passthrough case also use
> >>    max_processor_ids_for_cache() and max_core_ids_in_package() to
> >>    encode CPUID[4]. (Yanan)
> >>  * Rename the title of this patch (the original is "i386: Use
> >>    CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]").
> >> ---
> >>  target/i386/cpu.c | 70 +++++++++++++++++++++++++++++------------------
> >>  1 file changed, 43 insertions(+), 27 deletions(-)
> >>
> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> index 55aba4889628..c9897c0fe91a 100644
> >> --- a/target/i386/cpu.c
> >> +++ b/target/i386/cpu.c
> >> @@ -234,22 +234,53 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
> >>                         ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
> >>                         0 /* Invalid value */)
> >>  
> >> +static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
> >> +                                            enum CPUTopoLevel share_level)
> >> +{
> >> +    uint32_t num_ids = 0;
> >> +
> >> +    switch (share_level) {
> >> +    case CPU_TOPO_LEVEL_CORE:
> >> +        num_ids = 1 << apicid_core_offset(topo_info);
> >> +        break;
> >> +    case CPU_TOPO_LEVEL_DIE:
> >> +        num_ids = 1 << apicid_die_offset(topo_info);
> >> +        break;
> >> +    case CPU_TOPO_LEVEL_PACKAGE:
> >> +        num_ids = 1 << apicid_pkg_offset(topo_info);
> >> +        break;
> >> +    default:
> >> +        /*
> >> +         * Currently there is no use case for SMT and MODULE, so use
> >> +         * assert directly to facilitate debugging.
> >> +         */
> >> +        g_assert_not_reached();
> >> +    }
> >> +
> >> +    return num_ids - 1;
> >> +}
> >> +
> >> +static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
> >> +{
> >> +    uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
> >> +                               apicid_core_offset(topo_info));
> >> +    return num_cores - 1;
> >> +}
> >>  
> >>  /* Encode cache info for CPUID[4] */
> >>  static void encode_cache_cpuid4(CPUCacheInfo *cache,
> >> -                                int num_apic_ids, int num_cores,
> >> +                                X86CPUTopoInfo *topo_info,
> >>                                  uint32_t *eax, uint32_t *ebx,
> >>                                  uint32_t *ecx, uint32_t *edx)
> >>  {
> >>      assert(cache->size == cache->line_size * cache->associativity *
> >>                            cache->partitions * cache->sets);
> >>  
> >> -    assert(num_apic_ids > 0);
> >>      *eax = CACHE_TYPE(cache->type) |
> >>             CACHE_LEVEL(cache->level) |
> >>             (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
> >> -           ((num_cores - 1) << 26) |
> >> -           ((num_apic_ids - 1) << 14);
> >> +           (max_core_ids_in_package(topo_info) << 26) |
> >> +           (max_processor_ids_for_cache(topo_info, cache->share_level) << 14);
> >>  
> >>      assert(cache->line_size > 0);
> >>      assert(cache->partitions > 0);
> >> @@ -6116,56 +6147,41 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> >>  
> >>                  if (cores_per_pkg > 1) {
> >> -                    int addressable_cores_offset =
> >> -                                                apicid_pkg_offset(&topo_info) -
> >> -                                                apicid_core_offset(&topo_info);
> >> -
> >>                      *eax &= ~0xFC000000;
> >> -                    *eax |= (1 << addressable_cores_offset - 1) << 26;
> >> +                    *eax |= max_core_ids_in_package(&topo_info) << 26;
> >>                  }
> >>                  if (host_vcpus_per_cache > cpus_per_pkg) {
> >> -                    int pkg_offset = apicid_pkg_offset(&topo_info);
> >> -
> >>                      *eax &= ~0x3FFC000;
> >> -                    *eax |= (1 << pkg_offset - 1) << 14;
> >> +                    *eax |=
> >> +                        max_processor_ids_for_cache(&topo_info,
> >> +                                                CPU_TOPO_LEVEL_PACKAGE) << 14;
> >>                  }
> >>              }
> >>          } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
> >>              *eax = *ebx = *ecx = *edx = 0;
> >>          } else {
> >>              *eax = 0;
> >> -            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> >> -                                           apicid_core_offset(&topo_info);
> >> -            int core_offset, die_offset;
> >>  
> >>              switch (count) {
> >>              case 0: /* L1 dcache info */
> >> -                core_offset = apicid_core_offset(&topo_info);
> >>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> >> -                                    (1 << core_offset),
> >> -                                    (1 << addressable_cores_offset),
> >> +                                    &topo_info,
> >>                                      eax, ebx, ecx, edx);
> >>                  break;
> >>              case 1: /* L1 icache info */
> >> -                core_offset = apicid_core_offset(&topo_info);
> >>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> >> -                                    (1 << core_offset),
> >> -                                    (1 << addressable_cores_offset),
> >> +                                    &topo_info,
> >>                                      eax, ebx, ecx, edx);
> >>                  break;
> >>              case 2: /* L2 cache info */
> >> -                core_offset = apicid_core_offset(&topo_info);
> >>                  encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> >> -                                    (1 << core_offset),
> >> -                                    (1 << addressable_cores_offset),
> >> +                                    &topo_info,
> >>                                      eax, ebx, ecx, edx);
> >>                  break;
> >>              case 3: /* L3 cache info */
> >> -                die_offset = apicid_die_offset(&topo_info);
> >>                  if (cpu->enable_l3_cache) {
> >>                      encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> >> -                                        (1 << die_offset),
> >> -                                        (1 << addressable_cores_offset),
> >> +                                        &topo_info,
> >>                                          eax, ebx, ecx, edx);
> >>                      break;
> >>                  }
> > 
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  2023-08-03 20:40   ` Moger, Babu
@ 2023-08-04  9:50     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  9:50 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Babu,

On Thu, Aug 03, 2023 at 03:40:20PM -0500, Moger, Babu wrote:
> Date: Thu, 3 Aug 2023 15:40:20 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 15/17] i386: Fix NumSharingCache for
>  CPUID[0x8000001D].EAX[bits 25:14]
> 
> Hi Zhao,
> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
> > for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
> > the number of sharing threads directly.
> > 
> > From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
> > means [1]:
> > 
> > The number of logical processors sharing this cache is the value of
> > this field incremented by 1. To determine which logical processors are
> > sharing a cache, determine a Share Id for each processor as follows:
> > 
> > ShareId = LocalApicId >> log2(NumSharingCache+1)
> > 
> > Logical processors with the same ShareId then share a cache. If
> > NumSharingCache+1 is not a power of two, round it up to the next power
> > of two.
> > 
> > From the description above, the caculation of this feild should be same
> > as CPUID[4].EAX[bits 25:14] for intel cpus. So also use the offsets of
> > APIC ID to calculate this field.
> > 
> > Note: I don't have the AMD hardware available, hope folks can help me
> > to test this, thanks!
> 
> Yes. Decode looks good. You can remove this note in next revision.

Many thanks! :-)

> 
> The subject line "Fix" gives wrong impression. I would change the subject
> to (or something like this).
> 
> i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]

Okay, will change like this.

> 
> 
> > 
> > [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
> >      Information
> > 
> > Cc: Babu Moger <babu.moger@amd.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v1:
> >  * Rename "l3_threads" to "num_apic_ids" in
> >    encode_cache_cpuid8000001d(). (Yanan)
> >  * Add the description of the original commit and add Cc.
> > ---
> >  target/i386/cpu.c | 10 ++++------
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index c9897c0fe91a..f67b6be10b8d 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -361,7 +361,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >                                         uint32_t *eax, uint32_t *ebx,
> >                                         uint32_t *ecx, uint32_t *edx)
> >  {
> > -    uint32_t l3_threads;
> > +    uint32_t num_apic_ids;
> 
> I would change it to match spec definition.
> 
>   uint32_t num_sharing_cache;

Okay.

Thanks,
Zhao

> 
> 
> >      assert(cache->size == cache->line_size * cache->associativity *
> >                            cache->partitions * cache->sets);
> >  
> > @@ -370,13 +370,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >  
> >      /* L3 is shared among multiple cores */
> >      if (cache->level == 3) {
> > -        l3_threads = topo_info->modules_per_die *
> > -                     topo_info->cores_per_module *
> > -                     topo_info->threads_per_core;
> > -        *eax |= (l3_threads - 1) << 14;
> > +        num_apic_ids = 1 << apicid_die_offset(topo_info);
> >      } else {
> > -        *eax |= ((topo_info->threads_per_core - 1) << 14);
> > +        num_apic_ids = 1 << apicid_core_offset(topo_info);
> >      }
> > +    *eax |= (num_apic_ids - 1) << 14;
> >  
> >      assert(cache->line_size > 0);
> >      assert(cache->partitions > 0);
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
  2023-08-03 20:44   ` Moger, Babu
@ 2023-08-04  9:56     ` Zhao Liu
  2023-08-04 18:50       ` Moger, Babu
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-04  9:56 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Babu,

On Thu, Aug 03, 2023 at 03:44:13PM -0500, Moger, Babu wrote:
> Date: Thu, 3 Aug 2023 15:44:13 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode
>  CPUID[0x8000001D].EAX[bits 25:14]
> 
> Hi Zhao,
>   Please copy the thread to kvm@vger.kernel.org also.  It makes it easier
> to browse.
> 

OK. I'm not sure how to cc, should I forward all mail to KVM for the
current version (v3), or should I cc the kvm mail list for the next
version (v4)?

> 
> On 8/1/23 05:35, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > CPUID[0x8000001D].EAX[bits 25:14] is used to represent the cache
> > topology for amd CPUs.
> Please change this to.
> 
> 
> CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical
> processors sharing cache. The number of
> logical processors sharing this cache is NumSharingCache + 1.

OK.

Thanks,
Zhao

> 
> > 
> > After cache models have topology information, we can use
> > CPUCacheInfo.share_level to decide which topology level to be encoded
> > into CPUID[0x8000001D].EAX[bits 25:14].
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v1:
> >  * Use cache->share_level as the parameter in
> >    max_processor_ids_for_cache().
> > ---
> >  target/i386/cpu.c | 10 +---------
> >  1 file changed, 1 insertion(+), 9 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index f67b6be10b8d..6eee0274ade4 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -361,20 +361,12 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >                                         uint32_t *eax, uint32_t *ebx,
> >                                         uint32_t *ecx, uint32_t *edx)
> >  {
> > -    uint32_t num_apic_ids;
> >      assert(cache->size == cache->line_size * cache->associativity *
> >                            cache->partitions * cache->sets);
> >  
> >      *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
> >                 (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
> > -
> > -    /* L3 is shared among multiple cores */
> > -    if (cache->level == 3) {
> > -        num_apic_ids = 1 << apicid_die_offset(topo_info);
> > -    } else {
> > -        num_apic_ids = 1 << apicid_core_offset(topo_info);
> > -    }
> > -    *eax |= (num_apic_ids - 1) << 14;
> > +    *eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 14;
> >  
> >      assert(cache->line_size > 0);
> >      assert(cache->partitions > 0);
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU
  2023-08-01 10:35 ` [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
@ 2023-08-04  9:56   ` Xiaoyao Li
  2023-08-04 12:43     ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Xiaoyao Li @ 2023-08-04  9:56 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu, Robert Hoo

On 8/1/2023 6:35 PM, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
> CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
> both 0, 

This sounds like you are describing some architectural rules, which 
misleads me. I suggest changing the description to

For i-cache and d-cache, current QEMU hardcodes the maximum IDs for CPUs 
sharing cache (CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 
25:14]) to 0. ...

> and this means i-cache and d-cache are shared in the SMT level.
> This is correct if there's single thread per core, but is wrong for the
> hyper threading case (one core contains multiple threads) since the
> i-cache and d-cache are shared in the core level other than SMT level.
> 
> For AMD CPU, commit 8f4202fb1080 ("i386: Populate AMD Processor Cache
> Information for cpuid 0x8000001D") has already introduced i/d cache
> topology as core level by default.
> 
> Therefore, in order to be compatible with both multi-threaded and
> single-threaded situations, we should set i-cache and d-cache be shared
> at the core level by default.
> 
> This fix changes the default i/d cache topology from per-thread to
> per-core. Potentially, this change in L1 cache topology may affect the
> performance of the VM if the user does not specifically specify the
> topology or bind the vCPU. However, the way to achieve optimal
> performance should be to create a reasonable topology and set the
> appropriate vCPU affinity without relying on QEMU's default topology
> structure.
> 
> Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
> Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> ---
> Changes since v1:
>   * Split this fix from the patch named "i386/cpu: Fix number of
>     addressable IDs in CPUID.04H".
>   * Add the explanation of the impact on performance. (Xiaoyao)
> ---
>   target/i386/cpu.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 50613cd04612..b439a05244ee 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6104,12 +6104,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               switch (count) {
>               case 0: /* L1 dcache info */
>                   encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> -                                    1, cs->nr_cores,
> +                                    cs->nr_threads, cs->nr_cores,
>                                       eax, ebx, ecx, edx);
>                   break;
>               case 1: /* L1 icache info */
>                   encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> -                                    1, cs->nr_cores,
> +                                    cs->nr_threads, cs->nr_cores,
>                                       eax, ebx, ecx, edx);
>                   break;
>               case 2: /* L2 cache info */



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU
  2023-08-04  9:56   ` Xiaoyao Li
@ 2023-08-04 12:43     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-04 12:43 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu,
	Robert Hoo

Hi Xiaoyao,

On Fri, Aug 04, 2023 at 05:56:47PM +0800, Xiaoyao Li wrote:
> Date: Fri, 4 Aug 2023 17:56:47 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core
>  level for Intel CPU
> 
> On 8/1/2023 6:35 PM, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
> > CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
> > both 0,
> 
> This sounds like you are describing some architectural rules, which misleads
> me. I suggest changing the description to
> 
> For i-cache and d-cache, current QEMU hardcodes the maximum IDs for CPUs
> sharing cache (CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits
> 25:14]) to 0. ...

Yeah, it's clearer. Will use your description. Thanks!

> 
> > and this means i-cache and d-cache are shared in the SMT level.
> > This is correct if there's single thread per core, but is wrong for the
> > hyper threading case (one core contains multiple threads) since the
> > i-cache and d-cache are shared in the core level other than SMT level.
> > 
> > For AMD CPU, commit 8f4202fb1080 ("i386: Populate AMD Processor Cache
> > Information for cpuid 0x8000001D") has already introduced i/d cache
> > topology as core level by default.
> > 
> > Therefore, in order to be compatible with both multi-threaded and
> > single-threaded situations, we should set i-cache and d-cache be shared
> > at the core level by default.
> > 
> > This fix changes the default i/d cache topology from per-thread to
> > per-core. Potentially, this change in L1 cache topology may affect the
> > performance of the VM if the user does not specifically specify the
> > topology or bind the vCPU. However, the way to achieve optimal
> > performance should be to create a reasonable topology and set the
> > appropriate vCPU affinity without relying on QEMU's default topology
> > structure.
> > 
> > Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
> > Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> 
> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

Thanks!

-Zhao

> 
> > ---
> > Changes since v1:
> >   * Split this fix from the patch named "i386/cpu: Fix number of
> >     addressable IDs in CPUID.04H".
> >   * Add the explanation of the impact on performance. (Xiaoyao)
> > ---
> >   target/i386/cpu.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 50613cd04612..b439a05244ee 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6104,12 +6104,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               switch (count) {
> >               case 0: /* L1 dcache info */
> >                   encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> > -                                    1, cs->nr_cores,
> > +                                    cs->nr_threads, cs->nr_cores,
> >                                       eax, ebx, ecx, edx);
> >                   break;
> >               case 1: /* L1 icache info */
> >                   encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> > -                                    1, cs->nr_cores,
> > +                                    cs->nr_threads, cs->nr_cores,
> >                                       eax, ebx, ecx, edx);
> >                   break;
> >               case 2: /* L2 cache info */
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 00/17] Support smp.clusters for x86
  2023-08-01 15:35 ` [PATCH v3 00/17] Support smp.clusters for x86 Jonathan Cameron via
@ 2023-08-04 13:17   ` Zhao Liu
  2023-08-08 11:52     ` Jonathan Cameron via
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-04 13:17 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daud�,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger,
	Zhao Liu

Hi Jonathan,

On Tue, Aug 01, 2023 at 04:35:27PM +0100, Jonathan Cameron via wrote:
> > 

[snip]

> > 
> > ## New property: x-l2-cache-topo
> > 
> > The property l2-cache-topo will be used to change the L2 cache topology
> > in CPUID.04H.
> > 
> > Now it allows user to set the L2 cache is shared in core level or
> > cluster level.
> > 
> > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > topology will be overrided by the new topology setting.
> > 
> > Since CPUID.04H is used by intel cpus, this property is available on
> > intel cpus as for now.
> > 
> > When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.
> 
> Hi Zhao Liu,
> 
> As part of emulating arm's MPAM (cache partitioning controls) I needed
> to add the missing cache description in the ACPI PPTT table. As such I ran
> into a very similar problem to the one you are addressing.

May I ask if the cache topology you need is symmetric or heterogeneous?

I had the discussion with Yanan [5] about heterogeneous cache. If you
need a "symmetric" cache topology, maybe we could consider trying make
this x-l2-cache-topo more generic.

But if you need a heterogeneous cache topo, e.g., some cores have its
own l2 cache, and other cores share the same l2 cache, only this command
is not enough.

Intel hybrid platforms have the above case I mentioned, we used "hybrid
CPU topology" [6] + "x-l2-cache-topo=cluster" to solve this:

For example, AdlerLake has 2 types of core, one type is P-core with L2 per
P-core, and another type is E-core that 4 E-cores share a L2.

So we set a CPU topology like this:

Set 2 kinds of clusters:
* 1 P-core is in a cluster.
* 4 E-cores in a cluster.

Then we use "x-l2-cache-topo" to make l2 is shared at cluster. In this
way, a P-core could own a L2 because its cluster only has 1 P-core, and
4 E-cores could could share a L2.

For more general way to set cache topology, Yanan and me discussed 2
ways ([7] [8]). [8] depends on the QOM CPU topology mechanism that I'm
working on.

[5]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg04795.html
[6]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03205.html
[7]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05139.html
[8]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05167.html

> 
> I wonder if a more generic description is possible? We can rely on ordering
> of the cache levels, so what I was planning to propose was the rather lengthy
> but flexible (and with better names ;)
> 
> -smp 16,sockets=1,clusters=4,threads=2,cache-cluster-start-level=2,cache-node-start-level=3

Could you explain more about this command?
I don't understand what "cache-cluster-start-level=2,cache-node-start-level=3" mean.

> 
> Perhaps we can come up with a common scheme that covers both usecases?
> It gets more fiddly to define if we have variable topology across different clusters
> - and that was going to be an open question in the RFC proposing this - our current
> definition of the more basic topology doesn't cover those cases anyway.
> 
> What I want:
> 
> 1) No restriction on maximum cache levels - ...

Hmmm, if there's no cache name, it would be difficult to define in cli.

> ... some systems have more than 3

What about L4? A name can simplify a lot of issues.

> 2) Easy ability to express everything from all caches are private to all caches are shared.
> Is 3 levels enough? (private, shared at cluster level, shared at a level above that) I think
> so, but if not any scheme should be extensible to cover another level.

It seems you may need the "heterogeneous" cache topology.

I think "private" and "shared" are not good definitions of cache, since
they are not technical terms? (Correct me if I'm wrong.) And i/d cache,
l1 cache, l2 cache are generic terms accepted by many architectures.

Though cache topology is different with CPU topology, it's true that the
cache topology is related to the CPU hierarchy, so I think using the CPU
topology hierarchy to define the heterogeneous topology looks like a more
appropriate way to do it.

> 
> Great if we can figure out a common scheme.

Yeah, it's worth discussing.

Thanks,
Zhao

> 
> Jonathan
> 
> > 
> > 
> > # Patch description
> > 
> > patch 1-2 Cleanups about coding style and test name.
> > 
> > patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
> >              cache topology encoding.
> > 
> > patch 5-6 Cleanups about topology related CPUID encoding and QEMU
> >           topology variables.
> > 
> > patch 7-12 Add the module as the new CPU topology level in x86, and it
> >            is corresponding to the cluster level in generic code.
> > 
> > patch 13,14,16 Add cache topology infomation in cache models.
> > 
> > patch 17 Introduce a new command to configure L2 cache topology.
> > 
> > 
> > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> > [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> > [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> > [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > 
> > Best Regards,
> > Zhao
> > 
> > ---
> > Changelog:
> > 
> > Changes since v2:
> >  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
> >  * Use newly added wrapped helper to get cores per socket in
> >    qemu_init_vcpu().
> > 
> > Changes since v1:
> >  * Reordered patches. (Yanan)
> >  * Deprecated the patch to fix comment of machine_parse_smp_config().
> >    (Yanan)
> >  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
> >  * Split the intel's l1 cache topology fix into a new separate patch.
> >    (Yanan)
> >  * Combined module_id and APIC ID for module level support into one
> >    patch. (Yanan)
> >  * Make cache_into_passthrough case of cpuid 0x04 leaf in
> >  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
> >    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
> >  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
> >    (Yanan)
> >  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> > 
> > ---
> > Zhao Liu (10):
> >   i386: Fix comment style in topology.h
> >   tests: Rename test-x86-cpuid.c to test-x86-topo.c
> >   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
> >   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
> >   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
> >   i386: Add cache topology info in CPUCacheInfo
> >   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
> >   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
> >   i386: Use CPUCacheInfo.share_level to encode
> >     CPUID[0x8000001D].EAX[bits 25:14]
> >   i386: Add new property to control L2 cache topo in CPUID.04H
> > 
> > Zhuocheng Ding (7):
> >   softmmu: Fix CPUSTATE.nr_cores' calculation
> >   i386: Introduce module-level cpu topology to CPUX86State
> >   i386: Support modules_per_die in X86CPUTopoInfo
> >   i386: Support module_id in X86CPUTopoIDs
> >   i386/cpu: Introduce cluster-id to X86CPU
> >   tests: Add test case of APIC ID for module level parsing
> >   hw/i386/pc: Support smp.clusters for x86 PC machine
> > 
> >  MAINTAINERS                                   |   2 +-
> >  hw/i386/pc.c                                  |   1 +
> >  hw/i386/x86.c                                 |  49 +++++-
> >  include/hw/core/cpu.h                         |   2 +-
> >  include/hw/i386/topology.h                    |  68 +++++---
> >  qemu-options.hx                               |  10 +-
> >  softmmu/cpus.c                                |   2 +-
> >  target/i386/cpu.c                             | 158 ++++++++++++++----
> >  target/i386/cpu.h                             |  25 +++
> >  tests/unit/meson.build                        |   4 +-
> >  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
> >  11 files changed, 280 insertions(+), 99 deletions(-)
> >  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> > 
> 
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-04  9:48       ` Zhao Liu
@ 2023-08-04 15:48         ` Moger, Babu
  2023-08-14  8:22           ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-04 15:48 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Zhao,

On 8/4/23 04:48, Zhao Liu wrote:
> Hi Babu,
> 
> On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
>> Date: Thu, 3 Aug 2023 11:41:40 -0500
>> From: "Moger, Babu" <babu.moger@amd.com>
>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>  CPUID[4]
>>
>> Hi Zhao,
>>
>> On 8/2/23 18:49, Moger, Babu wrote:
>>> Hi Zhao,
>>>
>>> Hitting this error after this patch.
>>>
>>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
>>> not be reached
>>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
>>> should not be reached
>>> Aborted (core dumped)
>>>
>>> Looks like share_level for all the caches for AMD is not initialized.
> 
> I missed these change when I rebase. Sorry for that.
> 
> BTW, could I ask a question? From a previous discussion[1], I understand
> that the cache info is used to show the correct cache information in
> new machine. And from [2], the wrong cache info may cause "compatibility
> issues".
> 
> Is this "compatibility issues" AMD specific? I'm not sure if Intel should
> update the cache info like that. thanks!

I was going to comment about that. Good that you asked me.

Compatibility is qemu requirement.  Otherwise the migrations will fail.

Any changes in the topology is going to cause migration problems.

I am not sure how you are going to handle this. You can probably look at
the feature "x-intel-pt-auto-level".

make sure to test the migration.

Thanks
Babu


> 
> [1]: https://patchwork.kernel.org/project/kvm/patch/CY4PR12MB1768A3CBE42AAFB03CB1081E95AA0@CY4PR12MB1768.namprd12.prod.outlook.com/
> [2]: https://lore.kernel.org/qemu-devel/20180510204148.11687-1-babu.moger@amd.com/
> 
>>
>> The following patch fixes the problem.
>>
>> ======================================================
>>
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index f4c48e19fa..976a2755d8 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -528,6 +528,7 @@ static CPUCacheInfo legacy_l2_cache_cpuid2 = {
>>      .size = 2 * MiB,
>>      .line_size = 64,
>>      .associativity = 8,
>> +    .share_level = CPU_TOPO_LEVEL_CORE,
> 
> This "legacy_l2_cache_cpuid2" is not used to encode cache topology.
> I should explicitly set this default topo level as CPU_TOPO_LEVEL_UNKNOW.
> 
>>  };
>>
>>
>> @@ -1904,6 +1905,7 @@ static CPUCaches epyc_v4_cache_info = {
>>          .lines_per_tag = 1,
>>          .self_init = 1,
>>          .no_invd_sharing = true,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l1i_cache = &(CPUCacheInfo) {
>>          .type = INSTRUCTION_CACHE,
>> @@ -1916,6 +1918,7 @@ static CPUCaches epyc_v4_cache_info = {
>>          .lines_per_tag = 1,
>>          .self_init = 1,
>>          .no_invd_sharing = true,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l2_cache = &(CPUCacheInfo) {
>>          .type = UNIFIED_CACHE,
>> @@ -1926,6 +1929,7 @@ static CPUCaches epyc_v4_cache_info = {
>>          .partitions = 1,
>>          .sets = 1024,
>>          .lines_per_tag = 1,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l3_cache = &(CPUCacheInfo) {
>>          .type = UNIFIED_CACHE,
>> @@ -1939,6 +1943,7 @@ static CPUCaches epyc_v4_cache_info = {
>>          .self_init = true,
>>          .inclusive = true,
>>          .complex_indexing = false,
>> +        .share_level = CPU_TOPO_LEVEL_DIE,
>>      },
>>  };
>>
>> @@ -2008,6 +2013,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>>          .lines_per_tag = 1,
>>          .self_init = 1,
>>          .no_invd_sharing = true,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l1i_cache = &(CPUCacheInfo) {
>>          .type = INSTRUCTION_CACHE,
>> @@ -2020,6 +2026,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>>          .lines_per_tag = 1,
>>          .self_init = 1,
>>          .no_invd_sharing = true,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l2_cache = &(CPUCacheInfo) {
>>          .type = UNIFIED_CACHE,
>> @@ -2030,6 +2037,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>>          .partitions = 1,
>>          .sets = 1024,
>>          .lines_per_tag = 1,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l3_cache = &(CPUCacheInfo) {
>>          .type = UNIFIED_CACHE,
>> @@ -2043,6 +2051,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>>          .self_init = true,
>>          .inclusive = true,
>>          .complex_indexing = false,
>> +        .share_level = CPU_TOPO_LEVEL_DIE,
>>      },
>>  };
>>
>> @@ -2112,6 +2121,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>>          .lines_per_tag = 1,
>>          .self_init = 1,
>>          .no_invd_sharing = true,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l1i_cache = &(CPUCacheInfo) {
>>          .type = INSTRUCTION_CACHE,
>> @@ -2124,6 +2134,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>>          .lines_per_tag = 1,
>>          .self_init = 1,
>>          .no_invd_sharing = true,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l2_cache = &(CPUCacheInfo) {
>>          .type = UNIFIED_CACHE,
>> @@ -2134,6 +2145,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>>          .partitions = 1,
>>          .sets = 1024,
>>          .lines_per_tag = 1,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l3_cache = &(CPUCacheInfo) {
>>          .type = UNIFIED_CACHE,
>> @@ -2147,6 +2159,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>>          .self_init = true,
>>          .inclusive = true,
>>          .complex_indexing = false,
>> +        .share_level = CPU_TOPO_LEVEL_DIE,
>>      },
>>  };
>>
>> @@ -2162,6 +2175,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>>          .lines_per_tag = 1,
>>          .self_init = 1,
>>          .no_invd_sharing = true,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l1i_cache = &(CPUCacheInfo) {
>>          .type = INSTRUCTION_CACHE,
>> @@ -2174,6 +2188,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>>          .lines_per_tag = 1,
>>          .self_init = 1,
>>          .no_invd_sharing = true,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l2_cache = &(CPUCacheInfo) {
>>          .type = UNIFIED_CACHE,
>> @@ -2184,6 +2199,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>>          .partitions = 1,
>>          .sets = 2048,
>>          .lines_per_tag = 1,
>> +        .share_level = CPU_TOPO_LEVEL_CORE,
>>      },
>>      .l3_cache = &(CPUCacheInfo) {
>>          .type = UNIFIED_CACHE,
>> @@ -2197,6 +2213,7 @@ static const CPUCaches epyc_genoa_cache_info = {
>>          .self_init = true,
>>          .inclusive = true,
>>          .complex_indexing = false,
>> +        .share_level = CPU_TOPO_LEVEL_DIE,
>>      },
>>  };
>>
>>
>> =========================================================================
> 
> 
> Look good to me except legacy_l2_cache_cpuid2, thanks very much!
> I'll add this in next version.
> 
> -Zhao
> 
>>
>> Thanks
>> Babu
>>>
>>> On 8/1/23 05:35, Zhao Liu wrote:
>>>> From: Zhao Liu <zhao1.liu@intel.com>
>>>>
>>>> CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
>>>> intel CPUs.
>>>>
>>>> After cache models have topology information, we can use
>>>> CPUCacheInfo.share_level to decide which topology level to be encoded
>>>> into CPUID[4].EAX[bits 25:14].
>>>>
>>>> And since maximum_processor_id (original "num_apic_ids") is parsed
>>>> based on cpu topology levels, which are verified when parsing smp, it's
>>>> no need to check this value by "assert(num_apic_ids > 0)" again, so
>>>> remove this assert.
>>>>
>>>> Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
>>>> helper to make the code cleaner.
>>>>
>>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>>> ---
>>>> Changes since v1:
>>>>  * Use "enum CPUTopoLevel share_level" as the parameter in
>>>>    max_processor_ids_for_cache().
>>>>  * Make cache_into_passthrough case also use
>>>>    max_processor_ids_for_cache() and max_core_ids_in_package() to
>>>>    encode CPUID[4]. (Yanan)
>>>>  * Rename the title of this patch (the original is "i386: Use
>>>>    CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]").
>>>> ---
>>>>  target/i386/cpu.c | 70 +++++++++++++++++++++++++++++------------------
>>>>  1 file changed, 43 insertions(+), 27 deletions(-)
>>>>
>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>> index 55aba4889628..c9897c0fe91a 100644
>>>> --- a/target/i386/cpu.c
>>>> +++ b/target/i386/cpu.c
>>>> @@ -234,22 +234,53 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
>>>>                         ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
>>>>                         0 /* Invalid value */)
>>>>  
>>>> +static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
>>>> +                                            enum CPUTopoLevel share_level)
>>>> +{
>>>> +    uint32_t num_ids = 0;
>>>> +
>>>> +    switch (share_level) {
>>>> +    case CPU_TOPO_LEVEL_CORE:
>>>> +        num_ids = 1 << apicid_core_offset(topo_info);
>>>> +        break;
>>>> +    case CPU_TOPO_LEVEL_DIE:
>>>> +        num_ids = 1 << apicid_die_offset(topo_info);
>>>> +        break;
>>>> +    case CPU_TOPO_LEVEL_PACKAGE:
>>>> +        num_ids = 1 << apicid_pkg_offset(topo_info);
>>>> +        break;
>>>> +    default:
>>>> +        /*
>>>> +         * Currently there is no use case for SMT and MODULE, so use
>>>> +         * assert directly to facilitate debugging.
>>>> +         */
>>>> +        g_assert_not_reached();
>>>> +    }
>>>> +
>>>> +    return num_ids - 1;
>>>> +}
>>>> +
>>>> +static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
>>>> +{
>>>> +    uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
>>>> +                               apicid_core_offset(topo_info));
>>>> +    return num_cores - 1;
>>>> +}
>>>>  
>>>>  /* Encode cache info for CPUID[4] */
>>>>  static void encode_cache_cpuid4(CPUCacheInfo *cache,
>>>> -                                int num_apic_ids, int num_cores,
>>>> +                                X86CPUTopoInfo *topo_info,
>>>>                                  uint32_t *eax, uint32_t *ebx,
>>>>                                  uint32_t *ecx, uint32_t *edx)
>>>>  {
>>>>      assert(cache->size == cache->line_size * cache->associativity *
>>>>                            cache->partitions * cache->sets);
>>>>  
>>>> -    assert(num_apic_ids > 0);
>>>>      *eax = CACHE_TYPE(cache->type) |
>>>>             CACHE_LEVEL(cache->level) |
>>>>             (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
>>>> -           ((num_cores - 1) << 26) |
>>>> -           ((num_apic_ids - 1) << 14);
>>>> +           (max_core_ids_in_package(topo_info) << 26) |
>>>> +           (max_processor_ids_for_cache(topo_info, cache->share_level) << 14);
>>>>  
>>>>      assert(cache->line_size > 0);
>>>>      assert(cache->partitions > 0);
>>>> @@ -6116,56 +6147,41 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>>>>  
>>>>                  if (cores_per_pkg > 1) {
>>>> -                    int addressable_cores_offset =
>>>> -                                                apicid_pkg_offset(&topo_info) -
>>>> -                                                apicid_core_offset(&topo_info);
>>>> -
>>>>                      *eax &= ~0xFC000000;
>>>> -                    *eax |= (1 << addressable_cores_offset - 1) << 26;
>>>> +                    *eax |= max_core_ids_in_package(&topo_info) << 26;
>>>>                  }
>>>>                  if (host_vcpus_per_cache > cpus_per_pkg) {
>>>> -                    int pkg_offset = apicid_pkg_offset(&topo_info);
>>>> -
>>>>                      *eax &= ~0x3FFC000;
>>>> -                    *eax |= (1 << pkg_offset - 1) << 14;
>>>> +                    *eax |=
>>>> +                        max_processor_ids_for_cache(&topo_info,
>>>> +                                                CPU_TOPO_LEVEL_PACKAGE) << 14;
>>>>                  }
>>>>              }
>>>>          } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
>>>>              *eax = *ebx = *ecx = *edx = 0;
>>>>          } else {
>>>>              *eax = 0;
>>>> -            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
>>>> -                                           apicid_core_offset(&topo_info);
>>>> -            int core_offset, die_offset;
>>>>  
>>>>              switch (count) {
>>>>              case 0: /* L1 dcache info */
>>>> -                core_offset = apicid_core_offset(&topo_info);
>>>>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
>>>> -                                    (1 << core_offset),
>>>> -                                    (1 << addressable_cores_offset),
>>>> +                                    &topo_info,
>>>>                                      eax, ebx, ecx, edx);
>>>>                  break;
>>>>              case 1: /* L1 icache info */
>>>> -                core_offset = apicid_core_offset(&topo_info);
>>>>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
>>>> -                                    (1 << core_offset),
>>>> -                                    (1 << addressable_cores_offset),
>>>> +                                    &topo_info,
>>>>                                      eax, ebx, ecx, edx);
>>>>                  break;
>>>>              case 2: /* L2 cache info */
>>>> -                core_offset = apicid_core_offset(&topo_info);
>>>>                  encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
>>>> -                                    (1 << core_offset),
>>>> -                                    (1 << addressable_cores_offset),
>>>> +                                    &topo_info,
>>>>                                      eax, ebx, ecx, edx);
>>>>                  break;
>>>>              case 3: /* L3 cache info */
>>>> -                die_offset = apicid_die_offset(&topo_info);
>>>>                  if (cpu->enable_l3_cache) {
>>>>                      encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
>>>> -                                        (1 << die_offset),
>>>> -                                        (1 << addressable_cores_offset),
>>>> +                                        &topo_info,
>>>>                                          eax, ebx, ecx, edx);
>>>>                      break;
>>>>                  }
>>>
>>
>> -- 
>> Thanks
>> Babu Moger

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
  2023-08-04  9:56     ` Zhao Liu
@ 2023-08-04 18:50       ` Moger, Babu
  0 siblings, 0 replies; 63+ messages in thread
From: Moger, Babu @ 2023-08-04 18:50 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Zhao Liu

Hi Zhao,

On 8/4/23 04:56, Zhao Liu wrote:
> Hi Babu,
> 
> On Thu, Aug 03, 2023 at 03:44:13PM -0500, Moger, Babu wrote:
>> Date: Thu, 3 Aug 2023 15:44:13 -0500
>> From: "Moger, Babu" <babu.moger@amd.com>
>> Subject: Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode
>>  CPUID[0x8000001D].EAX[bits 25:14]
>>
>> Hi Zhao,
>>   Please copy the thread to kvm@vger.kernel.org also.  It makes it easier
>> to browse.
>>
> 
> OK. I'm not sure how to cc, should I forward all mail to KVM for the
> current version (v3), or should I cc the kvm mail list for the next
> version (v4)?

Yes. From v4.
Thanks
Babu
> 
>>
>> On 8/1/23 05:35, Zhao Liu wrote:
>>> From: Zhao Liu <zhao1.liu@intel.com>
>>>
>>> CPUID[0x8000001D].EAX[bits 25:14] is used to represent the cache
>>> topology for amd CPUs.
>> Please change this to.
>>
>>
>> CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical
>> processors sharing cache. The number of
>> logical processors sharing this cache is NumSharingCache + 1.
> 
> OK.
> 
> Thanks,
> Zhao
> 
>>
>>>
>>> After cache models have topology information, we can use
>>> CPUCacheInfo.share_level to decide which topology level to be encoded
>>> into CPUID[0x8000001D].EAX[bits 25:14].
>>>
>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>> ---
>>> Changes since v1:
>>>  * Use cache->share_level as the parameter in
>>>    max_processor_ids_for_cache().
>>> ---
>>>  target/i386/cpu.c | 10 +---------
>>>  1 file changed, 1 insertion(+), 9 deletions(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index f67b6be10b8d..6eee0274ade4 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -361,20 +361,12 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>>>                                         uint32_t *eax, uint32_t *ebx,
>>>                                         uint32_t *ecx, uint32_t *edx)
>>>  {
>>> -    uint32_t num_apic_ids;
>>>      assert(cache->size == cache->line_size * cache->associativity *
>>>                            cache->partitions * cache->sets);
>>>  
>>>      *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
>>>                 (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
>>> -
>>> -    /* L3 is shared among multiple cores */
>>> -    if (cache->level == 3) {
>>> -        num_apic_ids = 1 << apicid_die_offset(topo_info);
>>> -    } else {
>>> -        num_apic_ids = 1 << apicid_core_offset(topo_info);
>>> -    }
>>> -    *eax |= (num_apic_ids - 1) << 14;
>>> +    *eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 14;
>>>  
>>>      assert(cache->line_size > 0);
>>>      assert(cache->partitions > 0);
>>
>> -- 
>> Thanks
>> Babu Moger

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 01/17] i386: Fix comment style in topology.h
  2023-08-01 10:35 ` [PATCH v3 01/17] i386: Fix comment style in topology.h Zhao Liu
  2023-08-01 23:13   ` Moger, Babu
@ 2023-08-07  2:16   ` Xiaoyao Li
  2023-08-07  7:05     ` Zhao Liu
  1 sibling, 1 reply; 63+ messages in thread
From: Xiaoyao Li @ 2023-08-07  2:16 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu

On 8/1/2023 6:35 PM, Zhao Liu wrote:
> From: Zhao Liu<zhao1.liu@intel.com>
> 
> For function comments in this file, keep the comment style consistent
> with other places.
> 
> Signed-off-by: Zhao Liu<zhao1.liu@intel.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org

missing '>' at the end.

> Reviewed-by: Yanan Wang<wangyanan55@huawei.com>
> Acked-by: Michael S. Tsirkin<mst@redhat.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@Intel.com>



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-08-01 10:35 ` [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
  2023-08-02 15:25   ` Moger, Babu
@ 2023-08-07  7:03   ` Xiaoyao Li
  2023-08-07  7:53     ` Zhao Liu
  1 sibling, 1 reply; 63+ messages in thread
From: Xiaoyao Li @ 2023-08-07  7:03 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu, Zhuocheng Ding

On 8/1/2023 6:35 PM, Zhao Liu wrote:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> 
>  From CPUState.nr_cores' comment, it represents "number of cores within
> this CPU package".
> 
> After 003f230e37d7 ("machine: Tweak the order of topology members in
> struct CpuTopology"), the meaning of smp.cores changed to "the number of
> cores in one die", but this commit missed to change CPUState.nr_cores'
> caculation, so that CPUState.nr_cores became wrong and now it
> misses to consider numbers of clusters and dies.
> 
> At present, only i386 is using CPUState.nr_cores.
> 
> But as for i386, which supports die level, the uses of CPUState.nr_cores
> are very confusing:
> 
> Early uses are based on the meaning of "cores per package" (before die
> is introduced into i386), and later uses are based on "cores per die"
> (after die's introduction).
> 
> This difference is due to that commit a94e1428991f ("target/i386: Add
> CPUID.1F generation support for multi-dies PCMachine") misunderstood
> that CPUState.nr_cores means "cores per die" when caculated
> CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> wrong understanding.
> 
> With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> the result of CPUState.nr_cores is "cores per die", thus the original
> uses of CPUState.cores based on the meaning of "cores per package" are
> wrong when mutiple dies exist:
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
>     incorrect because it expects "cpus per package" but now the
>     result is "cpus per die".
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
>     EAX[bits 31:26] is incorrect because they expect "cpus per package"
>     but now the result is "cpus per die". The error not only impacts the
>     EAX caculation in cache_info_passthrough case, but also impacts other
>     cases of setting cache topology for Intel CPU according to cpu
>     topology (specifically, the incoming parameter "num_cores" expects
>     "cores per package" in encode_cache_cpuid4()).
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
>     15:00] is incorrect because the EBX of 0BH.01H (core level) expects
>     "cpus per package", which may be different with 1FH.01H (The reason
>     is 1FH can support more levels. For QEMU, 1FH also supports die,
>     1FH.01H:EBX[bits 15:00] expects "cpus per die").
> 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
>     caculated, here "cpus per package" is expected to be checked, but in
>     fact, now it checks "cpus per die". Though "cpus per die" also works
>     for this code logic, this isn't consistent with AMD's APM.
> 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
>     "cpus per package" but it obtains "cpus per die".
> 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
>     kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
>     helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
>     MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
>     package", but in these functions, it obtains "cpus per die" and
>     "cores per die".
> 
> On the other hand, these uses are correct now (they are added in/after
> a94e1428991f):
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
>     meets the actual meaning of CPUState.nr_cores ("cores per die").
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
>     04H's caculation) considers number of dies, so it's correct.
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
>     15:00] needs "cpus per die" and it gets the correct result, and
>     CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> 
> When CPUState.nr_cores is correctly changed to "cores per package" again
> , the above errors will be fixed without extra work, but the "currently"
> correct cases will go wrong and need special handling to pass correct
> "cpus/cores per die" they want.
> 
> Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
> original meaning "cores per package", as well as changing calculation of
> topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> 
> In addition, in the nr_threads' comment, specify it represents the
> number of threads in the "core" to avoid confusion, and also add comment
> for nr_dies in CPUX86State.
> 
> Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v2:
>   * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> Changes since v1:
>   * Add comment for nr_dies in CPUX86State. (Yanan)
> ---
>   include/hw/core/cpu.h | 2 +-
>   softmmu/cpus.c        | 2 +-
>   target/i386/cpu.c     | 9 ++++-----
>   target/i386/cpu.h     | 1 +
>   4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index fdcbe8735258..57f4d50ace72 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -277,7 +277,7 @@ struct qemu_work_item;
>    *   See TranslationBlock::TCG CF_CLUSTER_MASK.
>    * @tcg_cflags: Pre-computed cflags for this cpu.
>    * @nr_cores: Number of cores within this CPU package.
> - * @nr_threads: Number of threads within this CPU.
> + * @nr_threads: Number of threads within this CPU core.

This seems to be better as a separate patch.

Besides, if could, I think it's better to rename them to 
nr_threads_per_core and nr_cores_per_pkg.

Side topic. Ideally, it seems redundant to maintain the @nr_threads and 
@nr_cores in struct CPUState because we have ms->smp to carry all the 
topology info. However, various codes across different arches use 
@nr_threads and @nr_cores. It looks painful to get rid of it or use 
something else to replace it.

>    * @running: #true if CPU is currently running (lockless).
>    * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
>    * valid under cpu_list_lock.
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index fed20ffb5dd2..984558d7b245 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
>   {
>       MachineState *ms = MACHINE(qdev_get_machine());
>   
> -    cpu->nr_cores = ms->smp.cores;
> +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
>       cpu->nr_threads =  ms->smp.threads;
>       cpu->stopped = true;
>       cpu->random_seed = qemu_guest_random_seed_thread_part1();
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 97ad229d8ba3..50613cd04612 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>       X86CPUTopoInfo topo_info;
>   
>       topo_info.dies_per_pkg = env->nr_dies;
> -    topo_info.cores_per_die = cs->nr_cores;
> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;

This and below things make me think that, it looks ugly that @nr_dies is 
added separately in struct CPUArchState for i386 because CPUState only 
has @nr_cores and nr_threads. Further, for i386, it defines a specific 
struct X86CPUTopoInfo to contain topology info when setting up CPUID. To 
me, struct X86CPUTopoInfo is redundant as struct CpuTopology.

maybe we can carry a struct CpuTopology in CPUState, so that we can drop 
@nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in 
struct CPUArchState for i386. As well, topo_info can be dropped here.

>       topo_info.threads_per_core = cs->nr_threads;
>   
>       /* Calculate & apply limits for different index ranges */
> @@ -6087,8 +6087,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                */
>               if (*eax & 31) {
>                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> -                                       cs->nr_threads;
> +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>                   if (cs->nr_cores > 1) {
>                       *eax &= ~0xFC000000;
>                       *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> @@ -6266,12 +6265,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               break;
>           case 1:
>               *eax = apicid_die_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>               break;
>           case 2:
>               *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> +            *ebx = cs->nr_cores * cs->nr_threads;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>               break;
>           default:
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index e0771a10433b..7638128d59cc 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1878,6 +1878,7 @@ typedef struct CPUArchState {
>   
>       TPRAccess tpr_access_type;
>   
> +    /* Number of dies within this CPU package. */
>       unsigned nr_dies;
>   } CPUX86State;
>   



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 01/17] i386: Fix comment style in topology.h
  2023-08-07  2:16   ` Xiaoyao Li
@ 2023-08-07  7:05     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-07  7:05 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu

Hi Xiaoyao,

On Mon, Aug 07, 2023 at 10:16:46AM +0800, Xiaoyao Li wrote:
> Date: Mon, 7 Aug 2023 10:16:46 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v3 01/17] i386: Fix comment style in topology.h
> 
> On 8/1/2023 6:35 PM, Zhao Liu wrote:
> > From: Zhao Liu<zhao1.liu@intel.com>
> > 
> > For function comments in this file, keep the comment style consistent
> > with other places.
> > 
> > Signed-off-by: Zhao Liu<zhao1.liu@intel.com>
> > Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org
> 
> missing '>' at the end.

Good catch!

> 
> > Reviewed-by: Yanan Wang<wangyanan55@huawei.com>
> > Acked-by: Michael S. Tsirkin<mst@redhat.com>
> 
> Reviewed-by: Xiaoyao Li <xiaoyao.li@Intel.com>

Thanks!

-Zhao

> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-08-07  7:03   ` Xiaoyao Li
@ 2023-08-07  7:53     ` Zhao Liu
  2023-08-07  8:43       ` Xiaoyao Li
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-07  7:53 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu,
	Zhuocheng Ding

Hi Xiaoyao,

On Mon, Aug 07, 2023 at 03:03:13PM +0800, Xiaoyao Li wrote:
> Date: Mon, 7 Aug 2023 15:03:13 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
> 
> On 8/1/2023 6:35 PM, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> >  From CPUState.nr_cores' comment, it represents "number of cores within
> > this CPU package".
> > 
> > After 003f230e37d7 ("machine: Tweak the order of topology members in
> > struct CpuTopology"), the meaning of smp.cores changed to "the number of
> > cores in one die", but this commit missed to change CPUState.nr_cores'
> > caculation, so that CPUState.nr_cores became wrong and now it
> > misses to consider numbers of clusters and dies.
> > 
> > At present, only i386 is using CPUState.nr_cores.
> > 
> > But as for i386, which supports die level, the uses of CPUState.nr_cores
> > are very confusing:
> > 
> > Early uses are based on the meaning of "cores per package" (before die
> > is introduced into i386), and later uses are based on "cores per die"
> > (after die's introduction).
> > 
> > This difference is due to that commit a94e1428991f ("target/i386: Add
> > CPUID.1F generation support for multi-dies PCMachine") misunderstood
> > that CPUState.nr_cores means "cores per die" when caculated
> > CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> > wrong understanding.
> > 
> > With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> > the result of CPUState.nr_cores is "cores per die", thus the original
> > uses of CPUState.cores based on the meaning of "cores per package" are
> > wrong when mutiple dies exist:
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
> >     incorrect because it expects "cpus per package" but now the
> >     result is "cpus per die".
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
> >     EAX[bits 31:26] is incorrect because they expect "cpus per package"
> >     but now the result is "cpus per die". The error not only impacts the
> >     EAX caculation in cache_info_passthrough case, but also impacts other
> >     cases of setting cache topology for Intel CPU according to cpu
> >     topology (specifically, the incoming parameter "num_cores" expects
> >     "cores per package" in encode_cache_cpuid4()).
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
> >     15:00] is incorrect because the EBX of 0BH.01H (core level) expects
> >     "cpus per package", which may be different with 1FH.01H (The reason
> >     is 1FH can support more levels. For QEMU, 1FH also supports die,
> >     1FH.01H:EBX[bits 15:00] expects "cpus per die").
> > 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
> >     caculated, here "cpus per package" is expected to be checked, but in
> >     fact, now it checks "cpus per die". Though "cpus per die" also works
> >     for this code logic, this isn't consistent with AMD's APM.
> > 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
> >     "cpus per package" but it obtains "cpus per die".
> > 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
> >     kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
> >     helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
> >     MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
> >     package", but in these functions, it obtains "cpus per die" and
> >     "cores per die".
> > 
> > On the other hand, these uses are correct now (they are added in/after
> > a94e1428991f):
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
> >     meets the actual meaning of CPUState.nr_cores ("cores per die").
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
> >     04H's caculation) considers number of dies, so it's correct.
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
> >     15:00] needs "cpus per die" and it gets the correct result, and
> >     CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> > 
> > When CPUState.nr_cores is correctly changed to "cores per package" again
> > , the above errors will be fixed without extra work, but the "currently"
> > correct cases will go wrong and need special handling to pass correct
> > "cpus/cores per die" they want.
> > 
> > Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
> > original meaning "cores per package", as well as changing calculation of
> > topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> > 
> > In addition, in the nr_threads' comment, specify it represents the
> > number of threads in the "core" to avoid confusion, and also add comment
> > for nr_dies in CPUX86State.
> > 
> > Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> > Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v2:
> >   * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> > Changes since v1:
> >   * Add comment for nr_dies in CPUX86State. (Yanan)
> > ---
> >   include/hw/core/cpu.h | 2 +-
> >   softmmu/cpus.c        | 2 +-
> >   target/i386/cpu.c     | 9 ++++-----
> >   target/i386/cpu.h     | 1 +
> >   4 files changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index fdcbe8735258..57f4d50ace72 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -277,7 +277,7 @@ struct qemu_work_item;
> >    *   See TranslationBlock::TCG CF_CLUSTER_MASK.
> >    * @tcg_cflags: Pre-computed cflags for this cpu.
> >    * @nr_cores: Number of cores within this CPU package.
> > - * @nr_threads: Number of threads within this CPU.
> > + * @nr_threads: Number of threads within this CPU core.
> 
> This seems to be better as a separate patch.

Ok, I can spilt it into a new patch in v4.

> 
> Besides, if could, I think it's better to rename them to nr_threads_per_core
> and nr_cores_per_pkg.

Yeah, the names nr_threads and nr_cores are more historical, having been
around for a long time. With more CPU topology levels available today,
it's really worth reconsidering the names.

In the previous RFC of hybrid topology [1], I proposed the new structure
to replace nr_threads/nr_cores:

typedef struct TopologyState {
    int sockets;
    int cores_per_socket;
    int threads_per_socket;
    int dies_per_socket;
    int clusters_per_die;
    int cores_per_cluster;
    int threads_per_core;
} TopologyState;

I'm still working on hybrid topology, and I'll continue to follow up on
this cleanup! :-)

[1]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03235.html

> 
> Side topic. Ideally, it seems redundant to maintain the @nr_threads and
> @nr_cores in struct CPUState because we have ms->smp to carry all the
> topology info. However, various codes across different arches use
> @nr_threads and @nr_cores. It looks painful to get rid of it or use
> something else to replace it.

IIUC, user emulator needs the topology info in CPUState. System emulator
could use ms->smp directly.

> 
> >    * @running: #true if CPU is currently running (lockless).
> >    * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
> >    * valid under cpu_list_lock.
> > diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> > index fed20ffb5dd2..984558d7b245 100644
> > --- a/softmmu/cpus.c
> > +++ b/softmmu/cpus.c
> > @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
> >   {
> >       MachineState *ms = MACHINE(qdev_get_machine());
> > -    cpu->nr_cores = ms->smp.cores;
> > +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
> >       cpu->nr_threads =  ms->smp.threads;
> >       cpu->stopped = true;
> >       cpu->random_seed = qemu_guest_random_seed_thread_part1();
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 97ad229d8ba3..50613cd04612 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >       X86CPUTopoInfo topo_info;
> >       topo_info.dies_per_pkg = env->nr_dies;
> > -    topo_info.cores_per_die = cs->nr_cores;
> > +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> 
> This and below things make me think that, it looks ugly that @nr_dies is
> added separately in struct CPUArchState for i386 because CPUState only has
> @nr_cores and nr_threads. Further, for i386, it defines a specific struct
> X86CPUTopoInfo to contain topology info when setting up CPUID. To me, struct
> X86CPUTopoInfo is redundant as struct CpuTopology.
> 
> maybe we can carry a struct CpuTopology in CPUState, so that we can drop
> @nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in struct
> CPUArchState for i386. As well, topo_info can be dropped here.

Yeah, I agree. We think the same way, as did in [1].

About X86CPUTopoInfo, it's still necessary to keep to help encode
APICID. For hybrid topology case, the APICID is likely discontinuous,
and the width of each CPU level in APICID depends on the maximum number
of elements in this level. So I also proposed to rename it to
X86ApicidTopoInfo [2] and count the maximum number of elements in each
CPU level.

[2]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03237.html

Thanks,
Zhao

> 
> >       topo_info.threads_per_core = cs->nr_threads;
> >       /* Calculate & apply limits for different index ranges */
> > @@ -6087,8 +6087,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                */
> >               if (*eax & 31) {
> >                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> > -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> > -                                       cs->nr_threads;
> > +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> >                   if (cs->nr_cores > 1) {
> >                       *eax &= ~0xFC000000;
> >                       *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> > @@ -6266,12 +6265,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               break;
> >           case 1:
> >               *eax = apicid_die_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >               break;
> >           case 2:
> >               *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> > +            *ebx = cs->nr_cores * cs->nr_threads;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
> >               break;
> >           default:
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index e0771a10433b..7638128d59cc 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1878,6 +1878,7 @@ typedef struct CPUArchState {
> >       TPRAccess tpr_access_type;
> > +    /* Number of dies within this CPU package. */
> >       unsigned nr_dies;
> >   } CPUX86State;
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
  2023-08-01 10:35 ` [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
  2023-08-02 15:41   ` Moger, Babu
@ 2023-08-07  8:13   ` Xiaoyao Li
  2023-08-07  9:30     ` Zhao Liu
  1 sibling, 1 reply; 63+ messages in thread
From: Xiaoyao Li @ 2023-08-07  8:13 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini
  Cc: qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu, Robert Hoo

On 8/1/2023 6:35 PM, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
> CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
> nearest power-of-2 integer.

I doubt it. Especially for [1].

SDM doesn't state it should be the nearest power-of-2 integer.
For example, regarding EAX[25:14], what SDM states are,

1. The value needs to be added with 1
 
2. The nearest power-of-2 integer that is not smaller than 
(1+EAX[25:14]) is the number of unique initial APIC IDs reserved for 
addressing different logical processor sharing this cache.

Above indicates that

1. "EAX[25:14] + 1", indicates the real number of how many LPs sharing 
this cache. i.e., how many APIC IDs

while 2. "The nearest power-of-2 integer that is not smaller than 
(EAX[25:14] + 1)" indicates the how many APIC IDs are reserved for LPs 
sharing this cache. It doesn't require EAX[25:14] + 1, to be power of 2.

> The nearest power-of-2 integer can be caculated by pow2ceil() or by
> using APIC ID offset (like L3 topology using 1 << die_offset [3]).
> 
> But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
> are associated with APIC ID. For example, in linux kernel, the field
> "num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
> another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
> matched with actual core numbers and it's caculated by:
> "(1 << (pkg_offset - core_offset)) - 1".
> 
> Therefore the offset of APIC ID should be preferred to caculate nearest
> power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
> 31:26]:
> 1. d/i cache is shared in a core, 1 << core_offset should be used
>     instand of "cs->nr_threads" in encode_cache_cpuid4() for
>     CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
> 2. L2 cache is supposed to be shared in a core as for now, thereby
>     1 << core_offset should also be used instand of "cs->nr_threads" in
>     encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
> 3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
>     replaced by the offsets upper SMT level in APIC ID.
> 
> In addition, use APIC ID offset to replace "pow2ceil()" for
> cache_info_passthrough case.
> 
> [1]: efb3934adf9e ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec")
> [2]: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
> [3]: d65af288a84d ("i386: Update new x86_apicid parsing rules with die_offset support")
> 
> Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
> Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v1:
>   * Use APIC ID offset to replace "pow2ceil()" for cache_info_passthrough
>     case. (Yanan)
>   * Split the L1 cache fix into a separate patch.
>   * Rename the title of this patch (the original is "i386/cpu: Fix number
>     of addressable IDs in CPUID.04H").
> ---
>   target/i386/cpu.c | 30 +++++++++++++++++++++++-------
>   1 file changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b439a05244ee..c80613bfcded 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6005,7 +6005,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>   {
>       X86CPU *cpu = env_archcpu(env);
>       CPUState *cs = env_cpu(env);
> -    uint32_t die_offset;
>       uint32_t limit;
>       uint32_t signature[3];
>       X86CPUTopoInfo topo_info;
> @@ -6089,39 +6088,56 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>                   int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>                   if (cs->nr_cores > 1) {
> +                    int addressable_cores_offset =
> +                                                apicid_pkg_offset(&topo_info) -
> +                                                apicid_core_offset(&topo_info);
> +
>                       *eax &= ~0xFC000000;
> -                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> +                    *eax |= (1 << addressable_cores_offset - 1) << 26;
>                   }
>                   if (host_vcpus_per_cache > vcpus_per_socket) {
> +                    int pkg_offset = apicid_pkg_offset(&topo_info);
> +
>                       *eax &= ~0x3FFC000;
> -                    *eax |= (pow2ceil(vcpus_per_socket) - 1) << 14;
> +                    *eax |= (1 << pkg_offset - 1) << 14;
>                   }
>               }
>           } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
>               *eax = *ebx = *ecx = *edx = 0;
>           } else {
>               *eax = 0;
> +            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> +                                           apicid_core_offset(&topo_info);
> +            int core_offset, die_offset;
> +
>               switch (count) {
>               case 0: /* L1 dcache info */
> +                core_offset = apicid_core_offset(&topo_info);
>                   encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> -                                    cs->nr_threads, cs->nr_cores,
> +                                    (1 << core_offset),
> +                                    (1 << addressable_cores_offset),
>                                       eax, ebx, ecx, edx);
>                   break;
>               case 1: /* L1 icache info */
> +                core_offset = apicid_core_offset(&topo_info);
>                   encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> -                                    cs->nr_threads, cs->nr_cores,
> +                                    (1 << core_offset),
> +                                    (1 << addressable_cores_offset),
>                                       eax, ebx, ecx, edx);
>                   break;
>               case 2: /* L2 cache info */
> +                core_offset = apicid_core_offset(&topo_info);
>                   encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> -                                    cs->nr_threads, cs->nr_cores,
> +                                    (1 << core_offset),
> +                                    (1 << addressable_cores_offset),
>                                       eax, ebx, ecx, edx);
>                   break;
>               case 3: /* L3 cache info */
>                   die_offset = apicid_die_offset(&topo_info);
>                   if (cpu->enable_l3_cache) {
>                       encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> -                                        (1 << die_offset), cs->nr_cores,
> +                                        (1 << die_offset),
> +                                        (1 << addressable_cores_offset),
>                                           eax, ebx, ecx, edx);
>                       break;
>                   }



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-08-07  7:53     ` Zhao Liu
@ 2023-08-07  8:43       ` Xiaoyao Li
  2023-08-07 10:00         ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Xiaoyao Li @ 2023-08-07  8:43 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu,
	Zhuocheng Ding

On 8/7/2023 3:53 PM, Zhao Liu wrote:
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 97ad229d8ba3..50613cd04612 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>        X86CPUTopoInfo topo_info;
>>>        topo_info.dies_per_pkg = env->nr_dies;
>>> -    topo_info.cores_per_die = cs->nr_cores;
>>> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>> This and below things make me think that, it looks ugly that @nr_dies is
>> added separately in struct CPUArchState for i386 because CPUState only has
>> @nr_cores and nr_threads. Further, for i386, it defines a specific struct
>> X86CPUTopoInfo to contain topology info when setting up CPUID. To me, struct
>> X86CPUTopoInfo is redundant as struct CpuTopology.
>>
>> maybe we can carry a struct CpuTopology in CPUState, so that we can drop
>> @nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in struct
>> CPUArchState for i386. As well, topo_info can be dropped here.
> Yeah, I agree. We think the same way, as did in [1].
> 
> About X86CPUTopoInfo, it's still necessary to keep to help encode
> APICID. 

typedef struct X86CPUTopoInfo {
     unsigned dies_per_pkg;
     unsigned cores_per_die;
     unsigned threads_per_core;
} X86CPUTopoInfo;

/**
  * CpuTopology:
  * @cpus: the number of present logical processors on the machine
  * @sockets: the number of sockets on the machine
  * @dies: the number of dies in one socket
  * @clusters: the number of clusters in one die
  * @cores: the number of cores in one cluster
  * @threads: the number of threads in one core
  * @max_cpus: the maximum number of logical processors on the machine
  */
typedef struct CpuTopology {
     unsigned int cpus;
     unsigned int sockets;
     unsigned int dies;
     unsigned int clusters;
     unsigned int cores;
     unsigned int threads;
     unsigned int max_cpus;
} CpuTopology;

I think 'struct X86CPUTopoInfo' is just a subset of 'struct CpuTopology'

IIUC, currently the value of X86CPUTopoInfo::dies_per_pkg should equal 
with CpuTopology::dies, and the same for cores_per_die and threads_per_core.

So it's OK to keep an copy of 'struct CpuTopology' in CPUState and drop 
'struct X86CPUTopoInfo'

> For hybrid topology case, the APICID is likely discontinuous,
> and the width of each CPU level in APICID depends on the maximum number
> of elements in this level. So I also proposed to rename it to
> X86ApicidTopoInfo [2] and count the maximum number of elements in each
> CPU level.

Do you mean, for example, for hybrid topology, 
X86CPUTopoInfo::dies_per_pkg != CpuTopology::dies? Or after rename
X86CPUTopoInfo::max_dies != CpuTopology::dies?

> [2]:https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03237.html
> 
> Thanks,
> Zhao
> 



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
  2023-08-07  8:13   ` Xiaoyao Li
@ 2023-08-07  9:30     ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-07  9:30 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu,
	Robert Hoo

Hi Xiaoyao,

On Mon, Aug 07, 2023 at 04:13:36PM +0800, Xiaoyao Li wrote:
> Date: Mon, 7 Aug 2023 16:13:36 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache
>  topo in CPUID[4]
> 
> On 8/1/2023 6:35 PM, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
> > CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
> > nearest power-of-2 integer.
> 
> I doubt it. Especially for [1].
> 
> SDM doesn't state it should be the nearest power-of-2 integer.
> For example, regarding EAX[25:14], what SDM states are,
> 
> 1. The value needs to be added with 1
>  
> 2. The nearest power-of-2 integer that is not smaller than (1+EAX[25:14]) is
> the number of unique initial APIC IDs reserved for addressing different
> logical processor sharing this cache.
> 
> Above indicates that
> 
> 1. "EAX[25:14] + 1", indicates the real number of how many LPs sharing this
> cache. i.e., how many APIC IDs
> 
> while 2. "The nearest power-of-2 integer that is not smaller than
> (EAX[25:14] + 1)" indicates the how many APIC IDs are reserved for LPs
> sharing this cache. It doesn't require EAX[25:14] + 1, to be power of 2.

Semantically, it is true that SDM does not strictly require that 
EAX[25:14] + 1 is the power of 2.

But for our emulation, how much bigger EAX[25:14] + 1 is than "nearest
power-of-2", it's hard to define...and even there's no rule to define...

Using "nearest power-of-2" directly is a common and generic way (and in
line with spec). :-)

On the actual machines (I've seen), this field is also implemented using
"power-of-2 - 1". (When you meet counterexample, pls educate me)

Thanks,
Zhao

> 
> > The nearest power-of-2 integer can be caculated by pow2ceil() or by
> > using APIC ID offset (like L3 topology using 1 << die_offset [3]).
> > 
> > But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
> > are associated with APIC ID. For example, in linux kernel, the field
> > "num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
> > another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
> > matched with actual core numbers and it's caculated by:
> > "(1 << (pkg_offset - core_offset)) - 1".
> > 
> > Therefore the offset of APIC ID should be preferred to caculate nearest
> > power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
> > 31:26]:
> > 1. d/i cache is shared in a core, 1 << core_offset should be used
> >     instand of "cs->nr_threads" in encode_cache_cpuid4() for
> >     CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
> > 2. L2 cache is supposed to be shared in a core as for now, thereby
> >     1 << core_offset should also be used instand of "cs->nr_threads" in
> >     encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
> > 3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
> >     replaced by the offsets upper SMT level in APIC ID.
> > 
> > In addition, use APIC ID offset to replace "pow2ceil()" for
> > cache_info_passthrough case.
> > 
> > [1]: efb3934adf9e ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec")
> > [2]: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
> > [3]: d65af288a84d ("i386: Update new x86_apicid parsing rules with die_offset support")
> > 
> > Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
> > Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v1:
> >   * Use APIC ID offset to replace "pow2ceil()" for cache_info_passthrough
> >     case. (Yanan)
> >   * Split the L1 cache fix into a separate patch.
> >   * Rename the title of this patch (the original is "i386/cpu: Fix number
> >     of addressable IDs in CPUID.04H").
> > ---
> >   target/i386/cpu.c | 30 +++++++++++++++++++++++-------
> >   1 file changed, 23 insertions(+), 7 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index b439a05244ee..c80613bfcded 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6005,7 +6005,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >   {
> >       X86CPU *cpu = env_archcpu(env);
> >       CPUState *cs = env_cpu(env);
> > -    uint32_t die_offset;
> >       uint32_t limit;
> >       uint32_t signature[3];
> >       X86CPUTopoInfo topo_info;
> > @@ -6089,39 +6088,56 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> >                   int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> >                   if (cs->nr_cores > 1) {
> > +                    int addressable_cores_offset =
> > +                                                apicid_pkg_offset(&topo_info) -
> > +                                                apicid_core_offset(&topo_info);
> > +
> >                       *eax &= ~0xFC000000;
> > -                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> > +                    *eax |= (1 << addressable_cores_offset - 1) << 26;
> >                   }
> >                   if (host_vcpus_per_cache > vcpus_per_socket) {
> > +                    int pkg_offset = apicid_pkg_offset(&topo_info);
> > +
> >                       *eax &= ~0x3FFC000;
> > -                    *eax |= (pow2ceil(vcpus_per_socket) - 1) << 14;
> > +                    *eax |= (1 << pkg_offset - 1) << 14;
> >                   }
> >               }
> >           } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
> >               *eax = *ebx = *ecx = *edx = 0;
> >           } else {
> >               *eax = 0;
> > +            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> > +                                           apicid_core_offset(&topo_info);
> > +            int core_offset, die_offset;
> > +
> >               switch (count) {
> >               case 0: /* L1 dcache info */
> > +                core_offset = apicid_core_offset(&topo_info);
> >                   encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> > -                                    cs->nr_threads, cs->nr_cores,
> > +                                    (1 << core_offset),
> > +                                    (1 << addressable_cores_offset),
> >                                       eax, ebx, ecx, edx);
> >                   break;
> >               case 1: /* L1 icache info */
> > +                core_offset = apicid_core_offset(&topo_info);
> >                   encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> > -                                    cs->nr_threads, cs->nr_cores,
> > +                                    (1 << core_offset),
> > +                                    (1 << addressable_cores_offset),
> >                                       eax, ebx, ecx, edx);
> >                   break;
> >               case 2: /* L2 cache info */
> > +                core_offset = apicid_core_offset(&topo_info);
> >                   encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> > -                                    cs->nr_threads, cs->nr_cores,
> > +                                    (1 << core_offset),
> > +                                    (1 << addressable_cores_offset),
> >                                       eax, ebx, ecx, edx);
> >                   break;
> >               case 3: /* L3 cache info */
> >                   die_offset = apicid_die_offset(&topo_info);
> >                   if (cpu->enable_l3_cache) {
> >                       encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> > -                                        (1 << die_offset), cs->nr_cores,
> > +                                        (1 << die_offset),
> > +                                        (1 << addressable_cores_offset),
> >                                           eax, ebx, ecx, edx);
> >                       break;
> >                   }
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-08-07  8:43       ` Xiaoyao Li
@ 2023-08-07 10:00         ` Zhao Liu
  2023-08-07 14:20           ` Xiaoyao Li
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-07 10:00 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu,
	Zhuocheng Ding

Hi Xiaoyao,

On Mon, Aug 07, 2023 at 04:43:32PM +0800, Xiaoyao Li wrote:
> Date: Mon, 7 Aug 2023 16:43:32 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
> 
> On 8/7/2023 3:53 PM, Zhao Liu wrote:
> > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > index 97ad229d8ba3..50613cd04612 100644
> > > > --- a/target/i386/cpu.c
> > > > +++ b/target/i386/cpu.c
> > > > @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> > > >        X86CPUTopoInfo topo_info;
> > > >        topo_info.dies_per_pkg = env->nr_dies;
> > > > -    topo_info.cores_per_die = cs->nr_cores;
> > > > +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> > > This and below things make me think that, it looks ugly that @nr_dies is
> > > added separately in struct CPUArchState for i386 because CPUState only has
> > > @nr_cores and nr_threads. Further, for i386, it defines a specific struct
> > > X86CPUTopoInfo to contain topology info when setting up CPUID. To me, struct
> > > X86CPUTopoInfo is redundant as struct CpuTopology.
> > > 
> > > maybe we can carry a struct CpuTopology in CPUState, so that we can drop
> > > @nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in struct
> > > CPUArchState for i386. As well, topo_info can be dropped here.
> > Yeah, I agree. We think the same way, as did in [1].
> > 
> > About X86CPUTopoInfo, it's still necessary to keep to help encode
> > APICID.
> 
> typedef struct X86CPUTopoInfo {
>     unsigned dies_per_pkg;
>     unsigned cores_per_die;
>     unsigned threads_per_core;
> } X86CPUTopoInfo;
> 
> /**
>  * CpuTopology:
>  * @cpus: the number of present logical processors on the machine
>  * @sockets: the number of sockets on the machine
>  * @dies: the number of dies in one socket
>  * @clusters: the number of clusters in one die
>  * @cores: the number of cores in one cluster
>  * @threads: the number of threads in one core
>  * @max_cpus: the maximum number of logical processors on the machine
>  */
> typedef struct CpuTopology {
>     unsigned int cpus;
>     unsigned int sockets;
>     unsigned int dies;
>     unsigned int clusters;
>     unsigned int cores;
>     unsigned int threads;
>     unsigned int max_cpus;
> } CpuTopology;
> 
> I think 'struct X86CPUTopoInfo' is just a subset of 'struct CpuTopology'

For smp topology, it's correct.

> 
> IIUC, currently the value of X86CPUTopoInfo::dies_per_pkg should equal with
> CpuTopology::dies, and the same for cores_per_die and threads_per_core.
> 
> So it's OK to keep an copy of 'struct CpuTopology' in CPUState and drop
> 'struct X86CPUTopoInfo'
> 
> > For hybrid topology case, the APICID is likely discontinuous,
> > and the width of each CPU level in APICID depends on the maximum number
> > of elements in this level. So I also proposed to rename it to
> > X86ApicidTopoInfo [2] and count the maximum number of elements in each
> > CPU level.
> 
> Do you mean, for example, for hybrid topology, X86CPUTopoInfo::dies_per_pkg
> != CpuTopology::dies? Or after rename
> X86CPUTopoInfo::max_dies != CpuTopology::dies?

I mean the latter.

A more typical example nowadays is thread level.

X86CPUTopoInfo::max_threads may not euqal to CpuTopology::threads,

since P core has 2 threads per core and E core doesn't support SMT.

The CpuTopology in CPUState should reflect the topology information for
current CPU, so CpuTopology::threads is 2 for P core and
CpuTopology::threads = 1 for E core.

But the width of the SMT level in APICID must be fixed, so that SMT width
should be determined by X86CPUTopoInfo::max_threads. Current hybrid
platforms implement it the same way.

Thanks,
Zhao

> 
> > [2]:https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03237.html
> > 
> > Thanks,
> > Zhao
> > 
> 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-08-07 10:00         ` Zhao Liu
@ 2023-08-07 14:20           ` Xiaoyao Li
  2023-08-07 14:42             ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Xiaoyao Li @ 2023-08-07 14:20 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu,
	Zhuocheng Ding

On 8/7/2023 6:00 PM, Zhao Liu wrote:
> Hi Xiaoyao,
> 
> On Mon, Aug 07, 2023 at 04:43:32PM +0800, Xiaoyao Li wrote:
>> Date: Mon, 7 Aug 2023 16:43:32 +0800
>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>> Subject: Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
>>
>> On 8/7/2023 3:53 PM, Zhao Liu wrote:
>>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>>> index 97ad229d8ba3..50613cd04612 100644
>>>>> --- a/target/i386/cpu.c
>>>>> +++ b/target/i386/cpu.c
>>>>> @@ -6011,7 +6011,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>>>         X86CPUTopoInfo topo_info;
>>>>>         topo_info.dies_per_pkg = env->nr_dies;
>>>>> -    topo_info.cores_per_die = cs->nr_cores;
>>>>> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>>>> This and below things make me think that, it looks ugly that @nr_dies is
>>>> added separately in struct CPUArchState for i386 because CPUState only has
>>>> @nr_cores and nr_threads. Further, for i386, it defines a specific struct
>>>> X86CPUTopoInfo to contain topology info when setting up CPUID. To me, struct
>>>> X86CPUTopoInfo is redundant as struct CpuTopology.
>>>>
>>>> maybe we can carry a struct CpuTopology in CPUState, so that we can drop
>>>> @nr_threads, @nr_cores in CPUState for all ARCHes, and @nr_dies in struct
>>>> CPUArchState for i386. As well, topo_info can be dropped here.
>>> Yeah, I agree. We think the same way, as did in [1].
>>>
>>> About X86CPUTopoInfo, it's still necessary to keep to help encode
>>> APICID.
>>
>> typedef struct X86CPUTopoInfo {
>>      unsigned dies_per_pkg;
>>      unsigned cores_per_die;
>>      unsigned threads_per_core;
>> } X86CPUTopoInfo;
>>
>> /**
>>   * CpuTopology:
>>   * @cpus: the number of present logical processors on the machine
>>   * @sockets: the number of sockets on the machine
>>   * @dies: the number of dies in one socket
>>   * @clusters: the number of clusters in one die
>>   * @cores: the number of cores in one cluster
>>   * @threads: the number of threads in one core
>>   * @max_cpus: the maximum number of logical processors on the machine
>>   */
>> typedef struct CpuTopology {
>>      unsigned int cpus;
>>      unsigned int sockets;
>>      unsigned int dies;
>>      unsigned int clusters;
>>      unsigned int cores;
>>      unsigned int threads;
>>      unsigned int max_cpus;
>> } CpuTopology;
>>
>> I think 'struct X86CPUTopoInfo' is just a subset of 'struct CpuTopology'
> 
> For smp topology, it's correct.
> 
>>
>> IIUC, currently the value of X86CPUTopoInfo::dies_per_pkg should equal with
>> CpuTopology::dies, and the same for cores_per_die and threads_per_core.
>>
>> So it's OK to keep an copy of 'struct CpuTopology' in CPUState and drop
>> 'struct X86CPUTopoInfo'
>>
>>> For hybrid topology case, the APICID is likely discontinuous,
>>> and the width of each CPU level in APICID depends on the maximum number
>>> of elements in this level. So I also proposed to rename it to
>>> X86ApicidTopoInfo [2] and count the maximum number of elements in each
>>> CPU level.
>>
>> Do you mean, for example, for hybrid topology, X86CPUTopoInfo::dies_per_pkg
>> != CpuTopology::dies? Or after rename
>> X86CPUTopoInfo::max_dies != CpuTopology::dies?
> 
> I mean the latter.
> 
> A more typical example nowadays is thread level.
> 
> X86CPUTopoInfo::max_threads may not euqal to CpuTopology::threads,
> 
> since P core has 2 threads per core and E core doesn't support SMT.
> 
> The CpuTopology in CPUState should reflect the topology information for
> current CPU, so CpuTopology::threads is 2 for P core and
> CpuTopology::threads = 1 for E core.
> 
> But the width of the SMT level in APICID must be fixed, so that SMT width
> should be determined by X86CPUTopoInfo::max_threads. Current hybrid
> platforms implement it the same way.

I see.

Can we pull the patch into this series (define a common CPUTopoInfo in 
CPUState and drop env->nr_dies, cs->nr_cores and cs->nr_threads) and let 
the hybrid series later to rename it to X86ApicidTopoInfo?

> Thanks,
> Zhao
> 
>>
>>> [2]:https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03237.html
>>>
>>> Thanks,
>>> Zhao
>>>
>>



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-08-07 14:20           ` Xiaoyao Li
@ 2023-08-07 14:42             ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-08-07 14:42 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Babu Moger, Zhao Liu,
	Zhuocheng Ding

Hi Xiaoyao,

[snip]

> 
> I see.
> 
> Can we pull the patch into this series (define a common CPUTopoInfo in
> CPUState and drop env->nr_dies, cs->nr_cores and cs->nr_threads) and let the
> hybrid series later to rename it to X86ApicidTopoInfo?
>

Yes, we can spilt these from hybrid series.

But if I merge these into this series, it makes the v4 change a bit too
much for subsequent reviews.

I could cook another series to do these cleanups after this series.

Thanks,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 00/17] Support smp.clusters for x86
  2023-08-04 13:17   ` Zhao Liu
@ 2023-08-08 11:52     ` Jonathan Cameron via
  0 siblings, 0 replies; 63+ messages in thread
From: Jonathan Cameron via @ 2023-08-08 11:52 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daud�,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Babu Moger,
	Zhao Liu

On Fri, 4 Aug 2023 21:17:59 +0800
Zhao Liu <zhao1.liu@linux.intel.com> wrote:

> Hi Jonathan,
> 
> On Tue, Aug 01, 2023 at 04:35:27PM +0100, Jonathan Cameron via wrote:
> > >   
> 
> [snip]
> 
> > > 
> > > ## New property: x-l2-cache-topo
> > > 
> > > The property l2-cache-topo will be used to change the L2 cache topology
> > > in CPUID.04H.
> > > 
> > > Now it allows user to set the L2 cache is shared in core level or
> > > cluster level.
> > > 
> > > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > > topology will be overrided by the new topology setting.
> > > 
> > > Since CPUID.04H is used by intel cpus, this property is available on
> > > intel cpus as for now.
> > > 
> > > When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.  
> > 
> > Hi Zhao Liu,
> > 
> > As part of emulating arm's MPAM (cache partitioning controls) I needed
> > to add the missing cache description in the ACPI PPTT table. As such I ran
> > into a very similar problem to the one you are addressing.  
> 
> May I ask if the cache topology you need is symmetric or heterogeneous?
> 
> I had the discussion with Yanan [5] about heterogeneous cache. If you
> need a "symmetric" cache topology, maybe we could consider trying make
> this x-l2-cache-topo more generic.

For now, I'm interested in symmetric, but heterogeneous is certain
to pop up at some point as people will build MPAM mobile systems.
Right now there needs to be a lot of other work to emulate those well
in QEMU given the cores are likely to be quite different.

> 
> But if you need a heterogeneous cache topo, e.g., some cores have its
> own l2 cache, and other cores share the same l2 cache, only this command
> is not enough.
> 
> Intel hybrid platforms have the above case I mentioned, we used "hybrid
> CPU topology" [6] + "x-l2-cache-topo=cluster" to solve this:
> 
> For example, AdlerLake has 2 types of core, one type is P-core with L2 per
> P-core, and another type is E-core that 4 E-cores share a L2.
> 
> So we set a CPU topology like this:
> 
> Set 2 kinds of clusters:
> * 1 P-core is in a cluster.
> * 4 E-cores in a cluster.
> 
> Then we use "x-l2-cache-topo" to make l2 is shared at cluster. In this
> way, a P-core could own a L2 because its cluster only has 1 P-core, and
> 4 E-cores could could share a L2.

That can work if you aren't using clusters to describe other elements of
the topology.  We originally added PPTT based cluster support to Linux
to enable sharing of l3 tags (but not l3 data) among a cluster of
CPUs. So we'd need it for some of our platforms independent of this
particular aspect.  Sometimes the cluster will have associated caches
sometimes it won't and they will be at a different level.  PPTT represents
that cache topology well but is complex.

> 
> For more general way to set cache topology, Yanan and me discussed 2
> ways ([7] [8]). [8] depends on the QOM CPU topology mechanism that I'm
> working on.
> 
> [5]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg04795.html
> [6]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg03205.html
> [7]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05139.html
> [8]: https://mail.gnu.org/archive/html/qemu-devel/2023-02/msg05167.html

Great. I'll have a read!

> 
> > 
> > I wonder if a more generic description is possible? We can rely on ordering
> > of the cache levels, so what I was planning to propose was the rather lengthy
> > but flexible (and with better names ;)
> > 
> > -smp 16,sockets=1,clusters=4,threads=2,cache-cluster-start-level=2,cache-node-start-level=3  
> 
> Could you explain more about this command?

I'll cc you on the RFC patch set in a few mins.

> I don't understand what "cache-cluster-start-level=2,cache-node-start-level=3" mean.

Assume hierarchical max Nth level cache. 
Cache levels 1 to (cache-cluster-start-level - 1) are private to a core (shared by threads).
Cache levels cache-cluster-start-level to (cache-node-start-level - 1) are shared at cluster level.
Cache levels cache cach-node-start-level to N are at the Numa node level which may or may not
be the physical package.

It very much assumes non heterogeneous though which I don't like about this scheme.

> 
> > 
> > Perhaps we can come up with a common scheme that covers both usecases?
> > It gets more fiddly to define if we have variable topology across different clusters
> > - and that was going to be an open question in the RFC proposing this - our current
> > definition of the more basic topology doesn't cover those cases anyway.
> > 
> > What I want:
> > 
> > 1) No restriction on maximum cache levels - ...  
> 
> Hmmm, if there's no cache name, it would be difficult to define in cli.

Define by level number rather than name.

> 
> > ... some systems have more than 3  
> 
> What about L4? A name can simplify a lot of issues.

True but if we can make it take a number as a parameter it extends to
any level.

> 
> > 2) Easy ability to express everything from all caches are private to all caches are shared.
> > Is 3 levels enough? (private, shared at cluster level, shared at a level above that) I think
> > so, but if not any scheme should be extensible to cover another level.  
> 
> It seems you may need the "heterogeneous" cache topology.
So far, nope. I need to be able to define flexible.

> 
> I think "private" and "shared" are not good definitions of cache, since
> they are not technical terms? (Correct me if I'm wrong.) And i/d cache,
> l1 cache, l2 cache are generic terms accepted by many architectures.
Totally parallel concepts.

L1, l2 etc are just distances from the core and even that gets fiddly if
they aren't inclusive.

Private just means the cache is not shared by multiple cores.
The way you define it above is to put them in clusters, but the cluster concept
means a bunch of other things that don't necessarily have anything to do with
caches (many other resources may be shared).

> 
> Though cache topology is different with CPU topology, it's true that the
> cache topology is related to the CPU hierarchy, so I think using the CPU
> topology hierarchy to define the heterogeneous topology looks like a more
> appropriate way to do it.

Agreed this needs to be built off the CPU topology (PPTT in ACPI is a good
general way to describe things as I've not yet met a system it can't describe
to some degree), but there can be levels of that topology there for different
purposes than describing the sharing of cacehs.

> 
> > 
> > Great if we can figure out a common scheme.  
> 
> Yeah, it's worth discussing.

Let me catch up with the discussions you link above - perhaps the proposals
are generic enough for my cases as well.  The ARM world tends to have a lot
more varied topology than x86 so we often see corner cases.

Jonathan

> 
> Thanks,
> Zhao
> 
> > 
> > Jonathan
> >   
> > > 
> > > 
> > > # Patch description
> > > 
> > > patch 1-2 Cleanups about coding style and test name.
> > > 
> > > patch 3-4,15 Fixes about x86 topology, intel l1 cache topology and amd
> > >              cache topology encoding.
> > > 
> > > patch 5-6 Cleanups about topology related CPUID encoding and QEMU
> > >           topology variables.
> > > 
> > > patch 7-12 Add the module as the new CPU topology level in x86, and it
> > >            is corresponding to the cluster level in generic code.
> > > 
> > > patch 13,14,16 Add cache topology infomation in cache models.
> > > 
> > > patch 17 Introduce a new command to configure L2 cache topology.
> > > 
> > > 
> > > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07179.html
> > > [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> > > [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> > > [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > > 
> > > Best Regards,
> > > Zhao
> > > 
> > > ---
> > > Changelog:
> > > 
> > > Changes since v2:
> > >  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
> > >  * Use newly added wrapped helper to get cores per socket in
> > >    qemu_init_vcpu().
> > > 
> > > Changes since v1:
> > >  * Reordered patches. (Yanan)
> > >  * Deprecated the patch to fix comment of machine_parse_smp_config().
> > >    (Yanan)
> > >  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
> > >  * Split the intel's l1 cache topology fix into a new separate patch.
> > >    (Yanan)
> > >  * Combined module_id and APIC ID for module level support into one
> > >    patch. (Yanan)
> > >  * Make cache_into_passthrough case of cpuid 0x04 leaf in
> > >  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
> > >    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
> > >  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
> > >    (Yanan)
> > >  * Rename the "INVALID" level to "CPU_TOPO_LEVEL_UNKNOW". (Yanan)
> > > 
> > > ---
> > > Zhao Liu (10):
> > >   i386: Fix comment style in topology.h
> > >   tests: Rename test-x86-cpuid.c to test-x86-topo.c
> > >   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
> > >   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
> > >   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
> > >   i386: Add cache topology info in CPUCacheInfo
> > >   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
> > >   i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
> > >   i386: Use CPUCacheInfo.share_level to encode
> > >     CPUID[0x8000001D].EAX[bits 25:14]
> > >   i386: Add new property to control L2 cache topo in CPUID.04H
> > > 
> > > Zhuocheng Ding (7):
> > >   softmmu: Fix CPUSTATE.nr_cores' calculation
> > >   i386: Introduce module-level cpu topology to CPUX86State
> > >   i386: Support modules_per_die in X86CPUTopoInfo
> > >   i386: Support module_id in X86CPUTopoIDs
> > >   i386/cpu: Introduce cluster-id to X86CPU
> > >   tests: Add test case of APIC ID for module level parsing
> > >   hw/i386/pc: Support smp.clusters for x86 PC machine
> > > 
> > >  MAINTAINERS                                   |   2 +-
> > >  hw/i386/pc.c                                  |   1 +
> > >  hw/i386/x86.c                                 |  49 +++++-
> > >  include/hw/core/cpu.h                         |   2 +-
> > >  include/hw/i386/topology.h                    |  68 +++++---
> > >  qemu-options.hx                               |  10 +-
> > >  softmmu/cpus.c                                |   2 +-
> > >  target/i386/cpu.c                             | 158 ++++++++++++++----
> > >  target/i386/cpu.h                             |  25 +++
> > >  tests/unit/meson.build                        |   4 +-
> > >  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++++---
> > >  11 files changed, 280 insertions(+), 99 deletions(-)
> > >  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> > >   
> > 
> >   
> 



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-04 15:48         ` Moger, Babu
@ 2023-08-14  8:22           ` Zhao Liu
  2023-08-14 16:03             ` Moger, Babu
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-14  8:22 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Wei Wang,
	Zhao Liu

Hi Babu,

On Fri, Aug 04, 2023 at 10:48:29AM -0500, Moger, Babu wrote:
> Date: Fri, 4 Aug 2023 10:48:29 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>  CPUID[4]
> 
> Hi Zhao,
> 
> On 8/4/23 04:48, Zhao Liu wrote:
> > Hi Babu,
> > 
> > On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
> >> Date: Thu, 3 Aug 2023 11:41:40 -0500
> >> From: "Moger, Babu" <babu.moger@amd.com>
> >> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
> >>  CPUID[4]
> >>
> >> Hi Zhao,
> >>
> >> On 8/2/23 18:49, Moger, Babu wrote:
> >>> Hi Zhao,
> >>>
> >>> Hitting this error after this patch.
> >>>
> >>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
> >>> not be reached
> >>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
> >>> should not be reached
> >>> Aborted (core dumped)
> >>>
> >>> Looks like share_level for all the caches for AMD is not initialized.
> > 
> > I missed these change when I rebase. Sorry for that.
> > 
> > BTW, could I ask a question? From a previous discussion[1], I understand
> > that the cache info is used to show the correct cache information in
> > new machine. And from [2], the wrong cache info may cause "compatibility
> > issues".
> > 
> > Is this "compatibility issues" AMD specific? I'm not sure if Intel should
> > update the cache info like that. thanks!
> 
> I was going to comment about that. Good that you asked me.
> 
> Compatibility is qemu requirement.  Otherwise the migrations will fail.
> 
> Any changes in the topology is going to cause migration problems.

Could you please educate me more about the details of the "migration
problem"?

I didn't understand why it was causing the problem and wasn't sure if I
was missing any cases.

Thanks,
Zhao

> 
> I am not sure how you are going to handle this. You can probably look at
> the feature "x-intel-pt-auto-level".
> 
> make sure to test the migration.
> 
> Thanks
> Babu
> 
> 
> > 
> > [1]: https://patchwork.kernel.org/project/kvm/patch/CY4PR12MB1768A3CBE42AAFB03CB1081E95AA0@CY4PR12MB1768.namprd12.prod.outlook.com/
> > [2]: https://lore.kernel.org/qemu-devel/20180510204148.11687-1-babu.moger@amd.com/
> > 
> >>
> >> The following patch fixes the problem.
> >>
> >> ======================================================
> >>
> >>
> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> index f4c48e19fa..976a2755d8 100644
> >> --- a/target/i386/cpu.c
> >> +++ b/target/i386/cpu.c
> >> @@ -528,6 +528,7 @@ static CPUCacheInfo legacy_l2_cache_cpuid2 = {
> >>      .size = 2 * MiB,
> >>      .line_size = 64,
> >>      .associativity = 8,
> >> +    .share_level = CPU_TOPO_LEVEL_CORE,
> > 
> > This "legacy_l2_cache_cpuid2" is not used to encode cache topology.
> > I should explicitly set this default topo level as CPU_TOPO_LEVEL_UNKNOW.
> > 
> >>  };
> >>
> >>
> >> @@ -1904,6 +1905,7 @@ static CPUCaches epyc_v4_cache_info = {
> >>          .lines_per_tag = 1,
> >>          .self_init = 1,
> >>          .no_invd_sharing = true,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l1i_cache = &(CPUCacheInfo) {
> >>          .type = INSTRUCTION_CACHE,
> >> @@ -1916,6 +1918,7 @@ static CPUCaches epyc_v4_cache_info = {
> >>          .lines_per_tag = 1,
> >>          .self_init = 1,
> >>          .no_invd_sharing = true,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l2_cache = &(CPUCacheInfo) {
> >>          .type = UNIFIED_CACHE,
> >> @@ -1926,6 +1929,7 @@ static CPUCaches epyc_v4_cache_info = {
> >>          .partitions = 1,
> >>          .sets = 1024,
> >>          .lines_per_tag = 1,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l3_cache = &(CPUCacheInfo) {
> >>          .type = UNIFIED_CACHE,
> >> @@ -1939,6 +1943,7 @@ static CPUCaches epyc_v4_cache_info = {
> >>          .self_init = true,
> >>          .inclusive = true,
> >>          .complex_indexing = false,
> >> +        .share_level = CPU_TOPO_LEVEL_DIE,
> >>      },
> >>  };
> >>
> >> @@ -2008,6 +2013,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
> >>          .lines_per_tag = 1,
> >>          .self_init = 1,
> >>          .no_invd_sharing = true,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l1i_cache = &(CPUCacheInfo) {
> >>          .type = INSTRUCTION_CACHE,
> >> @@ -2020,6 +2026,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
> >>          .lines_per_tag = 1,
> >>          .self_init = 1,
> >>          .no_invd_sharing = true,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l2_cache = &(CPUCacheInfo) {
> >>          .type = UNIFIED_CACHE,
> >> @@ -2030,6 +2037,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
> >>          .partitions = 1,
> >>          .sets = 1024,
> >>          .lines_per_tag = 1,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l3_cache = &(CPUCacheInfo) {
> >>          .type = UNIFIED_CACHE,
> >> @@ -2043,6 +2051,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
> >>          .self_init = true,
> >>          .inclusive = true,
> >>          .complex_indexing = false,
> >> +        .share_level = CPU_TOPO_LEVEL_DIE,
> >>      },
> >>  };
> >>
> >> @@ -2112,6 +2121,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
> >>          .lines_per_tag = 1,
> >>          .self_init = 1,
> >>          .no_invd_sharing = true,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l1i_cache = &(CPUCacheInfo) {
> >>          .type = INSTRUCTION_CACHE,
> >> @@ -2124,6 +2134,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
> >>          .lines_per_tag = 1,
> >>          .self_init = 1,
> >>          .no_invd_sharing = true,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l2_cache = &(CPUCacheInfo) {
> >>          .type = UNIFIED_CACHE,
> >> @@ -2134,6 +2145,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
> >>          .partitions = 1,
> >>          .sets = 1024,
> >>          .lines_per_tag = 1,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l3_cache = &(CPUCacheInfo) {
> >>          .type = UNIFIED_CACHE,
> >> @@ -2147,6 +2159,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
> >>          .self_init = true,
> >>          .inclusive = true,
> >>          .complex_indexing = false,
> >> +        .share_level = CPU_TOPO_LEVEL_DIE,
> >>      },
> >>  };
> >>
> >> @@ -2162,6 +2175,7 @@ static const CPUCaches epyc_genoa_cache_info = {
> >>          .lines_per_tag = 1,
> >>          .self_init = 1,
> >>          .no_invd_sharing = true,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l1i_cache = &(CPUCacheInfo) {
> >>          .type = INSTRUCTION_CACHE,
> >> @@ -2174,6 +2188,7 @@ static const CPUCaches epyc_genoa_cache_info = {
> >>          .lines_per_tag = 1,
> >>          .self_init = 1,
> >>          .no_invd_sharing = true,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l2_cache = &(CPUCacheInfo) {
> >>          .type = UNIFIED_CACHE,
> >> @@ -2184,6 +2199,7 @@ static const CPUCaches epyc_genoa_cache_info = {
> >>          .partitions = 1,
> >>          .sets = 2048,
> >>          .lines_per_tag = 1,
> >> +        .share_level = CPU_TOPO_LEVEL_CORE,
> >>      },
> >>      .l3_cache = &(CPUCacheInfo) {
> >>          .type = UNIFIED_CACHE,
> >> @@ -2197,6 +2213,7 @@ static const CPUCaches epyc_genoa_cache_info = {
> >>          .self_init = true,
> >>          .inclusive = true,
> >>          .complex_indexing = false,
> >> +        .share_level = CPU_TOPO_LEVEL_DIE,
> >>      },
> >>  };
> >>
> >>
> >> =========================================================================
> > 
> > 
> > Look good to me except legacy_l2_cache_cpuid2, thanks very much!
> > I'll add this in next version.
> > 
> > -Zhao
> > 
> >>
> >> Thanks
> >> Babu
> >>>
> >>> On 8/1/23 05:35, Zhao Liu wrote:
> >>>> From: Zhao Liu <zhao1.liu@intel.com>
> >>>>
> >>>> CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
> >>>> intel CPUs.
> >>>>
> >>>> After cache models have topology information, we can use
> >>>> CPUCacheInfo.share_level to decide which topology level to be encoded
> >>>> into CPUID[4].EAX[bits 25:14].
> >>>>
> >>>> And since maximum_processor_id (original "num_apic_ids") is parsed
> >>>> based on cpu topology levels, which are verified when parsing smp, it's
> >>>> no need to check this value by "assert(num_apic_ids > 0)" again, so
> >>>> remove this assert.
> >>>>
> >>>> Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
> >>>> helper to make the code cleaner.
> >>>>
> >>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> >>>> ---
> >>>> Changes since v1:
> >>>>  * Use "enum CPUTopoLevel share_level" as the parameter in
> >>>>    max_processor_ids_for_cache().
> >>>>  * Make cache_into_passthrough case also use
> >>>>    max_processor_ids_for_cache() and max_core_ids_in_package() to
> >>>>    encode CPUID[4]. (Yanan)
> >>>>  * Rename the title of this patch (the original is "i386: Use
> >>>>    CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]").
> >>>> ---
> >>>>  target/i386/cpu.c | 70 +++++++++++++++++++++++++++++------------------
> >>>>  1 file changed, 43 insertions(+), 27 deletions(-)
> >>>>
> >>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >>>> index 55aba4889628..c9897c0fe91a 100644
> >>>> --- a/target/i386/cpu.c
> >>>> +++ b/target/i386/cpu.c
> >>>> @@ -234,22 +234,53 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
> >>>>                         ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
> >>>>                         0 /* Invalid value */)
> >>>>  
> >>>> +static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
> >>>> +                                            enum CPUTopoLevel share_level)
> >>>> +{
> >>>> +    uint32_t num_ids = 0;
> >>>> +
> >>>> +    switch (share_level) {
> >>>> +    case CPU_TOPO_LEVEL_CORE:
> >>>> +        num_ids = 1 << apicid_core_offset(topo_info);
> >>>> +        break;
> >>>> +    case CPU_TOPO_LEVEL_DIE:
> >>>> +        num_ids = 1 << apicid_die_offset(topo_info);
> >>>> +        break;
> >>>> +    case CPU_TOPO_LEVEL_PACKAGE:
> >>>> +        num_ids = 1 << apicid_pkg_offset(topo_info);
> >>>> +        break;
> >>>> +    default:
> >>>> +        /*
> >>>> +         * Currently there is no use case for SMT and MODULE, so use
> >>>> +         * assert directly to facilitate debugging.
> >>>> +         */
> >>>> +        g_assert_not_reached();
> >>>> +    }
> >>>> +
> >>>> +    return num_ids - 1;
> >>>> +}
> >>>> +
> >>>> +static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
> >>>> +{
> >>>> +    uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
> >>>> +                               apicid_core_offset(topo_info));
> >>>> +    return num_cores - 1;
> >>>> +}
> >>>>  
> >>>>  /* Encode cache info for CPUID[4] */
> >>>>  static void encode_cache_cpuid4(CPUCacheInfo *cache,
> >>>> -                                int num_apic_ids, int num_cores,
> >>>> +                                X86CPUTopoInfo *topo_info,
> >>>>                                  uint32_t *eax, uint32_t *ebx,
> >>>>                                  uint32_t *ecx, uint32_t *edx)
> >>>>  {
> >>>>      assert(cache->size == cache->line_size * cache->associativity *
> >>>>                            cache->partitions * cache->sets);
> >>>>  
> >>>> -    assert(num_apic_ids > 0);
> >>>>      *eax = CACHE_TYPE(cache->type) |
> >>>>             CACHE_LEVEL(cache->level) |
> >>>>             (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
> >>>> -           ((num_cores - 1) << 26) |
> >>>> -           ((num_apic_ids - 1) << 14);
> >>>> +           (max_core_ids_in_package(topo_info) << 26) |
> >>>> +           (max_processor_ids_for_cache(topo_info, cache->share_level) << 14);
> >>>>  
> >>>>      assert(cache->line_size > 0);
> >>>>      assert(cache->partitions > 0);
> >>>> @@ -6116,56 +6147,41 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >>>>                  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> >>>>  
> >>>>                  if (cores_per_pkg > 1) {
> >>>> -                    int addressable_cores_offset =
> >>>> -                                                apicid_pkg_offset(&topo_info) -
> >>>> -                                                apicid_core_offset(&topo_info);
> >>>> -
> >>>>                      *eax &= ~0xFC000000;
> >>>> -                    *eax |= (1 << addressable_cores_offset - 1) << 26;
> >>>> +                    *eax |= max_core_ids_in_package(&topo_info) << 26;
> >>>>                  }
> >>>>                  if (host_vcpus_per_cache > cpus_per_pkg) {
> >>>> -                    int pkg_offset = apicid_pkg_offset(&topo_info);
> >>>> -
> >>>>                      *eax &= ~0x3FFC000;
> >>>> -                    *eax |= (1 << pkg_offset - 1) << 14;
> >>>> +                    *eax |=
> >>>> +                        max_processor_ids_for_cache(&topo_info,
> >>>> +                                                CPU_TOPO_LEVEL_PACKAGE) << 14;
> >>>>                  }
> >>>>              }
> >>>>          } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
> >>>>              *eax = *ebx = *ecx = *edx = 0;
> >>>>          } else {
> >>>>              *eax = 0;
> >>>> -            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> >>>> -                                           apicid_core_offset(&topo_info);
> >>>> -            int core_offset, die_offset;
> >>>>  
> >>>>              switch (count) {
> >>>>              case 0: /* L1 dcache info */
> >>>> -                core_offset = apicid_core_offset(&topo_info);
> >>>>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> >>>> -                                    (1 << core_offset),
> >>>> -                                    (1 << addressable_cores_offset),
> >>>> +                                    &topo_info,
> >>>>                                      eax, ebx, ecx, edx);
> >>>>                  break;
> >>>>              case 1: /* L1 icache info */
> >>>> -                core_offset = apicid_core_offset(&topo_info);
> >>>>                  encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> >>>> -                                    (1 << core_offset),
> >>>> -                                    (1 << addressable_cores_offset),
> >>>> +                                    &topo_info,
> >>>>                                      eax, ebx, ecx, edx);
> >>>>                  break;
> >>>>              case 2: /* L2 cache info */
> >>>> -                core_offset = apicid_core_offset(&topo_info);
> >>>>                  encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> >>>> -                                    (1 << core_offset),
> >>>> -                                    (1 << addressable_cores_offset),
> >>>> +                                    &topo_info,
> >>>>                                      eax, ebx, ecx, edx);
> >>>>                  break;
> >>>>              case 3: /* L3 cache info */
> >>>> -                die_offset = apicid_die_offset(&topo_info);
> >>>>                  if (cpu->enable_l3_cache) {
> >>>>                      encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> >>>> -                                        (1 << die_offset),
> >>>> -                                        (1 << addressable_cores_offset),
> >>>> +                                        &topo_info,
> >>>>                                          eax, ebx, ecx, edx);
> >>>>                      break;
> >>>>                  }
> >>>
> >>
> >> -- 
> >> Thanks
> >> Babu Moger
> 
> -- 
> Thanks
> Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-14  8:22           ` Zhao Liu
@ 2023-08-14 16:03             ` Moger, Babu
  2023-08-18  7:37               ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-14 16:03 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Wei Wang,
	Zhao Liu

Hi Zhao,


On 8/14/23 03:22, Zhao Liu wrote:
> Hi Babu,
> 
> On Fri, Aug 04, 2023 at 10:48:29AM -0500, Moger, Babu wrote:
>> Date: Fri, 4 Aug 2023 10:48:29 -0500
>> From: "Moger, Babu" <babu.moger@amd.com>
>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>  CPUID[4]
>>
>> Hi Zhao,
>>
>> On 8/4/23 04:48, Zhao Liu wrote:
>>> Hi Babu,
>>>
>>> On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
>>>> Date: Thu, 3 Aug 2023 11:41:40 -0500
>>>> From: "Moger, Babu" <babu.moger@amd.com>
>>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>>>  CPUID[4]
>>>>
>>>> Hi Zhao,
>>>>
>>>> On 8/2/23 18:49, Moger, Babu wrote:
>>>>> Hi Zhao,
>>>>>
>>>>> Hitting this error after this patch.
>>>>>
>>>>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
>>>>> not be reached
>>>>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
>>>>> should not be reached
>>>>> Aborted (core dumped)
>>>>>
>>>>> Looks like share_level for all the caches for AMD is not initialized.
>>>
>>> I missed these change when I rebase. Sorry for that.
>>>
>>> BTW, could I ask a question? From a previous discussion[1], I understand
>>> that the cache info is used to show the correct cache information in
>>> new machine. And from [2], the wrong cache info may cause "compatibility
>>> issues".
>>>
>>> Is this "compatibility issues" AMD specific? I'm not sure if Intel should
>>> update the cache info like that. thanks!
>>
>> I was going to comment about that. Good that you asked me.
>>
>> Compatibility is qemu requirement.  Otherwise the migrations will fail.
>>
>> Any changes in the topology is going to cause migration problems.
> 
> Could you please educate me more about the details of the "migration
> problem"?
> 
> I didn't understand why it was causing the problem and wasn't sure if I
> was missing any cases.
> 

I am not an expert on migration but I test VM migration sometimes.
Here are some guidelines.
https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-virtual-machines

When you migrate a VM to newer qemu using the same CPU type, migration
should work seamless. That means list of CPU features should be compatible
when you move to newer qemu version with CPU type.

Thanks
Babu



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-14 16:03             ` Moger, Babu
@ 2023-08-18  7:37               ` Zhao Liu
  2023-08-23 17:18                 ` Moger, Babu
  0 siblings, 1 reply; 63+ messages in thread
From: Zhao Liu @ 2023-08-18  7:37 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Wei Wang,
	Yongwei Ma, Zhao Liu

Hi Babu,

On Mon, Aug 14, 2023 at 11:03:53AM -0500, Moger, Babu wrote:
> Date: Mon, 14 Aug 2023 11:03:53 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>  CPUID[4]
> 
> Hi Zhao,
> 
> 
> On 8/14/23 03:22, Zhao Liu wrote:
> > Hi Babu,
> > 
> > On Fri, Aug 04, 2023 at 10:48:29AM -0500, Moger, Babu wrote:
> >> Date: Fri, 4 Aug 2023 10:48:29 -0500
> >> From: "Moger, Babu" <babu.moger@amd.com>
> >> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
> >>  CPUID[4]
> >>
> >> Hi Zhao,
> >>
> >> On 8/4/23 04:48, Zhao Liu wrote:
> >>> Hi Babu,
> >>>
> >>> On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
> >>>> Date: Thu, 3 Aug 2023 11:41:40 -0500
> >>>> From: "Moger, Babu" <babu.moger@amd.com>
> >>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
> >>>>  CPUID[4]
> >>>>
> >>>> Hi Zhao,
> >>>>
> >>>> On 8/2/23 18:49, Moger, Babu wrote:
> >>>>> Hi Zhao,
> >>>>>
> >>>>> Hitting this error after this patch.
> >>>>>
> >>>>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
> >>>>> not be reached
> >>>>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
> >>>>> should not be reached
> >>>>> Aborted (core dumped)
> >>>>>
> >>>>> Looks like share_level for all the caches for AMD is not initialized.
> >>>
> >>> I missed these change when I rebase. Sorry for that.
> >>>
> >>> BTW, could I ask a question? From a previous discussion[1], I understand
> >>> that the cache info is used to show the correct cache information in
> >>> new machine. And from [2], the wrong cache info may cause "compatibility
> >>> issues".
> >>>
> >>> Is this "compatibility issues" AMD specific? I'm not sure if Intel should
> >>> update the cache info like that. thanks!
> >>
> >> I was going to comment about that. Good that you asked me.
> >>
> >> Compatibility is qemu requirement.  Otherwise the migrations will fail.
> >>
> >> Any changes in the topology is going to cause migration problems.
> > 
> > Could you please educate me more about the details of the "migration
> > problem"?
> > 
> > I didn't understand why it was causing the problem and wasn't sure if I
> > was missing any cases.
> > 
> 
> I am not an expert on migration but I test VM migration sometimes.
> Here are some guidelines.
> https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-virtual-machines

Thanks for the material!

> 
> When you migrate a VM to newer qemu using the same CPU type, migration
> should work seamless. That means list of CPU features should be compatible
> when you move to newer qemu version with CPU type.

I see. This patches set adds the "-smp cluster" command and the
"x-l2-cache-topo" command. Migration requires that the target and
source VM command lines are the same, so the new commands ensure that
the migration is consistent.

But this patch set also includes some topology fixes (nr_cores fix and
l1 cache topology fix) and encoding change (use APIC ID offset to encode
addressable ids), these changes would affect migration and may cause
CPUID change for VM view. Thus if this patch set is accepted, these
changes also need to be pushed into stable versions. Do you agree?

And about cache info for different CPU generations, migration usually
happens on the same CPU type, and Intel uses the same default cache
info for all CPU types. With the consistent cache info, migration is
also Ok. So if we don't care about the specific cache info in the VM,
it's okay to use the same default cache info for all CPU types. Right?

Thanks,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-18  7:37               ` Zhao Liu
@ 2023-08-23 17:18                 ` Moger, Babu
  2023-09-01  8:43                   ` Zhao Liu
  0 siblings, 1 reply; 63+ messages in thread
From: Moger, Babu @ 2023-08-23 17:18 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Wei Wang,
	Yongwei Ma, Zhao Liu

Hi Zhao,

On 8/18/23 02:37, Zhao Liu wrote:
> Hi Babu,
> 
> On Mon, Aug 14, 2023 at 11:03:53AM -0500, Moger, Babu wrote:
>> Date: Mon, 14 Aug 2023 11:03:53 -0500
>> From: "Moger, Babu" <babu.moger@amd.com>
>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>  CPUID[4]
>>
>> Hi Zhao,
>>
>>
>> On 8/14/23 03:22, Zhao Liu wrote:
>>> Hi Babu,
>>>
>>> On Fri, Aug 04, 2023 at 10:48:29AM -0500, Moger, Babu wrote:
>>>> Date: Fri, 4 Aug 2023 10:48:29 -0500
>>>> From: "Moger, Babu" <babu.moger@amd.com>
>>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>>>  CPUID[4]
>>>>
>>>> Hi Zhao,
>>>>
>>>> On 8/4/23 04:48, Zhao Liu wrote:
>>>>> Hi Babu,
>>>>>
>>>>> On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
>>>>>> Date: Thu, 3 Aug 2023 11:41:40 -0500
>>>>>> From: "Moger, Babu" <babu.moger@amd.com>
>>>>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>>>>>  CPUID[4]
>>>>>>
>>>>>> Hi Zhao,
>>>>>>
>>>>>> On 8/2/23 18:49, Moger, Babu wrote:
>>>>>>> Hi Zhao,
>>>>>>>
>>>>>>> Hitting this error after this patch.
>>>>>>>
>>>>>>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
>>>>>>> not be reached
>>>>>>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
>>>>>>> should not be reached
>>>>>>> Aborted (core dumped)
>>>>>>>
>>>>>>> Looks like share_level for all the caches for AMD is not initialized.
>>>>>
>>>>> I missed these change when I rebase. Sorry for that.
>>>>>
>>>>> BTW, could I ask a question? From a previous discussion[1], I understand
>>>>> that the cache info is used to show the correct cache information in
>>>>> new machine. And from [2], the wrong cache info may cause "compatibility
>>>>> issues".
>>>>>
>>>>> Is this "compatibility issues" AMD specific? I'm not sure if Intel should
>>>>> update the cache info like that. thanks!
>>>>
>>>> I was going to comment about that. Good that you asked me.
>>>>
>>>> Compatibility is qemu requirement.  Otherwise the migrations will fail.
>>>>
>>>> Any changes in the topology is going to cause migration problems.
>>>
>>> Could you please educate me more about the details of the "migration
>>> problem"?
>>>
>>> I didn't understand why it was causing the problem and wasn't sure if I
>>> was missing any cases.
>>>
>>
>> I am not an expert on migration but I test VM migration sometimes.
>> Here are some guidelines.
>> https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-virtual-machines
> 
> Thanks for the material!
> 
>>
>> When you migrate a VM to newer qemu using the same CPU type, migration
>> should work seamless. That means list of CPU features should be compatible
>> when you move to newer qemu version with CPU type.
> 
> I see. This patches set adds the "-smp cluster" command and the
> "x-l2-cache-topo" command. Migration requires that the target and

Shouldn't the command x-l2-cache-topo disabled by default? (For example
look at hw/i386/pc.c the property x-migrate-smi-count).

It will be enabled when user passes "-cpu x-l2-cache-topo=[core|cluster]".
Current code enables it by default as far I can see.

> source VM command lines are the same, so the new commands ensure that
> the migration is consistent.
> 
> But this patch set also includes some topology fixes (nr_cores fix and
> l1 cache topology fix) and encoding change (use APIC ID offset to encode
> addressable ids), these changes would affect migration and may cause
> CPUID change for VM view. Thus if this patch set is accepted, these
> changes also need to be pushed into stable versions. Do you agree?

Yes. That sounds right.

> 
> And about cache info for different CPU generations, migration usually
> happens on the same CPU type, and Intel uses the same default cache
> info for all CPU types. With the consistent cache info, migration is
> also Ok. So if we don't care about the specific cache info in the VM,
> it's okay to use the same default cache info for all CPU types. Right?

I am not sure about this. Please run migration tests to be sure.
-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-08-23 17:18                 ` Moger, Babu
@ 2023-09-01  8:43                   ` Zhao Liu
  0 siblings, 0 replies; 63+ messages in thread
From: Zhao Liu @ 2023-09-01  8:43 UTC (permalink / raw)
  To: Moger, Babu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, qemu-devel, Zhenyu Wang, Xiaoyao Li, Wei Wang,
	Yongwei Ma, Zhao Liu

Hi Babu,

On Wed, Aug 23, 2023 at 12:18:30PM -0500, Moger, Babu wrote:
> Date: Wed, 23 Aug 2023 12:18:30 -0500
> From: "Moger, Babu" <babu.moger@amd.com>
> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>  CPUID[4]
> 
> Hi Zhao,
> 
> On 8/18/23 02:37, Zhao Liu wrote:
> > Hi Babu,
> > 
> > On Mon, Aug 14, 2023 at 11:03:53AM -0500, Moger, Babu wrote:
> >> Date: Mon, 14 Aug 2023 11:03:53 -0500
> >> From: "Moger, Babu" <babu.moger@amd.com>
> >> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
> >>  CPUID[4]
> >>
> >> Hi Zhao,
> >>
> >>
> >> On 8/14/23 03:22, Zhao Liu wrote:
> >>> Hi Babu,
> >>>
> >>> On Fri, Aug 04, 2023 at 10:48:29AM -0500, Moger, Babu wrote:
> >>>> Date: Fri, 4 Aug 2023 10:48:29 -0500
> >>>> From: "Moger, Babu" <babu.moger@amd.com>
> >>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
> >>>>  CPUID[4]
> >>>>
> >>>> Hi Zhao,
> >>>>
> >>>> On 8/4/23 04:48, Zhao Liu wrote:
> >>>>> Hi Babu,
> >>>>>
> >>>>> On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
> >>>>>> Date: Thu, 3 Aug 2023 11:41:40 -0500
> >>>>>> From: "Moger, Babu" <babu.moger@amd.com>
> >>>>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
> >>>>>>  CPUID[4]
> >>>>>>
> >>>>>> Hi Zhao,
> >>>>>>
> >>>>>> On 8/2/23 18:49, Moger, Babu wrote:
> >>>>>>> Hi Zhao,
> >>>>>>>
> >>>>>>> Hitting this error after this patch.
> >>>>>>>
> >>>>>>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
> >>>>>>> not be reached
> >>>>>>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
> >>>>>>> should not be reached
> >>>>>>> Aborted (core dumped)
> >>>>>>>
> >>>>>>> Looks like share_level for all the caches for AMD is not initialized.
> >>>>>
> >>>>> I missed these change when I rebase. Sorry for that.
> >>>>>
> >>>>> BTW, could I ask a question? From a previous discussion[1], I understand
> >>>>> that the cache info is used to show the correct cache information in
> >>>>> new machine. And from [2], the wrong cache info may cause "compatibility
> >>>>> issues".
> >>>>>
> >>>>> Is this "compatibility issues" AMD specific? I'm not sure if Intel should
> >>>>> update the cache info like that. thanks!
> >>>>
> >>>> I was going to comment about that. Good that you asked me.
> >>>>
> >>>> Compatibility is qemu requirement.  Otherwise the migrations will fail.
> >>>>
> >>>> Any changes in the topology is going to cause migration problems.
> >>>
> >>> Could you please educate me more about the details of the "migration
> >>> problem"?
> >>>
> >>> I didn't understand why it was causing the problem and wasn't sure if I
> >>> was missing any cases.
> >>>
> >>
> >> I am not an expert on migration but I test VM migration sometimes.
> >> Here are some guidelines.
> >> https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-virtual-machines
> > 
> > Thanks for the material!
> > 
> >>
> >> When you migrate a VM to newer qemu using the same CPU type, migration
> >> should work seamless. That means list of CPU features should be compatible
> >> when you move to newer qemu version with CPU type.
> > 
> > I see. This patches set adds the "-smp cluster" command and the
> > "x-l2-cache-topo" command. Migration requires that the target and
> 
> Shouldn't the command x-l2-cache-topo disabled by default? (For example
> look at hw/i386/pc.c the property x-migrate-smi-count).

Thanks!

Since we add the default topology level in cache models, so the default
l2 topology is the level hardcoded in cache model.

From this point, this option won't affect the migration between
different QEMU versions. If user doesn't change l2 to cluster, the
default l2 topology levels are the same (core level).

> 
> It will be enabled when user passes "-cpu x-l2-cache-topo=[core|cluster]".
> Current code enables it by default as far I can see.

I think the compatibility issue for x-migrate-smi-count is because it
has differnt default settings for different QEMU versions.

And for x-l2-cache-topo, it defaults to use the level hardcoded in cache
model, this is no difference between new and old QEMUs.

> 
> > source VM command lines are the same, so the new commands ensure that
> > the migration is consistent.
> > 
> > But this patch set also includes some topology fixes (nr_cores fix and
> > l1 cache topology fix) and encoding change (use APIC ID offset to encode
> > addressable ids), these changes would affect migration and may cause
> > CPUID change for VM view. Thus if this patch set is accepted, these
> > changes also need to be pushed into stable versions. Do you agree?
> 
> Yes. That sounds right.
> 
> > 
> > And about cache info for different CPU generations, migration usually
> > happens on the same CPU type, and Intel uses the same default cache
> > info for all CPU types. With the consistent cache info, migration is
> > also Ok. So if we don't care about the specific cache info in the VM,
> > it's okay to use the same default cache info for all CPU types. Right?
> 
> I am not sure about this. Please run migration tests to be sure.

We tested for these cases:

1. v3 <-> v3: same cli (same setting in x-l2-cache-topo) cases succeed.

2. v3 <-> master base (no v3 patches): same cli or v3 with default level
   (as hardcoded in cache models) cases succeed.

Thanks,
Zhao


^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2023-09-01  8:33 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-01 10:35 [PATCH v3 00/17] Support smp.clusters for x86 Zhao Liu
2023-08-01 10:35 ` [PATCH v3 01/17] i386: Fix comment style in topology.h Zhao Liu
2023-08-01 23:13   ` Moger, Babu
2023-08-04  8:12     ` Zhao Liu
2023-08-07  2:16   ` Xiaoyao Li
2023-08-07  7:05     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c Zhao Liu
2023-08-01 23:20   ` Moger, Babu
2023-08-04  8:14     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
2023-08-02 15:25   ` Moger, Babu
2023-08-04  8:16     ` Zhao Liu
2023-08-07  7:03   ` Xiaoyao Li
2023-08-07  7:53     ` Zhao Liu
2023-08-07  8:43       ` Xiaoyao Li
2023-08-07 10:00         ` Zhao Liu
2023-08-07 14:20           ` Xiaoyao Li
2023-08-07 14:42             ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 04/17] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
2023-08-04  9:56   ` Xiaoyao Li
2023-08-04 12:43     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
2023-08-02 15:41   ` Moger, Babu
2023-08-04  8:21     ` Zhao Liu
2023-08-07  8:13   ` Xiaoyao Li
2023-08-07  9:30     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
2023-08-02 16:31   ` Moger, Babu
2023-08-04  8:23     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 07/17] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
2023-08-01 10:35 ` [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
2023-08-02 17:25   ` Moger, Babu
2023-08-04  9:05     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 09/17] i386: Support module_id in X86CPUTopoIDs Zhao Liu
2023-08-01 10:35 ` [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
2023-08-02 22:44   ` Moger, Babu
2023-08-04  9:06     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 11/17] tests: Add test case of APIC ID for module level parsing Zhao Liu
2023-08-01 10:35 ` [PATCH v3 12/17] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
2023-08-01 10:35 ` [PATCH v3 13/17] i386: Add cache topology info in CPUCacheInfo Zhao Liu
2023-08-01 10:35 ` [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
2023-08-02 23:49   ` Moger, Babu
2023-08-03 16:41     ` Moger, Babu
2023-08-04  9:48       ` Zhao Liu
2023-08-04 15:48         ` Moger, Babu
2023-08-14  8:22           ` Zhao Liu
2023-08-14 16:03             ` Moger, Babu
2023-08-18  7:37               ` Zhao Liu
2023-08-23 17:18                 ` Moger, Babu
2023-09-01  8:43                   ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
2023-08-03 20:40   ` Moger, Babu
2023-08-04  9:50     ` Zhao Liu
2023-08-01 10:35 ` [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
2023-08-03 20:44   ` Moger, Babu
2023-08-04  9:56     ` Zhao Liu
2023-08-04 18:50       ` Moger, Babu
2023-08-01 10:35 ` [PATCH v3 17/17] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
2023-08-01 15:35 ` [PATCH v3 00/17] Support smp.clusters for x86 Jonathan Cameron via
2023-08-04 13:17   ` Zhao Liu
2023-08-08 11:52     ` Jonathan Cameron via
2023-08-01 23:11 ` Moger, Babu
2023-08-04  7:44   ` Zhao Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.