All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/21] Support smp.clusters for x86 in QEMU
@ 2023-09-14  7:21 Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 01/21] i386: Fix comment style in topology.h Zhao Liu
                   ` (22 more replies)
  0 siblings, 23 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Hi list,

(CC kvm@vger.kernel.org for better browsing.)

This is the our v4 patch series, rebased on the master branch at the
commit 9ef497755afc2 ("Merge tag 'pull-vfio-20230911' of
https://github.com/legoater/qemu into staging").

Comparing with v3 [1], v4 mainly refactors the CPUID[0x1F] encoding and
exposes module level in CPUID[0x1F] with these new patches:

* [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the
definitions of CPUID[0xB]
* [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific
topology level
* [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F]

v4 also fixes compile warnings and fixes cache topology uninitialization
bugs for some AMD CPUs.

Welcome your comments!


# Introduction

This series add the cluster support for x86 PC machine, which allows
x86 can use smp.clusters to configure the module level CPU topology
of x86.

And since the compatibility issue (see section: ## Why not share L2
cache in cluster directly), this series also introduce a new command
to adjust the topology of the x86 L2 cache.

Welcome your comments!


# Background

The "clusters" parameter in "smp" is introduced by ARM [2], but x86
hasn't supported it.

At present, x86 defaults L2 cache is shared in one core, but this is
not enough. There're some platforms that multiple cores share the
same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
Atom cores [3], that is, every four Atom cores shares one L2 cache.
Therefore, we need the new CPU topology level (cluster/module).

Another reason is for hybrid architecture. cluster support not only
provides another level of topology definition in x86, but would also
provide required code change for future our hybrid topology support.


# Overview

## Introduction of module level for x86

"cluster" in smp is the CPU topology level which is between "core" and
die.

For x86, the "cluster" in smp is corresponding to the module level [4],
which is above the core level. So use the "module" other than "cluster"
in x86 code.

And please note that x86 already has a cpu topology level also named
"cluster" [4], this level is at the upper level of the package. Here,
the cluster in x86 cpu topology is completely different from the
"clusters" as the smp parameter. After the module level is introduced,
the cluster as the smp parameter will actually refer to the module level
of x86.


## Why not share L2 cache in cluster directly

Though "clusters" was introduced to help define L2 cache topology
[2], using cluster to define x86's L2 cache topology will cause the
compatibility problem:

Currently, x86 defaults that the L2 cache is shared in one core, which
actually implies a default setting "cores per L2 cache is 1" and
therefore implicitly defaults to having as many L2 caches as cores.

For example (i386 PC machine):
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)

Considering the topology of the L2 cache, this (*) implicitly means "1
core per L2 cache" and "2 L2 caches per die".

If we use cluster to configure L2 cache topology with the new default
setting "clusters per L2 cache is 1", the above semantics will change
to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
cores per L2 cache".

So the same command (*) will cause changes in the L2 cache topology,
further affecting the performance of the virtual machine.

Therefore, x86 should only treat cluster as a cpu topology level and
avoid using it to change L2 cache by default for compatibility.


## module level in CPUID

Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
erroneous smp_num_siblings on Intel Hybrid platforms") is able to
handle platforms with Module level enumerated via CPUID.1F.

Expose the module level in CPUID[0x1F] (for Intel CPUs) if the machine
has more than 1 modules since v3.

We can configure CPUID.04H.02H (L2 cache topology) with module level by
a new command:

        "-cpu,x-l2-cache-topo=cluster"

More information about this command, please see the section: "## New
property: x-l2-cache-topo".


## New cache topology info in CPUCacheInfo

Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache [2].

Thus, in this patch set, we explicitly define the corresponding cache
topology for different cpu models and this has two benefits:
1. Easy to expand to new CPU models in the future, which has different
   cache topology.
2. It can easily support custom cache topology by some command (e.g.,
   x-l2-cache-topo).


## New property: x-l2-cache-topo

The property x-l2-cache-topo will be used to change the L2 cache
topology in CPUID.04H.

Now it allows user to set the L2 cache is shared in core level or
cluster level.

If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
topology will be overrode by the new topology setting.

Since CPUID.04H is used by Intel CPUs, this property is available on
Intel CPUs as for now.

When necessary, it can be extended to CPUID[0x8000001D] for AMD CPUs.


# Patch description

patch 1-2 Cleanups about coding style and test name.

patch 3-5 Fixes about x86 topology and Intel l1 cache topology.

patch 6-7 Cleanups about topology related CPUID encoding and QEMU
          topology variables.

patch 8-9 Refactor CPUID[0x1F] encoding to prepare to introduce module
          level.

patch 10-16 Add the module as the new CPU topology level in x86, and it
            is corresponding to the cluster level in generic code.

patch 17,18,20 Add cache topology information in cache models.

patch 19 Update AMD CPUs' cache topology encoding.

patch 21 Introduce a new command to configure L2 cache topology.


[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg00022.html
[2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
[3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
[4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.

Best Regards,
Zhao
---
Changelog:

Changes since v3 (main changes):
 * Expose module level in CPUID[0x1F].
 * Fix compile warnings. (Babu)
 * Fixes cache topology uninitialization bugs for some AMD CPUs. (Babu)

Changes since v2:
 * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
 * Use newly added wrapped helper to get cores per socket in
   qemu_init_vcpu().

Changes since v1:
 * Reordered patches. (Yanan)
 * Deprecated the patch to fix comment of machine_parse_smp_config().
   (Yanan)
 * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
 * Split the intel's l1 cache topology fix into a new separate patch.
   (Yanan)
 * Combined module_id and APIC ID for module level support into one
   patch. (Yanan)
 * Make cache_into_passthrough case of cpuid 0x04 leaf in
 * cpu_x86_cpuid() use max_processor_ids_for_cache() and
   max_core_ids_in_package() to encode CPUID[4]. (Yanan)
 * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
   (Yanan)
---
Zhao Liu (14):
  i386: Fix comment style in topology.h
  tests: Rename test-x86-cpuid.c to test-x86-topo.c
  hw/cpu: Update the comments of nr_cores and nr_dies
  i386/cpu: Fix i/d-cache topology to core level for Intel CPU
  i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
  i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  i386: Split topology types of CPUID[0x1F] from the definitions of
    CPUID[0xB]
  i386: Decouple CPUID[0x1F] subleaf with specific topology level
  i386: Expose module level in CPUID[0x1F]
  i386: Add cache topology info in CPUCacheInfo
  i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits
    25:14]
  i386: Use CPUCacheInfo.share_level to encode
    CPUID[0x8000001D].EAX[bits 25:14]
  i386: Add new property to control L2 cache topo in CPUID.04H

Zhuocheng Ding (7):
  softmmu: Fix CPUSTATE.nr_cores' calculation
  i386: Introduce module-level cpu topology to CPUX86State
  i386: Support modules_per_die in X86CPUTopoInfo
  i386: Support module_id in X86CPUTopoIDs
  i386/cpu: Introduce cluster-id to X86CPU
  tests: Add test case of APIC ID for module level parsing
  hw/i386/pc: Support smp.clusters for x86 PC machine

 MAINTAINERS                                   |   2 +-
 hw/i386/pc.c                                  |   1 +
 hw/i386/x86.c                                 |  49 ++-
 include/hw/core/cpu.h                         |   2 +-
 include/hw/i386/topology.h                    |  68 ++--
 qemu-options.hx                               |  10 +-
 softmmu/cpus.c                                |   2 +-
 target/i386/cpu.c                             | 322 ++++++++++++++----
 target/i386/cpu.h                             |  46 ++-
 target/i386/kvm/kvm.c                         |   2 +-
 tests/unit/meson.build                        |   4 +-
 .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++--
 12 files changed, 437 insertions(+), 129 deletions(-)
 rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v4 01/21] i386: Fix comment style in topology.h
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-22 16:05   ` Moger, Babu
  2023-09-14  7:21 ` [PATCH v4 02/21] tests: Rename test-x86-cpuid.c to test-x86-topo.c Zhao Liu
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Xiaoyao Li

From: Zhao Liu <zhao1.liu@intel.com>

For function comments in this file, keep the comment style consistent
with other files in the directory.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@Intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v3:
 * Optimized the description in commit message: Change "with other
   places" to "with other files in the directory". (Babu)
---
 include/hw/i386/topology.h | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 81573f6cfde0..5a19679f618b 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -24,7 +24,8 @@
 #ifndef HW_I386_TOPOLOGY_H
 #define HW_I386_TOPOLOGY_H
 
-/* This file implements the APIC-ID-based CPU topology enumeration logic,
+/*
+ * This file implements the APIC-ID-based CPU topology enumeration logic,
  * documented at the following document:
  *   Intel® 64 Architecture Processor Topology Enumeration
  *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
@@ -41,7 +42,8 @@
 
 #include "qemu/bitops.h"
 
-/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
+/*
+ * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
  */
 typedef uint32_t apic_id_t;
 
@@ -58,8 +60,7 @@ typedef struct X86CPUTopoInfo {
     unsigned threads_per_core;
 } X86CPUTopoInfo;
 
-/* Return the bit width needed for 'count' IDs
- */
+/* Return the bit width needed for 'count' IDs */
 static unsigned apicid_bitwidth_for_count(unsigned count)
 {
     g_assert(count >= 1);
@@ -67,15 +68,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
     return count ? 32 - clz32(count) : 0;
 }
 
-/* Bit width of the SMT_ID (thread ID) field on the APIC ID
- */
+/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
 static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
 {
     return apicid_bitwidth_for_count(topo_info->threads_per_core);
 }
 
-/* Bit width of the Core_ID field
- */
+/* Bit width of the Core_ID field */
 static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
 {
     return apicid_bitwidth_for_count(topo_info->cores_per_die);
@@ -87,8 +86,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
     return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
 }
 
-/* Bit offset of the Core_ID field
- */
+/* Bit offset of the Core_ID field */
 static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
 {
     return apicid_smt_width(topo_info);
@@ -100,14 +98,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
     return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
 }
 
-/* Bit offset of the Pkg_ID (socket ID) field
- */
+/* Bit offset of the Pkg_ID (socket ID) field */
 static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
 {
     return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
 }
 
-/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
+/*
+ * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
  *
  * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
  */
@@ -120,7 +118,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
            topo_ids->smt_id;
 }
 
-/* Calculate thread/core/package IDs for a specific topology,
+/*
+ * Calculate thread/core/package IDs for a specific topology,
  * based on (contiguous) CPU index
  */
 static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
@@ -137,7 +136,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
     topo_ids->smt_id = cpu_index % nr_threads;
 }
 
-/* Calculate thread/core/package IDs for a specific topology,
+/*
+ * Calculate thread/core/package IDs for a specific topology,
  * based on APIC ID
  */
 static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
@@ -155,7 +155,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
     topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
 }
 
-/* Make APIC ID for the CPU 'cpu_index'
+/*
+ * Make APIC ID for the CPU 'cpu_index'
  *
  * 'cpu_index' is a sequential, contiguous ID for the CPU.
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 02/21] tests: Rename test-x86-cpuid.c to test-x86-topo.c
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 01/21] i386: Fix comment style in topology.h Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 03/21] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Yongwei Ma

From: Zhao Liu <zhao1.liu@intel.com>

The tests in this file actually test the APIC ID combinations.
Rename to test-x86-topo.c to make its name more in line with its
actual content.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v3:
 * Modify the description in commit message to emphasize this file tests
   APIC ID combinations. (Babu)

Changes since v1:
 * Rename test-x86-apicid.c to test-x86-topo.c. (Yanan)
---
 MAINTAINERS                                      | 2 +-
 tests/unit/meson.build                           | 4 ++--
 tests/unit/{test-x86-cpuid.c => test-x86-topo.c} | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)
 rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (99%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 00562f924f7a..eaaa041804b2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1721,7 +1721,7 @@ F: include/hw/southbridge/ich9.h
 F: include/hw/southbridge/piix.h
 F: hw/isa/apm.c
 F: include/hw/isa/apm.h
-F: tests/unit/test-x86-cpuid.c
+F: tests/unit/test-x86-topo.c
 F: tests/qtest/test-x86-cpuid-compat.c
 
 PC Chipset
diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index 0299ef6906cc..bb6f8177341d 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -21,8 +21,8 @@ tests = {
   'test-opts-visitor': [testqapi],
   'test-visitor-serialization': [testqapi],
   'test-bitmap': [],
-  # all code tested by test-x86-cpuid is inside topology.h
-  'test-x86-cpuid': [],
+  # all code tested by test-x86-topo is inside topology.h
+  'test-x86-topo': [],
   'test-cutils': [],
   'test-div128': [],
   'test-shift128': [],
diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-topo.c
similarity index 99%
rename from tests/unit/test-x86-cpuid.c
rename to tests/unit/test-x86-topo.c
index bfabc0403a1a..2b104f86d7c2 100644
--- a/tests/unit/test-x86-cpuid.c
+++ b/tests/unit/test-x86-topo.c
@@ -1,5 +1,5 @@
 /*
- *  Test code for x86 CPUID and Topology functions
+ *  Test code for x86 APIC ID and Topology functions
  *
  *  Copyright (c) 2012 Red Hat Inc.
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 03/21] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 01/21] i386: Fix comment style in topology.h Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 02/21] tests: Rename test-x86-cpuid.c to test-x86-topo.c Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:31   ` Philippe Mathieu-Daudé
  2023-09-14  7:21 ` [PATCH v4 04/21] hw/cpu: Update the comments of nr_cores and nr_dies Zhao Liu
                   ` (19 subsequent siblings)
  22 siblings, 1 reply; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

From CPUState.nr_cores' comment, it represents "number of cores within
this CPU package".

After 003f230e37d7 ("machine: Tweak the order of topology members in
struct CpuTopology"), the meaning of smp.cores changed to "the number of
cores in one die", but this commit missed to change CPUState.nr_cores'
calculation, so that CPUState.nr_cores became wrong and now it
misses to consider numbers of clusters and dies.

At present, only i386 is using CPUState.nr_cores.

But as for i386, which supports die level, the uses of CPUState.nr_cores
are very confusing:

Early uses are based on the meaning of "cores per package" (before die
is introduced into i386), and later uses are based on "cores per die"
(after die's introduction).

This difference is due to that commit a94e1428991f ("target/i386: Add
CPUID.1F generation support for multi-dies PCMachine") misunderstood
that CPUState.nr_cores means "cores per die" when calculated
CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
wrong understanding.

With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
the result of CPUState.nr_cores is "cores per die", thus the original
uses of CPUState.cores based on the meaning of "cores per package" are
wrong when multiple dies exist:
1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
   incorrect because it expects "cpus per package" but now the
   result is "cpus per die".
2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
   EAX[bits 31:26] is incorrect because they expect "cpus per package"
   but now the result is "cpus per die". The error not only impacts the
   EAX calculation in cache_info_passthrough case, but also impacts other
   cases of setting cache topology for Intel CPU according to cpu
   topology (specifically, the incoming parameter "num_cores" expects
   "cores per package" in encode_cache_cpuid4()).
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
   15:00] is incorrect because the EBX of 0BH.01H (core level) expects
   "cpus per package", which may be different with 1FH.01H (The reason
   is 1FH can support more levels. For QEMU, 1FH also supports die,
   1FH.01H:EBX[bits 15:00] expects "cpus per die").
4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
   calculated, here "cpus per package" is expected to be checked, but in
   fact, now it checks "cpus per die". Though "cpus per die" also works
   for this code logic, this isn't consistent with AMD's APM.
5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
   "cpus per package" but it obtains "cpus per die".
6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
   kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
   helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
   MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
   package", but in these functions, it obtains "cpus per die" and
   "cores per die".

On the other hand, these uses are correct now (they are added in/after
a94e1428991f):
1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
   meets the actual meaning of CPUState.nr_cores ("cores per die").
2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
   04H's calculation) considers number of dies, so it's correct.
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
   15:00] needs "cpus per die" and it gets the correct result, and
   CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".

When CPUState.nr_cores is correctly changed to "cores per package" again
, the above errors will be fixed without extra work, but the "currently"
correct cases will go wrong and need special handling to pass correct
"cpus/cores per die" they want.

Fix CPUState.nr_cores' calculation to fit the original meaning "cores
per package", as well as changing calculation of topo_info.cores_per_die,
vcpus_per_socket and CPUID.1FH.

Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * Describe changes in imperative mood. (Babu)
 * Fix spelling typo. (Babu)
 * Split the comment change into a separate patch. (Xiaoyao)

Changes since v2:
 * Use wrapped helper to get cores per socket in qemu_init_vcpu().

Changes since v1:
 * Add comment for nr_dies in CPUX86State. (Yanan)
---
 softmmu/cpus.c    | 2 +-
 target/i386/cpu.c | 9 ++++-----
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index 0848e0dbdb3f..fa8239c217ff 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -624,7 +624,7 @@ void qemu_init_vcpu(CPUState *cpu)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
 
-    cpu->nr_cores = ms->smp.cores;
+    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
     cpu->nr_threads =  ms->smp.threads;
     cpu->stopped = true;
     cpu->random_seed = qemu_guest_random_seed_thread_part1();
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 24ee67b42d05..709c055c8468 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6015,7 +6015,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     X86CPUTopoInfo topo_info;
 
     topo_info.dies_per_pkg = env->nr_dies;
-    topo_info.cores_per_die = cs->nr_cores;
+    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
     topo_info.threads_per_core = cs->nr_threads;
 
     /* Calculate & apply limits for different index ranges */
@@ -6091,8 +6091,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              */
             if (*eax & 31) {
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
-                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
-                                       cs->nr_threads;
+                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
                 if (cs->nr_cores > 1) {
                     *eax &= ~0xFC000000;
                     *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
@@ -6270,12 +6269,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         case 1:
             *eax = apicid_die_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         case 2:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
+            *ebx = cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;
         default:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 04/21] hw/cpu: Update the comments of nr_cores and nr_dies
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (2 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 03/21] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:32   ` Philippe Mathieu-Daudé
  2023-09-14  7:21 ` [PATCH v4 05/21] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
                   ` (18 subsequent siblings)
  22 siblings, 1 reply; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

In the nr_threads' comment, specify it represents the
number of threads in the "core" to avoid confusion.

Also add comment for nr_dies in CPUX86State.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * The new patch split out of CPUSTATE.nr_cores' fix. (Xiaoyao)
---
 include/hw/core/cpu.h | 2 +-
 target/i386/cpu.h     | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 92a4234439a3..df908b23c692 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -277,7 +277,7 @@ struct qemu_work_item;
  *   See TranslationBlock::TCG CF_CLUSTER_MASK.
  * @tcg_cflags: Pre-computed cflags for this cpu.
  * @nr_cores: Number of cores within this CPU package.
- * @nr_threads: Number of threads within this CPU.
+ * @nr_threads: Number of threads within this CPU core.
  * @running: #true if CPU is currently running (lockless).
  * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
  * valid under cpu_list_lock.
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index fbb05eace57e..70eb3bc23eb8 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1881,6 +1881,7 @@ typedef struct CPUArchState {
 
     TPRAccess tpr_access_type;
 
+    /* Number of dies within this CPU package. */
     unsigned nr_dies;
 } CPUX86State;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 05/21] i386/cpu: Fix i/d-cache topology to core level for Intel CPU
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (3 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 04/21] hw/cpu: Update the comments of nr_cores and nr_dies Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 06/21] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Robert Hoo

From: Zhao Liu <zhao1.liu@intel.com>

For i-cache and d-cache, current QEMU hardcodes the maximum IDs for CPUs
sharing cache (CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits
25:14]) to 0, and this means i-cache and d-cache are shared in the SMT
level.

This is correct if there's single thread per core, but is wrong for the
hyper threading case (one core contains multiple threads) since the
i-cache and d-cache are shared in the core level other than SMT level.

For AMD CPU, commit 8f4202fb1080 ("i386: Populate AMD Processor Cache
Information for cpuid 0x8000001D") has already introduced i/d cache
topology as core level by default.

Therefore, in order to be compatible with both multi-threaded and
single-threaded situations, we should set i-cache and d-cache be shared
at the core level by default.

This fix changes the default i/d cache topology from per-thread to
per-core. Potentially, this change in L1 cache topology may affect the
performance of the VM if the user does not specifically specify the
topology or bind the vCPU. However, the way to achieve optimal
performance should be to create a reasonable topology and set the
appropriate vCPU affinity without relying on QEMU's default topology
structure.

Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
Changes since v3:
 * Change the description of current i/d cache encoding status to avoid
   misleading to "architectural rules". (Xiaoyao)

Changes since v1:
 * Split this fix from the patch named "i386/cpu: Fix number of
   addressable IDs in CPUID.04H".
 * Add the explanation of the impact on performance. (Xiaoyao)
---
 target/i386/cpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 709c055c8468..c5c2a045e032 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6108,12 +6108,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             switch (count) {
             case 0: /* L1 dcache info */
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
-                                    1, cs->nr_cores,
+                                    cs->nr_threads, cs->nr_cores,
                                     eax, ebx, ecx, edx);
                 break;
             case 1: /* L1 icache info */
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
-                                    1, cs->nr_cores,
+                                    cs->nr_threads, cs->nr_cores,
                                     eax, ebx, ecx, edx);
                 break;
             case 2: /* L2 cache info */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 06/21] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (4 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 05/21] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 07/21] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Robert Hoo

From: Zhao Liu <zhao1.liu@intel.com>

Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
nearest power-of-2 integer.

The nearest power-of-2 integer can be calculated by pow2ceil() or by
using APIC ID offset (like L3 topology using 1 << die_offset [3]).

But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
are associated with APIC ID. For example, in linux kernel, the field
"num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
matched with actual core numbers and it's calculated by:
"(1 << (pkg_offset - core_offset)) - 1".

Therefore the offset of APIC ID should be preferred to calculate nearest
power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
31:26]:
1. d/i cache is shared in a core, 1 << core_offset should be used
   instand of "cs->nr_threads" in encode_cache_cpuid4() for
   CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
2. L2 cache is supposed to be shared in a core as for now, thereby
   1 << core_offset should also be used instand of "cs->nr_threads" in
   encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
   calculated with the bit width between the Package and SMT levels in
   the APIC ID (1 << (pkg_offset - core_offset) - 1).

In addition, use APIC ID offset to replace "pow2ceil()" for
cache_info_passthrough case.

[1]: efb3934adf9e ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec")
[2]: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache")
[3]: d65af288a84d ("i386: Update new x86_apicid parsing rules with die_offset support")

Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * Fix compile warnings. (Babu)
 * Fix spelling typo.

Changes since v1:
 * Use APIC ID offset to replace "pow2ceil()" for cache_info_passthrough
   case. (Yanan)
 * Split the L1 cache fix into a separate patch.
 * Rename the title of this patch (the original is "i386/cpu: Fix number
   of addressable IDs in CPUID.04H").
---
 target/i386/cpu.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index c5c2a045e032..a3d67c1762c0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6009,7 +6009,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 {
     X86CPU *cpu = env_archcpu(env);
     CPUState *cs = env_cpu(env);
-    uint32_t die_offset;
     uint32_t limit;
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
@@ -6093,39 +6092,56 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
                 int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
                 if (cs->nr_cores > 1) {
+                    int addressable_cores_offset =
+                                                apicid_pkg_offset(&topo_info) -
+                                                apicid_core_offset(&topo_info);
+
                     *eax &= ~0xFC000000;
-                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
+                    *eax |= (1 << (addressable_cores_offset - 1)) << 26;
                 }
                 if (host_vcpus_per_cache > vcpus_per_socket) {
+                    int pkg_offset = apicid_pkg_offset(&topo_info);
+
                     *eax &= ~0x3FFC000;
-                    *eax |= (pow2ceil(vcpus_per_socket) - 1) << 14;
+                    *eax |= (1 << (pkg_offset - 1)) << 14;
                 }
             }
         } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
             *eax = *ebx = *ecx = *edx = 0;
         } else {
             *eax = 0;
+            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
+                                           apicid_core_offset(&topo_info);
+            int core_offset, die_offset;
+
             switch (count) {
             case 0: /* L1 dcache info */
+                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
-                                    cs->nr_threads, cs->nr_cores,
+                                    (1 << core_offset),
+                                    (1 << addressable_cores_offset),
                                     eax, ebx, ecx, edx);
                 break;
             case 1: /* L1 icache info */
+                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
-                                    cs->nr_threads, cs->nr_cores,
+                                    (1 << core_offset),
+                                    (1 << addressable_cores_offset),
                                     eax, ebx, ecx, edx);
                 break;
             case 2: /* L2 cache info */
+                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
-                                    cs->nr_threads, cs->nr_cores,
+                                    (1 << core_offset),
+                                    (1 << addressable_cores_offset),
                                     eax, ebx, ecx, edx);
                 break;
             case 3: /* L3 cache info */
                 die_offset = apicid_die_offset(&topo_info);
                 if (cpu->enable_l3_cache) {
                     encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
-                                        (1 << die_offset), cs->nr_cores,
+                                        (1 << die_offset),
+                                        (1 << addressable_cores_offset),
                                         eax, ebx, ecx, edx);
                     break;
                 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 07/21] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (5 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 06/21] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB] Zhao Liu
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Robert Hoo

From: Zhao Liu <zhao1.liu@intel.com>

In cpu_x86_cpuid(), there are many variables in representing the cpu
topology, e.g., topo_info, cs->nr_cores/cs->nr_threads.

Since the names of cs->nr_cores/cs->nr_threads does not accurately
represent its meaning, the use of cs->nr_cores/cs->nr_threads is prone
to confusion and mistakes.

And the structure X86CPUTopoInfo names its members clearly, thus the
variable "topo_info" should be preferred.

In addition, in cpu_x86_cpuid(), to uniformly use the topology variable,
replace env->dies with topo_info.dies_per_pkg as well.

Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * Fix typo. (Babu)

Changes since v1:
 * Extract cores_per_socket from the code block and use it as a local
   variable for cpu_x86_cpuid(). (Yanan)
 * Remove vcpus_per_socket variable and use cpus_per_pkg directly.
   (Yanan)
 * Replace env->dies with topo_info.dies_per_pkg in cpu_x86_cpuid().
---
 target/i386/cpu.c | 31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index a3d67c1762c0..0e9a33034026 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6012,11 +6012,16 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t limit;
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
+    uint32_t cores_per_pkg;
+    uint32_t cpus_per_pkg;
 
     topo_info.dies_per_pkg = env->nr_dies;
     topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
     topo_info.threads_per_core = cs->nr_threads;
 
+    cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
+    cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
+
     /* Calculate & apply limits for different index ranges */
     if (index >= 0xC0000000) {
         limit = env->cpuid_xlevel2;
@@ -6052,8 +6057,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *ecx |= CPUID_EXT_OSXSAVE;
         }
         *edx = env->features[FEAT_1_EDX];
-        if (cs->nr_cores * cs->nr_threads > 1) {
-            *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
+        if (cpus_per_pkg > 1) {
+            *ebx |= cpus_per_pkg << 16;
             *edx |= CPUID_HT;
         }
         if (!cpu->enable_pmu) {
@@ -6090,8 +6095,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              */
             if (*eax & 31) {
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
-                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
-                if (cs->nr_cores > 1) {
+
+                if (cores_per_pkg > 1) {
                     int addressable_cores_offset =
                                                 apicid_pkg_offset(&topo_info) -
                                                 apicid_core_offset(&topo_info);
@@ -6099,7 +6104,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
                     *eax &= ~0xFC000000;
                     *eax |= (1 << (addressable_cores_offset - 1)) << 26;
                 }
-                if (host_vcpus_per_cache > vcpus_per_socket) {
+                if (host_vcpus_per_cache > cpus_per_pkg) {
                     int pkg_offset = apicid_pkg_offset(&topo_info);
 
                     *eax &= ~0x3FFC000;
@@ -6244,12 +6249,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         switch (count) {
         case 0:
             *eax = apicid_core_offset(&topo_info);
-            *ebx = cs->nr_threads;
+            *ebx = topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = cpus_per_pkg;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         default:
@@ -6270,7 +6275,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         break;
     case 0x1F:
         /* V2 Extended Topology Enumeration Leaf */
-        if (env->nr_dies < 2) {
+        if (topo_info.dies_per_pkg < 2) {
             *eax = *ebx = *ecx = *edx = 0;
             break;
         }
@@ -6280,7 +6285,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         switch (count) {
         case 0:
             *eax = apicid_core_offset(&topo_info);
-            *ebx = cs->nr_threads;
+            *ebx = topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
@@ -6290,7 +6295,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         case 2:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = cpus_per_pkg;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;
         default:
@@ -6515,7 +6520,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
          * discards multiple thread information if it is set.
          * So don't set it here for Intel to make Linux guests happy.
          */
-        if (cs->nr_cores * cs->nr_threads > 1) {
+        if (cpus_per_pkg > 1) {
             if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 ||
                 env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 ||
                 env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) {
@@ -6581,7 +6586,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              *eax |= (cpu_x86_virtual_addr_width(env) << 8);
         }
         *ebx = env->features[FEAT_8000_0008_EBX];
-        if (cs->nr_cores * cs->nr_threads > 1) {
+        if (cpus_per_pkg > 1) {
             /*
              * Bits 15:12 is "The number of bits in the initial
              * Core::X86::Apic::ApicId[ApicId] value that indicate
@@ -6589,7 +6594,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              * Bits 7:0 is "The number of threads in the package is NC+1"
              */
             *ecx = (apicid_pkg_offset(&topo_info) << 12) |
-                   ((cs->nr_cores * cs->nr_threads) - 1);
+                   (cpus_per_pkg - 1);
         } else {
             *ecx = 0;
         }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB]
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (6 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 07/21] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific topology level Zhao Liu
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

CPUID[0xB] defines SMT, Core and Invalid types, and this leaf is shared
by Intel and AMD CPUs.

But for extended topology levels, Intel CPU (in CPUID[0x1F]) and AMD CPU
(in CPUID[0x80000026]) have the different definitions with different
enumeration values.

Though CPUID[0x80000026] hasn't been implemented in QEMU, to avoid
possible misunderstanding, split topology types of CPUID[0x1F] from the
definitions of CPUID[0xB] and introduce CPUID[0x1F]-specific topology
types.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * New commit to prepare to refactor CPUID[0x1F] encoding.
---
 target/i386/cpu.c | 14 +++++++-------
 target/i386/cpu.h | 13 +++++++++----
 2 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 0e9a33034026..88ccc9df9118 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6250,17 +6250,17 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         case 0:
             *eax = apicid_core_offset(&topo_info);
             *ebx = topo_info.threads_per_core;
-            *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
+            *ecx |= CPUID_B_ECX_TOPO_LEVEL_SMT << 8;
             break;
         case 1:
             *eax = apicid_pkg_offset(&topo_info);
             *ebx = cpus_per_pkg;
-            *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
+            *ecx |= CPUID_B_ECX_TOPO_LEVEL_CORE << 8;
             break;
         default:
             *eax = 0;
             *ebx = 0;
-            *ecx |= CPUID_TOPOLOGY_LEVEL_INVALID;
+            *ecx |= CPUID_B_ECX_TOPO_LEVEL_INVALID << 8;
         }
 
         assert(!(*eax & ~0x1f));
@@ -6286,22 +6286,22 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         case 0:
             *eax = apicid_core_offset(&topo_info);
             *ebx = topo_info.threads_per_core;
-            *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
+            *ecx |= CPUID_1F_ECX_TOPO_LEVEL_SMT << 8;
             break;
         case 1:
             *eax = apicid_die_offset(&topo_info);
             *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
-            *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
+            *ecx |= CPUID_1F_ECX_TOPO_LEVEL_CORE << 8;
             break;
         case 2:
             *eax = apicid_pkg_offset(&topo_info);
             *ebx = cpus_per_pkg;
-            *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
+            *ecx |= CPUID_1F_ECX_TOPO_LEVEL_DIE << 8;
             break;
         default:
             *eax = 0;
             *ebx = 0;
-            *ecx |= CPUID_TOPOLOGY_LEVEL_INVALID;
+            *ecx |= CPUID_1F_ECX_TOPO_LEVEL_INVALID << 8;
         }
         assert(!(*eax & ~0x1f));
         *ebx &= 0xffff; /* The count doesn't need to be reliable. */
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 70eb3bc23eb8..97113ad3d002 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1009,10 +1009,15 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_MWAIT_EMX     (1U << 0) /* enumeration supported */
 
 /* CPUID[0xB].ECX level types */
-#define CPUID_TOPOLOGY_LEVEL_INVALID  (0U << 8)
-#define CPUID_TOPOLOGY_LEVEL_SMT      (1U << 8)
-#define CPUID_TOPOLOGY_LEVEL_CORE     (2U << 8)
-#define CPUID_TOPOLOGY_LEVEL_DIE      (5U << 8)
+#define CPUID_B_ECX_TOPO_LEVEL_INVALID  0
+#define CPUID_B_ECX_TOPO_LEVEL_SMT      1
+#define CPUID_B_ECX_TOPO_LEVEL_CORE     2
+
+/* COUID[0x1F].ECX level types */
+#define CPUID_1F_ECX_TOPO_LEVEL_INVALID  CPUID_B_ECX_TOPO_LEVEL_INVALID
+#define CPUID_1F_ECX_TOPO_LEVEL_SMT      CPUID_B_ECX_TOPO_LEVEL_SMT
+#define CPUID_1F_ECX_TOPO_LEVEL_CORE     CPUID_B_ECX_TOPO_LEVEL_CORE
+#define CPUID_1F_ECX_TOPO_LEVEL_DIE      5
 
 /* MSR Feature Bits */
 #define MSR_ARCH_CAP_RDCL_NO            (1U << 0)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific topology level
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (7 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB] Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

At present, the subleaf 0x02 of CPUID[0x1F] is bound to the "die" level.

In fact, the specific topology level exposed in 0x1F depends on the
platform's support for extension levels (module, tile and die).

To help expose "module" level in 0x1F, decouple CPUID[0x1F] subleaf
with specific topology level.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * New patch to prepare to expose module level in 0x1F.
 * Move the CPUTopoLevel enumeration definition from "i386: Add cache
   topology info in CPUCacheInfo" to this patch. Note, to align with
   topology types in SDM, revert the name of CPU_TOPO_LEVEL_UNKNOW to
   CPU_TOPO_LEVEL_INVALID.
---
 target/i386/cpu.c | 136 +++++++++++++++++++++++++++++++++++++---------
 target/i386/cpu.h |  15 +++++
 2 files changed, 126 insertions(+), 25 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 88ccc9df9118..10e11c85f459 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -268,6 +268,116 @@ static void encode_cache_cpuid4(CPUCacheInfo *cache,
            (cache->complex_indexing ? CACHE_COMPLEX_IDX : 0);
 }
 
+static uint32_t num_cpus_by_topo_level(X86CPUTopoInfo *topo_info,
+                                       enum CPUTopoLevel topo_level)
+{
+    switch (topo_level) {
+    case CPU_TOPO_LEVEL_SMT:
+        return 1;
+    case CPU_TOPO_LEVEL_CORE:
+        return topo_info->threads_per_core;
+    case CPU_TOPO_LEVEL_DIE:
+        return topo_info->threads_per_core * topo_info->cores_per_die;
+    case CPU_TOPO_LEVEL_PACKAGE:
+        return topo_info->threads_per_core * topo_info->cores_per_die *
+               topo_info->dies_per_pkg;
+    default:
+        g_assert_not_reached();
+    }
+    return 0;
+}
+
+static uint32_t apicid_offset_by_topo_level(X86CPUTopoInfo *topo_info,
+                                            enum CPUTopoLevel topo_level)
+{
+    switch (topo_level) {
+    case CPU_TOPO_LEVEL_SMT:
+        return 0;
+    case CPU_TOPO_LEVEL_CORE:
+        return apicid_core_offset(topo_info);
+    case CPU_TOPO_LEVEL_DIE:
+        return apicid_die_offset(topo_info);
+    case CPU_TOPO_LEVEL_PACKAGE:
+        return apicid_pkg_offset(topo_info);
+    default:
+        g_assert_not_reached();
+    }
+    return 0;
+}
+
+static uint32_t cpuid1f_topo_type(enum CPUTopoLevel topo_level)
+{
+    switch (topo_level) {
+    case CPU_TOPO_LEVEL_INVALID:
+        return CPUID_1F_ECX_TOPO_LEVEL_INVALID;
+    case CPU_TOPO_LEVEL_SMT:
+        return CPUID_1F_ECX_TOPO_LEVEL_SMT;
+    case CPU_TOPO_LEVEL_CORE:
+        return CPUID_1F_ECX_TOPO_LEVEL_CORE;
+    case CPU_TOPO_LEVEL_DIE:
+        return CPUID_1F_ECX_TOPO_LEVEL_DIE;
+    default:
+        /* Other types are not supported in QEMU. */
+        g_assert_not_reached();
+    }
+    return 0;
+}
+
+static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
+                                X86CPUTopoInfo *topo_info,
+                                uint32_t *eax, uint32_t *ebx,
+                                uint32_t *ecx, uint32_t *edx)
+{
+    static DECLARE_BITMAP(topo_bitmap, CPU_TOPO_LEVEL_MAX);
+    X86CPU *cpu = env_archcpu(env);
+    unsigned long level, next_level;
+    uint32_t num_cpus_next_level, offset_next_level;
+
+    /*
+     * Initialize the bitmap to decide which levels should be
+     * encoded in 0x1f.
+     */
+    if (!count) {
+        /* SMT and core levels are exposed in 0x1f leaf by default. */
+        set_bit(CPU_TOPO_LEVEL_SMT, topo_bitmap);
+        set_bit(CPU_TOPO_LEVEL_CORE, topo_bitmap);
+
+        if (env->nr_dies > 1) {
+            set_bit(CPU_TOPO_LEVEL_DIE, topo_bitmap);
+        }
+    }
+
+    *ecx = count & 0xff;
+    *edx = cpu->apic_id;
+
+    level = find_first_bit(topo_bitmap, CPU_TOPO_LEVEL_MAX);
+    if (level == CPU_TOPO_LEVEL_MAX) {
+        num_cpus_next_level = 0;
+        offset_next_level = 0;
+
+        /* Encode CPU_TOPO_LEVEL_INVALID into the last subleaf of 0x1f. */
+        level = CPU_TOPO_LEVEL_INVALID;
+    } else {
+        next_level = find_next_bit(topo_bitmap, CPU_TOPO_LEVEL_MAX, level + 1);
+        if (next_level == CPU_TOPO_LEVEL_MAX) {
+            next_level = CPU_TOPO_LEVEL_PACKAGE;
+        }
+
+        num_cpus_next_level = num_cpus_by_topo_level(topo_info, next_level);
+        offset_next_level = apicid_offset_by_topo_level(topo_info, next_level);
+    }
+
+    *eax = offset_next_level;
+    *ebx = num_cpus_next_level;
+    *ecx |= cpuid1f_topo_type(level) << 8;
+
+    assert(!(*eax & ~0x1f));
+    *ebx &= 0xffff; /* The count doesn't need to be reliable. */
+    if (level != CPU_TOPO_LEVEL_MAX) {
+        clear_bit(level, topo_bitmap);
+    }
+}
+
 /* Encode cache info for CPUID[0x80000005].ECX or CPUID[0x80000005].EDX */
 static uint32_t encode_cache_cpuid80000005(CPUCacheInfo *cache)
 {
@@ -6280,31 +6390,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         }
 
-        *ecx = count & 0xff;
-        *edx = cpu->apic_id;
-        switch (count) {
-        case 0:
-            *eax = apicid_core_offset(&topo_info);
-            *ebx = topo_info.threads_per_core;
-            *ecx |= CPUID_1F_ECX_TOPO_LEVEL_SMT << 8;
-            break;
-        case 1:
-            *eax = apicid_die_offset(&topo_info);
-            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
-            *ecx |= CPUID_1F_ECX_TOPO_LEVEL_CORE << 8;
-            break;
-        case 2:
-            *eax = apicid_pkg_offset(&topo_info);
-            *ebx = cpus_per_pkg;
-            *ecx |= CPUID_1F_ECX_TOPO_LEVEL_DIE << 8;
-            break;
-        default:
-            *eax = 0;
-            *ebx = 0;
-            *ecx |= CPUID_1F_ECX_TOPO_LEVEL_INVALID << 8;
-        }
-        assert(!(*eax & ~0x1f));
-        *ebx &= 0xffff; /* The count doesn't need to be reliable. */
+        encode_topo_cpuid1f(env, count, &topo_info, eax, ebx, ecx, edx);
         break;
     case 0xD: {
         /* Processor Extended State */
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 97113ad3d002..470257b92240 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1008,6 +1008,21 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_MWAIT_IBE     (1U << 1) /* Interrupts can exit capability */
 #define CPUID_MWAIT_EMX     (1U << 0) /* enumeration supported */
 
+/*
+ * CPUTopoLevel is the general i386 topology hierarchical representation,
+ * ordered by increasing hierarchical relationship.
+ * Its enumeration value is not bound to the type value of Intel (CPUID[0x1F])
+ * or AMD (CPUID[0x80000026]).
+ */
+enum CPUTopoLevel {
+    CPU_TOPO_LEVEL_INVALID,
+    CPU_TOPO_LEVEL_SMT,
+    CPU_TOPO_LEVEL_CORE,
+    CPU_TOPO_LEVEL_DIE,
+    CPU_TOPO_LEVEL_PACKAGE,
+    CPU_TOPO_LEVEL_MAX,
+};
+
 /* CPUID[0xB].ECX level types */
 #define CPUID_B_ECX_TOPO_LEVEL_INVALID  0
 #define CPUID_B_ECX_TOPO_LEVEL_SMT      1
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (8 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific topology level Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:38   ` Philippe Mathieu-Daudé
  2023-09-14  7:21 ` [PATCH v4 11/21] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
                   ` (12 subsequent siblings)
  22 siblings, 1 reply; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

smp command has the "clusters" parameter but x86 hasn't supported that
level. "cluster" is a CPU topology level concept above cores, in which
the cores may share some resources (L2 cache or some others like L3
cache tags, depending on the Archs) [1][2]. For x86, the resource shared
by cores at the cluster level is mainly the L2 cache.

However, using cluster to define x86's L2 cache topology will cause the
compatibility problem:

Currently, x86 defaults that the L2 cache is shared in one core, which
actually implies a default setting "cores per L2 cache is 1" and
therefore implicitly defaults to having as many L2 caches as cores.

For example (i386 PC machine):
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)

Considering the topology of the L2 cache, this (*) implicitly means "1
core per L2 cache" and "2 L2 caches per die".

If we use cluster to configure L2 cache topology with the new default
setting "clusters per L2 cache is 1", the above semantics will change
to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
cores per L2 cache".

So the same command (*) will cause changes in the L2 cache topology,
further affecting the performance of the virtual machine.

Therefore, x86 should only treat cluster as a cpu topology level and
avoid using it to change L2 cache by default for compatibility.

"cluster" in smp is the CPU topology level which is between "core" and
die.

For x86, the "cluster" in smp is corresponding to the module level [2],
which is above the core level. So use the "module" other than "cluster"
in i386 code.

And please note that x86 already has a cpu topology level also named
"cluster" [3], this level is at the upper level of the package. Here,
the cluster in x86 cpu topology is completely different from the
"clusters" as the smp parameter. After the module level is introduced,
the cluster as the smp parameter will actually refer to the module level
of x86.

[1]: 864c3b5c32f0 ("hw/core/machine: Introduce CPU cluster topology support")
[2]: Yanan's comment about "cluster",
     https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg04051.html
[3]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v1:
 * The background of the introduction of the "cluster" parameter and its
   exact meaning were revised according to Yanan's explanation. (Yanan)
---
 hw/i386/x86.c     | 1 +
 target/i386/cpu.c | 1 +
 target/i386/cpu.h | 5 +++++
 3 files changed, 7 insertions(+)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index f034df8bf628..9c61b6882b99 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -306,6 +306,7 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
     init_topo_info(&topo_info, x86ms);
 
     env->nr_dies = ms->smp.dies;
+    env->nr_modules = ms->smp.clusters;
 
     /*
      * If APIC ID is not set,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 10e11c85f459..401409c5db08 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7696,6 +7696,7 @@ static void x86_cpu_initfn(Object *obj)
     CPUX86State *env = &cpu->env;
 
     env->nr_dies = 1;
+    env->nr_modules = 1;
     cpu_set_cpustate_pointers(cpu);
 
     object_property_add(obj, "feature-words", "X86CPUFeatureWordInfo",
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 470257b92240..556e80f29764 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1903,6 +1903,11 @@ typedef struct CPUArchState {
 
     /* Number of dies within this CPU package. */
     unsigned nr_dies;
+    /*
+     * Number of modules within this CPU package.
+     * Module level in x86 cpu topology is corresponding to smp.clusters.
+     */
+    unsigned nr_modules;
 } CPUX86State;
 
 struct kvm_msrs;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 11/21] i386: Support modules_per_die in X86CPUTopoInfo
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (9 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F] Zhao Liu
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

Support module level in i386 cpu topology structure "X86CPUTopoInfo".

Since x86 does not yet support the "clusters" parameter in "-smp",
X86CPUTopoInfo.modules_per_die is currently always 1. Therefore, the
module level width in APIC ID, which can be calculated by
"apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0
for now, so we can directly add APIC ID related helpers to support
module level parsing.

In addition, update topology structure in test-x86-topo.c.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * Drop the description about not exposing module level in commit
   message.
 * Update topology related calculation in newly added helpers:
   num_cpus_by_topo_level() and apicid_offset_by_topo_level().
 * Since the code change, drop the "Acked-by" tag.

Changes since v1:
 * Include module level related helpers (apicid_module_width() and
   apicid_module_offset()) in this patch. (Yanan)
---
 hw/i386/x86.c              |  3 ++-
 include/hw/i386/topology.h | 22 +++++++++++++++----
 target/i386/cpu.c          | 17 +++++++++-----
 tests/unit/test-x86-topo.c | 45 ++++++++++++++++++++------------------
 4 files changed, 55 insertions(+), 32 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 9c61b6882b99..267bb0f96ca5 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -72,7 +72,8 @@ static void init_topo_info(X86CPUTopoInfo *topo_info,
     MachineState *ms = MACHINE(x86ms);
 
     topo_info->dies_per_pkg = ms->smp.dies;
-    topo_info->cores_per_die = ms->smp.cores;
+    topo_info->modules_per_die = ms->smp.clusters;
+    topo_info->cores_per_module = ms->smp.cores;
     topo_info->threads_per_core = ms->smp.threads;
 }
 
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 5a19679f618b..c807d3811dd3 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -56,7 +56,8 @@ typedef struct X86CPUTopoIDs {
 
 typedef struct X86CPUTopoInfo {
     unsigned dies_per_pkg;
-    unsigned cores_per_die;
+    unsigned modules_per_die;
+    unsigned cores_per_module;
     unsigned threads_per_core;
 } X86CPUTopoInfo;
 
@@ -77,7 +78,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
 /* Bit width of the Core_ID field */
 static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
 {
-    return apicid_bitwidth_for_count(topo_info->cores_per_die);
+    return apicid_bitwidth_for_count(topo_info->cores_per_module);
+}
+
+/* Bit width of the Module_ID (cluster ID) field */
+static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
+{
+    return apicid_bitwidth_for_count(topo_info->modules_per_die);
 }
 
 /* Bit width of the Die_ID field */
@@ -92,10 +99,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
     return apicid_smt_width(topo_info);
 }
 
+/* Bit offset of the Module_ID (cluster ID) field */
+static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
+{
+    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
+}
+
 /* Bit offset of the Die_ID field */
 static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
 {
-    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
+    return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
 }
 
 /* Bit offset of the Pkg_ID (socket ID) field */
@@ -127,7 +140,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
                                          X86CPUTopoIDs *topo_ids)
 {
     unsigned nr_dies = topo_info->dies_per_pkg;
-    unsigned nr_cores = topo_info->cores_per_die;
+    unsigned nr_cores = topo_info->cores_per_module *
+                        topo_info->modules_per_die;
     unsigned nr_threads = topo_info->threads_per_core;
 
     topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 401409c5db08..cef9a4606d89 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -277,10 +277,11 @@ static uint32_t num_cpus_by_topo_level(X86CPUTopoInfo *topo_info,
     case CPU_TOPO_LEVEL_CORE:
         return topo_info->threads_per_core;
     case CPU_TOPO_LEVEL_DIE:
-        return topo_info->threads_per_core * topo_info->cores_per_die;
+        return topo_info->threads_per_core * topo_info->cores_per_module *
+               topo_info->modules_per_die;
     case CPU_TOPO_LEVEL_PACKAGE:
-        return topo_info->threads_per_core * topo_info->cores_per_die *
-               topo_info->dies_per_pkg;
+        return topo_info->threads_per_core * topo_info->cores_per_module *
+               topo_info->modules_per_die * topo_info->dies_per_pkg;
     default:
         g_assert_not_reached();
     }
@@ -449,7 +450,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
+        l3_threads = topo_info->modules_per_die *
+                     topo_info->cores_per_module *
+                     topo_info->threads_per_core;
         *eax |= (l3_threads - 1) << 14;
     } else {
         *eax |= ((topo_info->threads_per_core - 1) << 14);
@@ -6126,10 +6129,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t cpus_per_pkg;
 
     topo_info.dies_per_pkg = env->nr_dies;
-    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
+    topo_info.modules_per_die = env->nr_modules;
+    topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
     topo_info.threads_per_core = cs->nr_threads;
 
-    cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
+    cores_per_pkg = topo_info.cores_per_module * topo_info.modules_per_die *
+                    topo_info.dies_per_pkg;
     cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
 
     /* Calculate & apply limits for different index ranges */
diff --git a/tests/unit/test-x86-topo.c b/tests/unit/test-x86-topo.c
index 2b104f86d7c2..f21b8a5d95c2 100644
--- a/tests/unit/test-x86-topo.c
+++ b/tests/unit/test-x86-topo.c
@@ -30,13 +30,16 @@ static void test_topo_bits(void)
 {
     X86CPUTopoInfo topo_info = {0};
 
-    /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
-    topo_info = (X86CPUTopoInfo) {1, 1, 1};
+    /*
+     * simple tests for 1 thread per core, 1 core per module,
+     *                  1 module per die, 1 die per package
+     */
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
 
-    topo_info = (X86CPUTopoInfo) {1, 1, 1};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
@@ -45,39 +48,39 @@ static void test_topo_bits(void)
 
     /* Test field width calculation for multiple values
      */
-    topo_info = (X86CPUTopoInfo) {1, 1, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {1, 1, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {1, 1, 4};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
 
-    topo_info = (X86CPUTopoInfo) {1, 1, 14};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {1, 1, 15};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {1, 1, 16};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {1, 1, 17};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
 
 
-    topo_info = (X86CPUTopoInfo) {1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {1, 31, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {1, 32, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {1, 33, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
 
-    topo_info = (X86CPUTopoInfo) {1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
-    topo_info = (X86CPUTopoInfo) {2, 30, 2};
+    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {3, 30, 2};
+    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {4, 30, 2};
+    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
 
     /* build a weird topology and see if IDs are calculated correctly
@@ -85,18 +88,18 @@ static void test_topo_bits(void)
 
     /* This will use 2 bits for thread ID and 3 bits for core ID
      */
-    topo_info = (X86CPUTopoInfo) {1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
 
-    topo_info = (X86CPUTopoInfo) {1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
 
-    topo_info = (X86CPUTopoInfo) {1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
                      (1 << 2) | 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F]
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (10 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 11/21] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 13/21] i386: Support module_id in X86CPUTopoIDs Zhao Liu
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Yongwei Ma

From: Zhao Liu <zhao1.liu@intel.com>

Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
erroneous smp_num_siblings on Intel Hybrid platforms") is able to
handle platforms with Module level enumerated via CPUID.1F.

Expose the module level in CPUID[0x1F] if the machine has more than 1
modules.

(Tested CPU topology in CPUID[0x1F] leaf with various die/cluster
configurations in "-smp".)

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
---
Changes since v3:
 * New patch to expose module level in 0x1F.
 * Add Tested-by tag from Yongwei.
---
 target/i386/cpu.c     | 12 +++++++++++-
 target/i386/cpu.h     |  2 ++
 target/i386/kvm/kvm.c |  2 +-
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index cef9a4606d89..f0ddb253b6b5 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -276,6 +276,8 @@ static uint32_t num_cpus_by_topo_level(X86CPUTopoInfo *topo_info,
         return 1;
     case CPU_TOPO_LEVEL_CORE:
         return topo_info->threads_per_core;
+    case CPU_TOPO_LEVEL_MODULE:
+        return topo_info->threads_per_core * topo_info->cores_per_module;
     case CPU_TOPO_LEVEL_DIE:
         return topo_info->threads_per_core * topo_info->cores_per_module *
                topo_info->modules_per_die;
@@ -296,6 +298,8 @@ static uint32_t apicid_offset_by_topo_level(X86CPUTopoInfo *topo_info,
         return 0;
     case CPU_TOPO_LEVEL_CORE:
         return apicid_core_offset(topo_info);
+    case CPU_TOPO_LEVEL_MODULE:
+        return apicid_module_offset(topo_info);
     case CPU_TOPO_LEVEL_DIE:
         return apicid_die_offset(topo_info);
     case CPU_TOPO_LEVEL_PACKAGE:
@@ -315,6 +319,8 @@ static uint32_t cpuid1f_topo_type(enum CPUTopoLevel topo_level)
         return CPUID_1F_ECX_TOPO_LEVEL_SMT;
     case CPU_TOPO_LEVEL_CORE:
         return CPUID_1F_ECX_TOPO_LEVEL_CORE;
+    case CPU_TOPO_LEVEL_MODULE:
+        return CPUID_1F_ECX_TOPO_LEVEL_MODULE;
     case CPU_TOPO_LEVEL_DIE:
         return CPUID_1F_ECX_TOPO_LEVEL_DIE;
     default:
@@ -346,6 +352,10 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
         if (env->nr_dies > 1) {
             set_bit(CPU_TOPO_LEVEL_DIE, topo_bitmap);
         }
+
+        if (env->nr_modules > 1) {
+            set_bit(CPU_TOPO_LEVEL_MODULE, topo_bitmap);
+        }
     }
 
     *ecx = count & 0xff;
@@ -6390,7 +6400,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         break;
     case 0x1F:
         /* V2 Extended Topology Enumeration Leaf */
-        if (topo_info.dies_per_pkg < 2) {
+        if (topo_info.modules_per_die < 2 && topo_info.dies_per_pkg < 2) {
             *eax = *ebx = *ecx = *edx = 0;
             break;
         }
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 556e80f29764..54019e82fdb4 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1018,6 +1018,7 @@ enum CPUTopoLevel {
     CPU_TOPO_LEVEL_INVALID,
     CPU_TOPO_LEVEL_SMT,
     CPU_TOPO_LEVEL_CORE,
+    CPU_TOPO_LEVEL_MODULE,
     CPU_TOPO_LEVEL_DIE,
     CPU_TOPO_LEVEL_PACKAGE,
     CPU_TOPO_LEVEL_MAX,
@@ -1032,6 +1033,7 @@ enum CPUTopoLevel {
 #define CPUID_1F_ECX_TOPO_LEVEL_INVALID  CPUID_B_ECX_TOPO_LEVEL_INVALID
 #define CPUID_1F_ECX_TOPO_LEVEL_SMT      CPUID_B_ECX_TOPO_LEVEL_SMT
 #define CPUID_1F_ECX_TOPO_LEVEL_CORE     CPUID_B_ECX_TOPO_LEVEL_CORE
+#define CPUID_1F_ECX_TOPO_LEVEL_MODULE   3
 #define CPUID_1F_ECX_TOPO_LEVEL_DIE      5
 
 /* MSR Feature Bits */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e5cd7cc80616..545b2d46221e 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1959,7 +1959,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
             break;
         }
         case 0x1f:
-            if (env->nr_dies < 2) {
+            if (env->nr_modules < 2 && env->nr_dies < 2) {
                 break;
             }
             /* fallthrough */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 13/21] i386: Support module_id in X86CPUTopoIDs
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (11 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F] Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 14/21] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

Add module_id member in X86CPUTopoIDs.

module_id can be parsed from APIC ID, so also update APIC ID parsing
rule to support module level. With this support, the conversions with
module level between X86CPUTopoIDs, X86CPUTopoInfo and APIC ID are
completed.

module_id can be also generated from cpu topology, and before i386
supports "clusters" in smp, the default "clusters per die" is only 1,
thus the module_id generated in this way is 0, so that it will not
conflict with the module_id generated by APIC ID.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v1:
 * Merge the patch "i386: Update APIC ID parsing rule to support module
   level" into this one. (Yanan)
 * Move the apicid_module_width() and apicid_module_offset() support
   into the previous modules_per_die related patch. (Yanan)
---
 hw/i386/x86.c              | 28 +++++++++++++++++++++-------
 include/hw/i386/topology.h | 17 +++++++++++++----
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 267bb0f96ca5..5b05dbdedbff 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -311,11 +311,11 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
     /*
      * If APIC ID is not set,
-     * set it based on socket/die/core/thread properties.
+     * set it based on socket/die/cluster/core/thread properties.
      */
     if (cpu->apic_id == UNASSIGNED_APIC_ID) {
-        int max_socket = (ms->smp.max_cpus - 1) /
-                                smp_threads / smp_cores / ms->smp.dies;
+        int max_socket = (ms->smp.max_cpus - 1) / smp_threads / smp_cores /
+                                ms->smp.clusters / ms->smp.dies;
 
         /*
          * die-id was optional in QEMU 4.0 and older, so keep it optional
@@ -362,6 +362,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
         topo_ids.die_id = cpu->die_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
+
+        /*
+         * TODO: This is the temporary initialization for topo_ids.module_id to
+         * avoid "maybe-uninitialized" compilation errors. Will remove when
+         * X86CPU supports cluster_id.
+         */
+        topo_ids.module_id = 0;
+
         cpu->apic_id = x86_apicid_from_topo_ids(&topo_info, &topo_ids);
     }
 
@@ -370,11 +378,13 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
         MachineState *ms = MACHINE(x86ms);
 
         x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
+
         error_setg(errp,
-            "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
-            " APIC ID %" PRIu32 ", valid index range 0:%d",
-            topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, topo_ids.smt_id,
-            cpu->apic_id, ms->possible_cpus->len - 1);
+            "Invalid CPU [socket: %u, die: %u, module: %u, core: %u, thread: %u]"
+            " with APIC ID %" PRIu32 ", valid index range 0:%d",
+            topo_ids.pkg_id, topo_ids.die_id, topo_ids.module_id,
+            topo_ids.core_id, topo_ids.smt_id, cpu->apic_id,
+            ms->possible_cpus->len - 1);
         return;
     }
 
@@ -495,6 +505,10 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
             ms->possible_cpus->cpus[i].props.has_die_id = true;
             ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
         }
+        if (ms->smp.clusters > 1) {
+            ms->possible_cpus->cpus[i].props.has_cluster_id = true;
+            ms->possible_cpus->cpus[i].props.cluster_id = topo_ids.module_id;
+        }
         ms->possible_cpus->cpus[i].props.has_core_id = true;
         ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
         ms->possible_cpus->cpus[i].props.has_thread_id = true;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index c807d3811dd3..3cec97b377f2 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -50,6 +50,7 @@ typedef uint32_t apic_id_t;
 typedef struct X86CPUTopoIDs {
     unsigned pkg_id;
     unsigned die_id;
+    unsigned module_id;
     unsigned core_id;
     unsigned smt_id;
 } X86CPUTopoIDs;
@@ -127,6 +128,7 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
 {
     return (topo_ids->pkg_id  << apicid_pkg_offset(topo_info)) |
            (topo_ids->die_id  << apicid_die_offset(topo_info)) |
+           (topo_ids->module_id << apicid_module_offset(topo_info)) |
            (topo_ids->core_id << apicid_core_offset(topo_info)) |
            topo_ids->smt_id;
 }
@@ -140,12 +142,16 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
                                          X86CPUTopoIDs *topo_ids)
 {
     unsigned nr_dies = topo_info->dies_per_pkg;
-    unsigned nr_cores = topo_info->cores_per_module *
-                        topo_info->modules_per_die;
+    unsigned nr_modules = topo_info->modules_per_die;
+    unsigned nr_cores = topo_info->cores_per_module;
     unsigned nr_threads = topo_info->threads_per_core;
 
-    topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
-    topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
+    topo_ids->pkg_id = cpu_index / (nr_dies * nr_modules *
+                       nr_cores * nr_threads);
+    topo_ids->die_id = cpu_index / (nr_modules * nr_cores *
+                       nr_threads) % nr_dies;
+    topo_ids->module_id = cpu_index / (nr_cores * nr_threads) %
+                          nr_modules;
     topo_ids->core_id = cpu_index / nr_threads % nr_cores;
     topo_ids->smt_id = cpu_index % nr_threads;
 }
@@ -163,6 +169,9 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
     topo_ids->core_id =
             (apicid >> apicid_core_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
+    topo_ids->module_id =
+            (apicid >> apicid_module_offset(topo_info)) &
+            ~(0xFFFFFFFFUL << apicid_module_width(topo_info));
     topo_ids->die_id =
             (apicid >> apicid_die_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 14/21] i386/cpu: Introduce cluster-id to X86CPU
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (12 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 13/21] i386: Support module_id in X86CPUTopoIDs Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 15/21] tests: Add test case of APIC ID for module level parsing Zhao Liu
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

Introduce cluster-id other than module-id to be consistent with
CpuInstanceProperties.cluster-id, and this avoids the confusion
of parameter names when hotplugging.

Following the legacy smp check rules, also add the cluster_id validity
into x86_cpu_pre_plug().

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
Changes since v3:
 * Use the imperative in the commit message. (Babu)
---
 hw/i386/x86.c     | 33 +++++++++++++++++++++++++--------
 target/i386/cpu.c |  2 ++
 target/i386/cpu.h |  1 +
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 5b05dbdedbff..a93f0771d97b 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -325,6 +325,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
             cpu->die_id = 0;
         }
 
+        /*
+         * cluster-id was optional in QEMU 8.0 and older, so keep it optional
+         * if there's only one cluster per die.
+         */
+        if (cpu->cluster_id < 0 && ms->smp.clusters == 1) {
+            cpu->cluster_id = 0;
+        }
+
         if (cpu->socket_id < 0) {
             error_setg(errp, "CPU socket-id is not set");
             return;
@@ -341,6 +349,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
                        cpu->die_id, ms->smp.dies - 1);
             return;
         }
+        if (cpu->cluster_id < 0) {
+            error_setg(errp, "CPU cluster-id is not set");
+            return;
+        } else if (cpu->cluster_id > ms->smp.clusters - 1) {
+            error_setg(errp, "Invalid CPU cluster-id: %u must be in range 0:%u",
+                       cpu->cluster_id, ms->smp.clusters - 1);
+            return;
+        }
         if (cpu->core_id < 0) {
             error_setg(errp, "CPU core-id is not set");
             return;
@@ -360,16 +376,9 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
         topo_ids.pkg_id = cpu->socket_id;
         topo_ids.die_id = cpu->die_id;
+        topo_ids.module_id = cpu->cluster_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
-
-        /*
-         * TODO: This is the temporary initialization for topo_ids.module_id to
-         * avoid "maybe-uninitialized" compilation errors. Will remove when
-         * X86CPU supports cluster_id.
-         */
-        topo_ids.module_id = 0;
-
         cpu->apic_id = x86_apicid_from_topo_ids(&topo_info, &topo_ids);
     }
 
@@ -416,6 +425,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
     }
     cpu->die_id = topo_ids.die_id;
 
+    if (cpu->cluster_id != -1 && cpu->cluster_id != topo_ids.module_id) {
+        error_setg(errp, "property cluster-id: %u doesn't match set apic-id:"
+            " 0x%x (cluster-id: %u)", cpu->cluster_id, cpu->apic_id,
+            topo_ids.module_id);
+        return;
+    }
+    cpu->cluster_id = topo_ids.module_id;
+
     if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
         error_setg(errp, "property core-id: %u doesn't match set apic-id:"
             " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f0ddb253b6b5..d8c5e774cf95 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7907,12 +7907,14 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, 0),
     DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, 0),
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
+    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, 0),
     DEFINE_PROP_INT32("die-id", X86CPU, die_id, 0),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, 0),
 #else
     DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
     DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, -1),
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
+    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, -1),
     DEFINE_PROP_INT32("die-id", X86CPU, die_id, -1),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
 #endif
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 54019e82fdb4..fa1452380882 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2059,6 +2059,7 @@ struct ArchCPU {
     int32_t node_id; /* NUMA node this CPU belongs to */
     int32_t socket_id;
     int32_t die_id;
+    int32_t cluster_id;
     int32_t core_id;
     int32_t thread_id;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 15/21] tests: Add test case of APIC ID for module level parsing
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (13 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 14/21] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 16/21] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding, Yongwei Ma

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

After i386 supports module level, it's time to add the test for module
level's parsing.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 tests/unit/test-x86-topo.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/tests/unit/test-x86-topo.c b/tests/unit/test-x86-topo.c
index f21b8a5d95c2..55b731ccae55 100644
--- a/tests/unit/test-x86-topo.c
+++ b/tests/unit/test-x86-topo.c
@@ -37,6 +37,7 @@ static void test_topo_bits(void)
     topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
 
     topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
@@ -74,13 +75,22 @@ static void test_topo_bits(void)
     topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
 
-    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 6, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
+    topo_info = (X86CPUTopoInfo) {1, 7, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
+    topo_info = (X86CPUTopoInfo) {1, 8, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
+    topo_info = (X86CPUTopoInfo) {1, 9, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 4);
+
+    topo_info = (X86CPUTopoInfo) {1, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
-    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {2, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {3, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {4, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
 
     /* build a weird topology and see if IDs are calculated correctly
@@ -91,6 +101,7 @@ static void test_topo_bits(void)
     topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
+    g_assert_cmpuint(apicid_module_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 16/21] hw/i386/pc: Support smp.clusters for x86 PC machine
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (14 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 15/21] tests: Add test case of APIC ID for module level parsing Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 17/21] i386: Add cache topology info in CPUCacheInfo Zhao Liu
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

As module-level topology support is added to X86CPU, now we can enable
the support for the cluster parameter on PC machines. With this support,
we can define a 5-level x86 CPU topology with "-smp":

-smp cpus=*,maxcpus=*,sockets=*,dies=*,clusters=*,cores=*,threads=*.

Additionally, add the 5-level topology example in description of "-smp".

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/i386/pc.c    |  1 +
 qemu-options.hx | 10 +++++-----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 54838c0c411d..b2b8ec4968c9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1888,6 +1888,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = true;
     mc->smp_props.dies_supported = true;
+    mc->smp_props.clusters_supported = true;
     mc->default_ram_id = "pc.ram";
     pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 6be621c23249..ff4a67d6449d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -334,14 +334,14 @@ SRST
         -smp 8,sockets=2,cores=2,threads=2,maxcpus=8
 
     The following sub-option defines a CPU topology hierarchy (2 sockets
-    totally on the machine, 2 dies per socket, 2 cores per die, 2 threads
-    per core) for PC machines which support sockets/dies/cores/threads.
-    Some members of the option can be omitted but their values will be
-    automatically computed:
+    totally on the machine, 2 dies per socket, 2 clusters per die, 2 cores per
+    cluster, 2 threads per core) for PC machines which support sockets/dies
+    /clusters/cores/threads. Some members of the option can be omitted but
+    their values will be automatically computed:
 
     ::
 
-        -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16
+        -smp 32,sockets=2,dies=2,clusters=2,cores=2,threads=2,maxcpus=32
 
     The following sub-option defines a CPU topology hierarchy (2 sockets
     totally on the machine, 2 clusters per socket, 2 cores per cluster,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 17/21] i386: Add cache topology info in CPUCacheInfo
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (15 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 16/21] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 18/21] i386: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache.

Thus, we should explicitly define the corresponding cache topology for
different cache models to increase scalability.

Except legacy_l2_cache_cpuid2 (its default topo level is
CPU_TOPO_LEVEL_UNKNOW), explicitly set the corresponding topology level
for all other cache models. In order to be compatible with the existing
cache topology, set the CPU_TOPO_LEVEL_CORE level for the i/d cache, set
the CPU_TOPO_LEVEL_CORE level for L2 cache, and set the
CPU_TOPO_LEVEL_DIE level for L3 cache.

The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits
25:14] will be set based on CPUCacheInfo.share_level.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * Fix cache topology uninitialization bugs for some AMD CPUs. (Babu)
 * Move the CPUTopoLevel enumeration definition to the previous 0x1f
   rework patch.

Changes since v1:
 * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
   (Yanan)
 * (Revert, pls refer "i386: Decouple CPUID[0x1F] subleaf with specific
   topology level") Rename the "INVALID" level to CPU_TOPO_LEVEL_UNKNOW.
   (Yanan)
---
 target/i386/cpu.c | 36 ++++++++++++++++++++++++++++++++++++
 target/i386/cpu.h |  7 +++++++
 2 files changed, 43 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index d8c5e774cf95..fcb4f4da3431 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -557,6 +557,7 @@ static CPUCacheInfo legacy_l1d_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -571,6 +572,7 @@ static CPUCacheInfo legacy_l1d_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /* L1 instruction cache: */
@@ -584,6 +586,7 @@ static CPUCacheInfo legacy_l1i_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -598,6 +601,7 @@ static CPUCacheInfo legacy_l1i_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /* Level 2 unified cache: */
@@ -611,6 +615,7 @@ static CPUCacheInfo legacy_l2_cache = {
     .sets = 4096,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */
@@ -620,6 +625,7 @@ static CPUCacheInfo legacy_l2_cache_cpuid2 = {
     .size = 2 * MiB,
     .line_size = 64,
     .associativity = 8,
+    .share_level = CPU_TOPO_LEVEL_INVALID,
 };
 
 
@@ -633,6 +639,7 @@ static CPUCacheInfo legacy_l2_cache_amd = {
     .associativity = 16,
     .sets = 512,
     .partitions = 1,
+    .share_level = CPU_TOPO_LEVEL_CORE,
 };
 
 /* Level 3 unified cache: */
@@ -648,6 +655,7 @@ static CPUCacheInfo legacy_l3_cache = {
     .self_init = true,
     .inclusive = true,
     .complex_indexing = true,
+    .share_level = CPU_TOPO_LEVEL_DIE,
 };
 
 /* TLB definitions: */
@@ -1944,6 +1952,7 @@ static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -1956,6 +1965,7 @@ static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1966,6 +1976,7 @@ static const CPUCaches epyc_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1979,6 +1990,7 @@ static const CPUCaches epyc_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -1994,6 +2006,7 @@ static CPUCaches epyc_v4_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2006,6 +2019,7 @@ static CPUCaches epyc_v4_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2016,6 +2030,7 @@ static CPUCaches epyc_v4_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2029,6 +2044,7 @@ static CPUCaches epyc_v4_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2044,6 +2060,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2056,6 +2073,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2066,6 +2084,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2079,6 +2098,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2094,6 +2114,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2106,6 +2127,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2116,6 +2138,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2129,6 +2152,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2144,6 +2168,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2156,6 +2181,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2166,6 +2192,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2179,6 +2206,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2194,6 +2222,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2206,6 +2235,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2216,6 +2246,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2229,6 +2260,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
@@ -2244,6 +2276,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2256,6 +2289,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2266,6 +2300,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .partitions = 1,
         .sets = 2048,
         .lines_per_tag = 1,
+        .share_level = CPU_TOPO_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2279,6 +2314,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
+        .share_level = CPU_TOPO_LEVEL_DIE,
     },
 };
 
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index fa1452380882..a13132007415 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1596,6 +1596,13 @@ typedef struct CPUCacheInfo {
      * address bits.  CPUID[4].EDX[bit 2].
      */
     bool complex_indexing;
+
+    /*
+     * Cache Topology. The level that cache is shared in.
+     * Used to encode CPUID[4].EAX[bits 25:14] or
+     * CPUID[0x8000001D].EAX[bits 25:14].
+     */
+    enum CPUTopoLevel share_level;
 } CPUCacheInfo;
 
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 18/21] i386: Use CPUCacheInfo.share_level to encode CPUID[4]
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (16 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 17/21] i386: Add cache topology info in CPUCacheInfo Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:21 ` [PATCH v4 19/21] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
Intel CPUs.

After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[4].EAX[bits 25:14].

And since maximum_processor_id (original "num_apic_ids") is parsed
based on cpu topology levels, which are verified when parsing smp, it's
no need to check this value by "assert(num_apic_ids > 0)" again, so
remove this assert.

Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
helper to make the code cleaner.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v1:
 * Use "enum CPUTopoLevel share_level" as the parameter in
   max_processor_ids_for_cache().
 * Make cache_into_passthrough case also use
   max_processor_ids_for_cache() and max_core_ids_in_package() to
   encode CPUID[4]. (Yanan)
 * Rename the title of this patch (the original is "i386: Use
   CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]").
---
 target/i386/cpu.c | 70 +++++++++++++++++++++++++++++------------------
 1 file changed, 43 insertions(+), 27 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index fcb4f4da3431..5d066107d6ce 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -234,22 +234,53 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
                        ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
                        0 /* Invalid value */)
 
+static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
+                                            enum CPUTopoLevel share_level)
+{
+    uint32_t num_ids = 0;
+
+    switch (share_level) {
+    case CPU_TOPO_LEVEL_CORE:
+        num_ids = 1 << apicid_core_offset(topo_info);
+        break;
+    case CPU_TOPO_LEVEL_DIE:
+        num_ids = 1 << apicid_die_offset(topo_info);
+        break;
+    case CPU_TOPO_LEVEL_PACKAGE:
+        num_ids = 1 << apicid_pkg_offset(topo_info);
+        break;
+    default:
+        /*
+         * Currently there is no use case for SMT and MODULE, so use
+         * assert directly to facilitate debugging.
+         */
+        g_assert_not_reached();
+    }
+
+    return num_ids - 1;
+}
+
+static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
+{
+    uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
+                               apicid_core_offset(topo_info));
+    return num_cores - 1;
+}
 
 /* Encode cache info for CPUID[4] */
 static void encode_cache_cpuid4(CPUCacheInfo *cache,
-                                int num_apic_ids, int num_cores,
+                                X86CPUTopoInfo *topo_info,
                                 uint32_t *eax, uint32_t *ebx,
                                 uint32_t *ecx, uint32_t *edx)
 {
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
-    assert(num_apic_ids > 0);
     *eax = CACHE_TYPE(cache->type) |
            CACHE_LEVEL(cache->level) |
            (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
-           ((num_cores - 1) << 26) |
-           ((num_apic_ids - 1) << 14);
+           (max_core_ids_in_package(topo_info) << 26) |
+           (max_processor_ids_for_cache(topo_info, cache->share_level) << 14);
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
@@ -6258,56 +6289,41 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
 
                 if (cores_per_pkg > 1) {
-                    int addressable_cores_offset =
-                                                apicid_pkg_offset(&topo_info) -
-                                                apicid_core_offset(&topo_info);
-
                     *eax &= ~0xFC000000;
-                    *eax |= (1 << (addressable_cores_offset - 1)) << 26;
+                    *eax |= max_core_ids_in_package(&topo_info) << 26;
                 }
                 if (host_vcpus_per_cache > cpus_per_pkg) {
-                    int pkg_offset = apicid_pkg_offset(&topo_info);
-
                     *eax &= ~0x3FFC000;
-                    *eax |= (1 << (pkg_offset - 1)) << 14;
+                    *eax |=
+                        max_processor_ids_for_cache(&topo_info,
+                                                CPU_TOPO_LEVEL_PACKAGE) << 14;
                 }
             }
         } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
             *eax = *ebx = *ecx = *edx = 0;
         } else {
             *eax = 0;
-            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
-                                           apicid_core_offset(&topo_info);
-            int core_offset, die_offset;
 
             switch (count) {
             case 0: /* L1 dcache info */
-                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
-                                    (1 << core_offset),
-                                    (1 << addressable_cores_offset),
+                                    &topo_info,
                                     eax, ebx, ecx, edx);
                 break;
             case 1: /* L1 icache info */
-                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
-                                    (1 << core_offset),
-                                    (1 << addressable_cores_offset),
+                                    &topo_info,
                                     eax, ebx, ecx, edx);
                 break;
             case 2: /* L2 cache info */
-                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
-                                    (1 << core_offset),
-                                    (1 << addressable_cores_offset),
+                                    &topo_info,
                                     eax, ebx, ecx, edx);
                 break;
             case 3: /* L3 cache info */
-                die_offset = apicid_die_offset(&topo_info);
                 if (cpu->enable_l3_cache) {
                     encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
-                                        (1 << die_offset),
-                                        (1 << addressable_cores_offset),
+                                        &topo_info,
                                         eax, ebx, ecx, edx);
                     break;
                 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 19/21] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (17 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 18/21] i386: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-22 19:27   ` Moger, Babu
  2023-09-14  7:21 ` [PATCH v4 20/21] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
                   ` (3 subsequent siblings)
  22 siblings, 1 reply; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
the number of sharing threads directly.

From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
means [1]:

The number of logical processors sharing this cache is the value of
this field incremented by 1. To determine which logical processors are
sharing a cache, determine a Share Id for each processor as follows:

ShareId = LocalApicId >> log2(NumSharingCache+1)

Logical processors with the same ShareId then share a cache. If
NumSharingCache+1 is not a power of two, round it up to the next power
of two.

From the description above, the calculation of this field should be same
as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
APIC ID to calculate this field.

[1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
     Information

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * Rewrite the subject. (Babu)
 * Delete the original "comment/help" expression, as this behavior is
   confirmed for AMD CPUs. (Babu)
 * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec
   definition. (Babu)

Changes since v1:
 * Rename "l3_threads" to "num_apic_ids" in
   encode_cache_cpuid8000001d(). (Yanan)
 * Add the description of the original commit and add Cc.
---
 target/i386/cpu.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5d066107d6ce..bc28c59df089 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -482,7 +482,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
                                        uint32_t *eax, uint32_t *ebx,
                                        uint32_t *ecx, uint32_t *edx)
 {
-    uint32_t l3_threads;
+    uint32_t num_sharing_cache;
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
@@ -491,13 +491,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_threads = topo_info->modules_per_die *
-                     topo_info->cores_per_module *
-                     topo_info->threads_per_core;
-        *eax |= (l3_threads - 1) << 14;
+        num_sharing_cache = 1 << apicid_die_offset(topo_info);
     } else {
-        *eax |= ((topo_info->threads_per_core - 1) << 14);
+        num_sharing_cache = 1 << apicid_core_offset(topo_info);
     }
+    *eax |= (num_sharing_cache - 1) << 14;
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 20/21] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (18 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 19/21] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-22 19:27   ` Moger, Babu
  2023-09-14  7:21 ` [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical
processors sharing cache.

The number of logical processors sharing this cache is
NumSharingCache + 1.

After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[0x8000001D].EAX[bits 25:14].

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since v3:
 * Explain what "CPUID[0x8000001D].EAX[bits 25:14]" means in the commit
   message. (Babu)

Changes since v1:
 * Use cache->share_level as the parameter in
   max_processor_ids_for_cache().
---
 target/i386/cpu.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index bc28c59df089..3bed823dc3b7 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -482,20 +482,12 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
                                        uint32_t *eax, uint32_t *ebx,
                                        uint32_t *ecx, uint32_t *edx)
 {
-    uint32_t num_sharing_cache;
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
     *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
                (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
-
-    /* L3 is shared among multiple cores */
-    if (cache->level == 3) {
-        num_sharing_cache = 1 << apicid_die_offset(topo_info);
-    } else {
-        num_sharing_cache = 1 << apicid_core_offset(topo_info);
-    }
-    *eax |= (num_sharing_cache - 1) << 14;
+    *eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 14;
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (19 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 20/21] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
@ 2023-09-14  7:21 ` Zhao Liu
  2023-09-14  7:41   ` Philippe Mathieu-Daudé
  2023-09-22 16:03 ` [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Moger, Babu
  2023-10-18 12:06 ` Michael S. Tsirkin
  22 siblings, 1 reply; 43+ messages in thread
From: Zhao Liu @ 2023-09-14  7:21 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Yongwei Ma

From: Zhao Liu <zhao1.liu@intel.com>

The property x-l2-cache-topo will be used to change the L2 cache
topology in CPUID.04H.

Now it allows user to set the L2 cache is shared in core level or
cluster level.

If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
topology will be overrode by the new topology setting.

Here we expose to user "cluster" instead of "module", to be consistent
with "cluster-id" naming.

Since CPUID.04H is used by intel CPUs, this property is available on
intel CPUs as for now.

When necessary, it can be extended to CPUID.8000001DH for AMD CPUs.

(Tested the cache topology in CPUID[0x04] leaf with "x-l2-cache-topo=[
core|cluster]", and tested the live migration between the QEMUs w/ &
w/o this patch series.)

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
---
Changes since v3:
 * Add description about test for live migration compatibility. (Babu)

Changes since v1:
 * Rename MODULE branch to CPU_TOPO_LEVEL_MODULE to match the previous
   renaming changes.
---
 target/i386/cpu.c | 34 +++++++++++++++++++++++++++++++++-
 target/i386/cpu.h |  2 ++
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3bed823dc3b7..b1282c8bd3b7 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -243,6 +243,9 @@ static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
     case CPU_TOPO_LEVEL_CORE:
         num_ids = 1 << apicid_core_offset(topo_info);
         break;
+    case CPU_TOPO_LEVEL_MODULE:
+        num_ids = 1 << apicid_module_offset(topo_info);
+        break;
     case CPU_TOPO_LEVEL_DIE:
         num_ids = 1 << apicid_die_offset(topo_info);
         break;
@@ -251,7 +254,7 @@ static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
         break;
     default:
         /*
-         * Currently there is no use case for SMT and MODULE, so use
+         * Currently there is no use case for SMT, so use
          * assert directly to facilitate debugging.
          */
         g_assert_not_reached();
@@ -7576,6 +7579,34 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
         env->cache_info_amd.l3_cache = &legacy_l3_cache;
     }
 
+    if (cpu->l2_cache_topo_level) {
+        /*
+         * FIXME: Currently only supports changing CPUID[4] (for intel), and
+         * will support changing CPUID[0x8000001D] when necessary.
+         */
+        if (!IS_INTEL_CPU(env)) {
+            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
+            return;
+        }
+
+        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
+            env->cache_info_cpuid4.l2_cache->share_level = CPU_TOPO_LEVEL_CORE;
+        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
+            /*
+             * We expose to users "cluster" instead of "module", to be
+             * consistent with "cluster-id" naming.
+             */
+            env->cache_info_cpuid4.l2_cache->share_level =
+                                                        CPU_TOPO_LEVEL_MODULE;
+        } else {
+            error_setg(errp,
+                       "x-l2-cache-topo doesn't support '%s', "
+                       "and it only supports 'core' or 'cluster'",
+                       cpu->l2_cache_topo_level);
+            return;
+        }
+    }
+
 #ifndef CONFIG_USER_ONLY
     MachineState *ms = MACHINE(qdev_get_machine());
     qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
@@ -8079,6 +8110,7 @@ static Property x86_cpu_properties[] = {
                      false),
     DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
                      true),
+    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
     DEFINE_PROP_END_OF_LIST()
 };
 
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index a13132007415..05ffc4c1cc6e 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2073,6 +2073,8 @@ struct ArchCPU {
     int32_t hv_max_vps;
 
     bool xen_vapic;
+
+    char *l2_cache_topo_level;
 };
 
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 03/21] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-09-14  7:21 ` [PATCH v4 03/21] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
@ 2023-09-14  7:31   ` Philippe Mathieu-Daudé
  2023-09-15  7:39     ` Zhao Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-14  7:31 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum, Yanan Wang,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

Hi,

On 14/9/23 09:21, Zhao Liu wrote:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> 
>  From CPUState.nr_cores' comment, it represents "number of cores within
> this CPU package".
> 
> After 003f230e37d7 ("machine: Tweak the order of topology members in
> struct CpuTopology"), the meaning of smp.cores changed to "the number of
> cores in one die", but this commit missed to change CPUState.nr_cores'
> calculation, so that CPUState.nr_cores became wrong and now it
> misses to consider numbers of clusters and dies.
> 
> At present, only i386 is using CPUState.nr_cores.
> 
> But as for i386, which supports die level, the uses of CPUState.nr_cores
> are very confusing:
> 
> Early uses are based on the meaning of "cores per package" (before die
> is introduced into i386), and later uses are based on "cores per die"
> (after die's introduction).
> 
> This difference is due to that commit a94e1428991f ("target/i386: Add
> CPUID.1F generation support for multi-dies PCMachine") misunderstood
> that CPUState.nr_cores means "cores per die" when calculated
> CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> wrong understanding.
> 
> With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> the result of CPUState.nr_cores is "cores per die", thus the original
> uses of CPUState.cores based on the meaning of "cores per package" are
> wrong when multiple dies exist:
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
>     incorrect because it expects "cpus per package" but now the
>     result is "cpus per die".
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
>     EAX[bits 31:26] is incorrect because they expect "cpus per package"
>     but now the result is "cpus per die". The error not only impacts the
>     EAX calculation in cache_info_passthrough case, but also impacts other
>     cases of setting cache topology for Intel CPU according to cpu
>     topology (specifically, the incoming parameter "num_cores" expects
>     "cores per package" in encode_cache_cpuid4()).
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
>     15:00] is incorrect because the EBX of 0BH.01H (core level) expects
>     "cpus per package", which may be different with 1FH.01H (The reason
>     is 1FH can support more levels. For QEMU, 1FH also supports die,
>     1FH.01H:EBX[bits 15:00] expects "cpus per die").
> 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
>     calculated, here "cpus per package" is expected to be checked, but in
>     fact, now it checks "cpus per die". Though "cpus per die" also works
>     for this code logic, this isn't consistent with AMD's APM.
> 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
>     "cpus per package" but it obtains "cpus per die".
> 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
>     kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
>     helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
>     MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
>     package", but in these functions, it obtains "cpus per die" and
>     "cores per die".
> 
> On the other hand, these uses are correct now (they are added in/after
> a94e1428991f):
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
>     meets the actual meaning of CPUState.nr_cores ("cores per die").
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
>     04H's calculation) considers number of dies, so it's correct.
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
>     15:00] needs "cpus per die" and it gets the correct result, and
>     CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> 
> When CPUState.nr_cores is correctly changed to "cores per package" again
> , the above errors will be fixed without extra work, but the "currently"
> correct cases will go wrong and need special handling to pass correct
> "cpus/cores per die" they want.
> 
> Fix CPUState.nr_cores' calculation to fit the original meaning "cores
> per package", as well as changing calculation of topo_info.cores_per_die,
> vcpus_per_socket and CPUID.1FH.

What a pain. Can we split this patch in 2, first the x86 part
and then the common part (softmmu/cpus.c)?

> Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v3:
>   * Describe changes in imperative mood. (Babu)
>   * Fix spelling typo. (Babu)
>   * Split the comment change into a separate patch. (Xiaoyao)
> 
> Changes since v2:
>   * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> 
> Changes since v1:
>   * Add comment for nr_dies in CPUX86State. (Yanan)
> ---
>   softmmu/cpus.c    | 2 +-
>   target/i386/cpu.c | 9 ++++-----
>   2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index 0848e0dbdb3f..fa8239c217ff 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -624,7 +624,7 @@ void qemu_init_vcpu(CPUState *cpu)
>   {
>       MachineState *ms = MACHINE(qdev_get_machine());
>   
> -    cpu->nr_cores = ms->smp.cores;
> +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
>       cpu->nr_threads =  ms->smp.threads;
>       cpu->stopped = true;
>       cpu->random_seed = qemu_guest_random_seed_thread_part1();
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 24ee67b42d05..709c055c8468 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6015,7 +6015,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>       X86CPUTopoInfo topo_info;
>   
>       topo_info.dies_per_pkg = env->nr_dies;
> -    topo_info.cores_per_die = cs->nr_cores;
> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>       topo_info.threads_per_core = cs->nr_threads;
>   
>       /* Calculate & apply limits for different index ranges */
> @@ -6091,8 +6091,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                */
>               if (*eax & 31) {
>                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> -                                       cs->nr_threads;
> +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>                   if (cs->nr_cores > 1) {
>                       *eax &= ~0xFC000000;
>                       *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> @@ -6270,12 +6269,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               break;
>           case 1:
>               *eax = apicid_die_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>               break;
>           case 2:
>               *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> +            *ebx = cs->nr_cores * cs->nr_threads;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>               break;
>           default:


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 04/21] hw/cpu: Update the comments of nr_cores and nr_dies
  2023-09-14  7:21 ` [PATCH v4 04/21] hw/cpu: Update the comments of nr_cores and nr_dies Zhao Liu
@ 2023-09-14  7:32   ` Philippe Mathieu-Daudé
  2023-09-15  7:40     ` Zhao Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-14  7:32 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum, Yanan Wang,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

On 14/9/23 09:21, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> In the nr_threads' comment, specify it represents the
> number of threads in the "core" to avoid confusion.
> 
> Also add comment for nr_dies in CPUX86State.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since v3:
>   * The new patch split out of CPUSTATE.nr_cores' fix. (Xiaoyao)
> ---
>   include/hw/core/cpu.h | 2 +-
>   target/i386/cpu.h     | 1 +
>   2 files changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State
  2023-09-14  7:21 ` [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
@ 2023-09-14  7:38   ` Philippe Mathieu-Daudé
  2023-09-15  7:50     ` Zhao Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-14  7:38 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum, Yanan Wang,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Zhuocheng Ding

On 14/9/23 09:21, Zhao Liu wrote:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> 
> smp command has the "clusters" parameter but x86 hasn't supported that
> level. "cluster" is a CPU topology level concept above cores, in which
> the cores may share some resources (L2 cache or some others like L3
> cache tags, depending on the Archs) [1][2]. For x86, the resource shared
> by cores at the cluster level is mainly the L2 cache.
> 
> However, using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
> 
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
> 
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> 
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
> 
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
> 
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
> 
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
> 
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
> 
> For x86, the "cluster" in smp is corresponding to the module level [2],
> which is above the core level. So use the "module" other than "cluster"
> in i386 code.
> 
> And please note that x86 already has a cpu topology level also named
> "cluster" [3], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
> 
> [1]: 864c3b5c32f0 ("hw/core/machine: Introduce CPU cluster topology support")
> [2]: Yanan's comment about "cluster",
>       https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg04051.html
> [3]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> 
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> Changes since v1:
>   * The background of the introduction of the "cluster" parameter and its
>     exact meaning were revised according to Yanan's explanation. (Yanan)
> ---
>   hw/i386/x86.c     | 1 +
>   target/i386/cpu.c | 1 +
>   target/i386/cpu.h | 5 +++++
>   3 files changed, 7 insertions(+)


> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 470257b92240..556e80f29764 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1903,6 +1903,11 @@ typedef struct CPUArchState {
>   
>       /* Number of dies within this CPU package. */
>       unsigned nr_dies;
> +    /*
> +     * Number of modules within this CPU package.
> +     * Module level in x86 cpu topology is corresponding to smp.clusters.
> +     */
> +    unsigned nr_modules;
>   } CPUX86State;

It would be really useful to have an ASCII art comment showing
the architecture topology. Also for clarity the topo fields from
CPU[Arch]State could be moved into a 'topo' sub structure, or even
clearer would be to re-use the X86CPUTopoIDs structure?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-09-14  7:21 ` [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
@ 2023-09-14  7:41   ` Philippe Mathieu-Daudé
  2023-09-15  7:53     ` Zhao Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-14  7:41 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum, Yanan Wang,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu,
	Yongwei Ma

On 14/9/23 09:21, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> The property x-l2-cache-topo will be used to change the L2 cache
> topology in CPUID.04H.
> 
> Now it allows user to set the L2 cache is shared in core level or
> cluster level.
> 
> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> topology will be overrode by the new topology setting.
> 
> Here we expose to user "cluster" instead of "module", to be consistent
> with "cluster-id" naming.
> 
> Since CPUID.04H is used by intel CPUs, this property is available on
> intel CPUs as for now.
> 
> When necessary, it can be extended to CPUID.8000001DH for AMD CPUs.
> 
> (Tested the cache topology in CPUID[0x04] leaf with "x-l2-cache-topo=[
> core|cluster]", and tested the live migration between the QEMUs w/ &
> w/o this patch series.)
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> ---
> Changes since v3:
>   * Add description about test for live migration compatibility. (Babu)
> 
> Changes since v1:
>   * Rename MODULE branch to CPU_TOPO_LEVEL_MODULE to match the previous
>     renaming changes.
> ---
>   target/i386/cpu.c | 34 +++++++++++++++++++++++++++++++++-
>   target/i386/cpu.h |  2 ++
>   2 files changed, 35 insertions(+), 1 deletion(-)


> @@ -8079,6 +8110,7 @@ static Property x86_cpu_properties[] = {
>                        false),
>       DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
>                        true),
> +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),

We use the 'x-' prefix for unstable features, is it the case here?

>       DEFINE_PROP_END_OF_LIST()
>   };


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 03/21] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-09-14  7:31   ` Philippe Mathieu-Daudé
@ 2023-09-15  7:39     ` Zhao Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-15  7:39 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Eduardo Habkost, Marcel Apfelbaum, Yanan Wang,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li,
	Babu Moger, Zhao Liu, Zhuocheng Ding

Hi Philippe,

On Thu, Sep 14, 2023 at 09:31:52AM +0200, Philippe Mathieu-Daudé wrote:
> Date: Thu, 14 Sep 2023 09:31:52 +0200
> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Subject: Re: [PATCH v4 03/21] softmmu: Fix CPUSTATE.nr_cores' calculation
> 
> Hi,
> 
> On 14/9/23 09:21, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> >  From CPUState.nr_cores' comment, it represents "number of cores within
> > this CPU package".
> > 
> > After 003f230e37d7 ("machine: Tweak the order of topology members in
> > struct CpuTopology"), the meaning of smp.cores changed to "the number of
> > cores in one die", but this commit missed to change CPUState.nr_cores'
> > calculation, so that CPUState.nr_cores became wrong and now it
> > misses to consider numbers of clusters and dies.
> > 
> > At present, only i386 is using CPUState.nr_cores.
> > 
> > But as for i386, which supports die level, the uses of CPUState.nr_cores
> > are very confusing:
> > 
> > Early uses are based on the meaning of "cores per package" (before die
> > is introduced into i386), and later uses are based on "cores per die"
> > (after die's introduction).
> > 
> > This difference is due to that commit a94e1428991f ("target/i386: Add
> > CPUID.1F generation support for multi-dies PCMachine") misunderstood
> > that CPUState.nr_cores means "cores per die" when calculated
> > CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> > wrong understanding.
> > 
> > With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> > the result of CPUState.nr_cores is "cores per die", thus the original
> > uses of CPUState.cores based on the meaning of "cores per package" are
> > wrong when multiple dies exist:
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
> >     incorrect because it expects "cpus per package" but now the
> >     result is "cpus per die".
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
> >     EAX[bits 31:26] is incorrect because they expect "cpus per package"
> >     but now the result is "cpus per die". The error not only impacts the
> >     EAX calculation in cache_info_passthrough case, but also impacts other
> >     cases of setting cache topology for Intel CPU according to cpu
> >     topology (specifically, the incoming parameter "num_cores" expects
> >     "cores per package" in encode_cache_cpuid4()).
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
> >     15:00] is incorrect because the EBX of 0BH.01H (core level) expects
> >     "cpus per package", which may be different with 1FH.01H (The reason
> >     is 1FH can support more levels. For QEMU, 1FH also supports die,
> >     1FH.01H:EBX[bits 15:00] expects "cpus per die").
> > 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
> >     calculated, here "cpus per package" is expected to be checked, but in
> >     fact, now it checks "cpus per die". Though "cpus per die" also works
> >     for this code logic, this isn't consistent with AMD's APM.
> > 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
> >     "cpus per package" but it obtains "cpus per die".
> > 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
> >     kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
> >     helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
> >     MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
> >     package", but in these functions, it obtains "cpus per die" and
> >     "cores per die".
> > 
> > On the other hand, these uses are correct now (they are added in/after
> > a94e1428991f):
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
> >     meets the actual meaning of CPUState.nr_cores ("cores per die").
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
> >     04H's calculation) considers number of dies, so it's correct.
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
> >     15:00] needs "cpus per die" and it gets the correct result, and
> >     CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> > 
> > When CPUState.nr_cores is correctly changed to "cores per package" again
> > , the above errors will be fixed without extra work, but the "currently"
> > correct cases will go wrong and need special handling to pass correct
> > "cpus/cores per die" they want.
> > 
> > Fix CPUState.nr_cores' calculation to fit the original meaning "cores
> > per package", as well as changing calculation of topo_info.cores_per_die,
> > vcpus_per_socket and CPUID.1FH.
> 
> What a pain. Can we split this patch in 2, first the x86 part
> and then the common part (softmmu/cpus.c)?

Since x86 uses this nr_cores to calculate many things, if the first
patch just fix nr_cores without changing x86 related code, then that
first patch will break many places (there will be many incorrect CPUIDs).

So I think it’s more appropriate to make these changes into one patch.

Thanks,
Zhao

> 
> > Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine")
> > Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology")
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v3:
> >   * Describe changes in imperative mood. (Babu)
> >   * Fix spelling typo. (Babu)
> >   * Split the comment change into a separate patch. (Xiaoyao)
> > 
> > Changes since v2:
> >   * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> > 
> > Changes since v1:
> >   * Add comment for nr_dies in CPUX86State. (Yanan)
> > ---
> >   softmmu/cpus.c    | 2 +-
> >   target/i386/cpu.c | 9 ++++-----
> >   2 files changed, 5 insertions(+), 6 deletions(-)
> > 
> > diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> > index 0848e0dbdb3f..fa8239c217ff 100644
> > --- a/softmmu/cpus.c
> > +++ b/softmmu/cpus.c
> > @@ -624,7 +624,7 @@ void qemu_init_vcpu(CPUState *cpu)
> >   {
> >       MachineState *ms = MACHINE(qdev_get_machine());
> > -    cpu->nr_cores = ms->smp.cores;
> > +    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
> >       cpu->nr_threads =  ms->smp.threads;
> >       cpu->stopped = true;
> >       cpu->random_seed = qemu_guest_random_seed_thread_part1();
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 24ee67b42d05..709c055c8468 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -6015,7 +6015,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >       X86CPUTopoInfo topo_info;
> >       topo_info.dies_per_pkg = env->nr_dies;
> > -    topo_info.cores_per_die = cs->nr_cores;
> > +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> >       topo_info.threads_per_core = cs->nr_threads;
> >       /* Calculate & apply limits for different index ranges */
> > @@ -6091,8 +6091,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                */
> >               if (*eax & 31) {
> >                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> > -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> > -                                       cs->nr_threads;
> > +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> >                   if (cs->nr_cores > 1) {
> >                       *eax &= ~0xFC000000;
> >                       *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> > @@ -6270,12 +6269,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               break;
> >           case 1:
> >               *eax = apicid_die_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >               break;
> >           case 2:
> >               *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> > +            *ebx = cs->nr_cores * cs->nr_threads;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
> >               break;
> >           default:
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 04/21] hw/cpu: Update the comments of nr_cores and nr_dies
  2023-09-14  7:32   ` Philippe Mathieu-Daudé
@ 2023-09-15  7:40     ` Zhao Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-15  7:40 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Eduardo Habkost, Marcel Apfelbaum, Yanan Wang,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li,
	Babu Moger, Zhao Liu

On Thu, Sep 14, 2023 at 09:32:31AM +0200, Philippe Mathieu-Daudé wrote:
> Date: Thu, 14 Sep 2023 09:32:31 +0200
> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Subject: Re: [PATCH v4 04/21] hw/cpu: Update the comments of nr_cores and
>  nr_dies
> 
> On 14/9/23 09:21, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > In the nr_threads' comment, specify it represents the
> > number of threads in the "core" to avoid confusion.
> > 
> > Also add comment for nr_dies in CPUX86State.
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> > Changes since v3:
> >   * The new patch split out of CPUSTATE.nr_cores' fix. (Xiaoyao)
> > ---
> >   include/hw/core/cpu.h | 2 +-
> >   target/i386/cpu.h     | 1 +
> >   2 files changed, 2 insertions(+), 1 deletion(-)
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>

Thanks!

-Zhao

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State
  2023-09-14  7:38   ` Philippe Mathieu-Daudé
@ 2023-09-15  7:50     ` Zhao Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-15  7:50 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Eduardo Habkost, Marcel Apfelbaum, Yanan Wang,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li,
	Babu Moger, Zhao Liu, Zhuocheng Ding

Hi Philippe,

On Thu, Sep 14, 2023 at 09:38:46AM +0200, Philippe Mathieu-Daudé wrote:
> Date: Thu, 14 Sep 2023 09:38:46 +0200
> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Subject: Re: [PATCH v4 10/21] i386: Introduce module-level cpu topology to
>  CPUX86State
> 
> On 14/9/23 09:21, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> > smp command has the "clusters" parameter but x86 hasn't supported that
> > level. "cluster" is a CPU topology level concept above cores, in which
> > the cores may share some resources (L2 cache or some others like L3
> > cache tags, depending on the Archs) [1][2]. For x86, the resource shared
> > by cores at the cluster level is mainly the L2 cache.
> > 
> > However, using cluster to define x86's L2 cache topology will cause the
> > compatibility problem:
> > 
> > Currently, x86 defaults that the L2 cache is shared in one core, which
> > actually implies a default setting "cores per L2 cache is 1" and
> > therefore implicitly defaults to having as many L2 caches as cores.
> > 
> > For example (i386 PC machine):
> > -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> > 
> > Considering the topology of the L2 cache, this (*) implicitly means "1
> > core per L2 cache" and "2 L2 caches per die".
> > 
> > If we use cluster to configure L2 cache topology with the new default
> > setting "clusters per L2 cache is 1", the above semantics will change
> > to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> > cores per L2 cache".
> > 
> > So the same command (*) will cause changes in the L2 cache topology,
> > further affecting the performance of the virtual machine.
> > 
> > Therefore, x86 should only treat cluster as a cpu topology level and
> > avoid using it to change L2 cache by default for compatibility.
> > 
> > "cluster" in smp is the CPU topology level which is between "core" and
> > die.
> > 
> > For x86, the "cluster" in smp is corresponding to the module level [2],
> > which is above the core level. So use the "module" other than "cluster"
> > in i386 code.
> > 
> > And please note that x86 already has a cpu topology level also named
> > "cluster" [3], this level is at the upper level of the package. Here,
> > the cluster in x86 cpu topology is completely different from the
> > "clusters" as the smp parameter. After the module level is introduced,
> > the cluster as the smp parameter will actually refer to the module level
> > of x86.
> > 
> > [1]: 864c3b5c32f0 ("hw/core/machine: Introduce CPU cluster topology support")
> > [2]: Yanan's comment about "cluster",
> >       https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg04051.html
> > [3]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > 
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > Changes since v1:
> >   * The background of the introduction of the "cluster" parameter and its
> >     exact meaning were revised according to Yanan's explanation. (Yanan)
> > ---
> >   hw/i386/x86.c     | 1 +
> >   target/i386/cpu.c | 1 +
> >   target/i386/cpu.h | 5 +++++
> >   3 files changed, 7 insertions(+)
> 
> 
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index 470257b92240..556e80f29764 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1903,6 +1903,11 @@ typedef struct CPUArchState {
> >       /* Number of dies within this CPU package. */
> >       unsigned nr_dies;
> > +    /*
> > +     * Number of modules within this CPU package.
> > +     * Module level in x86 cpu topology is corresponding to smp.clusters.
> > +     */
> > +    unsigned nr_modules;
> >   } CPUX86State;
> 
> It would be really useful to have an ASCII art comment showing
> the architecture topology.

Good idea, I could consider how to show that.

> Also for clarity the topo fields from
> CPU[Arch]State could be moved into a 'topo' sub structure, or even
> clearer would be to re-use the X86CPUTopoIDs structure?

Yeah, I also have the plan to do further cleanup about these topology
structures [1]. X86CPUTopoInfo is not suitable for hybrid topology case,
(hybrid case needs another structure to store the max number elements
for each level), so I will try to move that X86CPUTopoInfo into
CPU[Arch]State.

[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg01032.html

Thanks,
Zhao

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-09-14  7:41   ` Philippe Mathieu-Daudé
@ 2023-09-15  7:53     ` Zhao Liu
  2023-10-03 12:57       ` Michael S. Tsirkin
  0 siblings, 1 reply; 43+ messages in thread
From: Zhao Liu @ 2023-09-15  7:53 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Eduardo Habkost, Marcel Apfelbaum, Yanan Wang,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li,
	Babu Moger, Zhao Liu, Yongwei Ma

Hi Philippe,

On Thu, Sep 14, 2023 at 09:41:30AM +0200, Philippe Mathieu-Daudé wrote:
> Date: Thu, 14 Sep 2023 09:41:30 +0200
> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Subject: Re: [PATCH v4 21/21] i386: Add new property to control L2 cache
>  topo in CPUID.04H
> 
> On 14/9/23 09:21, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > The property x-l2-cache-topo will be used to change the L2 cache
> > topology in CPUID.04H.
> > 
> > Now it allows user to set the L2 cache is shared in core level or
> > cluster level.
> > 
> > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > topology will be overrode by the new topology setting.
> > 
> > Here we expose to user "cluster" instead of "module", to be consistent
> > with "cluster-id" naming.
> > 
> > Since CPUID.04H is used by intel CPUs, this property is available on
> > intel CPUs as for now.
> > 
> > When necessary, it can be extended to CPUID.8000001DH for AMD CPUs.
> > 
> > (Tested the cache topology in CPUID[0x04] leaf with "x-l2-cache-topo=[
> > core|cluster]", and tested the live migration between the QEMUs w/ &
> > w/o this patch series.)
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> > ---
> > Changes since v3:
> >   * Add description about test for live migration compatibility. (Babu)
> > 
> > Changes since v1:
> >   * Rename MODULE branch to CPU_TOPO_LEVEL_MODULE to match the previous
> >     renaming changes.
> > ---
> >   target/i386/cpu.c | 34 +++++++++++++++++++++++++++++++++-
> >   target/i386/cpu.h |  2 ++
> >   2 files changed, 35 insertions(+), 1 deletion(-)
> 
> 
> > @@ -8079,6 +8110,7 @@ static Property x86_cpu_properties[] = {
> >                        false),
> >       DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
> >                        true),
> > +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
> 
> We use the 'x-' prefix for unstable features, is it the case here?

I thought that if we can have a more general CLI way to define cache
topology in the future, then this option can be removed.

I'm not sure if this option could be treated as unstable, what do you
think?


Thanks,
Zhao

> 
> >       DEFINE_PROP_END_OF_LIST()
> >   };
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 00/21] Support smp.clusters for x86 in QEMU
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (20 preceding siblings ...)
  2023-09-14  7:21 ` [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
@ 2023-09-22 16:03 ` Moger, Babu
  2023-09-26  3:11   ` Zhao Liu
  2023-10-18 12:06 ` Michael S. Tsirkin
  22 siblings, 1 reply; 43+ messages in thread
From: Moger, Babu @ 2023-09-22 16:03 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

Tested the series on AMD system. Created a VM and ran some basic commands.

Everything looks good.

Tested-by: Babu Moger <babu.moger@amd.com>


On 9/14/2023 2:21 AM, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> Hi list,
>
> (CC kvm@vger.kernel.org for better browsing.)
>
> This is the our v4 patch series, rebased on the master branch at the
> commit 9ef497755afc2 ("Merge tag 'pull-vfio-20230911' of
> https://github.com/legoater/qemu into staging").
>
> Comparing with v3 [1], v4 mainly refactors the CPUID[0x1F] encoding and
> exposes module level in CPUID[0x1F] with these new patches:
>
> * [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the
> definitions of CPUID[0xB]
> * [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific
> topology level
> * [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F]
>
> v4 also fixes compile warnings and fixes cache topology uninitialization
> bugs for some AMD CPUs.
>
> Welcome your comments!
>
>
> # Introduction
>
> This series add the cluster support for x86 PC machine, which allows
> x86 can use smp.clusters to configure the module level CPU topology
> of x86.
>
> And since the compatibility issue (see section: ## Why not share L2
> cache in cluster directly), this series also introduce a new command
> to adjust the topology of the x86 L2 cache.
>
> Welcome your comments!
>
>
> # Background
>
> The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> hasn't supported it.
>
> At present, x86 defaults L2 cache is shared in one core, but this is
> not enough. There're some platforms that multiple cores share the
> same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> Atom cores [3], that is, every four Atom cores shares one L2 cache.
> Therefore, we need the new CPU topology level (cluster/module).
>
> Another reason is for hybrid architecture. cluster support not only
> provides another level of topology definition in x86, but would also
> provide required code change for future our hybrid topology support.
>
>
> # Overview
>
> ## Introduction of module level for x86
>
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
>
> For x86, the "cluster" in smp is corresponding to the module level [4],
> which is above the core level. So use the "module" other than "cluster"
> in x86 code.
>
> And please note that x86 already has a cpu topology level also named
> "cluster" [4], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
>
>
> ## Why not share L2 cache in cluster directly
>
> Though "clusters" was introduced to help define L2 cache topology
> [2], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
>
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
>
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
>
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
>
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
>
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
>
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
>
>
> ## module level in CPUID
>
> Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
> erroneous smp_num_siblings on Intel Hybrid platforms") is able to
> handle platforms with Module level enumerated via CPUID.1F.
>
> Expose the module level in CPUID[0x1F] (for Intel CPUs) if the machine
> has more than 1 modules since v3.
>
> We can configure CPUID.04H.02H (L2 cache topology) with module level by
> a new command:
>
>          "-cpu,x-l2-cache-topo=cluster"
>
> More information about this command, please see the section: "## New
> property: x-l2-cache-topo".
>
>
> ## New cache topology info in CPUCacheInfo
>
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
>
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
>
> In fact, the settings of these topologies depend on the specific
> platform and are not static. For example, on Alder Lake-P, every
> four Atom cores share the same L2 cache [2].
>
> Thus, in this patch set, we explicitly define the corresponding cache
> topology for different cpu models and this has two benefits:
> 1. Easy to expand to new CPU models in the future, which has different
>     cache topology.
> 2. It can easily support custom cache topology by some command (e.g.,
>     x-l2-cache-topo).
>
>
> ## New property: x-l2-cache-topo
>
> The property x-l2-cache-topo will be used to change the L2 cache
> topology in CPUID.04H.
>
> Now it allows user to set the L2 cache is shared in core level or
> cluster level.
>
> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> topology will be overrode by the new topology setting.
>
> Since CPUID.04H is used by Intel CPUs, this property is available on
> Intel CPUs as for now.
>
> When necessary, it can be extended to CPUID[0x8000001D] for AMD CPUs.
>
>
> # Patch description
>
> patch 1-2 Cleanups about coding style and test name.
>
> patch 3-5 Fixes about x86 topology and Intel l1 cache topology.
>
> patch 6-7 Cleanups about topology related CPUID encoding and QEMU
>            topology variables.
>
> patch 8-9 Refactor CPUID[0x1F] encoding to prepare to introduce module
>            level.
>
> patch 10-16 Add the module as the new CPU topology level in x86, and it
>              is corresponding to the cluster level in generic code.
>
> patch 17,18,20 Add cache topology information in cache models.
>
> patch 19 Update AMD CPUs' cache topology encoding.
>
> patch 21 Introduce a new command to configure L2 cache topology.
>
>
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg00022.html
> [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
>
> Best Regards,
> Zhao
> ---
> Changelog:
>
> Changes since v3 (main changes):
>   * Expose module level in CPUID[0x1F].
>   * Fix compile warnings. (Babu)
>   * Fixes cache topology uninitialization bugs for some AMD CPUs. (Babu)
>
> Changes since v2:
>   * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
>   * Use newly added wrapped helper to get cores per socket in
>     qemu_init_vcpu().
>
> Changes since v1:
>   * Reordered patches. (Yanan)
>   * Deprecated the patch to fix comment of machine_parse_smp_config().
>     (Yanan)
>   * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
>   * Split the intel's l1 cache topology fix into a new separate patch.
>     (Yanan)
>   * Combined module_id and APIC ID for module level support into one
>     patch. (Yanan)
>   * Make cache_into_passthrough case of cpuid 0x04 leaf in
>   * cpu_x86_cpuid() use max_processor_ids_for_cache() and
>     max_core_ids_in_package() to encode CPUID[4]. (Yanan)
>   * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
>     (Yanan)
> ---
> Zhao Liu (14):
>    i386: Fix comment style in topology.h
>    tests: Rename test-x86-cpuid.c to test-x86-topo.c
>    hw/cpu: Update the comments of nr_cores and nr_dies
>    i386/cpu: Fix i/d-cache topology to core level for Intel CPU
>    i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
>    i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
>    i386: Split topology types of CPUID[0x1F] from the definitions of
>      CPUID[0xB]
>    i386: Decouple CPUID[0x1F] subleaf with specific topology level
>    i386: Expose module level in CPUID[0x1F]
>    i386: Add cache topology info in CPUCacheInfo
>    i386: Use CPUCacheInfo.share_level to encode CPUID[4]
>    i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits
>      25:14]
>    i386: Use CPUCacheInfo.share_level to encode
>      CPUID[0x8000001D].EAX[bits 25:14]
>    i386: Add new property to control L2 cache topo in CPUID.04H
>
> Zhuocheng Ding (7):
>    softmmu: Fix CPUSTATE.nr_cores' calculation
>    i386: Introduce module-level cpu topology to CPUX86State
>    i386: Support modules_per_die in X86CPUTopoInfo
>    i386: Support module_id in X86CPUTopoIDs
>    i386/cpu: Introduce cluster-id to X86CPU
>    tests: Add test case of APIC ID for module level parsing
>    hw/i386/pc: Support smp.clusters for x86 PC machine
>
>   MAINTAINERS                                   |   2 +-
>   hw/i386/pc.c                                  |   1 +
>   hw/i386/x86.c                                 |  49 ++-
>   include/hw/core/cpu.h                         |   2 +-
>   include/hw/i386/topology.h                    |  68 ++--
>   qemu-options.hx                               |  10 +-
>   softmmu/cpus.c                                |   2 +-
>   target/i386/cpu.c                             | 322 ++++++++++++++----
>   target/i386/cpu.h                             |  46 ++-
>   target/i386/kvm/kvm.c                         |   2 +-
>   tests/unit/meson.build                        |   4 +-
>   .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++--
>   12 files changed, 437 insertions(+), 129 deletions(-)
>   rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 01/21] i386: Fix comment style in topology.h
  2023-09-14  7:21 ` [PATCH v4 01/21] i386: Fix comment style in topology.h Zhao Liu
@ 2023-09-22 16:05   ` Moger, Babu
  2023-09-26  3:11     ` Zhao Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Moger, Babu @ 2023-09-22 16:05 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu


On 9/14/2023 2:21 AM, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> For function comments in this file, keep the comment style consistent
> with other files in the directory.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
> Reviewed-by: Xiaoyao Li <xiaoyao.li@Intel.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>

Reviewed-by: Babu Moger <babu.moger@amd.com>

Thanks

Babu


> ---
> Changes since v3:
>   * Optimized the description in commit message: Change "with other
>     places" to "with other files in the directory". (Babu)
> ---
>   include/hw/i386/topology.h | 33 +++++++++++++++++----------------
>   1 file changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 81573f6cfde0..5a19679f618b 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -24,7 +24,8 @@
>   #ifndef HW_I386_TOPOLOGY_H
>   #define HW_I386_TOPOLOGY_H
>   
> -/* This file implements the APIC-ID-based CPU topology enumeration logic,
> +/*
> + * This file implements the APIC-ID-based CPU topology enumeration logic,
>    * documented at the following document:
>    *   Intel® 64 Architecture Processor Topology Enumeration
>    *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
> @@ -41,7 +42,8 @@
>   
>   #include "qemu/bitops.h"
>   
> -/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> +/*
> + * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
>    */
>   typedef uint32_t apic_id_t;
>   
> @@ -58,8 +60,7 @@ typedef struct X86CPUTopoInfo {
>       unsigned threads_per_core;
>   } X86CPUTopoInfo;
>   
> -/* Return the bit width needed for 'count' IDs
> - */
> +/* Return the bit width needed for 'count' IDs */
>   static unsigned apicid_bitwidth_for_count(unsigned count)
>   {
>       g_assert(count >= 1);
> @@ -67,15 +68,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
>       return count ? 32 - clz32(count) : 0;
>   }
>   
> -/* Bit width of the SMT_ID (thread ID) field on the APIC ID
> - */
> +/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
>   static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_bitwidth_for_count(topo_info->threads_per_core);
>   }
>   
> -/* Bit width of the Core_ID field
> - */
> +/* Bit width of the Core_ID field */
>   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_bitwidth_for_count(topo_info->cores_per_die);
> @@ -87,8 +86,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
>       return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
>   }
>   
> -/* Bit offset of the Core_ID field
> - */
> +/* Bit offset of the Core_ID field */
>   static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_smt_width(topo_info);
> @@ -100,14 +98,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
>       return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
>   }
>   
> -/* Bit offset of the Pkg_ID (socket ID) field
> - */
> +/* Bit offset of the Pkg_ID (socket ID) field */
>   static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
>   }
>   
> -/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> +/*
> + * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
>    *
>    * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
>    */
> @@ -120,7 +118,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
>              topo_ids->smt_id;
>   }
>   
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>    * based on (contiguous) CPU index
>    */
>   static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> @@ -137,7 +136,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>       topo_ids->smt_id = cpu_index % nr_threads;
>   }
>   
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>    * based on APIC ID
>    */
>   static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> @@ -155,7 +155,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>       topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
>   }
>   
> -/* Make APIC ID for the CPU 'cpu_index'
> +/*
> + * Make APIC ID for the CPU 'cpu_index'
>    *
>    * 'cpu_index' is a sequential, contiguous ID for the CPU.
>    */

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 19/21] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  2023-09-14  7:21 ` [PATCH v4 19/21] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
@ 2023-09-22 19:27   ` Moger, Babu
  2023-09-26  3:10     ` Zhao Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Moger, Babu @ 2023-09-22 19:27 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu


On 9/14/2023 2:21 AM, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
> for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
> the number of sharing threads directly.
>
>  From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
> means [1]:
>
> The number of logical processors sharing this cache is the value of
> this field incremented by 1. To determine which logical processors are
> sharing a cache, determine a Share Id for each processor as follows:
>
> ShareId = LocalApicId >> log2(NumSharingCache+1)
>
> Logical processors with the same ShareId then share a cache. If
> NumSharingCache+1 is not a power of two, round it up to the next power
> of two.
>
>  From the description above, the calculation of this field should be same
> as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
> APIC ID to calculate this field.
>
> [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
>       Information
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
> ---
> Changes since v3:
>   * Rewrite the subject. (Babu)
>   * Delete the original "comment/help" expression, as this behavior is
>     confirmed for AMD CPUs. (Babu)
>   * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec
>     definition. (Babu)
>
> Changes since v1:
>   * Rename "l3_threads" to "num_apic_ids" in
>     encode_cache_cpuid8000001d(). (Yanan)
>   * Add the description of the original commit and add Cc.
> ---
>   target/i386/cpu.c | 10 ++++------
>   1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 5d066107d6ce..bc28c59df089 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -482,7 +482,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>                                          uint32_t *eax, uint32_t *ebx,
>                                          uint32_t *ecx, uint32_t *edx)
>   {
> -    uint32_t l3_threads;
> +    uint32_t num_sharing_cache;
>       assert(cache->size == cache->line_size * cache->associativity *
>                             cache->partitions * cache->sets);
>   
> @@ -491,13 +491,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>   
>       /* L3 is shared among multiple cores */
>       if (cache->level == 3) {
> -        l3_threads = topo_info->modules_per_die *
> -                     topo_info->cores_per_module *
> -                     topo_info->threads_per_core;
> -        *eax |= (l3_threads - 1) << 14;
> +        num_sharing_cache = 1 << apicid_die_offset(topo_info);
>       } else {
> -        *eax |= ((topo_info->threads_per_core - 1) << 14);
> +        num_sharing_cache = 1 << apicid_core_offset(topo_info);
>       }
> +    *eax |= (num_sharing_cache - 1) << 14;
>   
>       assert(cache->line_size > 0);
>       assert(cache->partitions > 0);

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 20/21] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
  2023-09-14  7:21 ` [PATCH v4 20/21] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
@ 2023-09-22 19:27   ` Moger, Babu
  2023-09-26  3:08     ` Zhao Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Moger, Babu @ 2023-09-22 19:27 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti
  Cc: qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu


On 9/14/2023 2:21 AM, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical
> processors sharing cache.
>
> The number of logical processors sharing this cache is
> NumSharingCache + 1.
>
> After cache models have topology information, we can use
> CPUCacheInfo.share_level to decide which topology level to be encoded
> into CPUID[0x8000001D].EAX[bits 25:14].
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>

Reviewed-by: Babu Moger <babu.moger@amd.com>


> ---
> Changes since v3:
>   * Explain what "CPUID[0x8000001D].EAX[bits 25:14]" means in the commit
>     message. (Babu)
>
> Changes since v1:
>   * Use cache->share_level as the parameter in
>     max_processor_ids_for_cache().
> ---
>   target/i386/cpu.c | 10 +---------
>   1 file changed, 1 insertion(+), 9 deletions(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index bc28c59df089..3bed823dc3b7 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -482,20 +482,12 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>                                          uint32_t *eax, uint32_t *ebx,
>                                          uint32_t *ecx, uint32_t *edx)
>   {
> -    uint32_t num_sharing_cache;
>       assert(cache->size == cache->line_size * cache->associativity *
>                             cache->partitions * cache->sets);
>   
>       *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
>                  (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
> -
> -    /* L3 is shared among multiple cores */
> -    if (cache->level == 3) {
> -        num_sharing_cache = 1 << apicid_die_offset(topo_info);
> -    } else {
> -        num_sharing_cache = 1 << apicid_core_offset(topo_info);
> -    }
> -    *eax |= (num_sharing_cache - 1) << 14;
> +    *eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 14;
>   
>       assert(cache->line_size > 0);
>       assert(cache->partitions > 0);

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 20/21] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
  2023-09-22 19:27   ` Moger, Babu
@ 2023-09-26  3:08     ` Zhao Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-26  3:08 UTC (permalink / raw)
  To: Babu Moger
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang,
	Xiaoyao Li, Zhao Liu

On Fri, Sep 22, 2023 at 02:27:58PM -0500, Moger, Babu wrote:
> Date: Fri, 22 Sep 2023 14:27:58 -0500
> From: "Moger, Babu" <bmoger@amd.com>
> Subject: Re: [PATCH v4 20/21] i386: Use CPUCacheInfo.share_level to encode
>  CPUID[0x8000001D].EAX[bits 25:14]
> 
> 
> On 9/14/2023 2:21 AM, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical
> > processors sharing cache.
> > 
> > The number of logical processors sharing this cache is
> > NumSharingCache + 1.
> > 
> > After cache models have topology information, we can use
> > CPUCacheInfo.share_level to decide which topology level to be encoded
> > into CPUID[0x8000001D].EAX[bits 25:14].
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> 
> Reviewed-by: Babu Moger <babu.moger@amd.com>

Thanks Babu!

-Zhao

> 
> 
> > ---
> > Changes since v3:
> >   * Explain what "CPUID[0x8000001D].EAX[bits 25:14]" means in the commit
> >     message. (Babu)
> > 
> > Changes since v1:
> >   * Use cache->share_level as the parameter in
> >     max_processor_ids_for_cache().
> > ---
> >   target/i386/cpu.c | 10 +---------
> >   1 file changed, 1 insertion(+), 9 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index bc28c59df089..3bed823dc3b7 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -482,20 +482,12 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >                                          uint32_t *eax, uint32_t *ebx,
> >                                          uint32_t *ecx, uint32_t *edx)
> >   {
> > -    uint32_t num_sharing_cache;
> >       assert(cache->size == cache->line_size * cache->associativity *
> >                             cache->partitions * cache->sets);
> >       *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
> >                  (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
> > -
> > -    /* L3 is shared among multiple cores */
> > -    if (cache->level == 3) {
> > -        num_sharing_cache = 1 << apicid_die_offset(topo_info);
> > -    } else {
> > -        num_sharing_cache = 1 << apicid_core_offset(topo_info);
> > -    }
> > -    *eax |= (num_sharing_cache - 1) << 14;
> > +    *eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 14;
> >       assert(cache->line_size > 0);
> >       assert(cache->partitions > 0);

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 19/21] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  2023-09-22 19:27   ` Moger, Babu
@ 2023-09-26  3:10     ` Zhao Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-26  3:10 UTC (permalink / raw)
  To: Babu Moger
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang,
	Xiaoyao Li, Zhao Liu

On Fri, Sep 22, 2023 at 02:27:18PM -0500, Moger, Babu wrote:
> Date: Fri, 22 Sep 2023 14:27:18 -0500
> From: "Moger, Babu" <bmoger@amd.com>
> Subject: Re: [PATCH v4 19/21] i386: Use offsets get NumSharingCache for
>  CPUID[0x8000001D].EAX[bits 25:14]
> 
> 
> On 9/14/2023 2:21 AM, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
> > for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding
> > the number of sharing threads directly.
> > 
> >  From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
> > means [1]:
> > 
> > The number of logical processors sharing this cache is the value of
> > this field incremented by 1. To determine which logical processors are
> > sharing a cache, determine a Share Id for each processor as follows:
> > 
> > ShareId = LocalApicId >> log2(NumSharingCache+1)
> > 
> > Logical processors with the same ShareId then share a cache. If
> > NumSharingCache+1 is not a power of two, round it up to the next power
> > of two.
> > 
> >  From the description above, the calculation of this field should be same
> > as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
> > APIC ID to calculate this field.
> > 
> > [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
> >       Information
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Reviewed-by: Babu Moger <babu.moger@amd.com>

Thanks Babu!

-Zhao

> > ---
> > Changes since v3:
> >   * Rewrite the subject. (Babu)
> >   * Delete the original "comment/help" expression, as this behavior is
> >     confirmed for AMD CPUs. (Babu)
> >   * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec
> >     definition. (Babu)
> > 
> > Changes since v1:
> >   * Rename "l3_threads" to "num_apic_ids" in
> >     encode_cache_cpuid8000001d(). (Yanan)
> >   * Add the description of the original commit and add Cc.
> > ---
> >   target/i386/cpu.c | 10 ++++------
> >   1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 5d066107d6ce..bc28c59df089 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -482,7 +482,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >                                          uint32_t *eax, uint32_t *ebx,
> >                                          uint32_t *ecx, uint32_t *edx)
> >   {
> > -    uint32_t l3_threads;
> > +    uint32_t num_sharing_cache;
> >       assert(cache->size == cache->line_size * cache->associativity *
> >                             cache->partitions * cache->sets);
> > @@ -491,13 +491,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >       /* L3 is shared among multiple cores */
> >       if (cache->level == 3) {
> > -        l3_threads = topo_info->modules_per_die *
> > -                     topo_info->cores_per_module *
> > -                     topo_info->threads_per_core;
> > -        *eax |= (l3_threads - 1) << 14;
> > +        num_sharing_cache = 1 << apicid_die_offset(topo_info);
> >       } else {
> > -        *eax |= ((topo_info->threads_per_core - 1) << 14);
> > +        num_sharing_cache = 1 << apicid_core_offset(topo_info);
> >       }
> > +    *eax |= (num_sharing_cache - 1) << 14;
> >       assert(cache->line_size > 0);
> >       assert(cache->partitions > 0);

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 00/21] Support smp.clusters for x86 in QEMU
  2023-09-22 16:03 ` [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Moger, Babu
@ 2023-09-26  3:11   ` Zhao Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-26  3:11 UTC (permalink / raw)
  To: babu.moger
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang,
	Xiaoyao Li, Zhao Liu

On Fri, Sep 22, 2023 at 11:03:55AM -0500, Moger, Babu wrote:
> Date: Fri, 22 Sep 2023 11:03:55 -0500
> From: "Moger, Babu" <bmoger@amd.com>
> Subject: Re: [PATCH v4 00/21] Support smp.clusters for x86 in QEMU
> 
> Tested the series on AMD system. Created a VM and ran some basic commands.
> 
> Everything looks good.
> 
> Tested-by: Babu Moger <babu.moger@amd.com>
> 

Many thanks!

-Zhao

> 
> On 9/14/2023 2:21 AM, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > Hi list,
> > 
> > (CC kvm@vger.kernel.org for better browsing.)
> > 
> > This is the our v4 patch series, rebased on the master branch at the
> > commit 9ef497755afc2 ("Merge tag 'pull-vfio-20230911' of
> > https://github.com/legoater/qemu into staging").
> > 
> > Comparing with v3 [1], v4 mainly refactors the CPUID[0x1F] encoding and
> > exposes module level in CPUID[0x1F] with these new patches:
> > 
> > * [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the
> > definitions of CPUID[0xB]
> > * [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific
> > topology level
> > * [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F]
> > 
> > v4 also fixes compile warnings and fixes cache topology uninitialization
> > bugs for some AMD CPUs.
> > 
> > Welcome your comments!
> > 
> > 
> > # Introduction
> > 
> > This series add the cluster support for x86 PC machine, which allows
> > x86 can use smp.clusters to configure the module level CPU topology
> > of x86.
> > 
> > And since the compatibility issue (see section: ## Why not share L2
> > cache in cluster directly), this series also introduce a new command
> > to adjust the topology of the x86 L2 cache.
> > 
> > Welcome your comments!
> > 
> > 
> > # Background
> > 
> > The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> > hasn't supported it.
> > 
> > At present, x86 defaults L2 cache is shared in one core, but this is
> > not enough. There're some platforms that multiple cores share the
> > same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> > Atom cores [3], that is, every four Atom cores shares one L2 cache.
> > Therefore, we need the new CPU topology level (cluster/module).
> > 
> > Another reason is for hybrid architecture. cluster support not only
> > provides another level of topology definition in x86, but would also
> > provide required code change for future our hybrid topology support.
> > 
> > 
> > # Overview
> > 
> > ## Introduction of module level for x86
> > 
> > "cluster" in smp is the CPU topology level which is between "core" and
> > die.
> > 
> > For x86, the "cluster" in smp is corresponding to the module level [4],
> > which is above the core level. So use the "module" other than "cluster"
> > in x86 code.
> > 
> > And please note that x86 already has a cpu topology level also named
> > "cluster" [4], this level is at the upper level of the package. Here,
> > the cluster in x86 cpu topology is completely different from the
> > "clusters" as the smp parameter. After the module level is introduced,
> > the cluster as the smp parameter will actually refer to the module level
> > of x86.
> > 
> > 
> > ## Why not share L2 cache in cluster directly
> > 
> > Though "clusters" was introduced to help define L2 cache topology
> > [2], using cluster to define x86's L2 cache topology will cause the
> > compatibility problem:
> > 
> > Currently, x86 defaults that the L2 cache is shared in one core, which
> > actually implies a default setting "cores per L2 cache is 1" and
> > therefore implicitly defaults to having as many L2 caches as cores.
> > 
> > For example (i386 PC machine):
> > -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> > 
> > Considering the topology of the L2 cache, this (*) implicitly means "1
> > core per L2 cache" and "2 L2 caches per die".
> > 
> > If we use cluster to configure L2 cache topology with the new default
> > setting "clusters per L2 cache is 1", the above semantics will change
> > to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> > cores per L2 cache".
> > 
> > So the same command (*) will cause changes in the L2 cache topology,
> > further affecting the performance of the virtual machine.
> > 
> > Therefore, x86 should only treat cluster as a cpu topology level and
> > avoid using it to change L2 cache by default for compatibility.
> > 
> > 
> > ## module level in CPUID
> > 
> > Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
> > erroneous smp_num_siblings on Intel Hybrid platforms") is able to
> > handle platforms with Module level enumerated via CPUID.1F.
> > 
> > Expose the module level in CPUID[0x1F] (for Intel CPUs) if the machine
> > has more than 1 modules since v3.
> > 
> > We can configure CPUID.04H.02H (L2 cache topology) with module level by
> > a new command:
> > 
> >          "-cpu,x-l2-cache-topo=cluster"
> > 
> > More information about this command, please see the section: "## New
> > property: x-l2-cache-topo".
> > 
> > 
> > ## New cache topology info in CPUCacheInfo
> > 
> > Currently, by default, the cache topology is encoded as:
> > 1. i/d cache is shared in one core.
> > 2. L2 cache is shared in one core.
> > 3. L3 cache is shared in one die.
> > 
> > This default general setting has caused a misunderstanding, that is, the
> > cache topology is completely equated with a specific cpu topology, such
> > as the connection between L2 cache and core level, and the connection
> > between L3 cache and die level.
> > 
> > In fact, the settings of these topologies depend on the specific
> > platform and are not static. For example, on Alder Lake-P, every
> > four Atom cores share the same L2 cache [2].
> > 
> > Thus, in this patch set, we explicitly define the corresponding cache
> > topology for different cpu models and this has two benefits:
> > 1. Easy to expand to new CPU models in the future, which has different
> >     cache topology.
> > 2. It can easily support custom cache topology by some command (e.g.,
> >     x-l2-cache-topo).
> > 
> > 
> > ## New property: x-l2-cache-topo
> > 
> > The property x-l2-cache-topo will be used to change the L2 cache
> > topology in CPUID.04H.
> > 
> > Now it allows user to set the L2 cache is shared in core level or
> > cluster level.
> > 
> > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > topology will be overrode by the new topology setting.
> > 
> > Since CPUID.04H is used by Intel CPUs, this property is available on
> > Intel CPUs as for now.
> > 
> > When necessary, it can be extended to CPUID[0x8000001D] for AMD CPUs.
> > 
> > 
> > # Patch description
> > 
> > patch 1-2 Cleanups about coding style and test name.
> > 
> > patch 3-5 Fixes about x86 topology and Intel l1 cache topology.
> > 
> > patch 6-7 Cleanups about topology related CPUID encoding and QEMU
> >            topology variables.
> > 
> > patch 8-9 Refactor CPUID[0x1F] encoding to prepare to introduce module
> >            level.
> > 
> > patch 10-16 Add the module as the new CPU topology level in x86, and it
> >              is corresponding to the cluster level in generic code.
> > 
> > patch 17,18,20 Add cache topology information in cache models.
> > 
> > patch 19 Update AMD CPUs' cache topology encoding.
> > 
> > patch 21 Introduce a new command to configure L2 cache topology.
> > 
> > 
> > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg00022.html
> > [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> > [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> > [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > 
> > Best Regards,
> > Zhao
> > ---
> > Changelog:
> > 
> > Changes since v3 (main changes):
> >   * Expose module level in CPUID[0x1F].
> >   * Fix compile warnings. (Babu)
> >   * Fixes cache topology uninitialization bugs for some AMD CPUs. (Babu)
> > 
> > Changes since v2:
> >   * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
> >   * Use newly added wrapped helper to get cores per socket in
> >     qemu_init_vcpu().
> > 
> > Changes since v1:
> >   * Reordered patches. (Yanan)
> >   * Deprecated the patch to fix comment of machine_parse_smp_config().
> >     (Yanan)
> >   * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
> >   * Split the intel's l1 cache topology fix into a new separate patch.
> >     (Yanan)
> >   * Combined module_id and APIC ID for module level support into one
> >     patch. (Yanan)
> >   * Make cache_into_passthrough case of cpuid 0x04 leaf in
> >   * cpu_x86_cpuid() use max_processor_ids_for_cache() and
> >     max_core_ids_in_package() to encode CPUID[4]. (Yanan)
> >   * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
> >     (Yanan)
> > ---
> > Zhao Liu (14):
> >    i386: Fix comment style in topology.h
> >    tests: Rename test-x86-cpuid.c to test-x86-topo.c
> >    hw/cpu: Update the comments of nr_cores and nr_dies
> >    i386/cpu: Fix i/d-cache topology to core level for Intel CPU
> >    i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
> >    i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
> >    i386: Split topology types of CPUID[0x1F] from the definitions of
> >      CPUID[0xB]
> >    i386: Decouple CPUID[0x1F] subleaf with specific topology level
> >    i386: Expose module level in CPUID[0x1F]
> >    i386: Add cache topology info in CPUCacheInfo
> >    i386: Use CPUCacheInfo.share_level to encode CPUID[4]
> >    i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits
> >      25:14]
> >    i386: Use CPUCacheInfo.share_level to encode
> >      CPUID[0x8000001D].EAX[bits 25:14]
> >    i386: Add new property to control L2 cache topo in CPUID.04H
> > 
> > Zhuocheng Ding (7):
> >    softmmu: Fix CPUSTATE.nr_cores' calculation
> >    i386: Introduce module-level cpu topology to CPUX86State
> >    i386: Support modules_per_die in X86CPUTopoInfo
> >    i386: Support module_id in X86CPUTopoIDs
> >    i386/cpu: Introduce cluster-id to X86CPU
> >    tests: Add test case of APIC ID for module level parsing
> >    hw/i386/pc: Support smp.clusters for x86 PC machine
> > 
> >   MAINTAINERS                                   |   2 +-
> >   hw/i386/pc.c                                  |   1 +
> >   hw/i386/x86.c                                 |  49 ++-
> >   include/hw/core/cpu.h                         |   2 +-
> >   include/hw/i386/topology.h                    |  68 ++--
> >   qemu-options.hx                               |  10 +-
> >   softmmu/cpus.c                                |   2 +-
> >   target/i386/cpu.c                             | 322 ++++++++++++++----
> >   target/i386/cpu.h                             |  46 ++-
> >   target/i386/kvm/kvm.c                         |   2 +-
> >   tests/unit/meson.build                        |   4 +-
> >   .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++--
> >   12 files changed, 437 insertions(+), 129 deletions(-)
> >   rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> > 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 01/21] i386: Fix comment style in topology.h
  2023-09-22 16:05   ` Moger, Babu
@ 2023-09-26  3:11     ` Zhao Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-09-26  3:11 UTC (permalink / raw)
  To: Babu Moger
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang,
	Xiaoyao Li, Zhao Liu

On Fri, Sep 22, 2023 at 11:05:59AM -0500, Moger, Babu wrote:
> Date: Fri, 22 Sep 2023 11:05:59 -0500
> From: "Moger, Babu" <bmoger@amd.com>
> Subject: Re: [PATCH v4 01/21] i386: Fix comment style in topology.h
> 
> 
> On 9/14/2023 2:21 AM, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > For function comments in this file, keep the comment style consistent
> > with other files in the directory.
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> > Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
> > Reviewed-by: Xiaoyao Li <xiaoyao.li@Intel.com>
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Reviewed-by: Babu Moger <babu.moger@amd.com>

Thanks!

-Zhao

> 
> 
> > ---
> > Changes since v3:
> >   * Optimized the description in commit message: Change "with other
> >     places" to "with other files in the directory". (Babu)
> > ---
> >   include/hw/i386/topology.h | 33 +++++++++++++++++----------------
> >   1 file changed, 17 insertions(+), 16 deletions(-)
> > 
> > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > index 81573f6cfde0..5a19679f618b 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -24,7 +24,8 @@
> >   #ifndef HW_I386_TOPOLOGY_H
> >   #define HW_I386_TOPOLOGY_H
> > -/* This file implements the APIC-ID-based CPU topology enumeration logic,
> > +/*
> > + * This file implements the APIC-ID-based CPU topology enumeration logic,
> >    * documented at the following document:
> >    *   Intel® 64 Architecture Processor Topology Enumeration
> >    *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
> > @@ -41,7 +42,8 @@
> >   #include "qemu/bitops.h"
> > -/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> > +/*
> > + * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> >    */
> >   typedef uint32_t apic_id_t;
> > @@ -58,8 +60,7 @@ typedef struct X86CPUTopoInfo {
> >       unsigned threads_per_core;
> >   } X86CPUTopoInfo;
> > -/* Return the bit width needed for 'count' IDs
> > - */
> > +/* Return the bit width needed for 'count' IDs */
> >   static unsigned apicid_bitwidth_for_count(unsigned count)
> >   {
> >       g_assert(count >= 1);
> > @@ -67,15 +68,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
> >       return count ? 32 - clz32(count) : 0;
> >   }
> > -/* Bit width of the SMT_ID (thread ID) field on the APIC ID
> > - */
> > +/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
> >   static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
> >   {
> >       return apicid_bitwidth_for_count(topo_info->threads_per_core);
> >   }
> > -/* Bit width of the Core_ID field
> > - */
> > +/* Bit width of the Core_ID field */
> >   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
> >   {
> >       return apicid_bitwidth_for_count(topo_info->cores_per_die);
> > @@ -87,8 +86,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
> >       return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
> >   }
> > -/* Bit offset of the Core_ID field
> > - */
> > +/* Bit offset of the Core_ID field */
> >   static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
> >   {
> >       return apicid_smt_width(topo_info);
> > @@ -100,14 +98,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
> >       return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> >   }
> > -/* Bit offset of the Pkg_ID (socket ID) field
> > - */
> > +/* Bit offset of the Pkg_ID (socket ID) field */
> >   static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
> >   {
> >       return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
> >   }
> > -/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> > +/*
> > + * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> >    *
> >    * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
> >    */
> > @@ -120,7 +118,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
> >              topo_ids->smt_id;
> >   }
> > -/* Calculate thread/core/package IDs for a specific topology,
> > +/*
> > + * Calculate thread/core/package IDs for a specific topology,
> >    * based on (contiguous) CPU index
> >    */
> >   static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> > @@ -137,7 +136,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> >       topo_ids->smt_id = cpu_index % nr_threads;
> >   }
> > -/* Calculate thread/core/package IDs for a specific topology,
> > +/*
> > + * Calculate thread/core/package IDs for a specific topology,
> >    * based on APIC ID
> >    */
> >   static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> > @@ -155,7 +155,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> >       topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
> >   }
> > -/* Make APIC ID for the CPU 'cpu_index'
> > +/*
> > + * Make APIC ID for the CPU 'cpu_index'
> >    *
> >    * 'cpu_index' is a sequential, contiguous ID for the CPU.
> >    */

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-09-15  7:53     ` Zhao Liu
@ 2023-10-03 12:57       ` Michael S. Tsirkin
  2023-10-06 16:36         ` Zhao Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Michael S. Tsirkin @ 2023-10-03 12:57 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Philippe Mathieu-Daudé,
	Eduardo Habkost, Marcel Apfelbaum, Yanan Wang, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang,
	Xiaoyao Li, Babu Moger, Zhao Liu, Yongwei Ma

On Fri, Sep 15, 2023 at 03:53:25PM +0800, Zhao Liu wrote:
> Hi Philippe,
> 
> On Thu, Sep 14, 2023 at 09:41:30AM +0200, Philippe Mathieu-Daudé wrote:
> > Date: Thu, 14 Sep 2023 09:41:30 +0200
> > From: Philippe Mathieu-Daudé <philmd@linaro.org>
> > Subject: Re: [PATCH v4 21/21] i386: Add new property to control L2 cache
> >  topo in CPUID.04H
> > 
> > On 14/9/23 09:21, Zhao Liu wrote:
> > > From: Zhao Liu <zhao1.liu@intel.com>
> > > 
> > > The property x-l2-cache-topo will be used to change the L2 cache
> > > topology in CPUID.04H.
> > > 
> > > Now it allows user to set the L2 cache is shared in core level or
> > > cluster level.
> > > 
> > > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > > topology will be overrode by the new topology setting.
> > > 
> > > Here we expose to user "cluster" instead of "module", to be consistent
> > > with "cluster-id" naming.
> > > 
> > > Since CPUID.04H is used by intel CPUs, this property is available on
> > > intel CPUs as for now.
> > > 
> > > When necessary, it can be extended to CPUID.8000001DH for AMD CPUs.
> > > 
> > > (Tested the cache topology in CPUID[0x04] leaf with "x-l2-cache-topo=[
> > > core|cluster]", and tested the live migration between the QEMUs w/ &
> > > w/o this patch series.)
> > > 
> > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > > Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> > > ---
> > > Changes since v3:
> > >   * Add description about test for live migration compatibility. (Babu)
> > > 
> > > Changes since v1:
> > >   * Rename MODULE branch to CPU_TOPO_LEVEL_MODULE to match the previous
> > >     renaming changes.
> > > ---
> > >   target/i386/cpu.c | 34 +++++++++++++++++++++++++++++++++-
> > >   target/i386/cpu.h |  2 ++
> > >   2 files changed, 35 insertions(+), 1 deletion(-)
> > 
> > 
> > > @@ -8079,6 +8110,7 @@ static Property x86_cpu_properties[] = {
> > >                        false),
> > >       DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
> > >                        true),
> > > +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
> > 
> > We use the 'x-' prefix for unstable features, is it the case here?
> 
> I thought that if we can have a more general CLI way to define cache
> topology in the future, then this option can be removed.
> 
> I'm not sure if this option could be treated as unstable, what do you
> think?
> 
> 
> Thanks,
> Zhao

Then, please work on this new generic thing.
What we don't want is people relying on unstable options.

> > 
> > >       DEFINE_PROP_END_OF_LIST()
> > >   };
> > 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-10-03 12:57       ` Michael S. Tsirkin
@ 2023-10-06 16:36         ` Zhao Liu
  2023-10-18 14:12           ` Zhao Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Zhao Liu @ 2023-10-06 16:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Philippe Mathieu-Daud�,
	Eduardo Habkost, Marcel Apfelbaum, Yanan Wang, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang,
	Xiaoyao Li, Babu Moger, Zhao Liu, Yongwei Ma, Jonathan Cameron

Hi Michael,

On Tue, Oct 03, 2023 at 08:57:27AM -0400, Michael S. Tsirkin wrote:
> Date: Tue, 3 Oct 2023 08:57:27 -0400
> From: "Michael S. Tsirkin" <mst@redhat.com>
> Subject: Re: [PATCH v4 21/21] i386: Add new property to control L2 cache
>  topo in CPUID.04H
> 
> On Fri, Sep 15, 2023 at 03:53:25PM +0800, Zhao Liu wrote:
> > Hi Philippe,
> > 
> > On Thu, Sep 14, 2023 at 09:41:30AM +0200, Philippe Mathieu-Daud? wrote:
> > > Date: Thu, 14 Sep 2023 09:41:30 +0200
> > > From: Philippe Mathieu-Daud? <philmd@linaro.org>
> > > Subject: Re: [PATCH v4 21/21] i386: Add new property to control L2 cache
> > >  topo in CPUID.04H
> > > 
> > > On 14/9/23 09:21, Zhao Liu wrote:
> > > > From: Zhao Liu <zhao1.liu@intel.com>
> > > > 
> > > > The property x-l2-cache-topo will be used to change the L2 cache
> > > > topology in CPUID.04H.
> > > > 
> > > > Now it allows user to set the L2 cache is shared in core level or
> > > > cluster level.
> > > > 
> > > > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > > > topology will be overrode by the new topology setting.
> > > > 
> > > > Here we expose to user "cluster" instead of "module", to be consistent
> > > > with "cluster-id" naming.
> > > > 
> > > > Since CPUID.04H is used by intel CPUs, this property is available on
> > > > intel CPUs as for now.
> > > > 
> > > > When necessary, it can be extended to CPUID.8000001DH for AMD CPUs.
> > > > 
> > > > (Tested the cache topology in CPUID[0x04] leaf with "x-l2-cache-topo=[
> > > > core|cluster]", and tested the live migration between the QEMUs w/ &
> > > > w/o this patch series.)
> > > > 
> > > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > > > Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> > > > ---
> > > > Changes since v3:
> > > >   * Add description about test for live migration compatibility. (Babu)
> > > > 
> > > > Changes since v1:
> > > >   * Rename MODULE branch to CPU_TOPO_LEVEL_MODULE to match the previous
> > > >     renaming changes.
> > > > ---
> > > >   target/i386/cpu.c | 34 +++++++++++++++++++++++++++++++++-
> > > >   target/i386/cpu.h |  2 ++
> > > >   2 files changed, 35 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > > @@ -8079,6 +8110,7 @@ static Property x86_cpu_properties[] = {
> > > >                        false),
> > > >       DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
> > > >                        true),
> > > > +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
> > > 
> > > We use the 'x-' prefix for unstable features, is it the case here?
> > 
> > I thought that if we can have a more general CLI way to define cache
> > topology in the future, then this option can be removed.
> > 
> > I'm not sure if this option could be treated as unstable, what do you
> > think?
> > 
> > 
> > Thanks,
> > Zhao
> 
> Then, please work on this new generic thing.
> What we don't want is people relying on unstable options.
> 

Okay, I'll remove this option in the next refresh.

BTW, about the generic cache topology, what about porting this option to
smp? Just like:

-smp cpus=4,sockets=2,cores=2,threads=1, \
     l3-cache=socket,l2-cache=core,l1-i-cache=core,l1-d-cache=core

From the previous discussion [1] with Jonathan, it seems this format
could also meet the requirement for ARM.

If you like this, I'll move forward in this direction. ;-)

[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg03997.html

Thanks,
Zhao


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 00/21] Support smp.clusters for x86 in QEMU
  2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
                   ` (21 preceding siblings ...)
  2023-09-22 16:03 ` [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Moger, Babu
@ 2023-10-18 12:06 ` Michael S. Tsirkin
  2023-10-18 14:08   ` Zhao Liu
  22 siblings, 1 reply; 43+ messages in thread
From: Michael S. Tsirkin @ 2023-10-18 12:06 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Richard Henderson, Paolo Bonzini, Marcelo Tosatti,
	qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

On Thu, Sep 14, 2023 at 03:21:38PM +0800, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Hi list,
> 
> (CC kvm@vger.kernel.org for better browsing.)
> 
> This is the our v4 patch series, rebased on the master branch at the
> commit 9ef497755afc2 ("Merge tag 'pull-vfio-20230911' of
> https://github.com/legoater/qemu into staging").
> 
> Comparing with v3 [1], v4 mainly refactors the CPUID[0x1F] encoding and
> exposes module level in CPUID[0x1F] with these new patches:
> 
> * [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the
> definitions of CPUID[0xB]
> * [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific
> topology level
> * [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F]
> 
> v4 also fixes compile warnings and fixes cache topology uninitialization
> bugs for some AMD CPUs.
> 
> Welcome your comments!

Acked-by: Michael S. Tsirkin <mst@redhat.com>


> 
> # Introduction
> 
> This series add the cluster support for x86 PC machine, which allows
> x86 can use smp.clusters to configure the module level CPU topology
> of x86.
> 
> And since the compatibility issue (see section: ## Why not share L2
> cache in cluster directly), this series also introduce a new command
> to adjust the topology of the x86 L2 cache.
> 
> Welcome your comments!
> 
> 
> # Background
> 
> The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> hasn't supported it.
> 
> At present, x86 defaults L2 cache is shared in one core, but this is
> not enough. There're some platforms that multiple cores share the
> same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> Atom cores [3], that is, every four Atom cores shares one L2 cache.
> Therefore, we need the new CPU topology level (cluster/module).
> 
> Another reason is for hybrid architecture. cluster support not only
> provides another level of topology definition in x86, but would also
> provide required code change for future our hybrid topology support.
> 
> 
> # Overview
> 
> ## Introduction of module level for x86
> 
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
> 
> For x86, the "cluster" in smp is corresponding to the module level [4],
> which is above the core level. So use the "module" other than "cluster"
> in x86 code.
> 
> And please note that x86 already has a cpu topology level also named
> "cluster" [4], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
> 
> 
> ## Why not share L2 cache in cluster directly
> 
> Though "clusters" was introduced to help define L2 cache topology
> [2], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
> 
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
> 
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> 
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
> 
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
> 
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
> 
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
> 
> 
> ## module level in CPUID
> 
> Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
> erroneous smp_num_siblings on Intel Hybrid platforms") is able to
> handle platforms with Module level enumerated via CPUID.1F.
> 
> Expose the module level in CPUID[0x1F] (for Intel CPUs) if the machine
> has more than 1 modules since v3.
> 
> We can configure CPUID.04H.02H (L2 cache topology) with module level by
> a new command:
> 
>         "-cpu,x-l2-cache-topo=cluster"
> 
> More information about this command, please see the section: "## New
> property: x-l2-cache-topo".
> 
> 
> ## New cache topology info in CPUCacheInfo
> 
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
> 
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
> 
> In fact, the settings of these topologies depend on the specific
> platform and are not static. For example, on Alder Lake-P, every
> four Atom cores share the same L2 cache [2].
> 
> Thus, in this patch set, we explicitly define the corresponding cache
> topology for different cpu models and this has two benefits:
> 1. Easy to expand to new CPU models in the future, which has different
>    cache topology.
> 2. It can easily support custom cache topology by some command (e.g.,
>    x-l2-cache-topo).
> 
> 
> ## New property: x-l2-cache-topo
> 
> The property x-l2-cache-topo will be used to change the L2 cache
> topology in CPUID.04H.
> 
> Now it allows user to set the L2 cache is shared in core level or
> cluster level.
> 
> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> topology will be overrode by the new topology setting.
> 
> Since CPUID.04H is used by Intel CPUs, this property is available on
> Intel CPUs as for now.
> 
> When necessary, it can be extended to CPUID[0x8000001D] for AMD CPUs.
> 
> 
> # Patch description
> 
> patch 1-2 Cleanups about coding style and test name.
> 
> patch 3-5 Fixes about x86 topology and Intel l1 cache topology.
> 
> patch 6-7 Cleanups about topology related CPUID encoding and QEMU
>           topology variables.
> 
> patch 8-9 Refactor CPUID[0x1F] encoding to prepare to introduce module
>           level.
> 
> patch 10-16 Add the module as the new CPU topology level in x86, and it
>             is corresponding to the cluster level in generic code.
> 
> patch 17,18,20 Add cache topology information in cache models.
> 
> patch 19 Update AMD CPUs' cache topology encoding.
> 
> patch 21 Introduce a new command to configure L2 cache topology.
> 
> 
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg00022.html
> [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> 
> Best Regards,
> Zhao
> ---
> Changelog:
> 
> Changes since v3 (main changes):
>  * Expose module level in CPUID[0x1F].
>  * Fix compile warnings. (Babu)
>  * Fixes cache topology uninitialization bugs for some AMD CPUs. (Babu)
> 
> Changes since v2:
>  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
>  * Use newly added wrapped helper to get cores per socket in
>    qemu_init_vcpu().
> 
> Changes since v1:
>  * Reordered patches. (Yanan)
>  * Deprecated the patch to fix comment of machine_parse_smp_config().
>    (Yanan)
>  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
>  * Split the intel's l1 cache topology fix into a new separate patch.
>    (Yanan)
>  * Combined module_id and APIC ID for module level support into one
>    patch. (Yanan)
>  * Make cache_into_passthrough case of cpuid 0x04 leaf in
>  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
>    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
>  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
>    (Yanan)
> ---
> Zhao Liu (14):
>   i386: Fix comment style in topology.h
>   tests: Rename test-x86-cpuid.c to test-x86-topo.c
>   hw/cpu: Update the comments of nr_cores and nr_dies
>   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
>   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
>   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
>   i386: Split topology types of CPUID[0x1F] from the definitions of
>     CPUID[0xB]
>   i386: Decouple CPUID[0x1F] subleaf with specific topology level
>   i386: Expose module level in CPUID[0x1F]
>   i386: Add cache topology info in CPUCacheInfo
>   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
>   i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits
>     25:14]
>   i386: Use CPUCacheInfo.share_level to encode
>     CPUID[0x8000001D].EAX[bits 25:14]
>   i386: Add new property to control L2 cache topo in CPUID.04H
> 
> Zhuocheng Ding (7):
>   softmmu: Fix CPUSTATE.nr_cores' calculation
>   i386: Introduce module-level cpu topology to CPUX86State
>   i386: Support modules_per_die in X86CPUTopoInfo
>   i386: Support module_id in X86CPUTopoIDs
>   i386/cpu: Introduce cluster-id to X86CPU
>   tests: Add test case of APIC ID for module level parsing
>   hw/i386/pc: Support smp.clusters for x86 PC machine
> 
>  MAINTAINERS                                   |   2 +-
>  hw/i386/pc.c                                  |   1 +
>  hw/i386/x86.c                                 |  49 ++-
>  include/hw/core/cpu.h                         |   2 +-
>  include/hw/i386/topology.h                    |  68 ++--
>  qemu-options.hx                               |  10 +-
>  softmmu/cpus.c                                |   2 +-
>  target/i386/cpu.c                             | 322 ++++++++++++++----
>  target/i386/cpu.h                             |  46 ++-
>  target/i386/kvm/kvm.c                         |   2 +-
>  tests/unit/meson.build                        |   4 +-
>  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++--
>  12 files changed, 437 insertions(+), 129 deletions(-)
>  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> 
> -- 
> 2.34.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 00/21] Support smp.clusters for x86 in QEMU
  2023-10-18 12:06 ` Michael S. Tsirkin
@ 2023-10-18 14:08   ` Zhao Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-10-18 14:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daud�,
	Yanan Wang, Richard Henderson, Paolo Bonzini, Marcelo Tosatti,
	qemu-devel, kvm, Zhenyu Wang, Xiaoyao Li, Babu Moger, Zhao Liu

Hi Michael,

On Wed, Oct 18, 2023 at 08:06:47AM -0400, Michael S. Tsirkin wrote:
> Date: Wed, 18 Oct 2023 08:06:47 -0400
> From: "Michael S. Tsirkin" <mst@redhat.com>
> Subject: Re: [PATCH v4 00/21] Support smp.clusters for x86 in QEMU
> 
> On Thu, Sep 14, 2023 at 03:21:38PM +0800, Zhao Liu wrote:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > Hi list,
> > 
> > (CC kvm@vger.kernel.org for better browsing.)
> > 
> > This is the our v4 patch series, rebased on the master branch at the
> > commit 9ef497755afc2 ("Merge tag 'pull-vfio-20230911' of
> > https://github.com/legoater/qemu into staging").
> > 
> > Comparing with v3 [1], v4 mainly refactors the CPUID[0x1F] encoding and
> > exposes module level in CPUID[0x1F] with these new patches:
> > 
> > * [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the
> > definitions of CPUID[0xB]
> > * [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific
> > topology level
> > * [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F]
> > 
> > v4 also fixes compile warnings and fixes cache topology uninitialization
> > bugs for some AMD CPUs.
> > 
> > Welcome your comments!
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>

Thanks for your ACKed!

Just to double check, does this mean that all patches in this series
(except the last one [5] about x-l2-cache-topo) got acknowdged, not just
"PC things"? ;-)

Then I'll refresh the v5 without the x-l2-cache-topo.

[5]: https://lists.nongnu.org/archive/html/qemu-devel/2023-10/msg00438.html

Regards,
Zhao

> 
> 
> > 
> > # Introduction
> > 
> > This series add the cluster support for x86 PC machine, which allows
> > x86 can use smp.clusters to configure the module level CPU topology
> > of x86.
> > 
> > And since the compatibility issue (see section: ## Why not share L2
> > cache in cluster directly), this series also introduce a new command
> > to adjust the topology of the x86 L2 cache.
> > 
> > Welcome your comments!
> > 
> > 
> > # Background
> > 
> > The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> > hasn't supported it.
> > 
> > At present, x86 defaults L2 cache is shared in one core, but this is
> > not enough. There're some platforms that multiple cores share the
> > same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> > Atom cores [3], that is, every four Atom cores shares one L2 cache.
> > Therefore, we need the new CPU topology level (cluster/module).
> > 
> > Another reason is for hybrid architecture. cluster support not only
> > provides another level of topology definition in x86, but would also
> > provide required code change for future our hybrid topology support.
> > 
> > 
> > # Overview
> > 
> > ## Introduction of module level for x86
> > 
> > "cluster" in smp is the CPU topology level which is between "core" and
> > die.
> > 
> > For x86, the "cluster" in smp is corresponding to the module level [4],
> > which is above the core level. So use the "module" other than "cluster"
> > in x86 code.
> > 
> > And please note that x86 already has a cpu topology level also named
> > "cluster" [4], this level is at the upper level of the package. Here,
> > the cluster in x86 cpu topology is completely different from the
> > "clusters" as the smp parameter. After the module level is introduced,
> > the cluster as the smp parameter will actually refer to the module level
> > of x86.
> > 
> > 
> > ## Why not share L2 cache in cluster directly
> > 
> > Though "clusters" was introduced to help define L2 cache topology
> > [2], using cluster to define x86's L2 cache topology will cause the
> > compatibility problem:
> > 
> > Currently, x86 defaults that the L2 cache is shared in one core, which
> > actually implies a default setting "cores per L2 cache is 1" and
> > therefore implicitly defaults to having as many L2 caches as cores.
> > 
> > For example (i386 PC machine):
> > -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> > 
> > Considering the topology of the L2 cache, this (*) implicitly means "1
> > core per L2 cache" and "2 L2 caches per die".
> > 
> > If we use cluster to configure L2 cache topology with the new default
> > setting "clusters per L2 cache is 1", the above semantics will change
> > to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> > cores per L2 cache".
> > 
> > So the same command (*) will cause changes in the L2 cache topology,
> > further affecting the performance of the virtual machine.
> > 
> > Therefore, x86 should only treat cluster as a cpu topology level and
> > avoid using it to change L2 cache by default for compatibility.
> > 
> > 
> > ## module level in CPUID
> > 
> > Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
> > erroneous smp_num_siblings on Intel Hybrid platforms") is able to
> > handle platforms with Module level enumerated via CPUID.1F.
> > 
> > Expose the module level in CPUID[0x1F] (for Intel CPUs) if the machine
> > has more than 1 modules since v3.
> > 
> > We can configure CPUID.04H.02H (L2 cache topology) with module level by
> > a new command:
> > 
> >         "-cpu,x-l2-cache-topo=cluster"
> > 
> > More information about this command, please see the section: "## New
> > property: x-l2-cache-topo".
> > 
> > 
> > ## New cache topology info in CPUCacheInfo
> > 
> > Currently, by default, the cache topology is encoded as:
> > 1. i/d cache is shared in one core.
> > 2. L2 cache is shared in one core.
> > 3. L3 cache is shared in one die.
> > 
> > This default general setting has caused a misunderstanding, that is, the
> > cache topology is completely equated with a specific cpu topology, such
> > as the connection between L2 cache and core level, and the connection
> > between L3 cache and die level.
> > 
> > In fact, the settings of these topologies depend on the specific
> > platform and are not static. For example, on Alder Lake-P, every
> > four Atom cores share the same L2 cache [2].
> > 
> > Thus, in this patch set, we explicitly define the corresponding cache
> > topology for different cpu models and this has two benefits:
> > 1. Easy to expand to new CPU models in the future, which has different
> >    cache topology.
> > 2. It can easily support custom cache topology by some command (e.g.,
> >    x-l2-cache-topo).
> > 
> > 
> > ## New property: x-l2-cache-topo
> > 
> > The property x-l2-cache-topo will be used to change the L2 cache
> > topology in CPUID.04H.
> > 
> > Now it allows user to set the L2 cache is shared in core level or
> > cluster level.
> > 
> > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > topology will be overrode by the new topology setting.
> > 
> > Since CPUID.04H is used by Intel CPUs, this property is available on
> > Intel CPUs as for now.
> > 
> > When necessary, it can be extended to CPUID[0x8000001D] for AMD CPUs.
> > 
> > 
> > # Patch description
> > 
> > patch 1-2 Cleanups about coding style and test name.
> > 
> > patch 3-5 Fixes about x86 topology and Intel l1 cache topology.
> > 
> > patch 6-7 Cleanups about topology related CPUID encoding and QEMU
> >           topology variables.
> > 
> > patch 8-9 Refactor CPUID[0x1F] encoding to prepare to introduce module
> >           level.
> > 
> > patch 10-16 Add the module as the new CPU topology level in x86, and it
> >             is corresponding to the cluster level in generic code.
> > 
> > patch 17,18,20 Add cache topology information in cache models.
> > 
> > patch 19 Update AMD CPUs' cache topology encoding.
> > 
> > patch 21 Introduce a new command to configure L2 cache topology.
> > 
> > 
> > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg00022.html
> > [2]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
> > [3]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
> > [4]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > 
> > Best Regards,
> > Zhao
> > ---
> > Changelog:
> > 
> > Changes since v3 (main changes):
> >  * Expose module level in CPUID[0x1F].
> >  * Fix compile warnings. (Babu)
> >  * Fixes cache topology uninitialization bugs for some AMD CPUs. (Babu)
> > 
> > Changes since v2:
> >  * Add "Tested-by", "Reviewed-by" and "ACKed-by" tags.
> >  * Use newly added wrapped helper to get cores per socket in
> >    qemu_init_vcpu().
> > 
> > Changes since v1:
> >  * Reordered patches. (Yanan)
> >  * Deprecated the patch to fix comment of machine_parse_smp_config().
> >    (Yanan)
> >  * Rename test-x86-cpuid.c to test-x86-topo.c. (Yanan)
> >  * Split the intel's l1 cache topology fix into a new separate patch.
> >    (Yanan)
> >  * Combined module_id and APIC ID for module level support into one
> >    patch. (Yanan)
> >  * Make cache_into_passthrough case of cpuid 0x04 leaf in
> >  * cpu_x86_cpuid() use max_processor_ids_for_cache() and
> >    max_core_ids_in_package() to encode CPUID[4]. (Yanan)
> >  * Add the prefix "CPU_TOPO_LEVEL_*" for CPU topology level names.
> >    (Yanan)
> > ---
> > Zhao Liu (14):
> >   i386: Fix comment style in topology.h
> >   tests: Rename test-x86-cpuid.c to test-x86-topo.c
> >   hw/cpu: Update the comments of nr_cores and nr_dies
> >   i386/cpu: Fix i/d-cache topology to core level for Intel CPU
> >   i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]
> >   i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
> >   i386: Split topology types of CPUID[0x1F] from the definitions of
> >     CPUID[0xB]
> >   i386: Decouple CPUID[0x1F] subleaf with specific topology level
> >   i386: Expose module level in CPUID[0x1F]
> >   i386: Add cache topology info in CPUCacheInfo
> >   i386: Use CPUCacheInfo.share_level to encode CPUID[4]
> >   i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits
> >     25:14]
> >   i386: Use CPUCacheInfo.share_level to encode
> >     CPUID[0x8000001D].EAX[bits 25:14]
> >   i386: Add new property to control L2 cache topo in CPUID.04H
> > 
> > Zhuocheng Ding (7):
> >   softmmu: Fix CPUSTATE.nr_cores' calculation
> >   i386: Introduce module-level cpu topology to CPUX86State
> >   i386: Support modules_per_die in X86CPUTopoInfo
> >   i386: Support module_id in X86CPUTopoIDs
> >   i386/cpu: Introduce cluster-id to X86CPU
> >   tests: Add test case of APIC ID for module level parsing
> >   hw/i386/pc: Support smp.clusters for x86 PC machine
> > 
> >  MAINTAINERS                                   |   2 +-
> >  hw/i386/pc.c                                  |   1 +
> >  hw/i386/x86.c                                 |  49 ++-
> >  include/hw/core/cpu.h                         |   2 +-
> >  include/hw/i386/topology.h                    |  68 ++--
> >  qemu-options.hx                               |  10 +-
> >  softmmu/cpus.c                                |   2 +-
> >  target/i386/cpu.c                             | 322 ++++++++++++++----
> >  target/i386/cpu.h                             |  46 ++-
> >  target/i386/kvm/kvm.c                         |   2 +-
> >  tests/unit/meson.build                        |   4 +-
> >  .../{test-x86-cpuid.c => test-x86-topo.c}     |  58 ++--
> >  12 files changed, 437 insertions(+), 129 deletions(-)
> >  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (73%)
> > 
> > -- 
> > 2.34.1
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-10-06 16:36         ` Zhao Liu
@ 2023-10-18 14:12           ` Zhao Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Zhao Liu @ 2023-10-18 14:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Philippe Mathieu-Daud�,
	Eduardo Habkost, Marcel Apfelbaum, Yanan Wang, Richard Henderson,
	Paolo Bonzini, Marcelo Tosatti, qemu-devel, kvm, Zhenyu Wang,
	Xiaoyao Li, Babu Moger, Zhao Liu, Yongwei Ma, Jonathan Cameron

Hi Michael,

On Sat, Oct 07, 2023 at 12:36:41AM +0800, Zhao Liu wrote:
> Date: Sat, 7 Oct 2023 00:36:41 +0800
> From: Zhao Liu <zhao1.liu@linux.intel.com>
> Subject: Re: [PATCH v4 21/21] i386: Add new property to control L2 cache
>  topo in CPUID.04H
> 
> Hi Michael,
> 
> On Tue, Oct 03, 2023 at 08:57:27AM -0400, Michael S. Tsirkin wrote:
> > Date: Tue, 3 Oct 2023 08:57:27 -0400
> > From: "Michael S. Tsirkin" <mst@redhat.com>
> > Subject: Re: [PATCH v4 21/21] i386: Add new property to control L2 cache
> >  topo in CPUID.04H
> > 
> > On Fri, Sep 15, 2023 at 03:53:25PM +0800, Zhao Liu wrote:
> > > Hi Philippe,
> > > 
> > > On Thu, Sep 14, 2023 at 09:41:30AM +0200, Philippe Mathieu-Daud? wrote:
> > > > Date: Thu, 14 Sep 2023 09:41:30 +0200
> > > > From: Philippe Mathieu-Daud? <philmd@linaro.org>
> > > > Subject: Re: [PATCH v4 21/21] i386: Add new property to control L2 cache
> > > >  topo in CPUID.04H
> > > > 
> > > > On 14/9/23 09:21, Zhao Liu wrote:
> > > > > From: Zhao Liu <zhao1.liu@intel.com>
> > > > > 
> > > > > The property x-l2-cache-topo will be used to change the L2 cache
> > > > > topology in CPUID.04H.
> > > > > 
> > > > > Now it allows user to set the L2 cache is shared in core level or
> > > > > cluster level.
> > > > > 
> > > > > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > > > > topology will be overrode by the new topology setting.
> > > > > 
> > > > > Here we expose to user "cluster" instead of "module", to be consistent
> > > > > with "cluster-id" naming.
> > > > > 
> > > > > Since CPUID.04H is used by intel CPUs, this property is available on
> > > > > intel CPUs as for now.
> > > > > 
> > > > > When necessary, it can be extended to CPUID.8000001DH for AMD CPUs.
> > > > > 
> > > > > (Tested the cache topology in CPUID[0x04] leaf with "x-l2-cache-topo=[
> > > > > core|cluster]", and tested the live migration between the QEMUs w/ &
> > > > > w/o this patch series.)
> > > > > 
> > > > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > > > > Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> > > > > ---
> > > > > Changes since v3:
> > > > >   * Add description about test for live migration compatibility. (Babu)
> > > > > 
> > > > > Changes since v1:
> > > > >   * Rename MODULE branch to CPU_TOPO_LEVEL_MODULE to match the previous
> > > > >     renaming changes.
> > > > > ---
> > > > >   target/i386/cpu.c | 34 +++++++++++++++++++++++++++++++++-
> > > > >   target/i386/cpu.h |  2 ++
> > > > >   2 files changed, 35 insertions(+), 1 deletion(-)
> > > > 
> > > > 
> > > > > @@ -8079,6 +8110,7 @@ static Property x86_cpu_properties[] = {
> > > > >                        false),
> > > > >       DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
> > > > >                        true),
> > > > > +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
> > > > 
> > > > We use the 'x-' prefix for unstable features, is it the case here?
> > > 
> > > I thought that if we can have a more general CLI way to define cache
> > > topology in the future, then this option can be removed.
> > > 
> > > I'm not sure if this option could be treated as unstable, what do you
> > > think?
> > > 
> > > 
> > > Thanks,
> > > Zhao
> > 
> > Then, please work on this new generic thing.
> > What we don't want is people relying on unstable options.
> > 
> 
> Okay, I'll remove this option in the next refresh.
> 
> BTW, about the generic cache topology, what about porting this option to
> smp? Just like:
> 
> -smp cpus=4,sockets=2,cores=2,threads=1, \
>      l3-cache=socket,l2-cache=core,l1-i-cache=core,l1-d-cache=core
> 
> From the previous discussion [1] with Jonathan, it seems this format
> could also meet the requirement for ARM.
> 
> If you like this, I'll move forward in this direction. ;-)
> 
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg03997.html

Let me get this thread warmed up again.

Could I get your initial thoughts on the general cache topology idea
above?

Thanks,
Zhao

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2023-10-18 14:01 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-14  7:21 [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Zhao Liu
2023-09-14  7:21 ` [PATCH v4 01/21] i386: Fix comment style in topology.h Zhao Liu
2023-09-22 16:05   ` Moger, Babu
2023-09-26  3:11     ` Zhao Liu
2023-09-14  7:21 ` [PATCH v4 02/21] tests: Rename test-x86-cpuid.c to test-x86-topo.c Zhao Liu
2023-09-14  7:21 ` [PATCH v4 03/21] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
2023-09-14  7:31   ` Philippe Mathieu-Daudé
2023-09-15  7:39     ` Zhao Liu
2023-09-14  7:21 ` [PATCH v4 04/21] hw/cpu: Update the comments of nr_cores and nr_dies Zhao Liu
2023-09-14  7:32   ` Philippe Mathieu-Daudé
2023-09-15  7:40     ` Zhao Liu
2023-09-14  7:21 ` [PATCH v4 05/21] i386/cpu: Fix i/d-cache topology to core level for Intel CPU Zhao Liu
2023-09-14  7:21 ` [PATCH v4 06/21] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4] Zhao Liu
2023-09-14  7:21 ` [PATCH v4 07/21] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
2023-09-14  7:21 ` [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB] Zhao Liu
2023-09-14  7:21 ` [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific topology level Zhao Liu
2023-09-14  7:21 ` [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
2023-09-14  7:38   ` Philippe Mathieu-Daudé
2023-09-15  7:50     ` Zhao Liu
2023-09-14  7:21 ` [PATCH v4 11/21] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
2023-09-14  7:21 ` [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F] Zhao Liu
2023-09-14  7:21 ` [PATCH v4 13/21] i386: Support module_id in X86CPUTopoIDs Zhao Liu
2023-09-14  7:21 ` [PATCH v4 14/21] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
2023-09-14  7:21 ` [PATCH v4 15/21] tests: Add test case of APIC ID for module level parsing Zhao Liu
2023-09-14  7:21 ` [PATCH v4 16/21] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
2023-09-14  7:21 ` [PATCH v4 17/21] i386: Add cache topology info in CPUCacheInfo Zhao Liu
2023-09-14  7:21 ` [PATCH v4 18/21] i386: Use CPUCacheInfo.share_level to encode CPUID[4] Zhao Liu
2023-09-14  7:21 ` [PATCH v4 19/21] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
2023-09-22 19:27   ` Moger, Babu
2023-09-26  3:10     ` Zhao Liu
2023-09-14  7:21 ` [PATCH v4 20/21] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
2023-09-22 19:27   ` Moger, Babu
2023-09-26  3:08     ` Zhao Liu
2023-09-14  7:21 ` [PATCH v4 21/21] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
2023-09-14  7:41   ` Philippe Mathieu-Daudé
2023-09-15  7:53     ` Zhao Liu
2023-10-03 12:57       ` Michael S. Tsirkin
2023-10-06 16:36         ` Zhao Liu
2023-10-18 14:12           ` Zhao Liu
2023-09-22 16:03 ` [PATCH v4 00/21] Support smp.clusters for x86 in QEMU Moger, Babu
2023-09-26  3:11   ` Zhao Liu
2023-10-18 12:06 ` Michael S. Tsirkin
2023-10-18 14:08   ` Zhao Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.