All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND 00/18] Support smp.clusters for x86
@ 2023-02-13  9:36 Zhao Liu
  2023-02-13  9:36 ` [PATCH RESEND 01/18] machine: Fix comment of machine_parse_smp_config() Zhao Liu
                   ` (17 more replies)
  0 siblings, 18 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Hi list,

This is the RESEND version of this patch series, rebased on the
latest commit of the master branch (3b33ae4: Merge tag 'block-
pull-request' of https://gitlab.com/stefanha/qemu into staging),
so that we can send subsequent RFCs of the hybrid topology series.

This series add the cluster support for x86 PC machine, which allows
x86 can use smp.clusters to configure the modlue level CPU topology
of x86.

And since the compatibility issue (see section: ## Why not share L2
cache in cluster directly), this series also introduce a new command
to adjust the topology of the x86 L2 cache.

Welcome your comments!


# Backgroud

The "clusters" parameter in "smp" is introduced by ARM [1], but x86
hasn't supported it.

At present, x86 defaults L2 cache is shared in one core, but this is
not enough. There're some platforms that multiple cores share the
same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
Atom cores [2], that is, every four Atom cores shares one L2 cache.
Therefore, we need the new CPU topology level (cluster/module).

Another reason is for hybrid architecture. cluster support not only
provides another level of topology definition in x86, but would aslo
provide required code change for future our hybrid topology support.


# Overview

## Introduction of module level for x86

"cluster" in smp is the CPU topology level which is between "core" and
die.

For x86, the "cluster" in smp is corresponding to the module level [3],
which is above the core level. So use the "module" other than "cluster"
in x86 code.

And please note that x86 already has a cpu topology level also named
"cluster" [3], this level is at the upper level of the package. Here,
the cluster in x86 cpu topology is completely different from the
"clusters" as the smp parameter. After the module level is introduced,
the cluster as the smp parameter will actually refer to the module level
of x86.


## Why not share L2 cache in cluster directly

Though "clusters" was introduced to help define L2 cache topology
[1], using cluster to define x86's L2 cache topology will cause the
compatibility problem:

Currently, x86 defaults that the L2 cache is shared in one core, which
actually implies a default setting "cores per L2 cache is 1" and
therefore implicitly defaults to having as many L2 caches as cores.

For example (i386 PC machine):
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)

Considering the topology of the L2 cache, this (*) implicitly means "1
core per L2 cache" and "2 L2 caches per die".

If we use cluster to configure L2 cache topology with the new default
setting "clusters per L2 cache is 1", the above semantics will change
to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
cores per L2 cache".

So the same command (*) will cause changes in the L2 cache topology,
further affecting the performance of the virtual machine.

Therefore, x86 should only treat cluster as a cpu topology level and
avoid using it to change L2 cache by default for compatibility.


## module level in CPUID

Currently, we don't expose module level in CPUID.1FH because currently
linux (v6.2-rc6) doesn't support module level. And exposing module and
die levels at the same time in CPUID.1FH will cause linux to calculate
wrong die_id. The module level should be exposed until the real machine
has the module level in CPUID.1FH.

We can configure CPUID.04H.02H (L2 cache topology) with module level by
a new command:

        "-cpu,x-l2-cache-topo=cluster"

More information about this command, please see the section: "## New
property: x-l2-cache-topo".


## New cache topology info in CPUCacheInfo

Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache [2].

Thus, in this patch set, we explicitly define the corresponding cache
topology for different cpu models and this has two benefits:
1. Easy to expand to new CPU models in the future, which has different
   cache topology.
2. It can easily support custom cache topology by some command (e.g.,
   x-l2-cache-topo).


## New property: x-l2-cache-topo

The property l2-cache-topo will be used to change the L2 cache topology
in CPUID.04H.

Now it allows user to set the L2 cache is shared in core level or
cluster level.

If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
topology will be overrided by the new topology setting.

Since CPUID.04H is used by intel cpus, this property is available on
intel cpus as for now.

When necessary, it can be extended to CPUID[0x8000001D] for amd cpus.


# Patch description

patch 1-5 Cleanups and fixes related to subsequent changes.

patch 6-13 Add the module as the new CPU topology level in x86, and it
           is corresponding to the cluster level in generic code.

patch 14-17 Add cache topology infomation in cache models.

patch 18 Introduce a new command to configure L2 cache topology.


[1]: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/
[2]: https://www.intel.com/content/www/us/en/products/platforms/details/alder-lake-p.html
[3]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.

---
Zhao Liu (10):
  machine: Fix comment of machine_parse_smp_config()
  tests: Rename test-x86-cpuid.c to test-x86-apicid.c
  i386/cpu: Fix number of addressable IDs in CPUID.04H
  i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  i386: Fix comment style in topology.h
  i386: Add cache topology info in CPUCacheInfo
  i386: Use CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]
  i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  i386: Use CPUCacheInfo.share_level to encode
    CPUID[0x8000001D].EAX[bits 25:14]
  i386: Add new property to control L2 cache topo in CPUID.04H

Zhuocheng Ding (8):
  softmmu: Fix CPUSTATE.nr_cores' calculation
  i386: Introduce module-level cpu topology to CPUX86State
  i386: Support modules_per_die in X86CPUTopoInfo
  i386: Support module_id in X86CPUTopoIDs
  i386: Update APIC ID parsing rule to support module level
  i386/cpu: Introduce cluster-id to X86CPU
  tests: Add test case of APIC ID for module level parsing
  hw/i386/pc: Support smp.clusters for x86 PC machine

 MAINTAINERS                                   |   2 +-
 hw/core/machine-smp.c                         |   7 +-
 hw/i386/pc.c                                  |   1 +
 hw/i386/x86.c                                 |  49 +++++-
 include/hw/core/cpu.h                         |   2 +-
 include/hw/i386/topology.h                    |  68 +++++---
 qemu-options.hx                               |  10 +-
 softmmu/cpus.c                                |   2 +-
 target/i386/cpu.c                             | 146 ++++++++++++++----
 target/i386/cpu.h                             |  25 +++
 tests/unit/meson.build                        |   4 +-
 .../{test-x86-cpuid.c => test-x86-apicid.c}   |  58 ++++---
 12 files changed, 275 insertions(+), 99 deletions(-)
 rename tests/unit/{test-x86-cpuid.c => test-x86-apicid.c} (73%)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH RESEND 01/18] machine: Fix comment of machine_parse_smp_config()
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-13 13:31   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to test-x86-apicid.c Zhao Liu
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Now smp supports dies and clusters, so add description about these 2
levels in the comment of machine_parse_smp_config().

Fixes: 864c3b5 (hw/core/machine: Introduce CPU cluster topology support)
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/core/machine-smp.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index c3dab007dadc..3fd9e641efde 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -51,8 +51,8 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
  * machine_parse_smp_config: Generic function used to parse the given
  *                           SMP configuration
  *
- * Any missing parameter in "cpus/maxcpus/sockets/cores/threads" will be
- * automatically computed based on the provided ones.
+ * Any missing parameter in "cpus/maxcpus/sockets/dies/clusters/cores/threads"
+ * will be automatically computed based on the provided ones.
  *
  * In the calculation of omitted sockets/cores/threads: we prefer sockets
  * over cores over threads before 6.2, while preferring cores over sockets
@@ -66,7 +66,8 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
  *
  * For compatibility, apart from the parameters that will be computed, newly
  * introduced topology members which are likely to be target specific should
- * be directly set as 1 if they are omitted (e.g. dies for PC since 4.1).
+ * be directly set as 1 if they are omitted (e.g. dies for PC since v4.1 and
+ * clusters for arm since v7.0).
  */
 void machine_parse_smp_config(MachineState *ms,
                               const SMPConfiguration *config, Error **errp)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to test-x86-apicid.c
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
  2023-02-13  9:36 ` [PATCH RESEND 01/18] machine: Fix comment of machine_parse_smp_config() Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15  2:36   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 03/18] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

In fact, this unit tests APIC ID other than CPUID.
Rename to test-x86-apicid.c to make its name more in line with its
actual content.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 MAINTAINERS                                        | 2 +-
 tests/unit/meson.build                             | 4 ++--
 tests/unit/{test-x86-cpuid.c => test-x86-apicid.c} | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)
 rename tests/unit/{test-x86-cpuid.c => test-x86-apicid.c} (99%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 96e25f62acaa..71c1bc24371b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1679,7 +1679,7 @@ F: include/hw/southbridge/piix.h
 F: hw/misc/sga.c
 F: hw/isa/apm.c
 F: include/hw/isa/apm.h
-F: tests/unit/test-x86-cpuid.c
+F: tests/unit/test-x86-apicid.c
 F: tests/qtest/test-x86-cpuid-compat.c
 
 PC Chipset
diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index ffa444f4323c..a9df2843e92e 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -20,8 +20,8 @@ tests = {
   'test-opts-visitor': [testqapi],
   'test-visitor-serialization': [testqapi],
   'test-bitmap': [],
-  # all code tested by test-x86-cpuid is inside topology.h
-  'test-x86-cpuid': [],
+  # all code tested by test-x86-apicid is inside topology.h
+  'test-x86-apicid': [],
   'test-cutils': [],
   'test-div128': [],
   'test-shift128': [],
diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-apicid.c
similarity index 99%
rename from tests/unit/test-x86-cpuid.c
rename to tests/unit/test-x86-apicid.c
index bfabc0403a1a..2b104f86d7c2 100644
--- a/tests/unit/test-x86-cpuid.c
+++ b/tests/unit/test-x86-apicid.c
@@ -1,5 +1,5 @@
 /*
- *  Test code for x86 CPUID and Topology functions
+ *  Test code for x86 APIC ID and Topology functions
  *
  *  Copyright (c) 2012 Red Hat Inc.
  *
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 03/18] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
  2023-02-13  9:36 ` [PATCH RESEND 01/18] machine: Fix comment of machine_parse_smp_config() Zhao Liu
  2023-02-13  9:36 ` [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to test-x86-apicid.c Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15  2:58   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H Zhao Liu
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

From CPUState.nr_cores' comment, it represents "number of cores within
this CPU package".

After 003f230 (machine: Tweak the order of topology members in struct
CpuTopology), the meaning of smp.cores changed to "the number of cores
in one die", but this commit missed to change CPUState.nr_cores'
caculation, so that CPUState.nr_cores became wrong and now it
misses to consider numbers of clusters and dies.

At present, only i386 is using CPUState.nr_cores.

But as for i386, which supports die level, the uses of CPUState.nr_cores
are very confusing:

Early uses are based on the meaning of "cores per package" (before die
is introduced into i386), and later uses are based on "cores per die"
(after die's introduction).

This difference is due to that commit a94e142 (target/i386: Add CPUID.1F
generation support for multi-dies PCMachine) misunderstood that
CPUState.nr_cores means "cores per die" when caculated
CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
wrong understanding.

With the influence of 003f230 and a94e142, for i386 currently the result
of CPUState.nr_cores is "cores per die", thus the original uses of
CPUState.cores based on the meaning of "cores per package" are wrong
when mutiple dies exist:
1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
   incorrect because it expects "cpus per package" but now the
   result is "cpus per die".
2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
   EAX[bits 31:26] is incorrect because they expect "cpus per package"
   but now the result is "cpus per die". The error not only impacts the
   EAX caculation in cache_info_passthrough case, but also impacts other
   cases of setting cache topology for Intel CPU according to cpu
   topology (specifically, the incoming parameter "num_cores" expects
   "cores per package" in encode_cache_cpuid4()).
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
   15:00] is incorrect because the EBX of 0BH.01H (core level) expects
   "cpus per package", which may be different with 1FH.01H (The reason
   is 1FH can support more levels. For QEMU, 1FH also supports die,
   1FH.01H:EBX[bits 15:00] expects "cpus per die").
4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
   caculated, here "cpus per package" is expected to be checked, but in
   fact, now it checks "cpus per die". Though "cpus per die" also works
   for this code logic, this isn't consistent with AMD's APM.
5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
   "cpus per package" but it obtains "cpus per die".
6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
   kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
   helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
   MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
   package", but in these functions, it obtains "cpus per die" and
   "cores per die".

On the other hand, these uses are correct now (they are added in/after
a94e142):
1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
   meets the actual meaning of CPUState.nr_cores ("cores per die").
2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
   04H's caculation) considers number of dies, so it's correct.
3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
   15:00] needs "cpus per die" and it gets the correct result, and
   CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".

When CPUState.nr_cores is correctly changed to "cores per package" again
, the above errors will be fixed without extra work, but the "currently"
correct cases will go wrong and need special handling to pass correct
"cpus/cores per die" they want.

Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
original meaning "cores per package", as well as changing caculation of
topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.

In addition, in the nr_threads' comment, specify it represents the
number of threads in the "core" to avoid confusion.

Fixes: a94e142 (target/i386: Add CPUID.1F generation support for multi-dies PCMachine)
Fixes: 003f230 (machine: Tweak the order of topology members in struct CpuTopology)
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 include/hw/core/cpu.h | 2 +-
 softmmu/cpus.c        | 2 +-
 target/i386/cpu.c     | 9 ++++-----
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 2417597236bc..5253e4e839bb 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -274,7 +274,7 @@ struct qemu_work_item;
  *   QOM parent.
  * @tcg_cflags: Pre-computed cflags for this cpu.
  * @nr_cores: Number of cores within this CPU package.
- * @nr_threads: Number of threads within this CPU.
+ * @nr_threads: Number of threads within this CPU core.
  * @running: #true if CPU is currently running (lockless).
  * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
  * valid under cpu_list_lock.
diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index 9cbc8172b5f2..9996e6a3b295 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
 
-    cpu->nr_cores = ms->smp.cores;
+    cpu->nr_cores = ms->smp.dies * ms->smp.clusters * ms->smp.cores;
     cpu->nr_threads =  ms->smp.threads;
     cpu->stopped = true;
     cpu->random_seed = qemu_guest_random_seed_thread_part1();
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 4d2b8d0444df..29afec12c281 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5218,7 +5218,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     X86CPUTopoInfo topo_info;
 
     topo_info.dies_per_pkg = env->nr_dies;
-    topo_info.cores_per_die = cs->nr_cores;
+    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
     topo_info.threads_per_core = cs->nr_threads;
 
     /* Calculate & apply limits for different index ranges */
@@ -5294,8 +5294,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              */
             if (*eax & 31) {
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
-                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
-                                       cs->nr_threads;
+                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
                 if (cs->nr_cores > 1) {
                     *eax &= ~0xFC000000;
                     *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
@@ -5468,12 +5467,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         case 1:
             *eax = apicid_die_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         case 2:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
+            *ebx = cs->nr_cores * cs->nr_threads;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;
         default:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (2 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 03/18] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15 10:11   ` wangyanan (Y) via
  2023-02-20  6:59   ` Xiaoyao Li
  2023-02-13  9:36 ` [PATCH RESEND 05/18] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
                   ` (13 subsequent siblings)
  17 siblings, 2 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
both 0, and this means i-cache and d-cache are shared in the SMT level.
This is correct if there's single thread per core, but is wrong for the
hyper threading case (one core contains multiple threads) since the
i-cache and d-cache are shared in the core level other than SMT level.

Therefore, in order to be compatible with both multi-threaded and
single-threaded situations, we should set i-cache and d-cache be shared
at the core level by default.

Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
nearest power-of-2 integer.

The nearest power-of-2 integer can be caculated by pow2ceil() or by
using APIC ID offset (like L3 topology using 1 << die_offset [3]).

But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
are associated with APIC ID. For example, in linux kernel, the field
"num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
matched with actual core numbers and it's caculated by:
"(1 << (pkg_offset - core_offset)) - 1".

Therefore the offset of APIC ID should be preferred to caculate nearest
power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
31:26]:
1. d/i cache is shared in a core, 1 << core_offset should be used
   instand of "1" in encode_cache_cpuid4() for CPUID.04H.00H:EAX[bits
   25:14] and CPUID.04H.01H:EAX[bits 25:14].
2. L2 cache is supposed to be shared in a core as for now, thereby
   1 << core_offset should also be used instand of "cs->nr_threads" in
   encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
   replaced by the offsets upper SMT level in APIC ID.

And since [1] and [2] are good enough to make cache_into_passthrough
work well, its "pow2ceil()" uses are enough so that they're no need to
be replaced by APIC ID offset way.

[1]: efb3934 (x86: cpu: make sure number of addressable IDs for processor cores meets the spec)
[2]: d7caf13 (x86: cpu: fixup number of addressable IDs for logical processors sharing cache)
[3]: d65af28 (i386: Update new x86_apicid parsing rules with die_offset support)

Fixes: 7e3482f (i386: Helpers to encode cache information consistently)
Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 29afec12c281..7833505092d8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5212,7 +5212,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 {
     X86CPU *cpu = env_archcpu(env);
     CPUState *cs = env_cpu(env);
-    uint32_t die_offset;
     uint32_t limit;
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
@@ -5308,27 +5307,38 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *eax = *ebx = *ecx = *edx = 0;
         } else {
             *eax = 0;
+            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
+                                           apicid_core_offset(&topo_info);
+            int core_offset, die_offset;
+
             switch (count) {
             case 0: /* L1 dcache info */
+                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
-                                    1, cs->nr_cores,
+                                    (1 << core_offset),
+                                    (1 << addressable_cores_offset),
                                     eax, ebx, ecx, edx);
                 break;
             case 1: /* L1 icache info */
+                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
-                                    1, cs->nr_cores,
+                                    (1 << core_offset),
+                                    (1 << addressable_cores_offset),
                                     eax, ebx, ecx, edx);
                 break;
             case 2: /* L2 cache info */
+                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
-                                    cs->nr_threads, cs->nr_cores,
+                                    (1 << core_offset),
+                                    (1 << addressable_cores_offset),
                                     eax, ebx, ecx, edx);
                 break;
             case 3: /* L3 cache info */
                 die_offset = apicid_die_offset(&topo_info);
                 if (cpu->enable_l3_cache) {
                     encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
-                                        (1 << die_offset), cs->nr_cores,
+                                        (1 << die_offset),
+                                        (1 << addressable_cores_offset),
                                         eax, ebx, ecx, edx);
                     break;
                 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 05/18] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (3 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15  3:28   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 06/18] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

In cpu_x86_cpuid(), there are many variables in representing the cpu
topology, e.g., topo_info, cs->nr_cores/cs->nr_threads.

Since the names of cs->nr_cores/cs->nr_threads does not accurately
represent its meaning, the use of cs->nr_cores/cs->nr_threads is prone
to confusion and mistakes.

And the structure X86CPUTopoInfo names its memebers clearly, thus the
variable "topo_info" should be preferred.

Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 7833505092d8..4cda84eb96f1 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5215,11 +5215,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t limit;
     uint32_t signature[3];
     X86CPUTopoInfo topo_info;
+    uint32_t cpus_per_pkg;
 
     topo_info.dies_per_pkg = env->nr_dies;
     topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
     topo_info.threads_per_core = cs->nr_threads;
 
+    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.cores_per_die *
+                   topo_info.threads_per_core;
+
     /* Calculate & apply limits for different index ranges */
     if (index >= 0xC0000000) {
         limit = env->cpuid_xlevel2;
@@ -5255,8 +5259,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *ecx |= CPUID_EXT_OSXSAVE;
         }
         *edx = env->features[FEAT_1_EDX];
-        if (cs->nr_cores * cs->nr_threads > 1) {
-            *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
+        if (cpus_per_pkg > 1) {
+            *ebx |= cpus_per_pkg << 16;
             *edx |= CPUID_HT;
         }
         if (!cpu->enable_pmu) {
@@ -5293,10 +5297,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              */
             if (*eax & 31) {
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
-                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
-                if (cs->nr_cores > 1) {
+                int vcpus_per_socket = cpus_per_pkg;
+                int cores_per_socket = topo_info.cores_per_die *
+                                       topo_info.dies_per_pkg;
+                if (cores_per_socket > 1) {
                     *eax &= ~0xFC000000;
-                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
+                    *eax |= (pow2ceil(cores_per_socket) - 1) << 26;
                 }
                 if (host_vcpus_per_cache > vcpus_per_socket) {
                     *eax &= ~0x3FFC000;
@@ -5436,12 +5442,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         switch (count) {
         case 0:
             *eax = apicid_core_offset(&topo_info);
-            *ebx = cs->nr_threads;
+            *ebx = topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = cpus_per_pkg;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         default:
@@ -5472,7 +5478,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         switch (count) {
         case 0:
             *eax = apicid_core_offset(&topo_info);
-            *ebx = cs->nr_threads;
+            *ebx = topo_info.threads_per_core;
             *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
             break;
         case 1:
@@ -5482,7 +5488,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         case 2:
             *eax = apicid_pkg_offset(&topo_info);
-            *ebx = cs->nr_cores * cs->nr_threads;
+            *ebx = cpus_per_pkg;
             *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
             break;
         default:
@@ -5707,7 +5713,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
          * discards multiple thread information if it is set.
          * So don't set it here for Intel to make Linux guests happy.
          */
-        if (cs->nr_cores * cs->nr_threads > 1) {
+        if (cpus_per_pkg > 1) {
             if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 ||
                 env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 ||
                 env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) {
@@ -5769,7 +5775,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              *eax |= (cpu_x86_virtual_addr_width(env) << 8);
         }
         *ebx = env->features[FEAT_8000_0008_EBX];
-        if (cs->nr_cores * cs->nr_threads > 1) {
+        if (cpus_per_pkg > 1) {
             /*
              * Bits 15:12 is "The number of bits in the initial
              * Core::X86::Apic::ApicId[ApicId] value that indicate
@@ -5777,7 +5783,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
              * Bits 7:0 is "The number of threads in the package is NC+1"
              */
             *ecx = (apicid_pkg_offset(&topo_info) << 12) |
-                   ((cs->nr_cores * cs->nr_threads) - 1);
+                   (cpus_per_pkg - 1);
         } else {
             *ecx = 0;
         }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 06/18] i386: Introduce module-level cpu topology to CPUX86State
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (4 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 05/18] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15  7:41   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

smp command has the "clusters" parameter but x86 hasn't supported that
level. Though "clusters" was introduced to help define L2 cache topology
[1], using cluster to define x86's L2 cache topology will cause the
compatibility problem:

Currently, x86 defaults that the L2 cache is shared in one core, which
actually implies a default setting "cores per L2 cache is 1" and
therefore implicitly defaults to having as many L2 caches as cores.

For example (i386 PC machine):
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)

Considering the topology of the L2 cache, this (*) implicitly means "1
core per L2 cache" and "2 L2 caches per die".

If we use cluster to configure L2 cache topology with the new default
setting "clusters per L2 cache is 1", the above semantics will change
to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
cores per L2 cache".

So the same command (*) will cause changes in the L2 cache topology,
further affecting the performance of the virtual machine.

Therefore, x86 should only treat cluster as a cpu topology level and
avoid using it to change L2 cache by default for compatibility.

"cluster" in smp is the CPU topology level which is between "core" and
die.

For x86, the "cluster" in smp is corresponding to the module level [2],
which is above the core level. So use the "module" other than "cluster"
in i386 code.

And please note that x86 already has a cpu topology level also named
"cluster" [2], this level is at the upper level of the package. Here,
the cluster in x86 cpu topology is completely different from the
"clusters" as the smp parameter. After the module level is introduced,
the cluster as the smp parameter will actually refer to the module level
of x86.

[1]: 0d87178 (hw/core/machine: Introduce CPU cluster topology support)
[2]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/i386/x86.c     | 1 +
 target/i386/cpu.c | 1 +
 target/i386/cpu.h | 6 ++++++
 3 files changed, 8 insertions(+)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index eaff4227bd68..ae1bb562d6e2 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -306,6 +306,7 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
     init_topo_info(&topo_info, x86ms);
 
     env->nr_dies = ms->smp.dies;
+    env->nr_modules = ms->smp.clusters;
 
     /*
      * If APIC ID is not set,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 4cda84eb96f1..61ec9a7499b8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6781,6 +6781,7 @@ static void x86_cpu_initfn(Object *obj)
     CPUX86State *env = &cpu->env;
 
     env->nr_dies = 1;
+    env->nr_modules = 1;
     cpu_set_cpustate_pointers(cpu);
 
     object_property_add(obj, "feature-words", "X86CPUFeatureWordInfo",
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d4bc19577a21..f3afea765982 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1810,7 +1810,13 @@ typedef struct CPUArchState {
 
     TPRAccess tpr_access_type;
 
+    /* Number of dies per package. */
     unsigned nr_dies;
+    /*
+     * Number of modules per die. Module level in x86 cpu topology is
+     * corresponding to smp.clusters.
+     */
+    unsigned nr_modules;
 } CPUX86State;
 
 struct kvm_msrs;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (5 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 06/18] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15 10:38   ` wangyanan (Y) via
  2023-02-16  2:34   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 08/18] i386: Support module_id in X86CPUTopoIDs Zhao Liu
                   ` (10 subsequent siblings)
  17 siblings, 2 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

Support module level in i386 cpu topology structure "X86CPUTopoInfo".

Before updating APIC ID parsing rule with module level, the
apicid_core_width() temporarily combines the core and module levels
together.

At present, we don't expose module level in CPUID.1FH because currently
linux (v6.2-rc6) doesn't support module level. And exposing module and
die levels at the same time in CPUID.1FH will cause linux to calculate
the wrong die_id. The module level should be exposed until the real
machine has the module level in CPUID.1FH.

In addition, update topology structure in test-x86-apicid.c.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/i386/x86.c                |  3 ++-
 include/hw/i386/topology.h   | 13 ++++++++---
 target/i386/cpu.c            | 17 ++++++++------
 tests/unit/test-x86-apicid.c | 45 +++++++++++++++++++-----------------
 4 files changed, 46 insertions(+), 32 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index ae1bb562d6e2..1c069ff56ae7 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -71,7 +71,8 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
     MachineState *ms = MACHINE(x86ms);
 
     topo_info->dies_per_pkg = ms->smp.dies;
-    topo_info->cores_per_die = ms->smp.cores;
+    topo_info->modules_per_die = ms->smp.clusters;
+    topo_info->cores_per_module = ms->smp.cores;
     topo_info->threads_per_core = ms->smp.threads;
 }
 
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 81573f6cfde0..bbb00dc4aad8 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -54,7 +54,8 @@ typedef struct X86CPUTopoIDs {
 
 typedef struct X86CPUTopoInfo {
     unsigned dies_per_pkg;
-    unsigned cores_per_die;
+    unsigned modules_per_die;
+    unsigned cores_per_module;
     unsigned threads_per_core;
 } X86CPUTopoInfo;
 
@@ -78,7 +79,12 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
  */
 static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
 {
-    return apicid_bitwidth_for_count(topo_info->cores_per_die);
+    /*
+     * TODO: Will separate module info from core_width when update
+     * APIC ID with module level.
+     */
+    return apicid_bitwidth_for_count(topo_info->cores_per_module *
+                                     topo_info->modules_per_die);
 }
 
 /* Bit width of the Die_ID field */
@@ -128,7 +134,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
                                          X86CPUTopoIDs *topo_ids)
 {
     unsigned nr_dies = topo_info->dies_per_pkg;
-    unsigned nr_cores = topo_info->cores_per_die;
+    unsigned nr_cores = topo_info->cores_per_module *
+                        topo_info->modules_per_die;
     unsigned nr_threads = topo_info->threads_per_core;
 
     topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 61ec9a7499b8..6f3d114c7d12 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -336,7 +336,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
+        l3_threads = topo_info->modules_per_die *
+                     topo_info->cores_per_module *
+                     topo_info->threads_per_core;
         *eax |= (l3_threads - 1) << 14;
     } else {
         *eax |= ((topo_info->threads_per_core - 1) << 14);
@@ -5218,11 +5220,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
     uint32_t cpus_per_pkg;
 
     topo_info.dies_per_pkg = env->nr_dies;
-    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
+    topo_info.modules_per_die = env->nr_modules;
+    topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
     topo_info.threads_per_core = cs->nr_threads;
 
-    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.cores_per_die *
-                   topo_info.threads_per_core;
+    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.modules_per_die *
+                   topo_info.cores_per_module * topo_info.threads_per_core;
 
     /* Calculate & apply limits for different index ranges */
     if (index >= 0xC0000000) {
@@ -5298,8 +5301,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             if (*eax & 31) {
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
                 int vcpus_per_socket = cpus_per_pkg;
-                int cores_per_socket = topo_info.cores_per_die *
-                                       topo_info.dies_per_pkg;
+                int cores_per_socket = cpus_per_pkg /
+                                       topo_info.threads_per_core;
                 if (cores_per_socket > 1) {
                     *eax &= ~0xFC000000;
                     *eax |= (pow2ceil(cores_per_socket) - 1) << 26;
@@ -5483,7 +5486,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             break;
         case 1:
             *eax = apicid_die_offset(&topo_info);
-            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
+            *ebx = cpus_per_pkg / topo_info.dies_per_pkg;
             *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
             break;
         case 2:
diff --git a/tests/unit/test-x86-apicid.c b/tests/unit/test-x86-apicid.c
index 2b104f86d7c2..f21b8a5d95c2 100644
--- a/tests/unit/test-x86-apicid.c
+++ b/tests/unit/test-x86-apicid.c
@@ -30,13 +30,16 @@ static void test_topo_bits(void)
 {
     X86CPUTopoInfo topo_info = {0};
 
-    /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
-    topo_info = (X86CPUTopoInfo) {1, 1, 1};
+    /*
+     * simple tests for 1 thread per core, 1 core per module,
+     *                  1 module per die, 1 die per package
+     */
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
 
-    topo_info = (X86CPUTopoInfo) {1, 1, 1};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
@@ -45,39 +48,39 @@ static void test_topo_bits(void)
 
     /* Test field width calculation for multiple values
      */
-    topo_info = (X86CPUTopoInfo) {1, 1, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {1, 1, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {1, 1, 4};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
 
-    topo_info = (X86CPUTopoInfo) {1, 1, 14};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {1, 1, 15};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {1, 1, 16};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
-    topo_info = (X86CPUTopoInfo) {1, 1, 17};
+    topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
 
 
-    topo_info = (X86CPUTopoInfo) {1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {1, 31, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {1, 32, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
-    topo_info = (X86CPUTopoInfo) {1, 33, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
 
-    topo_info = (X86CPUTopoInfo) {1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
-    topo_info = (X86CPUTopoInfo) {2, 30, 2};
+    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {3, 30, 2};
+    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {4, 30, 2};
+    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
 
     /* build a weird topology and see if IDs are calculated correctly
@@ -85,18 +88,18 @@ static void test_topo_bits(void)
 
     /* This will use 2 bits for thread ID and 3 bits for core ID
      */
-    topo_info = (X86CPUTopoInfo) {1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
 
-    topo_info = (X86CPUTopoInfo) {1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
 
-    topo_info = (X86CPUTopoInfo) {1, 6, 3};
+    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
                      (1 << 2) | 0);
     g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 08/18] i386: Support module_id in X86CPUTopoIDs
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (6 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-13  9:36 ` [PATCH RESEND 09/18] i386: Fix comment style in topology.h Zhao Liu
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

Add module_id member in X86CPUTopoIDs.

module_id can be parsed from APIC ID, so before updating the parsing
rule of APIC ID, temporarily set the module_id generated in this way to
0.

module_id can be also generated from cpu topology, and before i386
supports "clusters" in smp, the default "clusters per die" is only 1,
thus the module_id generated in this way is 0, so that it will not
conflict with the module_id generated by APIC ID.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/i386/x86.c              | 17 +++++++++++++++++
 include/hw/i386/topology.h | 24 ++++++++++++++++++++----
 2 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 1c069ff56ae7..b90c6584930a 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -362,6 +362,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
         topo_ids.die_id = cpu->die_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
+
+        /*
+         * TODO: This is the temporary initialization for topo_ids.module_id to
+         * avoid "maybe-uninitialized" compilation errors. Will remove when
+         * X86CPU supports cluster_id.
+         */
+        topo_ids.module_id = 0;
+
         cpu->apic_id = x86_apicid_from_topo_ids(&topo_info, &topo_ids);
     }
 
@@ -370,6 +378,11 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
         MachineState *ms = MACHINE(x86ms);
 
         x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
+
+        /*
+         * TODO: Before APIC ID supports module level parsing, there's no need
+         * to expose module_id info.
+         */
         error_setg(errp,
             "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
             " APIC ID %" PRIu32 ", valid index range 0:%d",
@@ -495,6 +508,10 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
             ms->possible_cpus->cpus[i].props.has_die_id = true;
             ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
         }
+        if (ms->smp.clusters > 1) {
+            ms->possible_cpus->cpus[i].props.has_cluster_id = true;
+            ms->possible_cpus->cpus[i].props.cluster_id = topo_ids.module_id;
+        }
         ms->possible_cpus->cpus[i].props.has_core_id = true;
         ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
         ms->possible_cpus->cpus[i].props.has_thread_id = true;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index bbb00dc4aad8..b0174c18b7bd 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -48,6 +48,7 @@ typedef uint32_t apic_id_t;
 typedef struct X86CPUTopoIDs {
     unsigned pkg_id;
     unsigned die_id;
+    unsigned module_id;
     unsigned core_id;
     unsigned smt_id;
 } X86CPUTopoIDs;
@@ -134,12 +135,21 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
                                          X86CPUTopoIDs *topo_ids)
 {
     unsigned nr_dies = topo_info->dies_per_pkg;
-    unsigned nr_cores = topo_info->cores_per_module *
-                        topo_info->modules_per_die;
+    unsigned nr_modules = topo_info->modules_per_die;
+    unsigned nr_cores = topo_info->cores_per_module;
     unsigned nr_threads = topo_info->threads_per_core;
 
-    topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
-    topo_ids->die_id = cpu_index / (nr_cores * nr_threads) % nr_dies;
+    /*
+     * Currently smp for i386 doesn't support "clusters", modules_per_die is
+     * only 1. Therefore, the module_id generated from the module topology will
+     * not conflict with the module_id generated according to the apicid.
+     */
+    topo_ids->pkg_id = cpu_index / (nr_dies * nr_modules *
+                       nr_cores * nr_threads);
+    topo_ids->die_id = cpu_index / (nr_modules * nr_cores *
+                       nr_threads) % nr_dies;
+    topo_ids->module_id = cpu_index / (nr_cores * nr_threads) %
+                          nr_modules;
     topo_ids->core_id = cpu_index / nr_threads % nr_cores;
     topo_ids->smt_id = cpu_index % nr_threads;
 }
@@ -156,6 +166,12 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
     topo_ids->core_id =
             (apicid >> apicid_core_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
+    /*
+     * TODO: This is the temporary initialization for topo_ids.module_id to
+     * avoid "maybe-uninitialized" compilation errors. Will remove when APIC
+     * ID supports module level parsing.
+     */
+    topo_ids->module_id = 0;
     topo_ids->die_id =
             (apicid >> apicid_die_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 09/18] i386: Fix comment style in topology.h
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (7 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 08/18] i386: Support module_id in X86CPUTopoIDs Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-13 13:40   ` Philippe Mathieu-Daudé
                     ` (2 more replies)
  2023-02-13  9:36 ` [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to support module level Zhao Liu
                   ` (8 subsequent siblings)
  17 siblings, 3 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

For function comments in this file, keep the comment style consistent
with other places.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 include/hw/i386/topology.h | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index b0174c18b7bd..5de905dc00d3 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -24,7 +24,8 @@
 #ifndef HW_I386_TOPOLOGY_H
 #define HW_I386_TOPOLOGY_H
 
-/* This file implements the APIC-ID-based CPU topology enumeration logic,
+/*
+ * This file implements the APIC-ID-based CPU topology enumeration logic,
  * documented at the following document:
  *   Intel® 64 Architecture Processor Topology Enumeration
  *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
@@ -41,7 +42,8 @@
 
 #include "qemu/bitops.h"
 
-/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
+/*
+ * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
  */
 typedef uint32_t apic_id_t;
 
@@ -60,8 +62,7 @@ typedef struct X86CPUTopoInfo {
     unsigned threads_per_core;
 } X86CPUTopoInfo;
 
-/* Return the bit width needed for 'count' IDs
- */
+/* Return the bit width needed for 'count' IDs */
 static unsigned apicid_bitwidth_for_count(unsigned count)
 {
     g_assert(count >= 1);
@@ -69,15 +70,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
     return count ? 32 - clz32(count) : 0;
 }
 
-/* Bit width of the SMT_ID (thread ID) field on the APIC ID
- */
+/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
 static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
 {
     return apicid_bitwidth_for_count(topo_info->threads_per_core);
 }
 
-/* Bit width of the Core_ID field
- */
+/* Bit width of the Core_ID field */
 static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
 {
     /*
@@ -94,8 +93,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
     return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
 }
 
-/* Bit offset of the Core_ID field
- */
+/* Bit offset of the Core_ID field */
 static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
 {
     return apicid_smt_width(topo_info);
@@ -107,14 +105,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
     return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
 }
 
-/* Bit offset of the Pkg_ID (socket ID) field
- */
+/* Bit offset of the Pkg_ID (socket ID) field */
 static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
 {
     return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
 }
 
-/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
+/*
+ * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
  *
  * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
  */
@@ -127,7 +125,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
            topo_ids->smt_id;
 }
 
-/* Calculate thread/core/package IDs for a specific topology,
+/*
+ * Calculate thread/core/package IDs for a specific topology,
  * based on (contiguous) CPU index
  */
 static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
@@ -154,7 +153,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
     topo_ids->smt_id = cpu_index % nr_threads;
 }
 
-/* Calculate thread/core/package IDs for a specific topology,
+/*
+ * Calculate thread/core/package IDs for a specific topology,
  * based on APIC ID
  */
 static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
@@ -178,7 +178,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
     topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
 }
 
-/* Make APIC ID for the CPU 'cpu_index'
+/*
+ * Make APIC ID for the CPU 'cpu_index'
  *
  * 'cpu_index' is a sequential, contiguous ID for the CPU.
  */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to support module level
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (8 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 09/18] i386: Fix comment style in topology.h Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15 11:06   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 11/18] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

Add the module level parsing support for APIC ID.

With this support, now the conversion between X86CPUTopoIDs,
X86CPUTopoInfo and APIC ID is completed.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/i386/x86.c              | 19 ++++++++-----------
 include/hw/i386/topology.h | 36 ++++++++++++++++++------------------
 2 files changed, 26 insertions(+), 29 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index b90c6584930a..2a9d080a8e7a 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -311,11 +311,11 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
     /*
      * If APIC ID is not set,
-     * set it based on socket/die/core/thread properties.
+     * set it based on socket/die/cluster/core/thread properties.
      */
     if (cpu->apic_id == UNASSIGNED_APIC_ID) {
-        int max_socket = (ms->smp.max_cpus - 1) /
-                                smp_threads / smp_cores / ms->smp.dies;
+        int max_socket = (ms->smp.max_cpus - 1) / smp_threads / smp_cores /
+                                ms->smp.clusters / ms->smp.dies;
 
         /*
          * die-id was optional in QEMU 4.0 and older, so keep it optional
@@ -379,15 +379,12 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
         x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
 
-        /*
-         * TODO: Before APIC ID supports module level parsing, there's no need
-         * to expose module_id info.
-         */
         error_setg(errp,
-            "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
-            " APIC ID %" PRIu32 ", valid index range 0:%d",
-            topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, topo_ids.smt_id,
-            cpu->apic_id, ms->possible_cpus->len - 1);
+            "Invalid CPU [socket: %u, die: %u, module: %u, core: %u, thread: %u]"
+            " with APIC ID %" PRIu32 ", valid index range 0:%d",
+            topo_ids.pkg_id, topo_ids.die_id, topo_ids.module_id,
+            topo_ids.core_id, topo_ids.smt_id, cpu->apic_id,
+            ms->possible_cpus->len - 1);
         return;
     }
 
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 5de905dc00d3..3cec97b377f2 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -79,12 +79,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
 /* Bit width of the Core_ID field */
 static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
 {
-    /*
-     * TODO: Will separate module info from core_width when update
-     * APIC ID with module level.
-     */
-    return apicid_bitwidth_for_count(topo_info->cores_per_module *
-                                     topo_info->modules_per_die);
+    return apicid_bitwidth_for_count(topo_info->cores_per_module);
+}
+
+/* Bit width of the Module_ID (cluster ID) field */
+static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
+{
+    return apicid_bitwidth_for_count(topo_info->modules_per_die);
 }
 
 /* Bit width of the Die_ID field */
@@ -99,10 +100,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
     return apicid_smt_width(topo_info);
 }
 
+/* Bit offset of the Module_ID (cluster ID) field */
+static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
+{
+    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
+}
+
 /* Bit offset of the Die_ID field */
 static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
 {
-    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
+    return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
 }
 
 /* Bit offset of the Pkg_ID (socket ID) field */
@@ -121,6 +128,7 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
 {
     return (topo_ids->pkg_id  << apicid_pkg_offset(topo_info)) |
            (topo_ids->die_id  << apicid_die_offset(topo_info)) |
+           (topo_ids->module_id << apicid_module_offset(topo_info)) |
            (topo_ids->core_id << apicid_core_offset(topo_info)) |
            topo_ids->smt_id;
 }
@@ -138,11 +146,6 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
     unsigned nr_cores = topo_info->cores_per_module;
     unsigned nr_threads = topo_info->threads_per_core;
 
-    /*
-     * Currently smp for i386 doesn't support "clusters", modules_per_die is
-     * only 1. Therefore, the module_id generated from the module topology will
-     * not conflict with the module_id generated according to the apicid.
-     */
     topo_ids->pkg_id = cpu_index / (nr_dies * nr_modules *
                        nr_cores * nr_threads);
     topo_ids->die_id = cpu_index / (nr_modules * nr_cores *
@@ -166,12 +169,9 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
     topo_ids->core_id =
             (apicid >> apicid_core_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
-    /*
-     * TODO: This is the temporary initialization for topo_ids.module_id to
-     * avoid "maybe-uninitialized" compilation errors. Will remove when APIC
-     * ID supports module level parsing.
-     */
-    topo_ids->module_id = 0;
+    topo_ids->module_id =
+            (apicid >> apicid_module_offset(topo_info)) &
+            ~(0xFFFFFFFFUL << apicid_module_width(topo_info));
     topo_ids->die_id =
             (apicid >> apicid_die_offset(topo_info)) &
             ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 11/18] i386/cpu: Introduce cluster-id to X86CPU
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (9 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to support module level Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-13  9:36 ` [PATCH RESEND 12/18] tests: Add test case of APIC ID for module level parsing Zhao Liu
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

We introduce cluster-id other than module-id to be consistent with
CpuInstanceProperties.cluster-id, and this avoids the confusion
of parameter names when hotplugging.

Following the legacy smp check rules, also add the cluster_id validity
into x86_cpu_pre_plug().

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/i386/x86.c     | 33 +++++++++++++++++++++++++--------
 target/i386/cpu.c |  2 ++
 target/i386/cpu.h |  1 +
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 2a9d080a8e7a..20ba2384bbb2 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -325,6 +325,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
             cpu->die_id = 0;
         }
 
+        /*
+         * cluster-id was optional in QEMU 8.0 and older, so keep it optional
+         * if there's only one cluster per die.
+         */
+        if (cpu->cluster_id < 0 && ms->smp.clusters == 1) {
+            cpu->cluster_id = 0;
+        }
+
         if (cpu->socket_id < 0) {
             error_setg(errp, "CPU socket-id is not set");
             return;
@@ -341,6 +349,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
                        cpu->die_id, ms->smp.dies - 1);
             return;
         }
+        if (cpu->cluster_id < 0) {
+            error_setg(errp, "CPU cluster-id is not set");
+            return;
+        } else if (cpu->cluster_id > ms->smp.clusters - 1) {
+            error_setg(errp, "Invalid CPU cluster-id: %u must be in range 0:%u",
+                       cpu->cluster_id, ms->smp.clusters - 1);
+            return;
+        }
         if (cpu->core_id < 0) {
             error_setg(errp, "CPU core-id is not set");
             return;
@@ -360,16 +376,9 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
         topo_ids.pkg_id = cpu->socket_id;
         topo_ids.die_id = cpu->die_id;
+        topo_ids.module_id = cpu->cluster_id;
         topo_ids.core_id = cpu->core_id;
         topo_ids.smt_id = cpu->thread_id;
-
-        /*
-         * TODO: This is the temporary initialization for topo_ids.module_id to
-         * avoid "maybe-uninitialized" compilation errors. Will remove when
-         * X86CPU supports cluster_id.
-         */
-        topo_ids.module_id = 0;
-
         cpu->apic_id = x86_apicid_from_topo_ids(&topo_info, &topo_ids);
     }
 
@@ -416,6 +425,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
     }
     cpu->die_id = topo_ids.die_id;
 
+    if (cpu->cluster_id != -1 && cpu->cluster_id != topo_ids.module_id) {
+        error_setg(errp, "property cluster-id: %u doesn't match set apic-id:"
+            " 0x%x (cluster-id: %u)", cpu->cluster_id, cpu->apic_id,
+            topo_ids.module_id);
+        return;
+    }
+    cpu->cluster_id = topo_ids.module_id;
+
     if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
         error_setg(errp, "property core-id: %u doesn't match set apic-id:"
             " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6f3d114c7d12..27bbbc36b11c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6980,12 +6980,14 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, 0),
     DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, 0),
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
+    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, 0),
     DEFINE_PROP_INT32("die-id", X86CPU, die_id, 0),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, 0),
 #else
     DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
     DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, -1),
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
+    DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, -1),
     DEFINE_PROP_INT32("die-id", X86CPU, die_id, -1),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
 #endif
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index f3afea765982..8668e74e0c87 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1966,6 +1966,7 @@ struct ArchCPU {
     int32_t node_id; /* NUMA node this CPU belongs to */
     int32_t socket_id;
     int32_t die_id;
+    int32_t cluster_id;
     int32_t core_id;
     int32_t thread_id;
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 12/18] tests: Add test case of APIC ID for module level parsing
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (10 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 11/18] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15 11:22   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 13/18] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

After i386 supports module level, it's time to add the test for module
level's parsing.

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 tests/unit/test-x86-apicid.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/tests/unit/test-x86-apicid.c b/tests/unit/test-x86-apicid.c
index f21b8a5d95c2..55b731ccae55 100644
--- a/tests/unit/test-x86-apicid.c
+++ b/tests/unit/test-x86-apicid.c
@@ -37,6 +37,7 @@ static void test_topo_bits(void)
     topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 0);
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
 
     topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
@@ -74,13 +75,22 @@ static void test_topo_bits(void)
     topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
     g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
 
-    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {1, 6, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
+    topo_info = (X86CPUTopoInfo) {1, 7, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
+    topo_info = (X86CPUTopoInfo) {1, 8, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
+    topo_info = (X86CPUTopoInfo) {1, 9, 30, 2};
+    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 4);
+
+    topo_info = (X86CPUTopoInfo) {1, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
-    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {2, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
-    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {3, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
-    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
+    topo_info = (X86CPUTopoInfo) {4, 6, 30, 2};
     g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
 
     /* build a weird topology and see if IDs are calculated correctly
@@ -91,6 +101,7 @@ static void test_topo_bits(void)
     topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
     g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
     g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
+    g_assert_cmpuint(apicid_module_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
     g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 13/18] hw/i386/pc: Support smp.clusters for x86 PC machine
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (11 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 12/18] tests: Add test case of APIC ID for module level parsing Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-14  2:34   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 14/18] i386: Add cache topology info in CPUCacheInfo Zhao Liu
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhuocheng Ding <zhuocheng.ding@intel.com>

As module-level topology support is added to X86CPU, now we can enable
the support for the cluster parameter on PC machines. With this support,
we can define a 5-level x86 CPU topology with "-smp":

-smp cpus=*,maxcpus=*,sockets=*,dies=*,clusters=*,cores=*,threads=*.

Additionally, add the 5-level topology example in description of "-smp".

Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/i386/pc.c    |  1 +
 qemu-options.hx | 10 +++++-----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 6e592bd969aa..c329df56ebd2 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1929,6 +1929,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = true;
     mc->smp_props.dies_supported = true;
+    mc->smp_props.clusters_supported = true;
     mc->default_ram_id = "pc.ram";
 
     object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
diff --git a/qemu-options.hx b/qemu-options.hx
index 88e93c610314..3caf9da4c3af 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -312,14 +312,14 @@ SRST
         -smp 8,sockets=2,cores=2,threads=2,maxcpus=8
 
     The following sub-option defines a CPU topology hierarchy (2 sockets
-    totally on the machine, 2 dies per socket, 2 cores per die, 2 threads
-    per core) for PC machines which support sockets/dies/cores/threads.
-    Some members of the option can be omitted but their values will be
-    automatically computed:
+    totally on the machine, 2 dies per socket, 2 clusters per die, 2 cores per
+    cluster, 2 threads per core) for PC machines which support sockets/dies
+    /clusters/cores/threads. Some members of the option can be omitted but
+    their values will be automatically computed:
 
     ::
 
-        -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16
+        -smp 32,sockets=2,dies=2,clusters=2,cores=2,threads=2,maxcpus=32
 
     The following sub-option defines a CPU topology hierarchy (2 sockets
     totally on the machine, 2 clusters per socket, 2 cores per cluster,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 14/18] i386: Add cache topology info in CPUCacheInfo
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (12 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 13/18] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15 12:17   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 15/18] i386: Use CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14] Zhao Liu
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache.

Thus, we should explicitly define the corresponding cache topology for
different cache models to increase scalability.

Except legacy_l2_cache_cpuid2 (its default topo level is INVALID),
explicitly set the corresponding topology level for all other cache
models. In order to be compatible with the existing cache topology, set
the CORE level for the i/d cache, set the CORE level for L2 cache, and
set the DIE level for L3 cache.

The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits
25:14] will be set based on CPUCacheInfo.share_level.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 19 +++++++++++++++++++
 target/i386/cpu.h | 16 ++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 27bbbc36b11c..364534e84b1b 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -433,6 +433,7 @@ static CPUCacheInfo legacy_l1d_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -447,6 +448,7 @@ static CPUCacheInfo legacy_l1d_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
+    .share_level = CORE,
 };
 
 /* L1 instruction cache: */
@@ -460,6 +462,7 @@ static CPUCacheInfo legacy_l1i_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -474,6 +477,7 @@ static CPUCacheInfo legacy_l1i_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
+    .share_level = CORE,
 };
 
 /* Level 2 unified cache: */
@@ -487,6 +491,7 @@ static CPUCacheInfo legacy_l2_cache = {
     .sets = 4096,
     .partitions = 1,
     .no_invd_sharing = true,
+    .share_level = CORE,
 };
 
 /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */
@@ -509,6 +514,7 @@ static CPUCacheInfo legacy_l2_cache_amd = {
     .associativity = 16,
     .sets = 512,
     .partitions = 1,
+    .share_level = CORE,
 };
 
 /* Level 3 unified cache: */
@@ -524,6 +530,7 @@ static CPUCacheInfo legacy_l3_cache = {
     .self_init = true,
     .inclusive = true,
     .complex_indexing = true,
+    .share_level = DIE,
 };
 
 /* TLB definitions: */
@@ -1668,6 +1675,7 @@ static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -1680,6 +1688,7 @@ static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1690,6 +1699,7 @@ static const CPUCaches epyc_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1703,6 +1713,7 @@ static const CPUCaches epyc_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = DIE,
     },
 };
 
@@ -1718,6 +1729,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -1730,6 +1742,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1740,6 +1753,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1753,6 +1767,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = DIE,
     },
 };
 
@@ -1768,6 +1783,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -1780,6 +1796,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
+        .share_level = CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1790,6 +1807,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
+        .share_level = CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -1803,6 +1821,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
+        .share_level = DIE,
     },
 };
 
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 8668e74e0c87..5a955431f759 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1476,6 +1476,15 @@ enum CacheType {
     UNIFIED_CACHE
 };
 
+enum CPUTopoLevel {
+    INVALID = 0,
+    SMT,
+    CORE,
+    MODULE,
+    DIE,
+    PACKAGE,
+};
+
 typedef struct CPUCacheInfo {
     enum CacheType type;
     uint8_t level;
@@ -1517,6 +1526,13 @@ typedef struct CPUCacheInfo {
      * address bits.  CPUID[4].EDX[bit 2].
      */
     bool complex_indexing;
+
+    /*
+     * Cache Topology. The level that cache is shared in.
+     * Used to encode CPUID[4].EAX[bits 25:14] or
+     * CPUID[0x8000001D].EAX[bits 25:14].
+     */
+    enum CPUTopoLevel share_level;
 } CPUCacheInfo;
 
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 15/18] i386: Use CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (13 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 14/18] i386: Add cache topology info in CPUCacheInfo Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-13  9:36 ` [PATCH RESEND 16/18] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
intel CPUs.

After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[4].EAX[bits 25:14].

Additionally, since maximum_processor_id (original "num_apic_ids") is
parsed based on cpu topology levels, which are verified when parsing
smp, it's no need to check this value by "assert(num_apic_ids > 0)"
again, so remove this assert.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 55 +++++++++++++++++++++++++++++++----------------
 1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 364534e84b1b..96ef96860604 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -231,22 +231,50 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
                        ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
                        0 /* Invalid value */)
 
+static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
+                                            X86CPUTopoInfo *topo_info)
+{
+    uint32_t num_ids = 0;
+
+    switch (cache->share_level) {
+    case CORE:
+        num_ids = 1 << apicid_core_offset(topo_info);
+        break;
+    case DIE:
+        num_ids = 1 << apicid_die_offset(topo_info);
+        break;
+    default:
+        /*
+         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
+         * assert directly to facilitate debugging.
+         */
+        g_assert_not_reached();
+    }
+
+    return num_ids - 1;
+}
+
+static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
+{
+    uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
+                               apicid_core_offset(topo_info));
+    return num_cores - 1;
+}
 
 /* Encode cache info for CPUID[4] */
 static void encode_cache_cpuid4(CPUCacheInfo *cache,
-                                int num_apic_ids, int num_cores,
+                                X86CPUTopoInfo *topo_info,
                                 uint32_t *eax, uint32_t *ebx,
                                 uint32_t *ecx, uint32_t *edx)
 {
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
-    assert(num_apic_ids > 0);
     *eax = CACHE_TYPE(cache->type) |
            CACHE_LEVEL(cache->level) |
            (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
-           ((num_cores - 1) << 26) |
-           ((num_apic_ids - 1) << 14);
+           (max_core_ids_in_package(topo_info) << 26) |
+           (max_processor_ids_for_cache(cache, topo_info) << 14);
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
@@ -5335,38 +5363,27 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *eax = *ebx = *ecx = *edx = 0;
         } else {
             *eax = 0;
-            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
-                                           apicid_core_offset(&topo_info);
-            int core_offset, die_offset;
 
             switch (count) {
             case 0: /* L1 dcache info */
-                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
-                                    (1 << core_offset),
-                                    (1 << addressable_cores_offset),
+                                    &topo_info,
                                     eax, ebx, ecx, edx);
                 break;
             case 1: /* L1 icache info */
-                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
-                                    (1 << core_offset),
-                                    (1 << addressable_cores_offset),
+                                    &topo_info,
                                     eax, ebx, ecx, edx);
                 break;
             case 2: /* L2 cache info */
-                core_offset = apicid_core_offset(&topo_info);
                 encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
-                                    (1 << core_offset),
-                                    (1 << addressable_cores_offset),
+                                    &topo_info,
                                     eax, ebx, ecx, edx);
                 break;
             case 3: /* L3 cache info */
-                die_offset = apicid_die_offset(&topo_info);
                 if (cpu->enable_l3_cache) {
                     encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
-                                        (1 << die_offset),
-                                        (1 << addressable_cores_offset),
+                                        &topo_info,
                                         eax, ebx, ecx, edx);
                     break;
                 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 16/18] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (14 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 15/18] i386: Use CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14] Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-15 12:32   ` wangyanan (Y) via
  2023-02-13  9:36 ` [PATCH RESEND 17/18] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
  2023-02-13  9:36 ` [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
means [1]:

The number of logical processors sharing this cache is the value of
this field incremented by 1. To determine which logical processors are
sharing a cache, determine a Share Id for each processor as follows:

ShareId = LocalApicId >> log2(NumSharingCache+1)

Logical processors with the same ShareId then share a cache. If
NumSharingCache+1 is not a power of two, round it up to the next power
of two.

From the description above, the caculation of this feild should be same
as CPUID[4].EAX[bits 25:14] for intel cpus. So also use the offsets of
APIC ID to caculate this field.

Note: I don't have the hardware available, hope someone can help me to
confirm whether this calculation is correct, thanks!

[1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
     Information

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 96ef96860604..d691c02e3c06 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -355,7 +355,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
                                        uint32_t *eax, uint32_t *ebx,
                                        uint32_t *ecx, uint32_t *edx)
 {
-    uint32_t l3_threads;
+    uint32_t sharing_apic_ids;
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
@@ -364,13 +364,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     /* L3 is shared among multiple cores */
     if (cache->level == 3) {
-        l3_threads = topo_info->modules_per_die *
-                     topo_info->cores_per_module *
-                     topo_info->threads_per_core;
-        *eax |= (l3_threads - 1) << 14;
+        sharing_apic_ids = 1 << apicid_die_offset(topo_info);
     } else {
-        *eax |= ((topo_info->threads_per_core - 1) << 14);
+        sharing_apic_ids = 1 << apicid_core_offset(topo_info);
     }
+    *eax |= (sharing_apic_ids - 1) << 14;
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 17/18] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (15 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 16/18] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-13  9:36 ` [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
  17 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

CPUID[0x8000001D].EAX[bits 25:14] is used to represent the cache
topology for amd CPUs.

After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[0x8000001D].EAX[bits 25:14].

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index d691c02e3c06..5816dc99b1d4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -355,20 +355,12 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
                                        uint32_t *eax, uint32_t *ebx,
                                        uint32_t *ecx, uint32_t *edx)
 {
-    uint32_t sharing_apic_ids;
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
     *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
                (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
-
-    /* L3 is shared among multiple cores */
-    if (cache->level == 3) {
-        sharing_apic_ids = 1 << apicid_die_offset(topo_info);
-    } else {
-        sharing_apic_ids = 1 << apicid_core_offset(topo_info);
-    }
-    *eax |= (sharing_apic_ids - 1) << 14;
+    *eax |= max_processor_ids_for_cache(cache, topo_info) << 14;
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
                   ` (16 preceding siblings ...)
  2023-02-13  9:36 ` [PATCH RESEND 17/18] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
@ 2023-02-13  9:36 ` Zhao Liu
  2023-02-16 13:14   ` wangyanan (Y) via
  17 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-13  9:36 UTC (permalink / raw)
  To: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

The property x-l2-cache-topo will be used to change the L2 cache
topology in CPUID.04H.

Now it allows user to set the L2 cache is shared in core level or
cluster level.

If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
topology will be overrided by the new topology setting.

Here we expose to user "cluster" instead of "module", to be consistent
with "cluster-id" naming.

Since CPUID.04H is used by intel CPUs, this property is available on
intel CPUs as for now.

When necessary, it can be extended to CPUID.8000001DH for amd CPUs.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
 target/i386/cpu.h |  2 ++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5816dc99b1d4..cf84c720a431 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
     case CORE:
         num_ids = 1 << apicid_core_offset(topo_info);
         break;
+    case MODULE:
+        num_ids = 1 << apicid_module_offset(topo_info);
+        break;
     case DIE:
         num_ids = 1 << apicid_die_offset(topo_info);
         break;
     default:
         /*
-         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
+         * Currently there is no use case for SMT and PACKAGE, so use
          * assert directly to facilitate debugging.
          */
         g_assert_not_reached();
@@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
         env->cache_info_amd.l3_cache = &legacy_l3_cache;
     }
 
+    if (cpu->l2_cache_topo_level) {
+        /*
+         * FIXME: Currently only supports changing CPUID[4] (for intel), and
+         * will support changing CPUID[0x8000001D] when necessary.
+         */
+        if (!IS_INTEL_CPU(env)) {
+            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
+            return;
+        }
+
+        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
+            env->cache_info_cpuid4.l2_cache->share_level = CORE;
+        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
+            /*
+             * We expose to users "cluster" instead of "module", to be
+             * consistent with "cluster-id" naming.
+             */
+            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
+        } else {
+            error_setg(errp,
+                       "x-l2-cache-topo doesn't support '%s', "
+                       "and it only supports 'core' or 'cluster'",
+                       cpu->l2_cache_topo_level);
+            return;
+        }
+    }
+
 #ifndef CONFIG_USER_ONLY
     MachineState *ms = MACHINE(qdev_get_machine());
     qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
@@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
                      false),
     DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
                      true),
+    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
     DEFINE_PROP_END_OF_LIST()
 };
 
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 5a955431f759..aa7e96c586c7 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1987,6 +1987,8 @@ struct ArchCPU {
     int32_t thread_id;
 
     int32_t hv_max_vps;
+
+    char *l2_cache_topo_level;
 };
 
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 01/18] machine: Fix comment of machine_parse_smp_config()
  2023-02-13  9:36 ` [PATCH RESEND 01/18] machine: Fix comment of machine_parse_smp_config() Zhao Liu
@ 2023-02-13 13:31   ` wangyanan (Y) via
  2023-02-14 14:22     ` Zhao Liu
  0 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-13 13:31 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu


在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> Now smp supports dies and clusters, so add description about these 2
> levels in the comment of machine_parse_smp_config().
>
> Fixes: 864c3b5 (hw/core/machine: Introduce CPU cluster topology support)
> Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   hw/core/machine-smp.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
> index c3dab007dadc..3fd9e641efde 100644
> --- a/hw/core/machine-smp.c
> +++ b/hw/core/machine-smp.c
> @@ -51,8 +51,8 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
>    * machine_parse_smp_config: Generic function used to parse the given
>    *                           SMP configuration
>    *
> - * Any missing parameter in "cpus/maxcpus/sockets/cores/threads" will be
> - * automatically computed based on the provided ones.
> + * Any missing parameter in "cpus/maxcpus/sockets/dies/clusters/cores/threads"
> + * will be automatically computed based on the provided ones.
This is intential. Newly added topo params (apart from maxcpus/
socket/cores/threads) wiil be assigned to 1 and not computed
based the provided ones. There is no problem about this part.
>    *
>    * In the calculation of omitted sockets/cores/threads: we prefer sockets
>    * over cores over threads before 6.2, while preferring cores over sockets
> @@ -66,7 +66,8 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
>    *
>    * For compatibility, apart from the parameters that will be computed, newly
>    * introduced topology members which are likely to be target specific should
> - * be directly set as 1 if they are omitted (e.g. dies for PC since 4.1).
> + * be directly set as 1 if they are omitted (e.g. dies for PC since v4.1 and
> + * clusters for arm since v7.0).
>    */
Given that we are going to support cluster for PC machine.
Maybe simple "(i.e. dies for PC since 4.1)" here is good enough?

Thanks,
Yanan
>   void machine_parse_smp_config(MachineState *ms,
>                                 const SMPConfiguration *config, Error **errp)



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 09/18] i386: Fix comment style in topology.h
  2023-02-13  9:36 ` [PATCH RESEND 09/18] i386: Fix comment style in topology.h Zhao Liu
@ 2023-02-13 13:40   ` Philippe Mathieu-Daudé
  2023-02-14  2:37   ` wangyanan (Y) via
  2023-02-15 10:54   ` wangyanan (Y) via
  2 siblings, 0 replies; 61+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-02-13 13:40 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum, Yanan Wang,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

On 13/2/23 10:36, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> For function comments in this file, keep the comment style consistent
> with other places.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   include/hw/i386/topology.h | 33 +++++++++++++++++----------------
>   1 file changed, 17 insertions(+), 16 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 13/18] hw/i386/pc: Support smp.clusters for x86 PC machine
  2023-02-13  9:36 ` [PATCH RESEND 13/18] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
@ 2023-02-14  2:34   ` wangyanan (Y) via
  0 siblings, 0 replies; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-14  2:34 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
>
> As module-level topology support is added to X86CPU, now we can enable
> the support for the cluster parameter on PC machines. With this support,
> we can define a 5-level x86 CPU topology with "-smp":
>
> -smp cpus=*,maxcpus=*,sockets=*,dies=*,clusters=*,cores=*,threads=*.
>
> Additionally, add the 5-level topology example in description of "-smp".
>
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   hw/i386/pc.c    |  1 +
>   qemu-options.hx | 10 +++++-----
>   2 files changed, 6 insertions(+), 5 deletions(-)
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>

Thanks,
Yanan
>
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 6e592bd969aa..c329df56ebd2 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1929,6 +1929,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
>       mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
>       mc->nvdimm_supported = true;
>       mc->smp_props.dies_supported = true;
> +    mc->smp_props.clusters_supported = true;
>       mc->default_ram_id = "pc.ram";
>   
>       object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 88e93c610314..3caf9da4c3af 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -312,14 +312,14 @@ SRST
>           -smp 8,sockets=2,cores=2,threads=2,maxcpus=8
>   
>       The following sub-option defines a CPU topology hierarchy (2 sockets
> -    totally on the machine, 2 dies per socket, 2 cores per die, 2 threads
> -    per core) for PC machines which support sockets/dies/cores/threads.
> -    Some members of the option can be omitted but their values will be
> -    automatically computed:
> +    totally on the machine, 2 dies per socket, 2 clusters per die, 2 cores per
> +    cluster, 2 threads per core) for PC machines which support sockets/dies
> +    /clusters/cores/threads. Some members of the option can be omitted but
> +    their values will be automatically computed:
>   
>       ::
>   
> -        -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16
> +        -smp 32,sockets=2,dies=2,clusters=2,cores=2,threads=2,maxcpus=32
>   
>       The following sub-option defines a CPU topology hierarchy (2 sockets
>       totally on the machine, 2 clusters per socket, 2 cores per cluster,



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 09/18] i386: Fix comment style in topology.h
  2023-02-13  9:36 ` [PATCH RESEND 09/18] i386: Fix comment style in topology.h Zhao Liu
  2023-02-13 13:40   ` Philippe Mathieu-Daudé
@ 2023-02-14  2:37   ` wangyanan (Y) via
  2023-02-15 10:54   ` wangyanan (Y) via
  2 siblings, 0 replies; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-14  2:37 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> For function comments in this file, keep the comment style consistent
> with other places.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   include/hw/i386/topology.h | 33 +++++++++++++++++----------------
>   1 file changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index b0174c18b7bd..5de905dc00d3 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -24,7 +24,8 @@
>   #ifndef HW_I386_TOPOLOGY_H
>   #define HW_I386_TOPOLOGY_H
>   
> -/* This file implements the APIC-ID-based CPU topology enumeration logic,
> +/*
> + * This file implements the APIC-ID-based CPU topology enumeration logic,
>    * documented at the following document:
>    *   Intel® 64 Architecture Processor Topology Enumeration
>    *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
> @@ -41,7 +42,8 @@
>   
>   #include "qemu/bitops.h"
>   
> -/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> +/*
> + * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
>    */
>   typedef uint32_t apic_id_t;
>   
> @@ -60,8 +62,7 @@ typedef struct X86CPUTopoInfo {
>       unsigned threads_per_core;
>   } X86CPUTopoInfo;
>   
> -/* Return the bit width needed for 'count' IDs
> - */
> +/* Return the bit width needed for 'count' IDs */
>   static unsigned apicid_bitwidth_for_count(unsigned count)
>   {
>       g_assert(count >= 1);
> @@ -69,15 +70,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
>       return count ? 32 - clz32(count) : 0;
>   }
>   
> -/* Bit width of the SMT_ID (thread ID) field on the APIC ID
> - */
> +/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
>   static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_bitwidth_for_count(topo_info->threads_per_core);
>   }
>   
> -/* Bit width of the Core_ID field
> - */
> +/* Bit width of the Core_ID field */
>   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>   {
>       /*
> @@ -94,8 +93,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
>       return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
>   }
>   
> -/* Bit offset of the Core_ID field
> - */
> +/* Bit offset of the Core_ID field */
>   static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_smt_width(topo_info);
> @@ -107,14 +105,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
>       return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
>   }
>   
> -/* Bit offset of the Pkg_ID (socket ID) field
> - */
> +/* Bit offset of the Pkg_ID (socket ID) field */
>   static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
>   }
>   
> -/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> +/*
> + * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
>    *
>    * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
>    */
> @@ -127,7 +125,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
>              topo_ids->smt_id;
>   }
>   
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>    * based on (contiguous) CPU index
>    */
>   static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> @@ -154,7 +153,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>       topo_ids->smt_id = cpu_index % nr_threads;
>   }
>   
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>    * based on APIC ID
>    */
>   static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> @@ -178,7 +178,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>       topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
>   }
>   
> -/* Make APIC ID for the CPU 'cpu_index'
> +/*
> + * Make APIC ID for the CPU 'cpu_index'
>    *
>    * 'cpu_index' is a sequential, contiguous ID for the CPU.
>    */
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>

Thanks,
Yanan


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 01/18] machine: Fix comment of machine_parse_smp_config()
  2023-02-13 13:31   ` wangyanan (Y) via
@ 2023-02-14 14:22     ` Zhao Liu
  0 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-14 14:22 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster, qemu-devel, Zhenyu Wang,
	Dapeng Mi, Zhuocheng Ding, Robert Hoo, Xiaoyao Li, Like Xu,
	Zhao Liu

On Mon, Feb 13, 2023 at 09:31:53PM +0800, wangyanan (Y) wrote:
> Date: Mon, 13 Feb 2023 21:31:53 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 01/18] machine: Fix comment of
>  machine_parse_smp_config()
> 
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > Now smp supports dies and clusters, so add description about these 2
> > levels in the comment of machine_parse_smp_config().
> > 
> > Fixes: 864c3b5 (hw/core/machine: Introduce CPU cluster topology support)
> > Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   hw/core/machine-smp.c | 7 ++++---
> >   1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
> > index c3dab007dadc..3fd9e641efde 100644
> > --- a/hw/core/machine-smp.c
> > +++ b/hw/core/machine-smp.c
> > @@ -51,8 +51,8 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
> >    * machine_parse_smp_config: Generic function used to parse the given
> >    *                           SMP configuration
> >    *
> > - * Any missing parameter in "cpus/maxcpus/sockets/cores/threads" will be
> > - * automatically computed based on the provided ones.
> > + * Any missing parameter in "cpus/maxcpus/sockets/dies/clusters/cores/threads"
> > + * will be automatically computed based on the provided ones.
> This is intential. Newly added topo params (apart from maxcpus/
> socket/cores/threads) wiil be assigned to 1 and not computed
> based the provided ones. There is no problem about this part.

Sorry, I see. Will fix.

> >    *
> >    * In the calculation of omitted sockets/cores/threads: we prefer sockets
> >    * over cores over threads before 6.2, while preferring cores over sockets
> > @@ -66,7 +66,8 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
> >    *
> >    * For compatibility, apart from the parameters that will be computed, newly
> >    * introduced topology members which are likely to be target specific should
> > - * be directly set as 1 if they are omitted (e.g. dies for PC since 4.1).
> > + * be directly set as 1 if they are omitted (e.g. dies for PC since v4.1 and
> > + * clusters for arm since v7.0).
> >    */
> Given that we are going to support cluster for PC machine.
> Maybe simple "(i.e. dies for PC since 4.1)" here is good enough?

Makes sense. Now I understand this logic, and I will drop this commit.

Thanks,
Zhao

> 
> Thanks,
> Yanan
> >   void machine_parse_smp_config(MachineState *ms,
> >                                 const SMPConfiguration *config, Error **errp)
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to test-x86-apicid.c
  2023-02-13  9:36 ` [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to test-x86-apicid.c Zhao Liu
@ 2023-02-15  2:36   ` wangyanan (Y) via
  2023-02-15  3:35     ` Zhao Liu
  0 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15  2:36 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> In fact, this unit tests APIC ID other than CPUID.
> Rename to test-x86-apicid.c to make its name more in line with its
> actual content.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   MAINTAINERS                                        | 2 +-
>   tests/unit/meson.build                             | 4 ++--
>   tests/unit/{test-x86-cpuid.c => test-x86-apicid.c} | 2 +-
>   3 files changed, 4 insertions(+), 4 deletions(-)
>   rename tests/unit/{test-x86-cpuid.c => test-x86-apicid.c} (99%)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 96e25f62acaa..71c1bc24371b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1679,7 +1679,7 @@ F: include/hw/southbridge/piix.h
>   F: hw/misc/sga.c
>   F: hw/isa/apm.c
>   F: include/hw/isa/apm.h
> -F: tests/unit/test-x86-cpuid.c
> +F: tests/unit/test-x86-apicid.c
>   F: tests/qtest/test-x86-cpuid-compat.c
>   
>   PC Chipset
> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> index ffa444f4323c..a9df2843e92e 100644
> --- a/tests/unit/meson.build
> +++ b/tests/unit/meson.build
> @@ -20,8 +20,8 @@ tests = {
>     'test-opts-visitor': [testqapi],
>     'test-visitor-serialization': [testqapi],
>     'test-bitmap': [],
> -  # all code tested by test-x86-cpuid is inside topology.h
> -  'test-x86-cpuid': [],
> +  # all code tested by test-x86-apicid is inside topology.h
> +  'test-x86-apicid': [],
>     'test-cutils': [],
>     'test-div128': [],
>     'test-shift128': [],
> diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-apicid.c
> similarity index 99%
> rename from tests/unit/test-x86-cpuid.c
> rename to tests/unit/test-x86-apicid.c
> index bfabc0403a1a..2b104f86d7c2 100644
> --- a/tests/unit/test-x86-cpuid.c
> +++ b/tests/unit/test-x86-apicid.c
> @@ -1,5 +1,5 @@
>   /*
> - *  Test code for x86 CPUID and Topology functions
> + *  Test code for x86 APIC ID and Topology functions
>    *
I'm not very sure. The "CPUID" sounds like a general test for kinds of 
CPU IDs.
Besides APIC IDs computed from x86_apicid_from_cpu_idx(), there are also
topo IDs computed from x86_topo_ids_from_idx() although this kind of IDs
are not tested in test-x86-cpuid.c so far.

Thanks,
Yanan
>    *  Copyright (c) 2012 Red Hat Inc.
>    *



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 03/18] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-02-13  9:36 ` [PATCH RESEND 03/18] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
@ 2023-02-15  2:58   ` wangyanan (Y) via
  2023-02-15  3:37     ` Zhao Liu
  0 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15  2:58 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

Hi Zhao,

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
>
> >From CPUState.nr_cores' comment, it represents "number of cores within
> this CPU package".
>
> After 003f230 (machine: Tweak the order of topology members in struct
> CpuTopology), the meaning of smp.cores changed to "the number of cores
> in one die", but this commit missed to change CPUState.nr_cores'
> caculation, so that CPUState.nr_cores became wrong and now it
> misses to consider numbers of clusters and dies.
>
> At present, only i386 is using CPUState.nr_cores.
>
> But as for i386, which supports die level, the uses of CPUState.nr_cores
> are very confusing:
>
> Early uses are based on the meaning of "cores per package" (before die
> is introduced into i386), and later uses are based on "cores per die"
> (after die's introduction).
>
> This difference is due to that commit a94e142 (target/i386: Add CPUID.1F
> generation support for multi-dies PCMachine) misunderstood that
> CPUState.nr_cores means "cores per die" when caculated
> CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> wrong understanding.
>
> With the influence of 003f230 and a94e142, for i386 currently the result
> of CPUState.nr_cores is "cores per die", thus the original uses of
> CPUState.cores based on the meaning of "cores per package" are wrong
> when mutiple dies exist:
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
>     incorrect because it expects "cpus per package" but now the
>     result is "cpus per die".
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
>     EAX[bits 31:26] is incorrect because they expect "cpus per package"
>     but now the result is "cpus per die". The error not only impacts the
>     EAX caculation in cache_info_passthrough case, but also impacts other
>     cases of setting cache topology for Intel CPU according to cpu
>     topology (specifically, the incoming parameter "num_cores" expects
>     "cores per package" in encode_cache_cpuid4()).
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
>     15:00] is incorrect because the EBX of 0BH.01H (core level) expects
>     "cpus per package", which may be different with 1FH.01H (The reason
>     is 1FH can support more levels. For QEMU, 1FH also supports die,
>     1FH.01H:EBX[bits 15:00] expects "cpus per die").
> 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
>     caculated, here "cpus per package" is expected to be checked, but in
>     fact, now it checks "cpus per die". Though "cpus per die" also works
>     for this code logic, this isn't consistent with AMD's APM.
> 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
>     "cpus per package" but it obtains "cpus per die".
> 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
>     kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
>     helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
>     MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
>     package", but in these functions, it obtains "cpus per die" and
>     "cores per die".
>
> On the other hand, these uses are correct now (they are added in/after
> a94e142):
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
>     meets the actual meaning of CPUState.nr_cores ("cores per die").
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
>     04H's caculation) considers number of dies, so it's correct.
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
>     15:00] needs "cpus per die" and it gets the correct result, and
>     CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
>
> When CPUState.nr_cores is correctly changed to "cores per package" again
> , the above errors will be fixed without extra work, but the "currently"
> correct cases will go wrong and need special handling to pass correct
> "cpus/cores per die" they want.
>
> Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
> original meaning "cores per package", as well as changing caculation of
> topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
>
> In addition, in the nr_threads' comment, specify it represents the
> number of threads in the "core" to avoid confusion.
>
> Fixes: a94e142 (target/i386: Add CPUID.1F generation support for multi-dies PCMachine)
> Fixes: 003f230 (machine: Tweak the order of topology members in struct CpuTopology)
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   include/hw/core/cpu.h | 2 +-
>   softmmu/cpus.c        | 2 +-
>   target/i386/cpu.c     | 9 ++++-----
>   3 files changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 2417597236bc..5253e4e839bb 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -274,7 +274,7 @@ struct qemu_work_item;
>    *   QOM parent.
>    * @tcg_cflags: Pre-computed cflags for this cpu.
>    * @nr_cores: Number of cores within this CPU package.
> - * @nr_threads: Number of threads within this CPU.
> + * @nr_threads: Number of threads within this CPU core.
>    * @running: #true if CPU is currently running (lockless).
>    * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
>    * valid under cpu_list_lock.
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index 9cbc8172b5f2..9996e6a3b295 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
>   {
>       MachineState *ms = MACHINE(qdev_get_machine());
>   
> -    cpu->nr_cores = ms->smp.cores;
> +    cpu->nr_cores = ms->smp.dies * ms->smp.clusters * ms->smp.cores;
>       cpu->nr_threads =  ms->smp.threads;
>       cpu->stopped = true;
>       cpu->random_seed = qemu_guest_random_seed_thread_part1();
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 4d2b8d0444df..29afec12c281 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5218,7 +5218,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>       X86CPUTopoInfo topo_info;
>   
>       topo_info.dies_per_pkg = env->nr_dies;
> -    topo_info.cores_per_die = cs->nr_cores;
> +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
Is it better to also add a description for env->nr_dies in X86CPUState,
like "/* Number of dies within this CPU package */", for less confusion?
>       topo_info.threads_per_core = cs->nr_threads;
>   
>       /* Calculate & apply limits for different index ranges */
> @@ -5294,8 +5294,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                */
>               if (*eax & 31) {
>                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> -                                       cs->nr_threads;
> +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>                   if (cs->nr_cores > 1) {
>                       *eax &= ~0xFC000000;
>                       *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> @@ -5468,12 +5467,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               break;
>           case 1:
>               *eax = apicid_die_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>               break;
>           case 2:
>               *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> +            *ebx = cs->nr_cores * cs->nr_threads;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>               break;
>           default:
Otherwise:
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>

Thanks,
Yanan



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 05/18] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  2023-02-13  9:36 ` [PATCH RESEND 05/18] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
@ 2023-02-15  3:28   ` wangyanan (Y) via
  2023-02-15  7:10     ` Zhao Liu
  0 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15  3:28 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> In cpu_x86_cpuid(), there are many variables in representing the cpu
> topology, e.g., topo_info, cs->nr_cores/cs->nr_threads.
>
> Since the names of cs->nr_cores/cs->nr_threads does not accurately
> represent its meaning, the use of cs->nr_cores/cs->nr_threads is prone
> to confusion and mistakes.
>
> And the structure X86CPUTopoInfo names its memebers clearly, thus the
> variable "topo_info" should be preferred.
>
> Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   target/i386/cpu.c | 30 ++++++++++++++++++------------
>   1 file changed, 18 insertions(+), 12 deletions(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 7833505092d8..4cda84eb96f1 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5215,11 +5215,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>       uint32_t limit;
>       uint32_t signature[3];
>       X86CPUTopoInfo topo_info;
> +    uint32_t cpus_per_pkg;
>   
>       topo_info.dies_per_pkg = env->nr_dies;
>       topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>       topo_info.threads_per_core = cs->nr_threads;
>   
> +    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.cores_per_die *
> +                   topo_info.threads_per_core;
> +
>       /* Calculate & apply limits for different index ranges */
>       if (index >= 0xC0000000) {
>           limit = env->cpuid_xlevel2;
> @@ -5255,8 +5259,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               *ecx |= CPUID_EXT_OSXSAVE;
>           }
>           *edx = env->features[FEAT_1_EDX];
> -        if (cs->nr_cores * cs->nr_threads > 1) {
> -            *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
> +        if (cpus_per_pkg > 1) {
> +            *ebx |= cpus_per_pkg << 16;
>               *edx |= CPUID_HT;
>           }
>           if (!cpu->enable_pmu) {
> @@ -5293,10 +5297,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                */
>               if (*eax & 31) {
>                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> -                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> -                if (cs->nr_cores > 1) {
> +                int vcpus_per_socket = cpus_per_pkg;
Would it make sense to directly use cpus_per_pkg here
> +                int cores_per_socket = topo_info.cores_per_die *
> +                                       topo_info.dies_per_pkg;
There are other places in cpu_x86_cpuid where cs->nr_cores is used
separately, why not make a global "cores_per_pkg" like cpus_per_pkg
and also tweak the other places?
> +                if (cores_per_socket > 1) {
>                       *eax &= ~0xFC000000;
> -                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> +                    *eax |= (pow2ceil(cores_per_socket) - 1) << 26;
>                   }
>                   if (host_vcpus_per_cache > vcpus_per_socket) {
>                       *eax &= ~0x3FFC000;
> @@ -5436,12 +5442,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>           switch (count) {
>           case 0:
>               *eax = apicid_core_offset(&topo_info);
> -            *ebx = cs->nr_threads;
> +            *ebx = topo_info.threads_per_core;
There are many other places in cpu_x86_cpuid where cs->nr_threads
is used separately, such as encode_cache_cpuid4(***), should we
replace them all?
>               *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>               break;
>           case 1:
>               *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = cpus_per_pkg;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>               break;
>           default:
> @@ -5472,7 +5478,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>           switch (count) {
>           case 0:
>               *eax = apicid_core_offset(&topo_info);
> -            *ebx = cs->nr_threads;
> +            *ebx = topo_info.threads_per_core;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>               break;
>           case 1:
> @@ -5482,7 +5488,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               break;
>           case 2:
>               *eax = apicid_pkg_offset(&topo_info);
> -            *ebx = cs->nr_cores * cs->nr_threads;
> +            *ebx = cpus_per_pkg;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>               break;
>           default:
> @@ -5707,7 +5713,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>            * discards multiple thread information if it is set.
>            * So don't set it here for Intel to make Linux guests happy.
>            */
> -        if (cs->nr_cores * cs->nr_threads > 1) {
> +        if (cpus_per_pkg > 1) {
>               if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 ||
>                   env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 ||
>                   env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) {
> @@ -5769,7 +5775,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                *eax |= (cpu_x86_virtual_addr_width(env) << 8);
>           }
>           *ebx = env->features[FEAT_8000_0008_EBX];
> -        if (cs->nr_cores * cs->nr_threads > 1) {
> +        if (cpus_per_pkg > 1) {
>               /*
>                * Bits 15:12 is "The number of bits in the initial
>                * Core::X86::Apic::ApicId[ApicId] value that indicate
> @@ -5777,7 +5783,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                * Bits 7:0 is "The number of threads in the package is NC+1"
>                */
>               *ecx = (apicid_pkg_offset(&topo_info) << 12) |
> -                   ((cs->nr_cores * cs->nr_threads) - 1);
> +                   (cpus_per_pkg - 1);
>           } else {
>               *ecx = 0;
>           }
Thanks,
Yanan


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to test-x86-apicid.c
  2023-02-15  2:36   ` wangyanan (Y) via
@ 2023-02-15  3:35     ` Zhao Liu
  2023-02-15  7:44       ` wangyanan (Y) via
  0 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-15  3:35 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster, qemu-devel, Zhenyu Wang,
	Dapeng Mi, Zhuocheng Ding, Robert Hoo, Xiaoyao Li, Like Xu,
	Zhao Liu

On Wed, Feb 15, 2023 at 10:36:34AM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 10:36:34 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to
>  test-x86-apicid.c
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > In fact, this unit tests APIC ID other than CPUID.
> > Rename to test-x86-apicid.c to make its name more in line with its
> > actual content.
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   MAINTAINERS                                        | 2 +-
> >   tests/unit/meson.build                             | 4 ++--
> >   tests/unit/{test-x86-cpuid.c => test-x86-apicid.c} | 2 +-
> >   3 files changed, 4 insertions(+), 4 deletions(-)
> >   rename tests/unit/{test-x86-cpuid.c => test-x86-apicid.c} (99%)
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 96e25f62acaa..71c1bc24371b 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -1679,7 +1679,7 @@ F: include/hw/southbridge/piix.h
> >   F: hw/misc/sga.c
> >   F: hw/isa/apm.c
> >   F: include/hw/isa/apm.h
> > -F: tests/unit/test-x86-cpuid.c
> > +F: tests/unit/test-x86-apicid.c
> >   F: tests/qtest/test-x86-cpuid-compat.c
> >   PC Chipset
> > diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> > index ffa444f4323c..a9df2843e92e 100644
> > --- a/tests/unit/meson.build
> > +++ b/tests/unit/meson.build
> > @@ -20,8 +20,8 @@ tests = {
> >     'test-opts-visitor': [testqapi],
> >     'test-visitor-serialization': [testqapi],
> >     'test-bitmap': [],
> > -  # all code tested by test-x86-cpuid is inside topology.h
> > -  'test-x86-cpuid': [],
> > +  # all code tested by test-x86-apicid is inside topology.h
> > +  'test-x86-apicid': [],
> >     'test-cutils': [],
> >     'test-div128': [],
> >     'test-shift128': [],
> > diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-apicid.c
> > similarity index 99%
> > rename from tests/unit/test-x86-cpuid.c
> > rename to tests/unit/test-x86-apicid.c
> > index bfabc0403a1a..2b104f86d7c2 100644
> > --- a/tests/unit/test-x86-cpuid.c
> > +++ b/tests/unit/test-x86-apicid.c
> > @@ -1,5 +1,5 @@
> >   /*
> > - *  Test code for x86 CPUID and Topology functions
> > + *  Test code for x86 APIC ID and Topology functions
> >    *
> I'm not very sure. The "CPUID" sounds like a general test for kinds of CPU
> IDs.

CPUID usually refers to that basic instruction in the x86 to obtain basic
cpu information. So such naming is prone to ambiguity.

The cpu topology info of x86 is parsed based on the apic id, including
the sub-ids of each topology levels (such as thread id/core id...etc.).
These sub-ids are all part of the apic id.

> Besides APIC IDs computed from x86_apicid_from_cpu_idx(), there are also
> topo IDs computed from x86_topo_ids_from_idx() although this kind of IDs
> are not tested in test-x86-cpuid.c so far.

What about "test-x86-topo.c" or "test-x86-topo-ids.c"?

> 
> Thanks,
> Yanan
> >    *  Copyright (c) 2012 Red Hat Inc.
> >    *
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 03/18] softmmu: Fix CPUSTATE.nr_cores' calculation
  2023-02-15  2:58   ` wangyanan (Y) via
@ 2023-02-15  3:37     ` Zhao Liu
  0 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-15  3:37 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster, qemu-devel, Zhenyu Wang,
	Dapeng Mi, Zhuocheng Ding, Robert Hoo, Xiaoyao Li, Like Xu,
	Zhao Liu

On Wed, Feb 15, 2023 at 10:58:07AM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 10:58:07 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 03/18] softmmu: Fix CPUSTATE.nr_cores'
>  calculation
> 
> Hi Zhao,
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> > >From CPUState.nr_cores' comment, it represents "number of cores within
> > this CPU package".
> > 
> > After 003f230 (machine: Tweak the order of topology members in struct
> > CpuTopology), the meaning of smp.cores changed to "the number of cores
> > in one die", but this commit missed to change CPUState.nr_cores'
> > caculation, so that CPUState.nr_cores became wrong and now it
> > misses to consider numbers of clusters and dies.
> > 
> > At present, only i386 is using CPUState.nr_cores.
> > 
> > But as for i386, which supports die level, the uses of CPUState.nr_cores
> > are very confusing:
> > 
> > Early uses are based on the meaning of "cores per package" (before die
> > is introduced into i386), and later uses are based on "cores per die"
> > (after die's introduction).
> > 
> > This difference is due to that commit a94e142 (target/i386: Add CPUID.1F
> > generation support for multi-dies PCMachine) misunderstood that
> > CPUState.nr_cores means "cores per die" when caculated
> > CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> > wrong understanding.
> > 
> > With the influence of 003f230 and a94e142, for i386 currently the result
> > of CPUState.nr_cores is "cores per die", thus the original uses of
> > CPUState.cores based on the meaning of "cores per package" are wrong
> > when mutiple dies exist:
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
> >     incorrect because it expects "cpus per package" but now the
> >     result is "cpus per die".
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
> >     EAX[bits 31:26] is incorrect because they expect "cpus per package"
> >     but now the result is "cpus per die". The error not only impacts the
> >     EAX caculation in cache_info_passthrough case, but also impacts other
> >     cases of setting cache topology for Intel CPU according to cpu
> >     topology (specifically, the incoming parameter "num_cores" expects
> >     "cores per package" in encode_cache_cpuid4()).
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
> >     15:00] is incorrect because the EBX of 0BH.01H (core level) expects
> >     "cpus per package", which may be different with 1FH.01H (The reason
> >     is 1FH can support more levels. For QEMU, 1FH also supports die,
> >     1FH.01H:EBX[bits 15:00] expects "cpus per die").
> > 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is
> >     caculated, here "cpus per package" is expected to be checked, but in
> >     fact, now it checks "cpus per die". Though "cpus per die" also works
> >     for this code logic, this isn't consistent with AMD's APM.
> > 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects
> >     "cpus per package" but it obtains "cpus per die".
> > 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
> >     kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
> >     helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
> >     MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
> >     package", but in these functions, it obtains "cpus per die" and
> >     "cores per die".
> > 
> > On the other hand, these uses are correct now (they are added in/after
> > a94e142):
> > 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
> >     meets the actual meaning of CPUState.nr_cores ("cores per die").
> > 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
> >     04H's caculation) considers number of dies, so it's correct.
> > 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
> >     15:00] needs "cpus per die" and it gets the correct result, and
> >     CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> > 
> > When CPUState.nr_cores is correctly changed to "cores per package" again
> > , the above errors will be fixed without extra work, but the "currently"
> > correct cases will go wrong and need special handling to pass correct
> > "cpus/cores per die" they want.
> > 
> > Thus in this patch, we fix CPUState.nr_cores' caculation to fit the
> > original meaning "cores per package", as well as changing caculation of
> > topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> > 
> > In addition, in the nr_threads' comment, specify it represents the
> > number of threads in the "core" to avoid confusion.
> > 
> > Fixes: a94e142 (target/i386: Add CPUID.1F generation support for multi-dies PCMachine)
> > Fixes: 003f230 (machine: Tweak the order of topology members in struct CpuTopology)
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   include/hw/core/cpu.h | 2 +-
> >   softmmu/cpus.c        | 2 +-
> >   target/i386/cpu.c     | 9 ++++-----
> >   3 files changed, 6 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> > index 2417597236bc..5253e4e839bb 100644
> > --- a/include/hw/core/cpu.h
> > +++ b/include/hw/core/cpu.h
> > @@ -274,7 +274,7 @@ struct qemu_work_item;
> >    *   QOM parent.
> >    * @tcg_cflags: Pre-computed cflags for this cpu.
> >    * @nr_cores: Number of cores within this CPU package.
> > - * @nr_threads: Number of threads within this CPU.
> > + * @nr_threads: Number of threads within this CPU core.
> >    * @running: #true if CPU is currently running (lockless).
> >    * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
> >    * valid under cpu_list_lock.
> > diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> > index 9cbc8172b5f2..9996e6a3b295 100644
> > --- a/softmmu/cpus.c
> > +++ b/softmmu/cpus.c
> > @@ -630,7 +630,7 @@ void qemu_init_vcpu(CPUState *cpu)
> >   {
> >       MachineState *ms = MACHINE(qdev_get_machine());
> > -    cpu->nr_cores = ms->smp.cores;
> > +    cpu->nr_cores = ms->smp.dies * ms->smp.clusters * ms->smp.cores;
> >       cpu->nr_threads =  ms->smp.threads;
> >       cpu->stopped = true;
> >       cpu->random_seed = qemu_guest_random_seed_thread_part1();
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 4d2b8d0444df..29afec12c281 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -5218,7 +5218,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >       X86CPUTopoInfo topo_info;
> >       topo_info.dies_per_pkg = env->nr_dies;
> > -    topo_info.cores_per_die = cs->nr_cores;
> > +    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> Is it better to also add a description for env->nr_dies in X86CPUState,
> like "/* Number of dies within this CPU package */", for less confusion?

Yeah, thanks. I'll add this comment.

> >       topo_info.threads_per_core = cs->nr_threads;
> >       /* Calculate & apply limits for different index ranges */
> > @@ -5294,8 +5294,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                */
> >               if (*eax & 31) {
> >                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> > -                int vcpus_per_socket = env->nr_dies * cs->nr_cores *
> > -                                       cs->nr_threads;
> > +                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> >                   if (cs->nr_cores > 1) {
> >                       *eax &= ~0xFC000000;
> >                       *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> > @@ -5468,12 +5467,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               break;
> >           case 1:
> >               *eax = apicid_die_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >               break;
> >           case 2:
> >               *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
> > +            *ebx = cs->nr_cores * cs->nr_threads;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
> >               break;
> >           default:
> Otherwise:
> Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
> 
> Thanks,
> Yanan
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 05/18] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  2023-02-15  7:10     ` Zhao Liu
@ 2023-02-15  7:08       ` wangyanan (Y) via
  0 siblings, 0 replies; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15  7:08 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster, qemu-devel, Zhenyu Wang,
	Dapeng Mi, Zhuocheng Ding, Robert Hoo, Xiaoyao Li, Like Xu,
	Zhao Liu

在 2023/2/15 15:10, Zhao Liu 写道:
> On Wed, Feb 15, 2023 at 11:28:25AM +0800, wangyanan (Y) wrote:
>> Date: Wed, 15 Feb 2023 11:28:25 +0800
>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>> Subject: Re: [PATCH RESEND 05/18] i386/cpu: Consolidate the use of
>>   topo_info in cpu_x86_cpuid()
>>
>> 在 2023/2/13 17:36, Zhao Liu 写道:
>>> From: Zhao Liu <zhao1.liu@intel.com>
>>>
>>> In cpu_x86_cpuid(), there are many variables in representing the cpu
>>> topology, e.g., topo_info, cs->nr_cores/cs->nr_threads.
>>>
>>> Since the names of cs->nr_cores/cs->nr_threads does not accurately
>>> represent its meaning, the use of cs->nr_cores/cs->nr_threads is prone
>>> to confusion and mistakes.
>>>
>>> And the structure X86CPUTopoInfo names its memebers clearly, thus the
>>> variable "topo_info" should be preferred.
>>>
>>> Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>> ---
>>>    target/i386/cpu.c | 30 ++++++++++++++++++------------
>>>    1 file changed, 18 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 7833505092d8..4cda84eb96f1 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -5215,11 +5215,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>        uint32_t limit;
>>>        uint32_t signature[3];
>>>        X86CPUTopoInfo topo_info;
>>> +    uint32_t cpus_per_pkg;
>>>        topo_info.dies_per_pkg = env->nr_dies;
>>>        topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>>>        topo_info.threads_per_core = cs->nr_threads;
>>> +    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.cores_per_die *
>>> +                   topo_info.threads_per_core;
>>> +
>>>        /* Calculate & apply limits for different index ranges */
>>>        if (index >= 0xC0000000) {
>>>            limit = env->cpuid_xlevel2;
>>> @@ -5255,8 +5259,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>                *ecx |= CPUID_EXT_OSXSAVE;
>>>            }
>>>            *edx = env->features[FEAT_1_EDX];
>>> -        if (cs->nr_cores * cs->nr_threads > 1) {
>>> -            *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
>>> +        if (cpus_per_pkg > 1) {
>>> +            *ebx |= cpus_per_pkg << 16;
>>>                *edx |= CPUID_HT;
>>>            }
>>>            if (!cpu->enable_pmu) {
>>> @@ -5293,10 +5297,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>                 */
>>>                if (*eax & 31) {
>>>                    int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>>> -                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>>> -                if (cs->nr_cores > 1) {
>>> +                int vcpus_per_socket = cpus_per_pkg;
>> Would it make sense to directly use cpus_per_pkg here
>>> +                int cores_per_socket = topo_info.cores_per_die *
>>> +                                       topo_info.dies_per_pkg;
>> There are other places in cpu_x86_cpuid where cs->nr_cores is used
>> separately, why not make a global "cores_per_pkg" like cpus_per_pkg
>> and also tweak the other places?
> Yeah, good idea.
>
>>> +                if (cores_per_socket > 1) {
>>>                        *eax &= ~0xFC000000;
>>> -                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
>>> +                    *eax |= (pow2ceil(cores_per_socket) - 1) << 26;
>>>                    }
>>>                    if (host_vcpus_per_cache > vcpus_per_socket) {
>>>                        *eax &= ~0x3FFC000;
>>> @@ -5436,12 +5442,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>            switch (count) {
>>>            case 0:
>>>                *eax = apicid_core_offset(&topo_info);
>>> -            *ebx = cs->nr_threads;
>>> +            *ebx = topo_info.threads_per_core;
>> There are many other places in cpu_x86_cpuid where cs->nr_threads
>> is used separately, such as encode_cache_cpuid4(***), should we
>> replace them all?
> In a previous patch [1], I replaced the use of cs->nr_threads/nr_cores in
> the call of encode_cache_cpuid4().
>
> The cleanest way is to pass topo_info to encode_cache_cpuid4(), but this
> involves the modification of the interface format and the use of the
> cache topo level, so I included it in a follow-up patch [2].
Ok, I see. I have not reached there.😉
>
> [1]: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in
>       CPUID.04,
>       https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03188.html
> [2]: [PATCH RESEND 15/18] i386: Use CPUCacheInfo.share_level to encode
>       CPUID[4].EAX[bits 25:14],
>       https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03199.html
>
>>>                *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>>>                break;
>>>            case 1:
>>>                *eax = apicid_pkg_offset(&topo_info);
>>> -            *ebx = cs->nr_cores * cs->nr_threads;
>>> +            *ebx = cpus_per_pkg;
>>>                *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>>>                break;
>>>            default:
>>> @@ -5472,7 +5478,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>            switch (count) {
>>>            case 0:
>>>                *eax = apicid_core_offset(&topo_info);
>>> -            *ebx = cs->nr_threads;
>>> +            *ebx = topo_info.threads_per_core;
>>>                *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>>>                break;
>>>            case 1:
>>> @@ -5482,7 +5488,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>                break;
>>>            case 2:
>>>                *eax = apicid_pkg_offset(&topo_info);
>>> -            *ebx = cs->nr_cores * cs->nr_threads;
>>> +            *ebx = cpus_per_pkg;
>>>                *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
>>>                break;
>>>            default:
>>> @@ -5707,7 +5713,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>             * discards multiple thread information if it is set.
>>>             * So don't set it here for Intel to make Linux guests happy.
>>>             */
>>> -        if (cs->nr_cores * cs->nr_threads > 1) {
>>> +        if (cpus_per_pkg > 1) {
>>>                if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 ||
>>>                    env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 ||
>>>                    env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) {
>>> @@ -5769,7 +5775,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>                 *eax |= (cpu_x86_virtual_addr_width(env) << 8);
>>>            }
>>>            *ebx = env->features[FEAT_8000_0008_EBX];
>>> -        if (cs->nr_cores * cs->nr_threads > 1) {
>>> +        if (cpus_per_pkg > 1) {
>>>                /*
>>>                 * Bits 15:12 is "The number of bits in the initial
>>>                 * Core::X86::Apic::ApicId[ApicId] value that indicate
>>> @@ -5777,7 +5783,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>>                 * Bits 7:0 is "The number of threads in the package is NC+1"
>>>                 */
>>>                *ecx = (apicid_pkg_offset(&topo_info) << 12) |
>>> -                   ((cs->nr_cores * cs->nr_threads) - 1);
>>> +                   (cpus_per_pkg - 1);
>>>            } else {
>>>                *ecx = 0;
>>>            }
>> Thanks,
>> Yanan



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 05/18] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()
  2023-02-15  3:28   ` wangyanan (Y) via
@ 2023-02-15  7:10     ` Zhao Liu
  2023-02-15  7:08       ` wangyanan (Y) via
  0 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-15  7:10 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster, qemu-devel, Zhenyu Wang,
	Dapeng Mi, Zhuocheng Ding, Robert Hoo, Xiaoyao Li, Like Xu,
	Zhao Liu

On Wed, Feb 15, 2023 at 11:28:25AM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 11:28:25 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 05/18] i386/cpu: Consolidate the use of
>  topo_info in cpu_x86_cpuid()
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > In cpu_x86_cpuid(), there are many variables in representing the cpu
> > topology, e.g., topo_info, cs->nr_cores/cs->nr_threads.
> > 
> > Since the names of cs->nr_cores/cs->nr_threads does not accurately
> > represent its meaning, the use of cs->nr_cores/cs->nr_threads is prone
> > to confusion and mistakes.
> > 
> > And the structure X86CPUTopoInfo names its memebers clearly, thus the
> > variable "topo_info" should be preferred.
> > 
> > Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   target/i386/cpu.c | 30 ++++++++++++++++++------------
> >   1 file changed, 18 insertions(+), 12 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 7833505092d8..4cda84eb96f1 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -5215,11 +5215,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >       uint32_t limit;
> >       uint32_t signature[3];
> >       X86CPUTopoInfo topo_info;
> > +    uint32_t cpus_per_pkg;
> >       topo_info.dies_per_pkg = env->nr_dies;
> >       topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> >       topo_info.threads_per_core = cs->nr_threads;
> > +    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.cores_per_die *
> > +                   topo_info.threads_per_core;
> > +
> >       /* Calculate & apply limits for different index ranges */
> >       if (index >= 0xC0000000) {
> >           limit = env->cpuid_xlevel2;
> > @@ -5255,8 +5259,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               *ecx |= CPUID_EXT_OSXSAVE;
> >           }
> >           *edx = env->features[FEAT_1_EDX];
> > -        if (cs->nr_cores * cs->nr_threads > 1) {
> > -            *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
> > +        if (cpus_per_pkg > 1) {
> > +            *ebx |= cpus_per_pkg << 16;
> >               *edx |= CPUID_HT;
> >           }
> >           if (!cpu->enable_pmu) {
> > @@ -5293,10 +5297,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                */
> >               if (*eax & 31) {
> >                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> > -                int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> > -                if (cs->nr_cores > 1) {
> > +                int vcpus_per_socket = cpus_per_pkg;
> Would it make sense to directly use cpus_per_pkg here
> > +                int cores_per_socket = topo_info.cores_per_die *
> > +                                       topo_info.dies_per_pkg;
> There are other places in cpu_x86_cpuid where cs->nr_cores is used
> separately, why not make a global "cores_per_pkg" like cpus_per_pkg
> and also tweak the other places?

Yeah, good idea.

> > +                if (cores_per_socket > 1) {
> >                       *eax &= ~0xFC000000;
> > -                    *eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> > +                    *eax |= (pow2ceil(cores_per_socket) - 1) << 26;
> >                   }
> >                   if (host_vcpus_per_cache > vcpus_per_socket) {
> >                       *eax &= ~0x3FFC000;
> > @@ -5436,12 +5442,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >           switch (count) {
> >           case 0:
> >               *eax = apicid_core_offset(&topo_info);
> > -            *ebx = cs->nr_threads;
> > +            *ebx = topo_info.threads_per_core;
> There are many other places in cpu_x86_cpuid where cs->nr_threads
> is used separately, such as encode_cache_cpuid4(***), should we
> replace them all?

In a previous patch [1], I replaced the use of cs->nr_threads/nr_cores in
the call of encode_cache_cpuid4().

The cleanest way is to pass topo_info to encode_cache_cpuid4(), but this
involves the modification of the interface format and the use of the
cache topo level, so I included it in a follow-up patch [2].

[1]: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in
     CPUID.04,
     https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03188.html
[2]: [PATCH RESEND 15/18] i386: Use CPUCacheInfo.share_level to encode
     CPUID[4].EAX[bits 25:14],
     https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03199.html

> >               *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
> >               break;
> >           case 1:
> >               *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = cpus_per_pkg;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >               break;
> >           default:
> > @@ -5472,7 +5478,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >           switch (count) {
> >           case 0:
> >               *eax = apicid_core_offset(&topo_info);
> > -            *ebx = cs->nr_threads;
> > +            *ebx = topo_info.threads_per_core;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
> >               break;
> >           case 1:
> > @@ -5482,7 +5488,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               break;
> >           case 2:
> >               *eax = apicid_pkg_offset(&topo_info);
> > -            *ebx = cs->nr_cores * cs->nr_threads;
> > +            *ebx = cpus_per_pkg;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
> >               break;
> >           default:
> > @@ -5707,7 +5713,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >            * discards multiple thread information if it is set.
> >            * So don't set it here for Intel to make Linux guests happy.
> >            */
> > -        if (cs->nr_cores * cs->nr_threads > 1) {
> > +        if (cpus_per_pkg > 1) {
> >               if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 ||
> >                   env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 ||
> >                   env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) {
> > @@ -5769,7 +5775,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                *eax |= (cpu_x86_virtual_addr_width(env) << 8);
> >           }
> >           *ebx = env->features[FEAT_8000_0008_EBX];
> > -        if (cs->nr_cores * cs->nr_threads > 1) {
> > +        if (cpus_per_pkg > 1) {
> >               /*
> >                * Bits 15:12 is "The number of bits in the initial
> >                * Core::X86::Apic::ApicId[ApicId] value that indicate
> > @@ -5777,7 +5783,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >                * Bits 7:0 is "The number of threads in the package is NC+1"
> >                */
> >               *ecx = (apicid_pkg_offset(&topo_info) << 12) |
> > -                   ((cs->nr_cores * cs->nr_threads) - 1);
> > +                   (cpus_per_pkg - 1);
> >           } else {
> >               *ecx = 0;
> >           }
> Thanks,
> Yanan


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 06/18] i386: Introduce module-level cpu topology to CPUX86State
  2023-02-13  9:36 ` [PATCH RESEND 06/18] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
@ 2023-02-15  7:41   ` wangyanan (Y) via
  0 siblings, 0 replies; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15  7:41 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
>
> smp command has the "clusters" parameter but x86 hasn't supported that
> level. Though "clusters" was introduced to help define L2 cache topology
> [1], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
Well, the smp parameter "clusters" isn't destined to define L2 cache
topology. It's actually a CPU topology level concept above cores, in
which the cores may share some resources (the resources can be L2
cache or some others like L3 cache tags, depending on the Archs).

On some ARM64 chips, cores in the same cluster share a L2 and
hold their own L1D/I separately. There are also chips, where cores
in the same cluster have their own L2 & L1D/I cache separately,
and share a L3 cache tag.
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
>
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
>
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
>
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
>
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
>
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
Agree. I think all the smp parameters only indicates the CPU hierarchy,
while the cache layout is much more flexible.
>
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
>
> For x86, the "cluster" in smp is corresponding to the module level [2],
> which is above the core level. So use the "module" other than "cluster"
> in i386 code.
>
> And please note that x86 already has a cpu topology level also named
> "cluster" [2], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
I see. So the reason for use of "module" instead of "cluster" is that there
is already cluster concept above Package in the x86 reference.

Thanks,
Yanan
> [1]: 0d87178 (hw/core/machine: Introduce CPU cluster topology support)
> [2]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
>
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   hw/i386/x86.c     | 1 +
>   target/i386/cpu.c | 1 +
>   target/i386/cpu.h | 6 ++++++
>   3 files changed, 8 insertions(+)
>
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index eaff4227bd68..ae1bb562d6e2 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -306,6 +306,7 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>       init_topo_info(&topo_info, x86ms);
>   
>       env->nr_dies = ms->smp.dies;
> +    env->nr_modules = ms->smp.clusters;
>   
>       /*
>        * If APIC ID is not set,
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 4cda84eb96f1..61ec9a7499b8 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6781,6 +6781,7 @@ static void x86_cpu_initfn(Object *obj)
>       CPUX86State *env = &cpu->env;
>   
>       env->nr_dies = 1;
> +    env->nr_modules = 1;
>       cpu_set_cpustate_pointers(cpu);
>   
>       object_property_add(obj, "feature-words", "X86CPUFeatureWordInfo",
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index d4bc19577a21..f3afea765982 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1810,7 +1810,13 @@ typedef struct CPUArchState {
>   
>       TPRAccess tpr_access_type;
>   
> +    /* Number of dies per package. */
>       unsigned nr_dies;
> +    /*
> +     * Number of modules per die. Module level in x86 cpu topology is
> +     * corresponding to smp.clusters.
> +     */
> +    unsigned nr_modules;
>   } CPUX86State;
>   
>   struct kvm_msrs;



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to test-x86-apicid.c
  2023-02-15  3:35     ` Zhao Liu
@ 2023-02-15  7:44       ` wangyanan (Y) via
  0 siblings, 0 replies; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15  7:44 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster, qemu-devel, Zhenyu Wang,
	Dapeng Mi, Zhuocheng Ding, Robert Hoo, Xiaoyao Li, Like Xu,
	Zhao Liu

在 2023/2/15 11:35, Zhao Liu 写道:
> On Wed, Feb 15, 2023 at 10:36:34AM +0800, wangyanan (Y) wrote:
>> Date: Wed, 15 Feb 2023 10:36:34 +0800
>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>> Subject: Re: [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to
>>   test-x86-apicid.c
>>
>> 在 2023/2/13 17:36, Zhao Liu 写道:
>>> From: Zhao Liu <zhao1.liu@intel.com>
>>>
>>> In fact, this unit tests APIC ID other than CPUID.
>>> Rename to test-x86-apicid.c to make its name more in line with its
>>> actual content.
>>>
>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>> ---
>>>    MAINTAINERS                                        | 2 +-
>>>    tests/unit/meson.build                             | 4 ++--
>>>    tests/unit/{test-x86-cpuid.c => test-x86-apicid.c} | 2 +-
>>>    3 files changed, 4 insertions(+), 4 deletions(-)
>>>    rename tests/unit/{test-x86-cpuid.c => test-x86-apicid.c} (99%)
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 96e25f62acaa..71c1bc24371b 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -1679,7 +1679,7 @@ F: include/hw/southbridge/piix.h
>>>    F: hw/misc/sga.c
>>>    F: hw/isa/apm.c
>>>    F: include/hw/isa/apm.h
>>> -F: tests/unit/test-x86-cpuid.c
>>> +F: tests/unit/test-x86-apicid.c
>>>    F: tests/qtest/test-x86-cpuid-compat.c
>>>    PC Chipset
>>> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
>>> index ffa444f4323c..a9df2843e92e 100644
>>> --- a/tests/unit/meson.build
>>> +++ b/tests/unit/meson.build
>>> @@ -20,8 +20,8 @@ tests = {
>>>      'test-opts-visitor': [testqapi],
>>>      'test-visitor-serialization': [testqapi],
>>>      'test-bitmap': [],
>>> -  # all code tested by test-x86-cpuid is inside topology.h
>>> -  'test-x86-cpuid': [],
>>> +  # all code tested by test-x86-apicid is inside topology.h
>>> +  'test-x86-apicid': [],
>>>      'test-cutils': [],
>>>      'test-div128': [],
>>>      'test-shift128': [],
>>> diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-apicid.c
>>> similarity index 99%
>>> rename from tests/unit/test-x86-cpuid.c
>>> rename to tests/unit/test-x86-apicid.c
>>> index bfabc0403a1a..2b104f86d7c2 100644
>>> --- a/tests/unit/test-x86-cpuid.c
>>> +++ b/tests/unit/test-x86-apicid.c
>>> @@ -1,5 +1,5 @@
>>>    /*
>>> - *  Test code for x86 CPUID and Topology functions
>>> + *  Test code for x86 APIC ID and Topology functions
>>>     *
>> I'm not very sure. The "CPUID" sounds like a general test for kinds of CPU
>> IDs.
> CPUID usually refers to that basic instruction in the x86 to obtain basic
> cpu information. So such naming is prone to ambiguity.
>
> The cpu topology info of x86 is parsed based on the apic id, including
> the sub-ids of each topology levels (such as thread id/core id...etc.).
> These sub-ids are all part of the apic id.
>
>> Besides APIC IDs computed from x86_apicid_from_cpu_idx(), there are also
>> topo IDs computed from x86_topo_ids_from_idx() although this kind of IDs
>> are not tested in test-x86-cpuid.c so far.
> What about "test-x86-topo.c" or "test-x86-topo-ids.c"?
The first one is general enough, I think.
>> Thanks,
>> Yanan
>>>     *  Copyright (c) 2012 Red Hat Inc.
>>>     *



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H
  2023-02-13  9:36 ` [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H Zhao Liu
@ 2023-02-15 10:11   ` wangyanan (Y) via
  2023-02-15 14:33     ` Zhao Liu
  2023-02-20  6:59   ` Xiaoyao Li
  1 sibling, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15 10:11 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

Hi Zhao,

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
> CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
> both 0, and this means i-cache and d-cache are shared in the SMT level.
> This is correct if there's single thread per core, but is wrong for the
> hyper threading case (one core contains multiple threads) since the
> i-cache and d-cache are shared in the core level other than SMT level.
>
> Therefore, in order to be compatible with both multi-threaded and
> single-threaded situations, we should set i-cache and d-cache be shared
> at the core level by default.
>
> Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
> CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
> nearest power-of-2 integer.
>
> The nearest power-of-2 integer can be caculated by pow2ceil() or by
> using APIC ID offset (like L3 topology using 1 << die_offset [3]).
>
> But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
> are associated with APIC ID. For example, in linux kernel, the field
> "num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
> another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
> matched with actual core numbers and it's caculated by:
> "(1 << (pkg_offset - core_offset)) - 1".
>
> Therefore the offset of APIC ID should be preferred to caculate nearest
> power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
> 31:26]:
> 1. d/i cache is shared in a core, 1 << core_offset should be used
>     instand of "1" in encode_cache_cpuid4() for CPUID.04H.00H:EAX[bits
>     25:14] and CPUID.04H.01H:EAX[bits 25:14].
> 2. L2 cache is supposed to be shared in a core as for now, thereby
>     1 << core_offset should also be used instand of "cs->nr_threads" in
>     encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
> 3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
>     replaced by the offsets upper SMT level in APIC ID.
>
> And since [1] and [2] are good enough to make cache_into_passthrough
> work well, its "pow2ceil()" uses are enough so that they're no need to
> be replaced by APIC ID offset way.
If you uniformly tweak these two places with APIC ID offset too, then
you can also use the more spec-compliant helpers
(e.g max_processor_ids_for_cache and max_core_ids_in_pkg) here in
future patch #18. Would be it best to uniform the code?

Thanks,
Yanan
> [1]: efb3934 (x86: cpu: make sure number of addressable IDs for processor cores meets the spec)
> [2]: d7caf13 (x86: cpu: fixup number of addressable IDs for logical processors sharing cache)
> [3]: d65af28 (i386: Update new x86_apicid parsing rules with die_offset support)
>
> Fixes: 7e3482f (i386: Helpers to encode cache information consistently)
> Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   target/i386/cpu.c | 20 +++++++++++++++-----
>   1 file changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 29afec12c281..7833505092d8 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5212,7 +5212,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>   {
>       X86CPU *cpu = env_archcpu(env);
>       CPUState *cs = env_cpu(env);
> -    uint32_t die_offset;
>       uint32_t limit;
>       uint32_t signature[3];
>       X86CPUTopoInfo topo_info;
> @@ -5308,27 +5307,38 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               *eax = *ebx = *ecx = *edx = 0;
>           } else {
>               *eax = 0;
> +            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> +                                           apicid_core_offset(&topo_info);
> +            int core_offset, die_offset;
> +
>               switch (count) {
>               case 0: /* L1 dcache info */
> +                core_offset = apicid_core_offset(&topo_info);
>                   encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> -                                    1, cs->nr_cores,
> +                                    (1 << core_offset),
> +                                    (1 << addressable_cores_offset),
>                                       eax, ebx, ecx, edx);
>                   break;
>               case 1: /* L1 icache info */
> +                core_offset = apicid_core_offset(&topo_info);
>                   encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> -                                    1, cs->nr_cores,
> +                                    (1 << core_offset),
> +                                    (1 << addressable_cores_offset),
>                                       eax, ebx, ecx, edx);
>                   break;
>               case 2: /* L2 cache info */
> +                core_offset = apicid_core_offset(&topo_info);
>                   encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> -                                    cs->nr_threads, cs->nr_cores,
> +                                    (1 << core_offset),
> +                                    (1 << addressable_cores_offset),
>                                       eax, ebx, ecx, edx);
>                   break;
>               case 3: /* L3 cache info */
>                   die_offset = apicid_die_offset(&topo_info);
>                   if (cpu->enable_l3_cache) {
>                       encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> -                                        (1 << die_offset), cs->nr_cores,
> +                                        (1 << die_offset),
> +                                        (1 << addressable_cores_offset),
>                                           eax, ebx, ecx, edx);
>                       break;
>                   }



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo
  2023-02-13  9:36 ` [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
@ 2023-02-15 10:38   ` wangyanan (Y) via
  2023-02-15 14:35     ` Zhao Liu
  2023-02-16  2:34   ` wangyanan (Y) via
  1 sibling, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15 10:38 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
>
> Support module level in i386 cpu topology structure "X86CPUTopoInfo".
>
> Before updating APIC ID parsing rule with module level, the
> apicid_core_width() temporarily combines the core and module levels
> together.
>
> At present, we don't expose module level in CPUID.1FH because currently
> linux (v6.2-rc6) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> the wrong die_id. The module level should be exposed until the real
> machine has the module level in CPUID.1FH.
>
> In addition, update topology structure in test-x86-apicid.c.
>
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   hw/i386/x86.c                |  3 ++-
>   include/hw/i386/topology.h   | 13 ++++++++---
>   target/i386/cpu.c            | 17 ++++++++------
>   tests/unit/test-x86-apicid.c | 45 +++++++++++++++++++-----------------
>   4 files changed, 46 insertions(+), 32 deletions(-)
>
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index ae1bb562d6e2..1c069ff56ae7 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -71,7 +71,8 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
>       MachineState *ms = MACHINE(x86ms);
>   
>       topo_info->dies_per_pkg = ms->smp.dies;
> -    topo_info->cores_per_die = ms->smp.cores;
> +    topo_info->modules_per_die = ms->smp.clusters;
> +    topo_info->cores_per_module = ms->smp.cores;
>       topo_info->threads_per_core = ms->smp.threads;
>   }
>   
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 81573f6cfde0..bbb00dc4aad8 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -54,7 +54,8 @@ typedef struct X86CPUTopoIDs {
>   
>   typedef struct X86CPUTopoInfo {
>       unsigned dies_per_pkg;
> -    unsigned cores_per_die;
> +    unsigned modules_per_die;
> +    unsigned cores_per_module;
>       unsigned threads_per_core;
>   } X86CPUTopoInfo;
>   
> @@ -78,7 +79,12 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>    */
>   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>   {
> -    return apicid_bitwidth_for_count(topo_info->cores_per_die);
> +    /*
> +     * TODO: Will separate module info from core_width when update
> +     * APIC ID with module level.
> +     */
> +    return apicid_bitwidth_for_count(topo_info->cores_per_module *
> +                                     topo_info->modules_per_die);
>   }
>   
>   /* Bit width of the Die_ID field */
> @@ -128,7 +134,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>                                            X86CPUTopoIDs *topo_ids)
>   {
>       unsigned nr_dies = topo_info->dies_per_pkg;
> -    unsigned nr_cores = topo_info->cores_per_die;
> +    unsigned nr_cores = topo_info->cores_per_module *
> +                        topo_info->modules_per_die;
>       unsigned nr_threads = topo_info->threads_per_core;
>   
>       topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 61ec9a7499b8..6f3d114c7d12 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -336,7 +336,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>   
>       /* L3 is shared among multiple cores */
>       if (cache->level == 3) {
> -        l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
> +        l3_threads = topo_info->modules_per_die *
> +                     topo_info->cores_per_module *
> +                     topo_info->threads_per_core;
>           *eax |= (l3_threads - 1) << 14;
>       } else {
>           *eax |= ((topo_info->threads_per_core - 1) << 14);
> @@ -5218,11 +5220,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>       uint32_t cpus_per_pkg;
>   
>       topo_info.dies_per_pkg = env->nr_dies;
> -    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> +    topo_info.modules_per_die = env->nr_modules;
> +    topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
>       topo_info.threads_per_core = cs->nr_threads;
>   
> -    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.cores_per_die *
> -                   topo_info.threads_per_core;
> +    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.modules_per_die *
> +                   topo_info.cores_per_module * topo_info.threads_per_core;
>   
>       /* Calculate & apply limits for different index ranges */
>       if (index >= 0xC0000000) {
> @@ -5298,8 +5301,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               if (*eax & 31) {
>                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>                   int vcpus_per_socket = cpus_per_pkg;
> -                int cores_per_socket = topo_info.cores_per_die *
> -                                       topo_info.dies_per_pkg;
> +                int cores_per_socket = cpus_per_pkg /
> +                                       topo_info.threads_per_core;
As mentioned in patch 5, cores_per_socket can be function-global.
>                   if (cores_per_socket > 1) {
>                       *eax &= ~0xFC000000;
>                       *eax |= (pow2ceil(cores_per_socket) - 1) << 26;
> @@ -5483,7 +5486,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               break;
>           case 1:
>               *eax = apicid_die_offset(&topo_info);
> -            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> +            *ebx = cpus_per_pkg / topo_info.dies_per_pkg;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>               break;
>           case 2:
> diff --git a/tests/unit/test-x86-apicid.c b/tests/unit/test-x86-apicid.c
> index 2b104f86d7c2..f21b8a5d95c2 100644
> --- a/tests/unit/test-x86-apicid.c
> +++ b/tests/unit/test-x86-apicid.c
> @@ -30,13 +30,16 @@ static void test_topo_bits(void)
>   {
>       X86CPUTopoInfo topo_info = {0};
>   
> -    /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
> -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> +    /*
> +     * simple tests for 1 thread per core, 1 core per module,
> +     *                  1 module per die, 1 die per package
> +     */
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> @@ -45,39 +48,39 @@ static void test_topo_bits(void)
>   
>       /* Test field width calculation for multiple values
>        */
> -    topo_info = (X86CPUTopoInfo) {1, 1, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 4};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 1, 14};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 15};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 16};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 17};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
>   
>   
> -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {1, 31, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {1, 32, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {1, 33, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> -    topo_info = (X86CPUTopoInfo) {2, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
> -    topo_info = (X86CPUTopoInfo) {3, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> -    topo_info = (X86CPUTopoInfo) {4, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
>   
>       /* build a weird topology and see if IDs are calculated correctly
> @@ -85,18 +88,18 @@ static void test_topo_bits(void)
>   
>       /* This will use 2 bits for thread ID and 3 bits for core ID
>        */
> -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>       g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
>       g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
>       g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
>                        (1 << 2) | 0);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 09/18] i386: Fix comment style in topology.h
  2023-02-13  9:36 ` [PATCH RESEND 09/18] i386: Fix comment style in topology.h Zhao Liu
  2023-02-13 13:40   ` Philippe Mathieu-Daudé
  2023-02-14  2:37   ` wangyanan (Y) via
@ 2023-02-15 10:54   ` wangyanan (Y) via
  2023-02-15 14:35     ` Zhao Liu
  2 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15 10:54 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> For function comments in this file, keep the comment style consistent
> with other places.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
nit:Better to move this cleanup patch to top of the series.
> ---
>   include/hw/i386/topology.h | 33 +++++++++++++++++----------------
>   1 file changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index b0174c18b7bd..5de905dc00d3 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -24,7 +24,8 @@
>   #ifndef HW_I386_TOPOLOGY_H
>   #define HW_I386_TOPOLOGY_H
>   
> -/* This file implements the APIC-ID-based CPU topology enumeration logic,
> +/*
> + * This file implements the APIC-ID-based CPU topology enumeration logic,
>    * documented at the following document:
>    *   Intel® 64 Architecture Processor Topology Enumeration
>    *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
> @@ -41,7 +42,8 @@
>   
>   #include "qemu/bitops.h"
>   
> -/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> +/*
> + * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
>    */
>   typedef uint32_t apic_id_t;
>   
> @@ -60,8 +62,7 @@ typedef struct X86CPUTopoInfo {
>       unsigned threads_per_core;
>   } X86CPUTopoInfo;
>   
> -/* Return the bit width needed for 'count' IDs
> - */
> +/* Return the bit width needed for 'count' IDs */
>   static unsigned apicid_bitwidth_for_count(unsigned count)
>   {
>       g_assert(count >= 1);
> @@ -69,15 +70,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
>       return count ? 32 - clz32(count) : 0;
>   }
>   
> -/* Bit width of the SMT_ID (thread ID) field on the APIC ID
> - */
> +/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
>   static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_bitwidth_for_count(topo_info->threads_per_core);
>   }
>   
> -/* Bit width of the Core_ID field
> - */
> +/* Bit width of the Core_ID field */
>   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>   {
>       /*
> @@ -94,8 +93,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
>       return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
>   }
>   
> -/* Bit offset of the Core_ID field
> - */
> +/* Bit offset of the Core_ID field */
>   static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_smt_width(topo_info);
> @@ -107,14 +105,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
>       return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
>   }
>   
> -/* Bit offset of the Pkg_ID (socket ID) field
> - */
> +/* Bit offset of the Pkg_ID (socket ID) field */
>   static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
>   {
>       return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
>   }
>   
> -/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> +/*
> + * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
>    *
>    * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
>    */
> @@ -127,7 +125,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
>              topo_ids->smt_id;
>   }
>   
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>    * based on (contiguous) CPU index
>    */
>   static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> @@ -154,7 +153,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>       topo_ids->smt_id = cpu_index % nr_threads;
>   }
>   
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>    * based on APIC ID
>    */
>   static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> @@ -178,7 +178,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>       topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
>   }
>   
> -/* Make APIC ID for the CPU 'cpu_index'
> +/*
> + * Make APIC ID for the CPU 'cpu_index'
>    *
>    * 'cpu_index' is a sequential, contiguous ID for the CPU.
>    */



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to support module level
  2023-02-13  9:36 ` [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to support module level Zhao Liu
@ 2023-02-15 11:06   ` wangyanan (Y) via
  2023-02-15 15:03     ` Zhao Liu
  0 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15 11:06 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

Hi Zhao,

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
>
> Add the module level parsing support for APIC ID.
>
> With this support, now the conversion between X86CPUTopoIDs,
> X86CPUTopoInfo and APIC ID is completed.
IIUC, contents in patch 6-8 and 10 are all about "Introduce the module-level
CPU topology support for x86", why do we need gradually do this with kinds
of temporary things instead of warp them into one patch? Before support
for smp.clusters in the CLI for x86, we can ensure that modules_per_dies is
always 1 so that the code is save in one diff. Or do I miss something?

Thanks,
Yanan
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   hw/i386/x86.c              | 19 ++++++++-----------
>   include/hw/i386/topology.h | 36 ++++++++++++++++++------------------
>   2 files changed, 26 insertions(+), 29 deletions(-)
>
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index b90c6584930a..2a9d080a8e7a 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -311,11 +311,11 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>   
>       /*
>        * If APIC ID is not set,
> -     * set it based on socket/die/core/thread properties.
> +     * set it based on socket/die/cluster/core/thread properties.
>        */
>       if (cpu->apic_id == UNASSIGNED_APIC_ID) {
> -        int max_socket = (ms->smp.max_cpus - 1) /
> -                                smp_threads / smp_cores / ms->smp.dies;
> +        int max_socket = (ms->smp.max_cpus - 1) / smp_threads / smp_cores /
> +                                ms->smp.clusters / ms->smp.dies;
>   
>           /*
>            * die-id was optional in QEMU 4.0 and older, so keep it optional
> @@ -379,15 +379,12 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>   
>           x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
>   
> -        /*
> -         * TODO: Before APIC ID supports module level parsing, there's no need
> -         * to expose module_id info.
> -         */
>           error_setg(errp,
> -            "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
> -            " APIC ID %" PRIu32 ", valid index range 0:%d",
> -            topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, topo_ids.smt_id,
> -            cpu->apic_id, ms->possible_cpus->len - 1);
> +            "Invalid CPU [socket: %u, die: %u, module: %u, core: %u, thread: %u]"
> +            " with APIC ID %" PRIu32 ", valid index range 0:%d",
> +            topo_ids.pkg_id, topo_ids.die_id, topo_ids.module_id,
> +            topo_ids.core_id, topo_ids.smt_id, cpu->apic_id,
> +            ms->possible_cpus->len - 1);
>           return;
>       }
>   
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 5de905dc00d3..3cec97b377f2 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -79,12 +79,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>   /* Bit width of the Core_ID field */
>   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>   {
> -    /*
> -     * TODO: Will separate module info from core_width when update
> -     * APIC ID with module level.
> -     */
> -    return apicid_bitwidth_for_count(topo_info->cores_per_module *
> -                                     topo_info->modules_per_die);
> +    return apicid_bitwidth_for_count(topo_info->cores_per_module);
> +}
> +
> +/* Bit width of the Module_ID (cluster ID) field */
> +static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
> +{
> +    return apicid_bitwidth_for_count(topo_info->modules_per_die);
>   }
>   
>   /* Bit width of the Die_ID field */
> @@ -99,10 +100,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>       return apicid_smt_width(topo_info);
>   }
>   
> +/* Bit offset of the Module_ID (cluster ID) field */
> +static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
> +{
> +    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> +}
> +
>   /* Bit offset of the Die_ID field */
>   static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
>   {
> -    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> +    return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
>   }
>   
>   /* Bit offset of the Pkg_ID (socket ID) field */
> @@ -121,6 +128,7 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
>   {
>       return (topo_ids->pkg_id  << apicid_pkg_offset(topo_info)) |
>              (topo_ids->die_id  << apicid_die_offset(topo_info)) |
> +           (topo_ids->module_id << apicid_module_offset(topo_info)) |
>              (topo_ids->core_id << apicid_core_offset(topo_info)) |
>              topo_ids->smt_id;
>   }
> @@ -138,11 +146,6 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>       unsigned nr_cores = topo_info->cores_per_module;
>       unsigned nr_threads = topo_info->threads_per_core;
>   
> -    /*
> -     * Currently smp for i386 doesn't support "clusters", modules_per_die is
> -     * only 1. Therefore, the module_id generated from the module topology will
> -     * not conflict with the module_id generated according to the apicid.
> -     */
>       topo_ids->pkg_id = cpu_index / (nr_dies * nr_modules *
>                          nr_cores * nr_threads);
>       topo_ids->die_id = cpu_index / (nr_modules * nr_cores *
> @@ -166,12 +169,9 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>       topo_ids->core_id =
>               (apicid >> apicid_core_offset(topo_info)) &
>               ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
> -    /*
> -     * TODO: This is the temporary initialization for topo_ids.module_id to
> -     * avoid "maybe-uninitialized" compilation errors. Will remove when APIC
> -     * ID supports module level parsing.
> -     */
> -    topo_ids->module_id = 0;
> +    topo_ids->module_id =
> +            (apicid >> apicid_module_offset(topo_info)) &
> +            ~(0xFFFFFFFFUL << apicid_module_width(topo_info));
>       topo_ids->die_id =
>               (apicid >> apicid_die_offset(topo_info)) &
>               ~(0xFFFFFFFFUL << apicid_die_width(topo_info));



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 12/18] tests: Add test case of APIC ID for module level parsing
  2023-02-13  9:36 ` [PATCH RESEND 12/18] tests: Add test case of APIC ID for module level parsing Zhao Liu
@ 2023-02-15 11:22   ` wangyanan (Y) via
  0 siblings, 0 replies; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15 11:22 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
>
> After i386 supports module level, it's time to add the test for module
> level's parsing.
>
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   tests/unit/test-x86-apicid.c | 19 +++++++++++++++----
>   1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/tests/unit/test-x86-apicid.c b/tests/unit/test-x86-apicid.c
> index f21b8a5d95c2..55b731ccae55 100644
> --- a/tests/unit/test-x86-apicid.c
> +++ b/tests/unit/test-x86-apicid.c
> @@ -37,6 +37,7 @@ static void test_topo_bits(void)
>       topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
> +    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 0);
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
>   
>       topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
> @@ -74,13 +75,22 @@ static void test_topo_bits(void)
>       topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 6, 30, 2};
> +    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
> +    topo_info = (X86CPUTopoInfo) {1, 7, 30, 2};
> +    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
> +    topo_info = (X86CPUTopoInfo) {1, 8, 30, 2};
> +    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 3);
> +    topo_info = (X86CPUTopoInfo) {1, 9, 30, 2};
> +    g_assert_cmpuint(apicid_module_width(&topo_info), ==, 4);
> +
> +    topo_info = (X86CPUTopoInfo) {1, 6, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> -    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {2, 6, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
> -    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {3, 6, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> -    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {4, 6, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
>   
>       /* build a weird topology and see if IDs are calculated correctly
> @@ -91,6 +101,7 @@ static void test_topo_bits(void)
>       topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>       g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
> +    g_assert_cmpuint(apicid_module_offset(&topo_info), ==, 5);
>       g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
>       g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
>   
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>

Thanks,
Yanan


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 14/18] i386: Add cache topology info in CPUCacheInfo
  2023-02-13  9:36 ` [PATCH RESEND 14/18] i386: Add cache topology info in CPUCacheInfo Zhao Liu
@ 2023-02-15 12:17   ` wangyanan (Y) via
  2023-02-15 15:07     ` Zhao Liu
  0 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15 12:17 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

Hi Zhao,

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
>
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
>
> In fact, the settings of these topologies depend on the specific
> platform and are not static. For example, on Alder Lake-P, every
> four Atom cores share the same L2 cache.
>
> Thus, we should explicitly define the corresponding cache topology for
> different cache models to increase scalability.
>
> Except legacy_l2_cache_cpuid2 (its default topo level is INVALID),
It seems like its default topo level is UNKNOWN in this case.
> explicitly set the corresponding topology level for all other cache
> models. In order to be compatible with the existing cache topology, set
> the CORE level for the i/d cache, set the CORE level for L2 cache, and
> set the DIE level for L3 cache.
>
> The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits
> 25:14] will be set based on CPUCacheInfo.share_level.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   target/i386/cpu.c | 19 +++++++++++++++++++
>   target/i386/cpu.h | 16 ++++++++++++++++
>   2 files changed, 35 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 27bbbc36b11c..364534e84b1b 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -433,6 +433,7 @@ static CPUCacheInfo legacy_l1d_cache = {
>       .sets = 64,
>       .partitions = 1,
>       .no_invd_sharing = true,
> +    .share_level = CORE,
>   };
>   
>   /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
> @@ -447,6 +448,7 @@ static CPUCacheInfo legacy_l1d_cache_amd = {
>       .partitions = 1,
>       .lines_per_tag = 1,
>       .no_invd_sharing = true,
> +    .share_level = CORE,
>   };
>   
>   /* L1 instruction cache: */
> @@ -460,6 +462,7 @@ static CPUCacheInfo legacy_l1i_cache = {
>       .sets = 64,
>       .partitions = 1,
>       .no_invd_sharing = true,
> +    .share_level = CORE,
>   };
>   
>   /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
> @@ -474,6 +477,7 @@ static CPUCacheInfo legacy_l1i_cache_amd = {
>       .partitions = 1,
>       .lines_per_tag = 1,
>       .no_invd_sharing = true,
> +    .share_level = CORE,
>   };
>   
>   /* Level 2 unified cache: */
> @@ -487,6 +491,7 @@ static CPUCacheInfo legacy_l2_cache = {
>       .sets = 4096,
>       .partitions = 1,
>       .no_invd_sharing = true,
> +    .share_level = CORE,
>   };
>   
>   /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */
> @@ -509,6 +514,7 @@ static CPUCacheInfo legacy_l2_cache_amd = {
>       .associativity = 16,
>       .sets = 512,
>       .partitions = 1,
> +    .share_level = CORE,
>   };
>   
>   /* Level 3 unified cache: */
> @@ -524,6 +530,7 @@ static CPUCacheInfo legacy_l3_cache = {
>       .self_init = true,
>       .inclusive = true,
>       .complex_indexing = true,
> +    .share_level = DIE,
>   };
>   
>   /* TLB definitions: */
> @@ -1668,6 +1675,7 @@ static const CPUCaches epyc_cache_info = {
>           .lines_per_tag = 1,
>           .self_init = 1,
>           .no_invd_sharing = true,
> +        .share_level = CORE,
>       },
>       .l1i_cache = &(CPUCacheInfo) {
>           .type = INSTRUCTION_CACHE,
> @@ -1680,6 +1688,7 @@ static const CPUCaches epyc_cache_info = {
>           .lines_per_tag = 1,
>           .self_init = 1,
>           .no_invd_sharing = true,
> +        .share_level = CORE,
>       },
>       .l2_cache = &(CPUCacheInfo) {
>           .type = UNIFIED_CACHE,
> @@ -1690,6 +1699,7 @@ static const CPUCaches epyc_cache_info = {
>           .partitions = 1,
>           .sets = 1024,
>           .lines_per_tag = 1,
> +        .share_level = CORE,
>       },
>       .l3_cache = &(CPUCacheInfo) {
>           .type = UNIFIED_CACHE,
> @@ -1703,6 +1713,7 @@ static const CPUCaches epyc_cache_info = {
>           .self_init = true,
>           .inclusive = true,
>           .complex_indexing = true,
> +        .share_level = DIE,
>       },
>   };
>   
> @@ -1718,6 +1729,7 @@ static const CPUCaches epyc_rome_cache_info = {
>           .lines_per_tag = 1,
>           .self_init = 1,
>           .no_invd_sharing = true,
> +        .share_level = CORE,
>       },
>       .l1i_cache = &(CPUCacheInfo) {
>           .type = INSTRUCTION_CACHE,
> @@ -1730,6 +1742,7 @@ static const CPUCaches epyc_rome_cache_info = {
>           .lines_per_tag = 1,
>           .self_init = 1,
>           .no_invd_sharing = true,
> +        .share_level = CORE,
>       },
>       .l2_cache = &(CPUCacheInfo) {
>           .type = UNIFIED_CACHE,
> @@ -1740,6 +1753,7 @@ static const CPUCaches epyc_rome_cache_info = {
>           .partitions = 1,
>           .sets = 1024,
>           .lines_per_tag = 1,
> +        .share_level = CORE,
>       },
>       .l3_cache = &(CPUCacheInfo) {
>           .type = UNIFIED_CACHE,
> @@ -1753,6 +1767,7 @@ static const CPUCaches epyc_rome_cache_info = {
>           .self_init = true,
>           .inclusive = true,
>           .complex_indexing = true,
> +        .share_level = DIE,
>       },
>   };
>   
> @@ -1768,6 +1783,7 @@ static const CPUCaches epyc_milan_cache_info = {
>           .lines_per_tag = 1,
>           .self_init = 1,
>           .no_invd_sharing = true,
> +        .share_level = CORE,
>       },
>       .l1i_cache = &(CPUCacheInfo) {
>           .type = INSTRUCTION_CACHE,
> @@ -1780,6 +1796,7 @@ static const CPUCaches epyc_milan_cache_info = {
>           .lines_per_tag = 1,
>           .self_init = 1,
>           .no_invd_sharing = true,
> +        .share_level = CORE,
>       },
>       .l2_cache = &(CPUCacheInfo) {
>           .type = UNIFIED_CACHE,
> @@ -1790,6 +1807,7 @@ static const CPUCaches epyc_milan_cache_info = {
>           .partitions = 1,
>           .sets = 1024,
>           .lines_per_tag = 1,
> +        .share_level = CORE,
>       },
>       .l3_cache = &(CPUCacheInfo) {
>           .type = UNIFIED_CACHE,
> @@ -1803,6 +1821,7 @@ static const CPUCaches epyc_milan_cache_info = {
>           .self_init = true,
>           .inclusive = true,
>           .complex_indexing = true,
> +        .share_level = DIE,
>       },
>   };
>   
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 8668e74e0c87..5a955431f759 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1476,6 +1476,15 @@ enum CacheType {
>       UNIFIED_CACHE
>   };
>   
> +enum CPUTopoLevel {
> +    INVALID = 0,
Maybe UNKNOWN?
> +    SMT,
> +    CORE,
> +    MODULE,
> +    DIE,
> +    PACKAGE,
Do we need a prefix like CPU_TOPO_LEVEL_* or shorter CPU_TL_*
to avoid possible naming conflicts with other micros or enums?
Not sure..

Thanks,
Yanan
> +};
> +
>   typedef struct CPUCacheInfo {
>       enum CacheType type;
>       uint8_t level;
> @@ -1517,6 +1526,13 @@ typedef struct CPUCacheInfo {
>        * address bits.  CPUID[4].EDX[bit 2].
>        */
>       bool complex_indexing;
> +
> +    /*
> +     * Cache Topology. The level that cache is shared in.
> +     * Used to encode CPUID[4].EAX[bits 25:14] or
> +     * CPUID[0x8000001D].EAX[bits 25:14].
> +     */
> +    enum CPUTopoLevel share_level;
>   } CPUCacheInfo;
>   
>   



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 16/18] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  2023-02-13  9:36 ` [PATCH RESEND 16/18] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
@ 2023-02-15 12:32   ` wangyanan (Y) via
  2023-02-15 15:09     ` Zhao Liu
  0 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-15 12:32 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> >From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
> means [1]:
>
> The number of logical processors sharing this cache is the value of
> this field incremented by 1. To determine which logical processors are
> sharing a cache, determine a Share Id for each processor as follows:
>
> ShareId = LocalApicId >> log2(NumSharingCache+1)
>
> Logical processors with the same ShareId then share a cache. If
> NumSharingCache+1 is not a power of two, round it up to the next power
> of two.
>
> >From the description above, the caculation of this feild should be same
> as CPUID[4].EAX[bits 25:14] for intel cpus. So also use the offsets of
> APIC ID to caculate this field.
>
> Note: I don't have the hardware available, hope someone can help me to
> confirm whether this calculation is correct, thanks!
>
> [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
>       Information
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   target/i386/cpu.c | 10 ++++------
>   1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 96ef96860604..d691c02e3c06 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -355,7 +355,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>                                          uint32_t *eax, uint32_t *ebx,
>                                          uint32_t *ecx, uint32_t *edx)
>   {
> -    uint32_t l3_threads;
> +    uint32_t sharing_apic_ids;
maybe num_apic_ids or num_ids?
>       assert(cache->size == cache->line_size * cache->associativity *
>                             cache->partitions * cache->sets);
>   
> @@ -364,13 +364,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>   
>       /* L3 is shared among multiple cores */
>       if (cache->level == 3) {
> -        l3_threads = topo_info->modules_per_die *
> -                     topo_info->cores_per_module *
> -                     topo_info->threads_per_core;
> -        *eax |= (l3_threads - 1) << 14;
> +        sharing_apic_ids = 1 << apicid_die_offset(topo_info);
>       } else {
> -        *eax |= ((topo_info->threads_per_core - 1) << 14);
> +        sharing_apic_ids = 1 << apicid_core_offset(topo_info);
>       }
> +    *eax |= (sharing_apic_ids - 1) << 14;
>   
>       assert(cache->line_size > 0);
>       assert(cache->partitions > 0);



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H
  2023-02-15 10:11   ` wangyanan (Y) via
@ 2023-02-15 14:33     ` Zhao Liu
  0 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-15 14:33 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Wed, Feb 15, 2023 at 06:11:23PM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 18:11:23 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs
>  in CPUID.04H
> 
> Hi Zhao,
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
> > CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
> > both 0, and this means i-cache and d-cache are shared in the SMT level.
> > This is correct if there's single thread per core, but is wrong for the
> > hyper threading case (one core contains multiple threads) since the
> > i-cache and d-cache are shared in the core level other than SMT level.
> > 
> > Therefore, in order to be compatible with both multi-threaded and
> > single-threaded situations, we should set i-cache and d-cache be shared
> > at the core level by default.
> > 
> > Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
> > CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
> > nearest power-of-2 integer.
> > 
> > The nearest power-of-2 integer can be caculated by pow2ceil() or by
> > using APIC ID offset (like L3 topology using 1 << die_offset [3]).
> > 
> > But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
> > are associated with APIC ID. For example, in linux kernel, the field
> > "num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
> > another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
> > matched with actual core numbers and it's caculated by:
> > "(1 << (pkg_offset - core_offset)) - 1".
> > 
> > Therefore the offset of APIC ID should be preferred to caculate nearest
> > power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
> > 31:26]:
> > 1. d/i cache is shared in a core, 1 << core_offset should be used
> >     instand of "1" in encode_cache_cpuid4() for CPUID.04H.00H:EAX[bits
> >     25:14] and CPUID.04H.01H:EAX[bits 25:14].
> > 2. L2 cache is supposed to be shared in a core as for now, thereby
> >     1 << core_offset should also be used instand of "cs->nr_threads" in
> >     encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
> > 3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
> >     replaced by the offsets upper SMT level in APIC ID.
> > 
> > And since [1] and [2] are good enough to make cache_into_passthrough
> > work well, its "pow2ceil()" uses are enough so that they're no need to
> > be replaced by APIC ID offset way.
> If you uniformly tweak these two places with APIC ID offset too, then
> you can also use the more spec-compliant helpers
> (e.g max_processor_ids_for_cache and max_core_ids_in_pkg) here in
> future patch #18. Would be it best to uniform the code?

Yes, it makes sense! Will also do that. Thanks!

> 
> Thanks,
> Yanan
> > [1]: efb3934 (x86: cpu: make sure number of addressable IDs for processor cores meets the spec)
> > [2]: d7caf13 (x86: cpu: fixup number of addressable IDs for logical processors sharing cache)
> > [3]: d65af28 (i386: Update new x86_apicid parsing rules with die_offset support)
> > 
> > Fixes: 7e3482f (i386: Helpers to encode cache information consistently)
> > Suggested-by: Robert Hoo <robert.hu@linux.intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   target/i386/cpu.c | 20 +++++++++++++++-----
> >   1 file changed, 15 insertions(+), 5 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 29afec12c281..7833505092d8 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -5212,7 +5212,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >   {
> >       X86CPU *cpu = env_archcpu(env);
> >       CPUState *cs = env_cpu(env);
> > -    uint32_t die_offset;
> >       uint32_t limit;
> >       uint32_t signature[3];
> >       X86CPUTopoInfo topo_info;
> > @@ -5308,27 +5307,38 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               *eax = *ebx = *ecx = *edx = 0;
> >           } else {
> >               *eax = 0;
> > +            int addressable_cores_offset = apicid_pkg_offset(&topo_info) -
> > +                                           apicid_core_offset(&topo_info);
> > +            int core_offset, die_offset;
> > +
> >               switch (count) {
> >               case 0: /* L1 dcache info */
> > +                core_offset = apicid_core_offset(&topo_info);
> >                   encode_cache_cpuid4(env->cache_info_cpuid4.l1d_cache,
> > -                                    1, cs->nr_cores,
> > +                                    (1 << core_offset),
> > +                                    (1 << addressable_cores_offset),
> >                                       eax, ebx, ecx, edx);
> >                   break;
> >               case 1: /* L1 icache info */
> > +                core_offset = apicid_core_offset(&topo_info);
> >                   encode_cache_cpuid4(env->cache_info_cpuid4.l1i_cache,
> > -                                    1, cs->nr_cores,
> > +                                    (1 << core_offset),
> > +                                    (1 << addressable_cores_offset),
> >                                       eax, ebx, ecx, edx);
> >                   break;
> >               case 2: /* L2 cache info */
> > +                core_offset = apicid_core_offset(&topo_info);
> >                   encode_cache_cpuid4(env->cache_info_cpuid4.l2_cache,
> > -                                    cs->nr_threads, cs->nr_cores,
> > +                                    (1 << core_offset),
> > +                                    (1 << addressable_cores_offset),
> >                                       eax, ebx, ecx, edx);
> >                   break;
> >               case 3: /* L3 cache info */
> >                   die_offset = apicid_die_offset(&topo_info);
> >                   if (cpu->enable_l3_cache) {
> >                       encode_cache_cpuid4(env->cache_info_cpuid4.l3_cache,
> > -                                        (1 << die_offset), cs->nr_cores,
> > +                                        (1 << die_offset),
> > +                                        (1 << addressable_cores_offset),
> >                                           eax, ebx, ecx, edx);
> >                       break;
> >                   }
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo
  2023-02-15 10:38   ` wangyanan (Y) via
@ 2023-02-15 14:35     ` Zhao Liu
  0 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-15 14:35 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Wed, Feb 15, 2023 at 06:38:31PM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 18:38:31 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 07/18] i386: Support modules_per_die in
>  X86CPUTopoInfo
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> > Support module level in i386 cpu topology structure "X86CPUTopoInfo".
> > 
> > Before updating APIC ID parsing rule with module level, the
> > apicid_core_width() temporarily combines the core and module levels
> > together.
> > 
> > At present, we don't expose module level in CPUID.1FH because currently
> > linux (v6.2-rc6) doesn't support module level. And exposing module and
> > die levels at the same time in CPUID.1FH will cause linux to calculate
> > the wrong die_id. The module level should be exposed until the real
> > machine has the module level in CPUID.1FH.
> > 
> > In addition, update topology structure in test-x86-apicid.c.
> > 
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   hw/i386/x86.c                |  3 ++-
> >   include/hw/i386/topology.h   | 13 ++++++++---
> >   target/i386/cpu.c            | 17 ++++++++------
> >   tests/unit/test-x86-apicid.c | 45 +++++++++++++++++++-----------------
> >   4 files changed, 46 insertions(+), 32 deletions(-)
> > 
> > diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> > index ae1bb562d6e2..1c069ff56ae7 100644
> > --- a/hw/i386/x86.c
> > +++ b/hw/i386/x86.c
> > @@ -71,7 +71,8 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
> >       MachineState *ms = MACHINE(x86ms);
> >       topo_info->dies_per_pkg = ms->smp.dies;
> > -    topo_info->cores_per_die = ms->smp.cores;
> > +    topo_info->modules_per_die = ms->smp.clusters;
> > +    topo_info->cores_per_module = ms->smp.cores;
> >       topo_info->threads_per_core = ms->smp.threads;
> >   }
> > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > index 81573f6cfde0..bbb00dc4aad8 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -54,7 +54,8 @@ typedef struct X86CPUTopoIDs {
> >   typedef struct X86CPUTopoInfo {
> >       unsigned dies_per_pkg;
> > -    unsigned cores_per_die;
> > +    unsigned modules_per_die;
> > +    unsigned cores_per_module;
> >       unsigned threads_per_core;
> >   } X86CPUTopoInfo;
> > @@ -78,7 +79,12 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
> >    */
> >   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
> >   {
> > -    return apicid_bitwidth_for_count(topo_info->cores_per_die);
> > +    /*
> > +     * TODO: Will separate module info from core_width when update
> > +     * APIC ID with module level.
> > +     */
> > +    return apicid_bitwidth_for_count(topo_info->cores_per_module *
> > +                                     topo_info->modules_per_die);
> >   }
> >   /* Bit width of the Die_ID field */
> > @@ -128,7 +134,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> >                                            X86CPUTopoIDs *topo_ids)
> >   {
> >       unsigned nr_dies = topo_info->dies_per_pkg;
> > -    unsigned nr_cores = topo_info->cores_per_die;
> > +    unsigned nr_cores = topo_info->cores_per_module *
> > +                        topo_info->modules_per_die;
> >       unsigned nr_threads = topo_info->threads_per_core;
> >       topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 61ec9a7499b8..6f3d114c7d12 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -336,7 +336,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >       /* L3 is shared among multiple cores */
> >       if (cache->level == 3) {
> > -        l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
> > +        l3_threads = topo_info->modules_per_die *
> > +                     topo_info->cores_per_module *
> > +                     topo_info->threads_per_core;
> >           *eax |= (l3_threads - 1) << 14;
> >       } else {
> >           *eax |= ((topo_info->threads_per_core - 1) << 14);
> > @@ -5218,11 +5220,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >       uint32_t cpus_per_pkg;
> >       topo_info.dies_per_pkg = env->nr_dies;
> > -    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> > +    topo_info.modules_per_die = env->nr_modules;
> > +    topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
> >       topo_info.threads_per_core = cs->nr_threads;
> > -    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.cores_per_die *
> > -                   topo_info.threads_per_core;
> > +    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.modules_per_die *
> > +                   topo_info.cores_per_module * topo_info.threads_per_core;
> >       /* Calculate & apply limits for different index ranges */
> >       if (index >= 0xC0000000) {
> > @@ -5298,8 +5301,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               if (*eax & 31) {
> >                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> >                   int vcpus_per_socket = cpus_per_pkg;
> > -                int cores_per_socket = topo_info.cores_per_die *
> > -                                       topo_info.dies_per_pkg;
> > +                int cores_per_socket = cpus_per_pkg /
> > +                                       topo_info.threads_per_core;
> As mentioned in patch 5, cores_per_socket can be function-global.

Yeah, got it.

> >                   if (cores_per_socket > 1) {
> >                       *eax &= ~0xFC000000;
> >                       *eax |= (pow2ceil(cores_per_socket) - 1) << 26;
> > @@ -5483,7 +5486,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               break;
> >           case 1:
> >               *eax = apicid_die_offset(&topo_info);
> > -            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> > +            *ebx = cpus_per_pkg / topo_info.dies_per_pkg;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >               break;
> >           case 2:
> > diff --git a/tests/unit/test-x86-apicid.c b/tests/unit/test-x86-apicid.c
> > index 2b104f86d7c2..f21b8a5d95c2 100644
> > --- a/tests/unit/test-x86-apicid.c
> > +++ b/tests/unit/test-x86-apicid.c
> > @@ -30,13 +30,16 @@ static void test_topo_bits(void)
> >   {
> >       X86CPUTopoInfo topo_info = {0};
> > -    /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> > +    /*
> > +     * simple tests for 1 thread per core, 1 core per module,
> > +     *                  1 module per die, 1 die per package
> > +     */
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> > @@ -45,39 +48,39 @@ static void test_topo_bits(void)
> >       /* Test field width calculation for multiple values
> >        */
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 4};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 14};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 15};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 16};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 17};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 31, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 32, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 33, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
> > -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> > -    topo_info = (X86CPUTopoInfo) {2, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
> > -    topo_info = (X86CPUTopoInfo) {3, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {4, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> >       /* build a weird topology and see if IDs are calculated correctly
> > @@ -85,18 +88,18 @@ static void test_topo_bits(void)
> >       /* This will use 2 bits for thread ID and 3 bits for core ID
> >        */
> > -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> >       g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
> >       g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
> >       g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
> >                        (1 << 2) | 0);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 09/18] i386: Fix comment style in topology.h
  2023-02-15 10:54   ` wangyanan (Y) via
@ 2023-02-15 14:35     ` Zhao Liu
  0 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-15 14:35 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Wed, Feb 15, 2023 at 06:54:15PM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 18:54:15 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 09/18] i386: Fix comment style in topology.h
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > For function comments in this file, keep the comment style consistent
> > with other places.
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> nit:Better to move this cleanup patch to top of the series.

Ok, will do that. Thanks!

> > ---
> >   include/hw/i386/topology.h | 33 +++++++++++++++++----------------
> >   1 file changed, 17 insertions(+), 16 deletions(-)
> > 
> > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > index b0174c18b7bd..5de905dc00d3 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -24,7 +24,8 @@
> >   #ifndef HW_I386_TOPOLOGY_H
> >   #define HW_I386_TOPOLOGY_H
> > -/* This file implements the APIC-ID-based CPU topology enumeration logic,
> > +/*
> > + * This file implements the APIC-ID-based CPU topology enumeration logic,
> >    * documented at the following document:
> >    *   Intel® 64 Architecture Processor Topology Enumeration
> >    *   http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
> > @@ -41,7 +42,8 @@
> >   #include "qemu/bitops.h"
> > -/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> > +/*
> > + * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> >    */
> >   typedef uint32_t apic_id_t;
> > @@ -60,8 +62,7 @@ typedef struct X86CPUTopoInfo {
> >       unsigned threads_per_core;
> >   } X86CPUTopoInfo;
> > -/* Return the bit width needed for 'count' IDs
> > - */
> > +/* Return the bit width needed for 'count' IDs */
> >   static unsigned apicid_bitwidth_for_count(unsigned count)
> >   {
> >       g_assert(count >= 1);
> > @@ -69,15 +70,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
> >       return count ? 32 - clz32(count) : 0;
> >   }
> > -/* Bit width of the SMT_ID (thread ID) field on the APIC ID
> > - */
> > +/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
> >   static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
> >   {
> >       return apicid_bitwidth_for_count(topo_info->threads_per_core);
> >   }
> > -/* Bit width of the Core_ID field
> > - */
> > +/* Bit width of the Core_ID field */
> >   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
> >   {
> >       /*
> > @@ -94,8 +93,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo *topo_info)
> >       return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
> >   }
> > -/* Bit offset of the Core_ID field
> > - */
> > +/* Bit offset of the Core_ID field */
> >   static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
> >   {
> >       return apicid_smt_width(topo_info);
> > @@ -107,14 +105,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
> >       return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> >   }
> > -/* Bit offset of the Pkg_ID (socket ID) field
> > - */
> > +/* Bit offset of the Pkg_ID (socket ID) field */
> >   static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
> >   {
> >       return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
> >   }
> > -/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> > +/*
> > + * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> >    *
> >    * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
> >    */
> > @@ -127,7 +125,8 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
> >              topo_ids->smt_id;
> >   }
> > -/* Calculate thread/core/package IDs for a specific topology,
> > +/*
> > + * Calculate thread/core/package IDs for a specific topology,
> >    * based on (contiguous) CPU index
> >    */
> >   static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> > @@ -154,7 +153,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> >       topo_ids->smt_id = cpu_index % nr_threads;
> >   }
> > -/* Calculate thread/core/package IDs for a specific topology,
> > +/*
> > + * Calculate thread/core/package IDs for a specific topology,
> >    * based on APIC ID
> >    */
> >   static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> > @@ -178,7 +178,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> >       topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
> >   }
> > -/* Make APIC ID for the CPU 'cpu_index'
> > +/*
> > + * Make APIC ID for the CPU 'cpu_index'
> >    *
> >    * 'cpu_index' is a sequential, contiguous ID for the CPU.
> >    */
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to support module level
  2023-02-15 11:06   ` wangyanan (Y) via
@ 2023-02-15 15:03     ` Zhao Liu
  2023-02-16  2:40       ` wangyanan (Y) via
  0 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-15 15:03 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Wed, Feb 15, 2023 at 07:06:32PM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 19:06:32 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to
>  support module level
> 
> Hi Zhao,
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> > Add the module level parsing support for APIC ID.
> > 
> > With this support, now the conversion between X86CPUTopoIDs,
> > X86CPUTopoInfo and APIC ID is completed.
> IIUC, contents in patch 6-8 and 10 are all about "Introduce the module-level
> CPU topology support for x86", why do we need gradually do this with kinds
> of temporary things instead of warp them into one patch?

Patch 6 is about CPUX86State.nr_dies, which is independent from
patch 7, 8, 10.

Patch 7 (X86CPUTopoInfo.modules_per_die), patch 8 (X86CPUTopoIDs.module_id),
and patch 10 (APIC ID parsing rule) are related but have their own
relatively clear little themes, and are gradually completing full
support for module level in apic id.

Patch 7, 8, 10 can be combined into one big patch. This current
splitting way is actually designed to make it easier to review...
But if you think this is not convenient for review, sorry for that,
and I'm willing to merge them together. ;-)

Thanks,
Zhao

> Before support
> for smp.clusters in the CLI for x86, we can ensure that modules_per_dies is
> always 1 so that the code is save in one diff. Or do I miss something?
> 
> Thanks,
> Yanan
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   hw/i386/x86.c              | 19 ++++++++-----------
> >   include/hw/i386/topology.h | 36 ++++++++++++++++++------------------
> >   2 files changed, 26 insertions(+), 29 deletions(-)
> > 
> > diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> > index b90c6584930a..2a9d080a8e7a 100644
> > --- a/hw/i386/x86.c
> > +++ b/hw/i386/x86.c
> > @@ -311,11 +311,11 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
> >       /*
> >        * If APIC ID is not set,
> > -     * set it based on socket/die/core/thread properties.
> > +     * set it based on socket/die/cluster/core/thread properties.
> >        */
> >       if (cpu->apic_id == UNASSIGNED_APIC_ID) {
> > -        int max_socket = (ms->smp.max_cpus - 1) /
> > -                                smp_threads / smp_cores / ms->smp.dies;
> > +        int max_socket = (ms->smp.max_cpus - 1) / smp_threads / smp_cores /
> > +                                ms->smp.clusters / ms->smp.dies;
> >           /*
> >            * die-id was optional in QEMU 4.0 and older, so keep it optional
> > @@ -379,15 +379,12 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
> >           x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
> > -        /*
> > -         * TODO: Before APIC ID supports module level parsing, there's no need
> > -         * to expose module_id info.
> > -         */
> >           error_setg(errp,
> > -            "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
> > -            " APIC ID %" PRIu32 ", valid index range 0:%d",
> > -            topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, topo_ids.smt_id,
> > -            cpu->apic_id, ms->possible_cpus->len - 1);
> > +            "Invalid CPU [socket: %u, die: %u, module: %u, core: %u, thread: %u]"
> > +            " with APIC ID %" PRIu32 ", valid index range 0:%d",
> > +            topo_ids.pkg_id, topo_ids.die_id, topo_ids.module_id,
> > +            topo_ids.core_id, topo_ids.smt_id, cpu->apic_id,
> > +            ms->possible_cpus->len - 1);
> >           return;
> >       }
> > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > index 5de905dc00d3..3cec97b377f2 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -79,12 +79,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
> >   /* Bit width of the Core_ID field */
> >   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
> >   {
> > -    /*
> > -     * TODO: Will separate module info from core_width when update
> > -     * APIC ID with module level.
> > -     */
> > -    return apicid_bitwidth_for_count(topo_info->cores_per_module *
> > -                                     topo_info->modules_per_die);
> > +    return apicid_bitwidth_for_count(topo_info->cores_per_module);
> > +}
> > +
> > +/* Bit width of the Module_ID (cluster ID) field */
> > +static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
> > +{
> > +    return apicid_bitwidth_for_count(topo_info->modules_per_die);
> >   }
> >   /* Bit width of the Die_ID field */
> > @@ -99,10 +100,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
> >       return apicid_smt_width(topo_info);
> >   }
> > +/* Bit offset of the Module_ID (cluster ID) field */
> > +static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
> > +{
> > +    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> > +}
> > +
> >   /* Bit offset of the Die_ID field */
> >   static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
> >   {
> > -    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> > +    return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
> >   }
> >   /* Bit offset of the Pkg_ID (socket ID) field */
> > @@ -121,6 +128,7 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
> >   {
> >       return (topo_ids->pkg_id  << apicid_pkg_offset(topo_info)) |
> >              (topo_ids->die_id  << apicid_die_offset(topo_info)) |
> > +           (topo_ids->module_id << apicid_module_offset(topo_info)) |
> >              (topo_ids->core_id << apicid_core_offset(topo_info)) |
> >              topo_ids->smt_id;
> >   }
> > @@ -138,11 +146,6 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> >       unsigned nr_cores = topo_info->cores_per_module;
> >       unsigned nr_threads = topo_info->threads_per_core;
> > -    /*
> > -     * Currently smp for i386 doesn't support "clusters", modules_per_die is
> > -     * only 1. Therefore, the module_id generated from the module topology will
> > -     * not conflict with the module_id generated according to the apicid.
> > -     */
> >       topo_ids->pkg_id = cpu_index / (nr_dies * nr_modules *
> >                          nr_cores * nr_threads);
> >       topo_ids->die_id = cpu_index / (nr_modules * nr_cores *
> > @@ -166,12 +169,9 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> >       topo_ids->core_id =
> >               (apicid >> apicid_core_offset(topo_info)) &
> >               ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
> > -    /*
> > -     * TODO: This is the temporary initialization for topo_ids.module_id to
> > -     * avoid "maybe-uninitialized" compilation errors. Will remove when APIC
> > -     * ID supports module level parsing.
> > -     */
> > -    topo_ids->module_id = 0;
> > +    topo_ids->module_id =
> > +            (apicid >> apicid_module_offset(topo_info)) &
> > +            ~(0xFFFFFFFFUL << apicid_module_width(topo_info));
> >       topo_ids->die_id =
> >               (apicid >> apicid_die_offset(topo_info)) &
> >               ~(0xFFFFFFFFUL << apicid_die_width(topo_info));
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 14/18] i386: Add cache topology info in CPUCacheInfo
  2023-02-15 12:17   ` wangyanan (Y) via
@ 2023-02-15 15:07     ` Zhao Liu
  0 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-15 15:07 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Wed, Feb 15, 2023 at 08:17:22PM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 20:17:22 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 14/18] i386: Add cache topology info in
>  CPUCacheInfo
> 
> Hi Zhao,
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > Currently, by default, the cache topology is encoded as:
> > 1. i/d cache is shared in one core.
> > 2. L2 cache is shared in one core.
> > 3. L3 cache is shared in one die.
> > 
> > This default general setting has caused a misunderstanding, that is, the
> > cache topology is completely equated with a specific cpu topology, such
> > as the connection between L2 cache and core level, and the connection
> > between L3 cache and die level.
> > 
> > In fact, the settings of these topologies depend on the specific
> > platform and are not static. For example, on Alder Lake-P, every
> > four Atom cores share the same L2 cache.
> > 
> > Thus, we should explicitly define the corresponding cache topology for
> > different cache models to increase scalability.
> > 
> > Except legacy_l2_cache_cpuid2 (its default topo level is INVALID),
> It seems like its default topo level is UNKNOWN in this case.
> > explicitly set the corresponding topology level for all other cache
> > models. In order to be compatible with the existing cache topology, set
> > the CORE level for the i/d cache, set the CORE level for L2 cache, and
> > set the DIE level for L3 cache.
> > 
> > The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits
> > 25:14] will be set based on CPUCacheInfo.share_level.
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   target/i386/cpu.c | 19 +++++++++++++++++++
> >   target/i386/cpu.h | 16 ++++++++++++++++
> >   2 files changed, 35 insertions(+)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 27bbbc36b11c..364534e84b1b 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -433,6 +433,7 @@ static CPUCacheInfo legacy_l1d_cache = {
> >       .sets = 64,
> >       .partitions = 1,
> >       .no_invd_sharing = true,
> > +    .share_level = CORE,
> >   };
> >   /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
> > @@ -447,6 +448,7 @@ static CPUCacheInfo legacy_l1d_cache_amd = {
> >       .partitions = 1,
> >       .lines_per_tag = 1,
> >       .no_invd_sharing = true,
> > +    .share_level = CORE,
> >   };
> >   /* L1 instruction cache: */
> > @@ -460,6 +462,7 @@ static CPUCacheInfo legacy_l1i_cache = {
> >       .sets = 64,
> >       .partitions = 1,
> >       .no_invd_sharing = true,
> > +    .share_level = CORE,
> >   };
> >   /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
> > @@ -474,6 +477,7 @@ static CPUCacheInfo legacy_l1i_cache_amd = {
> >       .partitions = 1,
> >       .lines_per_tag = 1,
> >       .no_invd_sharing = true,
> > +    .share_level = CORE,
> >   };
> >   /* Level 2 unified cache: */
> > @@ -487,6 +491,7 @@ static CPUCacheInfo legacy_l2_cache = {
> >       .sets = 4096,
> >       .partitions = 1,
> >       .no_invd_sharing = true,
> > +    .share_level = CORE,
> >   };
> >   /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */
> > @@ -509,6 +514,7 @@ static CPUCacheInfo legacy_l2_cache_amd = {
> >       .associativity = 16,
> >       .sets = 512,
> >       .partitions = 1,
> > +    .share_level = CORE,
> >   };
> >   /* Level 3 unified cache: */
> > @@ -524,6 +530,7 @@ static CPUCacheInfo legacy_l3_cache = {
> >       .self_init = true,
> >       .inclusive = true,
> >       .complex_indexing = true,
> > +    .share_level = DIE,
> >   };
> >   /* TLB definitions: */
> > @@ -1668,6 +1675,7 @@ static const CPUCaches epyc_cache_info = {
> >           .lines_per_tag = 1,
> >           .self_init = 1,
> >           .no_invd_sharing = true,
> > +        .share_level = CORE,
> >       },
> >       .l1i_cache = &(CPUCacheInfo) {
> >           .type = INSTRUCTION_CACHE,
> > @@ -1680,6 +1688,7 @@ static const CPUCaches epyc_cache_info = {
> >           .lines_per_tag = 1,
> >           .self_init = 1,
> >           .no_invd_sharing = true,
> > +        .share_level = CORE,
> >       },
> >       .l2_cache = &(CPUCacheInfo) {
> >           .type = UNIFIED_CACHE,
> > @@ -1690,6 +1699,7 @@ static const CPUCaches epyc_cache_info = {
> >           .partitions = 1,
> >           .sets = 1024,
> >           .lines_per_tag = 1,
> > +        .share_level = CORE,
> >       },
> >       .l3_cache = &(CPUCacheInfo) {
> >           .type = UNIFIED_CACHE,
> > @@ -1703,6 +1713,7 @@ static const CPUCaches epyc_cache_info = {
> >           .self_init = true,
> >           .inclusive = true,
> >           .complex_indexing = true,
> > +        .share_level = DIE,
> >       },
> >   };
> > @@ -1718,6 +1729,7 @@ static const CPUCaches epyc_rome_cache_info = {
> >           .lines_per_tag = 1,
> >           .self_init = 1,
> >           .no_invd_sharing = true,
> > +        .share_level = CORE,
> >       },
> >       .l1i_cache = &(CPUCacheInfo) {
> >           .type = INSTRUCTION_CACHE,
> > @@ -1730,6 +1742,7 @@ static const CPUCaches epyc_rome_cache_info = {
> >           .lines_per_tag = 1,
> >           .self_init = 1,
> >           .no_invd_sharing = true,
> > +        .share_level = CORE,
> >       },
> >       .l2_cache = &(CPUCacheInfo) {
> >           .type = UNIFIED_CACHE,
> > @@ -1740,6 +1753,7 @@ static const CPUCaches epyc_rome_cache_info = {
> >           .partitions = 1,
> >           .sets = 1024,
> >           .lines_per_tag = 1,
> > +        .share_level = CORE,
> >       },
> >       .l3_cache = &(CPUCacheInfo) {
> >           .type = UNIFIED_CACHE,
> > @@ -1753,6 +1767,7 @@ static const CPUCaches epyc_rome_cache_info = {
> >           .self_init = true,
> >           .inclusive = true,
> >           .complex_indexing = true,
> > +        .share_level = DIE,
> >       },
> >   };
> > @@ -1768,6 +1783,7 @@ static const CPUCaches epyc_milan_cache_info = {
> >           .lines_per_tag = 1,
> >           .self_init = 1,
> >           .no_invd_sharing = true,
> > +        .share_level = CORE,
> >       },
> >       .l1i_cache = &(CPUCacheInfo) {
> >           .type = INSTRUCTION_CACHE,
> > @@ -1780,6 +1796,7 @@ static const CPUCaches epyc_milan_cache_info = {
> >           .lines_per_tag = 1,
> >           .self_init = 1,
> >           .no_invd_sharing = true,
> > +        .share_level = CORE,
> >       },
> >       .l2_cache = &(CPUCacheInfo) {
> >           .type = UNIFIED_CACHE,
> > @@ -1790,6 +1807,7 @@ static const CPUCaches epyc_milan_cache_info = {
> >           .partitions = 1,
> >           .sets = 1024,
> >           .lines_per_tag = 1,
> > +        .share_level = CORE,
> >       },
> >       .l3_cache = &(CPUCacheInfo) {
> >           .type = UNIFIED_CACHE,
> > @@ -1803,6 +1821,7 @@ static const CPUCaches epyc_milan_cache_info = {
> >           .self_init = true,
> >           .inclusive = true,
> >           .complex_indexing = true,
> > +        .share_level = DIE,
> >       },
> >   };
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index 8668e74e0c87..5a955431f759 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1476,6 +1476,15 @@ enum CacheType {
> >       UNIFIED_CACHE
> >   };
> > +enum CPUTopoLevel {
> > +    INVALID = 0,
> Maybe UNKNOWN?

I agree, it's a better name than INVALID.

> > +    SMT,
> > +    CORE,
> > +    MODULE,
> > +    DIE,
> > +    PACKAGE,
> Do we need a prefix like CPU_TOPO_LEVEL_* or shorter CPU_TL_*
> to avoid possible naming conflicts with other micros or enums?
> Not sure..

Yes, adding a prefix for enum is more in line with the QEMU naming
style, I will add it. I support the first name, it looks more clear.

Thanks,
Zhao

> 
> Thanks,
> Yanan
> > +};
> > +
> >   typedef struct CPUCacheInfo {
> >       enum CacheType type;
> >       uint8_t level;
> > @@ -1517,6 +1526,13 @@ typedef struct CPUCacheInfo {
> >        * address bits.  CPUID[4].EDX[bit 2].
> >        */
> >       bool complex_indexing;
> > +
> > +    /*
> > +     * Cache Topology. The level that cache is shared in.
> > +     * Used to encode CPUID[4].EAX[bits 25:14] or
> > +     * CPUID[0x8000001D].EAX[bits 25:14].
> > +     */
> > +    enum CPUTopoLevel share_level;
> >   } CPUCacheInfo;
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 16/18] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]
  2023-02-15 12:32   ` wangyanan (Y) via
@ 2023-02-15 15:09     ` Zhao Liu
  0 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-15 15:09 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Wed, Feb 15, 2023 at 08:32:39PM +0800, wangyanan (Y) wrote:
> Date: Wed, 15 Feb 2023 20:32:39 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 16/18] i386: Fix NumSharingCache for
>  CPUID[0x8000001D].EAX[bits 25:14]
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > >From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14])
> > means [1]:
> > 
> > The number of logical processors sharing this cache is the value of
> > this field incremented by 1. To determine which logical processors are
> > sharing a cache, determine a Share Id for each processor as follows:
> > 
> > ShareId = LocalApicId >> log2(NumSharingCache+1)
> > 
> > Logical processors with the same ShareId then share a cache. If
> > NumSharingCache+1 is not a power of two, round it up to the next power
> > of two.
> > 
> > >From the description above, the caculation of this feild should be same
> > as CPUID[4].EAX[bits 25:14] for intel cpus. So also use the offsets of
> > APIC ID to caculate this field.
> > 
> > Note: I don't have the hardware available, hope someone can help me to
> > confirm whether this calculation is correct, thanks!
> > 
> > [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
> >       Information
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   target/i386/cpu.c | 10 ++++------
> >   1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 96ef96860604..d691c02e3c06 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -355,7 +355,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >                                          uint32_t *eax, uint32_t *ebx,
> >                                          uint32_t *ecx, uint32_t *edx)
> >   {
> > -    uint32_t l3_threads;
> > +    uint32_t sharing_apic_ids;
> maybe num_apic_ids or num_ids?

Thanks, num is better as a prefix. I would use num_apic_ids.

Zhao

> >       assert(cache->size == cache->line_size * cache->associativity *
> >                             cache->partitions * cache->sets);
> > @@ -364,13 +364,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >       /* L3 is shared among multiple cores */
> >       if (cache->level == 3) {
> > -        l3_threads = topo_info->modules_per_die *
> > -                     topo_info->cores_per_module *
> > -                     topo_info->threads_per_core;
> > -        *eax |= (l3_threads - 1) << 14;
> > +        sharing_apic_ids = 1 << apicid_die_offset(topo_info);
> >       } else {
> > -        *eax |= ((topo_info->threads_per_core - 1) << 14);
> > +        sharing_apic_ids = 1 << apicid_core_offset(topo_info);
> >       }
> > +    *eax |= (sharing_apic_ids - 1) << 14;
> >       assert(cache->line_size > 0);
> >       assert(cache->partitions > 0);
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo
  2023-02-13  9:36 ` [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
  2023-02-15 10:38   ` wangyanan (Y) via
@ 2023-02-16  2:34   ` wangyanan (Y) via
  2023-02-16  4:33     ` Zhao Liu
  1 sibling, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-16  2:34 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

Hi Zhao,

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
>
> Support module level in i386 cpu topology structure "X86CPUTopoInfo".
>
> Before updating APIC ID parsing rule with module level, the
> apicid_core_width() temporarily combines the core and module levels
> together.
If we dont merge this one with the followed patches, then nits below
may be meaningful.
> At present, we don't expose module level in CPUID.1FH because currently
> linux (v6.2-rc6) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> the wrong die_id. The module level should be exposed until the real
> machine has the module level in CPUID.1FH.
>
> In addition, update topology structure in test-x86-apicid.c.
>
> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   hw/i386/x86.c                |  3 ++-
>   include/hw/i386/topology.h   | 13 ++++++++---
>   target/i386/cpu.c            | 17 ++++++++------
>   tests/unit/test-x86-apicid.c | 45 +++++++++++++++++++-----------------
>   4 files changed, 46 insertions(+), 32 deletions(-)
>
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index ae1bb562d6e2..1c069ff56ae7 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -71,7 +71,8 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
>       MachineState *ms = MACHINE(x86ms);
>   
>       topo_info->dies_per_pkg = ms->smp.dies;
> -    topo_info->cores_per_die = ms->smp.cores;
> +    topo_info->modules_per_die = ms->smp.clusters;
> +    topo_info->cores_per_module = ms->smp.cores;
Here we can ensure that topo_info->modules_per_die is always 1, so...
>       topo_info->threads_per_core = ms->smp.threads;
>   }
>   
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 81573f6cfde0..bbb00dc4aad8 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -54,7 +54,8 @@ typedef struct X86CPUTopoIDs {
>   
>   typedef struct X86CPUTopoInfo {
>       unsigned dies_per_pkg;
> -    unsigned cores_per_die;
> +    unsigned modules_per_die;
> +    unsigned cores_per_module;
>       unsigned threads_per_core;
>   } X86CPUTopoInfo;
>   
> @@ -78,7 +79,12 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>    */
>   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>   {
> -    return apicid_bitwidth_for_count(topo_info->cores_per_die);
> +    /*
> +     * TODO: Will separate module info from core_width when update
> +     * APIC ID with module level.
> +     */
> +    return apicid_bitwidth_for_count(topo_info->cores_per_module *
> +                                     topo_info->modules_per_die);
>   }
...We can directly add apicid_module_width (which returns 0 so far)
and apicid_module_offset here which don't rely on the APIC ID rule
change, and avoid the "TODO..".

Then patch 8 and 10 are about module_id, so can be merged.
Is this good?

Thanks,
Yanan
>   /* Bit width of the Die_ID field */
> @@ -128,7 +134,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>                                            X86CPUTopoIDs *topo_ids)
>   {
>       unsigned nr_dies = topo_info->dies_per_pkg;
> -    unsigned nr_cores = topo_info->cores_per_die;
> +    unsigned nr_cores = topo_info->cores_per_module *
> +                        topo_info->modules_per_die;
>       unsigned nr_threads = topo_info->threads_per_core;
>   
>       topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 61ec9a7499b8..6f3d114c7d12 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -336,7 +336,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
>   
>       /* L3 is shared among multiple cores */
>       if (cache->level == 3) {
> -        l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
> +        l3_threads = topo_info->modules_per_die *
> +                     topo_info->cores_per_module *
> +                     topo_info->threads_per_core;
>           *eax |= (l3_threads - 1) << 14;
>       } else {
>           *eax |= ((topo_info->threads_per_core - 1) << 14);
> @@ -5218,11 +5220,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>       uint32_t cpus_per_pkg;
>   
>       topo_info.dies_per_pkg = env->nr_dies;
> -    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> +    topo_info.modules_per_die = env->nr_modules;
> +    topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
>       topo_info.threads_per_core = cs->nr_threads;
>   
> -    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.cores_per_die *
> -                   topo_info.threads_per_core;
> +    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.modules_per_die *
> +                   topo_info.cores_per_module * topo_info.threads_per_core;
>   
>       /* Calculate & apply limits for different index ranges */
>       if (index >= 0xC0000000) {
> @@ -5298,8 +5301,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               if (*eax & 31) {
>                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>                   int vcpus_per_socket = cpus_per_pkg;
> -                int cores_per_socket = topo_info.cores_per_die *
> -                                       topo_info.dies_per_pkg;
> +                int cores_per_socket = cpus_per_pkg /
> +                                       topo_info.threads_per_core;
>                   if (cores_per_socket > 1) {
>                       *eax &= ~0xFC000000;
>                       *eax |= (pow2ceil(cores_per_socket) - 1) << 26;
> @@ -5483,7 +5486,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               break;
>           case 1:
>               *eax = apicid_die_offset(&topo_info);
> -            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> +            *ebx = cpus_per_pkg / topo_info.dies_per_pkg;
>               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>               break;
>           case 2:
> diff --git a/tests/unit/test-x86-apicid.c b/tests/unit/test-x86-apicid.c
> index 2b104f86d7c2..f21b8a5d95c2 100644
> --- a/tests/unit/test-x86-apicid.c
> +++ b/tests/unit/test-x86-apicid.c
> @@ -30,13 +30,16 @@ static void test_topo_bits(void)
>   {
>       X86CPUTopoInfo topo_info = {0};
>   
> -    /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
> -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> +    /*
> +     * simple tests for 1 thread per core, 1 core per module,
> +     *                  1 module per die, 1 die per package
> +     */
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> @@ -45,39 +48,39 @@ static void test_topo_bits(void)
>   
>       /* Test field width calculation for multiple values
>        */
> -    topo_info = (X86CPUTopoInfo) {1, 1, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 4};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 1, 14};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 15};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 16};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> -    topo_info = (X86CPUTopoInfo) {1, 1, 17};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
>   
>   
> -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {1, 31, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {1, 32, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> -    topo_info = (X86CPUTopoInfo) {1, 33, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
>       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> -    topo_info = (X86CPUTopoInfo) {2, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
> -    topo_info = (X86CPUTopoInfo) {3, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> -    topo_info = (X86CPUTopoInfo) {4, 30, 2};
> +    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
>       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
>   
>       /* build a weird topology and see if IDs are calculated correctly
> @@ -85,18 +88,18 @@ static void test_topo_bits(void)
>   
>       /* This will use 2 bits for thread ID and 3 bits for core ID
>        */
> -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
>       g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
>       g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
>       g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
>   
> -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
>                        (1 << 2) | 0);
>       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to support module level
  2023-02-15 15:03     ` Zhao Liu
@ 2023-02-16  2:40       ` wangyanan (Y) via
  0 siblings, 0 replies; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-16  2:40 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/15 23:03, Zhao Liu 写道:
> On Wed, Feb 15, 2023 at 07:06:32PM +0800, wangyanan (Y) wrote:
>> Date: Wed, 15 Feb 2023 19:06:32 +0800
>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>> Subject: Re: [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to
>>   support module level
>>
>> Hi Zhao,
>>
>> 在 2023/2/13 17:36, Zhao Liu 写道:
>>> From: Zhuocheng Ding <zhuocheng.ding@intel.com>
>>>
>>> Add the module level parsing support for APIC ID.
>>>
>>> With this support, now the conversion between X86CPUTopoIDs,
>>> X86CPUTopoInfo and APIC ID is completed.
>> IIUC, contents in patch 6-8 and 10 are all about "Introduce the module-level
>> CPU topology support for x86", why do we need gradually do this with kinds
>> of temporary things instead of warp them into one patch?
> Patch 6 is about CPUX86State.nr_dies, which is independent from
> patch 7, 8, 10.
Ok
>
> Patch 7 (X86CPUTopoInfo.modules_per_die), patch 8 (X86CPUTopoIDs.module_id),
> and patch 10 (APIC ID parsing rule) are related but have their own
> relatively clear little themes, and are gradually completing full
> support for module level in apic id.
>
> Patch 7, 8, 10 can be combined into one big patch. This current
> splitting way is actually designed to make it easier to review...
> But if you think this is not convenient for review, sorry for that,
> and I'm willing to merge them together. ;-)
So comments in patch 7, I think merging 8 and 10 will be clean enough.

Thanks,
Yanan
>
> Thanks,
> Zhao
>
>> Before support
>> for smp.clusters in the CLI for x86, we can ensure that modules_per_dies is
>> always 1 so that the code is save in one diff. Or do I miss something?
>>
>> Thanks,
>> Yanan
>>> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
>>> Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>> ---
>>>    hw/i386/x86.c              | 19 ++++++++-----------
>>>    include/hw/i386/topology.h | 36 ++++++++++++++++++------------------
>>>    2 files changed, 26 insertions(+), 29 deletions(-)
>>>
>>> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
>>> index b90c6584930a..2a9d080a8e7a 100644
>>> --- a/hw/i386/x86.c
>>> +++ b/hw/i386/x86.c
>>> @@ -311,11 +311,11 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>>        /*
>>>         * If APIC ID is not set,
>>> -     * set it based on socket/die/core/thread properties.
>>> +     * set it based on socket/die/cluster/core/thread properties.
>>>         */
>>>        if (cpu->apic_id == UNASSIGNED_APIC_ID) {
>>> -        int max_socket = (ms->smp.max_cpus - 1) /
>>> -                                smp_threads / smp_cores / ms->smp.dies;
>>> +        int max_socket = (ms->smp.max_cpus - 1) / smp_threads / smp_cores /
>>> +                                ms->smp.clusters / ms->smp.dies;
>>>            /*
>>>             * die-id was optional in QEMU 4.0 and older, so keep it optional
>>> @@ -379,15 +379,12 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>>>            x86_topo_ids_from_apicid(cpu->apic_id, &topo_info, &topo_ids);
>>> -        /*
>>> -         * TODO: Before APIC ID supports module level parsing, there's no need
>>> -         * to expose module_id info.
>>> -         */
>>>            error_setg(errp,
>>> -            "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
>>> -            " APIC ID %" PRIu32 ", valid index range 0:%d",
>>> -            topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, topo_ids.smt_id,
>>> -            cpu->apic_id, ms->possible_cpus->len - 1);
>>> +            "Invalid CPU [socket: %u, die: %u, module: %u, core: %u, thread: %u]"
>>> +            " with APIC ID %" PRIu32 ", valid index range 0:%d",
>>> +            topo_ids.pkg_id, topo_ids.die_id, topo_ids.module_id,
>>> +            topo_ids.core_id, topo_ids.smt_id, cpu->apic_id,
>>> +            ms->possible_cpus->len - 1);
>>>            return;
>>>        }
>>> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
>>> index 5de905dc00d3..3cec97b377f2 100644
>>> --- a/include/hw/i386/topology.h
>>> +++ b/include/hw/i386/topology.h
>>> @@ -79,12 +79,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>>>    /* Bit width of the Core_ID field */
>>>    static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>>>    {
>>> -    /*
>>> -     * TODO: Will separate module info from core_width when update
>>> -     * APIC ID with module level.
>>> -     */
>>> -    return apicid_bitwidth_for_count(topo_info->cores_per_module *
>>> -                                     topo_info->modules_per_die);
>>> +    return apicid_bitwidth_for_count(topo_info->cores_per_module);
>>> +}
>>> +
>>> +/* Bit width of the Module_ID (cluster ID) field */
>>> +static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
>>> +{
>>> +    return apicid_bitwidth_for_count(topo_info->modules_per_die);
>>>    }
>>>    /* Bit width of the Die_ID field */
>>> @@ -99,10 +100,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>>>        return apicid_smt_width(topo_info);
>>>    }
>>> +/* Bit offset of the Module_ID (cluster ID) field */
>>> +static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
>>> +{
>>> +    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
>>> +}
>>> +
>>>    /* Bit offset of the Die_ID field */
>>>    static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
>>>    {
>>> -    return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
>>> +    return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
>>>    }
>>>    /* Bit offset of the Pkg_ID (socket ID) field */
>>> @@ -121,6 +128,7 @@ static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
>>>    {
>>>        return (topo_ids->pkg_id  << apicid_pkg_offset(topo_info)) |
>>>               (topo_ids->die_id  << apicid_die_offset(topo_info)) |
>>> +           (topo_ids->module_id << apicid_module_offset(topo_info)) |
>>>               (topo_ids->core_id << apicid_core_offset(topo_info)) |
>>>               topo_ids->smt_id;
>>>    }
>>> @@ -138,11 +146,6 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
>>>        unsigned nr_cores = topo_info->cores_per_module;
>>>        unsigned nr_threads = topo_info->threads_per_core;
>>> -    /*
>>> -     * Currently smp for i386 doesn't support "clusters", modules_per_die is
>>> -     * only 1. Therefore, the module_id generated from the module topology will
>>> -     * not conflict with the module_id generated according to the apicid.
>>> -     */
>>>        topo_ids->pkg_id = cpu_index / (nr_dies * nr_modules *
>>>                           nr_cores * nr_threads);
>>>        topo_ids->die_id = cpu_index / (nr_modules * nr_cores *
>>> @@ -166,12 +169,9 @@ static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
>>>        topo_ids->core_id =
>>>                (apicid >> apicid_core_offset(topo_info)) &
>>>                ~(0xFFFFFFFFUL << apicid_core_width(topo_info));
>>> -    /*
>>> -     * TODO: This is the temporary initialization for topo_ids.module_id to
>>> -     * avoid "maybe-uninitialized" compilation errors. Will remove when APIC
>>> -     * ID supports module level parsing.
>>> -     */
>>> -    topo_ids->module_id = 0;
>>> +    topo_ids->module_id =
>>> +            (apicid >> apicid_module_offset(topo_info)) &
>>> +            ~(0xFFFFFFFFUL << apicid_module_width(topo_info));
>>>        topo_ids->die_id =
>>>                (apicid >> apicid_die_offset(topo_info)) &
>>>                ~(0xFFFFFFFFUL << apicid_die_width(topo_info));



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo
  2023-02-16  2:34   ` wangyanan (Y) via
@ 2023-02-16  4:33     ` Zhao Liu
  0 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-16  4:33 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Thu, Feb 16, 2023 at 10:34:24AM +0800, wangyanan (Y) wrote:
> Date: Thu, 16 Feb 2023 10:34:24 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 07/18] i386: Support modules_per_die in
>  X86CPUTopoInfo
> 
> Hi Zhao,
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > 
> > Support module level in i386 cpu topology structure "X86CPUTopoInfo".
> > 
> > Before updating APIC ID parsing rule with module level, the
> > apicid_core_width() temporarily combines the core and module levels
> > together.
> If we dont merge this one with the followed patches, then nits below
> may be meaningful.
> > At present, we don't expose module level in CPUID.1FH because currently
> > linux (v6.2-rc6) doesn't support module level. And exposing module and
> > die levels at the same time in CPUID.1FH will cause linux to calculate
> > the wrong die_id. The module level should be exposed until the real
> > machine has the module level in CPUID.1FH.
> > 
> > In addition, update topology structure in test-x86-apicid.c.
> > 
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   hw/i386/x86.c                |  3 ++-
> >   include/hw/i386/topology.h   | 13 ++++++++---
> >   target/i386/cpu.c            | 17 ++++++++------
> >   tests/unit/test-x86-apicid.c | 45 +++++++++++++++++++-----------------
> >   4 files changed, 46 insertions(+), 32 deletions(-)
> > 
> > diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> > index ae1bb562d6e2..1c069ff56ae7 100644
> > --- a/hw/i386/x86.c
> > +++ b/hw/i386/x86.c
> > @@ -71,7 +71,8 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
> >       MachineState *ms = MACHINE(x86ms);
> >       topo_info->dies_per_pkg = ms->smp.dies;
> > -    topo_info->cores_per_die = ms->smp.cores;
> > +    topo_info->modules_per_die = ms->smp.clusters;
> > +    topo_info->cores_per_module = ms->smp.cores;
> Here we can ensure that topo_info->modules_per_die is always 1, so...
> >       topo_info->threads_per_core = ms->smp.threads;
> >   }
> > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > index 81573f6cfde0..bbb00dc4aad8 100644
> > --- a/include/hw/i386/topology.h
> > +++ b/include/hw/i386/topology.h
> > @@ -54,7 +54,8 @@ typedef struct X86CPUTopoIDs {
> >   typedef struct X86CPUTopoInfo {
> >       unsigned dies_per_pkg;
> > -    unsigned cores_per_die;
> > +    unsigned modules_per_die;
> > +    unsigned cores_per_module;
> >       unsigned threads_per_core;
> >   } X86CPUTopoInfo;
> > @@ -78,7 +79,12 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
> >    */
> >   static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
> >   {
> > -    return apicid_bitwidth_for_count(topo_info->cores_per_die);
> > +    /*
> > +     * TODO: Will separate module info from core_width when update
> > +     * APIC ID with module level.
> > +     */
> > +    return apicid_bitwidth_for_count(topo_info->cores_per_module *
> > +                                     topo_info->modules_per_die);
> >   }
> ...We can directly add apicid_module_width (which returns 0 so far)
> and apicid_module_offset here which don't rely on the APIC ID rule
> change, and avoid the "TODO..".
> 
> Then patch 8 and 10 are about module_id, so can be merged.
> Is this good?

Thanks Yanan, this makes sense. I'll do it.

Zhao

> 
> Thanks,
> Yanan
> >   /* Bit width of the Die_ID field */
> > @@ -128,7 +134,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> >                                            X86CPUTopoIDs *topo_ids)
> >   {
> >       unsigned nr_dies = topo_info->dies_per_pkg;
> > -    unsigned nr_cores = topo_info->cores_per_die;
> > +    unsigned nr_cores = topo_info->cores_per_module *
> > +                        topo_info->modules_per_die;
> >       unsigned nr_threads = topo_info->threads_per_core;
> >       topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 61ec9a7499b8..6f3d114c7d12 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -336,7 +336,9 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
> >       /* L3 is shared among multiple cores */
> >       if (cache->level == 3) {
> > -        l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
> > +        l3_threads = topo_info->modules_per_die *
> > +                     topo_info->cores_per_module *
> > +                     topo_info->threads_per_core;
> >           *eax |= (l3_threads - 1) << 14;
> >       } else {
> >           *eax |= ((topo_info->threads_per_core - 1) << 14);
> > @@ -5218,11 +5220,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >       uint32_t cpus_per_pkg;
> >       topo_info.dies_per_pkg = env->nr_dies;
> > -    topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
> > +    topo_info.modules_per_die = env->nr_modules;
> > +    topo_info.cores_per_module = cs->nr_cores / env->nr_dies / env->nr_modules;
> >       topo_info.threads_per_core = cs->nr_threads;
> > -    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.cores_per_die *
> > -                   topo_info.threads_per_core;
> > +    cpus_per_pkg = topo_info.dies_per_pkg * topo_info.modules_per_die *
> > +                   topo_info.cores_per_module * topo_info.threads_per_core;
> >       /* Calculate & apply limits for different index ranges */
> >       if (index >= 0xC0000000) {
> > @@ -5298,8 +5301,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               if (*eax & 31) {
> >                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> >                   int vcpus_per_socket = cpus_per_pkg;
> > -                int cores_per_socket = topo_info.cores_per_die *
> > -                                       topo_info.dies_per_pkg;
> > +                int cores_per_socket = cpus_per_pkg /
> > +                                       topo_info.threads_per_core;
> >                   if (cores_per_socket > 1) {
> >                       *eax &= ~0xFC000000;
> >                       *eax |= (pow2ceil(cores_per_socket) - 1) << 26;
> > @@ -5483,7 +5486,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> >               break;
> >           case 1:
> >               *eax = apicid_die_offset(&topo_info);
> > -            *ebx = topo_info.cores_per_die * topo_info.threads_per_core;
> > +            *ebx = cpus_per_pkg / topo_info.dies_per_pkg;
> >               *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
> >               break;
> >           case 2:
> > diff --git a/tests/unit/test-x86-apicid.c b/tests/unit/test-x86-apicid.c
> > index 2b104f86d7c2..f21b8a5d95c2 100644
> > --- a/tests/unit/test-x86-apicid.c
> > +++ b/tests/unit/test-x86-apicid.c
> > @@ -30,13 +30,16 @@ static void test_topo_bits(void)
> >   {
> >       X86CPUTopoInfo topo_info = {0};
> > -    /* simple tests for 1 thread per core, 1 core per die, 1 die per package */
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> > +    /*
> > +     * simple tests for 1 thread per core, 1 core per module,
> > +     *                  1 module per die, 1 die per package
> > +     */
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 0);
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 0);
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 1};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 1};
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> > @@ -45,39 +48,39 @@ static void test_topo_bits(void)
> >       /* Test field width calculation for multiple values
> >        */
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 2};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 1);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 3};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 4};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 4};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 14};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 14};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 15};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 15};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 16};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 16};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 4);
> > -    topo_info = (X86CPUTopoInfo) {1, 1, 17};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 1, 17};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 31, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 31, 2};
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 32, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 32, 2};
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 33, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 33, 2};
> >       g_assert_cmpuint(apicid_core_width(&topo_info), ==, 6);
> > -    topo_info = (X86CPUTopoInfo) {1, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 30, 2};
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 0);
> > -    topo_info = (X86CPUTopoInfo) {2, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {2, 1, 30, 2};
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 1);
> > -    topo_info = (X86CPUTopoInfo) {3, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {3, 1, 30, 2};
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {4, 30, 2};
> > +    topo_info = (X86CPUTopoInfo) {4, 1, 30, 2};
> >       g_assert_cmpuint(apicid_die_width(&topo_info), ==, 2);
> >       /* build a weird topology and see if IDs are calculated correctly
> > @@ -85,18 +88,18 @@ static void test_topo_bits(void)
> >       /* This will use 2 bits for thread ID and 3 bits for core ID
> >        */
> > -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> >       g_assert_cmpuint(apicid_smt_width(&topo_info), ==, 2);
> >       g_assert_cmpuint(apicid_core_offset(&topo_info), ==, 2);
> >       g_assert_cmpuint(apicid_die_offset(&topo_info), ==, 5);
> >       g_assert_cmpuint(apicid_pkg_offset(&topo_info), ==, 5);
> > -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 0), ==, 0);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1), ==, 1);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 2), ==, 2);
> > -    topo_info = (X86CPUTopoInfo) {1, 6, 3};
> > +    topo_info = (X86CPUTopoInfo) {1, 1, 6, 3};
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 0), ==,
> >                        (1 << 2) | 0);
> >       g_assert_cmpuint(x86_apicid_from_cpu_idx(&topo_info, 1 * 3 + 1), ==,
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-13  9:36 ` [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
@ 2023-02-16 13:14   ` wangyanan (Y) via
  2023-02-17  3:35     ` Zhao Liu
  0 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-16 13:14 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/13 17:36, Zhao Liu 写道:
> From: Zhao Liu <zhao1.liu@intel.com>
>
> The property x-l2-cache-topo will be used to change the L2 cache
> topology in CPUID.04H.
>
> Now it allows user to set the L2 cache is shared in core level or
> cluster level.
>
> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> topology will be overrided by the new topology setting.
Currently x-l2-cache-topo only defines the share level *globally*.
I'm thinking how we can make the property more powerful so that it
can specify which CPUs share l2 on core level and which CPUs share
l2 on cluster level.

What would Intel's Hybrid CPUs do? Determine the l2 share level
is core or cluster according to the CPU core type (Atom or Core)?
While ARM does not have the core type concept but have CPUs
that l2 is shared on different levels in the same system.

Thanks,
Yanan
> Here we expose to user "cluster" instead of "module", to be consistent
> with "cluster-id" naming.
>
> Since CPUID.04H is used by intel CPUs, this property is available on
> intel CPUs as for now.
>
> When necessary, it can be extended to CPUID.8000001DH for amd CPUs.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>   target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
>   target/i386/cpu.h |  2 ++
>   2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 5816dc99b1d4..cf84c720a431 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
>       case CORE:
>           num_ids = 1 << apicid_core_offset(topo_info);
>           break;
> +    case MODULE:
> +        num_ids = 1 << apicid_module_offset(topo_info);
> +        break;
>       case DIE:
>           num_ids = 1 << apicid_die_offset(topo_info);
>           break;
>       default:
>           /*
> -         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
> +         * Currently there is no use case for SMT and PACKAGE, so use
>            * assert directly to facilitate debugging.
>            */
>           g_assert_not_reached();
> @@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>           env->cache_info_amd.l3_cache = &legacy_l3_cache;
>       }
>   
> +    if (cpu->l2_cache_topo_level) {
> +        /*
> +         * FIXME: Currently only supports changing CPUID[4] (for intel), and
> +         * will support changing CPUID[0x8000001D] when necessary.
> +         */
> +        if (!IS_INTEL_CPU(env)) {
> +            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
> +            return;
> +        }
> +
> +        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
> +            env->cache_info_cpuid4.l2_cache->share_level = CORE;
> +        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
> +            /*
> +             * We expose to users "cluster" instead of "module", to be
> +             * consistent with "cluster-id" naming.
> +             */
> +            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
> +        } else {
> +            error_setg(errp,
> +                       "x-l2-cache-topo doesn't support '%s', "
> +                       "and it only supports 'core' or 'cluster'",
> +                       cpu->l2_cache_topo_level);
> +            return;
> +        }
> +    }
> +
>   #ifndef CONFIG_USER_ONLY
>       MachineState *ms = MACHINE(qdev_get_machine());
>       qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
> @@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
>                        false),
>       DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
>                        true),
> +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
>       DEFINE_PROP_END_OF_LIST()
>   };
>   
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 5a955431f759..aa7e96c586c7 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1987,6 +1987,8 @@ struct ArchCPU {
>       int32_t thread_id;
>   
>       int32_t hv_max_vps;
> +
> +    char *l2_cache_topo_level;
>   };
>   
>   



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-16 13:14   ` wangyanan (Y) via
@ 2023-02-17  3:35     ` Zhao Liu
  2023-02-17  3:45       ` wangyanan (Y) via
  2023-02-17  4:07       ` wangyanan (Y) via
  0 siblings, 2 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-17  3:35 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Thu, Feb 16, 2023 at 09:14:54PM +0800, wangyanan (Y) wrote:
> Date: Thu, 16 Feb 2023 21:14:54 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>  cache topo in CPUID.04H
> 
> 在 2023/2/13 17:36, Zhao Liu 写道:
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > The property x-l2-cache-topo will be used to change the L2 cache
> > topology in CPUID.04H.
> > 
> > Now it allows user to set the L2 cache is shared in core level or
> > cluster level.
> > 
> > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > topology will be overrided by the new topology setting.
> Currently x-l2-cache-topo only defines the share level *globally*.

Yes, will set for all CPUs.

> I'm thinking how we can make the property more powerful so that it
> can specify which CPUs share l2 on core level and which CPUs share
> l2 on cluster level.
> 
> What would Intel's Hybrid CPUs do? Determine the l2 share level
> is core or cluster according to the CPU core type (Atom or Core)?
> While ARM does not have the core type concept but have CPUs
> that l2 is shared on different levels in the same system.

For example, Alderlake's "core" shares 1 L2 per core and every 4 "atom"s
share 1 L2. For this case, we can set the topology as:

cluster0 has 1 "core" and cluster1 has 4 "atom". Then set L2 shared on
cluster level.

Since cluster0 has only 1 "core" type core, then L2 per "core" works.

Not sure if this idea can be applied to arm?

> 
> Thanks,
> Yanan
> > Here we expose to user "cluster" instead of "module", to be consistent
> > with "cluster-id" naming.
> > 
> > Since CPUID.04H is used by intel CPUs, this property is available on
> > intel CPUs as for now.
> > 
> > When necessary, it can be extended to CPUID.8000001DH for amd CPUs.
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > ---
> >   target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
> >   target/i386/cpu.h |  2 ++
> >   2 files changed, 34 insertions(+), 1 deletion(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 5816dc99b1d4..cf84c720a431 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
> >       case CORE:
> >           num_ids = 1 << apicid_core_offset(topo_info);
> >           break;
> > +    case MODULE:
> > +        num_ids = 1 << apicid_module_offset(topo_info);
> > +        break;
> >       case DIE:
> >           num_ids = 1 << apicid_die_offset(topo_info);
> >           break;
> >       default:
> >           /*
> > -         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
> > +         * Currently there is no use case for SMT and PACKAGE, so use
> >            * assert directly to facilitate debugging.
> >            */
> >           g_assert_not_reached();
> > @@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> >           env->cache_info_amd.l3_cache = &legacy_l3_cache;
> >       }
> > +    if (cpu->l2_cache_topo_level) {
> > +        /*
> > +         * FIXME: Currently only supports changing CPUID[4] (for intel), and
> > +         * will support changing CPUID[0x8000001D] when necessary.
> > +         */
> > +        if (!IS_INTEL_CPU(env)) {
> > +            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
> > +            return;
> > +        }
> > +
> > +        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
> > +            env->cache_info_cpuid4.l2_cache->share_level = CORE;
> > +        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
> > +            /*
> > +             * We expose to users "cluster" instead of "module", to be
> > +             * consistent with "cluster-id" naming.
> > +             */
> > +            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
> > +        } else {
> > +            error_setg(errp,
> > +                       "x-l2-cache-topo doesn't support '%s', "
> > +                       "and it only supports 'core' or 'cluster'",
> > +                       cpu->l2_cache_topo_level);
> > +            return;
> > +        }
> > +    }
> > +
> >   #ifndef CONFIG_USER_ONLY
> >       MachineState *ms = MACHINE(qdev_get_machine());
> >       qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
> > @@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
> >                        false),
> >       DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
> >                        true),
> > +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
> >       DEFINE_PROP_END_OF_LIST()
> >   };
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index 5a955431f759..aa7e96c586c7 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1987,6 +1987,8 @@ struct ArchCPU {
> >       int32_t thread_id;
> >       int32_t hv_max_vps;
> > +
> > +    char *l2_cache_topo_level;
> >   };
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-17  3:35     ` Zhao Liu
@ 2023-02-17  3:45       ` wangyanan (Y) via
  2023-02-20  2:54         ` Zhao Liu
  2023-02-17  4:07       ` wangyanan (Y) via
  1 sibling, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-17  3:45 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/17 11:35, Zhao Liu 写道:
> On Thu, Feb 16, 2023 at 09:14:54PM +0800, wangyanan (Y) wrote:
>> Date: Thu, 16 Feb 2023 21:14:54 +0800
>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>>   cache topo in CPUID.04H
>>
>> 在 2023/2/13 17:36, Zhao Liu 写道:
>>> From: Zhao Liu <zhao1.liu@intel.com>
>>>
>>> The property x-l2-cache-topo will be used to change the L2 cache
>>> topology in CPUID.04H.
>>>
>>> Now it allows user to set the L2 cache is shared in core level or
>>> cluster level.
>>>
>>> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
>>> topology will be overrided by the new topology setting.
>> Currently x-l2-cache-topo only defines the share level *globally*.
> Yes, will set for all CPUs.
>
>> I'm thinking how we can make the property more powerful so that it
>> can specify which CPUs share l2 on core level and which CPUs share
>> l2 on cluster level.
>>
>> What would Intel's Hybrid CPUs do? Determine the l2 share level
>> is core or cluster according to the CPU core type (Atom or Core)?
>> While ARM does not have the core type concept but have CPUs
>> that l2 is shared on different levels in the same system.
> For example, Alderlake's "core" shares 1 L2 per core and every 4 "atom"s
> share 1 L2. For this case, we can set the topology as:
>
> cluster0 has 1 "core" and cluster1 has 4 "atom". Then set L2 shared on
> cluster level.
>
> Since cluster0 has only 1 "core" type core, then L2 per "core" works.
This brings restriction to the users that cluster0 must have *1* 
core-type core.
What if we set 2 vCores in cluster0 and 4 vCores in cluster1,  and bind 
cores in
cluster0 to 2 core-type pCores and bind cores in cluster1 to 4 atom-type
pCores?I think this is a necessary use case too.
> Not sure if this idea can be applied to arm?
>
>> Thanks,
>> Yanan
>>> Here we expose to user "cluster" instead of "module", to be consistent
>>> with "cluster-id" naming.
>>>
>>> Since CPUID.04H is used by intel CPUs, this property is available on
>>> intel CPUs as for now.
>>>
>>> When necessary, it can be extended to CPUID.8000001DH for amd CPUs.
>>>
>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>> ---
>>>    target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
>>>    target/i386/cpu.h |  2 ++
>>>    2 files changed, 34 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 5816dc99b1d4..cf84c720a431 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
>>>        case CORE:
>>>            num_ids = 1 << apicid_core_offset(topo_info);
>>>            break;
>>> +    case MODULE:
>>> +        num_ids = 1 << apicid_module_offset(topo_info);
>>> +        break;
>>>        case DIE:
>>>            num_ids = 1 << apicid_die_offset(topo_info);
>>>            break;
>>>        default:
>>>            /*
>>> -         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
>>> +         * Currently there is no use case for SMT and PACKAGE, so use
>>>             * assert directly to facilitate debugging.
>>>             */
>>>            g_assert_not_reached();
>>> @@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>>>            env->cache_info_amd.l3_cache = &legacy_l3_cache;
>>>        }
>>> +    if (cpu->l2_cache_topo_level) {
>>> +        /*
>>> +         * FIXME: Currently only supports changing CPUID[4] (for intel), and
>>> +         * will support changing CPUID[0x8000001D] when necessary.
>>> +         */
>>> +        if (!IS_INTEL_CPU(env)) {
>>> +            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
>>> +            return;
>>> +        }
>>> +
>>> +        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
>>> +            env->cache_info_cpuid4.l2_cache->share_level = CORE;
>>> +        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
>>> +            /*
>>> +             * We expose to users "cluster" instead of "module", to be
>>> +             * consistent with "cluster-id" naming.
>>> +             */
>>> +            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
>>> +        } else {
>>> +            error_setg(errp,
>>> +                       "x-l2-cache-topo doesn't support '%s', "
>>> +                       "and it only supports 'core' or 'cluster'",
>>> +                       cpu->l2_cache_topo_level);
>>> +            return;
>>> +        }
>>> +    }
>>> +
>>>    #ifndef CONFIG_USER_ONLY
>>>        MachineState *ms = MACHINE(qdev_get_machine());
>>>        qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
>>> @@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
>>>                         false),
>>>        DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
>>>                         true),
>>> +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
>>>        DEFINE_PROP_END_OF_LIST()
>>>    };
>>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>>> index 5a955431f759..aa7e96c586c7 100644
>>> --- a/target/i386/cpu.h
>>> +++ b/target/i386/cpu.h
>>> @@ -1987,6 +1987,8 @@ struct ArchCPU {
>>>        int32_t thread_id;
>>>        int32_t hv_max_vps;
>>> +
>>> +    char *l2_cache_topo_level;
>>>    };



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-17  3:35     ` Zhao Liu
  2023-02-17  3:45       ` wangyanan (Y) via
@ 2023-02-17  4:07       ` wangyanan (Y) via
  2023-02-17  7:26         ` Zhao Liu
  1 sibling, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-17  4:07 UTC (permalink / raw)
  To: Zhao Liu
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/17 11:35, Zhao Liu 写道:
> On Thu, Feb 16, 2023 at 09:14:54PM +0800, wangyanan (Y) wrote:
>> Date: Thu, 16 Feb 2023 21:14:54 +0800
>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>>   cache topo in CPUID.04H
>>
>> 在 2023/2/13 17:36, Zhao Liu 写道:
>>> From: Zhao Liu <zhao1.liu@intel.com>
>>>
>>> The property x-l2-cache-topo will be used to change the L2 cache
>>> topology in CPUID.04H.
>>>
>>> Now it allows user to set the L2 cache is shared in core level or
>>> cluster level.
>>>
>>> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
>>> topology will be overrided by the new topology setting.
>> Currently x-l2-cache-topo only defines the share level *globally*.
> Yes, will set for all CPUs.
>
>> I'm thinking how we can make the property more powerful so that it
>> can specify which CPUs share l2 on core level and which CPUs share
>> l2 on cluster level.
>>
>> What would Intel's Hybrid CPUs do? Determine the l2 share level
>> is core or cluster according to the CPU core type (Atom or Core)?
>> While ARM does not have the core type concept but have CPUs
>> that l2 is shared on different levels in the same system.
> For example, Alderlake's "core" shares 1 L2 per core and every 4 "atom"s
> share 1 L2. For this case, we can set the topology as:
>
> cluster0 has 1 "core" and cluster1 has 4 "atom". Then set L2 shared on
> cluster level.
>
> Since cluster0 has only 1 "core" type core, then L2 per "core" works.
>
> Not sure if this idea can be applied to arm?
For a CPU topopoly where we have 2 clusters totally, 2 cores in cluster0
have their own L1/L2 cache and 2 threads in each core, 4 cores in cluster1
share one L2 cache and 1 thread in each core. The global way does not
work well.

What about defining something general, which looks like -numa config:
-cache-topo cache=l2, share_level="core", cpus='0-3'
-cache-topo cache=l2, share_level="cluster", cpus='4-7'
If we ever want to support custom share-level for L3/L1, no extra work
is needed. We can also extend the CLI to support custom cache size, etc..

If you thinks this a good idea to explore, I can work on it, since I'm
planing to add support cache topology for ARM.

Thanks,
Yanan
>> Thanks,
>> Yanan
>>> Here we expose to user "cluster" instead of "module", to be consistent
>>> with "cluster-id" naming.
>>>
>>> Since CPUID.04H is used by intel CPUs, this property is available on
>>> intel CPUs as for now.
>>>
>>> When necessary, it can be extended to CPUID.8000001DH for amd CPUs.
>>>
>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>> ---
>>>    target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
>>>    target/i386/cpu.h |  2 ++
>>>    2 files changed, 34 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 5816dc99b1d4..cf84c720a431 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
>>>        case CORE:
>>>            num_ids = 1 << apicid_core_offset(topo_info);
>>>            break;
>>> +    case MODULE:
>>> +        num_ids = 1 << apicid_module_offset(topo_info);
>>> +        break;
>>>        case DIE:
>>>            num_ids = 1 << apicid_die_offset(topo_info);
>>>            break;
>>>        default:
>>>            /*
>>> -         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
>>> +         * Currently there is no use case for SMT and PACKAGE, so use
>>>             * assert directly to facilitate debugging.
>>>             */
>>>            g_assert_not_reached();
>>> @@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>>>            env->cache_info_amd.l3_cache = &legacy_l3_cache;
>>>        }
>>> +    if (cpu->l2_cache_topo_level) {
>>> +        /*
>>> +         * FIXME: Currently only supports changing CPUID[4] (for intel), and
>>> +         * will support changing CPUID[0x8000001D] when necessary.
>>> +         */
>>> +        if (!IS_INTEL_CPU(env)) {
>>> +            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
>>> +            return;
>>> +        }
>>> +
>>> +        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
>>> +            env->cache_info_cpuid4.l2_cache->share_level = CORE;
>>> +        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
>>> +            /*
>>> +             * We expose to users "cluster" instead of "module", to be
>>> +             * consistent with "cluster-id" naming.
>>> +             */
>>> +            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
>>> +        } else {
>>> +            error_setg(errp,
>>> +                       "x-l2-cache-topo doesn't support '%s', "
>>> +                       "and it only supports 'core' or 'cluster'",
>>> +                       cpu->l2_cache_topo_level);
>>> +            return;
>>> +        }
>>> +    }
>>> +
>>>    #ifndef CONFIG_USER_ONLY
>>>        MachineState *ms = MACHINE(qdev_get_machine());
>>>        qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
>>> @@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
>>>                         false),
>>>        DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
>>>                         true),
>>> +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
>>>        DEFINE_PROP_END_OF_LIST()
>>>    };
>>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>>> index 5a955431f759..aa7e96c586c7 100644
>>> --- a/target/i386/cpu.h
>>> +++ b/target/i386/cpu.h
>>> @@ -1987,6 +1987,8 @@ struct ArchCPU {
>>>        int32_t thread_id;
>>>        int32_t hv_max_vps;
>>> +
>>> +    char *l2_cache_topo_level;
>>>    };



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-17  4:07       ` wangyanan (Y) via
@ 2023-02-17  7:26         ` Zhao Liu
  2023-02-17  9:08           ` wangyanan (Y) via
  0 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-17  7:26 UTC (permalink / raw)
  To: wangyanan (Y), Daniel P . Berrangé
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Fri, Feb 17, 2023 at 12:07:01PM +0800, wangyanan (Y) wrote:
> Date: Fri, 17 Feb 2023 12:07:01 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>  cache topo in CPUID.04H
> 
> 在 2023/2/17 11:35, Zhao Liu 写道:
> > On Thu, Feb 16, 2023 at 09:14:54PM +0800, wangyanan (Y) wrote:
> > > Date: Thu, 16 Feb 2023 21:14:54 +0800
> > > From: "wangyanan (Y)" <wangyanan55@huawei.com>
> > > Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
> > >   cache topo in CPUID.04H
> > > 
> > > 在 2023/2/13 17:36, Zhao Liu 写道:
> > > > From: Zhao Liu <zhao1.liu@intel.com>
> > > > 
> > > > The property x-l2-cache-topo will be used to change the L2 cache
> > > > topology in CPUID.04H.
> > > > 
> > > > Now it allows user to set the L2 cache is shared in core level or
> > > > cluster level.
> > > > 
> > > > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > > > topology will be overrided by the new topology setting.
> > > Currently x-l2-cache-topo only defines the share level *globally*.
> > Yes, will set for all CPUs.
> > 
> > > I'm thinking how we can make the property more powerful so that it
> > > can specify which CPUs share l2 on core level and which CPUs share
> > > l2 on cluster level.
> > > 
> > > What would Intel's Hybrid CPUs do? Determine the l2 share level
> > > is core or cluster according to the CPU core type (Atom or Core)?
> > > While ARM does not have the core type concept but have CPUs
> > > that l2 is shared on different levels in the same system.
> > For example, Alderlake's "core" shares 1 L2 per core and every 4 "atom"s
> > share 1 L2. For this case, we can set the topology as:
> > 
> > cluster0 has 1 "core" and cluster1 has 4 "atom". Then set L2 shared on
> > cluster level.
> > 
> > Since cluster0 has only 1 "core" type core, then L2 per "core" works.
> > 
> > Not sure if this idea can be applied to arm?
> For a CPU topopoly where we have 2 clusters totally, 2 cores in cluster0
> have their own L1/L2 cache and 2 threads in each core, 4 cores in cluster1
> share one L2 cache and 1 thread in each core. The global way does not
> work well.
> 
> What about defining something general, which looks like -numa config:
> -cache-topo cache=l2, share_level="core", cpus='0-3'
> -cache-topo cache=l2, share_level="cluster", cpus='4-7'

Hi Yanan, here it may be necessary to check whether the cpu index set
in "cpus" is reasonable through the specific cpu topology.

For example, core0 has 2 CPUs: cpu0 and cpu1, and core1 has 2 CPUs: cpu2
and cpu3, then set l2 as:

-cache-topo cache=l2, share_level="core", cpus='0-2'
-cache-topo cache=l2, share_level="core", cpus='3'

Whether this command is legal depends on the meaning we give to the
parameter "cpu":
1. If "cpu" means all cpus share the cache set in this command, then
this command should fail since cpu2 and cpu3 are in a core.

2. If "cpu" means the affected cpus, then this command should find the
cores they belong to according to the cpu topology, and set L2 for those
cores. This command may return success.

What about removing share_level and ask "cpu" to mean all the sharing
cpus to avoid checking the cpu topology?

Then the above example should be:

-cache-topo cache=l2, cpus='0-1'
-cache-topo cache=l2, cpus='2-3'

This decouples cpu topology and cache topology completely and very
simple. In this way, determining the cache by specifying the shared cpu
is similar to that in x86 CPUID.04H.

But the price of simplicity is we may build a cache topology that doesn't
match the reality.

But if the cache topology must be set based on the cpu topology, another
way is consider specifying the cache when setting the topology
structure, which can be based on @Daniel's format [1]:

  -object cpu-socket,id=sock0,cache=l3
  -object cpu-die,id=die0,parent=sock0
  -object cpu-cluster,id=cluster0,parent=die0
  -object cpu-cluster,id=cluster1,parent=die0,cache=l2
  -object x86-cpu-model-core,id=cpu0,parent=cluster0,threads=2,cache=l1i,lid,l2
  -object x86-cpu-model-atom,id=cpu1,parent=cluster1,cache=l1i,lid
  -object x86-cpu-model-atom,id=cpu2,parent=cluster1,cache=l1i,l1d

Then from this command, cpu0 has a l2, and cpu1 and cpu2 shares a l2
(the l2 is inserted in cluster1).

This whole process is like when designing or building a CPU, the user
decides where to insert the caches. The advantage is that it is easier
to verify the rationality and is intuitive. But complicated.

(Also CC @Daniel for comments).

[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03320.html

Thanks,
Zhao

> If we ever want to support custom share-level for L3/L1, no extra work
> is needed. We can also extend the CLI to support custom cache size, etc..
> 
> If you thinks this a good idea to explore, I can work on it, since I'm
> planing to add support cache topology for ARM.
> 
> Thanks,
> Yanan
> > > Thanks,
> > > Yanan
> > > > Here we expose to user "cluster" instead of "module", to be consistent
> > > > with "cluster-id" naming.
> > > > 
> > > > Since CPUID.04H is used by intel CPUs, this property is available on
> > > > intel CPUs as for now.
> > > > 
> > > > When necessary, it can be extended to CPUID.8000001DH for amd CPUs.
> > > > 
> > > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > > > ---
> > > >    target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
> > > >    target/i386/cpu.h |  2 ++
> > > >    2 files changed, 34 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > index 5816dc99b1d4..cf84c720a431 100644
> > > > --- a/target/i386/cpu.c
> > > > +++ b/target/i386/cpu.c
> > > > @@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
> > > >        case CORE:
> > > >            num_ids = 1 << apicid_core_offset(topo_info);
> > > >            break;
> > > > +    case MODULE:
> > > > +        num_ids = 1 << apicid_module_offset(topo_info);
> > > > +        break;
> > > >        case DIE:
> > > >            num_ids = 1 << apicid_die_offset(topo_info);
> > > >            break;
> > > >        default:
> > > >            /*
> > > > -         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
> > > > +         * Currently there is no use case for SMT and PACKAGE, so use
> > > >             * assert directly to facilitate debugging.
> > > >             */
> > > >            g_assert_not_reached();
> > > > @@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> > > >            env->cache_info_amd.l3_cache = &legacy_l3_cache;
> > > >        }
> > > > +    if (cpu->l2_cache_topo_level) {
> > > > +        /*
> > > > +         * FIXME: Currently only supports changing CPUID[4] (for intel), and
> > > > +         * will support changing CPUID[0x8000001D] when necessary.
> > > > +         */
> > > > +        if (!IS_INTEL_CPU(env)) {
> > > > +            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
> > > > +            return;
> > > > +        }
> > > > +
> > > > +        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
> > > > +            env->cache_info_cpuid4.l2_cache->share_level = CORE;
> > > > +        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
> > > > +            /*
> > > > +             * We expose to users "cluster" instead of "module", to be
> > > > +             * consistent with "cluster-id" naming.
> > > > +             */
> > > > +            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
> > > > +        } else {
> > > > +            error_setg(errp,
> > > > +                       "x-l2-cache-topo doesn't support '%s', "
> > > > +                       "and it only supports 'core' or 'cluster'",
> > > > +                       cpu->l2_cache_topo_level);
> > > > +            return;
> > > > +        }
> > > > +    }
> > > > +
> > > >    #ifndef CONFIG_USER_ONLY
> > > >        MachineState *ms = MACHINE(qdev_get_machine());
> > > >        qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
> > > > @@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
> > > >                         false),
> > > >        DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
> > > >                         true),
> > > > +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
> > > >        DEFINE_PROP_END_OF_LIST()
> > > >    };
> > > > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > > > index 5a955431f759..aa7e96c586c7 100644
> > > > --- a/target/i386/cpu.h
> > > > +++ b/target/i386/cpu.h
> > > > @@ -1987,6 +1987,8 @@ struct ArchCPU {
> > > >        int32_t thread_id;
> > > >        int32_t hv_max_vps;
> > > > +
> > > > +    char *l2_cache_topo_level;
> > > >    };
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-17  7:26         ` Zhao Liu
@ 2023-02-17  9:08           ` wangyanan (Y) via
  2023-02-20  2:49             ` Zhao Liu
  0 siblings, 1 reply; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-17  9:08 UTC (permalink / raw)
  To: Zhao Liu, Daniel P. Berrangé
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/17 15:26, Zhao Liu 写道:
> On Fri, Feb 17, 2023 at 12:07:01PM +0800, wangyanan (Y) wrote:
>> Date: Fri, 17 Feb 2023 12:07:01 +0800
>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>>   cache topo in CPUID.04H
>>
>> 在 2023/2/17 11:35, Zhao Liu 写道:
>>> On Thu, Feb 16, 2023 at 09:14:54PM +0800, wangyanan (Y) wrote:
>>>> Date: Thu, 16 Feb 2023 21:14:54 +0800
>>>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>>>> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>>>>    cache topo in CPUID.04H
>>>>
>>>> 在 2023/2/13 17:36, Zhao Liu 写道:
>>>>> From: Zhao Liu <zhao1.liu@intel.com>
>>>>>
>>>>> The property x-l2-cache-topo will be used to change the L2 cache
>>>>> topology in CPUID.04H.
>>>>>
>>>>> Now it allows user to set the L2 cache is shared in core level or
>>>>> cluster level.
>>>>>
>>>>> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
>>>>> topology will be overrided by the new topology setting.
>>>> Currently x-l2-cache-topo only defines the share level *globally*.
>>> Yes, will set for all CPUs.
>>>
>>>> I'm thinking how we can make the property more powerful so that it
>>>> can specify which CPUs share l2 on core level and which CPUs share
>>>> l2 on cluster level.
>>>>
>>>> What would Intel's Hybrid CPUs do? Determine the l2 share level
>>>> is core or cluster according to the CPU core type (Atom or Core)?
>>>> While ARM does not have the core type concept but have CPUs
>>>> that l2 is shared on different levels in the same system.
>>> For example, Alderlake's "core" shares 1 L2 per core and every 4 "atom"s
>>> share 1 L2. For this case, we can set the topology as:
>>>
>>> cluster0 has 1 "core" and cluster1 has 4 "atom". Then set L2 shared on
>>> cluster level.
>>>
>>> Since cluster0 has only 1 "core" type core, then L2 per "core" works.
>>>
>>> Not sure if this idea can be applied to arm?
>> For a CPU topopoly where we have 2 clusters totally, 2 cores in cluster0
>> have their own L1/L2 cache and 2 threads in each core, 4 cores in cluster1
>> share one L2 cache and 1 thread in each core. The global way does not
>> work well.
>>
>> What about defining something general, which looks like -numa config:
>> -cache-topo cache=l2, share_level="core", cpus='0-3'
>> -cache-topo cache=l2, share_level="cluster", cpus='4-7'
> Hi Yanan, here it may be necessary to check whether the cpu index set
> in "cpus" is reasonable through the specific cpu topology.
Yes, the validity of the cache topo configs from the users should be
check in machine_parse_cache_topo ( if we will have this func).
It's not a big deal, we always need the validity checks.

In summary:
1、There can not be the same cpus in different "cpus" list.
2、A combination of all the "cpus" list should *just* cover all the CPUs
in the machine
3、Most importantly, cpus in the same cluster must be set with the
same cache "share_level" (core or cluster) and cpus in the same core
must also be set with the same cache "share_level".
> For example, core0 has 2 CPUs: cpu0 and cpu1, and core1 has 2 CPUs: cpu2
> and cpu3, then set l2 as:
>
> -cache-topo cache=l2, share_level="core", cpus='0-2'
> -cache-topo cache=l2, share_level="core", cpus='3'
>
> Whether this command is legal depends on the meaning we give to the
> parameter "cpu":
s/cpus/cpu.
It means all the afftected CPUs, e.g, the second case.
> 1. If "cpu" means all cpus share the cache set in this command, then
> this command should fail since cpu2 and cpu3 are in a core.
>
> 2. If "cpu" means the affected cpus, then this command should find the
> cores they belong to according to the cpu topology, and set L2 for those
> cores. This command may return success.
>
> What about removing share_level and ask "cpu" to mean all the sharing
> cpus to avoid checking the cpu topology?
>
> Then the above example should be:
>
> -cache-topo cache=l2, cpus='0-1'
> -cache-topo cache=l2, cpus='2-3'
Sorry, I dont understand how we will know the cache share_level of
cpus '0-1' or '2-3'. What will the CLIs will be like if we change the
belows CLIs by removing the "share_level" params.

-cache-topo cache=l2, share_level="core", cpus='0-3'
-cache-topo cache=l2, share_level="cluster", cpus='4-7'
> This decouples cpu topology and cache topology completely and very
> simple. In this way, determining the cache by specifying the shared cpu
> is similar to that in x86 CPUID.04H.
>
> But the price of simplicity is we may build a cache topology that doesn't
> match the reality.
>
> But if the cache topology must be set based on the cpu topology, another
> way is consider specifying the cache when setting the topology
> structure, which can be based on @Daniel's format [1]:
>
>    -object cpu-socket,id=sock0,cache=l3
>    -object cpu-die,id=die0,parent=sock0
>    -object cpu-cluster,id=cluster0,parent=die0
>    -object cpu-cluster,id=cluster1,parent=die0,cache=l2
>    -object x86-cpu-model-core,id=cpu0,parent=cluster0,threads=2,cache=l1i,lid,l2
>    -object x86-cpu-model-atom,id=cpu1,parent=cluster1,cache=l1i,lid
>    -object x86-cpu-model-atom,id=cpu2,parent=cluster1,cache=l1i,l1d
>
> Then from this command, cpu0 has a l2, and cpu1 and cpu2 shares a l2
> (the l2 is inserted in cluster1).
>
> This whole process is like when designing or building a CPU, the user
> decides where to insert the caches. The advantage is that it is easier
> to verify the rationality and is intuitive. But complicated.
Yeah, this is also a way.
Most of the concern is that it will not be easy/readable to extand the
cache configs, for example if we ever want to support custom cache size,
cache type or other cache properties in the future. And yes, will also
complex the -objects by making them huge.

I think keeping cache and cpu configs decouped will leave simplicity
to the users, just like we keep numa resources config from cpu
topology currently.

On the other hand, -cache-topo is not just for hybrid, it's also for
current smp, which make it inappropriate to bind -cache-topo
with hybrid case. For example, "-cache-topo name=l2, share_level=cluster"
will indicates that l2 cache is shared on cluster level globally. And 
this is
"x-l2-cache-topo" in this patch does.

Thanks,
Yanan
> (Also CC @Daniel for comments).
>
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03320.html
>
> Thanks,
> Zhao
>
>> If we ever want to support custom share-level for L3/L1, no extra work
>> is needed. We can also extend the CLI to support custom cache size, etc..
>>
>> If you thinks this a good idea to explore, I can work on it, since I'm
>> planing to add support cache topology for ARM.
>>
>> Thanks,
>> Yanan
>>>> Thanks,
>>>> Yanan
>>>>> Here we expose to user "cluster" instead of "module", to be consistent
>>>>> with "cluster-id" naming.
>>>>>
>>>>> Since CPUID.04H is used by intel CPUs, this property is available on
>>>>> intel CPUs as for now.
>>>>>
>>>>> When necessary, it can be extended to CPUID.8000001DH for amd CPUs.
>>>>>
>>>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>>>> ---
>>>>>     target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
>>>>>     target/i386/cpu.h |  2 ++
>>>>>     2 files changed, 34 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>>> index 5816dc99b1d4..cf84c720a431 100644
>>>>> --- a/target/i386/cpu.c
>>>>> +++ b/target/i386/cpu.c
>>>>> @@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
>>>>>         case CORE:
>>>>>             num_ids = 1 << apicid_core_offset(topo_info);
>>>>>             break;
>>>>> +    case MODULE:
>>>>> +        num_ids = 1 << apicid_module_offset(topo_info);
>>>>> +        break;
>>>>>         case DIE:
>>>>>             num_ids = 1 << apicid_die_offset(topo_info);
>>>>>             break;
>>>>>         default:
>>>>>             /*
>>>>> -         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
>>>>> +         * Currently there is no use case for SMT and PACKAGE, so use
>>>>>              * assert directly to facilitate debugging.
>>>>>              */
>>>>>             g_assert_not_reached();
>>>>> @@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>>>>>             env->cache_info_amd.l3_cache = &legacy_l3_cache;
>>>>>         }
>>>>> +    if (cpu->l2_cache_topo_level) {
>>>>> +        /*
>>>>> +         * FIXME: Currently only supports changing CPUID[4] (for intel), and
>>>>> +         * will support changing CPUID[0x8000001D] when necessary.
>>>>> +         */
>>>>> +        if (!IS_INTEL_CPU(env)) {
>>>>> +            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
>>>>> +            return;
>>>>> +        }
>>>>> +
>>>>> +        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
>>>>> +            env->cache_info_cpuid4.l2_cache->share_level = CORE;
>>>>> +        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
>>>>> +            /*
>>>>> +             * We expose to users "cluster" instead of "module", to be
>>>>> +             * consistent with "cluster-id" naming.
>>>>> +             */
>>>>> +            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
>>>>> +        } else {
>>>>> +            error_setg(errp,
>>>>> +                       "x-l2-cache-topo doesn't support '%s', "
>>>>> +                       "and it only supports 'core' or 'cluster'",
>>>>> +                       cpu->l2_cache_topo_level);
>>>>> +            return;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>>     #ifndef CONFIG_USER_ONLY
>>>>>         MachineState *ms = MACHINE(qdev_get_machine());
>>>>>         qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
>>>>> @@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
>>>>>                          false),
>>>>>         DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
>>>>>                          true),
>>>>> +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
>>>>>         DEFINE_PROP_END_OF_LIST()
>>>>>     };
>>>>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>>>>> index 5a955431f759..aa7e96c586c7 100644
>>>>> --- a/target/i386/cpu.h
>>>>> +++ b/target/i386/cpu.h
>>>>> @@ -1987,6 +1987,8 @@ struct ArchCPU {
>>>>>         int32_t thread_id;
>>>>>         int32_t hv_max_vps;
>>>>> +
>>>>> +    char *l2_cache_topo_level;
>>>>>     };



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-17  9:08           ` wangyanan (Y) via
@ 2023-02-20  2:49             ` Zhao Liu
  2023-02-20  3:52               ` wangyanan (Y) via
  0 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-20  2:49 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: Daniel P. Berrangé,
	qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Fri, Feb 17, 2023 at 05:08:31PM +0800, wangyanan (Y) wrote:
> Date: Fri, 17 Feb 2023 17:08:31 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>  cache topo in CPUID.04H
> 
> 在 2023/2/17 15:26, Zhao Liu 写道:
> > On Fri, Feb 17, 2023 at 12:07:01PM +0800, wangyanan (Y) wrote:
> > > Date: Fri, 17 Feb 2023 12:07:01 +0800
> > > From: "wangyanan (Y)" <wangyanan55@huawei.com>
> > > Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
> > >   cache topo in CPUID.04H
> > > 
> > > 在 2023/2/17 11:35, Zhao Liu 写道:
> > > > On Thu, Feb 16, 2023 at 09:14:54PM +0800, wangyanan (Y) wrote:
> > > > > Date: Thu, 16 Feb 2023 21:14:54 +0800
> > > > > From: "wangyanan (Y)" <wangyanan55@huawei.com>
> > > > > Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
> > > > >    cache topo in CPUID.04H
> > > > > 
> > > > > 在 2023/2/13 17:36, Zhao Liu 写道:
> > > > > > From: Zhao Liu <zhao1.liu@intel.com>
> > > > > > 
> > > > > > The property x-l2-cache-topo will be used to change the L2 cache
> > > > > > topology in CPUID.04H.
> > > > > > 
> > > > > > Now it allows user to set the L2 cache is shared in core level or
> > > > > > cluster level.
> > > > > > 
> > > > > > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > > > > > topology will be overrided by the new topology setting.
> > > > > Currently x-l2-cache-topo only defines the share level *globally*.
> > > > Yes, will set for all CPUs.
> > > > 
> > > > > I'm thinking how we can make the property more powerful so that it
> > > > > can specify which CPUs share l2 on core level and which CPUs share
> > > > > l2 on cluster level.
> > > > > 
> > > > > What would Intel's Hybrid CPUs do? Determine the l2 share level
> > > > > is core or cluster according to the CPU core type (Atom or Core)?
> > > > > While ARM does not have the core type concept but have CPUs
> > > > > that l2 is shared on different levels in the same system.
> > > > For example, Alderlake's "core" shares 1 L2 per core and every 4 "atom"s
> > > > share 1 L2. For this case, we can set the topology as:
> > > > 
> > > > cluster0 has 1 "core" and cluster1 has 4 "atom". Then set L2 shared on
> > > > cluster level.
> > > > 
> > > > Since cluster0 has only 1 "core" type core, then L2 per "core" works.
> > > > 
> > > > Not sure if this idea can be applied to arm?
> > > For a CPU topopoly where we have 2 clusters totally, 2 cores in cluster0
> > > have their own L1/L2 cache and 2 threads in each core, 4 cores in cluster1
> > > share one L2 cache and 1 thread in each core. The global way does not
> > > work well.
> > > 
> > > What about defining something general, which looks like -numa config:
> > > -cache-topo cache=l2, share_level="core", cpus='0-3'
> > > -cache-topo cache=l2, share_level="cluster", cpus='4-7'
> > Hi Yanan, here it may be necessary to check whether the cpu index set
> > in "cpus" is reasonable through the specific cpu topology.
> Yes, the validity of the cache topo configs from the users should be
> check in machine_parse_cache_topo ( if we will have this func).
> It's not a big deal, we always need the validity checks.

I guess that verification needs to build up the full cpu topology, as
done in another hybrid RFC. So, should the cpu-topology.h related
patches in that RFC be split out and sent first?

> 
> In summary:
> 1、There can not be the same cpus in different "cpus" list.
> 2、A combination of all the "cpus" list should *just* cover all the CPUs
> in the machine
> 3、Most importantly, cpus in the same cluster must be set with the
> same cache "share_level" (core or cluster) and cpus in the same core
> must also be set with the same cache "share_level".

Got it, thx.

> > For example, core0 has 2 CPUs: cpu0 and cpu1, and core1 has 2 CPUs: cpu2
> > and cpu3, then set l2 as:
> > 
> > -cache-topo cache=l2, share_level="core", cpus='0-2'
> > -cache-topo cache=l2, share_level="core", cpus='3'
> > 
> > Whether this command is legal depends on the meaning we give to the
> > parameter "cpu":
> s/cpus/cpu.
> It means all the afftected CPUs, e.g, the second case.
> > 1. If "cpu" means all cpus share the cache set in this command, then
> > this command should fail since cpu2 and cpu3 are in a core.
> > 
> > 2. If "cpu" means the affected cpus, then this command should find the
> > cores they belong to according to the cpu topology, and set L2 for those
> > cores. This command may return success.
> > 
> > What about removing share_level and ask "cpu" to mean all the sharing
> > cpus to avoid checking the cpu topology?
> > 
> > Then the above example should be:
> > 
> > -cache-topo cache=l2, cpus='0-1'
> > -cache-topo cache=l2, cpus='2-3'
> Sorry, I dont understand how we will know the cache share_level of
> cpus '0-1' or '2-3'. What will the CLIs will be like if we change the
> belows CLIs by removing the "share_level" params.
> 
> -cache-topo cache=l2, share_level="core", cpus='0-3'
> -cache-topo cache=l2, share_level="cluster", cpus='4-7'
> > This decouples cpu topology and cache topology completely and very
> > simple. In this way, determining the cache by specifying the shared cpu
> > is similar to that in x86 CPUID.04H.
> > 
> > But the price of simplicity is we may build a cache topology that doesn't
> > match the reality.
> > 
> > But if the cache topology must be set based on the cpu topology, another
> > way is consider specifying the cache when setting the topology
> > structure, which can be based on @Daniel's format [1]:
> > 
> >    -object cpu-socket,id=sock0,cache=l3
> >    -object cpu-die,id=die0,parent=sock0
> >    -object cpu-cluster,id=cluster0,parent=die0
> >    -object cpu-cluster,id=cluster1,parent=die0,cache=l2
> >    -object x86-cpu-model-core,id=cpu0,parent=cluster0,threads=2,cache=l1i,lid,l2
> >    -object x86-cpu-model-atom,id=cpu1,parent=cluster1,cache=l1i,lid
> >    -object x86-cpu-model-atom,id=cpu2,parent=cluster1,cache=l1i,l1d
> > 
> > Then from this command, cpu0 has a l2, and cpu1 and cpu2 shares a l2
> > (the l2 is inserted in cluster1).
> > 
> > This whole process is like when designing or building a CPU, the user
> > decides where to insert the caches. The advantage is that it is easier
> > to verify the rationality and is intuitive. But complicated.
> Yeah, this is also a way.
> Most of the concern is that it will not be easy/readable to extand the
> cache configs, for example if we ever want to support custom cache size,
> cache type or other cache properties in the future. And yes, will also
> complex the -objects by making them huge.
> 
> I think keeping cache and cpu configs decouped will leave simplicity
> to the users, just like we keep numa resources config from cpu
> topology currently.
> 
> On the other hand, -cache-topo is not just for hybrid, it's also for
> current smp, which make it inappropriate to bind -cache-topo
> with hybrid case. For example, "-cache-topo name=l2, share_level=cluster"
> will indicates that l2 cache is shared on cluster level globally. And this
> is
> "x-l2-cache-topo" in this patch does.
> 
> Thanks,
> Yanan
> > (Also CC @Daniel for comments).
> > 
> > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03320.html
> > 
> > Thanks,
> > Zhao
> > 
> > > If we ever want to support custom share-level for L3/L1, no extra work
> > > is needed. We can also extend the CLI to support custom cache size, etc..
> > > 
> > > If you thinks this a good idea to explore, I can work on it, since I'm
> > > planing to add support cache topology for ARM.
> > > 
> > > Thanks,
> > > Yanan
> > > > > Thanks,
> > > > > Yanan
> > > > > > Here we expose to user "cluster" instead of "module", to be consistent
> > > > > > with "cluster-id" naming.
> > > > > > 
> > > > > > Since CPUID.04H is used by intel CPUs, this property is available on
> > > > > > intel CPUs as for now.
> > > > > > 
> > > > > > When necessary, it can be extended to CPUID.8000001DH for amd CPUs.
> > > > > > 
> > > > > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > > > > > ---
> > > > > >     target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
> > > > > >     target/i386/cpu.h |  2 ++
> > > > > >     2 files changed, 34 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > > > index 5816dc99b1d4..cf84c720a431 100644
> > > > > > --- a/target/i386/cpu.c
> > > > > > +++ b/target/i386/cpu.c
> > > > > > @@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
> > > > > >         case CORE:
> > > > > >             num_ids = 1 << apicid_core_offset(topo_info);
> > > > > >             break;
> > > > > > +    case MODULE:
> > > > > > +        num_ids = 1 << apicid_module_offset(topo_info);
> > > > > > +        break;
> > > > > >         case DIE:
> > > > > >             num_ids = 1 << apicid_die_offset(topo_info);
> > > > > >             break;
> > > > > >         default:
> > > > > >             /*
> > > > > > -         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
> > > > > > +         * Currently there is no use case for SMT and PACKAGE, so use
> > > > > >              * assert directly to facilitate debugging.
> > > > > >              */
> > > > > >             g_assert_not_reached();
> > > > > > @@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> > > > > >             env->cache_info_amd.l3_cache = &legacy_l3_cache;
> > > > > >         }
> > > > > > +    if (cpu->l2_cache_topo_level) {
> > > > > > +        /*
> > > > > > +         * FIXME: Currently only supports changing CPUID[4] (for intel), and
> > > > > > +         * will support changing CPUID[0x8000001D] when necessary.
> > > > > > +         */
> > > > > > +        if (!IS_INTEL_CPU(env)) {
> > > > > > +            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
> > > > > > +            return;
> > > > > > +        }
> > > > > > +
> > > > > > +        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
> > > > > > +            env->cache_info_cpuid4.l2_cache->share_level = CORE;
> > > > > > +        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
> > > > > > +            /*
> > > > > > +             * We expose to users "cluster" instead of "module", to be
> > > > > > +             * consistent with "cluster-id" naming.
> > > > > > +             */
> > > > > > +            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
> > > > > > +        } else {
> > > > > > +            error_setg(errp,
> > > > > > +                       "x-l2-cache-topo doesn't support '%s', "
> > > > > > +                       "and it only supports 'core' or 'cluster'",
> > > > > > +                       cpu->l2_cache_topo_level);
> > > > > > +            return;
> > > > > > +        }
> > > > > > +    }
> > > > > > +
> > > > > >     #ifndef CONFIG_USER_ONLY
> > > > > >         MachineState *ms = MACHINE(qdev_get_machine());
> > > > > >         qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
> > > > > > @@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
> > > > > >                          false),
> > > > > >         DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
> > > > > >                          true),
> > > > > > +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
> > > > > >         DEFINE_PROP_END_OF_LIST()
> > > > > >     };
> > > > > > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > > > > > index 5a955431f759..aa7e96c586c7 100644
> > > > > > --- a/target/i386/cpu.h
> > > > > > +++ b/target/i386/cpu.h
> > > > > > @@ -1987,6 +1987,8 @@ struct ArchCPU {
> > > > > >         int32_t thread_id;
> > > > > >         int32_t hv_max_vps;
> > > > > > +
> > > > > > +    char *l2_cache_topo_level;
> > > > > >     };
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-17  3:45       ` wangyanan (Y) via
@ 2023-02-20  2:54         ` Zhao Liu
  0 siblings, 0 replies; 61+ messages in thread
From: Zhao Liu @ 2023-02-20  2:54 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

On Fri, Feb 17, 2023 at 11:45:58AM +0800, wangyanan (Y) wrote:
> Date: Fri, 17 Feb 2023 11:45:58 +0800
> From: "wangyanan (Y)" <wangyanan55@huawei.com>
> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>  cache topo in CPUID.04H
> 
> 在 2023/2/17 11:35, Zhao Liu 写道:
> > On Thu, Feb 16, 2023 at 09:14:54PM +0800, wangyanan (Y) wrote:
> > > Date: Thu, 16 Feb 2023 21:14:54 +0800
> > > From: "wangyanan (Y)" <wangyanan55@huawei.com>
> > > Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
> > >   cache topo in CPUID.04H
> > > 
> > > 在 2023/2/13 17:36, Zhao Liu 写道:
> > > > From: Zhao Liu <zhao1.liu@intel.com>
> > > > 
> > > > The property x-l2-cache-topo will be used to change the L2 cache
> > > > topology in CPUID.04H.
> > > > 
> > > > Now it allows user to set the L2 cache is shared in core level or
> > > > cluster level.
> > > > 
> > > > If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
> > > > topology will be overrided by the new topology setting.
> > > Currently x-l2-cache-topo only defines the share level *globally*.
> > Yes, will set for all CPUs.
> > 
> > > I'm thinking how we can make the property more powerful so that it
> > > can specify which CPUs share l2 on core level and which CPUs share
> > > l2 on cluster level.
> > > 
> > > What would Intel's Hybrid CPUs do? Determine the l2 share level
> > > is core or cluster according to the CPU core type (Atom or Core)?
> > > While ARM does not have the core type concept but have CPUs
> > > that l2 is shared on different levels in the same system.
> > For example, Alderlake's "core" shares 1 L2 per core and every 4 "atom"s
> > share 1 L2. For this case, we can set the topology as:
> > 
> > cluster0 has 1 "core" and cluster1 has 4 "atom". Then set L2 shared on
> > cluster level.
> > 
> > Since cluster0 has only 1 "core" type core, then L2 per "core" works.
> This brings restriction to the users that cluster0 must have *1* core-type
> core.
> What if we set 2 vCores in cluster0 and 4 vCores in cluster1,  and bind
> cores in
> cluster0 to 2 core-type pCores and bind cores in cluster1 to 4 atom-type
> pCores?I think this is a necessary use case too.

At present, the cache topology level and core type are not bound, so the
cache topology level can also be adjusted for any vCores.

> > Not sure if this idea can be applied to arm?
> > 
> > > Thanks,
> > > Yanan
> > > > Here we expose to user "cluster" instead of "module", to be consistent
> > > > with "cluster-id" naming.
> > > > 
> > > > Since CPUID.04H is used by intel CPUs, this property is available on
> > > > intel CPUs as for now.
> > > > 
> > > > When necessary, it can be extended to CPUID.8000001DH for amd CPUs.
> > > > 
> > > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > > > ---
> > > >    target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
> > > >    target/i386/cpu.h |  2 ++
> > > >    2 files changed, 34 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > index 5816dc99b1d4..cf84c720a431 100644
> > > > --- a/target/i386/cpu.c
> > > > +++ b/target/i386/cpu.c
> > > > @@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
> > > >        case CORE:
> > > >            num_ids = 1 << apicid_core_offset(topo_info);
> > > >            break;
> > > > +    case MODULE:
> > > > +        num_ids = 1 << apicid_module_offset(topo_info);
> > > > +        break;
> > > >        case DIE:
> > > >            num_ids = 1 << apicid_die_offset(topo_info);
> > > >            break;
> > > >        default:
> > > >            /*
> > > > -         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
> > > > +         * Currently there is no use case for SMT and PACKAGE, so use
> > > >             * assert directly to facilitate debugging.
> > > >             */
> > > >            g_assert_not_reached();
> > > > @@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> > > >            env->cache_info_amd.l3_cache = &legacy_l3_cache;
> > > >        }
> > > > +    if (cpu->l2_cache_topo_level) {
> > > > +        /*
> > > > +         * FIXME: Currently only supports changing CPUID[4] (for intel), and
> > > > +         * will support changing CPUID[0x8000001D] when necessary.
> > > > +         */
> > > > +        if (!IS_INTEL_CPU(env)) {
> > > > +            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
> > > > +            return;
> > > > +        }
> > > > +
> > > > +        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
> > > > +            env->cache_info_cpuid4.l2_cache->share_level = CORE;
> > > > +        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
> > > > +            /*
> > > > +             * We expose to users "cluster" instead of "module", to be
> > > > +             * consistent with "cluster-id" naming.
> > > > +             */
> > > > +            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
> > > > +        } else {
> > > > +            error_setg(errp,
> > > > +                       "x-l2-cache-topo doesn't support '%s', "
> > > > +                       "and it only supports 'core' or 'cluster'",
> > > > +                       cpu->l2_cache_topo_level);
> > > > +            return;
> > > > +        }
> > > > +    }
> > > > +
> > > >    #ifndef CONFIG_USER_ONLY
> > > >        MachineState *ms = MACHINE(qdev_get_machine());
> > > >        qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
> > > > @@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
> > > >                         false),
> > > >        DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
> > > >                         true),
> > > > +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
> > > >        DEFINE_PROP_END_OF_LIST()
> > > >    };
> > > > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > > > index 5a955431f759..aa7e96c586c7 100644
> > > > --- a/target/i386/cpu.h
> > > > +++ b/target/i386/cpu.h
> > > > @@ -1987,6 +1987,8 @@ struct ArchCPU {
> > > >        int32_t thread_id;
> > > >        int32_t hv_max_vps;
> > > > +
> > > > +    char *l2_cache_topo_level;
> > > >    };
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H
  2023-02-20  2:49             ` Zhao Liu
@ 2023-02-20  3:52               ` wangyanan (Y) via
  0 siblings, 0 replies; 61+ messages in thread
From: wangyanan (Y) via @ 2023-02-20  3:52 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Daniel P. Berrangé,
	qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Xiaoyao Li, Like Xu, Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Michael S . Tsirkin, Richard Henderson, Paolo Bonzini,
	Eric Blake, Markus Armbruster

在 2023/2/20 10:49, Zhao Liu 写道:
> On Fri, Feb 17, 2023 at 05:08:31PM +0800, wangyanan (Y) wrote:
>> Date: Fri, 17 Feb 2023 17:08:31 +0800
>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>>   cache topo in CPUID.04H
>>
>> 在 2023/2/17 15:26, Zhao Liu 写道:
>>> On Fri, Feb 17, 2023 at 12:07:01PM +0800, wangyanan (Y) wrote:
>>>> Date: Fri, 17 Feb 2023 12:07:01 +0800
>>>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>>>> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>>>>    cache topo in CPUID.04H
>>>>
>>>> 在 2023/2/17 11:35, Zhao Liu 写道:
>>>>> On Thu, Feb 16, 2023 at 09:14:54PM +0800, wangyanan (Y) wrote:
>>>>>> Date: Thu, 16 Feb 2023 21:14:54 +0800
>>>>>> From: "wangyanan (Y)" <wangyanan55@huawei.com>
>>>>>> Subject: Re: [PATCH RESEND 18/18] i386: Add new property to control L2
>>>>>>     cache topo in CPUID.04H
>>>>>>
>>>>>> 在 2023/2/13 17:36, Zhao Liu 写道:
>>>>>>> From: Zhao Liu <zhao1.liu@intel.com>
>>>>>>>
>>>>>>> The property x-l2-cache-topo will be used to change the L2 cache
>>>>>>> topology in CPUID.04H.
>>>>>>>
>>>>>>> Now it allows user to set the L2 cache is shared in core level or
>>>>>>> cluster level.
>>>>>>>
>>>>>>> If user passes "-cpu x-l2-cache-topo=[core|cluster]" then older L2 cache
>>>>>>> topology will be overrided by the new topology setting.
>>>>>> Currently x-l2-cache-topo only defines the share level *globally*.
>>>>> Yes, will set for all CPUs.
>>>>>
>>>>>> I'm thinking how we can make the property more powerful so that it
>>>>>> can specify which CPUs share l2 on core level and which CPUs share
>>>>>> l2 on cluster level.
>>>>>>
>>>>>> What would Intel's Hybrid CPUs do? Determine the l2 share level
>>>>>> is core or cluster according to the CPU core type (Atom or Core)?
>>>>>> While ARM does not have the core type concept but have CPUs
>>>>>> that l2 is shared on different levels in the same system.
>>>>> For example, Alderlake's "core" shares 1 L2 per core and every 4 "atom"s
>>>>> share 1 L2. For this case, we can set the topology as:
>>>>>
>>>>> cluster0 has 1 "core" and cluster1 has 4 "atom". Then set L2 shared on
>>>>> cluster level.
>>>>>
>>>>> Since cluster0 has only 1 "core" type core, then L2 per "core" works.
>>>>>
>>>>> Not sure if this idea can be applied to arm?
>>>> For a CPU topopoly where we have 2 clusters totally, 2 cores in cluster0
>>>> have their own L1/L2 cache and 2 threads in each core, 4 cores in cluster1
>>>> share one L2 cache and 1 thread in each core. The global way does not
>>>> work well.
>>>>
>>>> What about defining something general, which looks like -numa config:
>>>> -cache-topo cache=l2, share_level="core", cpus='0-3'
>>>> -cache-topo cache=l2, share_level="cluster", cpus='4-7'
>>> Hi Yanan, here it may be necessary to check whether the cpu index set
>>> in "cpus" is reasonable through the specific cpu topology.
>> Yes, the validity of the cache topo configs from the users should be
>> check in machine_parse_cache_topo ( if we will have this func).
>> It's not a big deal, we always need the validity checks.
> I guess that verification needs to build up the full cpu topology, as
> done in another hybrid RFC. So, should the cpu-topology.h related
> patches in that RFC be split out and sent first?
Which patches, do you mean the re-work/generalization of smp?
Will it be reasonable if they are sent separately without the hybrid
introduction? I'm not yet sure if -cache must need the cpu-topology.h
related things, maybe current upstream QEMU code is enough.
>> In summary:
>> 1、There can not be the same cpus in different "cpus" list.
>> 2、A combination of all the "cpus" list should *just* cover all the CPUs
>> in the machine
>> 3、Most importantly, cpus in the same cluster must be set with the
>> same cache "share_level" (core or cluster) and cpus in the same core
>> must also be set with the same cache "share_level".
> Got it, thx.
>
>>> For example, core0 has 2 CPUs: cpu0 and cpu1, and core1 has 2 CPUs: cpu2
>>> and cpu3, then set l2 as:
>>>
>>> -cache-topo cache=l2, share_level="core", cpus='0-2'
>>> -cache-topo cache=l2, share_level="core", cpus='3'
>>>
>>> Whether this command is legal depends on the meaning we give to the
>>> parameter "cpu":
>> s/cpus/cpu.
>> It means all the afftected CPUs, e.g, the second case.
>>> 1. If "cpu" means all cpus share the cache set in this command, then
>>> this command should fail since cpu2 and cpu3 are in a core.
>>>
>>> 2. If "cpu" means the affected cpus, then this command should find the
>>> cores they belong to according to the cpu topology, and set L2 for those
>>> cores. This command may return success.
>>>
>>> What about removing share_level and ask "cpu" to mean all the sharing
>>> cpus to avoid checking the cpu topology?
>>>
>>> Then the above example should be:
>>>
>>> -cache-topo cache=l2, cpus='0-1'
>>> -cache-topo cache=l2, cpus='2-3'
>> Sorry, I dont understand how we will know the cache share_level of
>> cpus '0-1' or '2-3'. What will the CLIs will be like if we change the
>> belows CLIs by removing the "share_level" params.
>>
>> -cache-topo cache=l2, share_level="core", cpus='0-3'
>> -cache-topo cache=l2, share_level="cluster", cpus='4-7'
>>> This decouples cpu topology and cache topology completely and very
>>> simple. In this way, determining the cache by specifying the shared cpu
>>> is similar to that in x86 CPUID.04H.
>>>
>>> But the price of simplicity is we may build a cache topology that doesn't
>>> match the reality.
>>>
>>> But if the cache topology must be set based on the cpu topology, another
>>> way is consider specifying the cache when setting the topology
>>> structure, which can be based on @Daniel's format [1]:
>>>
>>>     -object cpu-socket,id=sock0,cache=l3
>>>     -object cpu-die,id=die0,parent=sock0
>>>     -object cpu-cluster,id=cluster0,parent=die0
>>>     -object cpu-cluster,id=cluster1,parent=die0,cache=l2
>>>     -object x86-cpu-model-core,id=cpu0,parent=cluster0,threads=2,cache=l1i,lid,l2
>>>     -object x86-cpu-model-atom,id=cpu1,parent=cluster1,cache=l1i,lid
>>>     -object x86-cpu-model-atom,id=cpu2,parent=cluster1,cache=l1i,l1d
>>>
>>> Then from this command, cpu0 has a l2, and cpu1 and cpu2 shares a l2
>>> (the l2 is inserted in cluster1).
>>>
>>> This whole process is like when designing or building a CPU, the user
>>> decides where to insert the caches. The advantage is that it is easier
>>> to verify the rationality and is intuitive. But complicated.
>> Yeah, this is also a way.
>> Most of the concern is that it will not be easy/readable to extand the
>> cache configs, for example if we ever want to support custom cache size,
>> cache type or other cache properties in the future. And yes, will also
>> complex the -objects by making them huge.
>>
>> I think keeping cache and cpu configs decouped will leave simplicity
>> to the users, just like we keep numa resources config from cpu
>> topology currently.
>>
>> On the other hand, -cache-topo is not just for hybrid, it's also for
>> current smp, which make it inappropriate to bind -cache-topo
>> with hybrid case. For example, "-cache-topo name=l2, share_level=cluster"
>> will indicates that l2 cache is shared on cluster level globally. And this
>> is
>> "x-l2-cache-topo" in this patch does.
>>
>> Thanks,
>> Yanan
>>> (Also CC @Daniel for comments).
>>>
>>> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03320.html
>>>
>>> Thanks,
>>> Zhao
>>>
>>>> If we ever want to support custom share-level for L3/L1, no extra work
>>>> is needed. We can also extend the CLI to support custom cache size, etc..
>>>>
>>>> If you thinks this a good idea to explore, I can work on it, since I'm
>>>> planing to add support cache topology for ARM.
>>>>
>>>> Thanks,
>>>> Yanan
>>>>>> Thanks,
>>>>>> Yanan
>>>>>>> Here we expose to user "cluster" instead of "module", to be consistent
>>>>>>> with "cluster-id" naming.
>>>>>>>
>>>>>>> Since CPUID.04H is used by intel CPUs, this property is available on
>>>>>>> intel CPUs as for now.
>>>>>>>
>>>>>>> When necessary, it can be extended to CPUID.8000001DH for amd CPUs.
>>>>>>>
>>>>>>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
>>>>>>> ---
>>>>>>>      target/i386/cpu.c | 33 ++++++++++++++++++++++++++++++++-
>>>>>>>      target/i386/cpu.h |  2 ++
>>>>>>>      2 files changed, 34 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>>>>> index 5816dc99b1d4..cf84c720a431 100644
>>>>>>> --- a/target/i386/cpu.c
>>>>>>> +++ b/target/i386/cpu.c
>>>>>>> @@ -240,12 +240,15 @@ static uint32_t max_processor_ids_for_cache(CPUCacheInfo *cache,
>>>>>>>          case CORE:
>>>>>>>              num_ids = 1 << apicid_core_offset(topo_info);
>>>>>>>              break;
>>>>>>> +    case MODULE:
>>>>>>> +        num_ids = 1 << apicid_module_offset(topo_info);
>>>>>>> +        break;
>>>>>>>          case DIE:
>>>>>>>              num_ids = 1 << apicid_die_offset(topo_info);
>>>>>>>              break;
>>>>>>>          default:
>>>>>>>              /*
>>>>>>> -         * Currently there is no use case for SMT, MODULE and PACKAGE, so use
>>>>>>> +         * Currently there is no use case for SMT and PACKAGE, so use
>>>>>>>               * assert directly to facilitate debugging.
>>>>>>>               */
>>>>>>>              g_assert_not_reached();
>>>>>>> @@ -6633,6 +6636,33 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>>>>>>>              env->cache_info_amd.l3_cache = &legacy_l3_cache;
>>>>>>>          }
>>>>>>> +    if (cpu->l2_cache_topo_level) {
>>>>>>> +        /*
>>>>>>> +         * FIXME: Currently only supports changing CPUID[4] (for intel), and
>>>>>>> +         * will support changing CPUID[0x8000001D] when necessary.
>>>>>>> +         */
>>>>>>> +        if (!IS_INTEL_CPU(env)) {
>>>>>>> +            error_setg(errp, "only intel cpus supports x-l2-cache-topo");
>>>>>>> +            return;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        if (!strcmp(cpu->l2_cache_topo_level, "core")) {
>>>>>>> +            env->cache_info_cpuid4.l2_cache->share_level = CORE;
>>>>>>> +        } else if (!strcmp(cpu->l2_cache_topo_level, "cluster")) {
>>>>>>> +            /*
>>>>>>> +             * We expose to users "cluster" instead of "module", to be
>>>>>>> +             * consistent with "cluster-id" naming.
>>>>>>> +             */
>>>>>>> +            env->cache_info_cpuid4.l2_cache->share_level = MODULE;
>>>>>>> +        } else {
>>>>>>> +            error_setg(errp,
>>>>>>> +                       "x-l2-cache-topo doesn't support '%s', "
>>>>>>> +                       "and it only supports 'core' or 'cluster'",
>>>>>>> +                       cpu->l2_cache_topo_level);
>>>>>>> +            return;
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +
>>>>>>>      #ifndef CONFIG_USER_ONLY
>>>>>>>          MachineState *ms = MACHINE(qdev_get_machine());
>>>>>>>          qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
>>>>>>> @@ -7135,6 +7165,7 @@ static Property x86_cpu_properties[] = {
>>>>>>>                           false),
>>>>>>>          DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
>>>>>>>                           true),
>>>>>>> +    DEFINE_PROP_STRING("x-l2-cache-topo", X86CPU, l2_cache_topo_level),
>>>>>>>          DEFINE_PROP_END_OF_LIST()
>>>>>>>      };
>>>>>>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>>>>>>> index 5a955431f759..aa7e96c586c7 100644
>>>>>>> --- a/target/i386/cpu.h
>>>>>>> +++ b/target/i386/cpu.h
>>>>>>> @@ -1987,6 +1987,8 @@ struct ArchCPU {
>>>>>>>          int32_t thread_id;
>>>>>>>          int32_t hv_max_vps;
>>>>>>> +
>>>>>>> +    char *l2_cache_topo_level;
>>>>>>>      };



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H
  2023-02-13  9:36 ` [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H Zhao Liu
  2023-02-15 10:11   ` wangyanan (Y) via
@ 2023-02-20  6:59   ` Xiaoyao Li
  2023-02-22  6:37     ` Zhao Liu
  1 sibling, 1 reply; 61+ messages in thread
From: Xiaoyao Li @ 2023-02-20  6:59 UTC (permalink / raw)
  To: Zhao Liu, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster
  Cc: qemu-devel, Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo,
	Like Xu, Zhao Liu

On 2/13/2023 5:36 PM, Zhao Liu wrote:
> For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
> CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
> both 0, and this means i-cache and d-cache are shared in the SMT level.
> This is correct if there's single thread per core, but is wrong for the
> hyper threading case (one core contains multiple threads) since the
> i-cache and d-cache are shared in the core level other than SMT level.
> 
> Therefore, in order to be compatible with both multi-threaded and
> single-threaded situations, we should set i-cache and d-cache be shared
> at the core level by default.

It's true for VM only when the exactly HW topology is configured to VM. 
i.e., two virtual LPs of one virtual CORE are pinned to two physical LPs 
that of one physical CORE. Otherwise it's incorrect for VM.

for example. given a VM of 4 threads and 2 cores. If not pinning the 4 
threads to physical 4 LPs of 2 CORES. It's likely each vcpu running on a 
LP of different physical cores. So no vcpu shares L1i/L1d cache at core 
level.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H
  2023-02-20  6:59   ` Xiaoyao Li
@ 2023-02-22  6:37     ` Zhao Liu
  2023-02-23  3:52       ` Xiaoyao Li
  0 siblings, 1 reply; 61+ messages in thread
From: Zhao Liu @ 2023-02-22  6:37 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster, qemu-devel,
	Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo, Like Xu,
	Zhao Liu

Hi Xiaoyao,

Thanks, I've spent some time thinking about it here.

On Mon, Feb 20, 2023 at 02:59:20PM +0800, Xiaoyao Li wrote:
> Date: Mon, 20 Feb 2023 14:59:20 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs
>  in CPUID.04H
> 
> On 2/13/2023 5:36 PM, Zhao Liu wrote:
> > For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
> > CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
> > both 0, and this means i-cache and d-cache are shared in the SMT level.
> > This is correct if there's single thread per core, but is wrong for the
> > hyper threading case (one core contains multiple threads) since the
> > i-cache and d-cache are shared in the core level other than SMT level.
> > 
> > Therefore, in order to be compatible with both multi-threaded and
> > single-threaded situations, we should set i-cache and d-cache be shared
> > at the core level by default.
> 
> It's true for VM only when the exactly HW topology is configured to VM.
> i.e., two virtual LPs of one virtual CORE are pinned to two physical LPs
> that of one physical CORE.

Yes, in this case, host and guest has the same topology, and their
topology can match.

> Otherwise it's incorrect for VM.

My understanding here is that what we do in QEMU is to create
self-consistent CPU topology and cache topology for the guest.

If the VM topology is self-consistent and emulated to be almost
identical to the real machine, then the emulation in QEMU is correct,
right? ;-)

> 
> for example. given a VM of 4 threads and 2 cores. If not pinning the 4
> threads to physical 4 LPs of 2 CORES. It's likely each vcpu running on a LP
> of different physical cores.

Thanks for bringing this up, this is worth discussing.

I looked into it and found that the specific scheduling policy for the
vCPU actually depends on the host setting. For example, (IIUC) if host

enables core scheduling, then host will schedule the vCPU on the SMT
threads of same core.

Also, to explore the original purpose of the "per thread" i/d cache
topology, I have retraced its history.

The related commit should be in '09, which is 400281a (set CPUID bits
to present cores and threads topology). In this commit, the
multithreading cache topology is added in CPUID.04H. In particular, here
it set the L2 cache level to per core, but it did not change the level of
L1 (i/d cache), that is, L1 still remains per thread.

I think that here is the problem, L1 should also be per core in
multithreading case. (the fix for this patch is worth it?)

Another thing we can refer to is that AMD's i/d cache topology is per
core rather than per thread (different CPUID leaf than intel): In
encode_cache_cpuid8000001d() (target/i386/cpu.c), i/d cache and L2
are encoded as core level in EAX. They set up the per core supposedly
to emulate the L1 topology of the real machine as well.

So, I guess this example is "unintentionally" benefiting from the
"per thread" level of i/d cache.

What do you think?

> So no vcpu shares L1i/L1d cache at core level.

Yes. The scheduling of host is not guaranteed, and workload balance
policies in various scenarios and some security mitigation ways may
break the delicate balance we have carefully set up.

Perhaps another way is to also add a new command "x-l1-cache-topo" (like
[1] did for L2) that can adjust the i/d cache level from core to smt to
benefit cases like this.

[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03201.html

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H
  2023-02-22  6:37     ` Zhao Liu
@ 2023-02-23  3:52       ` Xiaoyao Li
  0 siblings, 0 replies; 61+ messages in thread
From: Xiaoyao Li @ 2023-02-23  3:52 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Richard Henderson,
	Paolo Bonzini, Eric Blake, Markus Armbruster, qemu-devel,
	Zhenyu Wang, Dapeng Mi, Zhuocheng Ding, Robert Hoo, Like Xu,
	Zhao Liu

On 2/22/2023 2:37 PM, Zhao Liu wrote:
> Hi Xiaoyao,
> 
> Thanks, I've spent some time thinking about it here.
> 
> On Mon, Feb 20, 2023 at 02:59:20PM +0800, Xiaoyao Li wrote:
>> Date: Mon, 20 Feb 2023 14:59:20 +0800
>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>> Subject: Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs
>>   in CPUID.04H
>>
>> On 2/13/2023 5:36 PM, Zhao Liu wrote:
>>> For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
>>> CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
>>> both 0, and this means i-cache and d-cache are shared in the SMT level.
>>> This is correct if there's single thread per core, but is wrong for the
>>> hyper threading case (one core contains multiple threads) since the
>>> i-cache and d-cache are shared in the core level other than SMT level.
>>>
>>> Therefore, in order to be compatible with both multi-threaded and
>>> single-threaded situations, we should set i-cache and d-cache be shared
>>> at the core level by default.
>>
>> It's true for VM only when the exactly HW topology is configured to VM.
>> i.e., two virtual LPs of one virtual CORE are pinned to two physical LPs
>> that of one physical CORE.
> 
> Yes, in this case, host and guest has the same topology, and their
> topology can match.
> 
>> Otherwise it's incorrect for VM.
> 
> My understanding here is that what we do in QEMU is to create
> self-consistent CPU topology and cache topology for the guest.
> 
> If the VM topology is self-consistent and emulated to be almost
> identical to the real machine, then the emulation in QEMU is correct,
> right? ;-)

Real machine tells two threads in the same CORE share the L1 cahche via 
CUPID because it's the fact and it is how exactly hardware resource lay 
out. However, for VM, when you tell the same thing (two threads share 
the L1 cache), is it true for vcpus?

The target is to emulate things correctly, not emulate it identical to 
real machine. In fact, for these shared resources, it's mostly 
infeasible to emulate correctly if not pinning vcpus to physical LPs.

>>
>> for example. given a VM of 4 threads and 2 cores. If not pinning the 4
>> threads to physical 4 LPs of 2 CORES. It's likely each vcpu running on a LP
>> of different physical cores.
> 
> Thanks for bringing this up, this is worth discussing.
> 
> I looked into it and found that the specific scheduling policy for the
> vCPU actually depends on the host setting. For example, (IIUC) if host
> 
> enables core scheduling, then host will schedule the vCPU on the SMT
> threads of same core.
> 
> Also, to explore the original purpose of the "per thread" i/d cache
> topology, I have retraced its history.
> 
> The related commit should be in '09, which is 400281a (set CPUID bits
> to present cores and threads topology). In this commit, the
> multithreading cache topology is added in CPUID.04H. In particular, here
> it set the L2 cache level to per core, but it did not change the level of
> L1 (i/d cache), that is, L1 still remains per thread.
> 
> I think that here is the problem, L1 should also be per core in
> multithreading case. (the fix for this patch is worth it?)
> 
> Another thing we can refer to is that AMD's i/d cache topology is per
> core rather than per thread (different CPUID leaf than intel): In
> encode_cache_cpuid8000001d() (target/i386/cpu.c), i/d cache and L2
> are encoded as core level in EAX. They set up the per core supposedly
> to emulate the L1 topology of the real machine as well.
> 
> So, I guess this example is "unintentionally" benefiting from the
> "per thread" level of i/d cache.
> 
> What do you think?
> 
>> So no vcpu shares L1i/L1d cache at core level.
> 
> Yes. The scheduling of host is not guaranteed, and workload balance
> policies in various scenarios and some security mitigation ways may
> break the delicate balance we have carefully set up.
> 
> Perhaps another way is to also add a new command "x-l1-cache-topo" (like
> [1] did for L2) that can adjust the i/d cache level from core to smt to
> benefit cases like this.
> 
> [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03201.html
> 
> Thanks,
> Zhao
> 



^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2023-02-23  3:54 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-13  9:36 [PATCH RESEND 00/18] Support smp.clusters for x86 Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 01/18] machine: Fix comment of machine_parse_smp_config() Zhao Liu
2023-02-13 13:31   ` wangyanan (Y) via
2023-02-14 14:22     ` Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 02/18] tests: Rename test-x86-cpuid.c to test-x86-apicid.c Zhao Liu
2023-02-15  2:36   ` wangyanan (Y) via
2023-02-15  3:35     ` Zhao Liu
2023-02-15  7:44       ` wangyanan (Y) via
2023-02-13  9:36 ` [PATCH RESEND 03/18] softmmu: Fix CPUSTATE.nr_cores' calculation Zhao Liu
2023-02-15  2:58   ` wangyanan (Y) via
2023-02-15  3:37     ` Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H Zhao Liu
2023-02-15 10:11   ` wangyanan (Y) via
2023-02-15 14:33     ` Zhao Liu
2023-02-20  6:59   ` Xiaoyao Li
2023-02-22  6:37     ` Zhao Liu
2023-02-23  3:52       ` Xiaoyao Li
2023-02-13  9:36 ` [PATCH RESEND 05/18] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() Zhao Liu
2023-02-15  3:28   ` wangyanan (Y) via
2023-02-15  7:10     ` Zhao Liu
2023-02-15  7:08       ` wangyanan (Y) via
2023-02-13  9:36 ` [PATCH RESEND 06/18] i386: Introduce module-level cpu topology to CPUX86State Zhao Liu
2023-02-15  7:41   ` wangyanan (Y) via
2023-02-13  9:36 ` [PATCH RESEND 07/18] i386: Support modules_per_die in X86CPUTopoInfo Zhao Liu
2023-02-15 10:38   ` wangyanan (Y) via
2023-02-15 14:35     ` Zhao Liu
2023-02-16  2:34   ` wangyanan (Y) via
2023-02-16  4:33     ` Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 08/18] i386: Support module_id in X86CPUTopoIDs Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 09/18] i386: Fix comment style in topology.h Zhao Liu
2023-02-13 13:40   ` Philippe Mathieu-Daudé
2023-02-14  2:37   ` wangyanan (Y) via
2023-02-15 10:54   ` wangyanan (Y) via
2023-02-15 14:35     ` Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 10/18] i386: Update APIC ID parsing rule to support module level Zhao Liu
2023-02-15 11:06   ` wangyanan (Y) via
2023-02-15 15:03     ` Zhao Liu
2023-02-16  2:40       ` wangyanan (Y) via
2023-02-13  9:36 ` [PATCH RESEND 11/18] i386/cpu: Introduce cluster-id to X86CPU Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 12/18] tests: Add test case of APIC ID for module level parsing Zhao Liu
2023-02-15 11:22   ` wangyanan (Y) via
2023-02-13  9:36 ` [PATCH RESEND 13/18] hw/i386/pc: Support smp.clusters for x86 PC machine Zhao Liu
2023-02-14  2:34   ` wangyanan (Y) via
2023-02-13  9:36 ` [PATCH RESEND 14/18] i386: Add cache topology info in CPUCacheInfo Zhao Liu
2023-02-15 12:17   ` wangyanan (Y) via
2023-02-15 15:07     ` Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 15/18] i386: Use CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14] Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 16/18] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] Zhao Liu
2023-02-15 12:32   ` wangyanan (Y) via
2023-02-15 15:09     ` Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 17/18] i386: Use CPUCacheInfo.share_level to encode " Zhao Liu
2023-02-13  9:36 ` [PATCH RESEND 18/18] i386: Add new property to control L2 cache topo in CPUID.04H Zhao Liu
2023-02-16 13:14   ` wangyanan (Y) via
2023-02-17  3:35     ` Zhao Liu
2023-02-17  3:45       ` wangyanan (Y) via
2023-02-20  2:54         ` Zhao Liu
2023-02-17  4:07       ` wangyanan (Y) via
2023-02-17  7:26         ` Zhao Liu
2023-02-17  9:08           ` wangyanan (Y) via
2023-02-20  2:49             ` Zhao Liu
2023-02-20  3:52               ` wangyanan (Y) via

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.