From: Zhang Rui <rui.zhang@intel.com>
To: linux-kernel@vger.kernel.org, x86@kernel.org,
linux-hwmon@vger.kernel.org
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, hpa@zytor.com, corbet@lwn.net,
fenghua.yu@intel.com, jdelvare@suse.com, linux@roeck-us.net,
len.brown@intel.com, rui.zhang@intel.com
Subject: [PATCH 4/7] x86/topology: Fix max_siblings calculation
Date: Sat, 13 Aug 2022 00:41:41 +0800 [thread overview]
Message-ID: <20220812164144.30829-5-rui.zhang@intel.com> (raw)
In-Reply-To: <20220812164144.30829-1-rui.zhang@intel.com>
The max siblings value returned by CPUID.1F SMT level EBX differs among
CPUs on Intel Hybrid platforms like ADL-S/P.
It returns 2 for Pcore CPUs which have HT sibling and 1 for Ecore CPUs
which do not.
Today, CPUID SMT level EBX sets the global variable smp_num_siblings.
Thus, smp_num_siblings is overridden to different values based on the CPU
Pcore/Ecore enumeration order.
For example,
[ 0.201005] detect_extended_topology: CPU APICID 0x0, smp_num_siblings 2, x86_max_cores 10
[ 0.201117] start_kernel->check_bugs->cpu_smt_check_topology: smp_num_siblings 2
...
[ 0.010146] detect_extended_topology: CPU APICID 0x8, smp_num_siblings 2, x86_max_cores 10
...
[ 0.010146] detect_extended_topology: CPU APICID 0x39, smp_num_siblings 2, x86_max_cores 10
[ 0.010146] detect_extended_topology: CPU APICID 0x48, smp_num_siblings 1, x86_max_cores 20
...
[ 0.010146] detect_extended_topology: CPU APICID 0x4e, smp_num_siblings 1, x86_max_cores 20
[ 2.583800] sched_set_itmt_core_prio: smp_num_siblings 1
This inconsistency brings several potential issues:
1. some kernel configuration like cpu_smt_control, as set in
start_kernel()->check_bugs()->cpu_smt_check_topology(), depends on
smp_num_siblings set by cpu0.
It is pure luck that all the current hybrid platforms use Pcore as cpu0
and hide this problem.
2. some per CPU data like cpuinfo_x86.x86_max_cores that depends on
smp_num_siblings becomes inconsistent and bogus.
3. the final smp_num_siblings value after boot depends on the last CPU
enumerated, which could either be Pcore or Ecore CPU.
The solution is to use CPUID EAX bits_shift to get the maximum number of
addressable logical processors, and use this to determin max siblings.
Because:
1. the CPUID EAX bits_shift values are consistent among CPUs as far as
observed.
2. some code already uses smp_num_siblings value to isolate the SMT ID
bits in APIC-ID, like apic_id_is_primary_thread().
Suggested-and-reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
arch/x86/kernel/cpu/topology.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 5e868b62a7c4..2a88f2fa5756 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -23,7 +23,12 @@
#define LEAFB_SUBTYPE(ecx) (((ecx) >> 8) & 0xff)
#define BITS_SHIFT_NEXT_LEVEL(eax) ((eax) & 0x1f)
-#define LEVEL_MAX_SIBLINGS(ebx) ((ebx) & 0xffff)
+
+/*
+ * Use EAX bit_shift to calculate the maximum number of addressable logical
+ * processors sharing the current level.
+ */
+#define LEVEL_MAX_SIBLINGS(eax) (1 << BITS_SHIFT_NEXT_LEVEL(eax))
unsigned int __max_die_per_package __read_mostly = 1;
EXPORT_SYMBOL(__max_die_per_package);
@@ -79,7 +84,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
* initial apic id, which also represents 32-bit extended x2apic id.
*/
c->initial_apicid = edx;
- smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ smp_num_siblings = LEVEL_MAX_SIBLINGS(eax);
#endif
return 0;
}
@@ -109,9 +114,9 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
*/
cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
c->initial_apicid = edx;
- core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(eax);
core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
- die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ die_level_siblings = LEVEL_MAX_SIBLINGS(eax);
pkg_mask_width = die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
sub_index = 1;
@@ -122,14 +127,14 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
* Check for the Core type in the implemented sub leaves.
*/
if (LEAFB_SUBTYPE(ecx) == CORE_TYPE) {
- core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ core_level_siblings = LEVEL_MAX_SIBLINGS(eax);
core_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
die_level_siblings = core_level_siblings;
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) {
die_level_present = true;
- die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ die_level_siblings = LEVEL_MAX_SIBLINGS(eax);
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
--
2.34.1
next prev parent reply other threads:[~2022-08-12 16:38 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-12 16:41 [PATCH 0/7] x86/topology: Improve CPUID.1F handling Zhang Rui
2022-08-12 16:41 ` [PATCH 1/7] x86/topology: Fix multiple packages shown on a single-package system Zhang Rui
2022-08-12 16:41 ` [PATCH 2/7] x86/topology: Fix duplicated core_id within a package Zhang Rui
2022-08-12 16:41 ` [PATCH 3/7] hwmon/coretemp: Handle large core id value Zhang Rui
2022-08-12 17:16 ` Guenter Roeck
2022-08-13 17:24 ` Zhang Rui
2022-08-13 10:48 ` Ingo Molnar
2022-08-13 17:07 ` Zhang Rui
2022-08-14 9:12 ` Ingo Molnar
2022-08-12 16:41 ` Zhang Rui [this message]
2022-08-12 16:41 ` [PATCH 5/7] Documentation: x86: Update smp_num_siblings/x86_max_cores description Zhang Rui
2022-08-12 16:41 ` [PATCH 6/7] Documentation: x86: Remove obsolete x86_max_dies description Zhang Rui
2022-08-12 16:41 ` [PATCH 7/7] perf/x86/intel/P4: Fix smp_num_siblings usage Zhang Rui
2022-08-13 10:50 ` Ingo Molnar
2022-08-13 17:29 ` Zhang Rui
2022-08-15 9:11 ` Peter Zijlstra
2022-08-16 2:26 ` Zhang Rui
2022-08-16 8:26 ` Peter Zijlstra
2022-08-13 10:44 ` [PATCH 0/7] x86/topology: Improve CPUID.1F handling Ingo Molnar
2022-08-13 17:10 ` Zhang Rui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220812164144.30829-5-rui.zhang@intel.com \
--to=rui.zhang@intel.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=fenghua.yu@intel.com \
--cc=hpa@zytor.com \
--cc=jdelvare@suse.com \
--cc=len.brown@intel.com \
--cc=linux-hwmon@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).