From: Zhang Rui <rui.zhang@intel.com>
To: linux-kernel@vger.kernel.org, x86@kernel.org,
linux-hwmon@vger.kernel.org
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
dave.hansen@linux.intel.com, hpa@zytor.com, corbet@lwn.net,
fenghua.yu@intel.com, jdelvare@suse.com, linux@roeck-us.net,
len.brown@intel.com, rui.zhang@intel.com
Subject: [PATCH V2 6/8] x86/topology: Fix max_siblings calculation
Date: Tue, 16 Aug 2022 13:16:31 +0800 [thread overview]
Message-ID: <20220816051633.17775-7-rui.zhang@intel.com> (raw)
In-Reply-To: <20220816051633.17775-1-rui.zhang@intel.com>
The max siblings value returned by CPUID.1F SMT level EBX differs among
CPUs on Intel Hybrid platforms like ADL-S/P.
It returns 2 for Pcore CPUs which have HT sibling and 1 for Ecore CPUs
which do not.
Today, CPUID SMT level EBX sets the global variable smp_num_siblings.
Thus, smp_num_siblings is overridden to different values based on the CPU
Pcore/Ecore enumeration order.
For example,
[ 0.201005] detect_extended_topology: CPU APICID 0x0, smp_num_siblings 2, x86_max_cores 10
[ 0.201117] start_kernel->check_bugs->cpu_smt_check_topology: smp_num_siblings 2
...
[ 0.010146] detect_extended_topology: CPU APICID 0x8, smp_num_siblings 2, x86_max_cores 10
...
[ 0.010146] detect_extended_topology: CPU APICID 0x39, smp_num_siblings 2, x86_max_cores 10
[ 0.010146] detect_extended_topology: CPU APICID 0x48, smp_num_siblings 1, x86_max_cores 20
...
[ 0.010146] detect_extended_topology: CPU APICID 0x4e, smp_num_siblings 1, x86_max_cores 20
[ 2.583800] sched_set_itmt_core_prio: smp_num_siblings 1
This inconsistency brings several potential issues:
1. some kernel configuration like cpu_smt_control, as set in
start_kernel()->check_bugs()->cpu_smt_check_topology(), depends on
smp_num_siblings set by cpu0.
It is pure luck that all the current hybrid platforms use Pcore as cpu0
and hide this problem.
2. some per CPU data like cpuinfo_x86.x86_max_cores that depends on
smp_num_siblings becomes inconsistent and bogus.
3. the final smp_num_siblings value after boot depends on the last CPU
enumerated, which could either be Pcore or Ecore CPU.
The solution is to use CPUID EAX bits_shift to get the maximum number of
addressable logical processors, and use this to determin max siblings.
Because:
1. the CPUID EAX bits_shift values are consistent among CPUs as far as
observed.
2. some code already uses smp_num_siblings value to isolate the SMT ID
bits in APIC-ID, like apic_id_is_primary_thread().
Suggested-and-reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
arch/x86/kernel/cpu/topology.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 5e868b62a7c4..2a88f2fa5756 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -23,7 +23,12 @@
#define LEAFB_SUBTYPE(ecx) (((ecx) >> 8) & 0xff)
#define BITS_SHIFT_NEXT_LEVEL(eax) ((eax) & 0x1f)
-#define LEVEL_MAX_SIBLINGS(ebx) ((ebx) & 0xffff)
+
+/*
+ * Use EAX bit_shift to calculate the maximum number of addressable logical
+ * processors sharing the current level.
+ */
+#define LEVEL_MAX_SIBLINGS(eax) (1 << BITS_SHIFT_NEXT_LEVEL(eax))
unsigned int __max_die_per_package __read_mostly = 1;
EXPORT_SYMBOL(__max_die_per_package);
@@ -79,7 +84,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
* initial apic id, which also represents 32-bit extended x2apic id.
*/
c->initial_apicid = edx;
- smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ smp_num_siblings = LEVEL_MAX_SIBLINGS(eax);
#endif
return 0;
}
@@ -109,9 +114,9 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
*/
cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
c->initial_apicid = edx;
- core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(eax);
core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
- die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ die_level_siblings = LEVEL_MAX_SIBLINGS(eax);
pkg_mask_width = die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
sub_index = 1;
@@ -122,14 +127,14 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
* Check for the Core type in the implemented sub leaves.
*/
if (LEAFB_SUBTYPE(ecx) == CORE_TYPE) {
- core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ core_level_siblings = LEVEL_MAX_SIBLINGS(eax);
core_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
die_level_siblings = core_level_siblings;
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) {
die_level_present = true;
- die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+ die_level_siblings = LEVEL_MAX_SIBLINGS(eax);
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
--
2.25.1
next prev parent reply other threads:[~2022-08-16 7:57 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-16 5:16 [PATCH V2 0/8] x86/topology: Improve CPUID.1F handling Zhang Rui
2022-08-16 5:16 ` [PATCH V2 1/8] perf/x86/intel/P4: Fix smp_num_siblings usage Zhang Rui
2022-08-16 8:27 ` Peter Zijlstra
2022-08-16 9:47 ` Zhang Rui
2022-08-16 10:08 ` Peter Zijlstra
2022-08-16 5:16 ` [PATCH V2 2/8] hwmon/coretemp: Rename indx to index Zhang Rui
2022-08-16 5:16 ` [PATCH V2 3/8] hwmon/coretemp: Handle large core ID value Zhang Rui
2022-08-16 11:16 ` Guenter Roeck
2022-08-16 5:16 ` [PATCH V2 4/8] x86/topology: Fix multiple packages shown on a single-package system Zhang Rui
2022-09-21 16:26 ` Dave Hansen
2022-09-22 13:38 ` Zhang Rui
2022-08-16 5:16 ` [PATCH V2 5/8] x86/topology: Fix duplicated core ID within a package Zhang Rui
2022-08-16 5:16 ` Zhang Rui [this message]
2022-08-16 5:16 ` [PATCH V2 7/8] Documentation: x86: Update smp_num_siblings/x86_max_cores description Zhang Rui
2022-08-16 5:16 ` [PATCH V2 8/8] Documentation: x86: Remove obsolete x86_max_dies description Zhang Rui
2022-09-20 1:57 ` [PATCH V2 0/8] x86/topology: Improve CPUID.1F handling Zhang Rui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220816051633.17775-7-rui.zhang@intel.com \
--to=rui.zhang@intel.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=fenghua.yu@intel.com \
--cc=hpa@zytor.com \
--cc=jdelvare@suse.com \
--cc=len.brown@intel.com \
--cc=linux-hwmon@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).