All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] x86/cpu: Do not check c->initialized in topology_phys_to_logical_die()
@ 2021-01-05 11:34 Borislav Petkov
  2021-01-06 11:21 ` Borislav Petkov
  2021-01-10 14:54   ` kernel test robot
  0 siblings, 2 replies; 5+ messages in thread
From: Borislav Petkov @ 2021-01-05 11:34 UTC (permalink / raw)
  To: X86 ML; +Cc: Yazen Ghannam, LKML, Rafael Kitover, Johnathan Smithinovic

From: Borislav Petkov <bp@suse.de>

During boot, identify_secondary_cpu() calls at some point
validate_apic_and_package_id() which calls topology_update_die_map() to
update/verify the physical to logical DIE map of the CPUs on the system.

There's a call down that path to topology_phys_to_logical_die() which
maps a physical die to a logical one. The check in there looks at
cpuinfo_x86.initialized first before comparing die_ids and proc_ids.

And this is where the problem lies: both ->cpu_die_id and ->phys_proc_id
have been initialized as part of the identify_secondary_cpu() dance -
just the cpuinfo_x86.initialized thing hasn't been set yet (it gets set
as the last thing in smp_store_cpu_info()).

So what that means is that initialized fields are being compared but the
initialized flag says they're not, leading to:

  smpboot: topology_phys_to_logical_die: init: 1, cpu 7, cur_cpu: 8, cpu_die_id: 0, die_id: 2, phys_proc_id: 0, proc_id: 0, logical_die_id: 0
  smpboot: topology_phys_to_logical_die: init: 0, cpu 8, cur_cpu: 8, cpu_die_id: 2, die_id: 2, phys_proc_id: 0, proc_id: 0, logical_die_id: 0
  ...
  smpboot: topology_phys_to_logical_die: init: 0, cpu 127, cur_cpu: 8, cpu_die_id: 0, die_id: 2, phys_proc_id: 0, proc_id: 0, logical_die_id: 0
  smpboot: CPU 8 Converting physical 2 to logical die 1

On CPU8 and all the way up to all possible_cpus, boot_cpu_data is not
initialized yet even though

  cpu_die_id == die_id
&&
  phys_proc_id == proc_id

for that CPU 8.

As a result, topology_update_die_map() increments logical_die which gets
written into cpuinfo_x86.logical_die_id of that CPU.

Later, in the RAPL code, that logical_die_id is outside of the range of
maximum dies present on the system:

  int maxdie = topology_max_packages() * topology_max_die_per_package();

which leads to indexing into the rapl_pmus->pmus[] array out of bounds.
Boom.

Thus, drop the c->initialized check because the values it should protect
against checking, have been actually already initialized. (Yes, our boot
order is fragile. :-\).

Reported-by: Rafael Kitover <rkitover@gmail.com>
Reported-by: Johnathan Smithinovic <johnathan.smithinovic@gmx.at>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210939
---
 arch/x86/kernel/smpboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 8ca66af96a54..56d2ac8c54ab 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -319,7 +319,7 @@ int topology_phys_to_logical_die(unsigned int die_id, unsigned int cur_cpu)
 	for_each_possible_cpu(cpu) {
 		struct cpuinfo_x86 *c = &cpu_data(cpu);
 
-		if (c->initialized && c->cpu_die_id == die_id &&
+		if (c->cpu_die_id == die_id &&
 		    c->phys_proc_id == proc_id)
 			return c->logical_die_id;
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-12 11:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-05 11:34 [RFC PATCH] x86/cpu: Do not check c->initialized in topology_phys_to_logical_die() Borislav Petkov
2021-01-06 11:21 ` Borislav Petkov
2021-01-12 11:29   ` [tip: x86/urgent] x86/cpu/amd: Set __max_die_per_package on AMD tip-bot2 for Yazen Ghannam
2021-01-10 14:54 ` [x86/cpu] 3756dbdf6f: WARNING:at_arch/x86/events/intel/uncore.c:#uncore_change_type_ctx[intel_uncore] kernel test robot
2021-01-10 14:54   ` kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.