linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/mce/AMD: Fix partial SMCA bank init when CPU 0 != thread 0
@ 2017-06-28  0:06 Jack Miller
  2017-06-28  9:22 ` Borislav Petkov
  2017-06-29 18:08 ` [PATCH] x86/mce/AMD: Allow any CPU to initialize smca_banks array Yazen Ghannam
  0 siblings, 2 replies; 14+ messages in thread
From: Jack Miller @ 2017-06-28  0:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: tglx, bp, Yazen.Ghannam, x86

After a call to firmware SwitchBSP(), Linux can be booted with a thread
that isn't the first in the system. That thread automatically becomes
CPU 0.

Currently get_smca_bank_info() queries CPU 0's MCA types, but if CPU 0
!= hardware thread 0, it will get an incomplete list of MCA types in
smca_banks.

This causes get_name() to return NULL when initing hardware thread 0's
additional types, and then the following error when creating the bank
kobj in threshold_create_bank():

[    1.171552] kobject: can not set name properly!
[    1.171569] kobject_create_and_add: kobject_add error: -12

This error path isn't correctly handled. threshold_init_device() fails,
but later if a thread is offlined, threshold_remove_bank() causes a BUG:

[   67.491772] BUG: unable to handle kernel NULL pointer dereference at           (null)
[   67.491781] IP: mce_threshold_remove_device.part.7+0x82/0x2c0

because per_cpu(threshold_banks, cpu) is unexpectedly NULL.

This patch fixes get_smca_bank_info() to query hardware thread 0, not
necessarily CPU 0, to get a full set of MCA types.

I'm uncertain that reading the APIC ID is correct here, and this will
fail if there is AMD hardware where hardware thread 0's APIC ID != 0,
but the other topology/CPUID based functions don't seem to easily
differentiate CPU 0 and thread 0 or possibly aren't inited at this
point. Suggestions for a better mechanism welcome.

Signed-off-by: Jack Miller <jack@codezen.org>
---
 arch/x86/kernel/cpu/mcheck/mce_amd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 6e4a047e4b68..9d74adcf34d2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -170,8 +170,8 @@ static void get_smca_bank_info(unsigned int bank)
 	struct smca_hwid *s_hwid;
 	u32 high, instance_id;
 
-	/* Collect bank_info using CPU 0 for now. */
-	if (cpu)
+	/* Collect bank_info using hardware thread 0 for now. */
+	if (apic->get_apic_id(apic->read(APIC_ID)) != 0)
 		return;
 
 	if (rdmsr_safe_on_cpu(cpu, MSR_AMD64_SMCA_MCx_IPID(bank), &instance_id, &high)) {
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-07-17  5:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-28  0:06 [PATCH] x86/mce/AMD: Fix partial SMCA bank init when CPU 0 != thread 0 Jack Miller
2017-06-28  9:22 ` Borislav Petkov
2017-06-28 17:44   ` Jack Miller
2017-06-28 18:00     ` Ghannam, Yazen
2017-06-28 18:53       ` Jack Miller
2017-06-28 18:58         ` Ghannam, Yazen
2017-06-29 16:22           ` Jack Miller
2017-06-29 17:58             ` Ghannam, Yazen
2017-06-28 18:16     ` Borislav Petkov
2017-06-28 18:51       ` Ghannam, Yazen
2017-06-28 18:55         ` Borislav Petkov
2017-06-29 18:08 ` [PATCH] x86/mce/AMD: Allow any CPU to initialize smca_banks array Yazen Ghannam
2017-06-30 15:57   ` Jack Miller
2017-07-17  5:19   ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).