From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932367AbcECHuf (ORCPT ); Tue, 3 May 2016 03:50:35 -0400 Received: from terminus.zytor.com ([198.137.202.10]:37250 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755506AbcECHu0 (ORCPT ); Tue, 3 May 2016 03:50:26 -0400 Date: Tue, 3 May 2016 00:49:15 -0700 From: tip-bot for Yazen Ghannam Message-ID: Cc: tony.luck@intel.com, peterz@infradead.org, bp@suse.de, tglx@linutronix.de, Yazen.Ghannam@amd.com, hpa@zytor.com, dvlasenk@redhat.com, mingo@kernel.org, linux-kernel@vger.kernel.org, torvalds@linux-foundation.org, brgerst@gmail.com, bp@alien8.de, luto@amacapital.net, linux-edac@vger.kernel.org Reply-To: tony.luck@intel.com, peterz@infradead.org, bp@suse.de, tglx@linutronix.de, Yazen.Ghannam@amd.com, hpa@zytor.com, dvlasenk@redhat.com, linux-kernel@vger.kernel.org, mingo@kernel.org, torvalds@linux-foundation.org, brgerst@gmail.com, bp@alien8.de, luto@amacapital.net, linux-edac@vger.kernel.org In-Reply-To: <1462019637-16474-8-git-send-email-bp@alien8.de> References: <1462019637-16474-8-git-send-email-bp@alien8.de> To: linux-tip-commits@vger.kernel.org Subject: [tip:ras/core] x86/mce: Detect local MCEs properly Git-Commit-ID: fead35c68926682c90c995f22b48f1c8d78865c1 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: fead35c68926682c90c995f22b48f1c8d78865c1 Gitweb: http://git.kernel.org/tip/fead35c68926682c90c995f22b48f1c8d78865c1 Author: Yazen Ghannam AuthorDate: Sat, 30 Apr 2016 14:33:57 +0200 Committer: Ingo Molnar CommitDate: Tue, 3 May 2016 08:24:17 +0200 x86/mce: Detect local MCEs properly Check the MCG_STATUS_LMCES bit on Intel to verify that current MCE is local. It is always local on AMD. Signed-off-by: Yazen Ghannam [ Massaged it a bit. Reflowed comments. Shut up -Wmaybe-uninitialized. ] Signed-off-by: Borislav Petkov Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Tony Luck Cc: linux-edac Link: http://lkml.kernel.org/r/1462019637-16474-8-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar --- arch/x86/kernel/cpu/mcheck/mce.c | 33 ++++++++++++++++++++------------- 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index c356f47..aeda446 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -1038,11 +1038,12 @@ void do_machine_check(struct pt_regs *regs, long error_code) int i; int worst = 0; int severity; + /* * Establish sequential order between the CPUs entering the machine * check handler. */ - int order; + int order = -1; /* * If no_way_out gets set, there is no safe way to recover from this * MCE. If mca_cfg.tolerant is cranked up, we'll try anyway. @@ -1056,7 +1057,12 @@ void do_machine_check(struct pt_regs *regs, long error_code) DECLARE_BITMAP(toclear, MAX_NR_BANKS); DECLARE_BITMAP(valid_banks, MAX_NR_BANKS); char *msg = "Unknown"; - int lmce = 0; + + /* + * MCEs are always local on AMD. Same is determined by MCG_STATUS_LMCES + * on Intel. + */ + int lmce = 1; /* If this CPU is offline, just bail out. */ if (cpu_is_offline(smp_processor_id())) { @@ -1095,19 +1101,20 @@ void do_machine_check(struct pt_regs *regs, long error_code) kill_it = 1; /* - * Check if this MCE is signaled to only this logical processor + * Check if this MCE is signaled to only this logical processor, + * on Intel only. */ - if (m.mcgstatus & MCG_STATUS_LMCES) - lmce = 1; - else { - /* - * Go through all the banks in exclusion of the other CPUs. - * This way we don't report duplicated events on shared banks - * because the first one to see it will clear it. - * If this is a Local MCE, then no need to perform rendezvous. - */ + if (m.cpuvendor == X86_VENDOR_INTEL) + lmce = m.mcgstatus & MCG_STATUS_LMCES; + + /* + * Go through all banks in exclusion of the other CPUs. This way we + * don't report duplicated events on shared banks because the first one + * to see it will clear it. If this is a Local MCE, then no need to + * perform rendezvous. + */ + if (!lmce) order = mce_start(&no_way_out); - } for (i = 0; i < cfg->banks; i++) { __clear_bit(i, toclear);