* [PATCH 1/9] x86/mce-inject: Make it depend on X86_LOCAL_APIC
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
@ 2017-01-23 18:35 ` Borislav Petkov
2017-01-24 8:46 ` [tip:ras/core] x86/ras/inject: Make it depend on X86_LOCAL_APIC=y tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 2/9] x86/MCE/therm_throt: Do not log a fake MCE for a thermal event Borislav Petkov
` (7 subsequent siblings)
8 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2017-01-23 18:35 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, Yazen Ghannam, linux-edac, LKML
From: Borislav Petkov <bp@suse.de>
... and get rid of the annoying:
arch/x86/kernel/cpu/mcheck/mce-inject.c:97:13: warning: ‘mce_irq_ipi’
defined but not used [-Wunused-function]
static void mce_irq_ipi(void *info
when doing randconfig builds.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/Kconfig | 2 +-
arch/x86/kernel/cpu/mcheck/mce-inject.c | 5 +----
2 files changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e487493bbd47..7b6fd68b4715 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1070,7 +1070,7 @@ config X86_MCE_THRESHOLD
def_bool y
config X86_MCE_INJECT
- depends on X86_MCE
+ depends on X86_MCE && X86_LOCAL_APIC
tristate "Machine check injector support"
---help---
Provide support for injecting machine checks for testing purposes.
diff --git a/arch/x86/kernel/cpu/mcheck/mce-inject.c b/arch/x86/kernel/cpu/mcheck/mce-inject.c
index 517619ea6498..99165b206df3 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-inject.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-inject.c
@@ -152,7 +152,6 @@ static void raise_mce(struct mce *m)
if (context == MCJ_CTX_RANDOM)
return;
-#ifdef CONFIG_X86_LOCAL_APIC
if (m->inject_flags & (MCJ_IRQ_BROADCAST | MCJ_NMI_BROADCAST)) {
unsigned long start;
int cpu;
@@ -192,9 +191,7 @@ static void raise_mce(struct mce *m)
raise_local();
put_cpu();
put_online_cpus();
- } else
-#endif
- {
+ } else {
preempt_disable();
raise_local();
preempt_enable();
--
2.11.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [tip:ras/core] x86/ras/inject: Make it depend on X86_LOCAL_APIC=y
2017-01-23 18:35 ` [PATCH 1/9] x86/mce-inject: Make it depend on X86_LOCAL_APIC Borislav Petkov
@ 2017-01-24 8:46 ` tip-bot for Borislav Petkov
0 siblings, 0 replies; 37+ messages in thread
From: tip-bot for Borislav Petkov @ 2017-01-24 8:46 UTC (permalink / raw)
To: linux-tip-commits
Cc: bp, hpa, linux-kernel, peterz, torvalds, tglx, mingo,
Yazen.Ghannam, linux-edac, tony.luck
Commit-ID: d4b2ac63b0eae461fc10c9791084be24724ef57a
Gitweb: http://git.kernel.org/tip/d4b2ac63b0eae461fc10c9791084be24724ef57a
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 23 Jan 2017 19:35:06 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:52 +0100
x86/ras/inject: Make it depend on X86_LOCAL_APIC=y
... and get rid of the annoying:
arch/x86/kernel/cpu/mcheck/mce-inject.c:97:13: warning: ‘mce_irq_ipi’ defined but not used [-Wunused-function]
when doing randconfig builds.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-2-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/Kconfig | 2 +-
arch/x86/kernel/cpu/mcheck/mce-inject.c | 5 +----
2 files changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e487493..7b6fd68 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1070,7 +1070,7 @@ config X86_MCE_THRESHOLD
def_bool y
config X86_MCE_INJECT
- depends on X86_MCE
+ depends on X86_MCE && X86_LOCAL_APIC
tristate "Machine check injector support"
---help---
Provide support for injecting machine checks for testing purposes.
diff --git a/arch/x86/kernel/cpu/mcheck/mce-inject.c b/arch/x86/kernel/cpu/mcheck/mce-inject.c
index 517619e..99165b2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-inject.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-inject.c
@@ -152,7 +152,6 @@ static void raise_mce(struct mce *m)
if (context == MCJ_CTX_RANDOM)
return;
-#ifdef CONFIG_X86_LOCAL_APIC
if (m->inject_flags & (MCJ_IRQ_BROADCAST | MCJ_NMI_BROADCAST)) {
unsigned long start;
int cpu;
@@ -192,9 +191,7 @@ static void raise_mce(struct mce *m)
raise_local();
put_cpu();
put_online_cpus();
- } else
-#endif
- {
+ } else {
preempt_disable();
raise_local();
preempt_enable();
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 2/9] x86/MCE/therm_throt: Do not log a fake MCE for a thermal event
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
2017-01-23 18:35 ` [PATCH 1/9] x86/mce-inject: Make it depend on X86_LOCAL_APIC Borislav Petkov
@ 2017-01-23 18:35 ` Borislav Petkov
2017-01-24 8:47 ` [tip:ras/core] x86/ras/therm_throt: Do not log a fake MCE for thermal events tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 3/9] x86/MCE/AMD: Make sysfs names of banks more user-friendly Borislav Petkov
` (6 subsequent siblings)
8 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2017-01-23 18:35 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, Yazen Ghannam, linux-edac, LKML
From: Borislav Petkov <bp@suse.de>
We log a fake bank 128 MCE to note that we're handling a CPU thermal
event. However, this confuses people into thinking that their hardware
generates MCEs. Hijacking MCA for logging thermal events is a gross
misuse anyway and it shouldn't have been done in the first place. And
besides we have other means for dealing with thermal events which are
much more suitable.
So let's kill the MCE logging part.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ashok Raj <ashok.raj@intel.com>
Link: http://lkml.kernel.org/r/20170105213846.GA12024@gmail.com
---
arch/x86/include/asm/mce.h | 6 ------
arch/x86/kernel/cpu/mcheck/mce.c | 25 -------------------------
arch/x86/kernel/cpu/mcheck/therm_throt.c | 30 +++++++++++-------------------
3 files changed, 11 insertions(+), 50 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 5132f2a6c0a2..a09ed05725c2 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -97,10 +97,6 @@
#define MCE_OVERFLOW 0 /* bit 0 in flags means overflow */
-/* Software defined banks */
-#define MCE_EXTENDED_BANK 128
-#define MCE_THERMAL_BANK (MCE_EXTENDED_BANK + 0)
-
#define MCE_LOG_LEN 32
#define MCE_LOG_SIGNATURE "MACHINECHECK"
@@ -306,8 +302,6 @@ extern void (*deferred_error_int_vector)(void);
void intel_init_thermal(struct cpuinfo_x86 *c);
-void mce_log_therm_throt_event(__u64 status);
-
/* Interrupt Handler for core thermal thresholds */
extern int (*platform_thermal_notify)(__u64 msr_val);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 00ef43233e03..6eef6fde0f02 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1331,31 +1331,6 @@ static void mce_process_work(struct work_struct *dummy)
mce_gen_pool_process();
}
-#ifdef CONFIG_X86_MCE_INTEL
-/***
- * mce_log_therm_throt_event - Logs the thermal throttling event to mcelog
- * @cpu: The CPU on which the event occurred.
- * @status: Event status information
- *
- * This function should be called by the thermal interrupt after the
- * event has been processed and the decision was made to log the event
- * further.
- *
- * The status parameter will be saved to the 'status' field of 'struct mce'
- * and historically has been the register value of the
- * MSR_IA32_THERMAL_STATUS (Intel) msr.
- */
-void mce_log_therm_throt_event(__u64 status)
-{
- struct mce m;
-
- mce_setup(&m);
- m.bank = MCE_THERMAL_BANK;
- m.status = status;
- mce_log(&m);
-}
-#endif /* CONFIG_X86_MCE_INTEL */
-
/*
* Periodic polling timer for "silent" machine check errors. If the
* poller finds an MCE, poll 2x faster. When the poller finds no more
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 465aca8be009..85469f84c921 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -6,7 +6,7 @@
*
* Maintains a counter in /sys that keeps track of the number of thermal
* events, such that the user knows how bad the thermal problem might be
- * (since the logging to syslog and mcelog is rate limited).
+ * (since the logging to syslog is rate limited).
*
* Author: Dmitriy Zavin (dmitriyz@google.com)
*
@@ -141,13 +141,8 @@ static struct attribute_group thermal_attr_group = {
* IRQ has been acknowledged.
*
* It will take care of rate limiting and printing messages to the syslog.
- *
- * Returns: 0 : Event should NOT be further logged, i.e. still in
- * "timeout" from previous log message.
- * 1 : Event should be logged further, and a message has been
- * printed to the syslog.
*/
-static int therm_throt_process(bool new_event, int event, int level)
+static void therm_throt_process(bool new_event, int event, int level)
{
struct _thermal_state *state;
unsigned int this_cpu = smp_processor_id();
@@ -162,16 +157,16 @@ static int therm_throt_process(bool new_event, int event, int level)
else if (event == POWER_LIMIT_EVENT)
state = &pstate->core_power_limit;
else
- return 0;
+ return;
} else if (level == PACKAGE_LEVEL) {
if (event == THERMAL_THROTTLING_EVENT)
state = &pstate->package_throttle;
else if (event == POWER_LIMIT_EVENT)
state = &pstate->package_power_limit;
else
- return 0;
+ return;
} else
- return 0;
+ return;
old_event = state->new_event;
state->new_event = new_event;
@@ -181,7 +176,7 @@ static int therm_throt_process(bool new_event, int event, int level)
if (time_before64(now, state->next_check) &&
state->count != state->last_count)
- return 0;
+ return;
state->next_check = now + CHECK_INTERVAL;
state->last_count = state->count;
@@ -193,16 +188,14 @@ static int therm_throt_process(bool new_event, int event, int level)
this_cpu,
level == CORE_LEVEL ? "Core" : "Package",
state->count);
- return 1;
+ return;
}
if (old_event) {
if (event == THERMAL_THROTTLING_EVENT)
pr_info("CPU%d: %s temperature/speed normal\n", this_cpu,
level == CORE_LEVEL ? "Core" : "Package");
- return 1;
+ return;
}
-
- return 0;
}
static int thresh_event_valid(int level, int event)
@@ -365,10 +358,9 @@ static void intel_thermal_interrupt(void)
/* Check for violation of core thermal thresholds*/
notify_thresholds(msr_val);
- if (therm_throt_process(msr_val & THERM_STATUS_PROCHOT,
- THERMAL_THROTTLING_EVENT,
- CORE_LEVEL) != 0)
- mce_log_therm_throt_event(msr_val);
+ therm_throt_process(msr_val & THERM_STATUS_PROCHOT,
+ THERMAL_THROTTLING_EVENT,
+ CORE_LEVEL);
if (this_cpu_has(X86_FEATURE_PLN) && int_pln_enable)
therm_throt_process(msr_val & THERM_STATUS_POWER_LIMIT,
--
2.11.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [tip:ras/core] x86/ras/therm_throt: Do not log a fake MCE for thermal events
2017-01-23 18:35 ` [PATCH 2/9] x86/MCE/therm_throt: Do not log a fake MCE for a thermal event Borislav Petkov
@ 2017-01-24 8:47 ` tip-bot for Borislav Petkov
0 siblings, 0 replies; 37+ messages in thread
From: tip-bot for Borislav Petkov @ 2017-01-24 8:47 UTC (permalink / raw)
To: linux-tip-commits
Cc: Yazen.Ghannam, torvalds, mingo, tglx, bp, linux-edac, ashok.raj,
hpa, tony.luck, peterz, linux-kernel
Commit-ID: 9b052ea4ced0fa1ad30a2eafe86984a16297e6f1
Gitweb: http://git.kernel.org/tip/9b052ea4ced0fa1ad30a2eafe86984a16297e6f1
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 23 Jan 2017 19:35:07 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:53 +0100
x86/ras/therm_throt: Do not log a fake MCE for thermal events
We log a fake bank 128 MCE to note that we're handling a CPU thermal
event. However, this confuses people into thinking that their hardware
generates MCEs. Hijacking MCA for logging thermal events is a gross
misuse anyway and it shouldn't have been done in the first place. And
besides we have other means for dealing with thermal events which are
much more suitable.
So let's kill the MCE logging part.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ashok Raj <ashok.raj@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170105213846.GA12024@gmail.com
Link: http://lkml.kernel.org/r/20170123183514.13356-3-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/mce.h | 6 ------
arch/x86/kernel/cpu/mcheck/mce.c | 25 -------------------------
arch/x86/kernel/cpu/mcheck/therm_throt.c | 30 +++++++++++-------------------
3 files changed, 11 insertions(+), 50 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 5132f2a..a09ed05 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -97,10 +97,6 @@
#define MCE_OVERFLOW 0 /* bit 0 in flags means overflow */
-/* Software defined banks */
-#define MCE_EXTENDED_BANK 128
-#define MCE_THERMAL_BANK (MCE_EXTENDED_BANK + 0)
-
#define MCE_LOG_LEN 32
#define MCE_LOG_SIGNATURE "MACHINECHECK"
@@ -306,8 +302,6 @@ extern void (*deferred_error_int_vector)(void);
void intel_init_thermal(struct cpuinfo_x86 *c);
-void mce_log_therm_throt_event(__u64 status);
-
/* Interrupt Handler for core thermal thresholds */
extern int (*platform_thermal_notify)(__u64 msr_val);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 00ef432..6eef6fd 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1331,31 +1331,6 @@ static void mce_process_work(struct work_struct *dummy)
mce_gen_pool_process();
}
-#ifdef CONFIG_X86_MCE_INTEL
-/***
- * mce_log_therm_throt_event - Logs the thermal throttling event to mcelog
- * @cpu: The CPU on which the event occurred.
- * @status: Event status information
- *
- * This function should be called by the thermal interrupt after the
- * event has been processed and the decision was made to log the event
- * further.
- *
- * The status parameter will be saved to the 'status' field of 'struct mce'
- * and historically has been the register value of the
- * MSR_IA32_THERMAL_STATUS (Intel) msr.
- */
-void mce_log_therm_throt_event(__u64 status)
-{
- struct mce m;
-
- mce_setup(&m);
- m.bank = MCE_THERMAL_BANK;
- m.status = status;
- mce_log(&m);
-}
-#endif /* CONFIG_X86_MCE_INTEL */
-
/*
* Periodic polling timer for "silent" machine check errors. If the
* poller finds an MCE, poll 2x faster. When the poller finds no more
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 465aca8..85469f8 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -6,7 +6,7 @@
*
* Maintains a counter in /sys that keeps track of the number of thermal
* events, such that the user knows how bad the thermal problem might be
- * (since the logging to syslog and mcelog is rate limited).
+ * (since the logging to syslog is rate limited).
*
* Author: Dmitriy Zavin (dmitriyz@google.com)
*
@@ -141,13 +141,8 @@ static struct attribute_group thermal_attr_group = {
* IRQ has been acknowledged.
*
* It will take care of rate limiting and printing messages to the syslog.
- *
- * Returns: 0 : Event should NOT be further logged, i.e. still in
- * "timeout" from previous log message.
- * 1 : Event should be logged further, and a message has been
- * printed to the syslog.
*/
-static int therm_throt_process(bool new_event, int event, int level)
+static void therm_throt_process(bool new_event, int event, int level)
{
struct _thermal_state *state;
unsigned int this_cpu = smp_processor_id();
@@ -162,16 +157,16 @@ static int therm_throt_process(bool new_event, int event, int level)
else if (event == POWER_LIMIT_EVENT)
state = &pstate->core_power_limit;
else
- return 0;
+ return;
} else if (level == PACKAGE_LEVEL) {
if (event == THERMAL_THROTTLING_EVENT)
state = &pstate->package_throttle;
else if (event == POWER_LIMIT_EVENT)
state = &pstate->package_power_limit;
else
- return 0;
+ return;
} else
- return 0;
+ return;
old_event = state->new_event;
state->new_event = new_event;
@@ -181,7 +176,7 @@ static int therm_throt_process(bool new_event, int event, int level)
if (time_before64(now, state->next_check) &&
state->count != state->last_count)
- return 0;
+ return;
state->next_check = now + CHECK_INTERVAL;
state->last_count = state->count;
@@ -193,16 +188,14 @@ static int therm_throt_process(bool new_event, int event, int level)
this_cpu,
level == CORE_LEVEL ? "Core" : "Package",
state->count);
- return 1;
+ return;
}
if (old_event) {
if (event == THERMAL_THROTTLING_EVENT)
pr_info("CPU%d: %s temperature/speed normal\n", this_cpu,
level == CORE_LEVEL ? "Core" : "Package");
- return 1;
+ return;
}
-
- return 0;
}
static int thresh_event_valid(int level, int event)
@@ -365,10 +358,9 @@ static void intel_thermal_interrupt(void)
/* Check for violation of core thermal thresholds*/
notify_thresholds(msr_val);
- if (therm_throt_process(msr_val & THERM_STATUS_PROCHOT,
- THERMAL_THROTTLING_EVENT,
- CORE_LEVEL) != 0)
- mce_log_therm_throt_event(msr_val);
+ therm_throt_process(msr_val & THERM_STATUS_PROCHOT,
+ THERMAL_THROTTLING_EVENT,
+ CORE_LEVEL);
if (this_cpu_has(X86_FEATURE_PLN) && int_pln_enable)
therm_throt_process(msr_val & THERM_STATUS_POWER_LIMIT,
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 3/9] x86/MCE/AMD: Make sysfs names of banks more user-friendly
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
2017-01-23 18:35 ` [PATCH 1/9] x86/mce-inject: Make it depend on X86_LOCAL_APIC Borislav Petkov
2017-01-23 18:35 ` [PATCH 2/9] x86/MCE/therm_throt: Do not log a fake MCE for a thermal event Borislav Petkov
@ 2017-01-23 18:35 ` Borislav Petkov
2017-01-24 8:47 ` [tip:ras/core] x86/ras/amd: " tip-bot for Yazen Ghannam
2017-01-23 18:35 ` [PATCH 4/9] x86/MCE: Flip the TSC-adding logic Borislav Petkov
` (5 subsequent siblings)
8 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2017-01-23 18:35 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, Yazen Ghannam, linux-edac, LKML
From: Yazen Ghannam <Yazen.Ghannam@amd.com>
Currently, we append the MCA_IPID[InstanceId] to the bank name to create
the sysfs filename. The InstanceId field uniquely identifies a bank
instance but it doesn't look very nice for most banks.
Replace the InstanceId with a simpler, ascending (0, 1, ..) value.
Only use this in the sysfs name when there is more than 1 instance.
Otherwise, just use the bank's name as the sysfs name.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1484322741-41884-3-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/include/asm/mce.h | 5 +++--
arch/x86/kernel/cpu/mcheck/mce_amd.c | 6 +++++-
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index a09ed05725c2..528f6ec897cb 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -356,12 +356,13 @@ struct smca_hwid {
unsigned int bank_type; /* Use with smca_bank_types for easy indexing. */
u32 hwid_mcatype; /* (hwid,mcatype) tuple */
u32 xec_bitmap; /* Bitmap of valid ExtErrorCodes; current max is 21. */
+ u8 count; /* Number of instances. */
};
struct smca_bank {
struct smca_hwid *hwid;
- /* Instance ID */
- u32 id;
+ u32 id; /* Value of MCA_IPID[InstanceId]. */
+ u8 sysfs_id; /* Value used for sysfs name. */
};
extern struct smca_bank smca_banks[MAX_NR_BANKS];
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index a5fd137417a2..776379e4a39c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -192,6 +192,7 @@ static void get_smca_bank_info(unsigned int bank)
smca_banks[bank].hwid = s_hwid;
smca_banks[bank].id = instance_id;
+ smca_banks[bank].sysfs_id = s_hwid->count++;
break;
}
}
@@ -1064,9 +1065,12 @@ static const char *get_name(unsigned int bank, struct threshold_block *b)
return NULL;
}
+ if (smca_banks[bank].hwid->count == 1)
+ return smca_get_name(bank_type);
+
snprintf(buf_mcatype, MAX_MCATYPE_NAME_LEN,
"%s_%x", smca_get_name(bank_type),
- smca_banks[bank].id);
+ smca_banks[bank].sysfs_id);
return buf_mcatype;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [tip:ras/core] x86/ras/amd: Make sysfs names of banks more user-friendly
2017-01-23 18:35 ` [PATCH 3/9] x86/MCE/AMD: Make sysfs names of banks more user-friendly Borislav Petkov
@ 2017-01-24 8:47 ` tip-bot for Yazen Ghannam
0 siblings, 0 replies; 37+ messages in thread
From: tip-bot for Yazen Ghannam @ 2017-01-24 8:47 UTC (permalink / raw)
To: linux-tip-commits
Cc: tony.luck, Yazen.Ghannam, torvalds, peterz, mingo, linux-edac,
bp, linux-kernel, tglx, hpa
Commit-ID: 0b737a9c2af85cc8295f9308d9250f9111bbf94d
Gitweb: http://git.kernel.org/tip/0b737a9c2af85cc8295f9308d9250f9111bbf94d
Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
AuthorDate: Mon, 23 Jan 2017 19:35:08 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:53 +0100
x86/ras/amd: Make sysfs names of banks more user-friendly
Currently, we append the MCA_IPID[InstanceId] to the bank name to create
the sysfs filename. The InstanceId field uniquely identifies a bank
instance but it doesn't look very nice for most banks.
Replace the InstanceId with a simpler, ascending (0, 1, ..) value.
Only use this in the sysfs name when there is more than 1 instance.
Otherwise, just use the bank's name as the sysfs name.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1484322741-41884-3-git-send-email-Yazen.Ghannam@amd.com
Link: http://lkml.kernel.org/r/20170123183514.13356-4-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/mce.h | 5 +++--
arch/x86/kernel/cpu/mcheck/mce_amd.c | 6 +++++-
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index a09ed05..528f6ec 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -356,12 +356,13 @@ struct smca_hwid {
unsigned int bank_type; /* Use with smca_bank_types for easy indexing. */
u32 hwid_mcatype; /* (hwid,mcatype) tuple */
u32 xec_bitmap; /* Bitmap of valid ExtErrorCodes; current max is 21. */
+ u8 count; /* Number of instances. */
};
struct smca_bank {
struct smca_hwid *hwid;
- /* Instance ID */
- u32 id;
+ u32 id; /* Value of MCA_IPID[InstanceId]. */
+ u8 sysfs_id; /* Value used for sysfs name. */
};
extern struct smca_bank smca_banks[MAX_NR_BANKS];
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index a5fd137..776379e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -192,6 +192,7 @@ static void get_smca_bank_info(unsigned int bank)
smca_banks[bank].hwid = s_hwid;
smca_banks[bank].id = instance_id;
+ smca_banks[bank].sysfs_id = s_hwid->count++;
break;
}
}
@@ -1064,9 +1065,12 @@ static const char *get_name(unsigned int bank, struct threshold_block *b)
return NULL;
}
+ if (smca_banks[bank].hwid->count == 1)
+ return smca_get_name(bank_type);
+
snprintf(buf_mcatype, MAX_MCATYPE_NAME_LEN,
"%s_%x", smca_get_name(bank_type),
- smca_banks[bank].id);
+ smca_banks[bank].sysfs_id);
return buf_mcatype;
}
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 4/9] x86/MCE: Flip the TSC-adding logic
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
` (2 preceding siblings ...)
2017-01-23 18:35 ` [PATCH 3/9] x86/MCE/AMD: Make sysfs names of banks more user-friendly Borislav Petkov
@ 2017-01-23 18:35 ` Borislav Petkov
2017-01-24 8:48 ` [tip:ras/core] x86/ras: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 5/9] x86/ras/mce_amd_inj: Change dependency Borislav Petkov
` (4 subsequent siblings)
8 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2017-01-23 18:35 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, Yazen Ghannam, linux-edac, LKML
From: Borislav Petkov <bp@suse.de>
Add the TSC value to the MCE record only when the MCE being logged is
precise, i.e., it is logged as an exception or an MCE-related interrupt.
So it doesn't look particularly easy to do without touching/changing a
bunch of places. That's why I'm trying tricks first.
For example, the mce-apei.c case I'm addressing by setting ->tsc only
for errors of panic severity. The idea there is, that, panic errors will
have raised an #MC and not polled.
And then instead of propagating a flag to mce_setup(), it seems
easier/less code to set ->tsc depending on the call sites, i.e.,
are we polling or are we preparing an MCE record in an exception
handler/thresholding interrupt.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mcheck/mce-apei.c | 5 ++++-
arch/x86/kernel/cpu/mcheck/mce.c | 12 +++---------
arch/x86/kernel/cpu/mcheck/mce_amd.c | 3 ++-
3 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
index 83f1a98d37db..2eee85379689 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
@@ -52,8 +52,11 @@ void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err)
if (severity >= GHES_SEV_RECOVERABLE)
m.status |= MCI_STATUS_UC;
- if (severity >= GHES_SEV_PANIC)
+
+ if (severity >= GHES_SEV_PANIC) {
m.status |= MCI_STATUS_PCC;
+ m.tsc = rdtsc();
+ }
m.addr = mem_err->physical_addr;
mce_log(&m);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 6eef6fde0f02..ca15a7e1f97d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -128,7 +128,6 @@ void mce_setup(struct mce *m)
{
memset(m, 0, sizeof(struct mce));
m->cpu = m->extcpu = smp_processor_id();
- m->tsc = rdtsc();
/* We hope get_seconds stays lockless */
m->time = get_seconds();
m->cpuvendor = boot_cpu_data.x86_vendor;
@@ -710,14 +709,8 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
mce_gather_info(&m, NULL);
- /*
- * m.tsc was set in mce_setup(). Clear it if not requested.
- *
- * FIXME: Propagate @flags to mce_gather_info/mce_setup() to avoid
- * that dance.
- */
- if (!(flags & MCP_TIMESTAMP))
- m.tsc = 0;
+ if (flags & MCP_TIMESTAMP)
+ m.tsc = rdtsc();
for (i = 0; i < mca_cfg.banks; i++) {
if (!mce_banks[i].ctl || !test_bit(i, *b))
@@ -1156,6 +1149,7 @@ void do_machine_check(struct pt_regs *regs, long error_code)
goto out;
mce_gather_info(&m, regs);
+ m.tsc = rdtsc();
final = this_cpu_ptr(&mces_seen);
*final = m;
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 776379e4a39c..9e5427df3243 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -778,7 +778,8 @@ __log_error(unsigned int bank, bool deferred_err, bool threshold_err, u64 misc)
mce_setup(&m);
m.status = status;
- m.bank = bank;
+ m.bank = bank;
+ m.tsc = rdtsc();
if (threshold_err)
m.misc = misc;
--
2.11.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [tip:ras/core] x86/ras: Flip the TSC-adding logic
2017-01-23 18:35 ` [PATCH 4/9] x86/MCE: Flip the TSC-adding logic Borislav Petkov
@ 2017-01-24 8:48 ` tip-bot for Borislav Petkov
0 siblings, 0 replies; 37+ messages in thread
From: tip-bot for Borislav Petkov @ 2017-01-24 8:48 UTC (permalink / raw)
To: linux-tip-commits
Cc: tony.luck, peterz, Yazen.Ghannam, linux-edac, bp, linux-kernel,
hpa, tglx, torvalds, mingo
Commit-ID: 669c00f09935fc7a22297eadee04536af141595b
Gitweb: http://git.kernel.org/tip/669c00f09935fc7a22297eadee04536af141595b
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 23 Jan 2017 19:35:09 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:54 +0100
x86/ras: Flip the TSC-adding logic
Add the TSC value to the MCE record only when the MCE being logged is
precise, i.e., it is logged as an exception or an MCE-related interrupt.
So it doesn't look particularly easy to do without touching/changing a
bunch of places. That's why I'm trying tricks first.
For example, the mce-apei.c case I'm addressing by setting ->tsc only
for errors of panic severity. The idea there is, that, panic errors will
have raised an #MC and not polled.
And then instead of propagating a flag to mce_setup(), it seems
easier/less code to set ->tsc depending on the call sites, i.e.,
are we polling or are we preparing an MCE record in an exception
handler/thresholding interrupt.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-5-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/cpu/mcheck/mce-apei.c | 5 ++++-
arch/x86/kernel/cpu/mcheck/mce.c | 12 +++---------
arch/x86/kernel/cpu/mcheck/mce_amd.c | 3 ++-
3 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
index 83f1a98..2eee853 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
@@ -52,8 +52,11 @@ void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err)
if (severity >= GHES_SEV_RECOVERABLE)
m.status |= MCI_STATUS_UC;
- if (severity >= GHES_SEV_PANIC)
+
+ if (severity >= GHES_SEV_PANIC) {
m.status |= MCI_STATUS_PCC;
+ m.tsc = rdtsc();
+ }
m.addr = mem_err->physical_addr;
mce_log(&m);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 6eef6fd..ca15a7e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -128,7 +128,6 @@ void mce_setup(struct mce *m)
{
memset(m, 0, sizeof(struct mce));
m->cpu = m->extcpu = smp_processor_id();
- m->tsc = rdtsc();
/* We hope get_seconds stays lockless */
m->time = get_seconds();
m->cpuvendor = boot_cpu_data.x86_vendor;
@@ -710,14 +709,8 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
mce_gather_info(&m, NULL);
- /*
- * m.tsc was set in mce_setup(). Clear it if not requested.
- *
- * FIXME: Propagate @flags to mce_gather_info/mce_setup() to avoid
- * that dance.
- */
- if (!(flags & MCP_TIMESTAMP))
- m.tsc = 0;
+ if (flags & MCP_TIMESTAMP)
+ m.tsc = rdtsc();
for (i = 0; i < mca_cfg.banks; i++) {
if (!mce_banks[i].ctl || !test_bit(i, *b))
@@ -1156,6 +1149,7 @@ void do_machine_check(struct pt_regs *regs, long error_code)
goto out;
mce_gather_info(&m, regs);
+ m.tsc = rdtsc();
final = this_cpu_ptr(&mces_seen);
*final = m;
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 776379e..9e5427d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -778,7 +778,8 @@ __log_error(unsigned int bank, bool deferred_err, bool threshold_err, u64 misc)
mce_setup(&m);
m.status = status;
- m.bank = bank;
+ m.bank = bank;
+ m.tsc = rdtsc();
if (threshold_err)
m.misc = misc;
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 5/9] x86/ras/mce_amd_inj: Change dependency
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
` (3 preceding siblings ...)
2017-01-23 18:35 ` [PATCH 4/9] x86/MCE: Flip the TSC-adding logic Borislav Petkov
@ 2017-01-23 18:35 ` Borislav Petkov
2017-01-24 8:48 ` [tip:ras/core] x86/ras/amd/inj: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 6/9] EDAC/mce_amd: Unexport amd_decode_mce() Borislav Petkov
` (3 subsequent siblings)
8 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2017-01-23 18:35 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, Yazen Ghannam, linux-edac, LKML
From: Borislav Petkov <bp@suse.de>
Change dependency to mce.c as we're using mce_inject_log() now to stick
an MCE into the MCA subsystem.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/ras/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/ras/Kconfig b/arch/x86/ras/Kconfig
index d957d5f21a86..0bc60a308730 100644
--- a/arch/x86/ras/Kconfig
+++ b/arch/x86/ras/Kconfig
@@ -1,6 +1,6 @@
config MCE_AMD_INJ
tristate "Simple MCE injection interface for AMD processors"
- depends on RAS && EDAC_DECODE_MCE && DEBUG_FS && AMD_NB
+ depends on RAS && X86_MCE && DEBUG_FS && AMD_NB
default n
help
This is a simple debugfs interface to inject MCEs and test different
--
2.11.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [tip:ras/core] x86/ras/amd/inj: Change dependency
2017-01-23 18:35 ` [PATCH 5/9] x86/ras/mce_amd_inj: Change dependency Borislav Petkov
@ 2017-01-24 8:48 ` tip-bot for Borislav Petkov
0 siblings, 0 replies; 37+ messages in thread
From: tip-bot for Borislav Petkov @ 2017-01-24 8:48 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, linux-edac, hpa, tglx, mingo, Yazen.Ghannam,
peterz, bp, tony.luck, torvalds
Commit-ID: bd43f60a260c83cbc9befd7d710a3f2bfd3b2dd2
Gitweb: http://git.kernel.org/tip/bd43f60a260c83cbc9befd7d710a3f2bfd3b2dd2
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 23 Jan 2017 19:35:10 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:55 +0100
x86/ras/amd/inj: Change dependency
Change dependency to mce.c as we're using mce_inject_log() now to stick
an MCE into the MCA subsystem.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-6-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/ras/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/ras/Kconfig b/arch/x86/ras/Kconfig
index d957d5f..0bc60a3 100644
--- a/arch/x86/ras/Kconfig
+++ b/arch/x86/ras/Kconfig
@@ -1,6 +1,6 @@
config MCE_AMD_INJ
tristate "Simple MCE injection interface for AMD processors"
- depends on RAS && EDAC_DECODE_MCE && DEBUG_FS && AMD_NB
+ depends on RAS && X86_MCE && DEBUG_FS && AMD_NB
default n
help
This is a simple debugfs interface to inject MCEs and test different
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 6/9] EDAC/mce_amd: Unexport amd_decode_mce()
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
` (4 preceding siblings ...)
2017-01-23 18:35 ` [PATCH 5/9] x86/ras/mce_amd_inj: Change dependency Borislav Petkov
@ 2017-01-23 18:35 ` Borislav Petkov
2017-01-24 8:49 ` [tip:ras/core] EDAC/mce/amd: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 7/9] EDAC/mce_amd: Dump TSC value Borislav Petkov
` (2 subsequent siblings)
8 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2017-01-23 18:35 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, Yazen Ghannam, linux-edac, LKML
From: Borislav Petkov <bp@suse.de>
It is not used outside of the driver anymore.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
drivers/edac/mce_amd.c | 4 ++--
drivers/edac/mce_amd.h | 1 -
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 34208f38c5b1..5cd3c39bc695 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -942,7 +942,8 @@ static const char *decode_error_status(struct mce *m)
return "Corrected error, no action required.";
}
-int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
+static int
+amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
{
struct mce *m = (struct mce *)data;
struct cpuinfo_x86 *c = &cpu_data(m->extcpu);
@@ -1047,7 +1048,6 @@ int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
return NOTIFY_STOP;
}
-EXPORT_SYMBOL_GPL(amd_decode_mce);
static struct notifier_block amd_mce_dec_nb = {
.notifier_call = amd_decode_mce,
diff --git a/drivers/edac/mce_amd.h b/drivers/edac/mce_amd.h
index c2359a1ea6b3..0b6a68673e0e 100644
--- a/drivers/edac/mce_amd.h
+++ b/drivers/edac/mce_amd.h
@@ -79,6 +79,5 @@ struct amd_decoder_ops {
void amd_report_gart_errors(bool);
void amd_register_ecc_decoder(void (*f)(int, struct mce *));
void amd_unregister_ecc_decoder(void (*f)(int, struct mce *));
-int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data);
#endif /* _EDAC_MCE_AMD_H */
--
2.11.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [tip:ras/core] EDAC/mce/amd: Unexport amd_decode_mce()
2017-01-23 18:35 ` [PATCH 6/9] EDAC/mce_amd: Unexport amd_decode_mce() Borislav Petkov
@ 2017-01-24 8:49 ` tip-bot for Borislav Petkov
0 siblings, 0 replies; 37+ messages in thread
From: tip-bot for Borislav Petkov @ 2017-01-24 8:49 UTC (permalink / raw)
To: linux-tip-commits
Cc: tglx, linux-edac, bp, mingo, torvalds, linux-kernel, tony.luck,
peterz, hpa, Yazen.Ghannam
Commit-ID: 1fbcd909035251b5eac267f1c5d6d67b32d16b62
Gitweb: http://git.kernel.org/tip/1fbcd909035251b5eac267f1c5d6d67b32d16b62
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 23 Jan 2017 19:35:11 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:55 +0100
EDAC/mce/amd: Unexport amd_decode_mce()
It is not used outside of the driver anymore.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-7-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
drivers/edac/mce_amd.c | 4 ++--
drivers/edac/mce_amd.h | 1 -
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 34208f3..5cd3c39 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -942,7 +942,8 @@ static const char *decode_error_status(struct mce *m)
return "Corrected error, no action required.";
}
-int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
+static int
+amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
{
struct mce *m = (struct mce *)data;
struct cpuinfo_x86 *c = &cpu_data(m->extcpu);
@@ -1047,7 +1048,6 @@ int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
return NOTIFY_STOP;
}
-EXPORT_SYMBOL_GPL(amd_decode_mce);
static struct notifier_block amd_mce_dec_nb = {
.notifier_call = amd_decode_mce,
diff --git a/drivers/edac/mce_amd.h b/drivers/edac/mce_amd.h
index c2359a1..0b6a686 100644
--- a/drivers/edac/mce_amd.h
+++ b/drivers/edac/mce_amd.h
@@ -79,6 +79,5 @@ struct amd_decoder_ops {
void amd_report_gart_errors(bool);
void amd_register_ecc_decoder(void (*f)(int, struct mce *));
void amd_unregister_ecc_decoder(void (*f)(int, struct mce *));
-int amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data);
#endif /* _EDAC_MCE_AMD_H */
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 7/9] EDAC/mce_amd: Dump TSC value
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
` (5 preceding siblings ...)
2017-01-23 18:35 ` [PATCH 6/9] EDAC/mce_amd: Unexport amd_decode_mce() Borislav Petkov
@ 2017-01-23 18:35 ` Borislav Petkov
2017-01-24 8:50 ` [tip:ras/core] EDAC/mce/amd: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 8/9] x86/MCE: Get rid of mce_process_work() Borislav Petkov
2017-01-23 18:35 ` [PATCH 9/9] x86/MCE, EDAC, acpi: Assign MCE notifier handlers a priority Borislav Petkov
8 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2017-01-23 18:35 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, Yazen Ghannam, linux-edac, LKML
From: Borislav Petkov <bp@suse.de>
Dump the TSC value of the time when the MCE got logged.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
drivers/edac/mce_amd.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 5cd3c39bc695..ecad750fd090 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1007,6 +1007,9 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
} else
pr_cont("\n");
+ if (m->tsc)
+ pr_emerg(HW_ERR "TSC: %llu\n", m->tsc);
+
if (!fam_ops)
goto err_code;
--
2.11.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [tip:ras/core] EDAC/mce/amd: Dump TSC value
2017-01-23 18:35 ` [PATCH 7/9] EDAC/mce_amd: Dump TSC value Borislav Petkov
@ 2017-01-24 8:50 ` tip-bot for Borislav Petkov
0 siblings, 0 replies; 37+ messages in thread
From: tip-bot for Borislav Petkov @ 2017-01-24 8:50 UTC (permalink / raw)
To: linux-tip-commits
Cc: hpa, linux-edac, tony.luck, bp, torvalds, peterz, mingo,
Yazen.Ghannam, tglx, linux-kernel
Commit-ID: 0bceab677dcef409f6281d922461057721d547b3
Gitweb: http://git.kernel.org/tip/0bceab677dcef409f6281d922461057721d547b3
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 23 Jan 2017 19:35:12 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:56 +0100
EDAC/mce/amd: Dump TSC value
Dump the TSC value of the time when the MCE got logged.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-8-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
drivers/edac/mce_amd.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 5cd3c39..ecad750 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1007,6 +1007,9 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
} else
pr_cont("\n");
+ if (m->tsc)
+ pr_emerg(HW_ERR "TSC: %llu\n", m->tsc);
+
if (!fam_ops)
goto err_code;
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 8/9] x86/MCE: Get rid of mce_process_work()
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
` (6 preceding siblings ...)
2017-01-23 18:35 ` [PATCH 7/9] EDAC/mce_amd: Dump TSC value Borislav Petkov
@ 2017-01-23 18:35 ` Borislav Petkov
2017-01-24 8:50 ` [tip:ras/core] x86/ras: " tip-bot for Borislav Petkov
2017-01-23 18:35 ` [PATCH 9/9] x86/MCE, EDAC, acpi: Assign MCE notifier handlers a priority Borislav Petkov
8 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2017-01-23 18:35 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, Yazen Ghannam, linux-edac, LKML
From: Borislav Petkov <bp@suse.de>
Make mce_gen_pool_process() the workqueue function directly and save us
an indirection.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mcheck/mce-genpool.c | 2 +-
arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 +-
arch/x86/kernel/cpu/mcheck/mce.c | 12 +-----------
3 files changed, 3 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce-genpool.c b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
index 93d824ec3120..1e5a50c11d3c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-genpool.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
@@ -72,7 +72,7 @@ struct llist_node *mce_gen_pool_prepare_records(void)
return new_head.first;
}
-void mce_gen_pool_process(void)
+void mce_gen_pool_process(struct work_struct *__unused)
{
struct llist_node *head;
struct mce_evt_llist *node, *tmp;
diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index cd74a3f00aea..903043e6a62b 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -31,7 +31,7 @@ struct mce_evt_llist {
struct mce mce;
};
-void mce_gen_pool_process(void);
+void mce_gen_pool_process(struct work_struct *__unused);
bool mce_gen_pool_empty(void);
int mce_gen_pool_add(struct mce *mce);
int mce_gen_pool_init(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index ca15a7e1f97d..0fef5406f0eb 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1316,16 +1316,6 @@ int memory_failure(unsigned long pfn, int vector, int flags)
#endif
/*
- * Action optional processing happens here (picking up
- * from the list of faulting pages that do_machine_check()
- * placed into the genpool).
- */
-static void mce_process_work(struct work_struct *dummy)
-{
- mce_gen_pool_process();
-}
-
-/*
* Periodic polling timer for "silent" machine check errors. If the
* poller finds an MCE, poll 2x faster. When the poller finds no more
* errors, poll 2x slower (up to check_interval seconds).
@@ -2165,7 +2155,7 @@ int __init mcheck_init(void)
mce_register_decode_chain(&mce_default_nb);
mcheck_vendor_init_severity();
- INIT_WORK(&mce_work, mce_process_work);
+ INIT_WORK(&mce_work, mce_gen_pool_process);
init_irq_work(&mce_irq_work, mce_irq_work_cb);
return 0;
--
2.11.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [tip:ras/core] x86/ras: Get rid of mce_process_work()
2017-01-23 18:35 ` [PATCH 8/9] x86/MCE: Get rid of mce_process_work() Borislav Petkov
@ 2017-01-24 8:50 ` tip-bot for Borislav Petkov
0 siblings, 0 replies; 37+ messages in thread
From: tip-bot for Borislav Petkov @ 2017-01-24 8:50 UTC (permalink / raw)
To: linux-tip-commits
Cc: tglx, tony.luck, linux-edac, linux-kernel, bp, peterz, torvalds,
hpa, Yazen.Ghannam, mingo
Commit-ID: cff4c0391a692cf9b89932c62a7f879fb3637148
Gitweb: http://git.kernel.org/tip/cff4c0391a692cf9b89932c62a7f879fb3637148
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 23 Jan 2017 19:35:13 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:56 +0100
x86/ras: Get rid of mce_process_work()
Make mce_gen_pool_process() the workqueue function directly and save us
an indirection.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-9-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/cpu/mcheck/mce-genpool.c | 2 +-
arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 +-
arch/x86/kernel/cpu/mcheck/mce.c | 12 +-----------
3 files changed, 3 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce-genpool.c b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
index 93d824e..1e5a50c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-genpool.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
@@ -72,7 +72,7 @@ struct llist_node *mce_gen_pool_prepare_records(void)
return new_head.first;
}
-void mce_gen_pool_process(void)
+void mce_gen_pool_process(struct work_struct *__unused)
{
struct llist_node *head;
struct mce_evt_llist *node, *tmp;
diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index cd74a3f..903043e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -31,7 +31,7 @@ struct mce_evt_llist {
struct mce mce;
};
-void mce_gen_pool_process(void);
+void mce_gen_pool_process(struct work_struct *__unused);
bool mce_gen_pool_empty(void);
int mce_gen_pool_add(struct mce *mce);
int mce_gen_pool_init(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index ca15a7e..0fef540 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1316,16 +1316,6 @@ int memory_failure(unsigned long pfn, int vector, int flags)
#endif
/*
- * Action optional processing happens here (picking up
- * from the list of faulting pages that do_machine_check()
- * placed into the genpool).
- */
-static void mce_process_work(struct work_struct *dummy)
-{
- mce_gen_pool_process();
-}
-
-/*
* Periodic polling timer for "silent" machine check errors. If the
* poller finds an MCE, poll 2x faster. When the poller finds no more
* errors, poll 2x slower (up to check_interval seconds).
@@ -2165,7 +2155,7 @@ int __init mcheck_init(void)
mce_register_decode_chain(&mce_default_nb);
mcheck_vendor_init_severity();
- INIT_WORK(&mce_work, mce_process_work);
+ INIT_WORK(&mce_work, mce_gen_pool_process);
init_irq_work(&mce_irq_work, mce_irq_work_cb);
return 0;
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 9/9] x86/MCE, EDAC, acpi: Assign MCE notifier handlers a priority
2017-01-23 18:35 [PATCH 0/9] x86/RAS: Queue for 4.11 Borislav Petkov
` (7 preceding siblings ...)
2017-01-23 18:35 ` [PATCH 8/9] x86/MCE: Get rid of mce_process_work() Borislav Petkov
@ 2017-01-23 18:35 ` Borislav Petkov
2017-01-24 8:51 ` [tip:ras/core] x86/ras, " tip-bot for Borislav Petkov
8 siblings, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2017-01-23 18:35 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, Yazen Ghannam, linux-edac, LKML
From: Borislav Petkov <bp@suse.de>
Assign all notifiers on the MCE decode chain a priority so that they get
called in the correct order.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/include/asm/mce.h | 9 +++++++++
arch/x86/kernel/cpu/mcheck/mce.c | 8 +++-----
drivers/acpi/acpi_extlog.c | 1 +
drivers/acpi/nfit/mce.c | 1 +
drivers/edac/i7core_edac.c | 1 +
drivers/edac/mce_amd.c | 1 +
drivers/edac/sb_edac.c | 3 ++-
drivers/edac/skx_edac.c | 3 ++-
8 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 528f6ec897cb..e63873683d4a 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -189,6 +189,15 @@ extern struct mce_vendor_flags mce_flags;
extern struct mca_config mca_cfg;
extern struct mca_msr_regs msr_ops;
+
+enum mce_notifier_prios {
+ MCE_PRIO_SRAO = INT_MAX,
+ MCE_PRIO_EXTLOG = INT_MAX - 1,
+ MCE_PRIO_NFIT = INT_MAX - 2,
+ MCE_PRIO_EDAC = INT_MAX - 3,
+ MCE_PRIO_LOWEST = 0,
+};
+
extern void mce_register_decode_chain(struct notifier_block *nb);
extern void mce_unregister_decode_chain(struct notifier_block *nb);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 0fef5406f0eb..e39bbc0e7c8b 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -216,9 +216,7 @@ void mce_register_decode_chain(struct notifier_block *nb)
{
atomic_inc(&num_notifiers);
- /* Ensure SRAO notifier has the highest priority in the decode chain. */
- if (nb != &mce_srao_nb && nb->priority == INT_MAX)
- nb->priority -= 1;
+ WARN_ON(nb->priority > MCE_PRIO_LOWEST && nb->priority < MCE_PRIO_EDAC);
atomic_notifier_chain_register(&x86_mce_decoder_chain, nb);
}
@@ -582,7 +580,7 @@ static int srao_decode_notifier(struct notifier_block *nb, unsigned long val,
}
static struct notifier_block mce_srao_nb = {
.notifier_call = srao_decode_notifier,
- .priority = INT_MAX,
+ .priority = MCE_PRIO_SRAO,
};
static int mce_default_notifier(struct notifier_block *nb, unsigned long val,
@@ -608,7 +606,7 @@ static int mce_default_notifier(struct notifier_block *nb, unsigned long val,
static struct notifier_block mce_default_nb = {
.notifier_call = mce_default_notifier,
/* lowest prio, we want it to run last. */
- .priority = 0,
+ .priority = MCE_PRIO_LOWEST,
};
/*
diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
index b3842ffc19ba..a15270a806fc 100644
--- a/drivers/acpi/acpi_extlog.c
+++ b/drivers/acpi/acpi_extlog.c
@@ -212,6 +212,7 @@ static bool __init extlog_get_l1addr(void)
}
static struct notifier_block extlog_mce_dec = {
.notifier_call = extlog_print,
+ .priority = MCE_PRIO_EXTLOG,
};
static int __init extlog_init(void)
diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index e5ce81c38eed..3ba1c3472cf9 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -90,6 +90,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
static struct notifier_block nfit_mce_dec = {
.notifier_call = nfit_handle_mce,
+ .priority = MCE_PRIO_NFIT,
};
void nfit_mce_register(void)
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 69b5adead0ad..75ad847593b7 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -1835,6 +1835,7 @@ static int i7core_mce_check_error(struct notifier_block *nb, unsigned long val,
static struct notifier_block i7_mce_dec = {
.notifier_call = i7core_mce_check_error,
+ .priority = MCE_PRIO_EDAC,
};
struct memdev_dmi_entry {
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index ecad750fd090..0d9bc25543d8 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1054,6 +1054,7 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
static struct notifier_block amd_mce_dec_nb = {
.notifier_call = amd_decode_mce,
+ .priority = MCE_PRIO_EDAC,
};
static int __init mce_amd_init(void)
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 54ae6dc45ab2..c585a014dd3d 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -3136,7 +3136,8 @@ static int sbridge_mce_check_error(struct notifier_block *nb, unsigned long val,
}
static struct notifier_block sbridge_mce_dec = {
- .notifier_call = sbridge_mce_check_error,
+ .notifier_call = sbridge_mce_check_error,
+ .priority = MCE_PRIO_EDAC,
};
/****************************************************************************
diff --git a/drivers/edac/skx_edac.c b/drivers/edac/skx_edac.c
index 79ef675e4d6f..1159dba4671f 100644
--- a/drivers/edac/skx_edac.c
+++ b/drivers/edac/skx_edac.c
@@ -1007,7 +1007,8 @@ static int skx_mce_check_error(struct notifier_block *nb, unsigned long val,
}
static struct notifier_block skx_mce_dec = {
- .notifier_call = skx_mce_check_error,
+ .notifier_call = skx_mce_check_error,
+ .priority = MCE_PRIO_EDAC,
};
static void skx_remove(void)
--
2.11.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [tip:ras/core] x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority
2017-01-23 18:35 ` [PATCH 9/9] x86/MCE, EDAC, acpi: Assign MCE notifier handlers a priority Borislav Petkov
@ 2017-01-24 8:51 ` tip-bot for Borislav Petkov
0 siblings, 0 replies; 37+ messages in thread
From: tip-bot for Borislav Petkov @ 2017-01-24 8:51 UTC (permalink / raw)
To: linux-tip-commits
Cc: torvalds, tony.luck, peterz, tglx, mingo, linux-kernel, hpa, bp,
Yazen.Ghannam, linux-edac
Commit-ID: 9026cc82b632ed1a859935c82ed8ad65f27f2781
Gitweb: http://git.kernel.org/tip/9026cc82b632ed1a859935c82ed8ad65f27f2781
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Mon, 23 Jan 2017 19:35:14 +0100
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 24 Jan 2017 09:14:57 +0100
x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority
Assign all notifiers on the MCE decode chain a priority so that they get
called in the correct order.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/mce.h | 9 +++++++++
arch/x86/kernel/cpu/mcheck/mce.c | 8 +++-----
drivers/acpi/acpi_extlog.c | 1 +
drivers/acpi/nfit/mce.c | 1 +
drivers/edac/i7core_edac.c | 1 +
drivers/edac/mce_amd.c | 1 +
drivers/edac/sb_edac.c | 3 ++-
drivers/edac/skx_edac.c | 3 ++-
8 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 528f6ec..e638736 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -189,6 +189,15 @@ extern struct mce_vendor_flags mce_flags;
extern struct mca_config mca_cfg;
extern struct mca_msr_regs msr_ops;
+
+enum mce_notifier_prios {
+ MCE_PRIO_SRAO = INT_MAX,
+ MCE_PRIO_EXTLOG = INT_MAX - 1,
+ MCE_PRIO_NFIT = INT_MAX - 2,
+ MCE_PRIO_EDAC = INT_MAX - 3,
+ MCE_PRIO_LOWEST = 0,
+};
+
extern void mce_register_decode_chain(struct notifier_block *nb);
extern void mce_unregister_decode_chain(struct notifier_block *nb);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 0fef540..e39bbc0 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -216,9 +216,7 @@ void mce_register_decode_chain(struct notifier_block *nb)
{
atomic_inc(&num_notifiers);
- /* Ensure SRAO notifier has the highest priority in the decode chain. */
- if (nb != &mce_srao_nb && nb->priority == INT_MAX)
- nb->priority -= 1;
+ WARN_ON(nb->priority > MCE_PRIO_LOWEST && nb->priority < MCE_PRIO_EDAC);
atomic_notifier_chain_register(&x86_mce_decoder_chain, nb);
}
@@ -582,7 +580,7 @@ static int srao_decode_notifier(struct notifier_block *nb, unsigned long val,
}
static struct notifier_block mce_srao_nb = {
.notifier_call = srao_decode_notifier,
- .priority = INT_MAX,
+ .priority = MCE_PRIO_SRAO,
};
static int mce_default_notifier(struct notifier_block *nb, unsigned long val,
@@ -608,7 +606,7 @@ static int mce_default_notifier(struct notifier_block *nb, unsigned long val,
static struct notifier_block mce_default_nb = {
.notifier_call = mce_default_notifier,
/* lowest prio, we want it to run last. */
- .priority = 0,
+ .priority = MCE_PRIO_LOWEST,
};
/*
diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
index b3842ff..a15270a 100644
--- a/drivers/acpi/acpi_extlog.c
+++ b/drivers/acpi/acpi_extlog.c
@@ -212,6 +212,7 @@ static bool __init extlog_get_l1addr(void)
}
static struct notifier_block extlog_mce_dec = {
.notifier_call = extlog_print,
+ .priority = MCE_PRIO_EXTLOG,
};
static int __init extlog_init(void)
diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index e5ce81c..3ba1c34 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -90,6 +90,7 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
static struct notifier_block nfit_mce_dec = {
.notifier_call = nfit_handle_mce,
+ .priority = MCE_PRIO_NFIT,
};
void nfit_mce_register(void)
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 69b5ade..75ad847 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -1835,6 +1835,7 @@ static int i7core_mce_check_error(struct notifier_block *nb, unsigned long val,
static struct notifier_block i7_mce_dec = {
.notifier_call = i7core_mce_check_error,
+ .priority = MCE_PRIO_EDAC,
};
struct memdev_dmi_entry {
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index ecad750..0d9bc25 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1054,6 +1054,7 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
static struct notifier_block amd_mce_dec_nb = {
.notifier_call = amd_decode_mce,
+ .priority = MCE_PRIO_EDAC,
};
static int __init mce_amd_init(void)
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 54ae6dc..c585a01 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -3136,7 +3136,8 @@ static int sbridge_mce_check_error(struct notifier_block *nb, unsigned long val,
}
static struct notifier_block sbridge_mce_dec = {
- .notifier_call = sbridge_mce_check_error,
+ .notifier_call = sbridge_mce_check_error,
+ .priority = MCE_PRIO_EDAC,
};
/****************************************************************************
diff --git a/drivers/edac/skx_edac.c b/drivers/edac/skx_edac.c
index 79ef675..1159dba 100644
--- a/drivers/edac/skx_edac.c
+++ b/drivers/edac/skx_edac.c
@@ -1007,7 +1007,8 @@ static int skx_mce_check_error(struct notifier_block *nb, unsigned long val,
}
static struct notifier_block skx_mce_dec = {
- .notifier_call = skx_mce_check_error,
+ .notifier_call = skx_mce_check_error,
+ .priority = MCE_PRIO_EDAC,
};
static void skx_remove(void)
^ permalink raw reply related [flat|nested] 37+ messages in thread