* [PATCH 0/3] x86/RAS: Some more fixes
@ 2018-03-06 14:21 Borislav Petkov
2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, LKML
From: Borislav Petkov <bp@suse.de>
Hi guys,
here are some more RAS fixes. Please queue 1st and 3rd for urgent.
Thx.
Borislav Petkov (1):
x86/MCE: Cleanup and complete struct mce fields definitions
Seunghun Han (1):
x86/MCE: Synchronize sysfs changes
Tony Luck (1):
x86/MCE: Save microcode revision in machine check records
arch/x86/include/uapi/asm/mce.h | 51 ++++++++++++++++++++++------------------
arch/x86/kernel/cpu/mcheck/mce.c | 26 ++++++++++++++++++--
2 files changed, 52 insertions(+), 25 deletions(-)
--
2.13.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/3] x86/MCE: Save microcode revision in machine check records
2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
@ 2018-03-06 14:21 ` Borislav Petkov
2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
2018-03-06 14:21 ` [PATCH 3/3] x86/MCE: Synchronize sysfs changes Borislav Petkov
2 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, LKML
From: Tony Luck <tony.luck@intel.com>
Updating microcode used to be relatively rare. Now that it has become
more common we should save the microcode version in a machine check
record to make sure that those people looking at the error have this
important information bundled with the rest of the logged information.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180301233449.24311-1-tony.luck@intel.com
[ Simplify a bit. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/include/uapi/asm/mce.h | 1 +
arch/x86/kernel/cpu/mcheck/mce.c | 4 +++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 91723461dc1f..435db58a7bad 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -30,6 +30,7 @@ struct mce {
__u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
__u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
__u64 ppin; /* Protected Processor Inventory Number */
+ __u32 microcode;/* Microcode revision */
};
#define MCE_GET_RECORD_LEN _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 3c9a25b93538..181f6cf25895 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -130,6 +130,8 @@ void mce_setup(struct mce *m)
if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
rdmsrl(MSR_PPIN, m->ppin);
+
+ m->microcode = boot_cpu_data.microcode;
}
DEFINE_PER_CPU(struct mce, injectm);
@@ -262,7 +264,7 @@ static void __print_mce(struct mce *m)
*/
pr_emerg(HW_ERR "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x microcode %x\n",
m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid,
- cpu_data(m->extcpu).microcode);
+ m->microcode);
}
static void print_mce(struct mce *m)
--
2.13.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions
2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
@ 2018-03-06 14:21 ` Borislav Petkov
2018-03-08 15:00 ` [tip:ras/core] " tip-bot for Borislav Petkov
2018-03-06 14:21 ` [PATCH 3/3] x86/MCE: Synchronize sysfs changes Borislav Petkov
2 siblings, 1 reply; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, LKML
From: Borislav Petkov <bp@suse.de>
Struct is part of the uapi, document that fact and all fields properly
and fix formatting.
No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/include/uapi/asm/mce.h | 52 ++++++++++++++++++++++-------------------
1 file changed, 28 insertions(+), 24 deletions(-)
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 435db58a7bad..955c2a2e1cf9 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -5,32 +5,36 @@
#include <linux/types.h>
#include <linux/ioctl.h>
-/* Fields are zero when not available */
+/*
+ * Fields are zero when not available. Also, this struct is shared with
+ * userspace mcelog and thus must keep existing fields at current offsets.
+ * Only add new fields to the end of the structure
+ */
struct mce {
- __u64 status;
- __u64 misc;
- __u64 addr;
- __u64 mcgstatus;
- __u64 ip;
- __u64 tsc; /* cpu time stamp counter */
- __u64 time; /* wall time_t when error was detected */
- __u8 cpuvendor; /* cpu vendor as encoded in system.h */
- __u8 inject_flags; /* software inject flags */
- __u8 severity;
+ __u64 status; /* Bank's MCi_STATUS MSR */
+ __u64 misc; /* Bank's MCi_MISC MSR */
+ __u64 addr; /* Bank's MCi_ADDR MSR */
+ __u64 mcgstatus; /* Machine Check Global Status MSR */
+ __u64 ip; /* Instruction Pointer when the error happened */
+ __u64 tsc; /* CPU time stamp counter */
+ __u64 time; /* Wall time_t when error was detected */
+ __u8 cpuvendor; /* Kernel's X86_VENDOR enum */
+ __u8 inject_flags; /* Software inject flags */
+ __u8 severity; /* Error severity */
__u8 pad;
- __u32 cpuid; /* CPUID 1 EAX */
- __u8 cs; /* code segment */
- __u8 bank; /* machine check bank */
- __u8 cpu; /* cpu number; obsolete; use extcpu now */
- __u8 finished; /* entry is valid */
- __u32 extcpu; /* linux cpu number that detected the error */
- __u32 socketid; /* CPU socket ID */
- __u32 apicid; /* CPU initial apic ID */
- __u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
- __u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
- __u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
- __u64 ppin; /* Protected Processor Inventory Number */
- __u32 microcode;/* Microcode revision */
+ __u32 cpuid; /* CPUID 1 EAX */
+ __u8 cs; /* Code segment */
+ __u8 bank; /* Machine check bank reporting the error */
+ __u8 cpu; /* CPU number; obsoleted by extcpu */
+ __u8 finished; /* Entry is valid */
+ __u32 extcpu; /* Linux CPU number that detected the error */
+ __u32 socketid; /* CPU socket ID */
+ __u32 apicid; /* CPU initial APIC ID */
+ __u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
+ __u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
+ __u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
+ __u64 ppin; /* Protected Processor Inventory Number */
+ __u32 microcode; /* Microcode revision */
};
#define MCE_GET_RECORD_LEN _IOR('M', 1, int)
--
2.13.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/3] x86/MCE: Synchronize sysfs changes
2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
@ 2018-03-06 14:21 ` Borislav Petkov
2 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
To: X86 ML; +Cc: Tony Luck, LKML
From: Seunghun Han <kkamagui@gmail.com>
The check_interval file in
/sys/devices/system/machinecheck/machinecheck<cpu number>
directory is a global timer value for MCE polling. If it is changed by
one CPU, mce_restart() broadcasts the event to other CPUs to delete
and restart the MCE polling timer and __mcheck_cpu_init_timer()
reinitializes the mce_timer variable.
If more than one CPU writes a specific value to the check_interval file
concurrently, mce_timer is not protected from such concurrent accesses
and all kinds of explosions happen.
Since only root can write to those sysfs variables, the issue is not a
big deal security-wise.
However, concurrent writes to these configuration variables is
void of reason so the proper thing to do is to "slow" accesses
down by synchronizing them with a mutex and thus take care of the
synchronization issue too.
Boris:
- make store_int_with_restart() use device_store_ulong() to filter out
negative intervals
- limit min interval to 1 second
- correct locking
- massage commit message
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180302202706.9434-1-kkamagui@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/kernel/cpu/mcheck/mce.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 181f6cf25895..21962c48dad7 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -56,6 +56,9 @@
static DEFINE_MUTEX(mce_log_mutex);
+/* sysfs synchronization */
+static DEFINE_MUTEX(mce_sysfs_mutex);
+
#define CREATE_TRACE_POINTS
#include <trace/events/mce.h>
@@ -2104,6 +2107,7 @@ static ssize_t set_ignore_ce(struct device *s,
if (kstrtou64(buf, 0, &new) < 0)
return -EINVAL;
+ mutex_lock(&mce_sysfs_mutex);
if (mca_cfg.ignore_ce ^ !!new) {
if (new) {
/* disable ce features */
@@ -2116,6 +2120,8 @@ static ssize_t set_ignore_ce(struct device *s,
on_each_cpu(mce_enable_ce, (void *)1, 1);
}
}
+ mutex_unlock(&mce_sysfs_mutex);
+
return size;
}
@@ -2128,6 +2134,7 @@ static ssize_t set_cmci_disabled(struct device *s,
if (kstrtou64(buf, 0, &new) < 0)
return -EINVAL;
+ mutex_lock(&mce_sysfs_mutex);
if (mca_cfg.cmci_disabled ^ !!new) {
if (new) {
/* disable cmci */
@@ -2139,6 +2146,8 @@ static ssize_t set_cmci_disabled(struct device *s,
on_each_cpu(mce_enable_ce, NULL, 1);
}
}
+ mutex_unlock(&mce_sysfs_mutex);
+
return size;
}
@@ -2146,8 +2155,19 @@ static ssize_t store_int_with_restart(struct device *s,
struct device_attribute *attr,
const char *buf, size_t size)
{
- ssize_t ret = device_store_int(s, attr, buf, size);
+ unsigned long old_check_interval = check_interval;
+ ssize_t ret = device_store_ulong(s, attr, buf, size);
+
+ if (check_interval == old_check_interval)
+ return ret;
+
+ if (check_interval < 1)
+ check_interval = 1;
+
+ mutex_lock(&mce_sysfs_mutex);
mce_restart();
+ mutex_unlock(&mce_sysfs_mutex);
+
return ret;
}
--
2.13.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [tip:ras/core] x86/MCE: Cleanup and complete struct mce fields definitions
2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
@ 2018-03-08 15:00 ` tip-bot for Borislav Petkov
0 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-03-08 15:00 UTC (permalink / raw)
To: linux-tip-commits; +Cc: bp, mingo, linux-kernel, hpa, tglx, tony.luck
Commit-ID: 24193c5de470358d0ed70e1f8e58fdaf83823b95
Gitweb: https://git.kernel.org/tip/24193c5de470358d0ed70e1f8e58fdaf83823b95
Author: Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 6 Mar 2018 15:21:42 +0100
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 8 Mar 2018 15:52:59 +0100
x86/MCE: Cleanup and complete struct mce fields definitions
The struct is part of the uapi, document that fact and all fields properly
and fix formatting.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20180306142143.19990-3-bp@alien8.de
---
arch/x86/include/uapi/asm/mce.h | 52 ++++++++++++++++++++++-------------------
1 file changed, 28 insertions(+), 24 deletions(-)
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 435db58a7bad..955c2a2e1cf9 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -5,32 +5,36 @@
#include <linux/types.h>
#include <linux/ioctl.h>
-/* Fields are zero when not available */
+/*
+ * Fields are zero when not available. Also, this struct is shared with
+ * userspace mcelog and thus must keep existing fields at current offsets.
+ * Only add new fields to the end of the structure
+ */
struct mce {
- __u64 status;
- __u64 misc;
- __u64 addr;
- __u64 mcgstatus;
- __u64 ip;
- __u64 tsc; /* cpu time stamp counter */
- __u64 time; /* wall time_t when error was detected */
- __u8 cpuvendor; /* cpu vendor as encoded in system.h */
- __u8 inject_flags; /* software inject flags */
- __u8 severity;
+ __u64 status; /* Bank's MCi_STATUS MSR */
+ __u64 misc; /* Bank's MCi_MISC MSR */
+ __u64 addr; /* Bank's MCi_ADDR MSR */
+ __u64 mcgstatus; /* Machine Check Global Status MSR */
+ __u64 ip; /* Instruction Pointer when the error happened */
+ __u64 tsc; /* CPU time stamp counter */
+ __u64 time; /* Wall time_t when error was detected */
+ __u8 cpuvendor; /* Kernel's X86_VENDOR enum */
+ __u8 inject_flags; /* Software inject flags */
+ __u8 severity; /* Error severity */
__u8 pad;
- __u32 cpuid; /* CPUID 1 EAX */
- __u8 cs; /* code segment */
- __u8 bank; /* machine check bank */
- __u8 cpu; /* cpu number; obsolete; use extcpu now */
- __u8 finished; /* entry is valid */
- __u32 extcpu; /* linux cpu number that detected the error */
- __u32 socketid; /* CPU socket ID */
- __u32 apicid; /* CPU initial apic ID */
- __u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
- __u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
- __u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
- __u64 ppin; /* Protected Processor Inventory Number */
- __u32 microcode;/* Microcode revision */
+ __u32 cpuid; /* CPUID 1 EAX */
+ __u8 cs; /* Code segment */
+ __u8 bank; /* Machine check bank reporting the error */
+ __u8 cpu; /* CPU number; obsoleted by extcpu */
+ __u8 finished; /* Entry is valid */
+ __u32 extcpu; /* Linux CPU number that detected the error */
+ __u32 socketid; /* CPU socket ID */
+ __u32 apicid; /* CPU initial APIC ID */
+ __u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */
+ __u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
+ __u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
+ __u64 ppin; /* Protected Processor Inventory Number */
+ __u32 microcode; /* Microcode revision */
};
#define MCE_GET_RECORD_LEN _IOR('M', 1, int)
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-03-08 15:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
2018-03-08 15:00 ` [tip:ras/core] " tip-bot for Borislav Petkov
2018-03-06 14:21 ` [PATCH 3/3] x86/MCE: Synchronize sysfs changes Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).