linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] x86/RAS: Some more fixes
@ 2018-03-06 14:21 Borislav Petkov
  2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
  To: X86 ML; +Cc: Tony Luck, LKML

From: Borislav Petkov <bp@suse.de>

Hi guys,

here are some more RAS fixes. Please queue 1st and 3rd for urgent.

Thx.

Borislav Petkov (1):
  x86/MCE: Cleanup and complete struct mce fields definitions

Seunghun Han (1):
  x86/MCE: Synchronize sysfs changes

Tony Luck (1):
  x86/MCE: Save microcode revision in machine check records

 arch/x86/include/uapi/asm/mce.h  | 51 ++++++++++++++++++++++------------------
 arch/x86/kernel/cpu/mcheck/mce.c | 26 ++++++++++++++++++--
 2 files changed, 52 insertions(+), 25 deletions(-)

-- 
2.13.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] x86/MCE: Save microcode revision in machine check records
  2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
@ 2018-03-06 14:21 ` Borislav Petkov
  2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
  2018-03-06 14:21 ` [PATCH 3/3] x86/MCE: Synchronize sysfs changes Borislav Petkov
  2 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
  To: X86 ML; +Cc: Tony Luck, LKML

From: Tony Luck <tony.luck@intel.com>

Updating microcode used to be relatively rare. Now that it has become
more common we should save the microcode version in a machine check
record to make sure that those people looking at the error have this
important information bundled with the rest of the logged information.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180301233449.24311-1-tony.luck@intel.com
[ Simplify a bit. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/uapi/asm/mce.h  | 1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 91723461dc1f..435db58a7bad 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -30,6 +30,7 @@ struct mce {
 	__u64 synd;	/* MCA_SYND MSR: only valid on SMCA systems */
 	__u64 ipid;	/* MCA_IPID MSR: only valid on SMCA systems */
 	__u64 ppin;	/* Protected Processor Inventory Number */
+	__u32 microcode;/* Microcode revision */
 };
 
 #define MCE_GET_RECORD_LEN   _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 3c9a25b93538..181f6cf25895 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -130,6 +130,8 @@ void mce_setup(struct mce *m)
 
 	if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
 		rdmsrl(MSR_PPIN, m->ppin);
+
+	m->microcode = boot_cpu_data.microcode;
 }
 
 DEFINE_PER_CPU(struct mce, injectm);
@@ -262,7 +264,7 @@ static void __print_mce(struct mce *m)
 	 */
 	pr_emerg(HW_ERR "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x microcode %x\n",
 		m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid,
-		cpu_data(m->extcpu).microcode);
+		m->microcode);
 }
 
 static void print_mce(struct mce *m)
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions
  2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
  2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
@ 2018-03-06 14:21 ` Borislav Petkov
  2018-03-08 15:00   ` [tip:ras/core] " tip-bot for Borislav Petkov
  2018-03-06 14:21 ` [PATCH 3/3] x86/MCE: Synchronize sysfs changes Borislav Petkov
  2 siblings, 1 reply; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
  To: X86 ML; +Cc: Tony Luck, LKML

From: Borislav Petkov <bp@suse.de>

Struct is part of the uapi, document that fact and all fields properly
and fix formatting.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/uapi/asm/mce.h | 52 ++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 435db58a7bad..955c2a2e1cf9 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -5,32 +5,36 @@
 #include <linux/types.h>
 #include <linux/ioctl.h>
 
-/* Fields are zero when not available */
+/*
+ * Fields are zero when not available. Also, this struct is shared with
+ * userspace mcelog and thus must keep existing fields at current offsets.
+ * Only add new fields to the end of the structure
+ */
 struct mce {
-	__u64 status;
-	__u64 misc;
-	__u64 addr;
-	__u64 mcgstatus;
-	__u64 ip;
-	__u64 tsc;	/* cpu time stamp counter */
-	__u64 time;	/* wall time_t when error was detected */
-	__u8  cpuvendor;	/* cpu vendor as encoded in system.h */
-	__u8  inject_flags;	/* software inject flags */
-	__u8  severity;
+	__u64 status;		/* Bank's MCi_STATUS MSR */
+	__u64 misc;		/* Bank's MCi_MISC MSR */
+	__u64 addr;		/* Bank's MCi_ADDR MSR */
+	__u64 mcgstatus;	/* Machine Check Global Status MSR */
+	__u64 ip;		/* Instruction Pointer when the error happened */
+	__u64 tsc;		/* CPU time stamp counter */
+	__u64 time;		/* Wall time_t when error was detected */
+	__u8  cpuvendor;	/* Kernel's X86_VENDOR enum */
+	__u8  inject_flags;	/* Software inject flags */
+	__u8  severity;		/* Error severity */
 	__u8  pad;
-	__u32 cpuid;	/* CPUID 1 EAX */
-	__u8  cs;		/* code segment */
-	__u8  bank;	/* machine check bank */
-	__u8  cpu;	/* cpu number; obsolete; use extcpu now */
-	__u8  finished;   /* entry is valid */
-	__u32 extcpu;	/* linux cpu number that detected the error */
-	__u32 socketid;	/* CPU socket ID */
-	__u32 apicid;	/* CPU initial apic ID */
-	__u64 mcgcap;	/* MCGCAP MSR: machine check capabilities of CPU */
-	__u64 synd;	/* MCA_SYND MSR: only valid on SMCA systems */
-	__u64 ipid;	/* MCA_IPID MSR: only valid on SMCA systems */
-	__u64 ppin;	/* Protected Processor Inventory Number */
-	__u32 microcode;/* Microcode revision */
+	__u32 cpuid;		/* CPUID 1 EAX */
+	__u8  cs;		/* Code segment */
+	__u8  bank;		/* Machine check bank reporting the error */
+	__u8  cpu;		/* CPU number; obsoleted by extcpu */
+	__u8  finished;		/* Entry is valid */
+	__u32 extcpu;		/* Linux CPU number that detected the error */
+	__u32 socketid;		/* CPU socket ID */
+	__u32 apicid;		/* CPU initial APIC ID */
+	__u64 mcgcap;		/* MCGCAP MSR: machine check capabilities of CPU */
+	__u64 synd;		/* MCA_SYND MSR: only valid on SMCA systems */
+	__u64 ipid;		/* MCA_IPID MSR: only valid on SMCA systems */
+	__u64 ppin;		/* Protected Processor Inventory Number */
+	__u32 microcode;	/* Microcode revision */
 };
 
 #define MCE_GET_RECORD_LEN   _IOR('M', 1, int)
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] x86/MCE: Synchronize sysfs changes
  2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
  2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
  2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
@ 2018-03-06 14:21 ` Borislav Petkov
  2 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
  To: X86 ML; +Cc: Tony Luck, LKML

From: Seunghun Han <kkamagui@gmail.com>

The check_interval file in

  /sys/devices/system/machinecheck/machinecheck<cpu number>

directory is a global timer value for MCE polling. If it is changed by
one CPU, mce_restart() broadcasts the event to other CPUs to delete
and restart the MCE polling timer and __mcheck_cpu_init_timer()
reinitializes the mce_timer variable.

If more than one CPU writes a specific value to the check_interval file
concurrently, mce_timer is not protected from such concurrent accesses
and all kinds of explosions happen.

Since only root can write to those sysfs variables, the issue is not a
big deal security-wise.

However, concurrent writes to these configuration variables is
void of reason so the proper thing to do is to "slow" accesses
down by synchronizing them with a mutex and thus take care of the
synchronization issue too.

Boris:

 - make store_int_with_restart() use device_store_ulong() to filter out
   negative intervals
 - limit min interval to 1 second
 - correct locking
 - massage commit message

Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180302202706.9434-1-kkamagui@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 181f6cf25895..21962c48dad7 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -56,6 +56,9 @@
 
 static DEFINE_MUTEX(mce_log_mutex);
 
+/* sysfs synchronization */
+static DEFINE_MUTEX(mce_sysfs_mutex);
+
 #define CREATE_TRACE_POINTS
 #include <trace/events/mce.h>
 
@@ -2104,6 +2107,7 @@ static ssize_t set_ignore_ce(struct device *s,
 	if (kstrtou64(buf, 0, &new) < 0)
 		return -EINVAL;
 
+	mutex_lock(&mce_sysfs_mutex);
 	if (mca_cfg.ignore_ce ^ !!new) {
 		if (new) {
 			/* disable ce features */
@@ -2116,6 +2120,8 @@ static ssize_t set_ignore_ce(struct device *s,
 			on_each_cpu(mce_enable_ce, (void *)1, 1);
 		}
 	}
+	mutex_unlock(&mce_sysfs_mutex);
+
 	return size;
 }
 
@@ -2128,6 +2134,7 @@ static ssize_t set_cmci_disabled(struct device *s,
 	if (kstrtou64(buf, 0, &new) < 0)
 		return -EINVAL;
 
+	mutex_lock(&mce_sysfs_mutex);
 	if (mca_cfg.cmci_disabled ^ !!new) {
 		if (new) {
 			/* disable cmci */
@@ -2139,6 +2146,8 @@ static ssize_t set_cmci_disabled(struct device *s,
 			on_each_cpu(mce_enable_ce, NULL, 1);
 		}
 	}
+	mutex_unlock(&mce_sysfs_mutex);
+
 	return size;
 }
 
@@ -2146,8 +2155,19 @@ static ssize_t store_int_with_restart(struct device *s,
 				      struct device_attribute *attr,
 				      const char *buf, size_t size)
 {
-	ssize_t ret = device_store_int(s, attr, buf, size);
+	unsigned long old_check_interval = check_interval;
+	ssize_t ret = device_store_ulong(s, attr, buf, size);
+
+	if (check_interval == old_check_interval)
+		return ret;
+
+	if (check_interval < 1)
+		check_interval = 1;
+
+	mutex_lock(&mce_sysfs_mutex);
 	mce_restart();
+	mutex_unlock(&mce_sysfs_mutex);
+
 	return ret;
 }
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [tip:ras/core] x86/MCE: Cleanup and complete struct mce fields definitions
  2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
@ 2018-03-08 15:00   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-03-08 15:00 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: bp, mingo, linux-kernel, hpa, tglx, tony.luck

Commit-ID:  24193c5de470358d0ed70e1f8e58fdaf83823b95
Gitweb:     https://git.kernel.org/tip/24193c5de470358d0ed70e1f8e58fdaf83823b95
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 6 Mar 2018 15:21:42 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 8 Mar 2018 15:52:59 +0100

x86/MCE: Cleanup and complete struct mce fields definitions

The struct is part of the uapi, document that fact and all fields properly
and fix formatting.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20180306142143.19990-3-bp@alien8.de
---
 arch/x86/include/uapi/asm/mce.h | 52 ++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 435db58a7bad..955c2a2e1cf9 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -5,32 +5,36 @@
 #include <linux/types.h>
 #include <linux/ioctl.h>
 
-/* Fields are zero when not available */
+/*
+ * Fields are zero when not available. Also, this struct is shared with
+ * userspace mcelog and thus must keep existing fields at current offsets.
+ * Only add new fields to the end of the structure
+ */
 struct mce {
-	__u64 status;
-	__u64 misc;
-	__u64 addr;
-	__u64 mcgstatus;
-	__u64 ip;
-	__u64 tsc;	/* cpu time stamp counter */
-	__u64 time;	/* wall time_t when error was detected */
-	__u8  cpuvendor;	/* cpu vendor as encoded in system.h */
-	__u8  inject_flags;	/* software inject flags */
-	__u8  severity;
+	__u64 status;		/* Bank's MCi_STATUS MSR */
+	__u64 misc;		/* Bank's MCi_MISC MSR */
+	__u64 addr;		/* Bank's MCi_ADDR MSR */
+	__u64 mcgstatus;	/* Machine Check Global Status MSR */
+	__u64 ip;		/* Instruction Pointer when the error happened */
+	__u64 tsc;		/* CPU time stamp counter */
+	__u64 time;		/* Wall time_t when error was detected */
+	__u8  cpuvendor;	/* Kernel's X86_VENDOR enum */
+	__u8  inject_flags;	/* Software inject flags */
+	__u8  severity;		/* Error severity */
 	__u8  pad;
-	__u32 cpuid;	/* CPUID 1 EAX */
-	__u8  cs;		/* code segment */
-	__u8  bank;	/* machine check bank */
-	__u8  cpu;	/* cpu number; obsolete; use extcpu now */
-	__u8  finished;   /* entry is valid */
-	__u32 extcpu;	/* linux cpu number that detected the error */
-	__u32 socketid;	/* CPU socket ID */
-	__u32 apicid;	/* CPU initial apic ID */
-	__u64 mcgcap;	/* MCGCAP MSR: machine check capabilities of CPU */
-	__u64 synd;	/* MCA_SYND MSR: only valid on SMCA systems */
-	__u64 ipid;	/* MCA_IPID MSR: only valid on SMCA systems */
-	__u64 ppin;	/* Protected Processor Inventory Number */
-	__u32 microcode;/* Microcode revision */
+	__u32 cpuid;		/* CPUID 1 EAX */
+	__u8  cs;		/* Code segment */
+	__u8  bank;		/* Machine check bank reporting the error */
+	__u8  cpu;		/* CPU number; obsoleted by extcpu */
+	__u8  finished;		/* Entry is valid */
+	__u32 extcpu;		/* Linux CPU number that detected the error */
+	__u32 socketid;		/* CPU socket ID */
+	__u32 apicid;		/* CPU initial APIC ID */
+	__u64 mcgcap;		/* MCGCAP MSR: machine check capabilities of CPU */
+	__u64 synd;		/* MCA_SYND MSR: only valid on SMCA systems */
+	__u64 ipid;		/* MCA_IPID MSR: only valid on SMCA systems */
+	__u64 ppin;		/* Protected Processor Inventory Number */
+	__u32 microcode;	/* Microcode revision */
 };
 
 #define MCE_GET_RECORD_LEN   _IOR('M', 1, int)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-03-08 15:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
2018-03-08 15:00   ` [tip:ras/core] " tip-bot for Borislav Petkov
2018-03-06 14:21 ` [PATCH 3/3] x86/MCE: Synchronize sysfs changes Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).