All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] x86/RAS: Some more fixes
@ 2018-03-06 14:21 Borislav Petkov
  2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
  To: X86 ML; +Cc: Tony Luck, LKML

From: Borislav Petkov <bp@suse.de>

Hi guys,

here are some more RAS fixes. Please queue 1st and 3rd for urgent.

Thx.

Borislav Petkov (1):
  x86/MCE: Cleanup and complete struct mce fields definitions

Seunghun Han (1):
  x86/MCE: Synchronize sysfs changes

Tony Luck (1):
  x86/MCE: Save microcode revision in machine check records

 arch/x86/include/uapi/asm/mce.h  | 51 ++++++++++++++++++++++------------------
 arch/x86/kernel/cpu/mcheck/mce.c | 26 ++++++++++++++++++--
 2 files changed, 52 insertions(+), 25 deletions(-)

-- 
2.13.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] x86/MCE: Save microcode revision in machine check records
  2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
@ 2018-03-06 14:21 ` Borislav Petkov
  2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
  2018-03-06 14:21 ` [PATCH 3/3] x86/MCE: Synchronize sysfs changes Borislav Petkov
  2 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
  To: X86 ML; +Cc: Tony Luck, LKML

From: Tony Luck <tony.luck@intel.com>

Updating microcode used to be relatively rare. Now that it has become
more common we should save the microcode version in a machine check
record to make sure that those people looking at the error have this
important information bundled with the rest of the logged information.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180301233449.24311-1-tony.luck@intel.com
[ Simplify a bit. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/uapi/asm/mce.h  | 1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 91723461dc1f..435db58a7bad 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -30,6 +30,7 @@ struct mce {
 	__u64 synd;	/* MCA_SYND MSR: only valid on SMCA systems */
 	__u64 ipid;	/* MCA_IPID MSR: only valid on SMCA systems */
 	__u64 ppin;	/* Protected Processor Inventory Number */
+	__u32 microcode;/* Microcode revision */
 };
 
 #define MCE_GET_RECORD_LEN   _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 3c9a25b93538..181f6cf25895 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -130,6 +130,8 @@ void mce_setup(struct mce *m)
 
 	if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
 		rdmsrl(MSR_PPIN, m->ppin);
+
+	m->microcode = boot_cpu_data.microcode;
 }
 
 DEFINE_PER_CPU(struct mce, injectm);
@@ -262,7 +264,7 @@ static void __print_mce(struct mce *m)
 	 */
 	pr_emerg(HW_ERR "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x microcode %x\n",
 		m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid,
-		cpu_data(m->extcpu).microcode);
+		m->microcode);
 }
 
 static void print_mce(struct mce *m)
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions
  2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
  2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
@ 2018-03-06 14:21 ` Borislav Petkov
  2018-03-08 15:00   ` [tip:ras/core] " tip-bot for Borislav Petkov
  2018-03-06 14:21 ` [PATCH 3/3] x86/MCE: Synchronize sysfs changes Borislav Petkov
  2 siblings, 1 reply; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
  To: X86 ML; +Cc: Tony Luck, LKML

From: Borislav Petkov <bp@suse.de>

Struct is part of the uapi, document that fact and all fields properly
and fix formatting.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/uapi/asm/mce.h | 52 ++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 435db58a7bad..955c2a2e1cf9 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -5,32 +5,36 @@
 #include <linux/types.h>
 #include <linux/ioctl.h>
 
-/* Fields are zero when not available */
+/*
+ * Fields are zero when not available. Also, this struct is shared with
+ * userspace mcelog and thus must keep existing fields at current offsets.
+ * Only add new fields to the end of the structure
+ */
 struct mce {
-	__u64 status;
-	__u64 misc;
-	__u64 addr;
-	__u64 mcgstatus;
-	__u64 ip;
-	__u64 tsc;	/* cpu time stamp counter */
-	__u64 time;	/* wall time_t when error was detected */
-	__u8  cpuvendor;	/* cpu vendor as encoded in system.h */
-	__u8  inject_flags;	/* software inject flags */
-	__u8  severity;
+	__u64 status;		/* Bank's MCi_STATUS MSR */
+	__u64 misc;		/* Bank's MCi_MISC MSR */
+	__u64 addr;		/* Bank's MCi_ADDR MSR */
+	__u64 mcgstatus;	/* Machine Check Global Status MSR */
+	__u64 ip;		/* Instruction Pointer when the error happened */
+	__u64 tsc;		/* CPU time stamp counter */
+	__u64 time;		/* Wall time_t when error was detected */
+	__u8  cpuvendor;	/* Kernel's X86_VENDOR enum */
+	__u8  inject_flags;	/* Software inject flags */
+	__u8  severity;		/* Error severity */
 	__u8  pad;
-	__u32 cpuid;	/* CPUID 1 EAX */
-	__u8  cs;		/* code segment */
-	__u8  bank;	/* machine check bank */
-	__u8  cpu;	/* cpu number; obsolete; use extcpu now */
-	__u8  finished;   /* entry is valid */
-	__u32 extcpu;	/* linux cpu number that detected the error */
-	__u32 socketid;	/* CPU socket ID */
-	__u32 apicid;	/* CPU initial apic ID */
-	__u64 mcgcap;	/* MCGCAP MSR: machine check capabilities of CPU */
-	__u64 synd;	/* MCA_SYND MSR: only valid on SMCA systems */
-	__u64 ipid;	/* MCA_IPID MSR: only valid on SMCA systems */
-	__u64 ppin;	/* Protected Processor Inventory Number */
-	__u32 microcode;/* Microcode revision */
+	__u32 cpuid;		/* CPUID 1 EAX */
+	__u8  cs;		/* Code segment */
+	__u8  bank;		/* Machine check bank reporting the error */
+	__u8  cpu;		/* CPU number; obsoleted by extcpu */
+	__u8  finished;		/* Entry is valid */
+	__u32 extcpu;		/* Linux CPU number that detected the error */
+	__u32 socketid;		/* CPU socket ID */
+	__u32 apicid;		/* CPU initial APIC ID */
+	__u64 mcgcap;		/* MCGCAP MSR: machine check capabilities of CPU */
+	__u64 synd;		/* MCA_SYND MSR: only valid on SMCA systems */
+	__u64 ipid;		/* MCA_IPID MSR: only valid on SMCA systems */
+	__u64 ppin;		/* Protected Processor Inventory Number */
+	__u32 microcode;	/* Microcode revision */
 };
 
 #define MCE_GET_RECORD_LEN   _IOR('M', 1, int)
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] x86/MCE: Synchronize sysfs changes
  2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
  2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
  2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
@ 2018-03-06 14:21 ` Borislav Petkov
  2 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2018-03-06 14:21 UTC (permalink / raw)
  To: X86 ML; +Cc: Tony Luck, LKML

From: Seunghun Han <kkamagui@gmail.com>

The check_interval file in

  /sys/devices/system/machinecheck/machinecheck<cpu number>

directory is a global timer value for MCE polling. If it is changed by
one CPU, mce_restart() broadcasts the event to other CPUs to delete
and restart the MCE polling timer and __mcheck_cpu_init_timer()
reinitializes the mce_timer variable.

If more than one CPU writes a specific value to the check_interval file
concurrently, mce_timer is not protected from such concurrent accesses
and all kinds of explosions happen.

Since only root can write to those sysfs variables, the issue is not a
big deal security-wise.

However, concurrent writes to these configuration variables is
void of reason so the proper thing to do is to "slow" accesses
down by synchronizing them with a mutex and thus take care of the
synchronization issue too.

Boris:

 - make store_int_with_restart() use device_store_ulong() to filter out
   negative intervals
 - limit min interval to 1 second
 - correct locking
 - massage commit message

Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180302202706.9434-1-kkamagui@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 181f6cf25895..21962c48dad7 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -56,6 +56,9 @@
 
 static DEFINE_MUTEX(mce_log_mutex);
 
+/* sysfs synchronization */
+static DEFINE_MUTEX(mce_sysfs_mutex);
+
 #define CREATE_TRACE_POINTS
 #include <trace/events/mce.h>
 
@@ -2104,6 +2107,7 @@ static ssize_t set_ignore_ce(struct device *s,
 	if (kstrtou64(buf, 0, &new) < 0)
 		return -EINVAL;
 
+	mutex_lock(&mce_sysfs_mutex);
 	if (mca_cfg.ignore_ce ^ !!new) {
 		if (new) {
 			/* disable ce features */
@@ -2116,6 +2120,8 @@ static ssize_t set_ignore_ce(struct device *s,
 			on_each_cpu(mce_enable_ce, (void *)1, 1);
 		}
 	}
+	mutex_unlock(&mce_sysfs_mutex);
+
 	return size;
 }
 
@@ -2128,6 +2134,7 @@ static ssize_t set_cmci_disabled(struct device *s,
 	if (kstrtou64(buf, 0, &new) < 0)
 		return -EINVAL;
 
+	mutex_lock(&mce_sysfs_mutex);
 	if (mca_cfg.cmci_disabled ^ !!new) {
 		if (new) {
 			/* disable cmci */
@@ -2139,6 +2146,8 @@ static ssize_t set_cmci_disabled(struct device *s,
 			on_each_cpu(mce_enable_ce, NULL, 1);
 		}
 	}
+	mutex_unlock(&mce_sysfs_mutex);
+
 	return size;
 }
 
@@ -2146,8 +2155,19 @@ static ssize_t store_int_with_restart(struct device *s,
 				      struct device_attribute *attr,
 				      const char *buf, size_t size)
 {
-	ssize_t ret = device_store_int(s, attr, buf, size);
+	unsigned long old_check_interval = check_interval;
+	ssize_t ret = device_store_ulong(s, attr, buf, size);
+
+	if (check_interval == old_check_interval)
+		return ret;
+
+	if (check_interval < 1)
+		check_interval = 1;
+
+	mutex_lock(&mce_sysfs_mutex);
 	mce_restart();
+	mutex_unlock(&mce_sysfs_mutex);
+
 	return ret;
 }
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [tip:ras/core] x86/MCE: Cleanup and complete struct mce fields definitions
  2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
@ 2018-03-08 15:00   ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Borislav Petkov @ 2018-03-08 15:00 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: bp, mingo, linux-kernel, hpa, tglx, tony.luck

Commit-ID:  24193c5de470358d0ed70e1f8e58fdaf83823b95
Gitweb:     https://git.kernel.org/tip/24193c5de470358d0ed70e1f8e58fdaf83823b95
Author:     Borislav Petkov <bp@suse.de>
AuthorDate: Tue, 6 Mar 2018 15:21:42 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 8 Mar 2018 15:52:59 +0100

x86/MCE: Cleanup and complete struct mce fields definitions

The struct is part of the uapi, document that fact and all fields properly
and fix formatting.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20180306142143.19990-3-bp@alien8.de
---
 arch/x86/include/uapi/asm/mce.h | 52 ++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index 435db58a7bad..955c2a2e1cf9 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -5,32 +5,36 @@
 #include <linux/types.h>
 #include <linux/ioctl.h>
 
-/* Fields are zero when not available */
+/*
+ * Fields are zero when not available. Also, this struct is shared with
+ * userspace mcelog and thus must keep existing fields at current offsets.
+ * Only add new fields to the end of the structure
+ */
 struct mce {
-	__u64 status;
-	__u64 misc;
-	__u64 addr;
-	__u64 mcgstatus;
-	__u64 ip;
-	__u64 tsc;	/* cpu time stamp counter */
-	__u64 time;	/* wall time_t when error was detected */
-	__u8  cpuvendor;	/* cpu vendor as encoded in system.h */
-	__u8  inject_flags;	/* software inject flags */
-	__u8  severity;
+	__u64 status;		/* Bank's MCi_STATUS MSR */
+	__u64 misc;		/* Bank's MCi_MISC MSR */
+	__u64 addr;		/* Bank's MCi_ADDR MSR */
+	__u64 mcgstatus;	/* Machine Check Global Status MSR */
+	__u64 ip;		/* Instruction Pointer when the error happened */
+	__u64 tsc;		/* CPU time stamp counter */
+	__u64 time;		/* Wall time_t when error was detected */
+	__u8  cpuvendor;	/* Kernel's X86_VENDOR enum */
+	__u8  inject_flags;	/* Software inject flags */
+	__u8  severity;		/* Error severity */
 	__u8  pad;
-	__u32 cpuid;	/* CPUID 1 EAX */
-	__u8  cs;		/* code segment */
-	__u8  bank;	/* machine check bank */
-	__u8  cpu;	/* cpu number; obsolete; use extcpu now */
-	__u8  finished;   /* entry is valid */
-	__u32 extcpu;	/* linux cpu number that detected the error */
-	__u32 socketid;	/* CPU socket ID */
-	__u32 apicid;	/* CPU initial apic ID */
-	__u64 mcgcap;	/* MCGCAP MSR: machine check capabilities of CPU */
-	__u64 synd;	/* MCA_SYND MSR: only valid on SMCA systems */
-	__u64 ipid;	/* MCA_IPID MSR: only valid on SMCA systems */
-	__u64 ppin;	/* Protected Processor Inventory Number */
-	__u32 microcode;/* Microcode revision */
+	__u32 cpuid;		/* CPUID 1 EAX */
+	__u8  cs;		/* Code segment */
+	__u8  bank;		/* Machine check bank reporting the error */
+	__u8  cpu;		/* CPU number; obsoleted by extcpu */
+	__u8  finished;		/* Entry is valid */
+	__u32 extcpu;		/* Linux CPU number that detected the error */
+	__u32 socketid;		/* CPU socket ID */
+	__u32 apicid;		/* CPU initial APIC ID */
+	__u64 mcgcap;		/* MCGCAP MSR: machine check capabilities of CPU */
+	__u64 synd;		/* MCA_SYND MSR: only valid on SMCA systems */
+	__u64 ipid;		/* MCA_IPID MSR: only valid on SMCA systems */
+	__u64 ppin;		/* Protected Processor Inventory Number */
+	__u32 microcode;	/* Microcode revision */
 };
 
 #define MCE_GET_RECORD_LEN   _IOR('M', 1, int)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-03-08 15:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-06 14:21 [PATCH 0/3] x86/RAS: Some more fixes Borislav Petkov
2018-03-06 14:21 ` [PATCH 1/3] x86/MCE: Save microcode revision in machine check records Borislav Petkov
2018-03-06 14:21 ` [PATCH 2/3] x86/MCE: Cleanup and complete struct mce fields definitions Borislav Petkov
2018-03-08 15:00   ` [tip:ras/core] " tip-bot for Borislav Petkov
2018-03-06 14:21 ` [PATCH 3/3] x86/MCE: Synchronize sysfs changes Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.