linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] RAS: Add CEC collector and deprecate mcelog
@ 2017-03-09 10:08 Borislav Petkov
  2017-03-09 10:08 ` [PATCH 1/4] x86/MCE: Rename mce_log()'s argument Borislav Petkov
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Borislav Petkov @ 2017-03-09 10:08 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, linux-edac, LKML

From: Borislav Petkov <bp@suse.de>

Hi,

here's the latest incarnation of the CEC collector. I think I've taken
care of all review comments but feel free to correct me here. The
introductory comment in cec.c should explain the whole deal - I'm
referring to there so that we have that text in the actual source and
not spread it around commit messages. So pls have a look there for more
info.

The thing has knobs in debugfs now which can control its operation, I
hope I've chosen sane default values.

Borislav Petkov (3):
  x86/MCE: Rename mce_log()'s argument
  x86/MCE: Rename mce_log to mce_log_buffer
  RAS: Add a Corrected Errors Collector

Tony Luck (1):
  x86/mce: Deprecate /dev/mcelog

 Documentation/admin-guide/kernel-parameters.txt |   6 +
 arch/x86/Kconfig                                |  10 +-
 arch/x86/include/asm/mce.h                      |  12 +-
 arch/x86/kernel/cpu/mcheck/Makefile             |   2 +
 arch/x86/kernel/cpu/mcheck/dev-mcelog.c         | 397 ++++++++++++++++++
 arch/x86/kernel/cpu/mcheck/mce-internal.h       |   8 +
 arch/x86/kernel/cpu/mcheck/mce.c                | 493 ++++------------------
 arch/x86/ras/Kconfig                            |  14 +
 drivers/ras/Makefile                            |   3 +-
 drivers/ras/cec.c                               | 530 ++++++++++++++++++++++++
 drivers/ras/debugfs.c                           |   2 +-
 drivers/ras/debugfs.h                           |   8 +
 drivers/ras/ras.c                               |  11 +
 include/linux/ras.h                             |  13 +-
 14 files changed, 1097 insertions(+), 412 deletions(-)
 create mode 100644 arch/x86/kernel/cpu/mcheck/dev-mcelog.c
 create mode 100644 drivers/ras/cec.c
 create mode 100644 drivers/ras/debugfs.h

-- 
2.11.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4] x86/MCE: Rename mce_log()'s argument
  2017-03-09 10:08 [PATCH 0/4] RAS: Add CEC collector and deprecate mcelog Borislav Petkov
@ 2017-03-09 10:08 ` Borislav Petkov
  2017-03-09 10:08 ` [PATCH 2/4] x86/MCE: Rename mce_log to mce_log_buffer Borislav Petkov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2017-03-09 10:08 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, linux-edac, LKML

From: Borislav Petkov <bp@suse.de>

We call it everywhere "struct mce *m". Adjust that here too to avoid
confusion.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 177472ace838..dee5db6065cb 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -156,14 +156,14 @@ static struct mce_log mcelog = {
 	.recordlen	= sizeof(struct mce),
 };
 
-void mce_log(struct mce *mce)
+void mce_log(struct mce *m)
 {
 	unsigned next, entry;
 
 	/* Emit the trace record: */
-	trace_mce_record(mce);
+	trace_mce_record(m);
 
-	if (!mce_gen_pool_add(mce))
+	if (!mce_gen_pool_add(m))
 		irq_work_queue(&mce_irq_work);
 
 	wmb();
@@ -193,7 +193,7 @@ void mce_log(struct mce *mce)
 		if (cmpxchg(&mcelog.next, entry, next) == entry)
 			break;
 	}
-	memcpy(mcelog.entry + entry, mce, sizeof(struct mce));
+	memcpy(mcelog.entry + entry, m, sizeof(struct mce));
 	wmb();
 	mcelog.entry[entry].finished = 1;
 	wmb();
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] x86/MCE: Rename mce_log to mce_log_buffer
  2017-03-09 10:08 [PATCH 0/4] RAS: Add CEC collector and deprecate mcelog Borislav Petkov
  2017-03-09 10:08 ` [PATCH 1/4] x86/MCE: Rename mce_log()'s argument Borislav Petkov
@ 2017-03-09 10:08 ` Borislav Petkov
  2017-03-09 10:08 ` [PATCH 3/4] RAS: Add a Corrected Errors Collector Borislav Petkov
  2017-03-09 10:08 ` [PATCH 4/4] x86/mce: Deprecate /dev/mcelog Borislav Petkov
  3 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2017-03-09 10:08 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, linux-edac, LKML

From: Borislav Petkov <bp@suse.de>

It is confusing when staring at "struct mce_log mcelog" and then there's
also a function called mce_log(). So call the buffer what it is.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/mce.h       |  2 +-
 arch/x86/kernel/cpu/mcheck/mce.c | 30 +++++++++++++++---------------
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index e63873683d4a..0512dcc11750 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -128,7 +128,7 @@
  * debugging tools.  Each entry is only valid when its finished flag
  * is set.
  */
-struct mce_log {
+struct mce_log_buffer {
 	char signature[12]; /* "MACHINECHECK" */
 	unsigned len;	    /* = MCE_LOG_LEN */
 	unsigned next;
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index dee5db6065cb..8a347f98eda2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -150,7 +150,7 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm);
  * separate MCEs from kernel messages to avoid bogus bug reports.
  */
 
-static struct mce_log mcelog = {
+static struct mce_log_buffer mcelog_buf = {
 	.signature	= MCE_LOG_SIGNATURE,
 	.len		= MCE_LOG_LEN,
 	.recordlen	= sizeof(struct mce),
@@ -168,7 +168,7 @@ void mce_log(struct mce *m)
 
 	wmb();
 	for (;;) {
-		entry = mce_log_get_idx_check(mcelog.next);
+		entry = mce_log_get_idx_check(mcelog_buf.next);
 		for (;;) {
 
 			/*
@@ -178,11 +178,11 @@ void mce_log(struct mce *m)
 			 */
 			if (entry >= MCE_LOG_LEN) {
 				set_bit(MCE_OVERFLOW,
-					(unsigned long *)&mcelog.flags);
+					(unsigned long *)&mcelog_buf.flags);
 				return;
 			}
 			/* Old left over entry. Skip: */
-			if (mcelog.entry[entry].finished) {
+			if (mcelog_buf.entry[entry].finished) {
 				entry++;
 				continue;
 			}
@@ -190,12 +190,12 @@ void mce_log(struct mce *m)
 		}
 		smp_rmb();
 		next = entry + 1;
-		if (cmpxchg(&mcelog.next, entry, next) == entry)
+		if (cmpxchg(&mcelog_buf.next, entry, next) == entry)
 			break;
 	}
-	memcpy(mcelog.entry + entry, m, sizeof(struct mce));
+	memcpy(mcelog_buf.entry + entry, m, sizeof(struct mce));
 	wmb();
-	mcelog.entry[entry].finished = 1;
+	mcelog_buf.entry[entry].finished = 1;
 	wmb();
 
 	set_bit(0, &mce_need_notify);
@@ -1947,7 +1947,7 @@ static ssize_t mce_chrdev_read(struct file *filp, char __user *ubuf,
 			goto out;
 	}
 
-	next = mce_log_get_idx_check(mcelog.next);
+	next = mce_log_get_idx_check(mcelog_buf.next);
 
 	/* Only supports full reads right now */
 	err = -EINVAL;
@@ -1959,7 +1959,7 @@ static ssize_t mce_chrdev_read(struct file *filp, char __user *ubuf,
 	do {
 		for (i = prev; i < next; i++) {
 			unsigned long start = jiffies;
-			struct mce *m = &mcelog.entry[i];
+			struct mce *m = &mcelog_buf.entry[i];
 
 			while (!m->finished) {
 				if (time_after_eq(jiffies, start + 2)) {
@@ -1975,10 +1975,10 @@ static ssize_t mce_chrdev_read(struct file *filp, char __user *ubuf,
 			;
 		}
 
-		memset(mcelog.entry + prev, 0,
+		memset(mcelog_buf.entry + prev, 0,
 		       (next - prev) * sizeof(struct mce));
 		prev = next;
-		next = cmpxchg(&mcelog.next, prev, 0);
+		next = cmpxchg(&mcelog_buf.next, prev, 0);
 	} while (next != prev);
 
 	synchronize_sched();
@@ -1990,7 +1990,7 @@ static ssize_t mce_chrdev_read(struct file *filp, char __user *ubuf,
 	on_each_cpu(collect_tscs, cpu_tsc, 1);
 
 	for (i = next; i < MCE_LOG_LEN; i++) {
-		struct mce *m = &mcelog.entry[i];
+		struct mce *m = &mcelog_buf.entry[i];
 
 		if (m->finished && m->tsc < cpu_tsc[m->cpu]) {
 			err |= copy_to_user(buf, m, sizeof(*m));
@@ -2013,7 +2013,7 @@ static ssize_t mce_chrdev_read(struct file *filp, char __user *ubuf,
 static unsigned int mce_chrdev_poll(struct file *file, poll_table *wait)
 {
 	poll_wait(file, &mce_chrdev_wait, wait);
-	if (READ_ONCE(mcelog.next))
+	if (READ_ONCE(mcelog_buf.next))
 		return POLLIN | POLLRDNORM;
 	if (!mce_apei_read_done && apei_check_mce())
 		return POLLIN | POLLRDNORM;
@@ -2037,8 +2037,8 @@ static long mce_chrdev_ioctl(struct file *f, unsigned int cmd,
 		unsigned flags;
 
 		do {
-			flags = mcelog.flags;
-		} while (cmpxchg(&mcelog.flags, flags, 0) != flags);
+			flags = mcelog_buf.flags;
+		} while (cmpxchg(&mcelog_buf.flags, flags, 0) != flags);
 
 		return put_user(flags, p);
 	}
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-09 10:08 [PATCH 0/4] RAS: Add CEC collector and deprecate mcelog Borislav Petkov
  2017-03-09 10:08 ` [PATCH 1/4] x86/MCE: Rename mce_log()'s argument Borislav Petkov
  2017-03-09 10:08 ` [PATCH 2/4] x86/MCE: Rename mce_log to mce_log_buffer Borislav Petkov
@ 2017-03-09 10:08 ` Borislav Petkov
  2017-03-12 13:43   ` Boris Petkov
                     ` (2 more replies)
  2017-03-09 10:08 ` [PATCH 4/4] x86/mce: Deprecate /dev/mcelog Borislav Petkov
  3 siblings, 3 replies; 15+ messages in thread
From: Borislav Petkov @ 2017-03-09 10:08 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, linux-edac, LKML

From: Borislav Petkov <bp@suse.de>

A simple data structure for collecting correctable errors along with
accessors. More detailed description in the code itself.

The error decoding is done with the decoding chain now and
mce_first_notifier() gets to see the error first and the CEC decides
whether to log it and then the rest of the chain doesn't hear about it -
basically the main reason for the CE collector - or to continue running
the notifiers.

When the CEC hits the action threshold, it will try to soft-offine the
page containing the ECC and then the whole decoding chain gets to see
the error.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 Documentation/admin-guide/kernel-parameters.txt |   6 +
 arch/x86/include/asm/mce.h                      |   9 +-
 arch/x86/kernel/cpu/mcheck/mce.c                | 188 +++++----
 arch/x86/ras/Kconfig                            |  14 +
 drivers/ras/Makefile                            |   3 +-
 drivers/ras/cec.c                               | 530 ++++++++++++++++++++++++
 drivers/ras/debugfs.c                           |   2 +-
 drivers/ras/debugfs.h                           |   8 +
 drivers/ras/ras.c                               |  11 +
 include/linux/ras.h                             |  13 +-
 10 files changed, 701 insertions(+), 83 deletions(-)
 create mode 100644 drivers/ras/cec.c
 create mode 100644 drivers/ras/debugfs.h

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 986e44387dad..7a0844cff130 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3163,6 +3163,12 @@
 	ramdisk_size=	[RAM] Sizes of RAM disks in kilobytes
 			See Documentation/blockdev/ramdisk.txt.
 
+	ras=option[,option,...]	[KNL] RAS-specific options
+
+		cec_disable	[X86]
+				Disable the Correctable Errors Collector,
+				see CONFIG_RAS_CEC help text.
+
 	rcu_nocbs=	[KNL]
 			The argument is a cpu list, as described above.
 
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 0512dcc11750..c5ae545d27d8 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -191,10 +191,11 @@ extern struct mca_config mca_cfg;
 extern struct mca_msr_regs msr_ops;
 
 enum mce_notifier_prios {
-	MCE_PRIO_SRAO		= INT_MAX,
-	MCE_PRIO_EXTLOG		= INT_MAX - 1,
-	MCE_PRIO_NFIT		= INT_MAX - 2,
-	MCE_PRIO_EDAC		= INT_MAX - 3,
+	MCE_PRIO_FIRST		= INT_MAX,
+	MCE_PRIO_SRAO		= INT_MAX - 1,
+	MCE_PRIO_EXTLOG		= INT_MAX - 2,
+	MCE_PRIO_NFIT		= INT_MAX - 3,
+	MCE_PRIO_EDAC		= INT_MAX - 4,
 	MCE_PRIO_LOWEST		= 0,
 };
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 8a347f98eda2..cd750cfbcb93 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -35,6 +35,7 @@
 #include <linux/poll.h>
 #include <linux/nmi.h>
 #include <linux/cpu.h>
+#include <linux/ras.h>
 #include <linux/smp.h>
 #include <linux/fs.h>
 #include <linux/mm.h>
@@ -158,47 +159,8 @@ static struct mce_log_buffer mcelog_buf = {
 
 void mce_log(struct mce *m)
 {
-	unsigned next, entry;
-
-	/* Emit the trace record: */
-	trace_mce_record(m);
-
 	if (!mce_gen_pool_add(m))
 		irq_work_queue(&mce_irq_work);
-
-	wmb();
-	for (;;) {
-		entry = mce_log_get_idx_check(mcelog_buf.next);
-		for (;;) {
-
-			/*
-			 * When the buffer fills up discard new entries.
-			 * Assume that the earlier errors are the more
-			 * interesting ones:
-			 */
-			if (entry >= MCE_LOG_LEN) {
-				set_bit(MCE_OVERFLOW,
-					(unsigned long *)&mcelog_buf.flags);
-				return;
-			}
-			/* Old left over entry. Skip: */
-			if (mcelog_buf.entry[entry].finished) {
-				entry++;
-				continue;
-			}
-			break;
-		}
-		smp_rmb();
-		next = entry + 1;
-		if (cmpxchg(&mcelog_buf.next, entry, next) == entry)
-			break;
-	}
-	memcpy(mcelog_buf.entry + entry, m, sizeof(struct mce));
-	wmb();
-	mcelog_buf.entry[entry].finished = 1;
-	wmb();
-
-	set_bit(0, &mce_need_notify);
 }
 
 void mce_inject_log(struct mce *m)
@@ -211,6 +173,12 @@ EXPORT_SYMBOL_GPL(mce_inject_log);
 
 static struct notifier_block mce_srao_nb;
 
+/*
+ * We run the default notifier if we have only the SRAO, the first and the
+ * default notifier registered. I.e., the mandatory NUM_DEFAULT_NOTIFIERS
+ * notifiers registered on the chain.
+ */
+#define NUM_DEFAULT_NOTIFIERS	3
 static atomic_t num_notifiers;
 
 void mce_register_decode_chain(struct notifier_block *nb)
@@ -520,7 +488,6 @@ static void mce_schedule_work(void)
 
 static void mce_irq_work_cb(struct irq_work *entry)
 {
-	mce_notify_irq();
 	mce_schedule_work();
 }
 
@@ -563,6 +530,108 @@ static int mce_usable_address(struct mce *m)
 	return 1;
 }
 
+static bool memory_error(struct mce *m)
+{
+	struct cpuinfo_x86 *c = &boot_cpu_data;
+
+	if (c->x86_vendor == X86_VENDOR_AMD) {
+		/* ErrCodeExt[20:16] */
+		u8 xec = (m->status >> 16) & 0x1f;
+
+		return (xec == 0x0 || xec == 0x8);
+	} else if (c->x86_vendor == X86_VENDOR_INTEL) {
+		/*
+		 * Intel SDM Volume 3B - 15.9.2 Compound Error Codes
+		 *
+		 * Bit 7 of the MCACOD field of IA32_MCi_STATUS is used for
+		 * indicating a memory error. Bit 8 is used for indicating a
+		 * cache hierarchy error. The combination of bit 2 and bit 3
+		 * is used for indicating a `generic' cache hierarchy error
+		 * But we can't just blindly check the above bits, because if
+		 * bit 11 is set, then it is a bus/interconnect error - and
+		 * either way the above bits just gives more detail on what
+		 * bus/interconnect error happened. Note that bit 12 can be
+		 * ignored, as it's the "filter" bit.
+		 */
+		return (m->status & 0xef80) == BIT(7) ||
+		       (m->status & 0xef00) == BIT(8) ||
+		       (m->status & 0xeffc) == 0xc;
+	}
+
+	return false;
+}
+
+static bool cec_add_mce(struct mce *m)
+{
+	if (!m)
+		return false;
+
+	if (memory_error(m) && mce_usable_address(m))
+		if (!cec_add_elem(m->addr >> PAGE_SHIFT))
+			return true;
+
+	return false;
+}
+
+static int mce_first_notifier(struct notifier_block *nb, unsigned long val,
+			      void *data)
+{
+	struct mce *m = (struct mce *)data;
+	unsigned int next, entry;
+
+	if (!m)
+		return NOTIFY_DONE;
+
+	if (cec_add_mce(m))
+		return NOTIFY_STOP;
+
+	/* Emit the trace record: */
+	trace_mce_record(m);
+
+	wmb();
+	for (;;) {
+		entry = mce_log_get_idx_check(mcelog_buf.next);
+		for (;;) {
+
+			/*
+			 * When the buffer fills up discard new entries.
+			 * Assume that the earlier errors are the more
+			 * interesting ones:
+			 */
+			if (entry >= MCE_LOG_LEN) {
+				set_bit(MCE_OVERFLOW,
+					(unsigned long *)&mcelog_buf.flags);
+				return NOTIFY_DONE;
+			}
+			/* Old left over entry. Skip: */
+			if (mcelog_buf.entry[entry].finished) {
+				entry++;
+				continue;
+			}
+			break;
+		}
+		smp_rmb();
+		next = entry + 1;
+		if (cmpxchg(&mcelog_buf.next, entry, next) == entry)
+			break;
+	}
+	memcpy(mcelog_buf.entry + entry, m, sizeof(struct mce));
+	wmb();
+	mcelog_buf.entry[entry].finished = 1;
+	wmb();
+
+	set_bit(0, &mce_need_notify);
+
+	mce_notify_irq();
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block first_nb = {
+	.notifier_call	= mce_first_notifier,
+	.priority	= MCE_PRIO_FIRST,
+};
+
 static int srao_decode_notifier(struct notifier_block *nb, unsigned long val,
 				void *data)
 {
@@ -592,11 +661,7 @@ static int mce_default_notifier(struct notifier_block *nb, unsigned long val,
 	if (!m)
 		return NOTIFY_DONE;
 
-	/*
-	 * Run the default notifier if we have only the SRAO
-	 * notifier and us registered.
-	 */
-	if (atomic_read(&num_notifiers) > 2)
+	if (atomic_read(&num_notifiers) > NUM_DEFAULT_NOTIFIERS)
 		return NOTIFY_DONE;
 
 	__print_mce(m);
@@ -649,37 +714,6 @@ static void mce_read_aux(struct mce *m, int i)
 	}
 }
 
-static bool memory_error(struct mce *m)
-{
-	struct cpuinfo_x86 *c = &boot_cpu_data;
-
-	if (c->x86_vendor == X86_VENDOR_AMD) {
-		/* ErrCodeExt[20:16] */
-		u8 xec = (m->status >> 16) & 0x1f;
-
-		return (xec == 0x0 || xec == 0x8);
-	} else if (c->x86_vendor == X86_VENDOR_INTEL) {
-		/*
-		 * Intel SDM Volume 3B - 15.9.2 Compound Error Codes
-		 *
-		 * Bit 7 of the MCACOD field of IA32_MCi_STATUS is used for
-		 * indicating a memory error. Bit 8 is used for indicating a
-		 * cache hierarchy error. The combination of bit 2 and bit 3
-		 * is used for indicating a `generic' cache hierarchy error
-		 * But we can't just blindly check the above bits, because if
-		 * bit 11 is set, then it is a bus/interconnect error - and
-		 * either way the above bits just gives more detail on what
-		 * bus/interconnect error happened. Note that bit 12 can be
-		 * ignored, as it's the "filter" bit.
-		 */
-		return (m->status & 0xef80) == BIT(7) ||
-		       (m->status & 0xef00) == BIT(8) ||
-		       (m->status & 0xeffc) == 0xc;
-	}
-
-	return false;
-}
-
 DEFINE_PER_CPU(unsigned, mce_poll_count);
 
 /*
@@ -2156,6 +2190,7 @@ __setup("mce", mcheck_enable);
 int __init mcheck_init(void)
 {
 	mcheck_intel_therm_init();
+	mce_register_decode_chain(&first_nb);
 	mce_register_decode_chain(&mce_srao_nb);
 	mce_register_decode_chain(&mce_default_nb);
 	mcheck_vendor_init_severity();
@@ -2705,6 +2740,7 @@ static int __init mcheck_late_init(void)
 		static_branch_inc(&mcsafe_key);
 
 	mcheck_debugfs_init();
+	cec_init();
 
 	/*
 	 * Flush out everything that has been logged during early boot, now that
diff --git a/arch/x86/ras/Kconfig b/arch/x86/ras/Kconfig
index 0bc60a308730..2a2d89d39af6 100644
--- a/arch/x86/ras/Kconfig
+++ b/arch/x86/ras/Kconfig
@@ -7,3 +7,17 @@ config MCE_AMD_INJ
 	  aspects of the MCE handling code.
 
 	  WARNING: Do not even assume this interface is staying stable!
+
+config RAS_CEC
+	bool "Correctable Errors Collector"
+	depends on X86_MCE && MEMORY_FAILURE && DEBUG_FS
+	---help---
+	  This is a small cache which collects correctable memory errors per 4K
+	  page PFN and counts their repeated occurrence. Once the counter for a
+	  PFN overflows, we try to soft-offline that page as we take it to mean
+	  that it has reached a relatively high error count and would probably
+	  be best if we don't use it anymore.
+
+	  Bear in mind that this is absolutely useless if your platform doesn't
+	  have ECC DIMMs and doesn't have DRAM ECC checking enabled in the BIOS.
+
diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
index d7f73341ced3..7b26dd3aa5d0 100644
--- a/drivers/ras/Makefile
+++ b/drivers/ras/Makefile
@@ -1 +1,2 @@
-obj-$(CONFIG_RAS) += ras.o debugfs.o
+obj-$(CONFIG_RAS)	+= ras.o debugfs.o
+obj-$(CONFIG_RAS_CEC)	+= cec.o
diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
new file mode 100644
index 000000000000..bebf8ba10171
--- /dev/null
+++ b/drivers/ras/cec.c
@@ -0,0 +1,530 @@
+#include <linux/mm.h>
+#include <linux/gfp.h>
+#include <linux/kernel.h>
+
+#include <asm/mce.h>
+
+#include "debugfs.h"
+
+/*
+ * RAS Correctable Errors Collector
+ *
+ * This is a simple gadget which collects correctable errors and counts their
+ * occurrence per physical page address.
+ *
+ * We've opted for possibly the simplest data structure to collect those - an
+ * array of the size of a memory page. It stores 512 u64's with the following
+ * structure:
+ *
+ * [63 ... PFN ... 12 | 11 ... generation ... 10 | 9 ... count ... 0]
+ *
+ * The generation in the two highest order bits is two bits which are set to 11b
+ * on every insertion. During the course of each entry's existence, the
+ * generation field gets decremented during spring cleaning to 10b, then 01b and
+ * then 00b.
+ *
+ * This way we're employing the natural numeric ordering to make sure that newly
+ * inserted/touched elements have higher 12-bit counts (which we've manufactured)
+ * and thus iterating over the array initially won't kick out those elements
+ * which were inserted last.
+ *
+ * Spring cleaning is what we do when we reach a certain number CLEAN_ELEMS of
+ * elements entered into the array, during which, we're decaying all elements.
+ * If, after decay, an element gets inserted again, its generation is set to 11b
+ * to make sure it has higher numerical count than other, older elements and
+ * thus emulate an an LRU-like behavior when deleting elements to free up space
+ * in the page.
+ *
+ * When an element reaches it's max count of count_threshold, we try to poison
+ * it by assuming that errors triggered count_threshold times in a single page
+ * are excessive and that page shouldn't be used anymore. count_threshold is
+ * initialized to COUNT_MASK which is the maximum.
+ *
+ * That error event entry causes cec_add_elem() to return !0 value and thus
+ * signal to its callers to log the error.
+ *
+ * To the question why we've chosen a page and moving elements around with
+ * memmove(), it is because it is a very simple structure to handle and max data
+ * movement is 4K which on highly optimized modern CPUs is almost unnoticeable.
+ * We wanted to avoid the pointer traversal of more complex structures like a
+ * linked list or some sort of a balancing search tree.
+ *
+ * Deleting an element takes O(n) but since it is only a single page, it should
+ * be fast enough and it shouldn't happen all too often depending on error
+ * patterns.
+ */
+
+#undef pr_fmt
+#define pr_fmt(fmt) "RAS: " fmt
+
+/*
+ * We use DECAY_BITS bits of PAGE_SHIFT bits for counting decay, i.e., how long
+ * elements have stayed in the array without having been accessed again.
+ */
+#define DECAY_BITS		2
+#define DECAY_MASK		((1ULL << DECAY_BITS) - 1)
+#define MAX_ELEMS		(PAGE_SIZE / sizeof(u64))
+
+/*
+ * Threshold amount of inserted elements after which we start spring
+ * cleaning.
+ */
+#define CLEAN_ELEMS		(MAX_ELEMS >> DECAY_BITS)
+
+/* Bits which count the number of errors happened in this 4K page. */
+#define COUNT_BITS		(PAGE_SHIFT - DECAY_BITS)
+#define COUNT_MASK		((1ULL << COUNT_BITS) - 1)
+#define FULL_COUNT_MASK		(PAGE_SIZE - 1)
+
+/*
+ * u64: [ 63 ... 12 | DECAY_BITS | COUNT_BITS ]
+ */
+
+#define PFN(e)			((e) >> PAGE_SHIFT)
+#define DECAY(e)		(((e) >> COUNT_BITS) & DECAY_MASK)
+#define COUNT(e)		((unsigned int)(e) & COUNT_MASK)
+#define FULL_COUNT(e)		((e) & (PAGE_SIZE - 1))
+
+static struct ce_array {
+	u64 *array;			/* container page */
+	unsigned int n;			/* number of elements in the array */
+
+	unsigned int decay_count;	/*
+					 * number of element insertions/increments
+					 * since the last spring cleaning.
+					 */
+
+	u64 pfns_poisoned;		/*
+					 * number of PFNs which got poisoned.
+					 */
+
+	u64 ces_entered;		/*
+					 * The number of correctable errors
+					 * entered into the collector.
+					 */
+
+	u64 decays_done;		/*
+					 * Times we did spring cleaning.
+					 */
+
+	union {
+		struct {
+			__u32	disabled : 1,	/* cmdline disabled */
+			__resv   : 31;
+		};
+		__u32 flags;
+	};
+} ce_arr;
+
+static DEFINE_MUTEX(ce_mutex);
+static u64 dfs_pfn;
+
+/* Amount of errors after which we offline */
+static unsigned int count_threshold = COUNT_MASK;
+
+/*
+ * The timer "decays" element count each timer_interval which is 24hrs by
+ * default.
+ */
+
+#define CEC_TIMER_DEFAULT_INTERVAL	24 * 60 * 60	/* 24 hrs */
+#define CEC_TIMER_MIN_INTERVAL		 1 * 60 * 60	/* 1h */
+#define CEC_TIMER_MAX_INTERVAL	   30 *	24 * 60 * 60	/* one month */
+static struct timer_list cec_timer;
+static u64 timer_interval = CEC_TIMER_DEFAULT_INTERVAL;
+
+/*
+ * Decrement decay value. We're using DECAY_BITS bits to denote decay of an
+ * element in the array. On insertion and any access, it gets reset to max.
+ */
+static void do_spring_cleaning(struct ce_array *ca)
+{
+	int i;
+
+	for (i = 0; i < ca->n; i++) {
+		u8 decay = DECAY(ca->array[i]);
+
+		if (!decay)
+			continue;
+
+		decay--;
+
+		ca->array[i] &= ~(DECAY_MASK << COUNT_BITS);
+		ca->array[i] |= (decay << COUNT_BITS);
+	}
+	ca->decay_count = 0;
+	ca->decays_done++;
+}
+
+/*
+ * @interval in seconds
+ */
+static void cec_mod_timer(struct timer_list *t, unsigned long interval)
+{
+	unsigned long iv;
+
+	iv = interval * HZ + jiffies;
+
+	mod_timer(t, round_jiffies(iv));
+}
+
+static void cec_timer_fn(unsigned long data)
+{
+	struct ce_array *ca = (struct ce_array *)data;
+
+	do_spring_cleaning(ca);
+
+	cec_mod_timer(&cec_timer, timer_interval);
+}
+
+/*
+ * @to: index of the smallest element which is >= then @pfn.
+ *
+ * Return the index of the pfn if found, otherwise negative value.
+ */
+static int __find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
+{
+	u64 this_pfn;
+	int min = 0, max = ca->n;
+
+	while (min < max) {
+		int tmp = (max + min) >> 1;
+
+		this_pfn = PFN(ca->array[tmp]);
+
+		if (this_pfn < pfn)
+			min = tmp + 1;
+		else if (this_pfn > pfn)
+			max = tmp;
+		else {
+			min = tmp;
+			break;
+		}
+	}
+
+	if (to)
+		*to = min;
+
+	this_pfn = PFN(ca->array[min]);
+
+	if (this_pfn == pfn)
+		return min;
+
+	return -ENOKEY;
+}
+
+static int find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
+{
+	WARN_ON(!to);
+
+	if (!ca->n) {
+		*to = 0;
+		return -ENOKEY;
+	}
+	return __find_elem(ca, pfn, to);
+}
+
+static void del_elem(struct ce_array *ca, int idx)
+{
+	/* Save us a function call when deleting the last element. */
+	if (ca->n - (idx + 1))
+		memmove((void *)&ca->array[idx],
+			(void *)&ca->array[idx + 1],
+			(ca->n - (idx + 1)) * sizeof(u64));
+
+	ca->n--;
+}
+
+static u64 del_lru_elem_unlocked(struct ce_array *ca)
+{
+	unsigned int min = FULL_COUNT_MASK;
+	int i, min_idx = 0;
+
+	for (i = 0; i < ca->n; i++) {
+		unsigned int this = FULL_COUNT(ca->array[i]);
+
+		if (min > this) {
+			min = this;
+			min_idx = i;
+		}
+	}
+
+	del_elem(ca, min_idx);
+
+	return PFN(ca->array[min_idx]);
+}
+
+/*
+ * We return the 0th pfn in the error case under the assumption that it cannot
+ * be poisoned and excessive CEs in there are a serious deal anyway.
+ */
+static u64 __maybe_unused del_lru_elem(void)
+{
+	struct ce_array *ca = &ce_arr;
+	u64 pfn;
+
+	if (!ca->n)
+		return 0;
+
+	mutex_lock(&ce_mutex);
+	pfn = del_lru_elem_unlocked(ca);
+	mutex_unlock(&ce_mutex);
+
+	return pfn;
+}
+
+
+int cec_add_elem(u64 pfn)
+{
+	struct ce_array *ca = &ce_arr;
+	unsigned int to;
+	int count, ret = 0;
+
+	/*
+	 * We can be called very early on the identify_cpu() path where we are
+	 * not initialized yet. We ignore the error for simplicity.
+	 */
+	if (!ce_arr.array || ce_arr.disabled)
+		return -ENODEV;
+
+	ca->ces_entered++;
+
+	mutex_lock(&ce_mutex);
+
+	if (ca->n == MAX_ELEMS)
+		WARN_ON(!del_lru_elem_unlocked(ca));
+
+	ret = find_elem(ca, pfn, &to);
+	if (ret < 0) {
+		/*
+		 * Shift range [to-end] to make room for one more element.
+		 */
+		memmove((void *)&ca->array[to + 1],
+			(void *)&ca->array[to],
+			(ca->n - to) * sizeof(u64));
+
+		ca->array[to] = (pfn << PAGE_SHIFT) |
+				(DECAY_MASK << COUNT_BITS) | 1;
+
+		ca->n++;
+
+		ret = 0;
+
+		goto decay;
+	}
+
+	count = COUNT(ca->array[to]);
+
+	if (count < count_threshold) {
+		ca->array[to] |= (DECAY_MASK << COUNT_BITS);
+		ca->array[to]++;
+	} else {
+		u64 pfn = ca->array[to] >> PAGE_SHIFT;
+
+		if (!pfn_valid(pfn)) {
+			pr_warn("CEC: Invalid pfn: 0x%llx\n", pfn);
+		} else {
+			/* We have reached max count for this page, soft-offline it. */
+			pr_err("Soft-offlining pfn: 0x%llx\n", pfn);
+			memory_failure_queue(pfn, 0, MF_SOFT_OFFLINE);
+			ca->pfns_poisoned++;
+		}
+
+		del_elem(ca, to);
+
+		/*
+		 * Return a >0 value to denote that we've reached the offlining
+		 * threshold.
+		 */
+		ret = 1;
+
+		goto unlock;
+	}
+
+decay:
+	ca->decay_count++;
+
+	if (ca->decay_count >= CLEAN_ELEMS)
+		do_spring_cleaning(ca);
+
+unlock:
+	mutex_unlock(&ce_mutex);
+
+	return ret;
+}
+
+static int u64_get(void *data, u64 *val)
+{
+	*val = *(u64 *)data;
+
+	return 0;
+}
+
+static int pfn_set(void *data, u64 val)
+{
+	*(u64 *)data = val;
+
+	return cec_add_elem(val);
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(pfn_ops, u64_get, pfn_set, "0x%llx\n");
+
+static int decay_interval_set(void *data, u64 val)
+{
+	*(u64 *)data = val;
+
+	if (val < CEC_TIMER_MIN_INTERVAL)
+		return -EINVAL;
+
+	if (val > CEC_TIMER_MAX_INTERVAL)
+		return -EINVAL;
+
+	timer_interval = val;
+
+	cec_mod_timer(&cec_timer, timer_interval);
+	return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(decay_interval_ops, u64_get, decay_interval_set, "%lld\n");
+
+static int count_threshold_set(void *data, u64 val)
+{
+	*(u64 *)data = val;
+
+	if (val > COUNT_MASK)
+		val = COUNT_MASK;
+
+	count_threshold = val;
+
+	return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(count_threshold_ops, u64_get, count_threshold_set, "%lld\n");
+
+static int array_dump(struct seq_file *m, void *v)
+{
+	struct ce_array *ca = &ce_arr;
+	u64 prev = 0;
+	int i;
+
+	mutex_lock(&ce_mutex);
+
+	seq_printf(m, "{ n: %d\n", ca->n);
+	for (i = 0; i < ca->n; i++) {
+		u64 this = PFN(ca->array[i]);
+
+		seq_printf(m, " %03d: [%016llx|%03llx]\n", i, this, FULL_COUNT(ca->array[i]));
+
+		WARN_ON(prev > this);
+
+		prev = this;
+	}
+
+	seq_printf(m, "}\n");
+
+	seq_printf(m, "Stats:\nCEs: %llu\nofflined pages: %llu\n",
+		   ca->ces_entered, ca->pfns_poisoned);
+
+	seq_printf(m, "Flags: 0x%x\n", ca->flags);
+
+	seq_printf(m, "Timer interval: %lld seconds\n", timer_interval);
+	seq_printf(m, "Decays: %lld\n", ca->decays_done);
+
+	seq_printf(m, "Action threshold: %d\n", count_threshold);
+
+	mutex_unlock(&ce_mutex);
+
+	return 0;
+}
+
+static int array_open(struct inode *inode, struct file *filp)
+{
+	return single_open(filp, array_dump, NULL);
+}
+
+static const struct file_operations array_ops = {
+	.owner	 = THIS_MODULE,
+	.open	 = array_open,
+	.read	 = seq_read,
+	.llseek	 = seq_lseek,
+	.release = single_release,
+};
+
+static int __init create_debugfs_nodes(void)
+{
+	struct dentry *d, *pfn, *decay, *count, *array;
+
+	d = debugfs_create_dir("cec", ras_debugfs_dir);
+	if (!d) {
+		pr_warn("Error creating cec debugfs node!\n");
+		return -1;
+	}
+
+	pfn = debugfs_create_file("pfn", S_IRUSR | S_IWUSR, d, &dfs_pfn, &pfn_ops);
+	if (!pfn) {
+		pr_warn("Error creating pfn debugfs node!\n");
+		goto err;
+	}
+
+	array = debugfs_create_file("array", S_IRUSR, d, NULL, &array_ops);
+	if (!array) {
+		pr_warn("Error creating array debugfs node!\n");
+		goto err;
+	}
+
+	decay = debugfs_create_file("decay_interval", S_IRUSR | S_IWUSR, d,
+				    &timer_interval, &decay_interval_ops);
+	if (!decay) {
+		pr_warn("Error creating decay_interval debugfs node!\n");
+		goto err;
+	}
+
+	count = debugfs_create_file("count_threshold", S_IRUSR | S_IWUSR, d,
+				    &count_threshold, &count_threshold_ops);
+	if (!decay) {
+		pr_warn("Error creating count_threshold debugfs node!\n");
+		goto err;
+	}
+
+
+	return 0;
+
+err:
+	debugfs_remove_recursive(d);
+
+	return 1;
+}
+
+void __init cec_init(void)
+{
+	if (ce_arr.disabled)
+		return;
+
+	ce_arr.array = (void *)get_zeroed_page(GFP_KERNEL);
+	if (!ce_arr.array) {
+		pr_err("Error allocating CE array page!\n");
+		return;
+	}
+
+	if (create_debugfs_nodes())
+		return;
+
+	setup_timer(&cec_timer, cec_timer_fn, (unsigned long)&ce_arr);
+	cec_mod_timer(&cec_timer, CEC_TIMER_DEFAULT_INTERVAL);
+
+	pr_info("Correctable Errors collector initialized.\n");
+}
+
+int __init parse_cec_param(char *str)
+{
+	if (!str)
+		return 0;
+
+	if (*str == '=')
+		str++;
+
+	if (!strncmp(str, "cec_disable", 7))
+		ce_arr.disabled = 1;
+	else
+		return 0;
+
+	return 1;
+}
diff --git a/drivers/ras/debugfs.c b/drivers/ras/debugfs.c
index 0322acf67ea5..501603057dff 100644
--- a/drivers/ras/debugfs.c
+++ b/drivers/ras/debugfs.c
@@ -1,6 +1,6 @@
 #include <linux/debugfs.h>
 
-static struct dentry *ras_debugfs_dir;
+struct dentry *ras_debugfs_dir;
 
 static atomic_t trace_count = ATOMIC_INIT(0);
 
diff --git a/drivers/ras/debugfs.h b/drivers/ras/debugfs.h
new file mode 100644
index 000000000000..db72e4513191
--- /dev/null
+++ b/drivers/ras/debugfs.h
@@ -0,0 +1,8 @@
+#ifndef __RAS_DEBUGFS_H__
+#define __RAS_DEBUGFS_H__
+
+#include <linux/debugfs.h>
+
+extern struct dentry *ras_debugfs_dir;
+
+#endif /* __RAS_DEBUGFS_H__ */
diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
index b67dd362b7b6..94f8038864b4 100644
--- a/drivers/ras/ras.c
+++ b/drivers/ras/ras.c
@@ -27,3 +27,14 @@ subsys_initcall(ras_init);
 EXPORT_TRACEPOINT_SYMBOL_GPL(extlog_mem_event);
 #endif
 EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
+
+
+int __init parse_ras_param(char *str)
+{
+#ifdef CONFIG_RAS_CEC
+	parse_cec_param(str);
+#endif
+
+	return 1;
+}
+__setup("ras", parse_ras_param);
diff --git a/include/linux/ras.h b/include/linux/ras.h
index 2aceeafd6fe5..ffb147185e8d 100644
--- a/include/linux/ras.h
+++ b/include/linux/ras.h
@@ -1,14 +1,25 @@
 #ifndef __RAS_H__
 #define __RAS_H__
 
+#include <asm/errno.h>
+
 #ifdef CONFIG_DEBUG_FS
 int ras_userspace_consumers(void);
 void ras_debugfs_init(void);
 int ras_add_daemon_trace(void);
 #else
 static inline int ras_userspace_consumers(void) { return 0; }
-static inline void ras_debugfs_init(void) { return; }
+static inline void ras_debugfs_init(void) { }
 static inline int ras_add_daemon_trace(void) { return 0; }
 #endif
 
+#ifdef CONFIG_RAS_CEC
+void __init cec_init(void);
+int __init parse_cec_param(char *str);
+int cec_add_elem(u64 pfn);
+#else
+static inline void __init cec_init(void)	{ }
+static inline int cec_add_elem(u64 pfn)		{ return -ENODEV; }
 #endif
+
+#endif /* __RAS_H__ */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] x86/mce: Deprecate /dev/mcelog
  2017-03-09 10:08 [PATCH 0/4] RAS: Add CEC collector and deprecate mcelog Borislav Petkov
                   ` (2 preceding siblings ...)
  2017-03-09 10:08 ` [PATCH 3/4] RAS: Add a Corrected Errors Collector Borislav Petkov
@ 2017-03-09 10:08 ` Borislav Petkov
  3 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2017-03-09 10:08 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, linux-edac, LKML

From: Tony Luck <tony.luck@intel.com>

Move all code relating to /dev/mcelog to a separate source file.
/dev/mcelog driver can now operate from the machine check notifier with
lowest prio.

Boris:
* Move the mce_helper and trigger functionality behind
CONFIG_X86_MCELOG.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/Kconfig                          |  10 +-
 arch/x86/include/asm/mce.h                |   1 +
 arch/x86/kernel/cpu/mcheck/Makefile       |   2 +
 arch/x86/kernel/cpu/mcheck/dev-mcelog.c   | 397 ++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/mcheck/mce-internal.h |   8 +
 arch/x86/kernel/cpu/mcheck/mce.c          | 367 +--------------------------
 6 files changed, 426 insertions(+), 359 deletions(-)
 create mode 100644 arch/x86/kernel/cpu/mcheck/dev-mcelog.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc98d5a294ee..f1216d0114b0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1043,6 +1043,14 @@ config X86_MCE
 	  The action the kernel takes depends on the severity of the problem,
 	  ranging from warning messages to halting the machine.
 
+config X86_MCELOG
+	bool "Support for deprecated /dev/mcelog character device"
+	depends on X86_MCE
+	---help---
+	  Enable support for /dev/mcelog which is needed by the old mcelog
+	  userspace logging daemon. Consider switching to the new generation
+	  rasdaemon solution.
+
 config X86_MCE_INTEL
 	def_bool y
 	prompt "Intel MCE features"
@@ -1072,7 +1080,7 @@ config X86_MCE_THRESHOLD
 	def_bool y
 
 config X86_MCE_INJECT
-	depends on X86_MCE && X86_LOCAL_APIC
+	depends on X86_MCE && X86_LOCAL_APIC && X86_MCELOG
 	tristate "Machine check injector support"
 	---help---
 	  Provide support for injecting machine checks for testing purposes.
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index c5ae545d27d8..4fd5195deed0 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -196,6 +196,7 @@ enum mce_notifier_prios {
 	MCE_PRIO_EXTLOG		= INT_MAX - 2,
 	MCE_PRIO_NFIT		= INT_MAX - 3,
 	MCE_PRIO_EDAC		= INT_MAX - 4,
+	MCE_PRIO_MCELOG		= 1,
 	MCE_PRIO_LOWEST		= 0,
 };
 
diff --git a/arch/x86/kernel/cpu/mcheck/Makefile b/arch/x86/kernel/cpu/mcheck/Makefile
index a3311c886194..950e9ff602ea 100644
--- a/arch/x86/kernel/cpu/mcheck/Makefile
+++ b/arch/x86/kernel/cpu/mcheck/Makefile
@@ -9,3 +9,5 @@ obj-$(CONFIG_X86_MCE_INJECT)	+= mce-inject.o
 obj-$(CONFIG_X86_THERMAL_VECTOR) += therm_throt.o
 
 obj-$(CONFIG_ACPI_APEI)		+= mce-apei.o
+
+obj-$(CONFIG_X86_MCELOG)	+= dev-mcelog.o
diff --git a/arch/x86/kernel/cpu/mcheck/dev-mcelog.c b/arch/x86/kernel/cpu/mcheck/dev-mcelog.c
new file mode 100644
index 000000000000..9c632cb88546
--- /dev/null
+++ b/arch/x86/kernel/cpu/mcheck/dev-mcelog.c
@@ -0,0 +1,397 @@
+/*
+ * /dev/mcelog driver
+ *
+ * K8 parts Copyright 2002,2003 Andi Kleen, SuSE Labs.
+ * Rest from unknown author(s).
+ * 2004 Andi Kleen. Rewrote most of it.
+ * Copyright 2008 Intel Corporation
+ * Author: Andi Kleen
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/miscdevice.h>
+#include <linux/slab.h>
+#include <linux/kmod.h>
+#include <linux/poll.h>
+
+#include "mce-internal.h"
+
+static DEFINE_MUTEX(mce_chrdev_read_mutex);
+
+static char mce_helper[128];
+static char *mce_helper_argv[2] = { mce_helper, NULL };
+
+#define mce_log_get_idx_check(p) \
+({ \
+	RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held() && \
+			 !lockdep_is_held(&mce_chrdev_read_mutex), \
+			 "suspicious mce_log_get_idx_check() usage"); \
+	smp_load_acquire(&(p)); \
+})
+
+/*
+ * Lockless MCE logging infrastructure.
+ * This avoids deadlocks on printk locks without having to break locks. Also
+ * separate MCEs from kernel messages to avoid bogus bug reports.
+ */
+
+static struct mce_log_buffer mcelog = {
+	.signature	= MCE_LOG_SIGNATURE,
+	.len		= MCE_LOG_LEN,
+	.recordlen	= sizeof(struct mce),
+};
+
+static DECLARE_WAIT_QUEUE_HEAD(mce_chrdev_wait);
+
+/* User mode helper program triggered by machine check event */
+extern char			mce_helper[128];
+
+static int dev_mce_log(struct notifier_block *nb, unsigned long val,
+				void *data)
+{
+	struct mce *mce = (struct mce *)data;
+	unsigned int next, entry;
+
+	wmb();
+	for (;;) {
+		entry = mce_log_get_idx_check(mcelog.next);
+		for (;;) {
+
+			/*
+			 * When the buffer fills up discard new entries.
+			 * Assume that the earlier errors are the more
+			 * interesting ones:
+			 */
+			if (entry >= MCE_LOG_LEN) {
+				set_bit(MCE_OVERFLOW,
+					(unsigned long *)&mcelog.flags);
+				return NOTIFY_OK;
+			}
+			/* Old left over entry. Skip: */
+			if (mcelog.entry[entry].finished) {
+				entry++;
+				continue;
+			}
+			break;
+		}
+		smp_rmb();
+		next = entry + 1;
+		if (cmpxchg(&mcelog.next, entry, next) == entry)
+			break;
+	}
+	memcpy(mcelog.entry + entry, mce, sizeof(struct mce));
+	wmb();
+	mcelog.entry[entry].finished = 1;
+	wmb();
+
+	/* wake processes polling /dev/mcelog */
+	wake_up_interruptible(&mce_chrdev_wait);
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block dev_mcelog_nb = {
+	.notifier_call	= dev_mce_log,
+	.priority	= MCE_PRIO_MCELOG,
+};
+
+static void mce_do_trigger(struct work_struct *work)
+{
+	call_usermodehelper(mce_helper, mce_helper_argv, NULL, UMH_NO_WAIT);
+}
+
+static DECLARE_WORK(mce_trigger_work, mce_do_trigger);
+
+
+void mce_work_trigger(void)
+{
+	if (mce_helper[0])
+		schedule_work(&mce_trigger_work);
+}
+
+static ssize_t
+show_trigger(struct device *s, struct device_attribute *attr, char *buf)
+{
+	strcpy(buf, mce_helper);
+	strcat(buf, "\n");
+	return strlen(mce_helper) + 1;
+}
+
+static ssize_t set_trigger(struct device *s, struct device_attribute *attr,
+				const char *buf, size_t siz)
+{
+	char *p;
+
+	strncpy(mce_helper, buf, sizeof(mce_helper));
+	mce_helper[sizeof(mce_helper)-1] = 0;
+	p = strchr(mce_helper, '\n');
+
+	if (p)
+		*p = 0;
+
+	return strlen(mce_helper) + !!p;
+}
+
+DEVICE_ATTR(trigger, 0644, show_trigger, set_trigger);
+
+/*
+ * mce_chrdev: Character device /dev/mcelog to read and clear the MCE log.
+ */
+
+static DEFINE_SPINLOCK(mce_chrdev_state_lock);
+static int mce_chrdev_open_count;	/* #times opened */
+static int mce_chrdev_open_exclu;	/* already open exclusive? */
+
+static int mce_chrdev_open(struct inode *inode, struct file *file)
+{
+	spin_lock(&mce_chrdev_state_lock);
+
+	if (mce_chrdev_open_exclu ||
+	    (mce_chrdev_open_count && (file->f_flags & O_EXCL))) {
+		spin_unlock(&mce_chrdev_state_lock);
+
+		return -EBUSY;
+	}
+
+	if (file->f_flags & O_EXCL)
+		mce_chrdev_open_exclu = 1;
+	mce_chrdev_open_count++;
+
+	spin_unlock(&mce_chrdev_state_lock);
+
+	return nonseekable_open(inode, file);
+}
+
+static int mce_chrdev_release(struct inode *inode, struct file *file)
+{
+	spin_lock(&mce_chrdev_state_lock);
+
+	mce_chrdev_open_count--;
+	mce_chrdev_open_exclu = 0;
+
+	spin_unlock(&mce_chrdev_state_lock);
+
+	return 0;
+}
+
+static void collect_tscs(void *data)
+{
+	unsigned long *cpu_tsc = (unsigned long *)data;
+
+	cpu_tsc[smp_processor_id()] = rdtsc();
+}
+
+static int mce_apei_read_done;
+
+/* Collect MCE record of previous boot in persistent storage via APEI ERST. */
+static int __mce_read_apei(char __user **ubuf, size_t usize)
+{
+	int rc;
+	u64 record_id;
+	struct mce m;
+
+	if (usize < sizeof(struct mce))
+		return -EINVAL;
+
+	rc = apei_read_mce(&m, &record_id);
+	/* Error or no more MCE record */
+	if (rc <= 0) {
+		mce_apei_read_done = 1;
+		/*
+		 * When ERST is disabled, mce_chrdev_read() should return
+		 * "no record" instead of "no device."
+		 */
+		if (rc == -ENODEV)
+			return 0;
+		return rc;
+	}
+	rc = -EFAULT;
+	if (copy_to_user(*ubuf, &m, sizeof(struct mce)))
+		return rc;
+	/*
+	 * In fact, we should have cleared the record after that has
+	 * been flushed to the disk or sent to network in
+	 * /sbin/mcelog, but we have no interface to support that now,
+	 * so just clear it to avoid duplication.
+	 */
+	rc = apei_clear_mce(record_id);
+	if (rc) {
+		mce_apei_read_done = 1;
+		return rc;
+	}
+	*ubuf += sizeof(struct mce);
+
+	return 0;
+}
+
+static ssize_t mce_chrdev_read(struct file *filp, char __user *ubuf,
+				size_t usize, loff_t *off)
+{
+	char __user *buf = ubuf;
+	unsigned long *cpu_tsc;
+	unsigned prev, next;
+	int i, err;
+
+	cpu_tsc = kmalloc(nr_cpu_ids * sizeof(long), GFP_KERNEL);
+	if (!cpu_tsc)
+		return -ENOMEM;
+
+	mutex_lock(&mce_chrdev_read_mutex);
+
+	if (!mce_apei_read_done) {
+		err = __mce_read_apei(&buf, usize);
+		if (err || buf != ubuf)
+			goto out;
+	}
+
+	next = mce_log_get_idx_check(mcelog.next);
+
+	/* Only supports full reads right now */
+	err = -EINVAL;
+	if (*off != 0 || usize < MCE_LOG_LEN*sizeof(struct mce))
+		goto out;
+
+	err = 0;
+	prev = 0;
+	do {
+		for (i = prev; i < next; i++) {
+			unsigned long start = jiffies;
+			struct mce *m = &mcelog.entry[i];
+
+			while (!m->finished) {
+				if (time_after_eq(jiffies, start + 2)) {
+					memset(m, 0, sizeof(*m));
+					goto timeout;
+				}
+				cpu_relax();
+			}
+			smp_rmb();
+			err |= copy_to_user(buf, m, sizeof(*m));
+			buf += sizeof(*m);
+timeout:
+			;
+		}
+
+		memset(mcelog.entry + prev, 0,
+		       (next - prev) * sizeof(struct mce));
+		prev = next;
+		next = cmpxchg(&mcelog.next, prev, 0);
+	} while (next != prev);
+
+	synchronize_sched();
+
+	/*
+	 * Collect entries that were still getting written before the
+	 * synchronize.
+	 */
+	on_each_cpu(collect_tscs, cpu_tsc, 1);
+
+	for (i = next; i < MCE_LOG_LEN; i++) {
+		struct mce *m = &mcelog.entry[i];
+
+		if (m->finished && m->tsc < cpu_tsc[m->cpu]) {
+			err |= copy_to_user(buf, m, sizeof(*m));
+			smp_rmb();
+			buf += sizeof(*m);
+			memset(m, 0, sizeof(*m));
+		}
+	}
+
+	if (err)
+		err = -EFAULT;
+
+out:
+	mutex_unlock(&mce_chrdev_read_mutex);
+	kfree(cpu_tsc);
+
+	return err ? err : buf - ubuf;
+}
+
+static unsigned int mce_chrdev_poll(struct file *file, poll_table *wait)
+{
+	poll_wait(file, &mce_chrdev_wait, wait);
+	if (READ_ONCE(mcelog.next))
+		return POLLIN | POLLRDNORM;
+	if (!mce_apei_read_done && apei_check_mce())
+		return POLLIN | POLLRDNORM;
+	return 0;
+}
+
+static long mce_chrdev_ioctl(struct file *f, unsigned int cmd,
+				unsigned long arg)
+{
+	int __user *p = (int __user *)arg;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	switch (cmd) {
+	case MCE_GET_RECORD_LEN:
+		return put_user(sizeof(struct mce), p);
+	case MCE_GET_LOG_LEN:
+		return put_user(MCE_LOG_LEN, p);
+	case MCE_GETCLEAR_FLAGS: {
+		unsigned flags;
+
+		do {
+			flags = mcelog.flags;
+		} while (cmpxchg(&mcelog.flags, flags, 0) != flags);
+
+		return put_user(flags, p);
+	}
+	default:
+		return -ENOTTY;
+	}
+}
+
+static ssize_t (*mce_write)(struct file *filp, const char __user *ubuf,
+			    size_t usize, loff_t *off);
+
+void register_mce_write_callback(ssize_t (*fn)(struct file *filp,
+			     const char __user *ubuf,
+			     size_t usize, loff_t *off))
+{
+	mce_write = fn;
+}
+EXPORT_SYMBOL_GPL(register_mce_write_callback);
+
+static ssize_t mce_chrdev_write(struct file *filp, const char __user *ubuf,
+				size_t usize, loff_t *off)
+{
+	if (mce_write)
+		return mce_write(filp, ubuf, usize, off);
+	else
+		return -EINVAL;
+}
+
+static const struct file_operations mce_chrdev_ops = {
+	.open			= mce_chrdev_open,
+	.release		= mce_chrdev_release,
+	.read			= mce_chrdev_read,
+	.write			= mce_chrdev_write,
+	.poll			= mce_chrdev_poll,
+	.unlocked_ioctl		= mce_chrdev_ioctl,
+	.llseek			= no_llseek,
+};
+
+static struct miscdevice mce_chrdev_device = {
+	MISC_MCELOG_MINOR,
+	"mcelog",
+	&mce_chrdev_ops,
+};
+
+static __init int dev_mcelog_init_device(void)
+{
+	int err;
+
+	/* register character device /dev/mcelog */
+	err = misc_register(&mce_chrdev_device);
+	if (err) {
+		pr_err("Unable to init device /dev/mcelog (rc: %d)\n", err);
+		return err;
+	}
+	mce_register_decode_chain(&dev_mcelog_nb);
+	return 0;
+}
+device_initcall_sync(dev_mcelog_init_device);
diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 903043e6a62b..f096d64485d0 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -96,3 +96,11 @@ static inline bool mce_cmp(struct mce *m1, struct mce *m2)
 		m1->addr != m2->addr ||
 		m1->misc != m2->misc;
 }
+
+extern struct device_attribute dev_attr_trigger;
+
+#ifdef CONFIG_X86_MCELOG
+extern void mce_work_trigger(void);
+#else
+static inline void mce_work_trigger(void)	{ }
+#endif
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index cd750cfbcb93..1be00932ac47 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -54,15 +54,7 @@
 
 #include "mce-internal.h"
 
-static DEFINE_MUTEX(mce_chrdev_read_mutex);
-
-#define mce_log_get_idx_check(p) \
-({ \
-	RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held() && \
-			 !lockdep_is_held(&mce_chrdev_read_mutex), \
-			 "suspicious mce_log_get_idx_check() usage"); \
-	smp_load_acquire(&(p)); \
-})
+static DEFINE_MUTEX(mce_log_mutex);
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/mce.h>
@@ -87,15 +79,9 @@ struct mca_config mca_cfg __read_mostly = {
 	.monarch_timeout = -1
 };
 
-/* User mode helper program triggered by machine check event */
-static unsigned long		mce_need_notify;
-static char			mce_helper[128];
-static char			*mce_helper_argv[2] = { mce_helper, NULL };
-
-static DECLARE_WAIT_QUEUE_HEAD(mce_chrdev_wait);
-
 static DEFINE_PER_CPU(struct mce, mces_seen);
-static int			cpu_missing;
+static unsigned long mce_need_notify;
+static int cpu_missing;
 
 /*
  * MCA banks polled by the period polling timer for corrected events.
@@ -145,18 +131,6 @@ void mce_setup(struct mce *m)
 DEFINE_PER_CPU(struct mce, injectm);
 EXPORT_PER_CPU_SYMBOL_GPL(injectm);
 
-/*
- * Lockless MCE logging infrastructure.
- * This avoids deadlocks on printk locks without having to break locks. Also
- * separate MCEs from kernel messages to avoid bogus bug reports.
- */
-
-static struct mce_log_buffer mcelog_buf = {
-	.signature	= MCE_LOG_SIGNATURE,
-	.len		= MCE_LOG_LEN,
-	.recordlen	= sizeof(struct mce),
-};
-
 void mce_log(struct mce *m)
 {
 	if (!mce_gen_pool_add(m))
@@ -165,9 +139,9 @@ void mce_log(struct mce *m)
 
 void mce_inject_log(struct mce *m)
 {
-	mutex_lock(&mce_chrdev_read_mutex);
+	mutex_lock(&mce_log_mutex);
 	mce_log(m);
-	mutex_unlock(&mce_chrdev_read_mutex);
+	mutex_unlock(&mce_log_mutex);
 }
 EXPORT_SYMBOL_GPL(mce_inject_log);
 
@@ -577,7 +551,6 @@ static int mce_first_notifier(struct notifier_block *nb, unsigned long val,
 			      void *data)
 {
 	struct mce *m = (struct mce *)data;
-	unsigned int next, entry;
 
 	if (!m)
 		return NOTIFY_DONE;
@@ -588,38 +561,6 @@ static int mce_first_notifier(struct notifier_block *nb, unsigned long val,
 	/* Emit the trace record: */
 	trace_mce_record(m);
 
-	wmb();
-	for (;;) {
-		entry = mce_log_get_idx_check(mcelog_buf.next);
-		for (;;) {
-
-			/*
-			 * When the buffer fills up discard new entries.
-			 * Assume that the earlier errors are the more
-			 * interesting ones:
-			 */
-			if (entry >= MCE_LOG_LEN) {
-				set_bit(MCE_OVERFLOW,
-					(unsigned long *)&mcelog_buf.flags);
-				return NOTIFY_DONE;
-			}
-			/* Old left over entry. Skip: */
-			if (mcelog_buf.entry[entry].finished) {
-				entry++;
-				continue;
-			}
-			break;
-		}
-		smp_rmb();
-		next = entry + 1;
-		if (cmpxchg(&mcelog_buf.next, entry, next) == entry)
-			break;
-	}
-	memcpy(mcelog_buf.entry + entry, m, sizeof(struct mce));
-	wmb();
-	mcelog_buf.entry[entry].finished = 1;
-	wmb();
-
 	set_bit(0, &mce_need_notify);
 
 	mce_notify_irq();
@@ -1447,13 +1388,6 @@ static void mce_timer_delete_all(void)
 		del_timer_sync(&per_cpu(mce_timer, cpu));
 }
 
-static void mce_do_trigger(struct work_struct *work)
-{
-	call_usermodehelper(mce_helper, mce_helper_argv, NULL, UMH_NO_WAIT);
-}
-
-static DECLARE_WORK(mce_trigger_work, mce_do_trigger);
-
 /*
  * Notify the user(s) about new machine check events.
  * Can be called from interrupt context, but not from machine check/NMI
@@ -1465,11 +1399,7 @@ int mce_notify_irq(void)
 	static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
 
 	if (test_and_clear_bit(0, &mce_need_notify)) {
-		/* wake processes polling /dev/mcelog */
-		wake_up_interruptible(&mce_chrdev_wait);
-
-		if (mce_helper[0])
-			schedule_work(&mce_trigger_work);
+		mce_work_trigger();
 
 		if (__ratelimit(&ratelimit))
 			pr_info(HW_ERR "Machine check events logged\n");
@@ -1871,252 +1801,6 @@ void mcheck_cpu_clear(struct cpuinfo_x86 *c)
 
 }
 
-/*
- * mce_chrdev: Character device /dev/mcelog to read and clear the MCE log.
- */
-
-static DEFINE_SPINLOCK(mce_chrdev_state_lock);
-static int mce_chrdev_open_count;	/* #times opened */
-static int mce_chrdev_open_exclu;	/* already open exclusive? */
-
-static int mce_chrdev_open(struct inode *inode, struct file *file)
-{
-	spin_lock(&mce_chrdev_state_lock);
-
-	if (mce_chrdev_open_exclu ||
-	    (mce_chrdev_open_count && (file->f_flags & O_EXCL))) {
-		spin_unlock(&mce_chrdev_state_lock);
-
-		return -EBUSY;
-	}
-
-	if (file->f_flags & O_EXCL)
-		mce_chrdev_open_exclu = 1;
-	mce_chrdev_open_count++;
-
-	spin_unlock(&mce_chrdev_state_lock);
-
-	return nonseekable_open(inode, file);
-}
-
-static int mce_chrdev_release(struct inode *inode, struct file *file)
-{
-	spin_lock(&mce_chrdev_state_lock);
-
-	mce_chrdev_open_count--;
-	mce_chrdev_open_exclu = 0;
-
-	spin_unlock(&mce_chrdev_state_lock);
-
-	return 0;
-}
-
-static void collect_tscs(void *data)
-{
-	unsigned long *cpu_tsc = (unsigned long *)data;
-
-	cpu_tsc[smp_processor_id()] = rdtsc();
-}
-
-static int mce_apei_read_done;
-
-/* Collect MCE record of previous boot in persistent storage via APEI ERST. */
-static int __mce_read_apei(char __user **ubuf, size_t usize)
-{
-	int rc;
-	u64 record_id;
-	struct mce m;
-
-	if (usize < sizeof(struct mce))
-		return -EINVAL;
-
-	rc = apei_read_mce(&m, &record_id);
-	/* Error or no more MCE record */
-	if (rc <= 0) {
-		mce_apei_read_done = 1;
-		/*
-		 * When ERST is disabled, mce_chrdev_read() should return
-		 * "no record" instead of "no device."
-		 */
-		if (rc == -ENODEV)
-			return 0;
-		return rc;
-	}
-	rc = -EFAULT;
-	if (copy_to_user(*ubuf, &m, sizeof(struct mce)))
-		return rc;
-	/*
-	 * In fact, we should have cleared the record after that has
-	 * been flushed to the disk or sent to network in
-	 * /sbin/mcelog, but we have no interface to support that now,
-	 * so just clear it to avoid duplication.
-	 */
-	rc = apei_clear_mce(record_id);
-	if (rc) {
-		mce_apei_read_done = 1;
-		return rc;
-	}
-	*ubuf += sizeof(struct mce);
-
-	return 0;
-}
-
-static ssize_t mce_chrdev_read(struct file *filp, char __user *ubuf,
-				size_t usize, loff_t *off)
-{
-	char __user *buf = ubuf;
-	unsigned long *cpu_tsc;
-	unsigned prev, next;
-	int i, err;
-
-	cpu_tsc = kmalloc(nr_cpu_ids * sizeof(long), GFP_KERNEL);
-	if (!cpu_tsc)
-		return -ENOMEM;
-
-	mutex_lock(&mce_chrdev_read_mutex);
-
-	if (!mce_apei_read_done) {
-		err = __mce_read_apei(&buf, usize);
-		if (err || buf != ubuf)
-			goto out;
-	}
-
-	next = mce_log_get_idx_check(mcelog_buf.next);
-
-	/* Only supports full reads right now */
-	err = -EINVAL;
-	if (*off != 0 || usize < MCE_LOG_LEN*sizeof(struct mce))
-		goto out;
-
-	err = 0;
-	prev = 0;
-	do {
-		for (i = prev; i < next; i++) {
-			unsigned long start = jiffies;
-			struct mce *m = &mcelog_buf.entry[i];
-
-			while (!m->finished) {
-				if (time_after_eq(jiffies, start + 2)) {
-					memset(m, 0, sizeof(*m));
-					goto timeout;
-				}
-				cpu_relax();
-			}
-			smp_rmb();
-			err |= copy_to_user(buf, m, sizeof(*m));
-			buf += sizeof(*m);
-timeout:
-			;
-		}
-
-		memset(mcelog_buf.entry + prev, 0,
-		       (next - prev) * sizeof(struct mce));
-		prev = next;
-		next = cmpxchg(&mcelog_buf.next, prev, 0);
-	} while (next != prev);
-
-	synchronize_sched();
-
-	/*
-	 * Collect entries that were still getting written before the
-	 * synchronize.
-	 */
-	on_each_cpu(collect_tscs, cpu_tsc, 1);
-
-	for (i = next; i < MCE_LOG_LEN; i++) {
-		struct mce *m = &mcelog_buf.entry[i];
-
-		if (m->finished && m->tsc < cpu_tsc[m->cpu]) {
-			err |= copy_to_user(buf, m, sizeof(*m));
-			smp_rmb();
-			buf += sizeof(*m);
-			memset(m, 0, sizeof(*m));
-		}
-	}
-
-	if (err)
-		err = -EFAULT;
-
-out:
-	mutex_unlock(&mce_chrdev_read_mutex);
-	kfree(cpu_tsc);
-
-	return err ? err : buf - ubuf;
-}
-
-static unsigned int mce_chrdev_poll(struct file *file, poll_table *wait)
-{
-	poll_wait(file, &mce_chrdev_wait, wait);
-	if (READ_ONCE(mcelog_buf.next))
-		return POLLIN | POLLRDNORM;
-	if (!mce_apei_read_done && apei_check_mce())
-		return POLLIN | POLLRDNORM;
-	return 0;
-}
-
-static long mce_chrdev_ioctl(struct file *f, unsigned int cmd,
-				unsigned long arg)
-{
-	int __user *p = (int __user *)arg;
-
-	if (!capable(CAP_SYS_ADMIN))
-		return -EPERM;
-
-	switch (cmd) {
-	case MCE_GET_RECORD_LEN:
-		return put_user(sizeof(struct mce), p);
-	case MCE_GET_LOG_LEN:
-		return put_user(MCE_LOG_LEN, p);
-	case MCE_GETCLEAR_FLAGS: {
-		unsigned flags;
-
-		do {
-			flags = mcelog_buf.flags;
-		} while (cmpxchg(&mcelog_buf.flags, flags, 0) != flags);
-
-		return put_user(flags, p);
-	}
-	default:
-		return -ENOTTY;
-	}
-}
-
-static ssize_t (*mce_write)(struct file *filp, const char __user *ubuf,
-			    size_t usize, loff_t *off);
-
-void register_mce_write_callback(ssize_t (*fn)(struct file *filp,
-			     const char __user *ubuf,
-			     size_t usize, loff_t *off))
-{
-	mce_write = fn;
-}
-EXPORT_SYMBOL_GPL(register_mce_write_callback);
-
-static ssize_t mce_chrdev_write(struct file *filp, const char __user *ubuf,
-				size_t usize, loff_t *off)
-{
-	if (mce_write)
-		return mce_write(filp, ubuf, usize, off);
-	else
-		return -EINVAL;
-}
-
-static const struct file_operations mce_chrdev_ops = {
-	.open			= mce_chrdev_open,
-	.release		= mce_chrdev_release,
-	.read			= mce_chrdev_read,
-	.write			= mce_chrdev_write,
-	.poll			= mce_chrdev_poll,
-	.unlocked_ioctl		= mce_chrdev_ioctl,
-	.llseek			= no_llseek,
-};
-
-static struct miscdevice mce_chrdev_device = {
-	MISC_MCELOG_MINOR,
-	"mcelog",
-	&mce_chrdev_ops,
-};
-
 static void __mce_disable_bank(void *arg)
 {
 	int bank = *((int *)arg);
@@ -2335,29 +2019,6 @@ static ssize_t set_bank(struct device *s, struct device_attribute *attr,
 	return size;
 }
 
-static ssize_t
-show_trigger(struct device *s, struct device_attribute *attr, char *buf)
-{
-	strcpy(buf, mce_helper);
-	strcat(buf, "\n");
-	return strlen(mce_helper) + 1;
-}
-
-static ssize_t set_trigger(struct device *s, struct device_attribute *attr,
-				const char *buf, size_t siz)
-{
-	char *p;
-
-	strncpy(mce_helper, buf, sizeof(mce_helper));
-	mce_helper[sizeof(mce_helper)-1] = 0;
-	p = strchr(mce_helper, '\n');
-
-	if (p)
-		*p = 0;
-
-	return strlen(mce_helper) + !!p;
-}
-
 static ssize_t set_ignore_ce(struct device *s,
 			     struct device_attribute *attr,
 			     const char *buf, size_t size)
@@ -2414,7 +2075,6 @@ static ssize_t store_int_with_restart(struct device *s,
 	return ret;
 }
 
-static DEVICE_ATTR(trigger, 0644, show_trigger, set_trigger);
 static DEVICE_INT_ATTR(tolerant, 0644, mca_cfg.tolerant);
 static DEVICE_INT_ATTR(monarch_timeout, 0644, mca_cfg.monarch_timeout);
 static DEVICE_BOOL_ATTR(dont_log_ce, 0644, mca_cfg.dont_log_ce);
@@ -2437,7 +2097,9 @@ static struct dev_ext_attribute dev_attr_cmci_disabled = {
 static struct device_attribute *mce_device_attrs[] = {
 	&dev_attr_tolerant.attr,
 	&dev_attr_check_interval.attr,
+#ifdef CONFIG_X86_MCELOG
 	&dev_attr_trigger,
+#endif
 	&dev_attr_monarch_timeout.attr,
 	&dev_attr_dont_log_ce.attr,
 	&dev_attr_ignore_ce.attr,
@@ -2611,7 +2273,6 @@ static __init void mce_init_banks(void)
 
 static __init int mcheck_init_device(void)
 {
-	enum cpuhp_state hp_online;
 	int err;
 
 	if (!mce_available(&boot_cpu_data)) {
@@ -2639,21 +2300,11 @@ static __init int mcheck_init_device(void)
 				mce_cpu_online, mce_cpu_pre_down);
 	if (err < 0)
 		goto err_out_online;
-	hp_online = err;
 
 	register_syscore_ops(&mce_syscore_ops);
 
-	/* register character device /dev/mcelog */
-	err = misc_register(&mce_chrdev_device);
-	if (err)
-		goto err_register;
-
 	return 0;
 
-err_register:
-	unregister_syscore_ops(&mce_syscore_ops);
-	cpuhp_remove_state(hp_online);
-
 err_out_online:
 	cpuhp_remove_state(CPUHP_X86_MCE_DEAD);
 
@@ -2661,7 +2312,7 @@ static __init int mcheck_init_device(void)
 	free_cpumask_var(mce_device_initialized);
 
 err_out:
-	pr_err("Unable to init device /dev/mcelog (rc: %d)\n", err);
+	pr_err("Unable to init MCE device (rc: %d)\n", err);
 
 	return err;
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-09 10:08 ` [PATCH 3/4] RAS: Add a Corrected Errors Collector Borislav Petkov
@ 2017-03-12 13:43   ` Boris Petkov
  2017-03-20 22:48   ` Luck, Tony
  2017-03-22 19:00   ` Luck, Tony
  2 siblings, 0 replies; 15+ messages in thread
From: Boris Petkov @ 2017-03-12 13:43 UTC (permalink / raw)
  To: Tony Luck; +Cc: X86 ML, linux-edac, LKML

On March 9, 2017 11:08:17 AM GMT+01:00, Borislav Petkov <bp@alien8.de> wrote:
>From: Borislav Petkov <bp@suse.de>

...

>diff --git a/arch/x86/ras/Kconfig b/arch/x86/ras/Kconfig
>index 0bc60a308730..2a2d89d39af6 100644
>--- a/arch/x86/ras/Kconfig
>+++ b/arch/x86/ras/Kconfig
>@@ -7,3 +7,17 @@ config MCE_AMD_INJ
> 	  aspects of the MCE handling code.
> 
> 	  WARNING: Do not even assume this interface is staying stable!
>+
>+config RAS_CEC
>+	bool "Correctable Errors Collector"
>+	depends on X86_MCE && MEMORY_FAILURE && DEBUG_FS
>+	---help---

Btw, we should run it a couple releases like this and once we have the warm fuzzy feeling, make it default y.

-- 
Sent from a small device: formatting sux and brevity is inevitable. 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-09 10:08 ` [PATCH 3/4] RAS: Add a Corrected Errors Collector Borislav Petkov
  2017-03-12 13:43   ` Boris Petkov
@ 2017-03-20 22:48   ` Luck, Tony
  2017-03-22 18:03     ` Borislav Petkov
  2017-03-22 19:00   ` Luck, Tony
  2 siblings, 1 reply; 15+ messages in thread
From: Luck, Tony @ 2017-03-20 22:48 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: X86 ML, linux-edac, LKML

On Thu, Mar 09, 2017 at 11:08:17AM +0100, Borislav Petkov wrote:
> +config RAS_CEC
> +	bool "Correctable Errors Collector"
> +	depends on X86_MCE && MEMORY_FAILURE && DEBUG_FS
> +	---help---
> +	  This is a small cache which collects correctable memory errors per 4K
> +	  page PFN and counts their repeated occurrence. Once the counter for a
> +	  PFN overflows, we try to soft-offline that page as we take it to mean
> +	  that it has reached a relatively high error count and would probably
> +	  be best if we don't use it anymore.

You added "count_threshold" for me ... so the condition isn't quite "overflows"
like it was in the early versions.

We may need to give some thought on what to do if the attempt to offline
the page fails (e.g. because the page belongs to the kernel). Right now
you delete it from the list, but we will see more errors as the page is
still in use. Eventually the counter will hit count_threshold and we will
try to offline again. Rinse, repeat.

Someone also recently sent me a log from a machine with corrected errors
in over 9000 unique addresses. Need a parameter to allocate more than one
page for the collector, or a way to grow the space.

-Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-20 22:48   ` Luck, Tony
@ 2017-03-22 18:03     ` Borislav Petkov
  2017-03-23 15:22       ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2017-03-22 18:03 UTC (permalink / raw)
  To: Luck, Tony; +Cc: X86 ML, linux-edac, LKML

On Mon, Mar 20, 2017 at 03:48:24PM -0700, Luck, Tony wrote:
> You added "count_threshold" for me ... so the condition isn't quite "overflows"
> like it was in the early versions.

It is a max count which, when reached, causes the soft offline attempt.
What did you mean with "overflows" exactly then?

> We may need to give some thought on what to do if the attempt to offline
> the page fails (e.g. because the page belongs to the kernel). Right now
> you delete it from the list, but we will see more errors as the page is
> still in use. Eventually the counter will hit count_threshold and we will
> try to offline again. Rinse, repeat.

Well, what *is* there we can do? If the offlining code can't offline
it, there's not a whole lot we *can* do. The error would keep repeating
as a corrected error, rinse, repeat and we will keep trying to offline
containing page.

That is, until it degrades to an uncorrectable error and then we're
dead.

Either way, the collector can't really do anything about it. This would
be beyond its functionality anyway.

IMO.

> Someone also recently sent me a log from a machine with corrected errors
> in over 9000 unique addresses. Need a parameter to allocate more than one
> page for the collector, or a way to grow the space.

Well, so even with the amount of unique addresses higher than the CEC
slots, we should be able to deal with them ok: the moment we enter more
than CLEAN_ELEMS pfns, we will trigger a spring cleaning which will
degrade the already logged errors. Once the array is filled up, we will
replace the LRU pfn with the new one.

And so on.

And this way it would fulfill its purpose of *not* generating error
records into the decoding chain after it. If one of those 9000 errors
overflows, we will try to offline the page.

Either way we work as advertized.

Lemme try to write a small script exercising exactly that scenario to
see whether I'm actually not talking crap here :-)

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-09 10:08 ` [PATCH 3/4] RAS: Add a Corrected Errors Collector Borislav Petkov
  2017-03-12 13:43   ` Boris Petkov
  2017-03-20 22:48   ` Luck, Tony
@ 2017-03-22 19:00   ` Luck, Tony
  2017-03-22 19:22     ` Borislav Petkov
  2 siblings, 1 reply; 15+ messages in thread
From: Luck, Tony @ 2017-03-22 19:00 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: X86 ML, linux-edac, LKML

On Thu, Mar 09, 2017 at 11:08:17AM +0100, Borislav Petkov wrote:
> +static bool cec_add_mce(struct mce *m)
> +{
> +	if (!m)
> +		return false;
> +
> +	if (memory_error(m) && mce_usable_address(m))
> +		if (!cec_add_elem(m->addr >> PAGE_SHIFT))
> +			return true;
> +
> +	return false;
> +}

You also need to check that bit 61 of m->status is zero here.
The collector is hiding uncorrected errors too.

-Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-22 19:00   ` Luck, Tony
@ 2017-03-22 19:22     ` Borislav Petkov
  0 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2017-03-22 19:22 UTC (permalink / raw)
  To: Luck, Tony; +Cc: X86 ML, linux-edac, LKML

On Wed, Mar 22, 2017 at 12:00:25PM -0700, Luck, Tony wrote:
> You also need to check that bit 61 of m->status is zero here.
> The collector is hiding uncorrected errors too.

Good catch.

I think I wanna do something like this:

	if (memory_error(m) && !(m->status & MCI_STATUS_UC) ...

as we want to make sure we're looking at a memory error first and then
decide on severity.

Alternatively I could stick that logic in another helper called
correctable_memory_error() or so but I don't have a strong preference.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-22 18:03     ` Borislav Petkov
@ 2017-03-23 15:22       ` Borislav Petkov
  2017-03-23 17:20         ` Luck, Tony
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2017-03-23 15:22 UTC (permalink / raw)
  To: Luck, Tony; +Cc: X86 ML, linux-edac, LKML

On Wed, Mar 22, 2017 at 07:03:39PM +0100, Borislav Petkov wrote:
> Lemme try to write a small script exercising exactly that scenario to
> see whether I'm actually not talking crap here :-)

Ok, here's a snapshot from the CEC after letting it run for a couple of
hours in a guest with a script running twice in parallel and injecting
random PFNs. We have 0 offlined pages because a PFN number doesn't
repeat frequently enough to cause an overflow.

When I force the occurrence of a single PFN for 1023 and more times and
do that more than once, this happens:

[ 6629.091239] RAS: Soft-offlining pfn: 0x7fff
[ 6629.093036] __get_any_page: 0x7fff free buddy page
[ 6653.259476] RAS: Soft-offlining pfn: 0x7fff
[ 6653.260100] soft offline: 0x7fff page already poisoned

...

Stats:
CEs: 32614
offlined pages: 2
^^^^^^^^^^^^^^^^^

Flags: 0x0
Timer interval: 86400 seconds
Decays: 254
Action threshold: 1023

The "already poisoned" thing shouldn't happen in real life because once
the page frame is poisoned, it shouldn't generate MCEs.




Every 2.0s: head -n 40 array; tail -n 40 array                                                   Thu Mar 23 17:15:15 2017

{ n: 512
 000: [0000000000000056|c01]
 001: [000000000000011f|801]
 002: [0000000000000171|401]
 003: [00000000000001ce|401]
 004: [000000000000024a|401]
 005: [000000000000026e|401]
 006: [000000000000034d|c01]
 007: [0000000000000395|c01]
 008: [00000000000003b9|801]
 009: [0000000000000458|003]
 010: [000000000000045c|401]
 011: [00000000000004f9|401]
 012: [00000000000005d1|c01]
 013: [0000000000000677|801]
 014: [000000000000069d|401]
 015: [00000000000006b3|401]
 016: [00000000000006f5|c01]
 017: [00000000000006fc|401]
 018: [000000000000074d|401]
 019: [0000000000000764|c01]
 020: [00000000000008a8|801]
 021: [0000000000000951|401]
 022: [0000000000000994|401]
 023: [0000000000000aa8|401]
 024: [0000000000000ac7|801]
 025: [0000000000000af2|801]
 026: [0000000000000bb5|801]
 027: [0000000000000bd5|401]
 028: [0000000000000be0|c01]
 029: [0000000000000c30|c01]
 030: [0000000000000c61|801]
 031: [0000000000000c8a|401]
 032: [0000000000000d0d|801]
 033: [0000000000000d2a|003]
 034: [0000000000000d4d|401]
 035: [0000000000000d87|c01]
 036: [0000000000000da4|c01]
 037: [0000000000000e06|401]
 038: [0000000000000e23|c01]

 ...

 480: [0000000000007d22|005]
 481: [0000000000007d5f|002]
 482: [0000000000007d9f|004]
 483: [0000000000007db1|c01]
 484: [0000000000007dbf|002]
 485: [0000000000007dcf|002]
 486: [0000000000007dd8|401]
 487: [0000000000007df0|001]
 488: [0000000000007df4|002]
 489: [0000000000007e1f|003]
 490: [0000000000007e35|801]
 491: [0000000000007e73|003]
 492: [0000000000007e77|401]
 493: [0000000000007e80|002]
 494: [0000000000007e9c|002]
 495: [0000000000007eac|002]
 496: [0000000000007ecb|002]
 497: [0000000000007ed8|801]
 498: [0000000000007edc|003]
 499: [0000000000007ee3|801]
 500: [0000000000007f05|004]
 501: [0000000000007f15|002]
 502: [0000000000007f51|004]
 503: [0000000000007f5e|003]
 504: [0000000000007f80|801]
 505: [0000000000007f92|003]
 506: [0000000000007fb2|002]
 507: [0000000000007fd9|002]
 508: [0000000000007fdf|002]
 509: [0000000000007fe5|004]
 510: [0000000000007ff4|801]
 511: [0000000000007ffa|001]
}
Stats:
CEs: 30074
offlined pages: 0
Flags: 0x0
Timer interval: 86400 seconds
Decays: 234
Action threshold: 1023

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-23 15:22       ` Borislav Petkov
@ 2017-03-23 17:20         ` Luck, Tony
  2017-03-23 17:28           ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Luck, Tony @ 2017-03-23 17:20 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: X86 ML, linux-edac, LKML

On Thu, Mar 23, 2017 at 04:22:28PM +0100, Borislav Petkov wrote:
> On Wed, Mar 22, 2017 at 07:03:39PM +0100, Borislav Petkov wrote:
> > Lemme try to write a small script exercising exactly that scenario to
> > see whether I'm actually not talking crap here :-)
> 
> Ok, here's a snapshot from the CEC after letting it run for a couple of
> hours in a guest with a script running twice in parallel and injecting
> random PFNs. We have 0 offlined pages because a PFN number doesn't
> repeat frequently enough to cause an overflow.
> 
> When I force the occurrence of a single PFN for 1023 and more times and
> do that more than once, this happens:
> 
> [ 6629.091239] RAS: Soft-offlining pfn: 0x7fff
> [ 6629.093036] __get_any_page: 0x7fff free buddy page
> [ 6653.259476] RAS: Soft-offlining pfn: 0x7fff
> [ 6653.260100] soft offline: 0x7fff page already poisoned
> 
> ...
> 
> Stats:
> CEs: 32614
> offlined pages: 2
> ^^^^^^^^^^^^^^^^^
> 
> Flags: 0x0
> Timer interval: 86400 seconds
> Decays: 254
> Action threshold: 1023
> 
> The "already poisoned" thing shouldn't happen in real life because once
> the page frame is poisoned, it shouldn't generate MCEs.

It can happen if Linux didn't actually take the page offline
(because it was a kernel page). The CEC code only knows that
it queued this page to be taken offline ... and has no way
to know if that succeeded or not.

Some people have grumbled about mcelog(8) doing the same thing.

So is it worth keeping track of the page numbers that we
tried to offline?  If they show up again we shouldn't add
them back into the array.

-Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-23 17:20         ` Luck, Tony
@ 2017-03-23 17:28           ` Borislav Petkov
  2017-03-23 18:20             ` Luck, Tony
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2017-03-23 17:28 UTC (permalink / raw)
  To: Luck, Tony; +Cc: X86 ML, linux-edac, LKML

On Thu, Mar 23, 2017 at 10:20:31AM -0700, Luck, Tony wrote:
> It can happen if Linux didn't actually take the page offline
> (because it was a kernel page). The CEC code only knows that
> it queued this page to be taken offline ... and has no way
> to know if that succeeded or not.

Right, that's the case when the offlining fails for whatever reason.

> So is it worth keeping track of the page numbers that we
> tried to offline?  If they show up again we shouldn't add
> them back into the array.

Meh, I don't like the idea of keeping an evergrowing list of PFNs we
can't do anything about anyway.

And actually, you want the kernel to keep complaining about not being
able to offline those because then admins should consider speeding up
the arrival of the maintenance window - the kernel memory itself is
going sick so that not even RAS actions help here.

I'm wondering if we should make the offlining code dump a more
comprehensible message with hints what to do...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-23 17:28           ` Borislav Petkov
@ 2017-03-23 18:20             ` Luck, Tony
  2017-03-24 11:09               ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Luck, Tony @ 2017-03-23 18:20 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: X86 ML, linux-edac, LKML

On Thu, Mar 23, 2017 at 06:28:39PM +0100, Borislav Petkov wrote:
> Meh, I don't like the idea of keeping an evergrowing list of PFNs we
> can't do anything about anyway.

Keeping every PFN would be overkill (most of them should be taken
offline with no issues).  A fixed array of a few of them with timestamps
to drop the oldest would likely be a good enough(TM) solution.

> And actually, you want the kernel to keep complaining about not being
> able to offline those because then admins should consider speeding up
> the arrival of the maintenance window - the kernel memory itself is
> going sick so that not even RAS actions help here.

Worst case is pretty ugly. A frequently used kernel page with a stuck
bit could be added to the CEC array, overflow, and generate a message
at a pretty high rate.

> I'm wondering if we should make the offlining code dump a more
> comprehensible message with hints what to do...

Maybe ... but it gets into opinion rather than science. Some folks
think that very low numbers of corrected errors warrant DIMM replacement.
Others think that you can keep running almost forever with a several
stuck bits per DIMM.

Some of the best decisions would be made by correlating error logs
from multiple reboots ... which the kernel can't do.

-Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] RAS: Add a Corrected Errors Collector
  2017-03-23 18:20             ` Luck, Tony
@ 2017-03-24 11:09               ` Borislav Petkov
  0 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2017-03-24 11:09 UTC (permalink / raw)
  To: Luck, Tony; +Cc: X86 ML, linux-edac, LKML

On Thu, Mar 23, 2017 at 11:20:44AM -0700, Luck, Tony wrote:
> Keeping every PFN would be overkill (most of them should be taken
> offline with no issues).  A fixed array of a few of them with timestamps
> to drop the oldest would likely be a good enough(TM) solution.

The reason being? Prevent the CEC from adding it and trying to
unsuccessfully offline it again?

If so, that means, we will query that list on every element insertion so
it needs to be something we can search pretty quickly.

> Worst case is pretty ugly. A frequently used kernel page with a stuck
> bit could be added to the CEC array, overflow, and generate a message
> at a pretty high rate.

Oh sure, but it would still be lower rate than generating a message for
*each* correctable error. And I really think that these messages should
*not* be supressed as they're important. The CEC kinda ratelimits them a
bit though...

> Maybe ... but it gets into opinion rather than science. Some folks
> think that very low numbers of corrected errors warrant DIMM replacement.
> Others think that you can keep running almost forever with a several
> stuck bits per DIMM.
> 
> Some of the best decisions would be made by correlating error logs
> from multiple reboots ... which the kernel can't do.

... and maybe even then it doesn't fit everybody's use case.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-03-24 11:09 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-09 10:08 [PATCH 0/4] RAS: Add CEC collector and deprecate mcelog Borislav Petkov
2017-03-09 10:08 ` [PATCH 1/4] x86/MCE: Rename mce_log()'s argument Borislav Petkov
2017-03-09 10:08 ` [PATCH 2/4] x86/MCE: Rename mce_log to mce_log_buffer Borislav Petkov
2017-03-09 10:08 ` [PATCH 3/4] RAS: Add a Corrected Errors Collector Borislav Petkov
2017-03-12 13:43   ` Boris Petkov
2017-03-20 22:48   ` Luck, Tony
2017-03-22 18:03     ` Borislav Petkov
2017-03-23 15:22       ` Borislav Petkov
2017-03-23 17:20         ` Luck, Tony
2017-03-23 17:28           ` Borislav Petkov
2017-03-23 18:20             ` Luck, Tony
2017-03-24 11:09               ` Borislav Petkov
2017-03-22 19:00   ` Luck, Tony
2017-03-22 19:22     ` Borislav Petkov
2017-03-09 10:08 ` [PATCH 4/4] x86/mce: Deprecate /dev/mcelog Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).