linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86_64: Notify user of MCE events.
@ 2005-01-09  4:29 Zwane Mwaikambo
  2005-01-09 16:44 ` Andi Kleen
  0 siblings, 1 reply; 8+ messages in thread
From: Zwane Mwaikambo @ 2005-01-09  4:29 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Linux Kernel

x86_64 uses a userspace mce utility to decode MCEs, this patch will ensure 
that the user is notified of MCE events being logged too.

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>

Index: linux-2.6.10-mm1/arch/x86_64/kernel/mce.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.10-mm1/arch/x86_64/kernel/mce.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 mce.c
--- linux-2.6.10-mm1/arch/x86_64/kernel/mce.c	4 Jan 2005 04:03:35 -0000	1.1.1.1
+++ linux-2.6.10-mm1/arch/x86_64/kernel/mce.c	9 Jan 2005 03:15:48 -0000
@@ -31,6 +32,8 @@ static int mce_dont_init;
 static int tolerant = 1;
 static int banks;
 static unsigned long bank[NR_BANKS] = { [0 ... NR_BANKS-1] = ~0UL };
+static unsigned long console_logged;
+static int notify_user;
 
 /*
  * Lockless MCE logging infrastructure.
@@ -68,6 +71,9 @@ void mce_log(struct mce *mce)
 	smp_wmb();
 	mcelog.entry[entry].finished = 1;
 	smp_wmb();
+
+	if (!test_and_set_bit(0, &console_logged))
+		notify_user = 1;
 }
 
 static void print_mce(struct mce *m)
@@ -252,6 +258,12 @@ static void mcheck_timer(void *data)
 {
 	on_each_cpu(mcheck_check_cpu, NULL, 1, 1);
 	schedule_delayed_work(&mcheck_work, check_interval * HZ);
+
+	if (notify_user && console_logged) {
+		notify_user = 0;
+		clear_bit(0, &console_logged);
+		printk(KERN_EMERG "Machine check exception logged, run mcelog\n");
+	}
 }
 
 
 
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86_64: Notify user of MCE events.
  2005-01-09  4:29 [PATCH] x86_64: Notify user of MCE events Zwane Mwaikambo
@ 2005-01-09 16:44 ` Andi Kleen
  2005-01-09 17:10   ` [PATCH] x86_64: Notify user of MCE events (updated) Zwane Mwaikambo
  0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2005-01-09 16:44 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Andrew Morton, Linux Kernel

Zwane Mwaikambo <zwane@arm.linux.org.uk> writes:

> +
> +	if (!test_and_set_bit(0, &console_logged))
> +		notify_user = 1;
>  }
>  
>  static void print_mce(struct mce *m)
> @@ -252,6 +258,12 @@ static void mcheck_timer(void *data)
>  {
>  	on_each_cpu(mcheck_check_cpu, NULL, 1, 1);
>  	schedule_delayed_work(&mcheck_work, check_interval * HZ);
> +
> +	if (notify_user && console_logged) {

Perhaps a comment here that the race is harmless? 

> +		notify_user = 0;
> +		clear_bit(0, &console_logged);
> +		printk(KERN_EMERG "Machine check exception logged, run mcelog\n");

I would drop the "run mcelog". It's misleading if mcelog is already
running in cron as it should. 

-Andi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86_64: Notify user of MCE events (updated)
  2005-01-09 16:44 ` Andi Kleen
@ 2005-01-09 17:10   ` Zwane Mwaikambo
  2005-01-09 18:14     ` Andi Kleen
  0 siblings, 1 reply; 8+ messages in thread
From: Zwane Mwaikambo @ 2005-01-09 17:10 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Linux Kernel

x86_64 uses a userspace mce utility to decode MCEs, this patch will ensure
that the user is notified of MCE events being logged too.

Updated to incorporate suggestions from Andi Kleen.

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>

Index: linux-2.6.10-mm1/arch/x86_64/kernel/mce.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.10-mm1/arch/x86_64/kernel/mce.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 mce.c
--- linux-2.6.10-mm1/arch/x86_64/kernel/mce.c	4 Jan 2005 04:03:35 -0000	1.1.1.1
+++ linux-2.6.10-mm1/arch/x86_64/kernel/mce.c	9 Jan 2005 17:09:39 -0000
@@ -31,6 +31,8 @@ static int mce_dont_init;
 static int tolerant = 1;
 static int banks;
 static unsigned long bank[NR_BANKS] = { [0 ... NR_BANKS-1] = ~0UL };
+static unsigned long console_logged;
+static int notify_user;
 
 /*
  * Lockless MCE logging infrastructure.
@@ -68,6 +70,9 @@ void mce_log(struct mce *mce)
 	smp_wmb();
 	mcelog.entry[entry].finished = 1;
 	smp_wmb();
+
+	if (!test_and_set_bit(0, &console_logged))
+		notify_user = 1;
 }
 
 static void print_mce(struct mce *m)
@@ -252,6 +257,19 @@ static void mcheck_timer(void *data)
 {
 	on_each_cpu(mcheck_check_cpu, NULL, 1, 1);
 	schedule_delayed_work(&mcheck_work, check_interval * HZ);
+
+	/*
+	 * It's ok to read stale data here for notify_user and
+	 * console_logged as we'll simply get the updated versions
+	 * on the next mcheck_timer execution and atomic operations
+	 * on console_logged act as synchronization for notify_user
+	 * writes.
+	 */
+	if (notify_user && console_logged) {
+		notify_user = 0;
+		clear_bit(0, &console_logged);
+		printk(KERN_EMERG "Machine check exception logged\n");
+	}
 }
 
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86_64: Notify user of MCE events (updated)
  2005-01-09 17:10   ` [PATCH] x86_64: Notify user of MCE events (updated) Zwane Mwaikambo
@ 2005-01-09 18:14     ` Andi Kleen
  2005-01-09 18:20       ` [PATCH] x86_64: Notify user of MCE events (updated 2) Zwane Mwaikambo
                         ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Andi Kleen @ 2005-01-09 18:14 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Andrew Morton, Linux Kernel

Zwane Mwaikambo <zwane@arm.linux.org.uk> writes:
> +	 */
> +	if (notify_user && console_logged) {
> +		notify_user = 0;
> +		clear_bit(0, &console_logged);
> +		printk(KERN_EMERG "Machine check exception logged\n");

Another suggestion: don't make this KERN_EMERG. Make it KERN_INFO. 
Logged errors are usually correct, so there is no need for an 
emergency.

Also since these are not always exceptions (but can be read from
the polling timer) I would call them "machine check events" 

-Andi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86_64: Notify user of MCE events (updated 2)
  2005-01-09 18:14     ` Andi Kleen
@ 2005-01-09 18:20       ` Zwane Mwaikambo
  2005-01-09 18:27       ` [PATCH] x86_64: Notify user of MCE events (updated) Zwane Mwaikambo
  2005-01-09 18:30       ` Andreas Steinmetz
  2 siblings, 0 replies; 8+ messages in thread
From: Zwane Mwaikambo @ 2005-01-09 18:20 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Linux Kernel

x86_64 uses a userspace mce utility to decode MCEs, this patch will ensure
that the user is notified of MCE events being logged too.

Updated to incorporate suggestions from Andi Kleen

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>

Index: linux-2.6.10-mm1/arch/x86_64/kernel/mce.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.10-mm1/arch/x86_64/kernel/mce.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 mce.c
--- linux-2.6.10-mm1/arch/x86_64/kernel/mce.c	4 Jan 2005 04:03:35 -0000	1.1.1.1
+++ linux-2.6.10-mm1/arch/x86_64/kernel/mce.c	9 Jan 2005 18:18:25 -0000
@@ -31,6 +31,8 @@ static int mce_dont_init;
 static int tolerant = 1;
 static int banks;
 static unsigned long bank[NR_BANKS] = { [0 ... NR_BANKS-1] = ~0UL };
+static unsigned long console_logged;
+static int notify_user;
 
 /*
  * Lockless MCE logging infrastructure.
@@ -68,6 +70,9 @@ void mce_log(struct mce *mce)
 	smp_wmb();
 	mcelog.entry[entry].finished = 1;
 	smp_wmb();
+
+	if (!test_and_set_bit(0, &console_logged))
+		notify_user = 1;
 }
 
 static void print_mce(struct mce *m)
@@ -252,6 +257,19 @@ static void mcheck_timer(void *data)
 {
 	on_each_cpu(mcheck_check_cpu, NULL, 1, 1);
 	schedule_delayed_work(&mcheck_work, check_interval * HZ);
+
+	/*
+	 * It's ok to read stale data here for notify_user and
+	 * console_logged as we'll simply get the updated versions
+	 * on the next mcheck_timer execution and atomic operations
+	 * on console_logged act as synchronization for notify_user
+	 * writes.
+	 */
+	if (notify_user && console_logged) {
+		notify_user = 0;
+		clear_bit(0, &console_logged);
+		printk(KERN_INFO "Machine check events logged\n");
+	}
 }
 
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86_64: Notify user of MCE events (updated)
  2005-01-09 18:14     ` Andi Kleen
  2005-01-09 18:20       ` [PATCH] x86_64: Notify user of MCE events (updated 2) Zwane Mwaikambo
@ 2005-01-09 18:27       ` Zwane Mwaikambo
  2005-01-09 18:30       ` Andreas Steinmetz
  2 siblings, 0 replies; 8+ messages in thread
From: Zwane Mwaikambo @ 2005-01-09 18:27 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, Linux Kernel

On Sun, 9 Jan 2005, Andi Kleen wrote:

> Zwane Mwaikambo <zwane@arm.linux.org.uk> writes:
> > +	 */
> > +	if (notify_user && console_logged) {
> > +		notify_user = 0;
> > +		clear_bit(0, &console_logged);
> > +		printk(KERN_EMERG "Machine check exception logged\n");
> 
> Another suggestion: don't make this KERN_EMERG. Make it KERN_INFO. 
> Logged errors are usually correct, so there is no need for an 
> emergency.
> 
> Also since these are not always exceptions (but can be read from
> the polling timer) I would call them "machine check events" 

Thanks for the comments, i've updated the patch.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86_64: Notify user of MCE events (updated)
  2005-01-09 18:14     ` Andi Kleen
  2005-01-09 18:20       ` [PATCH] x86_64: Notify user of MCE events (updated 2) Zwane Mwaikambo
  2005-01-09 18:27       ` [PATCH] x86_64: Notify user of MCE events (updated) Zwane Mwaikambo
@ 2005-01-09 18:30       ` Andreas Steinmetz
  2005-01-09 18:40         ` Andi Kleen
  2 siblings, 1 reply; 8+ messages in thread
From: Andreas Steinmetz @ 2005-01-09 18:30 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Zwane Mwaikambo, Andrew Morton, Linux Kernel

Andi Kleen wrote:
> Zwane Mwaikambo <zwane@arm.linux.org.uk> writes:
> 
>>+	 */
>>+	if (notify_user && console_logged) {
>>+		notify_user = 0;
>>+		clear_bit(0, &console_logged);
>>+		printk(KERN_EMERG "Machine check exception logged\n");
> 
> 
> Another suggestion: don't make this KERN_EMERG. Make it KERN_INFO. 
> Logged errors are usually correct, so there is no need for an 
> emergency.

Just asking:
How about KERN_NOTICE? KERN_INFO is in my opinion too easily lost in the 
syslog noise. Personally I'm logging KERN_INFO just to console, 
KERN_NOTICE however to file.
An MCE event would suit the description "normal but significant 
condition" of KERN_NOTICE as far as I can see.
-- 
Andreas Steinmetz                       SPAMmers use robotrap@domdv.de

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86_64: Notify user of MCE events (updated)
  2005-01-09 18:30       ` Andreas Steinmetz
@ 2005-01-09 18:40         ` Andi Kleen
  0 siblings, 0 replies; 8+ messages in thread
From: Andi Kleen @ 2005-01-09 18:40 UTC (permalink / raw)
  To: Andreas Steinmetz; +Cc: Zwane Mwaikambo, Andrew Morton, Linux Kernel

On Sun, Jan 09, 2005 at 07:30:56PM +0100, Andreas Steinmetz wrote:
> Just asking:
> How about KERN_NOTICE? KERN_INFO is in my opinion too easily lost in the 

It's still in the mcelog if you want it.

The main idea behind the separate mcelog was to make it unobstrusive to 
reduce MCE related support load. (about 20% of all users who get such a 
message seem to think it's a kernel bug and ask kernel developers) 
KERN_INFO is at the right level of unintrusiveness.

-Andi

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-01-09 18:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-09  4:29 [PATCH] x86_64: Notify user of MCE events Zwane Mwaikambo
2005-01-09 16:44 ` Andi Kleen
2005-01-09 17:10   ` [PATCH] x86_64: Notify user of MCE events (updated) Zwane Mwaikambo
2005-01-09 18:14     ` Andi Kleen
2005-01-09 18:20       ` [PATCH] x86_64: Notify user of MCE events (updated 2) Zwane Mwaikambo
2005-01-09 18:27       ` [PATCH] x86_64: Notify user of MCE events (updated) Zwane Mwaikambo
2005-01-09 18:30       ` Andreas Steinmetz
2005-01-09 18:40         ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).