* [RFC PATCH] x86: Do not panic if mce=2 is passed
@ 2016-09-16 20:23 Yinghai Lu
2016-09-16 20:28 ` Luck, Tony
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Yinghai Lu @ 2016-09-16 20:23 UTC (permalink / raw)
To: Tony Luck, Borislav Petkov
Cc: the arch/x86 maintainers, Linux Kernel Mailing List, linux-edac,
Yinghai Lu
From: Yinghai Lu <yinghai.lu@oracle.com>
For UE recovery support, current we need mce=2 in command line
and also disable panic_on_oops with sysctl.
but other user may still need to have panic_on_oops to 1 always.
We can remove checking of panic_on_oops for mce-severity path.
We should be ok as on default path when mce=2 is not passed, tolerant
is 0, so they will still get MCE_PANIC_SEVERITY returned.
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
---
arch/x86/kernel/cpu/mcheck/mce-severity.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6/arch/x86/kernel/cpu/mcheck/mce-severity.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ linux-2.6/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -311,7 +311,7 @@ static int mce_severity_intel(struct mce
*msg = s->msg;
s->covered = 1;
if (s->sev >= MCE_UC_SEVERITY && ctx == IN_KERNEL) {
- if (panic_on_oops || tolerant < 1)
+ if (tolerant < 1)
return MCE_PANIC_SEVERITY;
}
return s->sev;
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [RFC PATCH] x86: Do not panic if mce=2 is passed
2016-09-16 20:23 [RFC PATCH] x86: Do not panic if mce=2 is passed Yinghai Lu
@ 2016-09-16 20:28 ` Luck, Tony
2016-09-18 18:39 ` Borislav Petkov
2016-10-31 10:57 ` Borislav Petkov
2016-11-08 16:18 ` [tip:ras/core] x86/MCE: Do not look at panic_on_oops in the severity grading tip-bot for Yinghai Lu
2 siblings, 1 reply; 5+ messages in thread
From: Luck, Tony @ 2016-09-16 20:28 UTC (permalink / raw)
To: Yinghai Lu, Borislav Petkov
Cc: the arch/x86 maintainers, Linux Kernel Mailing List, linux-edac,
Yinghai Lu
> For UE recovery support, current we need mce=2 in command line
> and also disable panic_on_oops with sysctl.
Please explain. I've never given mce=2 on command line, and have
had my kernel recover from thousands of (injected) UE memory errors.
-Tony
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] x86: Do not panic if mce=2 is passed
2016-09-16 20:28 ` Luck, Tony
@ 2016-09-18 18:39 ` Borislav Petkov
0 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2016-09-18 18:39 UTC (permalink / raw)
To: Luck, Tony
Cc: Yinghai Lu, the arch/x86 maintainers, Linux Kernel Mailing List,
linux-edac, Yinghai Lu
On Fri, Sep 16, 2016 at 08:28:44PM +0000, Luck, Tony wrote:
> > For UE recovery support, current we need mce=2 in command line
> > and also disable panic_on_oops with sysctl.
>
> Please explain. I've never given mce=2 on command line, and have
> had my kernel recover from thousands of (injected) UE memory errors.
So frankly, that panic_on_oops doesn't make a whole lotta sense to me.
It is promoting MCEs with severity MCE_UC_SEVERITY and higher to a
panic.
So let's look at those:
MCE_UC_SEVERITY, - we don't do anything special in the kernel for
those so just as well.
MCE_AR_SEVERITY, - those end up in the memory failure code if
they're memory errors
MCE_PANIC_SEVERITY, - causes panic
so if anything, panic_on_oops shouldn't control the panicking behavior
as tolerant does that already:
* Tolerant levels:
* 0: always panic on uncorrected errors, log corrected errors
* 1: panic or SIGBUS on uncorrected errors, log corrected errors
* 2: SIGBUS or log uncorrected errors (if possible), log corr. errors
* 3: never panic or SIGBUS, log all errors (for testing only)
IOW, I think that patch makes sense but please doublecheck my logic
above first.
Thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] x86: Do not panic if mce=2 is passed
2016-09-16 20:23 [RFC PATCH] x86: Do not panic if mce=2 is passed Yinghai Lu
2016-09-16 20:28 ` Luck, Tony
@ 2016-10-31 10:57 ` Borislav Petkov
2016-11-08 16:18 ` [tip:ras/core] x86/MCE: Do not look at panic_on_oops in the severity grading tip-bot for Yinghai Lu
2 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2016-10-31 10:57 UTC (permalink / raw)
To: Yinghai Lu
Cc: Tony Luck, the arch/x86 maintainers, Linux Kernel Mailing List,
linux-edac, Yinghai Lu
On Fri, Sep 16, 2016 at 01:23:25PM -0700, Yinghai Lu wrote:
> From: Yinghai Lu <yinghai.lu@oracle.com>
>
> For UE recovery support, current we need mce=2 in command line
> and also disable panic_on_oops with sysctl.
>
> but other user may still need to have panic_on_oops to 1 always.
>
> We can remove checking of panic_on_oops for mce-severity path.
>
> We should be ok as on default path when mce=2 is not passed, tolerant
> is 0, so they will still get MCE_PANIC_SEVERITY returned.
>
> Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
>
>
> ---
> arch/x86/kernel/cpu/mcheck/mce-severity.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6/arch/x86/kernel/cpu/mcheck/mce-severity.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/cpu/mcheck/mce-severity.c
> +++ linux-2.6/arch/x86/kernel/cpu/mcheck/mce-severity.c
> @@ -311,7 +311,7 @@ static int mce_severity_intel(struct mce
> *msg = s->msg;
> s->covered = 1;
> if (s->sev >= MCE_UC_SEVERITY && ctx == IN_KERNEL) {
> - if (panic_on_oops || tolerant < 1)
> + if (tolerant < 1)
> return MCE_PANIC_SEVERITY;
> }
> return s->sev;
Applied,
thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [tip:ras/core] x86/MCE: Do not look at panic_on_oops in the severity grading
2016-09-16 20:23 [RFC PATCH] x86: Do not panic if mce=2 is passed Yinghai Lu
2016-09-16 20:28 ` Luck, Tony
2016-10-31 10:57 ` Borislav Petkov
@ 2016-11-08 16:18 ` tip-bot for Yinghai Lu
2 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Yinghai Lu @ 2016-11-08 16:18 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, mingo, tony.luck, hpa, linux-edac, tglx, x86,
yinghai.lu, bp
Commit-ID: f5e886ef9b45a3dbfd42b054a13c755894ea8402
Gitweb: http://git.kernel.org/tip/f5e886ef9b45a3dbfd42b054a13c755894ea8402
Author: Yinghai Lu <yinghai.lu@oracle.com>
AuthorDate: Fri, 16 Sep 2016 13:23:25 -0700
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 8 Nov 2016 17:10:12 +0100
x86/MCE: Do not look at panic_on_oops in the severity grading
The MCE tolerance levels control whether we panic on a machine check or do
something else like generating a signal and logging error information. This
is controlled by the mce=<level> command line parameter.
However, if panic_on_oops is set, it will force a panic for such an MCE
even though the user didn't want to.
So don't check panic_on_oops in the severity grading anymore.
One of the use cases for that is recovery from uncorrectable errors with
mce=2.
[ Boris: rewrite commit message. ]
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20160916202325.4972-1-yinghai@kernel.org
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/mcheck/mce-severity.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 631356c..c7efbcf 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -311,7 +311,7 @@ static int mce_severity_intel(struct mce *m, int tolerant, char **msg, bool is_e
*msg = s->msg;
s->covered = 1;
if (s->sev >= MCE_UC_SEVERITY && ctx == IN_KERNEL) {
- if (panic_on_oops || tolerant < 1)
+ if (tolerant < 1)
return MCE_PANIC_SEVERITY;
}
return s->sev;
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-11-08 16:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-16 20:23 [RFC PATCH] x86: Do not panic if mce=2 is passed Yinghai Lu
2016-09-16 20:28 ` Luck, Tony
2016-09-18 18:39 ` Borislav Petkov
2016-10-31 10:57 ` Borislav Petkov
2016-11-08 16:18 ` [tip:ras/core] x86/MCE: Do not look at panic_on_oops in the severity grading tip-bot for Yinghai Lu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).