All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86, mce, severities: Add AMD severities function
@ 2015-03-16 17:16 Aravind Gopalakrishnan
  2015-03-17  7:42 ` Ingo Molnar
  2015-03-17 10:20 ` Borislav Petkov
  0 siblings, 2 replies; 16+ messages in thread
From: Aravind Gopalakrishnan @ 2015-03-16 17:16 UTC (permalink / raw)
  To: tglx, mingo, hpa, tony.luck, bp, slaoub, luto, x86, linux-kernel,
	linux-edac
  Cc: Aravind Gopalakrishnan, Aravind Gopalakrishnan

Add a severities function that caters to AMD processors.
This allows us to do some vendor specific work within the
function if necessary.

Also, introduce a vendor flag bitfield which contains vendor
specific flags. The severities code uses this to define error
scope based on the prescence of the flags field.

This is based off of work by Boris Petkov.

Testing details:
Tested the patch for any regressions on F15hM0h-0fh (Orochi)
and F15hM60h-6fh (Carrizo) and it works fine.

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
---
 arch/x86/include/asm/mce.h                |  6 ++++
 arch/x86/kernel/cpu/mcheck/mce-severity.c | 52 +++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/mcheck/mce.c          |  9 ++++++
 3 files changed, 67 insertions(+)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index fd38a23..b574fbf 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -116,6 +116,12 @@ struct mca_config {
 	u32 rip_msr;
 };
 
+struct mce_vendor_flags {
+	__u64		overflow_recov	: 1, /* cpuid_ebx(80000007) */
+			__reserved_0	: 63;
+};
+extern struct mce_vendor_flags mce_flags;
+
 extern struct mca_config mca_cfg;
 extern void mce_register_decode_chain(struct notifier_block *nb);
 extern void mce_unregister_decode_chain(struct notifier_block *nb);
diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 8bb4330..2af82b5 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -186,12 +186,64 @@ static int error_context(struct mce *m)
 	return ((m->cs & 3) == 3) ? IN_USER : IN_KERNEL;
 }
 
+/* keeping amd_mce_severity in sync with AMD error scope heirarchy table */
+static int amd_mce_severity(struct mce *m, enum context ctx)
+{
+	/* Processor Context Corrupt, no need to fumble too much, die! */
+	if (m->status & MCI_STATUS_PCC)
+		return MCE_PANIC_SEVERITY;
+
+	if (m->status & MCI_STATUS_UC) {
+		/*
+		 * On older systems, where overflow_recov flag is not
+		 * present, we should simply PANIC if Overflow occurs.
+		 * If overflow_recov flag set, then SW can try
+		 * to at least kill process to salvage systen operation.
+		 */
+
+		/* at least one error was not logged */
+		if (m->status & MCI_STATUS_OVER && !mce_flags.overflow_recov)
+				return MCE_PANIC_SEVERITY;
+
+		/* software can try to contain */
+		if (!(m->mcgstatus & MCG_STATUS_RIPV) &&
+		      mce_flags.overflow_recov) {
+			if (ctx == IN_KERNEL)
+				return MCE_PANIC_SEVERITY;
+
+			/* kill current process */
+			return MCE_AR_SEVERITY;
+		}
+		/*
+		 * any other case, return MCE_UC_SEVERITY so that
+		 * we log the error and exit #MC handler.
+		 */
+		return MCE_UC_SEVERITY;
+	}
+
+	/*
+	 * deferred error: poll handler catches these and adds to mce_ring
+	 * so memory-failure can take recovery actions.
+	 */
+	if (m->status & MCI_STATUS_DEFERRED)
+		return MCE_DEFERRED_SEVERITY;
+
+	/*
+	 * corrected error: poll handler catches these and passes
+	 * responsibility of decoding the error to EDAC
+	 */
+	return MCE_KEEP_SEVERITY;
+}
+
 int mce_severity(struct mce *m, int tolerant, char **msg, bool is_excp)
 {
 	enum exception excp = (is_excp ? EXCP_CONTEXT : NO_EXCP);
 	enum context ctx = error_context(m);
 	struct severity *s;
 
+	if (m->cpuvendor == X86_VENDOR_AMD)
+		return amd_mce_severity(m, ctx);
+
 	for (s = severities;; s++) {
 		if ((m->status & s->mask) != s->result)
 			continue;
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 17ad025..680cfb2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -65,6 +65,7 @@ static DEFINE_MUTEX(mce_chrdev_read_mutex);
 DEFINE_PER_CPU(unsigned, mce_exception_count);
 
 struct mce_bank *mce_banks __read_mostly;
+struct mce_vendor_flags mce_flags __read_mostly;
 
 struct mca_config mca_cfg __read_mostly = {
 	.bootlog  = -1,
@@ -1532,6 +1533,13 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
 		 if (c->x86 == 6 && cfg->banks > 0)
 			mce_banks[0].ctl = 0;
 
+		/*
+		 * overflow_recov is supported for F15h Models 00h-0fh
+		 * even though we don't have cpuid bit for this
+		 */
+		if (c->x86 == 0x15 && c->x86_model <= 0xf)
+			mce_flags.overflow_recov = 1;
+
 		 /*
 		  * Turn off MC4_MISC thresholding banks on those models since
 		  * they're not supported there.
@@ -1637,6 +1645,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
 		break;
 	case X86_VENDOR_AMD:
 		mce_amd_feature_init(c);
+		mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1;
 		break;
 	default:
 		break;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-16 17:16 [PATCH] x86, mce, severities: Add AMD severities function Aravind Gopalakrishnan
@ 2015-03-17  7:42 ` Ingo Molnar
  2015-03-17  9:04   ` Borislav Petkov
  2015-03-17 10:20 ` Borislav Petkov
  1 sibling, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2015-03-17  7:42 UTC (permalink / raw)
  To: Aravind Gopalakrishnan
  Cc: tglx, mingo, hpa, tony.luck, bp, slaoub, luto, x86, linux-kernel,
	linux-edac


* Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> wrote:

> Add a severities function that caters to AMD processors.
> This allows us to do some vendor specific work within the
> function if necessary.
> 
> Also, introduce a vendor flag bitfield which contains vendor
> specific flags. The severities code uses this to define error
> scope based on the prescence of the flags field.

So it would be nice to see actual use of this new facility, within the 
same patch series?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-17  7:42 ` Ingo Molnar
@ 2015-03-17  9:04   ` Borislav Petkov
  2015-03-17 10:11     ` Ingo Molnar
  0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2015-03-17  9:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Aravind Gopalakrishnan, tglx, mingo, hpa, tony.luck, slaoub,
	luto, x86, linux-kernel, linux-edac

On Tue, Mar 17, 2015 at 08:42:14AM +0100, Ingo Molnar wrote:
> So it would be nice to see actual use of this new facility, within the
> same patch series?

All AMD hardware will use this. It is basically a much cleaner, nicer
and *actually* *readable* severities grading function replacing the
default one on AMD.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-17  9:04   ` Borislav Petkov
@ 2015-03-17 10:11     ` Ingo Molnar
  0 siblings, 0 replies; 16+ messages in thread
From: Ingo Molnar @ 2015-03-17 10:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Aravind Gopalakrishnan, tglx, mingo, hpa, tony.luck, slaoub,
	luto, x86, linux-kernel, linux-edac


* Borislav Petkov <bp@alien8.de> wrote:

> On Tue, Mar 17, 2015 at 08:42:14AM +0100, Ingo Molnar wrote:
> > So it would be nice to see actual use of this new facility, within the
> > same patch series?
> 
> All AMD hardware will use this. It is basically a much cleaner, 
> nicer and *actually* *readable* severities grading function 
> replacing the default one on AMD.

Ok, fine with me!

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-16 17:16 [PATCH] x86, mce, severities: Add AMD severities function Aravind Gopalakrishnan
  2015-03-17  7:42 ` Ingo Molnar
@ 2015-03-17 10:20 ` Borislav Petkov
  2015-03-17 18:41   ` Aravind Gopalakrishnan
  1 sibling, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2015-03-17 10:20 UTC (permalink / raw)
  To: Aravind Gopalakrishnan
  Cc: tglx, mingo, hpa, tony.luck, slaoub, luto, x86, linux-kernel, linux-edac

On Mon, Mar 16, 2015 at 12:16:04PM -0500, Aravind Gopalakrishnan wrote:
> +/* keeping amd_mce_severity in sync with AMD error scope heirarchy table */
> +static int amd_mce_severity(struct mce *m, enum context ctx)
> +{
> +	/* Processor Context Corrupt, no need to fumble too much, die! */
> +	if (m->status & MCI_STATUS_PCC)
> +		return MCE_PANIC_SEVERITY;
> +
> +	if (m->status & MCI_STATUS_UC) {
> +		/*
> +		 * On older systems, where overflow_recov flag is not
> +		 * present, we should simply PANIC if Overflow occurs.
> +		 * If overflow_recov flag set, then SW can try
> +		 * to at least kill process to salvage systen operation.
> +		 */
> +
> +		/* at least one error was not logged */
> +		if (m->status & MCI_STATUS_OVER && !mce_flags.overflow_recov)
> +				return MCE_PANIC_SEVERITY;
> +
> +		/* software can try to contain */
> +		if (!(m->mcgstatus & MCG_STATUS_RIPV) &&
> +		      mce_flags.overflow_recov) {
> +			if (ctx == IN_KERNEL)
> +				return MCE_PANIC_SEVERITY;

we're testing mce_flags.overflow_recov twice here, perhaps do instead:

	/*
	 * < Comment about overflow recovery bit>
	 */
	if (mce_flags.overflow_recov) {
		if (!(m->mcgstatus & MCG_STATUS_RIPV) && (ctx == IN_KERNEL))
			return MCE_PANIC_SEVERITY;
	} else {
		if (m->status & MCI_STATUS_OVER)
			return MCE_PANIC_SEVERITY;
	}

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-17 10:20 ` Borislav Petkov
@ 2015-03-17 18:41   ` Aravind Gopalakrishnan
  2015-03-17 18:44     ` Borislav Petkov
  0 siblings, 1 reply; 16+ messages in thread
From: Aravind Gopalakrishnan @ 2015-03-17 18:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: tglx, mingo, hpa, tony.luck, slaoub, luto, x86, linux-kernel, linux-edac

On 3/17/2015 5:20 AM, Borislav Petkov wrote:
> On Mon, Mar 16, 2015 at 12:16:04PM -0500, Aravind Gopalakrishnan wrote:
>> +/* keeping amd_mce_severity in sync with AMD error scope heirarchy table */
>> +static int amd_mce_severity(struct mce *m, enum context ctx)
>> +{
>> +	/* Processor Context Corrupt, no need to fumble too much, die! */
>> +	if (m->status & MCI_STATUS_PCC)
>> +		return MCE_PANIC_SEVERITY;
>> +
>> +	if (m->status & MCI_STATUS_UC) {
>> +		/*
>> +		 * On older systems, where overflow_recov flag is not
>> +		 * present, we should simply PANIC if Overflow occurs.
>> +		 * If overflow_recov flag set, then SW can try
>> +		 * to at least kill process to salvage systen operation.
>> +		 */
>> +
>> +		/* at least one error was not logged */
>> +		if (m->status & MCI_STATUS_OVER && !mce_flags.overflow_recov)
>> +				return MCE_PANIC_SEVERITY;
>> +
>> +		/* software can try to contain */
>> +		if (!(m->mcgstatus & MCG_STATUS_RIPV) &&
>> +		      mce_flags.overflow_recov) {
>> +			if (ctx == IN_KERNEL)
>> +				return MCE_PANIC_SEVERITY;
> we're testing mce_flags.overflow_recov twice here, perhaps do instead:
>
> 	/*
> 	 * < Comment about overflow recovery bit>
> 	 */
> 	if (mce_flags.overflow_recov) {
> 		if (!(m->mcgstatus & MCG_STATUS_RIPV) && (ctx == IN_KERNEL))
> 			return MCE_PANIC_SEVERITY;

return MCE_AR_SEVERITY if ctx == IN_USER also needs to be within this 
block here.
Will do that and resend.

> 	} else {
> 		if (m->status & MCI_STATUS_OVER)
> 			return MCE_PANIC_SEVERITY;
> 	}
>

Thanks,
-Aravind

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-17 18:41   ` Aravind Gopalakrishnan
@ 2015-03-17 18:44     ` Borislav Petkov
  2015-03-19  0:01       ` Luck, Tony
  0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2015-03-17 18:44 UTC (permalink / raw)
  To: Aravind Gopalakrishnan
  Cc: tglx, mingo, hpa, tony.luck, slaoub, luto, x86, linux-kernel, linux-edac

On Tue, Mar 17, 2015 at 01:41:46PM -0500, Aravind Gopalakrishnan wrote:
> return MCE_AR_SEVERITY if ctx == IN_USER also needs to be within this
> block here. Will do that and resend.

Right, but please wait a couple of days for people to take a look first.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-17 18:44     ` Borislav Petkov
@ 2015-03-19  0:01       ` Luck, Tony
  2015-03-19  9:29         ` Borislav Petkov
  0 siblings, 1 reply; 16+ messages in thread
From: Luck, Tony @ 2015-03-19  0:01 UTC (permalink / raw)
  To: Borislav Petkov, Aravind Gopalakrishnan
  Cc: tglx, mingo, hpa, slaoub, luto, x86, linux-kernel, linux-edac

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 497 bytes --]

One other thought.  Instead of the run-time test to see if this is an AMD processor on every call
to this function, would it be cleaner to:

1) Rename existing mce_severity() function to mce_severity_intel()
2) Declare a function pointer named mce_severity.
3) Assign that pointer to the _intel() or _amd() function in mce_init()

-Tony
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-19  0:01       ` Luck, Tony
@ 2015-03-19  9:29         ` Borislav Petkov
  2015-03-19 14:41           ` Aravind Gopalakrishnan
  0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2015-03-19  9:29 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Aravind Gopalakrishnan, tglx, mingo, hpa, slaoub, luto, x86,
	linux-kernel, linux-edac

On Thu, Mar 19, 2015 at 12:01:32AM +0000, Luck, Tony wrote:
> One other thought.  Instead of the run-time test to see if this is an AMD processor on every call
> to this function, would it be cleaner to:
> 
> 1) Rename existing mce_severity() function to mce_severity_intel()
> 2) Declare a function pointer named mce_severity.
> 3) Assign that pointer to the _intel() or _amd() function in mce_init()

Yes, of course. We do that (or at least pretty close) in other paths
too.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-19  9:29         ` Borislav Petkov
@ 2015-03-19 14:41           ` Aravind Gopalakrishnan
  2015-03-19 15:53             ` Borislav Petkov
  0 siblings, 1 reply; 16+ messages in thread
From: Aravind Gopalakrishnan @ 2015-03-19 14:41 UTC (permalink / raw)
  To: Borislav Petkov, Luck, Tony
  Cc: tglx, mingo, hpa, slaoub, luto, x86, linux-kernel, linux-edac

On 3/19/2015 4:29 AM, Borislav Petkov wrote:
> On Thu, Mar 19, 2015 at 12:01:32AM +0000, Luck, Tony wrote:
>> One other thought.  Instead of the run-time test to see if this is an AMD processor on every call
>> to this function, would it be cleaner to:
>>
>> 1) Rename existing mce_severity() function to mce_severity_intel()
>> 2) Declare a function pointer named mce_severity.
>> 3) Assign that pointer to the _intel() or _amd() function in mce_init()
> Yes, of course. We do that (or at least pretty close) in other paths
> too.
>
>

Ok, I'll work on this and prepend the changes to the current version of 
the patch.
Would you prefer the changes be in a separate patch or lump it in along 
with current version?

Thanks,
-Aravind.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-19 14:41           ` Aravind Gopalakrishnan
@ 2015-03-19 15:53             ` Borislav Petkov
  2015-03-19 16:20               ` Aravind Gopalakrishnan
  0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2015-03-19 15:53 UTC (permalink / raw)
  To: Aravind Gopalakrishnan
  Cc: Luck, Tony, tglx, mingo, hpa, slaoub, luto, x86, linux-kernel,
	linux-edac

On Thu, Mar 19, 2015 at 09:41:01AM -0500, Aravind Gopalakrishnan wrote:
> Ok, I'll work on this and prepend the changes to the current version
> of the patch. Would you prefer the changes be in a separate patch or
> lump it in along with current version?

It would be best if what Tony suggested comes ontop of your patch with
his Suggested-by: tag. This ordering should be also the easiest wrt
functionality and bisectability.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-19 15:53             ` Borislav Petkov
@ 2015-03-19 16:20               ` Aravind Gopalakrishnan
  2015-03-19 17:15                 ` Luck, Tony
  0 siblings, 1 reply; 16+ messages in thread
From: Aravind Gopalakrishnan @ 2015-03-19 16:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luck, Tony, tglx, mingo, hpa, slaoub, luto, x86, linux-kernel,
	linux-edac

On 3/19/2015 10:53 AM, Borislav Petkov wrote:
> On Thu, Mar 19, 2015 at 09:41:01AM -0500, Aravind Gopalakrishnan wrote:
>> Ok, I'll work on this and prepend the changes to the current version
>> of the patch. Would you prefer the changes be in a separate patch or
>> lump it in along with current version?
> It would be best if what Tony suggested comes ontop of your patch with
> his Suggested-by: tag. This ordering should be also the easiest wrt
> functionality and bisectability.
>
>

Ok, I'll have it ready and send out a V2 by tomorrow if there are no 
other comments/reviews.

Thanks,
-Aravind.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-19 16:20               ` Aravind Gopalakrishnan
@ 2015-03-19 17:15                 ` Luck, Tony
  2015-03-20 15:59                   ` Aravind Gopalakrishnan
  0 siblings, 1 reply; 16+ messages in thread
From: Luck, Tony @ 2015-03-19 17:15 UTC (permalink / raw)
  To: Aravind Gopalakrishnan, Borislav Petkov
  Cc: tglx, mingo, hpa, slaoub, luto, x86, linux-kernel, linux-edac

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 684 bytes --]

>> It would be best if what Tony suggested comes ontop of your patch with
>> his Suggested-by: tag. This ordering should be also the easiest wrt
>> functionality and bisectability.
>>
>>
>
> Ok, I'll have it ready and send out a V2 by tomorrow if there are no 
> other comments/reviews.

Fame & glory ... whatever. Just be sure to make the _intel() and _amd()
severity functions "static" ... so you'll need something like this in mce-severity.c

void mce_vendor_severity_init(void)
{
	.... assign mce_severity here
}

-Tony
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-19 17:15                 ` Luck, Tony
@ 2015-03-20 15:59                   ` Aravind Gopalakrishnan
  2015-03-20 16:03                     ` Borislav Petkov
  0 siblings, 1 reply; 16+ messages in thread
From: Aravind Gopalakrishnan @ 2015-03-20 15:59 UTC (permalink / raw)
  To: Luck, Tony, Borislav Petkov
  Cc: tglx, mingo, hpa, slaoub, luto, x86, linux-kernel, linux-edac

On 3/19/2015 12:15 PM, Luck, Tony wrote:
>>> It would be best if what Tony suggested comes ontop of your patch with
>>> his Suggested-by: tag. This ordering should be also the easiest wrt
>>> functionality and bisectability.
>>>
>>>
>> Ok, I'll have it ready and send out a V2 by tomorrow if there are no
>> other comments/reviews.
> Fame & glory ... whatever. Just be sure to make the _intel() and _amd()
> severity functions "static" ... so you'll need something like this in mce-severity.c
>
> void mce_vendor_severity_init(void)
> {
> 	.... assign mce_severity here
> }
>
>

Ok, Here's how I have it currently:
void __init mcheck_vendor_init_severity(void)
{
         struct cpuinfo_x86 *c = &boot_cpu_data;

         switch (c->x86_vendor) {
         case X86_VENDOR_INTEL:
                 mce_severity = mce_severity_intel;
                 break;
         case X86_VENDOR_AMD:
                 mce_severity = mce_severity_amd;
                 break;
         default:
                 break;
         }
}
And I call this from mcheck_init().
I tested the above bits on AMD and the severities grading works fine.

Should we also come up with a '_default' function to assign to 
mce_severity function pointer?
Something like-
static int mce_severity_default(struct mce *m, int tolerant,
                                 char **msg, bool is_excp)
{
         pr_err("CPU#%d: No vendor specific severities grader assigned.
                 Implementing default grader\n", smp_processor_id());

         if (m->status & MCI_STATUS_PCC || m->status & MCI_STATUS_OVER)
                 return MCE_PANIC_SEVERITY;

         if (m->status & MCI_STATUS_UC)
                 return MCE_UC_SEVERITY;

         return MCE_KEEP_SEVERITY;
}

int (*mce_severity) (struct mce *m, int tolerant, char **msg, bool 
is_excp) =
                     mce_severity_default;

How much of grading should '_default' do if at all?

Thanks,
-Aravind.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-20 15:59                   ` Aravind Gopalakrishnan
@ 2015-03-20 16:03                     ` Borislav Petkov
  2015-03-20 17:49                       ` Luck, Tony
  0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2015-03-20 16:03 UTC (permalink / raw)
  To: Aravind Gopalakrishnan
  Cc: Luck, Tony, tglx, mingo, hpa, slaoub, luto, x86, linux-kernel,
	linux-edac

On Fri, Mar 20, 2015 at 10:59:33AM -0500, Aravind Gopalakrishnan wrote:
> Ok, Here's how I have it currently:
> void __init mcheck_vendor_init_severity(void)
> {
>         struct cpuinfo_x86 *c = &boot_cpu_data;
> 
>         switch (c->x86_vendor) {
>         case X86_VENDOR_INTEL:
>                 mce_severity = mce_severity_intel;
>                 break;
>         case X86_VENDOR_AMD:
>                 mce_severity = mce_severity_amd;
>                 break;
>         default:
>                 break;
>         }
> }
> And I call this from mcheck_init().
> I tested the above bits on AMD and the severities grading works fine.
> 
> Should we also come up with a '_default' function to assign to mce_severity
> function pointer?

I think that should be

	default:
		WARN_ONCE("WTF?!");
		break;

above.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] x86, mce, severities: Add AMD severities function
  2015-03-20 16:03                     ` Borislav Petkov
@ 2015-03-20 17:49                       ` Luck, Tony
  0 siblings, 0 replies; 16+ messages in thread
From: Luck, Tony @ 2015-03-20 17:49 UTC (permalink / raw)
  To: Borislav Petkov, Aravind Gopalakrishnan
  Cc: tglx, mingo, hpa, slaoub, luto, x86, linux-kernel, linux-edac

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 871 bytes --]

>> And I call this from mcheck_init().
>> I tested the above bits on AMD and the severities grading works fine.
>> 
>> Should we also come up with a '_default' function to assign to mce_severity
>> function pointer?
>
>I think that should be
>
>	default:
>		WARN_ONCE("WTF?!");
>		break;

mcheck_init() is called unconditionally from setup_arch().  So if anyone is
still using a Cyrix, Transmeta or other non-Intel, non-Amd processor we'd trip this WARN_ON.

I think you can have a default severity function that just does:

	return MCE_PANIC_SEVERITY;

just to avoid the unpleasant thought that we might jump through a NULL pointer
if we did somehow end up in do_machine_check() on another vendors cpu.

-Tony
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-03-20 17:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-16 17:16 [PATCH] x86, mce, severities: Add AMD severities function Aravind Gopalakrishnan
2015-03-17  7:42 ` Ingo Molnar
2015-03-17  9:04   ` Borislav Petkov
2015-03-17 10:11     ` Ingo Molnar
2015-03-17 10:20 ` Borislav Petkov
2015-03-17 18:41   ` Aravind Gopalakrishnan
2015-03-17 18:44     ` Borislav Petkov
2015-03-19  0:01       ` Luck, Tony
2015-03-19  9:29         ` Borislav Petkov
2015-03-19 14:41           ` Aravind Gopalakrishnan
2015-03-19 15:53             ` Borislav Petkov
2015-03-19 16:20               ` Aravind Gopalakrishnan
2015-03-19 17:15                 ` Luck, Tony
2015-03-20 15:59                   ` Aravind Gopalakrishnan
2015-03-20 16:03                     ` Borislav Petkov
2015-03-20 17:49                       ` Luck, Tony

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.