* [GIT PULL] MCE recovery changes
@ 2012-01-24 23:06 Luck, Tony
2012-01-26 10:46 ` Ingo Molnar
0 siblings, 1 reply; 6+ messages in thread
From: Luck, Tony @ 2012-01-24 23:06 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Borislav Petkov
Ingo,
Time to move these from "ras" tree to "tip" so they will be nicely
seasoned for the 3.4 merge window.
I tried to follow the instructions in
http://git-blame.blogspot.com/2012/01/using-signed-tag-in-pull-requests.html
to use the fancy new signed tag scheme. If something is wrong here, then it
is most likely that I typoed (or thinkoed) while following them.
-Tony
The following changes since commit dc47ce90c3a822cd7c9e9339fe4d5f61dcb26b50:
Linux 3.2-rc5 (2011-12-09 15:09:32 -0800)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git mce-recovery-for-tip
for you to fetch changes up to 5f7b88d51e89771f64c15903b96b5933dd0bc6d8:
x86/mce: Recognise machine check bank signature for data path error (2012-01-03 12:07:07 -0800)
----------------------------------------------------------------
MCE recovery (data path only)
----------------------------------------------------------------
Tony Luck (6):
HWPOISON: Clean up memory_failure() vs. __memory_failure()
HWPOISON: Add code to handle "action required" errors.
x86/mce: Create helper function to save addr/misc when needed
x86/mce: Add mechanism to safely save information in MCE handler
x86/mce: Handle "action required" errors
x86/mce: Recognise machine check bank signature for data path error
arch/x86/kernel/cpu/mcheck/mce-severity.c | 16 +++-
arch/x86/kernel/cpu/mcheck/mce.c | 179 ++++++++++++++++++++---------
drivers/base/memory.c | 2 +-
include/linux/mm.h | 4 +-
mm/hwpoison-inject.c | 4 +-
mm/madvise.c | 2 +-
mm/memory-failure.c | 96 ++++++++--------
7 files changed, 197 insertions(+), 106 deletions(-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] MCE recovery changes
2012-01-24 23:06 [GIT PULL] MCE recovery changes Luck, Tony
@ 2012-01-26 10:46 ` Ingo Molnar
2012-01-26 17:29 ` Tony Luck
0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2012-01-26 10:46 UTC (permalink / raw)
To: Luck, Tony; +Cc: linux-kernel, Borislav Petkov, Thomas Gleixner, H. Peter Anvin
* Luck, Tony <tony.luck@intel.com> wrote:
> Ingo,
>
> Time to move these from "ras" tree to "tip" so they will be nicely
> seasoned for the 3.4 merge window.
>
> I tried to follow the instructions in
> http://git-blame.blogspot.com/2012/01/using-signed-tag-in-pull-requests.html
> to use the fancy new signed tag scheme. If something is wrong here, then it
> is most likely that I typoed (or thinkoed) while following them.
It worked perfectly.
>
> -Tony
>
> The following changes since commit dc47ce90c3a822cd7c9e9339fe4d5f61dcb26b50:
>
> Linux 3.2-rc5 (2011-12-09 15:09:32 -0800)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git mce-recovery-for-tip
>
> for you to fetch changes up to 5f7b88d51e89771f64c15903b96b5933dd0bc6d8:
>
> x86/mce: Recognise machine check bank signature for data path error (2012-01-03 12:07:07 -0800)
>
> ----------------------------------------------------------------
> MCE recovery (data path only)
>
> ----------------------------------------------------------------
> Tony Luck (6):
> HWPOISON: Clean up memory_failure() vs. __memory_failure()
> HWPOISON: Add code to handle "action required" errors.
> x86/mce: Create helper function to save addr/misc when needed
> x86/mce: Add mechanism to safely save information in MCE handler
> x86/mce: Handle "action required" errors
> x86/mce: Recognise machine check bank signature for data path error
>
> arch/x86/kernel/cpu/mcheck/mce-severity.c | 16 +++-
> arch/x86/kernel/cpu/mcheck/mce.c | 179 ++++++++++++++++++++---------
> drivers/base/memory.c | 2 +-
> include/linux/mm.h | 4 +-
> mm/hwpoison-inject.c | 4 +-
> mm/madvise.c | 2 +-
> mm/memory-failure.c | 96 ++++++++--------
> 7 files changed, 197 insertions(+), 106 deletions(-)
Pulled, thanks!
One thing i noticed was the magic constant 0x134:
+ SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|0x0134),
don't we want that defined a bit more clearly?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] MCE recovery changes
2012-01-26 10:46 ` Ingo Molnar
@ 2012-01-26 17:29 ` Tony Luck
2012-01-26 18:28 ` Ingo Molnar
0 siblings, 1 reply; 6+ messages in thread
From: Tony Luck @ 2012-01-26 17:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Borislav Petkov, Thomas Gleixner, H. Peter Anvin
On Thu, Jan 26, 2012 at 2:46 AM, Ingo Molnar <mingo@elte.hu> wrote:
> It worked perfectly.
Hurrah!
> Pulled, thanks!
Thank you.
> One thing i noticed was the magic constant 0x134:
>
> + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|0x0134),
>
> don't we want that defined a bit more clearly?
Stylistically it is compatible with the existing:
MASK(MCI_STATUS_OVER|MCI_UC_SAR|0xfff0, MCI_UC_S|0x00c0)
and
MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCACOD, MCI_UC_S|0x017a)
... but that's just a sign that they need some love too :-)
I'll see what I can do - but meaningful names will clearly be longer than
the hex constants that they replace - and I'm already pushing line length
limits here, so it will need more than a trivial restructure.
-Tony
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] MCE recovery changes
2012-01-26 17:29 ` Tony Luck
@ 2012-01-26 18:28 ` Ingo Molnar
2012-01-27 0:02 ` [PATCH] x86/mce: Replace hard coded hex constants with symbolic defines Tony Luck
0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2012-01-26 18:28 UTC (permalink / raw)
To: Tony Luck; +Cc: linux-kernel, Borislav Petkov, Thomas Gleixner, H. Peter Anvin
* Tony Luck <tony.luck@intel.com> wrote:
> On Thu, Jan 26, 2012 at 2:46 AM, Ingo Molnar <mingo@elte.hu> wrote:
> > It worked perfectly.
>
> Hurrah!
>
> > Pulled, thanks!
>
> Thank you.
>
> > One thing i noticed was the magic constant 0x134:
> >
> > + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|0x0134),
> >
> > don't we want that defined a bit more clearly?
>
> Stylistically it is compatible with the existing:
> MASK(MCI_STATUS_OVER|MCI_UC_SAR|0xfff0, MCI_UC_S|0x00c0)
> and
> MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCACOD, MCI_UC_S|0x017a)
>
> ... but that's just a sign that they need some love too :-)
>
> I'll see what I can do - but meaningful names will clearly be
> longer than the hex constants that they replace - and I'm
> already pushing line length limits here, so it will need more
> than a trivial restructure.
Well, one option is to let the line grow - for such things it's
ok up to 100 cols or so.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] x86/mce: Replace hard coded hex constants with symbolic defines
2012-01-26 18:28 ` Ingo Molnar
@ 2012-01-27 0:02 ` Tony Luck
2012-01-27 10:49 ` Ingo Molnar
0 siblings, 1 reply; 6+ messages in thread
From: Tony Luck @ 2012-01-27 0:02 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Borislav Petkov, Thomas Gleixner, H. Peter Anvin
Magic constants like 0x0134 in code just invite questions on
where they come from, what they mean, can they be changed?
Provide #defines for the architecturally defined MCACOD values
with a reference to the Intel Software Developers manual which
describes them.
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
Ingo: You said "ok up to 100 cols or so" ... this goes to 103.
If this is OK, you can either apply it from here on top of the
x86/mce branch in tip - or I can push it to the ras tree and
send you another pull.
arch/x86/kernel/cpu/mcheck/mce-severity.c | 14 ++++++++++----
1 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index f6c92f9..0c82091 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -56,6 +56,12 @@ static struct severity {
#define MCI_UC_SAR (MCI_STATUS_UC|MCI_STATUS_S|MCI_STATUS_AR)
#define MCI_ADDR (MCI_STATUS_ADDRV|MCI_STATUS_MISCV)
#define MCACOD 0xffff
+/* Architecturally defined codes from SDM Vol. 3B Chapter 15 */
+#define MCACOD_SCRUB 0x00C0 /* 0xC0-0xCF Memory Scrubbing */
+#define MCACOD_SCRUBMSK 0xfff0
+#define MCACOD_L3WB 0x017A /* L3 Explicit Writeback */
+#define MCACOD_DATA 0x0134 /* Data Load */
+#define MCACOD_INSTR 0x0150 /* Instruction Fetch */
MCESEV(
NO, "Invalid",
@@ -112,12 +118,12 @@ static struct severity {
#ifdef CONFIG_MEMORY_FAILURE
MCESEV(
KEEP, "HT thread notices Action required: data load error",
- SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|0x0134),
+ SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
MCGMASK(MCG_STATUS_EIPV, 0)
),
MCESEV(
AR, "Action required: data load error",
- SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|0x0134),
+ SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
USER
),
#endif
@@ -129,11 +135,11 @@ static struct severity {
/* known AO MCACODs: */
MCESEV(
AO, "Action optional: memory scrubbing error",
- SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|0xfff0, MCI_UC_S|0x00c0)
+ SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCACOD_SCRUBMSK, MCI_UC_S|MCACOD_SCRUB)
),
MCESEV(
AO, "Action optional: last level cache writeback error",
- SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCACOD, MCI_UC_S|0x017a)
+ SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCACOD, MCI_UC_S|MCACOD_L3WB)
),
MCESEV(
SOME, "Action optional: unknown MCACOD",
--
1.7.9.rc2.1.g69204
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] x86/mce: Replace hard coded hex constants with symbolic defines
2012-01-27 0:02 ` [PATCH] x86/mce: Replace hard coded hex constants with symbolic defines Tony Luck
@ 2012-01-27 10:49 ` Ingo Molnar
0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2012-01-27 10:49 UTC (permalink / raw)
To: Tony Luck; +Cc: linux-kernel, Borislav Petkov, Thomas Gleixner, H. Peter Anvin
* Tony Luck <tony.luck@intel.com> wrote:
> Magic constants like 0x0134 in code just invite questions on
> where they come from, what they mean, can they be changed?
>
> Provide #defines for the architecturally defined MCACOD values
> with a reference to the Intel Software Developers manual which
> describes them.
>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>
> Ingo: You said "ok up to 100 cols or so" ... this goes to 103.
> If this is OK, you can either apply it from here on top of the
> x86/mce branch in tip - or I can push it to the ras tree and
> send you another pull.
Yeah, looks good to me.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-01-27 10:49 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-24 23:06 [GIT PULL] MCE recovery changes Luck, Tony
2012-01-26 10:46 ` Ingo Molnar
2012-01-26 17:29 ` Tony Luck
2012-01-26 18:28 ` Ingo Molnar
2012-01-27 0:02 ` [PATCH] x86/mce: Replace hard coded hex constants with symbolic defines Tony Luck
2012-01-27 10:49 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).