* [PATCH 0/6] MCE, AMD: Update MCE decoding code
@ 2012-02-07 14:26 Borislav Petkov
2012-02-07 14:26 ` [PATCH 1/6] MCE, AMD: Correct some MC0 error types Borislav Petkov
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Borislav Petkov @ 2012-02-07 14:26 UTC (permalink / raw)
To: EDAC devel; +Cc: LKML, Borislav Petkov
From: Borislav Petkov <borislav.petkov@amd.com>
Hi,
nothing earthshattering here, just a bunch of minor updates for the next
round.
--
Regards/Gruss,
Boris.
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/6] MCE, AMD: Correct some MC0 error types
2012-02-07 14:26 [PATCH 0/6] MCE, AMD: Update MCE decoding code Borislav Petkov
@ 2012-02-07 14:26 ` Borislav Petkov
2012-02-07 14:26 ` [PATCH 2/6] MCE, AMD: Correct ucode patch buffer description Borislav Petkov
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2012-02-07 14:26 UTC (permalink / raw)
To: EDAC devel; +Cc: LKML, Borislav Petkov
From: Borislav Petkov <borislav.petkov@amd.com>
Use "System Read Data Error" as a more general name for MC0 bus errors
on F15h and update some error definitions.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
---
drivers/edac/mce_amd.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index bd926ea..0ee1c0a 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -255,10 +255,9 @@ static bool f15h_dc_mce(u16 ec, u8 xec)
} else if (BUS_ERROR(ec)) {
if (!xec)
- pr_cont("during system linefill.\n");
+ pr_cont("System Read Data Error.\n");
else
- pr_cont(" Internal %s condition.\n",
- ((xec == 1) ? "livelock" : "deadlock"));
+ pr_cont(" Internal error condition type %d.\n", xec);
} else
ret = false;
--
1.7.8.rc0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/6] MCE, AMD: Correct ucode patch buffer description
2012-02-07 14:26 [PATCH 0/6] MCE, AMD: Update MCE decoding code Borislav Petkov
2012-02-07 14:26 ` [PATCH 1/6] MCE, AMD: Correct some MC0 error types Borislav Petkov
@ 2012-02-07 14:26 ` Borislav Petkov
2012-02-07 14:26 ` [PATCH 3/6] MCE, AMD: Correct VB data error description Borislav Petkov
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2012-02-07 14:26 UTC (permalink / raw)
To: EDAC devel; +Cc: LKML, Borislav Petkov
From: Borislav Petkov <borislav.petkov@amd.com>
This MC1 error signature is called differently now, fix it.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
---
drivers/edac/mce_amd.c | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 0ee1c0a..5626e17 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -88,7 +88,7 @@ static const char * const f15h_ic_mce_desc[] = {
"Parity error for IC probe tag valid bit",
"PFB non-cacheable bit parity error",
"PFB valid bit parity error", /* xec = 0xd */
- "patch RAM", /* xec = 010 */
+ "Microcode Patch Buffer", /* xec = 010 */
"uop queue",
"insn buffer",
"predecode buffer",
@@ -354,7 +354,11 @@ static bool f15h_ic_mce(u16 ec, u8 xec)
pr_cont("%s.\n", f15h_ic_mce_desc[xec-2]);
break;
- case 0x10 ... 0x14:
+ case 0x10:
+ pr_cont("%s.\n", f15h_ic_mce_desc[xec-4]);
+ break;
+
+ case 0x11 ... 0x14:
pr_cont("Decoder %s parity error.\n", f15h_ic_mce_desc[xec-4]);
break;
--
1.7.8.rc0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/6] MCE, AMD: Correct VB data error description
2012-02-07 14:26 [PATCH 0/6] MCE, AMD: Update MCE decoding code Borislav Petkov
2012-02-07 14:26 ` [PATCH 1/6] MCE, AMD: Correct some MC0 error types Borislav Petkov
2012-02-07 14:26 ` [PATCH 2/6] MCE, AMD: Correct ucode patch buffer description Borislav Petkov
@ 2012-02-07 14:26 ` Borislav Petkov
2012-02-07 14:26 ` [PATCH 4/6] MCE, AMD: Rework NB MCE signatures Borislav Petkov
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2012-02-07 14:26 UTC (permalink / raw)
To: EDAC devel; +Cc: LKML, Borislav Petkov
From: Borislav Petkov <borislav.petkov@amd.com>
Sync with latest BKDG error types.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
---
drivers/edac/mce_amd.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 5626e17..bf6dd99 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -104,7 +104,7 @@ static const char * const f15h_cu_mce_desc[] = {
"WCC Tag ECC error",
"WCC Data ECC error",
"WCB Data parity error",
- "VB Data/ECC error",
+ "VB Data ECC or parity error",
"L2 Tag ECC error", /* xec = 0x10 */
"Hard L2 Tag ECC error",
"Multiple hits on L2 tag",
--
1.7.8.rc0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/6] MCE, AMD: Rework NB MCE signatures
2012-02-07 14:26 [PATCH 0/6] MCE, AMD: Update MCE decoding code Borislav Petkov
` (2 preceding siblings ...)
2012-02-07 14:26 ` [PATCH 3/6] MCE, AMD: Correct VB data error description Borislav Petkov
@ 2012-02-07 14:26 ` Borislav Petkov
2012-02-07 14:26 ` [PATCH 5/6] MCE, AMD: Correct bank 5 error signatures Borislav Petkov
2012-02-07 14:26 ` [PATCH 6/6] MCE, AMD: Constify error tables Borislav Petkov
5 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2012-02-07 14:26 UTC (permalink / raw)
To: EDAC devel; +Cc: LKML, Borislav Petkov
From: Borislav Petkov <borislav.petkov@amd.com>
Correct their formulation, replace per-family functions with a single,
unified lookup table.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
---
drivers/edac/mce_amd.c | 176 +++++++++++++-----------------------------------
drivers/edac/mce_amd.h | 1 -
2 files changed, 48 insertions(+), 129 deletions(-)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index bf6dd99..f6ebe5e 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -64,17 +64,6 @@ EXPORT_SYMBOL_GPL(to_msgs);
const char *ii_msgs[] = { "MEM", "RESV", "IO", "GEN" };
EXPORT_SYMBOL_GPL(ii_msgs);
-static const char *f10h_nb_mce_desc[] = {
- "HT link data error",
- "Protocol error (link, L3, probe filter, etc.)",
- "Parity error in NB-internal arrays",
- "Link Retry due to IO link transmission error",
- "L3 ECC data cache error",
- "ECC error in L3 cache tag",
- "L3 LRU parity bits error",
- "ECC Error in the Probe Filter directory"
-};
-
static const char * const f15h_ic_mce_desc[] = {
"UC during a demand linefill from L2",
"Parity error during data load from IC",
@@ -112,6 +101,28 @@ static const char * const f15h_cu_mce_desc[] = {
"PRB address parity error"
};
+static const char *nb_mce_desc[] = {
+ "DRAM ECC error detected on the NB",
+ "CRC error detected on HT link",
+ "Link-defined sync error packets detected on HT link",
+ "HT Master abort",
+ "HT Target abort",
+ "Invalid GART PTE entry during GART table walk",
+ "Unsupported atomic RMW received from an IO link",
+ "Watchdog timeout due to lack of progress",
+ "DRAM ECC error detected on the NB",
+ "SVM DMA Exclusion Vector error",
+ "HT data error detected on link",
+ "Protocol error (link, L3, probe filter)",
+ "NB internal arrays parity error",
+ "DRAM addr/ctl signals parity error",
+ "IO link transmission error",
+ "L3 data cache ECC error", /* xec = 0x1c */
+ "L3 cache tag error",
+ "L3 LRU parity bits error",
+ "ECC Error in the Probe Filter directory"
+};
+
static const char * const fr_ex_mce_desc[] = {
"CPU Watchdog timer expire",
"Wakeup array dest tag",
@@ -499,58 +510,31 @@ wrong_ls_mce:
pr_emerg(HW_ERR "Corrupted LS MCE info?\n");
}
-static bool k8_nb_mce(u16 ec, u8 xec)
+void amd_decode_nb_mce(struct mce *m)
{
- bool ret = true;
-
- switch (xec) {
- case 0x1:
- pr_cont("CRC error detected on HT link.\n");
- break;
-
- case 0x5:
- pr_cont("Invalid GART PTE entry during GART table walk.\n");
- break;
-
- case 0x6:
- pr_cont("Unsupported atomic RMW received from an IO link.\n");
- break;
-
- case 0x0:
- case 0x8:
- if (boot_cpu_data.x86 == 0x11)
- return false;
-
- pr_cont("DRAM ECC error detected on the NB.\n");
- break;
-
- case 0xd:
- pr_cont("Parity error on the DRAM addr/ctl signals.\n");
- break;
-
- default:
- ret = false;
- break;
- }
+ struct cpuinfo_x86 *c = &boot_cpu_data;
+ int node_id = amd_get_nb_id(m->extcpu);
+ u16 ec = EC(m->status);
+ u8 xec = XEC(m->status, 0x1f);
+ u8 offset = 0;
- return ret;
-}
+ pr_emerg(HW_ERR "Northbridge Error (node %d): ", node_id);
-static bool f10h_nb_mce(u16 ec, u8 xec)
-{
- bool ret = true;
- u8 offset = 0;
+ switch (xec) {
+ case 0x0 ... 0xe:
- if (k8_nb_mce(ec, xec))
- return true;
+ /* special handling for DRAM ECCs */
+ if (xec == 0x0 || xec == 0x8) {
+ /* no ECCs on F11h */
+ if (c->x86 == 0x11)
+ goto wrong_nb_mce;
- switch(xec) {
- case 0xa ... 0xc:
- offset = 10;
- break;
+ pr_cont("%s.\n", nb_mce_desc[xec]);
- case 0xe:
- offset = 11;
+ if (nb_bus_decoder)
+ nb_bus_decoder(node_id, m);
+ return;
+ }
break;
case 0xf:
@@ -559,83 +543,25 @@ static bool f10h_nb_mce(u16 ec, u8 xec)
else if (BUS_ERROR(ec))
pr_cont("DMA Exclusion Vector Table Walk error.\n");
else
- ret = false;
-
- goto out;
- break;
+ goto wrong_nb_mce;
+ return;
case 0x19:
if (boot_cpu_data.x86 == 0x15)
pr_cont("Compute Unit Data Error.\n");
else
- ret = false;
-
- goto out;
- break;
+ goto wrong_nb_mce;
+ return;
case 0x1c ... 0x1f:
- offset = 24;
+ offset = 13;
break;
default:
- ret = false;
-
- goto out;
- break;
- }
-
- pr_cont("%s.\n", f10h_nb_mce_desc[xec - offset]);
-
-out:
- return ret;
-}
-
-static bool nb_noop_mce(u16 ec, u8 xec)
-{
- return false;
-}
-
-void amd_decode_nb_mce(struct mce *m)
-{
- struct cpuinfo_x86 *c = &boot_cpu_data;
- int node_id = amd_get_nb_id(m->extcpu);
- u16 ec = EC(m->status);
- u8 xec = XEC(m->status, 0x1f);
-
- pr_emerg(HW_ERR "Northbridge Error (node %d): ", node_id);
-
- switch (xec) {
- case 0x2:
- pr_cont("Sync error (sync packets on HT link detected).\n");
- return;
-
- case 0x3:
- pr_cont("HT Master abort.\n");
- return;
-
- case 0x4:
- pr_cont("HT Target abort.\n");
- return;
-
- case 0x7:
- pr_cont("NB Watchdog timeout.\n");
- return;
-
- case 0x9:
- pr_cont("SVM DMA Exclusion Vector error.\n");
- return;
-
- default:
- break;
- }
-
- if (!fam_ops->nb_mce(ec, xec))
goto wrong_nb_mce;
+ }
- if (c->x86 == 0xf || c->x86 == 0x10 || c->x86 == 0x15)
- if ((xec == 0x8 || xec == 0x0) && nb_bus_decoder)
- nb_bus_decoder(node_id, m);
-
+ pr_cont("%s.\n", nb_mce_desc[xec - offset]);
return;
wrong_nb_mce:
@@ -844,39 +770,33 @@ static int __init mce_amd_init(void)
case 0xf:
fam_ops->dc_mce = k8_dc_mce;
fam_ops->ic_mce = k8_ic_mce;
- fam_ops->nb_mce = k8_nb_mce;
break;
case 0x10:
fam_ops->dc_mce = f10h_dc_mce;
fam_ops->ic_mce = k8_ic_mce;
- fam_ops->nb_mce = f10h_nb_mce;
break;
case 0x11:
fam_ops->dc_mce = k8_dc_mce;
fam_ops->ic_mce = k8_ic_mce;
- fam_ops->nb_mce = f10h_nb_mce;
break;
case 0x12:
fam_ops->dc_mce = f12h_dc_mce;
fam_ops->ic_mce = k8_ic_mce;
- fam_ops->nb_mce = nb_noop_mce;
break;
case 0x14:
nb_err_cpumask = 0x3;
fam_ops->dc_mce = f14h_dc_mce;
fam_ops->ic_mce = f14h_ic_mce;
- fam_ops->nb_mce = nb_noop_mce;
break;
case 0x15:
xec_mask = 0x1f;
fam_ops->dc_mce = f15h_dc_mce;
fam_ops->ic_mce = f15h_ic_mce;
- fam_ops->nb_mce = f10h_nb_mce;
break;
default:
diff --git a/drivers/edac/mce_amd.h b/drivers/edac/mce_amd.h
index 0106747..6fcf599 100644
--- a/drivers/edac/mce_amd.h
+++ b/drivers/edac/mce_amd.h
@@ -82,7 +82,6 @@ extern const char *ii_msgs[];
struct amd_decoder_ops {
bool (*dc_mce)(u16, u8);
bool (*ic_mce)(u16, u8);
- bool (*nb_mce)(u16, u8);
};
void amd_report_gart_errors(bool);
--
1.7.8.rc0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 5/6] MCE, AMD: Correct bank 5 error signatures
2012-02-07 14:26 [PATCH 0/6] MCE, AMD: Update MCE decoding code Borislav Petkov
` (3 preceding siblings ...)
2012-02-07 14:26 ` [PATCH 4/6] MCE, AMD: Rework NB MCE signatures Borislav Petkov
@ 2012-02-07 14:26 ` Borislav Petkov
2012-02-07 14:26 ` [PATCH 6/6] MCE, AMD: Constify error tables Borislav Petkov
5 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2012-02-07 14:26 UTC (permalink / raw)
To: EDAC devel; +Cc: LKML, Borislav Petkov
From: Borislav Petkov <borislav.petkov@amd.com>
... and remove superfluous ErrorCodeExt check.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
---
drivers/edac/mce_amd.c | 5 +----
1 files changed, 1 insertions(+), 4 deletions(-)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index f6ebe5e..88a9297 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -136,7 +136,7 @@ static const char * const fr_ex_mce_desc[] = {
"Physical register file AG0 port",
"Physical register file AG1 port",
"Flag register file",
- "DE correctable error could not be corrected"
+ "DE error occurred"
};
static bool f12h_dc_mce(u16 ec, u8 xec)
@@ -577,9 +577,6 @@ static void amd_decode_fr_mce(struct mce *m)
if (c->x86 == 0xf || c->x86 == 0x11)
goto wrong_fr_mce;
- if (c->x86 != 0x15 && xec != 0x0)
- goto wrong_fr_mce;
-
pr_emerg(HW_ERR "%s Error: ",
(c->x86 == 0x15 ? "Execution Unit" : "FIROB"));
--
1.7.8.rc0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 6/6] MCE, AMD: Constify error tables
2012-02-07 14:26 [PATCH 0/6] MCE, AMD: Update MCE decoding code Borislav Petkov
` (4 preceding siblings ...)
2012-02-07 14:26 ` [PATCH 5/6] MCE, AMD: Correct bank 5 error signatures Borislav Petkov
@ 2012-02-07 14:26 ` Borislav Petkov
5 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2012-02-07 14:26 UTC (permalink / raw)
To: EDAC devel; +Cc: LKML, Borislav Petkov
From: Borislav Petkov <borislav.petkov@amd.com>
... so that checkpatch can chill out.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Reviewed-by: Andreas Herrmann <andreas.herrmann3@amd.com>
---
drivers/edac/mce_amd.c | 14 +++++++-------
drivers/edac/mce_amd.h | 12 ++++++------
2 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 88a9297..36e1486 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -39,29 +39,29 @@ EXPORT_SYMBOL_GPL(amd_unregister_ecc_decoder);
*/
/* transaction type */
-const char *tt_msgs[] = { "INSN", "DATA", "GEN", "RESV" };
+const char * const tt_msgs[] = { "INSN", "DATA", "GEN", "RESV" };
EXPORT_SYMBOL_GPL(tt_msgs);
/* cache level */
-const char *ll_msgs[] = { "RESV", "L1", "L2", "L3/GEN" };
+const char * const ll_msgs[] = { "RESV", "L1", "L2", "L3/GEN" };
EXPORT_SYMBOL_GPL(ll_msgs);
/* memory transaction type */
-const char *rrrr_msgs[] = {
+const char * const rrrr_msgs[] = {
"GEN", "RD", "WR", "DRD", "DWR", "IRD", "PRF", "EV", "SNP"
};
EXPORT_SYMBOL_GPL(rrrr_msgs);
/* participating processor */
-const char *pp_msgs[] = { "SRC", "RES", "OBS", "GEN" };
+const char * const pp_msgs[] = { "SRC", "RES", "OBS", "GEN" };
EXPORT_SYMBOL_GPL(pp_msgs);
/* request timeout */
-const char *to_msgs[] = { "no timeout", "timed out" };
+const char * const to_msgs[] = { "no timeout", "timed out" };
EXPORT_SYMBOL_GPL(to_msgs);
/* memory or i/o */
-const char *ii_msgs[] = { "MEM", "RESV", "IO", "GEN" };
+const char * const ii_msgs[] = { "MEM", "RESV", "IO", "GEN" };
EXPORT_SYMBOL_GPL(ii_msgs);
static const char * const f15h_ic_mce_desc[] = {
@@ -101,7 +101,7 @@ static const char * const f15h_cu_mce_desc[] = {
"PRB address parity error"
};
-static const char *nb_mce_desc[] = {
+static const char * const nb_mce_desc[] = {
"DRAM ECC error detected on the NB",
"CRC error detected on HT link",
"Link-defined sync error packets detected on HT link",
diff --git a/drivers/edac/mce_amd.h b/drivers/edac/mce_amd.h
index 6fcf599..c6074c5 100644
--- a/drivers/edac/mce_amd.h
+++ b/drivers/edac/mce_amd.h
@@ -69,12 +69,12 @@ enum rrrr_ids {
R4_SNOOP,
};
-extern const char *tt_msgs[];
-extern const char *ll_msgs[];
-extern const char *rrrr_msgs[];
-extern const char *pp_msgs[];
-extern const char *to_msgs[];
-extern const char *ii_msgs[];
+extern const char * const tt_msgs[];
+extern const char * const ll_msgs[];
+extern const char * const rrrr_msgs[];
+extern const char * const pp_msgs[];
+extern const char * const to_msgs[];
+extern const char * const ii_msgs[];
/*
* per-family decoder ops
--
1.7.8.rc0
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-02-07 14:28 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-07 14:26 [PATCH 0/6] MCE, AMD: Update MCE decoding code Borislav Petkov
2012-02-07 14:26 ` [PATCH 1/6] MCE, AMD: Correct some MC0 error types Borislav Petkov
2012-02-07 14:26 ` [PATCH 2/6] MCE, AMD: Correct ucode patch buffer description Borislav Petkov
2012-02-07 14:26 ` [PATCH 3/6] MCE, AMD: Correct VB data error description Borislav Petkov
2012-02-07 14:26 ` [PATCH 4/6] MCE, AMD: Rework NB MCE signatures Borislav Petkov
2012-02-07 14:26 ` [PATCH 5/6] MCE, AMD: Correct bank 5 error signatures Borislav Petkov
2012-02-07 14:26 ` [PATCH 6/6] MCE, AMD: Constify error tables Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).