Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code
       [not found] <Shiju Jose>
@ 2019-06-17 14:28 ` Shiju Jose
  2019-06-17 14:28   ` [PATCH 1/6] rasdaemon:print non-standard error data if not decoded Shiju Jose
                     ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Shiju Jose @ 2019-06-17 14:28 UTC (permalink / raw)
  To: mchehab, linux-edac, linuxarm; +Cc: Shiju Jose

This patch set add few changes in the non-standard error decoding code and
logging for the HiSilicon HIP08 non-standard H/W errors.

Shiju Jose (6):
  rasdaemon:print non-standard error data if not decoded
  rasdaemon: rearrange HiSilicon HIP07 decoding function table
  rasdaemon: update iteration logic for the non-standard error decoding
    functions
  rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM
    format1
  rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM
    format2
  rasdaemon:add logging HiSilicon HIP08 PCIe local errors

 Makefile.am                |   2 +-
 non-standard-hisi_hip07.c  |  36 +-
 non-standard-hisi_hip08.c  | 855 +++++++++++++++++++++++++++++++++++++++++++++
 ras-non-standard-handler.c |  36 +-
 ras-non-standard-handler.h |   8 +-
 ras-record.c               |  30 +-
 ras-record.h               |  13 +
 7 files changed, 932 insertions(+), 48 deletions(-)
 create mode 100644 non-standard-hisi_hip08.c

-- 
1.9.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/6] rasdaemon:print non-standard error data if not decoded
  2019-06-17 14:28 ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Shiju Jose
@ 2019-06-17 14:28   ` Shiju Jose
  2019-06-17 14:28   ` [PATCH 2/6] rasdaemon: rearrange HiSilicon HIP07 decoding function table Shiju Jose
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Shiju Jose @ 2019-06-17 14:28 UTC (permalink / raw)
  To: mchehab, linux-edac, linuxarm; +Cc: Shiju Jose

This patch change printing non-standard error data
only if not decoded.

Suggested-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 ras-non-standard-handler.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/ras-non-standard-handler.c b/ras-non-standard-handler.c
index 21e6a76..d343a2a 100644
--- a/ras-non-standard-handler.c
+++ b/ras-non-standard-handler.c
@@ -160,20 +160,6 @@ int ras_non_standard_event_handler(struct trace_seq *s,
 	ev.error = pevent_get_field_raw(s, event, "buf", record, &len, 1);
 	if(!ev.error)
 		return -1;
-	len = ev.length;
-	i = 0;
-	line_count = 0;
-	trace_seq_printf(s, " error:\n  %08x: ", i);
-	while(len >= 4) {
-		print_le_hex(s, ev.error, i);
-		i+=4;
-		len-=4;
-		if(++line_count == 4) {
-			trace_seq_printf(s, "\n  %08x: ", i);
-			line_count = 0;
-		} else
-			trace_seq_printf(s, " ");
-	}
 
 	for (count = 0; count < dec_tab_count && !dec_done; count++) {
 		dec_tab = ns_dec_tab[count];
@@ -187,6 +173,23 @@ int ras_non_standard_event_handler(struct trace_seq *s,
 		}
 	}
 
+	if (!dec_done) {
+		len = ev.length;
+		i = 0;
+		line_count = 0;
+		trace_seq_printf(s, " error:\n  %08x: ", i);
+		while (len >= 4) {
+			print_le_hex(s, ev.error, i);
+			i += 4;
+			len -= 4;
+			if (++line_count == 4) {
+				trace_seq_printf(s, "\n  %08x: ", i);
+				line_count = 0;
+			} else
+				trace_seq_printf(s, " ");
+		}
+	}
+
 	/* Insert data into the SGBD */
 #ifdef HAVE_SQLITE3
 	ras_store_non_standard_record(ras, &ev);
-- 
1.9.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/6] rasdaemon: rearrange HiSilicon HIP07 decoding function table
  2019-06-17 14:28 ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Shiju Jose
  2019-06-17 14:28   ` [PATCH 1/6] rasdaemon:print non-standard error data if not decoded Shiju Jose
@ 2019-06-17 14:28   ` Shiju Jose
  2019-06-17 14:28   ` [PATCH 3/6] rasdaemon: update iteration logic for the non-standard error decoding functions Shiju Jose
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Shiju Jose @ 2019-06-17 14:28 UTC (permalink / raw)
  To: mchehab, linux-edac, linuxarm; +Cc: Shiju Jose

This patch rearranges the decoding function table for the
HiSilicon HIP07 non-standard errors.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 non-standard-hisi_hip07.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/non-standard-hisi_hip07.c b/non-standard-hisi_hip07.c
index 3e9dabd..19a5c47 100644
--- a/non-standard-hisi_hip07.c
+++ b/non-standard-hisi_hip07.c
@@ -24,20 +24,6 @@
 #define HISI_SAS_VALID_ERR_TYPE       BIT(2)
 #define HISI_SAS_VALID_AXI_ERR_INFO   BIT(3)
 
-static int decode_hip07_sas_error(struct trace_seq *s, const void *error);
-static int decode_hip07_hns_error(struct trace_seq *s, const void *error);
-
-struct ras_ns_dec_tab hisi_ns_dec_tab[] = {
-	{
-		.sec_type = "daffd8146eba4d8c8a91bc9bbf4aa301",
-		.decode = decode_hip07_sas_error,
-	},
-	{
-		.sec_type = "fbc2d923ea7a453dab132949f5af9e53",
-		.decode = decode_hip07_hns_error,
-	},
-};
-
 struct hisi_sas_err_sec {
 	uint64_t   val_bits;
 	uint64_t   physical_addr;
@@ -138,6 +124,18 @@ static int decode_hip07_hns_error(struct trace_seq *s, const void *error)
 {
 	return 0;
 }
+
+struct ras_ns_dec_tab hisi_ns_dec_tab[] = {
+	{
+		.sec_type = "daffd8146eba4d8c8a91bc9bbf4aa301",
+		.decode = decode_hip07_sas_error,
+	},
+	{
+		.sec_type = "fbc2d923ea7a453dab132949f5af9e53",
+		.decode = decode_hip07_hns_error,
+	},
+};
+
 __attribute__((constructor))
 static void hip07_init(void)
 {
-- 
1.9.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 3/6] rasdaemon: update iteration logic for the non-standard error decoding functions
  2019-06-17 14:28 ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Shiju Jose
  2019-06-17 14:28   ` [PATCH 1/6] rasdaemon:print non-standard error data if not decoded Shiju Jose
  2019-06-17 14:28   ` [PATCH 2/6] rasdaemon: rearrange HiSilicon HIP07 decoding function table Shiju Jose
@ 2019-06-17 14:28   ` Shiju Jose
  2019-06-17 14:28   ` [PATCH 4/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format1 Shiju Jose
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Shiju Jose @ 2019-06-17 14:28 UTC (permalink / raw)
  To: mchehab, linux-edac, linuxarm; +Cc: Shiju Jose

This patch updates the iteration logic for the non-standard
error decoding functions.

Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 non-standard-hisi_hip07.c  | 2 +-
 ras-non-standard-handler.c | 2 +-
 ras-non-standard-handler.h | 1 -
 3 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/non-standard-hisi_hip07.c b/non-standard-hisi_hip07.c
index 19a5c47..bb2576e 100644
--- a/non-standard-hisi_hip07.c
+++ b/non-standard-hisi_hip07.c
@@ -134,11 +134,11 @@ struct ras_ns_dec_tab hisi_ns_dec_tab[] = {
 		.sec_type = "fbc2d923ea7a453dab132949f5af9e53",
 		.decode = decode_hip07_hns_error,
 	},
+	{ /* sentinel */ }
 };
 
 __attribute__((constructor))
 static void hip07_init(void)
 {
-	hisi_ns_dec_tab[0].len = ARRAY_SIZE(hisi_ns_dec_tab);
 	register_ns_dec_tab(hisi_ns_dec_tab);
 }
diff --git a/ras-non-standard-handler.c b/ras-non-standard-handler.c
index d343a2a..392bb27 100644
--- a/ras-non-standard-handler.c
+++ b/ras-non-standard-handler.c
@@ -163,7 +163,7 @@ int ras_non_standard_event_handler(struct trace_seq *s,
 
 	for (count = 0; count < dec_tab_count && !dec_done; count++) {
 		dec_tab = ns_dec_tab[count];
-		for (i = 0; i < dec_tab[0].len; i++) {
+		for (i = 0; dec_tab[i].decode; i++) {
 			if (uuid_le_cmp(ev.sec_type,
 					dec_tab[i].sec_type) == 0) {
 				dec_tab[i].decode(s, ev.error);
diff --git a/ras-non-standard-handler.h b/ras-non-standard-handler.h
index b9e9fb1..b2c9743 100644
--- a/ras-non-standard-handler.h
+++ b/ras-non-standard-handler.h
@@ -23,7 +23,6 @@
 typedef struct ras_ns_dec_tab {
 	const char *sec_type;
 	int (*decode)(struct trace_seq *s, const void *err);
-	size_t len;
 } *p_ns_dec_tab;
 
 int ras_non_standard_event_handler(struct trace_seq *s,
-- 
1.9.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 4/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format1
  2019-06-17 14:28 ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Shiju Jose
                     ` (2 preceding siblings ...)
  2019-06-17 14:28   ` [PATCH 3/6] rasdaemon: update iteration logic for the non-standard error decoding functions Shiju Jose
@ 2019-06-17 14:28   ` Shiju Jose
  2019-06-17 14:28   ` [PATCH 5/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format2 Shiju Jose
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Shiju Jose @ 2019-06-17 14:28 UTC (permalink / raw)
  To: mchehab, linux-edac, linuxarm; +Cc: Shiju Jose

This patch adds logging the HiSilicon HIP08 H/W errors reported
in the non-standard OEM format1.
These errors are from the H/W modules MN, PLL, SLLC, AA, SIOE,
POE, DISP, LPC, SAS and SATA.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Makefile.am                |   2 +-
 non-standard-hisi_hip07.c  |   8 +-
 non-standard-hisi_hip08.c  | 332 +++++++++++++++++++++++++++++++++++++++++++++
 ras-non-standard-handler.c |   3 +-
 ras-non-standard-handler.h |   7 +-
 ras-record.c               |  30 ++--
 ras-record.h               |  13 ++
 7 files changed, 378 insertions(+), 17 deletions(-)
 create mode 100644 non-standard-hisi_hip08.c

diff --git a/Makefile.am b/Makefile.am
index f036ffd..3d89672 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -49,7 +49,7 @@ if WITH_ABRT_REPORT
    rasdaemon_SOURCES += ras-report.c
 endif
 if WITH_HISI_NS_DECODE
-   rasdaemon_SOURCES += non-standard-hisi_hip07.c
+   rasdaemon_SOURCES += non-standard-hisi_hip07.c non-standard-hisi_hip08.c
 endif
 rasdaemon_LDADD = -lpthread $(SQLITE3_LIBS) libtrace/libtrace.a
 
diff --git a/non-standard-hisi_hip07.c b/non-standard-hisi_hip07.c
index bb2576e..7f58fb3 100644
--- a/non-standard-hisi_hip07.c
+++ b/non-standard-hisi_hip07.c
@@ -87,7 +87,9 @@ static char *sas_axi_err_type(int etype)
 	return "unknown error";
 }
 
-static int decode_hip07_sas_error(struct trace_seq *s, const void *error)
+static int decode_hip07_sas_error(struct ras_events *ras,
+				  struct ras_ns_dec_tab *dec_tab,
+				  struct trace_seq *s, const void *error)
 {
 	char buf[1024];
 	char *p = buf;
@@ -120,7 +122,9 @@ static int decode_hip07_sas_error(struct trace_seq *s, const void *error)
 	return 0;
 }
 
-static int decode_hip07_hns_error(struct trace_seq *s, const void *error)
+static int decode_hip07_hns_error(struct ras_events *ras,
+				  struct ras_ns_dec_tab *dec_tab,
+				  struct trace_seq *s, const void *error)
 {
 	return 0;
 }
diff --git a/non-standard-hisi_hip08.c b/non-standard-hisi_hip08.c
new file mode 100644
index 0000000..240e832
--- /dev/null
+++ b/non-standard-hisi_hip08.c
@@ -0,0 +1,332 @@
+/*
+ * Copyright (c) 2019 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include "ras-record.h"
+#include "ras-logger.h"
+#include "ras-report.h"
+#include "ras-non-standard-handler.h"
+
+/* HISI OEM error definitions */
+/* HISI OEM format1 error definitions */
+#define HISI_OEM_MODULE_ID_MN	0
+#define HISI_OEM_MODULE_ID_PLL	1
+#define HISI_OEM_MODULE_ID_SLLC	2
+#define HISI_OEM_MODULE_ID_AA	3
+#define HISI_OEM_MODULE_ID_SIOE	4
+#define HISI_OEM_MODULE_ID_POE	5
+#define HISI_OEM_MODULE_ID_DISP	8
+#define HISI_OEM_MODULE_ID_LPC	9
+#define HISI_OEM_MODULE_ID_SAS	15
+#define HISI_OEM_MODULE_ID_SATA	16
+
+#define HISI_OEM_VALID_SOC_ID		BIT(0)
+#define HISI_OEM_VALID_SOCKET_ID	BIT(1)
+#define HISI_OEM_VALID_NIMBUS_ID	BIT(2)
+#define HISI_OEM_VALID_MODULE_ID	BIT(3)
+#define HISI_OEM_VALID_SUB_MODULE_ID	BIT(4)
+#define HISI_OEM_VALID_ERR_SEVERITY	BIT(5)
+
+#define HISI_OEM_TYPE1_VALID_ERR_MISC_0	BIT(6)
+#define HISI_OEM_TYPE1_VALID_ERR_MISC_1	BIT(7)
+#define HISI_OEM_TYPE1_VALID_ERR_MISC_2	BIT(8)
+#define HISI_OEM_TYPE1_VALID_ERR_MISC_3	BIT(9)
+#define HISI_OEM_TYPE1_VALID_ERR_MISC_4	BIT(10)
+#define HISI_OEM_TYPE1_VALID_ERR_ADDR	BIT(11)
+
+struct hisi_oem_type1_err_sec {
+	uint32_t   val_bits;
+	uint8_t    version;
+	uint8_t    soc_id;
+	uint8_t    socket_id;
+	uint8_t    nimbus_id;
+	uint8_t    module_id;
+	uint8_t    sub_module_id;
+	uint8_t    err_severity;
+	uint8_t    reserv;
+	uint32_t   err_misc_0;
+	uint32_t   err_misc_1;
+	uint32_t   err_misc_2;
+	uint32_t   err_misc_3;
+	uint32_t   err_misc_4;
+	uint64_t   err_addr;
+};
+
+enum hisi_oem_data_type {
+	hisi_oem_data_type_int,
+	hisi_oem_data_type_int64,
+	hisi_oem_data_type_text,
+};
+
+enum {
+	hip08_oem_type1_field_id,
+	hip08_oem_type1_field_version,
+	hip08_oem_type1_field_soc_id,
+	hip08_oem_type1_field_socket_id,
+	hip08_oem_type1_field_nimbus_id,
+	hip08_oem_type1_field_module_id,
+	hip08_oem_type1_field_sub_module_id,
+	hip08_oem_type1_field_err_sev,
+	hip08_oem_type1_field_err_misc_0,
+	hip08_oem_type1_field_err_misc_1,
+	hip08_oem_type1_field_err_misc_2,
+	hip08_oem_type1_field_err_misc_3,
+	hip08_oem_type1_field_err_misc_4,
+	hip08_oem_type1_field_err_addr,
+};
+
+/* helper functions */
+static char *err_severity(uint8_t err_sev)
+{
+	switch (err_sev) {
+	case 0: return "recoverable";
+	case 1: return "fatal";
+	case 2: return "corrected";
+	case 3: return "none";
+	}
+	return "unknown";
+}
+
+static char *oem_type1_module_name(uint8_t module_id)
+{
+	switch (module_id) {
+	case HISI_OEM_MODULE_ID_MN: return "MN";
+	case HISI_OEM_MODULE_ID_PLL: return "PLL";
+	case HISI_OEM_MODULE_ID_SLLC: return "SLLC";
+	case HISI_OEM_MODULE_ID_AA: return "AA";
+	case HISI_OEM_MODULE_ID_SIOE: return "SIOE";
+	case HISI_OEM_MODULE_ID_POE: return "POE";
+	case HISI_OEM_MODULE_ID_DISP: return "DISP";
+	case HISI_OEM_MODULE_ID_LPC: return "LPC";
+	case HISI_OEM_MODULE_ID_SAS: return "SAS";
+	case HISI_OEM_MODULE_ID_SATA: return "SATA";
+	}
+	return "unknown";
+}
+
+#ifdef HAVE_SQLITE3
+static const struct db_fields hip08_oem_type1_event_fields[] = {
+	{ .name = "id",			.type = "INTEGER PRIMARY KEY" },
+	{ .name = "version",		.type = "INTEGER" },
+	{ .name = "soc_id",		.type = "INTEGER" },
+	{ .name = "socket_id",		.type = "INTEGER" },
+	{ .name = "nimbus_id",		.type = "INTEGER" },
+	{ .name = "module_id",		.type = "TEXT" },
+	{ .name = "sub_module_id",	.type = "INTEGER" },
+	{ .name = "err_severity",	.type = "TEXT" },
+	{ .name = "err_misc_0",		.type = "INTEGER" },
+	{ .name = "err_misc_1",		.type = "INTEGER" },
+	{ .name = "err_misc_2",		.type = "INTEGER" },
+	{ .name = "err_misc_3",		.type = "INTEGER" },
+	{ .name = "err_misc_4",		.type = "INTEGER" },
+	{ .name = "err_addr",		.type = "INTEGER" },
+};
+
+static const struct db_table_descriptor hip08_oem_type1_event_tab = {
+	.name = "hip08_oem_type1_event",
+	.fields = hip08_oem_type1_event_fields,
+	.num_fields = ARRAY_SIZE(hip08_oem_type1_event_fields),
+};
+
+static void record_vendor_data(struct ras_ns_dec_tab *dec_tab,
+			       enum hisi_oem_data_type data_type,
+			       int id, int64_t data, const char *text)
+{
+	switch (data_type) {
+	case hisi_oem_data_type_int:
+		sqlite3_bind_int(dec_tab->stmt_dec_record, id, data);
+		break;
+	case hisi_oem_data_type_int64:
+		sqlite3_bind_int64(dec_tab->stmt_dec_record, id, data);
+		break;
+	case hisi_oem_data_type_text:
+		sqlite3_bind_text(dec_tab->stmt_dec_record, id, text, -1, NULL);
+		break;
+	default:
+		break;
+	}
+}
+
+static int step_vendor_data_tab(struct ras_ns_dec_tab *dec_tab, char *name)
+{
+	int rc;
+
+	rc = sqlite3_step(dec_tab->stmt_dec_record);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to do %s step on sqlite: error = %d\n", name, rc);
+
+	rc = sqlite3_reset(dec_tab->stmt_dec_record);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to reset %s on sqlite: error = %d\n", name, rc);
+
+	rc = sqlite3_clear_bindings(dec_tab->stmt_dec_record);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to clear bindings %s on sqlite: error = %d\n",
+		    name, rc);
+
+	return rc;
+}
+#else
+static void record_vendor_data(struct ras_ns_dec_tab *dec_tab,
+			       enum hisi_oem_data_type data_type,
+			       int id, int64_t data, const char *text)
+{ }
+
+static int step_vendor_data_tab(struct ras_ns_dec_tab *dec_tab, char *name)
+{
+	return 0;
+}
+#endif
+
+/* error data decoding functions */
+static int decode_hip08_oem_type1_error(struct ras_events *ras,
+					struct ras_ns_dec_tab *dec_tab,
+					struct trace_seq *s, const void *error)
+{
+	const struct hisi_oem_type1_err_sec *err = error;
+	char buf[1024];
+	char *p = buf;
+
+	if (err->val_bits == 0) {
+		trace_seq_printf(s, "%s: no valid error information\n",
+				 __func__);
+		return -1;
+	}
+
+#ifdef HAVE_SQLITE3
+	if (!dec_tab->stmt_dec_record) {
+		if (ras_mc_add_vendor_table(ras, &dec_tab->stmt_dec_record,
+					    &hip08_oem_type1_event_tab)
+			!= SQLITE_OK) {
+			trace_seq_printf(s,
+					"create sql hip08_oem_type1_event_tab fail\n");
+			return -1;
+		}
+	}
+#endif
+
+	p += sprintf(p, "[ ");
+	p += sprintf(p, "Table version=%d ", err->version);
+	record_vendor_data(dec_tab, hisi_oem_data_type_int,
+			   hip08_oem_type1_field_version, err->version, NULL);
+
+	if (err->val_bits & HISI_OEM_VALID_SOC_ID) {
+		p += sprintf(p, "SOC ID=%d ", err->soc_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type1_field_soc_id,
+				   err->soc_id, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_VALID_SOCKET_ID) {
+		p += sprintf(p, "socket ID=%d ", err->socket_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type1_field_socket_id,
+				   err->socket_id, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_VALID_NIMBUS_ID) {
+		p += sprintf(p, "nimbus ID=%d ", err->nimbus_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type1_field_nimbus_id,
+				   err->nimbus_id, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_VALID_MODULE_ID) {
+		p += sprintf(p, "module=%s-",
+			     oem_type1_module_name(err->module_id));
+		record_vendor_data(dec_tab, hisi_oem_data_type_text,
+				   hip08_oem_type1_field_module_id,
+				   0, oem_type1_module_name(err->module_id));
+		if (err->val_bits & HISI_OEM_VALID_SUB_MODULE_ID) {
+			p += sprintf(p, "%d ", err->sub_module_id);
+			record_vendor_data(dec_tab, hisi_oem_data_type_int,
+					   hip08_oem_type1_field_sub_module_id,
+					   err->sub_module_id, NULL);
+		}
+	}
+
+	if (err->val_bits & HISI_OEM_VALID_ERR_SEVERITY) {
+		p += sprintf(p, "error severity=%s ",
+			     err_severity(err->err_severity));
+		record_vendor_data(dec_tab, hisi_oem_data_type_text,
+				   hip08_oem_type1_field_err_sev,
+				   0, err_severity(err->err_severity));
+	}
+
+	p += sprintf(p, "]");
+	trace_seq_printf(s, "\nHISI HIP08: OEM Type-1 Error\n");
+	trace_seq_printf(s, "%s\n", buf);
+
+	trace_seq_printf(s, "Reg Dump:\n");
+	if (err->val_bits & HISI_OEM_TYPE1_VALID_ERR_MISC_0) {
+		trace_seq_printf(s, "ERR_MISC0=0x%x\n", err->err_misc_0);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type1_field_err_misc_0,
+				   err->err_misc_0, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE1_VALID_ERR_MISC_1) {
+		trace_seq_printf(s, "ERR_MISC1=0x%x\n", err->err_misc_1);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type1_field_err_misc_1,
+				   err->err_misc_1, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE1_VALID_ERR_MISC_2) {
+		trace_seq_printf(s, "ERR_MISC2=0x%x\n", err->err_misc_2);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type1_field_err_misc_2,
+				   err->err_misc_2, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE1_VALID_ERR_MISC_3) {
+		trace_seq_printf(s, "ERR_MISC3=0x%x\n", err->err_misc_3);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type1_field_err_misc_3,
+				   err->err_misc_3, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE1_VALID_ERR_MISC_4) {
+		trace_seq_printf(s, "ERR_MISC4=0x%x\n", err->err_misc_4);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type1_field_err_misc_4,
+				   err->err_misc_4, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE1_VALID_ERR_ADDR) {
+		trace_seq_printf(s, "ERR_ADDR=0x%p\n", (void *)err->err_addr);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int64,
+				   hip08_oem_type1_field_err_addr,
+				   err->err_addr, NULL);
+	}
+
+	step_vendor_data_tab(dec_tab, "hip08_oem_type1_event_tab");
+
+	return 0;
+}
+
+struct ras_ns_dec_tab hip08_ns_oem_tab[] = {
+	{
+		.sec_type = "1f8161e155d641e6bd107afd1dc5f7c5",
+		.decode = decode_hip08_oem_type1_error,
+	},
+	{ /* sentinel */ }
+};
+
+__attribute__((constructor))
+static void hip08_init(void)
+{
+	register_ns_dec_tab(hip08_ns_oem_tab);
+}
diff --git a/ras-non-standard-handler.c b/ras-non-standard-handler.c
index 392bb27..4eda80b 100644
--- a/ras-non-standard-handler.c
+++ b/ras-non-standard-handler.c
@@ -166,7 +166,8 @@ int ras_non_standard_event_handler(struct trace_seq *s,
 		for (i = 0; dec_tab[i].decode; i++) {
 			if (uuid_le_cmp(ev.sec_type,
 					dec_tab[i].sec_type) == 0) {
-				dec_tab[i].decode(s, ev.error);
+				dec_tab[i].decode(ras, &dec_tab[i],
+						  s, ev.error);
 				dec_done = true;
 				break;
 			}
diff --git a/ras-non-standard-handler.h b/ras-non-standard-handler.h
index b2c9743..a7e48a3 100644
--- a/ras-non-standard-handler.h
+++ b/ras-non-standard-handler.h
@@ -22,7 +22,12 @@
 
 typedef struct ras_ns_dec_tab {
 	const char *sec_type;
-	int (*decode)(struct trace_seq *s, const void *err);
+	int (*decode)(struct ras_events *ras, struct ras_ns_dec_tab *dec_tab,
+		      struct trace_seq *s, const void *err);
+#ifdef HAVE_SQLITE3
+#include <sqlite3.h>
+	sqlite3_stmt *stmt_dec_record;
+#endif
 } *p_ns_dec_tab;
 
 int ras_non_standard_event_handler(struct trace_seq *s,
diff --git a/ras-record.c b/ras-record.c
index 4c8b55b..b212607 100644
--- a/ras-record.c
+++ b/ras-record.c
@@ -38,17 +38,6 @@
 
 #define ARRAY_SIZE(x) (sizeof(x)/sizeof(*(x)))
 
-struct db_fields {
-	char *name;
-	char *type;
-};
-
-struct db_table_descriptor {
-	char			*name;
-	const struct db_fields	*fields;
-	size_t			num_fields;
-};
-
 /*
  * Table and functions to handle ras:mc_event
  */
@@ -511,7 +500,7 @@ static int ras_mc_create_table(struct sqlite3_priv *priv,
 {
 	const struct db_fields *field;
 	char sql[1024], *p = sql, *end = sql + sizeof(sql);
-	int i,rc;
+	int i, rc;
 
 	p += snprintf(p, end - p, "CREATE TABLE IF NOT EXISTS %s (",
 		      db_tab->name);
@@ -538,6 +527,23 @@ static int ras_mc_create_table(struct sqlite3_priv *priv,
 	return rc;
 }
 
+int ras_mc_add_vendor_table(struct ras_events *ras,
+			    sqlite3_stmt **stmt,
+			    const struct db_table_descriptor *db_tab)
+{
+	int rc;
+	struct sqlite3_priv *priv = ras->db_priv;
+
+	if (!priv)
+		return -1;
+
+	rc = ras_mc_create_table(priv, db_tab);
+	if (rc == SQLITE_OK)
+		rc = ras_mc_prepare_stmt(priv, stmt, db_tab);
+
+	return rc;
+}
+
 int ras_mc_event_opendb(unsigned cpu, struct ras_events *ras)
 {
 	int rc;
diff --git a/ras-record.h b/ras-record.h
index 2183167..432a571 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -119,7 +119,20 @@ struct sqlite3_priv {
 #endif
 };
 
+struct db_fields {
+	char *name;
+	char *type;
+};
+
+struct db_table_descriptor {
+	char                    *name;
+	const struct db_fields  *fields;
+	size_t                  num_fields;
+};
+
 int ras_mc_event_opendb(unsigned cpu, struct ras_events *ras);
+int ras_mc_add_vendor_table(struct ras_events *ras, sqlite3_stmt **stmt,
+			    const struct db_table_descriptor *db_tab);
 int ras_store_mc_event(struct ras_events *ras, struct ras_mc_event *ev);
 int ras_store_aer_event(struct ras_events *ras, struct ras_aer_event *ev);
 int ras_store_mce_record(struct ras_events *ras, struct mce_event *ev);
-- 
1.9.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 5/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format2
  2019-06-17 14:28 ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Shiju Jose
                     ` (3 preceding siblings ...)
  2019-06-17 14:28   ` [PATCH 4/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format1 Shiju Jose
@ 2019-06-17 14:28   ` Shiju Jose
  2019-06-17 14:28   ` [PATCH 6/6] rasdaemon:add logging HiSilicon HIP08 PCIe local errors Shiju Jose
  2019-06-21 18:42   ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Mauro Carvalho Chehab
  6 siblings, 0 replies; 8+ messages in thread
From: Shiju Jose @ 2019-06-17 14:28 UTC (permalink / raw)
  To: mchehab, linux-edac, linuxarm; +Cc: Shiju Jose

This patch adds logging the HiSilicon HIP08 H/W errors reported
in the non-standard OEM format2.
These errors are from the H/W modules SMMU, HHA, HLLC, PA and DDRC.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 non-standard-hisi_hip08.c | 300 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 300 insertions(+)

diff --git a/non-standard-hisi_hip08.c b/non-standard-hisi_hip08.c
index 240e832..6fe5cbe 100644
--- a/non-standard-hisi_hip08.c
+++ b/non-standard-hisi_hip08.c
@@ -43,6 +43,20 @@
 #define HISI_OEM_TYPE1_VALID_ERR_MISC_4	BIT(10)
 #define HISI_OEM_TYPE1_VALID_ERR_ADDR	BIT(11)
 
+/* HISI OEM format2 error definitions */
+#define HISI_OEM_MODULE_ID_SMMU	0
+#define HISI_OEM_MODULE_ID_HHA	1
+#define HISI_OEM_MODULE_ID_HLLC	2
+#define HISI_OEM_MODULE_ID_PA	3
+#define HISI_OEM_MODULE_ID_DDRC	4
+
+#define HISI_OEM_TYPE2_VALID_ERR_FR	BIT(6)
+#define HISI_OEM_TYPE2_VALID_ERR_CTRL	BIT(7)
+#define HISI_OEM_TYPE2_VALID_ERR_STATUS	BIT(8)
+#define HISI_OEM_TYPE2_VALID_ERR_ADDR	BIT(9)
+#define HISI_OEM_TYPE2_VALID_ERR_MISC_0	BIT(10)
+#define HISI_OEM_TYPE2_VALID_ERR_MISC_1	BIT(11)
+
 struct hisi_oem_type1_err_sec {
 	uint32_t   val_bits;
 	uint8_t    version;
@@ -61,6 +75,30 @@ struct hisi_oem_type1_err_sec {
 	uint64_t   err_addr;
 };
 
+struct hisi_oem_type2_err_sec {
+	uint32_t   val_bits;
+	uint8_t    version;
+	uint8_t    soc_id;
+	uint8_t    socket_id;
+	uint8_t    nimbus_id;
+	uint8_t    module_id;
+	uint8_t    sub_module_id;
+	uint8_t    err_severity;
+	uint8_t    reserv;
+	uint32_t   err_fr_0;
+	uint32_t   err_fr_1;
+	uint32_t   err_ctrl_0;
+	uint32_t   err_ctrl_1;
+	uint32_t   err_status_0;
+	uint32_t   err_status_1;
+	uint32_t   err_addr_0;
+	uint32_t   err_addr_1;
+	uint32_t   err_misc0_0;
+	uint32_t   err_misc0_1;
+	uint32_t   err_misc1_0;
+	uint32_t   err_misc1_1;
+};
+
 enum hisi_oem_data_type {
 	hisi_oem_data_type_int,
 	hisi_oem_data_type_int64,
@@ -84,6 +122,29 @@ enum {
 	hip08_oem_type1_field_err_addr,
 };
 
+enum {
+	hip08_oem_type2_field_id,
+	hip08_oem_type2_field_version,
+	hip08_oem_type2_field_soc_id,
+	hip08_oem_type2_field_socket_id,
+	hip08_oem_type2_field_nimbus_id,
+	hip08_oem_type2_field_module_id,
+	hip08_oem_type2_field_sub_module_id,
+	hip08_oem_type2_field_err_sev,
+	hip08_oem_type2_field_err_fr_0,
+	hip08_oem_type2_field_err_fr_1,
+	hip08_oem_type2_field_err_ctrl_0,
+	hip08_oem_type2_field_err_ctrl_1,
+	hip08_oem_type2_field_err_status_0,
+	hip08_oem_type2_field_err_status_1,
+	hip08_oem_type2_field_err_addr_0,
+	hip08_oem_type2_field_err_addr_1,
+	hip08_oem_type2_field_err_misc0_0,
+	hip08_oem_type2_field_err_misc0_1,
+	hip08_oem_type2_field_err_misc1_0,
+	hip08_oem_type2_field_err_misc1_1,
+};
+
 /* helper functions */
 static char *err_severity(uint8_t err_sev)
 {
@@ -113,6 +174,62 @@ static char *oem_type1_module_name(uint8_t module_id)
 	return "unknown";
 }
 
+static char *oem_type2_module_name(uint8_t module_id)
+{
+	switch (module_id) {
+	case HISI_OEM_MODULE_ID_SMMU: return "SMMU";
+	case HISI_OEM_MODULE_ID_HHA: return "HHA";
+	case HISI_OEM_MODULE_ID_HLLC: return "HLLC";
+	case HISI_OEM_MODULE_ID_PA: return "PA";
+	case HISI_OEM_MODULE_ID_DDRC: return "DDRC";
+	}
+	return "unknown module";
+}
+
+static char *oem_type2_sub_module_id(char *p, uint8_t module_id,
+				     uint8_t sub_module_id)
+{
+	switch (module_id) {
+	case HISI_OEM_MODULE_ID_SMMU:
+	case HISI_OEM_MODULE_ID_HLLC:
+	case HISI_OEM_MODULE_ID_PA:
+		p += sprintf(p, "%d ", sub_module_id);
+		break;
+
+	case HISI_OEM_MODULE_ID_HHA:
+		if (sub_module_id == 0)
+			p += sprintf(p, "TA HHA0 ");
+		else if (sub_module_id == 1)
+			p += sprintf(p, "TA HHA1 ");
+		else if (sub_module_id == 2)
+			p += sprintf(p, "TB HHA0 ");
+		else if (sub_module_id == 3)
+			p += sprintf(p, "TB HHA1 ");
+		break;
+
+	case HISI_OEM_MODULE_ID_DDRC:
+		if (sub_module_id == 0)
+			p += sprintf(p, "TA DDRC0 ");
+		else if (sub_module_id == 1)
+			p += sprintf(p, "TA DDRC1 ");
+		else if (sub_module_id == 2)
+			p += sprintf(p, "TA DDRC2 ");
+		else if (sub_module_id == 3)
+			p += sprintf(p, "TA DDRC3 ");
+		else if (sub_module_id == 4)
+			p += sprintf(p, "TB DDRC0 ");
+		else if (sub_module_id == 5)
+			p += sprintf(p, "TB DDRC1 ");
+		else if (sub_module_id == 6)
+			p += sprintf(p, "TB DDRC2 ");
+		else if (sub_module_id == 7)
+			p += sprintf(p, "TB DDRC3 ");
+		break;
+	}
+
+	return p;
+}
+
 #ifdef HAVE_SQLITE3
 static const struct db_fields hip08_oem_type1_event_fields[] = {
 	{ .name = "id",			.type = "INTEGER PRIMARY KEY" },
@@ -137,6 +254,35 @@ static const struct db_table_descriptor hip08_oem_type1_event_tab = {
 	.num_fields = ARRAY_SIZE(hip08_oem_type1_event_fields),
 };
 
+static const struct db_fields hip08_oem_type2_event_fields[] = {
+	{ .name = "id",                 .type = "INTEGER PRIMARY KEY" },
+	{ .name = "version",            .type = "INTEGER" },
+	{ .name = "soc_id",             .type = "INTEGER" },
+	{ .name = "socket_id",          .type = "INTEGER" },
+	{ .name = "nimbus_id",          .type = "INTEGER" },
+	{ .name = "module_id",          .type = "TEXT" },
+	{ .name = "sub_module_id",      .type = "INTEGER" },
+	{ .name = "err_severity",       .type = "TEXT" },
+	{ .name = "err_fr_0",		.type = "INTEGER" },
+	{ .name = "err_fr_1",		.type = "INTEGER" },
+	{ .name = "err_ctrl_0",		.type = "INTEGER" },
+	{ .name = "err_ctrl_1",		.type = "INTEGER" },
+	{ .name = "err_status_0",	.type = "INTEGER" },
+	{ .name = "err_status_1",	.type = "INTEGER" },
+	{ .name = "err_addr_0",         .type = "INTEGER" },
+	{ .name = "err_addr_1",         .type = "INTEGER" },
+	{ .name = "err_misc0_0",	.type = "INTEGER" },
+	{ .name = "err_misc0_1",	.type = "INTEGER" },
+	{ .name = "err_misc1_0",	.type = "INTEGER" },
+	{ .name = "err_misc1_1",	.type = "INTEGER" },
+};
+
+static const struct db_table_descriptor hip08_oem_type2_event_tab = {
+	.name = "hip08_oem_type2_event",
+	.fields = hip08_oem_type2_event_fields,
+	.num_fields = ARRAY_SIZE(hip08_oem_type2_event_fields),
+};
+
 static void record_vendor_data(struct ras_ns_dec_tab *dec_tab,
 			       enum hisi_oem_data_type data_type,
 			       int id, int64_t data, const char *text)
@@ -317,11 +463,165 @@ static int decode_hip08_oem_type1_error(struct ras_events *ras,
 	return 0;
 }
 
+static int decode_hip08_oem_type2_error(struct ras_events *ras,
+					struct ras_ns_dec_tab *dec_tab,
+					struct trace_seq *s, const void *error)
+{
+	const struct hisi_oem_type2_err_sec *err = error;
+	char buf[1024];
+	char *p = buf;
+
+	if (err->val_bits == 0) {
+		trace_seq_printf(s, "%s: no valid error information\n",
+				 __func__);
+		return -1;
+	}
+
+#ifdef HAVE_SQLITE3
+	if (!dec_tab->stmt_dec_record) {
+		if (ras_mc_add_vendor_table(ras, &dec_tab->stmt_dec_record,
+			&hip08_oem_type2_event_tab) != SQLITE_OK) {
+			trace_seq_printf(s,
+				"create sql hip08_oem_type2_event_tab fail\n");
+			return -1;
+		}
+	}
+#endif
+	p += sprintf(p, "[ ");
+	p += sprintf(p, "Table version=%d ", err->version);
+	record_vendor_data(dec_tab, hisi_oem_data_type_int,
+			   hip08_oem_type2_field_version,
+			   err->version, NULL);
+	if (err->val_bits & HISI_OEM_VALID_SOC_ID) {
+		p += sprintf(p, "SOC ID=%d ", err->soc_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_soc_id,
+				   err->soc_id, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_VALID_SOCKET_ID) {
+		p += sprintf(p, "socket ID=%d ", err->socket_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_socket_id,
+				   err->socket_id, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_VALID_NIMBUS_ID) {
+		p += sprintf(p, "nimbus ID=%d ", err->nimbus_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_nimbus_id,
+				   err->nimbus_id, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_VALID_MODULE_ID) {
+		p += sprintf(p, "module=%s ",
+			     oem_type2_module_name(err->module_id));
+		record_vendor_data(dec_tab, hisi_oem_data_type_text,
+				   hip08_oem_type2_field_module_id,
+				   0, oem_type2_module_name(err->module_id));
+	}
+
+	if (err->val_bits & HISI_OEM_VALID_SUB_MODULE_ID) {
+		p =  oem_type2_sub_module_id(p, err->module_id,
+					     err->sub_module_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_sub_module_id,
+				   err->sub_module_id, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_VALID_ERR_SEVERITY) {
+		p += sprintf(p, "error severity=%s ",
+			     err_severity(err->err_severity));
+		record_vendor_data(dec_tab, hisi_oem_data_type_text,
+				   hip08_oem_type2_field_err_sev,
+				   0, err_severity(err->err_severity));
+	}
+
+	p += sprintf(p, "]");
+	trace_seq_printf(s, "\nHISI HIP08: OEM Type-2 Error\n");
+	trace_seq_printf(s, "%s\n", buf);
+
+	trace_seq_printf(s, "Reg Dump:\n");
+	if (err->val_bits & HISI_OEM_TYPE2_VALID_ERR_FR) {
+		trace_seq_printf(s, "ERR_FR_0=0x%x\n", err->err_fr_0);
+		trace_seq_printf(s, "ERR_FR_1=0x%x\n", err->err_fr_1);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_fr_0,
+				   err->err_fr_0, NULL);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_fr_1,
+				   err->err_fr_1, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE2_VALID_ERR_CTRL) {
+		trace_seq_printf(s, "ERR_CTRL_0=0x%x\n", err->err_ctrl_0);
+		trace_seq_printf(s, "ERR_CTRL_1=0x%x\n", err->err_ctrl_1);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_ctrl_0,
+				   err->err_ctrl_0, NULL);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_ctrl_1,
+				   err->err_ctrl_1, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE2_VALID_ERR_STATUS) {
+		trace_seq_printf(s, "ERR_STATUS_0=0x%x\n", err->err_status_0);
+		trace_seq_printf(s, "ERR_STATUS_1=0x%x\n", err->err_status_1);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_status_0,
+				   err->err_status_0, NULL);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_status_1,
+				   err->err_status_1, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE2_VALID_ERR_ADDR) {
+		trace_seq_printf(s, "ERR_ADDR_0=0x%x\n", err->err_addr_0);
+		trace_seq_printf(s, "ERR_ADDR_1=0x%x\n", err->err_addr_1);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_addr_0,
+				   err->err_addr_0, NULL);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_addr_1,
+				   err->err_addr_1, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE2_VALID_ERR_MISC_0) {
+		trace_seq_printf(s, "ERR_MISC0_0=0x%x\n", err->err_misc0_0);
+		trace_seq_printf(s, "ERR_MISC0_1=0x%x\n", err->err_misc0_1);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_misc0_0,
+				   err->err_misc0_0, NULL);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_misc0_1,
+				   err->err_misc0_1, NULL);
+	}
+
+	if (err->val_bits & HISI_OEM_TYPE2_VALID_ERR_MISC_1) {
+		trace_seq_printf(s, "ERR_MISC1_0=0x%x\n", err->err_misc1_0);
+		trace_seq_printf(s, "ERR_MISC1_1=0x%x\n", err->err_misc1_1);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_misc1_0,
+				   err->err_misc1_0, NULL);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_oem_type2_field_err_misc1_1,
+				   err->err_misc1_1, NULL);
+	}
+
+	step_vendor_data_tab(dec_tab, "hip08_oem_type2_event_tab");
+
+	return 0;
+}
+
 struct ras_ns_dec_tab hip08_ns_oem_tab[] = {
 	{
 		.sec_type = "1f8161e155d641e6bd107afd1dc5f7c5",
 		.decode = decode_hip08_oem_type1_error,
 	},
+	{
+		.sec_type = "45534ea6ce2341158535e07ab3aef91d",
+		.decode = decode_hip08_oem_type2_error,
+	},
 	{ /* sentinel */ }
 };
 
-- 
1.9.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 6/6] rasdaemon:add logging HiSilicon HIP08 PCIe local errors
  2019-06-17 14:28 ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Shiju Jose
                     ` (4 preceding siblings ...)
  2019-06-17 14:28   ` [PATCH 5/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format2 Shiju Jose
@ 2019-06-17 14:28   ` Shiju Jose
  2019-06-21 18:42   ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Mauro Carvalho Chehab
  6 siblings, 0 replies; 8+ messages in thread
From: Shiju Jose @ 2019-06-17 14:28 UTC (permalink / raw)
  To: mchehab, linux-edac, linuxarm; +Cc: Shiju Jose

This patch adds logging for the HiSilicon HIP08 PCIe local errors.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 non-standard-hisi_hip08.c | 223 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 223 insertions(+)

diff --git a/non-standard-hisi_hip08.c b/non-standard-hisi_hip08.c
index 6fe5cbe..ae543d6 100644
--- a/non-standard-hisi_hip08.c
+++ b/non-standard-hisi_hip08.c
@@ -57,6 +57,24 @@
 #define HISI_OEM_TYPE2_VALID_ERR_MISC_0	BIT(10)
 #define HISI_OEM_TYPE2_VALID_ERR_MISC_1	BIT(11)
 
+/* HISI PCIe Local error definitions */
+#define HISI_PCIE_SUB_MODULE_ID_AP	0
+#define HISI_PCIE_SUB_MODULE_ID_TL	1
+#define HISI_PCIE_SUB_MODULE_ID_MAC	2
+#define HISI_PCIE_SUB_MODULE_ID_DL	3
+#define HISI_PCIE_SUB_MODULE_ID_SDI	4
+
+#define HISI_PCIE_LOCAL_VALID_VERSION		BIT(0)
+#define HISI_PCIE_LOCAL_VALID_SOC_ID		BIT(1)
+#define HISI_PCIE_LOCAL_VALID_SOCKET_ID		BIT(2)
+#define HISI_PCIE_LOCAL_VALID_NIMBUS_ID		BIT(3)
+#define HISI_PCIE_LOCAL_VALID_SUB_MODULE_ID	BIT(4)
+#define HISI_PCIE_LOCAL_VALID_CORE_ID		BIT(5)
+#define HISI_PCIE_LOCAL_VALID_PORT_ID		BIT(6)
+#define HISI_PCIE_LOCAL_VALID_ERR_TYPE		BIT(7)
+#define HISI_PCIE_LOCAL_VALID_ERR_SEVERITY	BIT(8)
+#define HISI_PCIE_LOCAL_VALID_ERR_MISC		9
+
 struct hisi_oem_type1_err_sec {
 	uint32_t   val_bits;
 	uint8_t    version;
@@ -99,6 +117,21 @@ struct hisi_oem_type2_err_sec {
 	uint32_t   err_misc1_1;
 };
 
+struct hisi_pcie_local_err_sec {
+	uint64_t   val_bits;
+	uint8_t    version;
+	uint8_t    soc_id;
+	uint8_t    socket_id;
+	uint8_t    nimbus_id;
+	uint8_t    sub_module_id;
+	uint8_t    core_id;
+	uint8_t    port_id;
+	uint8_t    err_severity;
+	uint16_t   err_type;
+	uint8_t    reserv[2];
+	uint32_t   err_misc[33];
+};
+
 enum hisi_oem_data_type {
 	hisi_oem_data_type_int,
 	hisi_oem_data_type_int64,
@@ -145,6 +178,20 @@ enum {
 	hip08_oem_type2_field_err_misc1_1,
 };
 
+enum {
+	hip08_pcie_local_field_id,
+	hip08_pcie_local_field_version,
+	hip08_pcie_local_field_soc_id,
+	hip08_pcie_local_field_socket_id,
+	hip08_pcie_local_field_nimbus_id,
+	hip08_pcie_local_field_sub_module_id,
+	hip08_pcie_local_field_core_id,
+	hip08_pcie_local_field_port_id,
+	hip08_pcie_local_field_err_sev,
+	hip08_pcie_local_field_err_type,
+	hip08_pcie_local_field_err_misc,
+};
+
 /* helper functions */
 static char *err_severity(uint8_t err_sev)
 {
@@ -230,6 +277,18 @@ static char *oem_type2_sub_module_id(char *p, uint8_t module_id,
 	return p;
 }
 
+static char *pcie_local_sub_module_name(uint8_t id)
+{
+	switch (id) {
+	case HISI_PCIE_SUB_MODULE_ID_AP: return "AP Layer";
+	case HISI_PCIE_SUB_MODULE_ID_TL: return "TL Layer";
+	case HISI_PCIE_SUB_MODULE_ID_MAC: return "MAC Layer";
+	case HISI_PCIE_SUB_MODULE_ID_DL: return "DL Layer";
+	case HISI_PCIE_SUB_MODULE_ID_SDI: return "SDI Layer";
+	}
+	return "unknown";
+}
+
 #ifdef HAVE_SQLITE3
 static const struct db_fields hip08_oem_type1_event_fields[] = {
 	{ .name = "id",			.type = "INTEGER PRIMARY KEY" },
@@ -283,6 +342,58 @@ static const struct db_table_descriptor hip08_oem_type2_event_tab = {
 	.num_fields = ARRAY_SIZE(hip08_oem_type2_event_fields),
 };
 
+static const struct db_fields hip08_pcie_local_event_fields[] = {
+	{ .name = "id",                 .type = "INTEGER PRIMARY KEY" },
+	{ .name = "version",            .type = "INTEGER" },
+	{ .name = "soc_id",             .type = "INTEGER" },
+	{ .name = "socket_id",          .type = "INTEGER" },
+	{ .name = "nimbus_id",          .type = "INTEGER" },
+	{ .name = "sub_module_id",      .type = "TEXT" },
+	{ .name = "core_id",		.type = "INTEGER" },
+	{ .name = "port_id",		.type = "INTEGER" },
+	{ .name = "err_severity",       .type = "TEXT" },
+	{ .name = "err_type",		.type = "INTEGER" },
+	{ .name = "err_misc0",		.type = "INTEGER" },
+	{ .name = "err_misc1",		.type = "INTEGER" },
+	{ .name = "err_misc2",		.type = "INTEGER" },
+	{ .name = "err_misc3",		.type = "INTEGER" },
+	{ .name = "err_misc4",		.type = "INTEGER" },
+	{ .name = "err_misc5",		.type = "INTEGER" },
+	{ .name = "err_misc6",		.type = "INTEGER" },
+	{ .name = "err_misc7",		.type = "INTEGER" },
+	{ .name = "err_misc8",		.type = "INTEGER" },
+	{ .name = "err_misc9",		.type = "INTEGER" },
+	{ .name = "err_misc10",		.type = "INTEGER" },
+	{ .name = "err_misc11",		.type = "INTEGER" },
+	{ .name = "err_misc12",		.type = "INTEGER" },
+	{ .name = "err_misc13",		.type = "INTEGER" },
+	{ .name = "err_misc14",		.type = "INTEGER" },
+	{ .name = "err_misc15",		.type = "INTEGER" },
+	{ .name = "err_misc16",		.type = "INTEGER" },
+	{ .name = "err_misc17",		.type = "INTEGER" },
+	{ .name = "err_misc18",		.type = "INTEGER" },
+	{ .name = "err_misc19",		.type = "INTEGER" },
+	{ .name = "err_misc20",		.type = "INTEGER" },
+	{ .name = "err_misc21",		.type = "INTEGER" },
+	{ .name = "err_misc22",		.type = "INTEGER" },
+	{ .name = "err_misc23",		.type = "INTEGER" },
+	{ .name = "err_misc24",		.type = "INTEGER" },
+	{ .name = "err_misc25",		.type = "INTEGER" },
+	{ .name = "err_misc26",		.type = "INTEGER" },
+	{ .name = "err_misc27",		.type = "INTEGER" },
+	{ .name = "err_misc28",		.type = "INTEGER" },
+	{ .name = "err_misc29",		.type = "INTEGER" },
+	{ .name = "err_misc30",		.type = "INTEGER" },
+	{ .name = "err_misc31",		.type = "INTEGER" },
+	{ .name = "err_misc32",		.type = "INTEGER" },
+};
+
+static const struct db_table_descriptor hip08_pcie_local_event_tab = {
+	.name = "hip08_pcie_local_event",
+	.fields = hip08_pcie_local_event_fields,
+	.num_fields = ARRAY_SIZE(hip08_pcie_local_event_fields),
+};
+
 static void record_vendor_data(struct ras_ns_dec_tab *dec_tab,
 			       enum hisi_oem_data_type data_type,
 			       int id, int64_t data, const char *text)
@@ -613,6 +724,114 @@ static int decode_hip08_oem_type2_error(struct ras_events *ras,
 	return 0;
 }
 
+static int decode_hip08_pcie_local_error(struct ras_events *ras,
+					 struct ras_ns_dec_tab *dec_tab,
+					 struct trace_seq *s, const void *error)
+{
+	const struct hisi_pcie_local_err_sec *err = error;
+	char buf[1024];
+	char *p = buf;
+	uint32_t i;
+
+	if (err->val_bits == 0) {
+		trace_seq_printf(s, "%s: no valid error information\n",
+				 __func__);
+		return -1;
+	}
+
+#ifdef HAVE_SQLITE3
+	if (!dec_tab->stmt_dec_record) {
+		if (ras_mc_add_vendor_table(ras, &dec_tab->stmt_dec_record,
+				&hip08_pcie_local_event_tab) != SQLITE_OK) {
+			trace_seq_printf(s,
+				"create sql hip08_pcie_local_event_tab fail\n");
+			return -1;
+		}
+	}
+#endif
+	p += sprintf(p, "[ ");
+	p += sprintf(p, "Table version=%d ", err->version);
+	record_vendor_data(dec_tab, hisi_oem_data_type_int,
+			   hip08_pcie_local_field_version,
+			   err->version, NULL);
+	if (err->val_bits & HISI_PCIE_LOCAL_VALID_SOC_ID) {
+		p += sprintf(p, "SOC ID=%d ", err->soc_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_pcie_local_field_soc_id,
+				   err->soc_id, NULL);
+	}
+
+	if (err->val_bits & HISI_PCIE_LOCAL_VALID_SOCKET_ID) {
+		p += sprintf(p, "socket ID=%d ", err->socket_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_pcie_local_field_socket_id,
+				   err->socket_id, NULL);
+	}
+
+	if (err->val_bits & HISI_PCIE_LOCAL_VALID_NIMBUS_ID) {
+		p += sprintf(p, "nimbus ID=%d ", err->nimbus_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_pcie_local_field_nimbus_id,
+				   err->nimbus_id, NULL);
+	}
+
+	if (err->val_bits & HISI_PCIE_LOCAL_VALID_SUB_MODULE_ID) {
+		p += sprintf(p, "sub module=%s ",
+			     pcie_local_sub_module_name(err->sub_module_id));
+		record_vendor_data(dec_tab, hisi_oem_data_type_text,
+				   hip08_pcie_local_field_sub_module_id,
+				   0, pcie_local_sub_module_name(err->sub_module_id));
+	}
+
+	if (err->val_bits & HISI_PCIE_LOCAL_VALID_CORE_ID) {
+		p += sprintf(p, "core ID=core%d ", err->core_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_pcie_local_field_core_id,
+				   err->core_id, NULL);
+	}
+
+	if (err->val_bits & HISI_PCIE_LOCAL_VALID_PORT_ID) {
+		p += sprintf(p, "port ID=port%d ", err->port_id);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_pcie_local_field_port_id,
+				   err->port_id, NULL);
+	}
+
+	if (err->val_bits & HISI_PCIE_LOCAL_VALID_ERR_SEVERITY) {
+		p += sprintf(p, "error severity=%s ",
+			     err_severity(err->err_severity));
+		record_vendor_data(dec_tab, hisi_oem_data_type_text,
+				   hip08_pcie_local_field_err_sev,
+				   0, err_severity(err->err_severity));
+	}
+
+	if (err->val_bits & HISI_PCIE_LOCAL_VALID_ERR_TYPE) {
+		p += sprintf(p, "error type=0x%x ", err->err_type);
+		record_vendor_data(dec_tab, hisi_oem_data_type_int,
+				   hip08_pcie_local_field_err_type,
+				   err->err_type, NULL);
+	}
+	p += sprintf(p, "]");
+
+	trace_seq_printf(s, "\nHISI HIP08: PCIe local error\n");
+	trace_seq_printf(s, "%s\n", buf);
+
+	trace_seq_printf(s, "Reg Dump:\n");
+	for (i = 0; i < 33; i++) {
+		if (err->val_bits & BIT(HISI_PCIE_LOCAL_VALID_ERR_MISC + i)) {
+			trace_seq_printf(s, "ERR_MISC_%d=0x%x\n", i,
+					 err->err_misc[i]);
+			record_vendor_data(dec_tab, hisi_oem_data_type_int,
+					   (hip08_pcie_local_field_err_misc + i),
+					   err->err_misc[i], NULL);
+		}
+	}
+
+	step_vendor_data_tab(dec_tab, "hip08_pcie_local_event_tab");
+
+	return 0;
+}
+
 struct ras_ns_dec_tab hip08_ns_oem_tab[] = {
 	{
 		.sec_type = "1f8161e155d641e6bd107afd1dc5f7c5",
@@ -622,6 +841,10 @@ struct ras_ns_dec_tab hip08_ns_oem_tab[] = {
 		.sec_type = "45534ea6ce2341158535e07ab3aef91d",
 		.decode = decode_hip08_oem_type2_error,
 	},
+	{
+		.sec_type = "b2889fc9e7d74f9da867af42e98be772",
+		.decode = decode_hip08_pcie_local_error,
+	},
 	{ /* sentinel */ }
 };
 
-- 
1.9.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code
  2019-06-17 14:28 ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Shiju Jose
                     ` (5 preceding siblings ...)
  2019-06-17 14:28   ` [PATCH 6/6] rasdaemon:add logging HiSilicon HIP08 PCIe local errors Shiju Jose
@ 2019-06-21 18:42   ` Mauro Carvalho Chehab
  6 siblings, 0 replies; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2019-06-21 18:42 UTC (permalink / raw)
  To: Shiju Jose; +Cc: linux-edac, linuxarm

Em Mon, 17 Jun 2019 15:28:46 +0100
Shiju Jose <shiju.jose@huawei.com> escreveu:

> This patch set add few changes in the non-standard error decoding code and
> logging for the HiSilicon HIP08 non-standard H/W errors.
> 
> Shiju Jose (6):
>   rasdaemon:print non-standard error data if not decoded
>   rasdaemon: rearrange HiSilicon HIP07 decoding function table
>   rasdaemon: update iteration logic for the non-standard error decoding
>     functions
>   rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM
>     format1
>   rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM
>     format2
>   rasdaemon:add logging HiSilicon HIP08 PCIe local errors
> 
>  Makefile.am                |   2 +-
>  non-standard-hisi_hip07.c  |  36 +-
>  non-standard-hisi_hip08.c  | 855 +++++++++++++++++++++++++++++++++++++++++++++
>  ras-non-standard-handler.c |  36 +-
>  ras-non-standard-handler.h |   8 +-
>  ras-record.c               |  30 +-
>  ras-record.h               |  13 +
>  7 files changed, 932 insertions(+), 48 deletions(-)
>  create mode 100644 non-standard-hisi_hip08.c
> 

Applied, thanks!


Thanks,
Mauro

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Shiju Jose>
2019-06-17 14:28 ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Shiju Jose
2019-06-17 14:28   ` [PATCH 1/6] rasdaemon:print non-standard error data if not decoded Shiju Jose
2019-06-17 14:28   ` [PATCH 2/6] rasdaemon: rearrange HiSilicon HIP07 decoding function table Shiju Jose
2019-06-17 14:28   ` [PATCH 3/6] rasdaemon: update iteration logic for the non-standard error decoding functions Shiju Jose
2019-06-17 14:28   ` [PATCH 4/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format1 Shiju Jose
2019-06-17 14:28   ` [PATCH 5/6] rasdaemon:add logging HiSilicon HIP08 H/W errors reported in the OEM format2 Shiju Jose
2019-06-17 14:28   ` [PATCH 6/6] rasdaemon:add logging HiSilicon HIP08 PCIe local errors Shiju Jose
2019-06-21 18:42   ` [PATCH 0/6] rasdaemon:add logging of HiSilicon HIP08 non-standard H/W errors and changes in the error decoding code Mauro Carvalho Chehab

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org linux-edac@archiver.kernel.org
	public-inbox-index linux-edac


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/ public-inbox