Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH V2 0/6] CCIX rasdaemon support
@ 2019-08-27 11:30 Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 1/6] rasdaemon: CCIX: memory error support Jonathan Cameron
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-edac
  Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron

Depends on the kernel patches being accepted:
https://lore.kernel.org/linux-edac/20190820144732.2370-1-Jonathan.Cameron@huawei.com/T/#t

Changes since v1:
* Separated out the ras-record section into its own file.
* Rebased on current rasdaemon tree.

This series introduced rasdaemon support to match against the above
series which provides the tracepoints for CCIX PER error reporting from
the kernel to userspace.

These are errors which occur at the CCIX protocol layer which sits
on top of PCIe (for which we have AER).  They are defined in the
CCIX base specification v1.0 an evaluation version of which is available
at www.ccixconsortium.org.

Note the following is a trademark grant and doesn't prevent normal
stuff covered under fair use.  Given this set doesn't quote from
the spec (other than field names), there are no such copyright
notices.

This patch is being distributed by the CCIX Consortium, Inc. (CCIX) to
you and other parties that are participating (the "participants") in
rasdemon project with the understanding that the participants will use CCIX's
name and trademark only when this patch is used in association with
rasdaemon.

CCIX is also distributing this patch to these participants with the
understanding that if any portion of the CCIX specification will be
used or referenced in rasdaemon, the participants will not modify
the cited portion of the CCIX specification and will give CCIX proper
copyright attribution by including the following copyright notice with
the cited part of the CCIX specification:
"© 2019 CCIX CONSORTIUM, INC. ALL RIGHTS RESERVED."

Jonathan Cameron (6):
  rasdaemon: CCIX: memory error support
  rasdaemon: CCIX: Cache error support
  rasdaemon: CCIX: ATC error support
  rasdaemon: CCIX: Port error suppport
  rasdaemon: CCIX: Link error support
  rasdaemon: CCIX: Agent Internal error support

 Makefile.am        |   8 +-
 configure.ac       |  10 +
 ras-ccix-handler.c | 648 +++++++++++++++++++++++++++++++++++++++++++++
 ras-ccix-handler.h | 139 ++++++++++
 ras-events.c       |  61 +++++
 ras-record-ccix.c  | 596 +++++++++++++++++++++++++++++++++++++++++
 ras-record.c       |  15 +-
 ras-record.h       |  43 +++
 ras-report.h       |   6 +-
 9 files changed, 1519 insertions(+), 7 deletions(-)
 create mode 100644 ras-ccix-handler.c
 create mode 100644 ras-ccix-handler.h
 create mode 100644 ras-record-ccix.c

-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH V2 1/6] rasdaemon: CCIX: memory error support
  2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
@ 2019-08-27 11:30 ` Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 2/6] rasdaemon: CCIX: Cache " Jonathan Cameron
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-edac
  Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron

Adds support for basic decoding and logging of ccix memory errors
+ storing to sqlite3 DB.

Given that the CCIX memory record is very tightly defined by the
specification and that databases with large blobs in them
are not particularly useful, I have separately exposed all of the
standard fields.  Note that this means setting them NULL if the
validation bits indicate that the field is not valid.

Includes making a few ras-record.c functions available from other
files to allow us to split off the CCIX error recording functionality.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 Makefile.am        |   8 +-
 configure.ac       |  10 ++
 ras-ccix-handler.c | 244 +++++++++++++++++++++++++++++++++++++++++++++
 ras-ccix-handler.h |  61 ++++++++++++
 ras-events.c       |  16 +++
 ras-record-ccix.c  | 204 +++++++++++++++++++++++++++++++++++++
 ras-record.c       |  15 ++-
 ras-record.h       |  28 ++++++
 ras-report.h       |   6 +-
 9 files changed, 585 insertions(+), 7 deletions(-)

diff --git a/Makefile.am b/Makefile.am
index 3d89672..9d54390 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -20,10 +20,16 @@ rasdaemon_SOURCES = rasdaemon.c ras-events.c ras-mc-handler.c \
 		    bitfield.c
 if WITH_SQLITE3
    rasdaemon_SOURCES += ras-record.c
+if WITH_CCIX
+   rasdaemon_SOURCES += ras-record-ccix.c
+endif
 endif
 if WITH_AER
    rasdaemon_SOURCES += ras-aer-handler.c
 endif
+if WITH_CCIX
+   rasdaemon_SOURCES += ras-ccix-handler.c
+endif
 if WITH_NON_STANDARD
    rasdaemon_SOURCES += ras-non-standard-handler.c
 endif
@@ -56,7 +62,7 @@ rasdaemon_LDADD = -lpthread $(SQLITE3_LIBS) libtrace/libtrace.a
 include_HEADERS = config.h  ras-events.h  ras-logger.h  ras-mc-handler.h \
 		  ras-aer-handler.h ras-mce-handler.h ras-record.h bitfield.h ras-report.h \
 		  ras-extlog-handler.h ras-arm-handler.h ras-non-standard-handler.h \
-		  ras-devlink-handler.h
+		  ras-devlink-handler.h ras-ccix-handler.h
 
 # This rule can't be called with more than one Makefile job (like make -j8)
 # I can't figure out a way to fix that
diff --git a/configure.ac b/configure.ac
index fecff51..ca8977c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -44,6 +44,15 @@ AS_IF([test "x$enable_aer" = "xyes"], [
 ])
 AM_CONDITIONAL([WITH_AER], [test x$enable_aer = xyes])
 
+AC_ARG_ENABLE([ccix],
+    AS_HELP_STRING([--enable-ccix], [enable CCIX PER events (currently experimental)]))
+
+AS_IF([test "x$enable_ccix" = "xyes"], [
+  AC_DEFINE(HAVE_CCIX,1,"have CCIX PER events collect")
+  AC_SUBST([WITH_CCIX])
+])
+AM_CONDITIONAL([WITH_CCIX], [test x$enable_ccix = xyes])
+
 AC_ARG_ENABLE([non_standard],
     AS_HELP_STRING([--enable-non-standard], [enable NON_STANDARD events (currently experimental)]))
 
@@ -137,4 +146,5 @@ compile time options summary
     HIP07 SAS HW errors : $enable_hisi_ns_decode
     ARM events          : $enable_arm
     DEVLINK             : $enable_devlink
+    CCIX                : $enable_ccix
 EOF
diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
new file mode 100644
index 0000000..2be413f
--- /dev/null
+++ b/ras-ccix-handler.c
@@ -0,0 +1,244 @@
+/*
+ * Copyright (c) 2019 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include "libtrace/kbuffer.h"
+#include "ras-record.h"
+#include "ras-logger.h"
+#include "bitfield.h"
+#include "ras-report.h"
+
+static char *ccix_mem_pool_type(uint8_t pt)
+{
+	switch (pt) {
+	case 0: return "other/not-specified";
+	case 1: return "ROM";
+	case 2: return "volatile";
+	case 3: return "non-volatile";
+	case 4: return "device/register";
+	}
+	if (pt >= 0x80)
+		return "vendor";
+	return "unknown";
+}
+
+static char *ccix_mem_spec_type(uint8_t st)
+{
+	switch (st) {
+	case 0: return "other/not-specified";
+	case 1: return "SRAM";
+	case 2: return "DDR";
+	case 3: return "NVDIMM-F";
+	case 4: return "NVDIMM-N";
+	case 5: return "HBM";
+	case 6: return "flash";
+	}
+	if (st >= 0x80)
+		return "vendor";
+	return "unknown";
+}
+
+static char *ccix_mem_op(uint8_t op)
+{
+	switch (op) {
+	case 0: return "generic";
+	case 1: return "read";
+	case 2: return "write";
+	case 4: return "scrub";
+	}
+	return "unknown";
+}
+
+static char *ccix_mem_err_type(int etype)
+{
+	switch (etype) {
+	case 0: return "unknown";
+	case 1: return "no error";
+	case 2: return "single-bit ECC";
+	case 3: return "multi-bit ECC";
+	case 4: return "single-symbol chipkill ECC";
+	case 5: return "multi-symbol chipkill ECC";
+	case 6: return "master abort";
+	case 7: return "target abort";
+	case 8: return "parity error";
+	case 9: return "watchdog timeout";
+	case 10: return "invalid address";
+	case 11: return "mirror Broken";
+	case 12: return "memory sparing";
+	case 13: return "scrub";
+	case 14: return "physical memory map-out event";
+	}
+	return "unknown-type";
+}
+
+static char *ccix_mem_err_cper_data(const char *c)
+{
+	const struct cper_ccix_mem_err_compact *cpd =
+		(struct cper_ccix_mem_err_compact *)c;
+	static char buf[1024];
+	char *p = buf;
+
+	p += sprintf(p, " (");
+	p += sprintf(p, "fru: %u ", cpd->fru);
+	if (cpd->validation_bits & CCIX_MEM_ERR_MEM_ERR_TYPE_VALID)
+		p += sprintf(p, "error: %s ",
+			     ccix_mem_err_type(cpd->mem_err_type));
+	if (cpd->validation_bits & CCIX_MEM_ERR_GENERIC_MEM_VALID)
+		p += sprintf(p, "type: %s ",
+			     ccix_mem_pool_type(cpd->pool_generic_type));
+	if (cpd->validation_bits & CCIX_MEM_ERR_SPEC_TYPE_VALID)
+		p += sprintf(p, "sub_type: %s ",
+			     ccix_mem_spec_type(cpd->pool_specific_type));
+	if (cpd->validation_bits & CCIX_MEM_ERR_OP_VALID)
+		p += sprintf(p, "op: %s ", ccix_mem_op(cpd->op_type));
+	if (cpd->validation_bits & CCIX_MEM_ERR_CARD_VALID)
+		p += sprintf(p, "card: %u ", cpd->card);
+	if (cpd->validation_bits & CCIX_MEM_ERR_MOD_VALID)
+		p += sprintf(p, "mod: %u ", cpd->module);
+	if (cpd->validation_bits & CCIX_MEM_ERR_BANK_VALID)
+		p += sprintf(p, "bank: %u ", cpd->bank);
+	if (cpd->validation_bits & CCIX_MEM_ERR_DEVICE_VALID)
+		p += sprintf(p, "device: %u ", cpd->device);
+	if (cpd->validation_bits & CCIX_MEM_ERR_ROW_VALID)
+		p += sprintf(p, "row: %u ", cpd->row);
+	if (cpd->validation_bits & CCIX_MEM_ERR_COL_VALID)
+		p += sprintf(p, "col: %u ", cpd->column);
+	if (cpd->validation_bits & CCIX_MEM_ERR_RANK_VALID)
+		p += sprintf(p, "rank: %u ", cpd->rank);
+	if (cpd->validation_bits & CCIX_MEM_ERR_BIT_POS_VALID)
+		p += sprintf(p, "bitpos: %u ", cpd->bit_pos);
+	if (cpd->validation_bits & CCIX_MEM_ERR_CHIP_ID_VALID)
+		p += sprintf(p, "chipid: %u ", cpd->chip_id);
+	p += sprintf(p - 1, ")");
+
+	return buf;
+}
+
+static char *ccix_component_type(int type)
+{
+	switch (type) {
+	case 0: return "RA";
+	case 1: return "HA";
+	case 2: return "SA";
+	case 3: return "Port";
+	case 4: return "CCIX-Link";
+	}
+	return "unknown-component";
+}
+
+static char *err_severity(int severity)
+{
+	switch (severity) {
+	case 0: return "recoverable";
+	case 1: return "fatal";
+	case 2: return "corrected";
+	case 3: return "informational";
+	}
+	return "unknown-severity";
+}
+
+static unsigned long long err_mask(int lsb)
+{
+	if (lsb == 0xff)
+		return ~0ull;
+	return ~((1ull << lsb) - 1);
+}
+
+static int ras_ccix_common_parse(struct trace_seq *s,
+				 struct pevent_record *record,
+				 struct event_format *event, void *context,
+				 struct ras_ccix_event *ev)
+{
+	unsigned long long val;
+	int len;
+
+	if (pevent_get_field_val(s,  event, "err_seq", record, &val, 1) < 0)
+		return -1;
+	ev->error_seq = val;
+	if (pevent_get_field_val(s,  event, "sev", record, &val, 1) < 0)
+		return -1;
+	ev->severity = val;
+	if (pevent_get_field_val(s, event, "sevdetail", record, &val, 1) < 0)
+		return -1;
+	ev->severity_detail = val;
+	if (pevent_get_field_val(s,  event, "pa", record, &val, 1) < 0)
+		return -1;
+	ev->address = val;
+	if (pevent_get_field_val(s,  event, "pa_mask_lsb", record, &val, 1) < 0)
+		return -1;
+	ev->pa_mask_lsb = val;
+	if (pevent_get_field_val(s, event, "source", record, &val, 1) < 0)
+		return -1;
+	ev->source = val;
+	if (pevent_get_field_val(s, event, "component", record, &val, 1) < 0)
+		return -1;
+	ev->component = val;
+
+	ev->cper_data = pevent_get_field_raw(s, event, "data", record, &len, 1);
+	ev->cper_data_length = len;
+
+	if (pevent_get_field_val(s, event, "vendor_data_length", record, &val,
+				 1))
+		return -1;
+	ev->vendor_data_length = val;
+
+	ev->vendor_data = pevent_get_field_raw(s, event, "vendor_data", record,
+					       &len, 1);
+
+	return 0;
+}
+
+int ras_ccix_memory_event_handler(struct trace_seq *s,
+				  struct pevent_record *record,
+				  struct event_format *event, void *context)
+{
+	struct ras_events *ras = context;
+	struct tm *tm;
+	struct ras_ccix_event ev;
+	time_t now;
+	int ret;
+
+	if (ras->use_uptime)
+		now = record->ts/user_hz + ras->uptime_diff;
+	else
+		now = time(NULL);
+
+	tm = localtime(&now);
+
+	if (tm)
+		strftime(ev.timestamp, sizeof(ev.timestamp),
+			 "%Y-%m-%d %H:%M:%S %z", tm);
+	trace_seq_printf(s, "%s ", ev.timestamp);
+
+	ret = ras_ccix_common_parse(s, record, event, context, &ev);
+	if (ret)
+		return ret;
+
+	trace_seq_printf(s, "%d %s id:%d CCIX memory error %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+			 ev.error_seq, err_severity(ev.severity),
+			 ev.source, ccix_component_type(ev.component),
+			 (ev.severity_detail & 0x1) ? 1 : 0,
+			 (ev.severity_detail & 0x2) ? 1 : 0,
+			 (ev.severity_detail & 0x4) ? 1 : 0,
+			 (ev.severity_detail & 0x8) ? 1 : 0,
+			 ev.address,
+			 err_mask(ev.pa_mask_lsb),
+			 ccix_mem_err_cper_data(ev.cper_data));
+
+	ras_store_ccix_memory_event(ras, &ev);
+
+	return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
new file mode 100644
index 0000000..f6d25b1
--- /dev/null
+++ b/ras-ccix-handler.h
@@ -0,0 +1,61 @@
+/*
+ * Copyright (c) 2019 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __RAS_CCIX_HANDLER_H
+#define __RAS_CCIX_HANDLER_H
+
+#include "ras-events.h"
+#include "libtrace/event-parse.h"
+
+int ras_ccix_memory_event_handler(struct trace_seq *s,
+				  struct pevent_record *record,
+				  struct event_format *event, void *context);
+
+/* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
+#pragma pack(1)
+struct cper_ccix_mem_err_compact {
+	uint32_t validation_bits;
+	uint8_t mem_err_type;
+	uint8_t pool_generic_type;
+	uint8_t pool_specific_type;
+	uint8_t op_type;
+	uint8_t card;
+	uint16_t module;
+	uint16_t bank;
+	uint32_t device;
+	uint32_t row;
+	uint32_t column;
+	uint32_t rank;
+	uint8_t bit_pos;
+	uint8_t chip_id;
+	uint8_t fru;
+};
+#pragma pack()
+
+#define CCIX_MEM_ERR_GENERIC_MEM_VALID		0x0001
+#define CCIX_MEM_ERR_OP_VALID			0x0002
+#define CCIX_MEM_ERR_MEM_ERR_TYPE_VALID		0x0004
+#define CCIX_MEM_ERR_CARD_VALID			0x0008
+#define CCIX_MEM_ERR_BANK_VALID			0x0010
+#define CCIX_MEM_ERR_DEVICE_VALID		0x0020
+#define CCIX_MEM_ERR_ROW_VALID			0x0040
+#define CCIX_MEM_ERR_COL_VALID			0x0080
+#define CCIX_MEM_ERR_RANK_VALID			0x0100
+#define CCIX_MEM_ERR_BIT_POS_VALID		0x0200
+#define CCIX_MEM_ERR_CHIP_ID_VALID		0x0400
+#define CCIX_MEM_ERR_VENDOR_DATA_VALID		0x0800
+#define CCIX_MEM_ERR_MOD_VALID			0x1000
+#define CCIX_MEM_ERR_SPEC_TYPE_VALID		0x2000
+
+#endif
diff --git a/ras-events.c b/ras-events.c
index 6ba7a6a..e365d97 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -29,6 +29,7 @@
 #include "libtrace/event-parse.h"
 #include "ras-mc-handler.h"
 #include "ras-aer-handler.h"
+#include "ras-ccix-handler.h"
 #include "ras-non-standard-handler.h"
 #include "ras-arm-handler.h"
 #include "ras-mce-handler.h"
@@ -203,6 +204,10 @@ int toggle_ras_mc_event(int enable)
 	rc |= __toggle_ras_mc_event(ras, "ras", "aer_event", enable);
 #endif
 
+#ifdef HAVE_CCIX
+	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_memory_event", enable);
+#endif
+
 #ifdef HAVE_MCE
 	rc |= __toggle_ras_mc_event(ras, "mce", "mce_record", enable);
 #endif
@@ -717,6 +722,17 @@ int handle_ras_events(int record_events)
 		    "ras", "aer_event");
 #endif
 
+#ifdef HAVE_CCIX
+	rc = add_event_handler(ras, pevent, page_size, "ras",
+			       "ccix_memory_error_event",
+			       ras_ccix_memory_event_handler, NULL);
+	if (!rc)
+		num_events++;
+	else
+		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+		    "ras", "ccix_memory_event");
+#endif
+
 #ifdef HAVE_NON_STANDARD
         rc = add_event_handler(ras, pevent, page_size, "ras", "non_standard_event",
                                ras_non_standard_event_handler, NULL);
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
new file mode 100644
index 0000000..6e46b40
--- /dev/null
+++ b/ras-record-ccix.c
@@ -0,0 +1,204 @@
+/*
+ * Copyright (C) 2019 Jonathan Cameron <Jonathan.Cameron@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+*/
+
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include "bitfield.h"
+#include "ras-ccix-handler.h"
+#include "ras-logger.h"
+#include "ras-record.h"
+#include "ras-report.h"
+
+enum {
+	ccix_field_id,
+	ccix_field_timestamp,
+	ccix_field_error_count,
+	ccix_field_severity,
+	ccix_field_severity_detail,
+	ccix_field_address,
+	ccix_field_address_mask,
+	ccix_field_source,
+	ccix_field_component,
+	ccix_field_common_end
+};
+
+#define CCIX_COMMON_FIELDS \
+	[ccix_field_id] =		{ .name = "id",			.type = "INTEGER PRIMARY KEY" }, \
+	[ccix_field_timestamp] =	{ .name = "timestamp",		.type = "TEXT" },	\
+	[ccix_field_error_count] =	{ .name = "error_count",	.type = "INTEGER" }, \
+	[ccix_field_severity] =		{ .name = "severity",		.type = "INTEGER" }, \
+	[ccix_field_severity_detail] =	{ .name = "severity_detail",	.type = "INTEGER" }, \
+	[ccix_field_address] =		{ .name = "address",		.type = "INTEGER" }, \
+	[ccix_field_address_mask] =	{ .name = "address_mask",	.type = "INTEGER" }, \
+	[ccix_field_source] =		{ .name = "source",		.type = "INTEGER" }, \
+	[ccix_field_component] =	{ .name = "component",		.type = "INTEGER" }
+
+enum {
+	ccix_mem_field_error_type = ccix_field_common_end,
+	ccix_mem_field_fru,
+	ccix_mem_field_type,
+	ccix_mem_field_sub_type,
+	ccix_mem_field_operation,
+	ccix_mem_field_card,
+	ccix_mem_field_mod,
+	ccix_mem_field_bank,
+	ccix_mem_field_device,
+	ccix_mem_field_row,
+	ccix_mem_field_col,
+	ccix_mem_field_rank,
+	ccix_mem_field_bit_pos,
+	ccix_mem_field_chip_id,
+	ccix_mem_field_vendor
+};
+
+static const struct db_fields ccix_memory_event_fields[] = {
+	CCIX_COMMON_FIELDS,
+	[ccix_mem_field_error_type] =	{ .name = "mem_err_type",	.type = "INTEGER" },
+	[ccix_mem_field_fru] =		{ .name = "fru",		.type = "INTEGER" },
+	[ccix_mem_field_type] =		{ .name = "type",		.type = "INTEGER" },
+	[ccix_mem_field_sub_type] =	{ .name = "sub_type",		.type = "INTEGER" },
+	[ccix_mem_field_operation] =	{ .name = "operation",		.type = "INTEGER" },
+	[ccix_mem_field_card] =		{ .name = "card",		.type = "INTEGER" },
+	[ccix_mem_field_mod] =		{ .name = "mod",		.type = "INTEGER" },
+	[ccix_mem_field_bank] =		{ .name = "bank",		.type = "INTEGER" },
+	[ccix_mem_field_device] =	{ .name = "device",		.type = "INTEGER" },
+	[ccix_mem_field_row] =		{ .name = "row",		.type = "INTEGER" },
+	[ccix_mem_field_col] =		{ .name = "col",		.type = "INTEGER" },
+	[ccix_mem_field_rank] =		{ .name = "rank",		.type = "INTEGER" },
+	[ccix_mem_field_bit_pos] =	{ .name = "bit_position",	.type = "INTEGER" },
+	[ccix_mem_field_chip_id] =	{ .name = "chip_id",		.type = "INTEGER" },
+	[ccix_mem_field_vendor] =	{ .name = "vendor_data",	.type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_memory_event_tab = {
+	.name = "ccix_memory_event",
+	.fields = ccix_memory_event_fields,
+	.num_fields = ARRAY_SIZE(ccix_memory_event_fields),
+};
+
+static void ras_store_ccix_common(sqlite3_stmt *record,
+				  struct ras_ccix_event *ev)
+{
+	sqlite3_bind_text(record, ccix_field_timestamp, ev->timestamp, -1,
+			  NULL);
+	sqlite3_bind_int(record, ccix_field_error_count, ev->error_seq);
+	sqlite3_bind_int(record, ccix_field_severity, ev->severity);
+	sqlite3_bind_int(record, ccix_field_severity_detail,
+			 ev->severity_detail);
+	sqlite3_bind_int64(record, ccix_field_address, ev->address);
+	sqlite3_bind_int64(record, ccix_field_address_mask, ev->pa_mask_lsb);
+	sqlite3_bind_int(record, ccix_field_source, ev->source);
+	sqlite3_bind_int(record, ccix_field_component, ev->component);
+}
+
+int ras_store_ccix_memory_event(struct ras_events *ras,
+				struct ras_ccix_event *ev)
+{
+	int rc;
+	struct sqlite3_priv *priv = ras->db_priv;
+	struct cper_ccix_mem_err_compact *mem =
+	  (struct cper_ccix_mem_err_compact *)ev->cper_data;
+	sqlite3_stmt *rec = priv->stmt_ccix_mem_record;
+
+	if (!priv || !rec)
+		return 0;
+	log(TERM, LOG_INFO, "ccix_memory_eventstore: %p\n", rec);
+
+	ras_store_ccix_common(rec, ev);
+
+	sqlite3_bind_int(rec, ccix_mem_field_fru, mem->fru);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_MEM_ERR_TYPE_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_error_type,
+				 mem->mem_err_type);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_GENERIC_MEM_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_type,
+				 mem->pool_generic_type);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_SPEC_TYPE_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_sub_type,
+				 mem->pool_specific_type);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_OP_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_operation, mem->op_type);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_CARD_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_card, mem->card);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_MOD_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_mod, mem->module);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_BANK_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_bank, mem->bank);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_DEVICE_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_device, mem->device);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_ROW_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_row, mem->row);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_COL_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_col, mem->column);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_RANK_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_rank, mem->rank);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_BIT_POS_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_bit_pos, mem->bit_pos);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_CHIP_ID_VALID)
+		sqlite3_bind_int(rec, ccix_mem_field_chip_id, mem->chip_id);
+
+	if (mem->validation_bits & CCIX_MEM_ERR_VENDOR_DATA_VALID)
+		sqlite3_bind_blob(rec, ccix_mem_field_vendor,
+				  ev->vendor_data, ev->vendor_data_length,
+				  NULL);
+
+	rc = sqlite3_step(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to do ccix_mem_record step on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_reset(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed reset ccix_mem_record on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_clear_bindings(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to clear ccix_mem_record: error %d\n",
+		    rc);
+	log(TERM, LOG_INFO, "register inserted at db\n");
+	return rc;
+}
+
+void ras_ccix_create_table(struct sqlite3_priv *priv)
+{
+	int rc;
+
+	rc = ras_mc_create_table(priv, &ccix_memory_event_tab);
+	if (rc == SQLITE_OK)
+		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_mem_record,
+					 &ccix_memory_event_tab);
+}
diff --git a/ras-record.c b/ras-record.c
index b212607..874902c 100644
--- a/ras-record.c
+++ b/ras-record.c
@@ -28,6 +28,7 @@
 #include "ras-events.h"
 #include "ras-mc-handler.h"
 #include "ras-aer-handler.h"
+#include "ras-ccix-handler.h"
 #include "ras-mce-handler.h"
 #include "ras-logger.h"
 
@@ -449,9 +450,9 @@ int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev)
  * Generic code
  */
 
-static int ras_mc_prepare_stmt(struct sqlite3_priv *priv,
-			       sqlite3_stmt **stmt,
-			       const struct db_table_descriptor *db_tab)
+int ras_mc_prepare_stmt(struct sqlite3_priv *priv,
+			sqlite3_stmt **stmt,
+			const struct db_table_descriptor *db_tab)
 
 {
 	int i, rc;
@@ -495,8 +496,8 @@ static int ras_mc_prepare_stmt(struct sqlite3_priv *priv,
 	return rc;
 }
 
-static int ras_mc_create_table(struct sqlite3_priv *priv,
-			       const struct db_table_descriptor *db_tab)
+int ras_mc_create_table(struct sqlite3_priv *priv,
+			const struct db_table_descriptor *db_tab)
 {
 	const struct db_fields *field;
 	char sql[1024], *p = sql, *end = sql + sizeof(sql);
@@ -604,6 +605,10 @@ int ras_mc_event_opendb(unsigned cpu, struct ras_events *ras)
 					 &extlog_event_tab);
 #endif
 
+#ifdef HAVE_CCIX
+	ras_ccix_create_table(priv);
+#endif
+
 #ifdef HAVE_MCE
 	rc = ras_mc_create_table(priv, &mce_record_tab);
 	if (rc == SQLITE_OK)
diff --git a/ras-record.h b/ras-record.h
index 432a571..c094c91 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -44,6 +44,21 @@ struct ras_aer_event {
 	const char *msg;
 };
 
+struct ras_ccix_event {
+	char timestamp[64];
+	int32_t error_seq;
+	int8_t severity;
+	int8_t severity_detail;
+	unsigned long long address;
+	int8_t pa_mask_lsb;
+	uint8_t source;
+	uint8_t component;
+	const char *cper_data;
+	unsigned short cper_data_length;
+	uint16_t vendor_data_length;
+	const char *vendor_data;
+};
+
 struct ras_extlog_event {
 	char timestamp[64];
 	int32_t error_seq;
@@ -108,6 +123,9 @@ struct sqlite3_priv {
 #ifdef HAVE_EXTLOG
 	sqlite3_stmt	*stmt_extlog_record;
 #endif
+#ifdef HAVE_CCIX
+	sqlite3_stmt	*stmt_ccix_mem_record;
+#endif
 #ifdef HAVE_NON_STANDARD
 	sqlite3_stmt	*stmt_non_standard_record;
 #endif
@@ -131,12 +149,20 @@ struct db_table_descriptor {
 };
 
 int ras_mc_event_opendb(unsigned cpu, struct ras_events *ras);
+int ras_mc_prepare_stmt(struct sqlite3_priv *priv,
+			sqlite3_stmt **stmt,
+			const struct db_table_descriptor *db_tab);
+int ras_mc_create_table(struct sqlite3_priv *priv,
+			const struct db_table_descriptor *db_tab);
+
 int ras_mc_add_vendor_table(struct ras_events *ras, sqlite3_stmt **stmt,
 			    const struct db_table_descriptor *db_tab);
 int ras_store_mc_event(struct ras_events *ras, struct ras_mc_event *ev);
 int ras_store_aer_event(struct ras_events *ras, struct ras_aer_event *ev);
 int ras_store_mce_record(struct ras_events *ras, struct mce_event *ev);
 int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event *ev);
+void ras_ccix_create_table(struct sqlite3_priv *priv);
+int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
 int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
 int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -147,6 +173,8 @@ static inline int ras_store_mc_event(struct ras_events *ras, struct ras_mc_event
 static inline int ras_store_aer_event(struct ras_events *ras, struct ras_aer_event *ev) { return 0; };
 static inline int ras_store_mce_record(struct ras_events *ras, struct mce_event *ev) { return 0; };
 static inline int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event *ev) { return 0; };
+static inline void ras_ccix_create_table(void *priv) {};
+static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
 static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
 static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
 static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
diff --git a/ras-report.h b/ras-report.h
index cb133a1..4684fdc 100644
--- a/ras-report.h
+++ b/ras-report.h
@@ -19,6 +19,7 @@
 #include "ras-mc-handler.h"
 #include "ras-mce-handler.h"
 #include "ras-aer-handler.h"
+#include "ras-ccix-handler.h"
 
 /* Maximal length of backtrace. */
 #define MAX_BACKTRACE_SIZE (1024*1024)
@@ -35,7 +36,8 @@ enum {
 	AER_EVENT,
 	NON_STANDARD_EVENT,
 	ARM_EVENT,
-	DEVLINK_EVENT
+	DEVLINK_EVENT,
+	CCIX_EVENT,
 };
 
 #ifdef HAVE_ABRT_REPORT
@@ -46,6 +48,7 @@ int ras_report_mce_event(struct ras_events *ras, struct mce_event *ev);
 int ras_report_non_standard_event(struct ras_events *ras, struct ras_non_standard_event *ev);
 int ras_report_arm_event(struct ras_events *ras, struct ras_arm_event *ev);
 int ras_report_devlink_event(struct ras_events *ras, struct devlink_event *ev);
+int ras_report_ccix_event(struct ras_events *ras, struct ras_ccix_event *ev);
 
 #else
 
@@ -55,6 +58,7 @@ static inline int ras_report_mce_event(struct ras_events *ras, struct mce_event
 static inline int ras_report_non_standard_event(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
 static inline int ras_report_arm_event(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
 static inline int ras_report_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
+static inline int ras_report_ccix_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
 
 #endif
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH V2 2/6] rasdaemon: CCIX: Cache error support
  2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 1/6] rasdaemon: CCIX: memory error support Jonathan Cameron
@ 2019-08-27 11:30 ` " Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 3/6] rasdaemon: CCIX: ATC " Jonathan Cameron
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-edac
  Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron

Adds the support of CCIX cache error reporting and logging
to sqlite3.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 ras-ccix-handler.c | 114 +++++++++++++++++++++++++++++++++++++++++++++
 ras-ccix-handler.h |  24 ++++++++++
 ras-events.c       |   9 ++++
 ras-record-ccix.c  | 100 +++++++++++++++++++++++++++++++++++++++
 ras-record.h       |   3 ++
 5 files changed, 250 insertions(+)

diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index 2be413f..f68c297 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -127,6 +127,79 @@ static char *ccix_mem_err_cper_data(const char *c)
 	return buf;
 }
 
+static char *ccix_cache_type(uint8_t type)
+{
+	switch (type) {
+	case 0: return "instruction";
+	case 1: return "data";
+	case 2: return "generic/unified";
+	case 3: return "snoop filter directory";
+	}
+	return "unknown";
+}
+
+static char *ccix_cache_err_type(int etype)
+{
+	switch (etype) {
+	case 0: return "data";
+	case 1: return "tag";
+	case 2: return "timeout";
+	case 3: return "hang";
+	case 4: return "data loss";
+	case 5: return "invalid address";
+	}
+	return "unknown-type";
+}
+
+static char *ccix_cache_op(uint8_t op)
+{
+	switch (op) {
+	case 0: return "generic";
+	case 1: return "generic read";
+	case 2: return "generic write";
+	case 3: return "data read";
+	case 4: return "data write";
+	case 5: return "instruction fetch";
+	case 6: return "prefetch";
+	case 7: return "eviction";
+	case 8: return "snooping";
+	case 9: return "snooped";
+	case 10: return "management/command";
+	}
+	return "unknown";
+}
+
+static char *ccix_cache_err_cper_data(const char *c)
+{
+	const struct cper_ccix_cache_err_compact *cpd =
+		(struct cper_ccix_cache_err_compact *)c;
+	static char buf[1024];
+	char *p = buf;
+
+	if (!(cpd->validation_bits))
+		return "";
+
+	p += sprintf(p, " (");
+	if (cpd->validation_bits & CCIX_CACHE_ERR_CACHE_ERR_TYPE_VALID)
+		p += sprintf(p, "error: %s ",
+			     ccix_cache_err_type(cpd->cache_error_type));
+	if (cpd->validation_bits & CCIX_CACHE_ERR_TYPE_VALID)
+		p += sprintf(p, "type: %s ", ccix_cache_type(cpd->cache_type));
+	if (cpd->validation_bits & CCIX_CACHE_ERR_OP_VALID)
+		p += sprintf(p, "op: %s ", ccix_cache_op(cpd->op_type));
+	if (cpd->validation_bits & CCIX_CACHE_ERR_LEVEL_VALID)
+		p += sprintf(p, "level: %u ", cpd->cache_level);
+	if (cpd->validation_bits & CCIX_CACHE_ERR_SET_VALID)
+		p += sprintf(p, "set: %u ", cpd->set);
+	if (cpd->validation_bits & CCIX_CACHE_ERR_WAY_VALID)
+		p += sprintf(p, "way: %u ", cpd->way);
+	if (cpd->validation_bits & CCIX_CACHE_ERR_INSTANCE_ID_VALID)
+		p += sprintf(p, "instance: %u ", cpd->instance);
+	p += sprintf(p - 1, ")");
+
+	return buf;
+}
+
 static char *ccix_component_type(int type)
 {
 	switch (type) {
@@ -242,3 +315,44 @@ int ras_ccix_memory_event_handler(struct trace_seq *s,
 
 	return 0;
 }
+
+int ras_ccix_cache_event_handler(struct trace_seq *s,
+				  struct pevent_record *record,
+				  struct event_format *event, void *context)
+{
+	struct ras_events *ras = context;
+	struct tm *tm;
+	struct ras_ccix_event ev;
+	time_t now;
+	int ret;
+
+	if (ras->use_uptime)
+		now = record->ts/user_hz + ras->uptime_diff;
+	else
+		now = time(NULL);
+
+	tm = localtime(&now);
+
+	if (tm)
+		strftime(ev.timestamp, sizeof(ev.timestamp),
+			 "%Y-%m-%d %H:%M:%S %z", tm);
+	trace_seq_printf(s, "%s ", ev.timestamp);
+	ret = ras_ccix_common_parse(s, record, event, context, &ev);
+	if (ret)
+		return ret;
+
+	trace_seq_printf(s, "%d %s id:%d CCIX cache error %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+			 ev.error_seq, err_severity(ev.severity),
+			 ev.source, ccix_component_type(ev.component),
+			 (ev.severity_detail & 0x1) ? 1 : 0,
+			 (ev.severity_detail & 0x2) ? 1 : 0,
+			 (ev.severity_detail & 0x4) ? 1 : 0,
+			 (ev.severity_detail & 0x8) ? 1 : 0,
+			 ev.address,
+			 err_mask(ev.pa_mask_lsb),
+			 ccix_cache_err_cper_data(ev.cper_data));
+
+	ras_store_ccix_cache_event(ras, &ev);
+
+	return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index f6d25b1..629ccbe 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -21,6 +21,9 @@
 int ras_ccix_memory_event_handler(struct trace_seq *s,
 				  struct pevent_record *record,
 				  struct event_format *event, void *context);
+int ras_ccix_cache_event_handler(struct trace_seq *s,
+				 struct pevent_record *record,
+				 struct event_format *event, void *context);
 
 /* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
 #pragma pack(1)
@@ -41,6 +44,18 @@ struct cper_ccix_mem_err_compact {
 	uint8_t chip_id;
 	uint8_t fru;
 };
+
+struct cper_ccix_cache_err_compact {
+	uint32_t validation_bits;
+	uint32_t set;
+	uint32_t way;
+	uint8_t cache_type;
+	uint8_t op_type;
+	uint8_t cache_error_type;
+	uint8_t cache_level;
+	uint8_t instance;
+};
+
 #pragma pack()
 
 #define CCIX_MEM_ERR_GENERIC_MEM_VALID		0x0001
@@ -58,4 +73,13 @@ struct cper_ccix_mem_err_compact {
 #define CCIX_MEM_ERR_MOD_VALID			0x1000
 #define CCIX_MEM_ERR_SPEC_TYPE_VALID		0x2000
 
+#define CCIX_CACHE_ERR_TYPE_VALID		0x0001
+#define CCIX_CACHE_ERR_OP_VALID			0x0002
+#define CCIX_CACHE_ERR_CACHE_ERR_TYPE_VALID	0x0004
+#define CCIX_CACHE_ERR_LEVEL_VALID		0x0008
+#define CCIX_CACHE_ERR_SET_VALID		0x0010
+#define CCIX_CACHE_ERR_WAY_VALID		0x0020
+#define CCIX_CACHE_ERR_INSTANCE_ID_VALID	0x0040
+#define CCIX_CACHE_ERR_VENDOR_DATA_VALID	0x0080
+
 #endif
diff --git a/ras-events.c b/ras-events.c
index e365d97..f1b67cd 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -206,6 +206,7 @@ int toggle_ras_mc_event(int enable)
 
 #ifdef HAVE_CCIX
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_memory_event", enable);
+	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_cache_event", enable);
 #endif
 
 #ifdef HAVE_MCE
@@ -731,6 +732,14 @@ int handle_ras_events(int record_events)
 	else
 		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
 		    "ras", "ccix_memory_event");
+	rc = add_event_handler(ras, pevent, page_size, "ras",
+			       "ccix_cache_error_event",
+			       ras_ccix_cache_event_handler, NULL);
+	if (!rc)
+		num_events++;
+	else
+		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+		    "ras", "ccix_cache_event");
 #endif
 
 #ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index 6e46b40..5b6e044 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -193,6 +193,101 @@ int ras_store_ccix_memory_event(struct ras_events *ras,
 	return rc;
 }
 
+enum {
+	ccix_cache_field_type = ccix_field_common_end,
+	ccix_cache_field_operation,
+	ccix_cache_field_error_type,
+	ccix_cache_field_level,
+	ccix_cache_field_set,
+	ccix_cache_field_way,
+	ccix_cache_field_instance,
+	ccix_cache_field_vendor,
+};
+
+static const struct db_fields ccix_cache_event_fields[] = {
+	CCIX_COMMON_FIELDS,
+	[ccix_cache_field_type] =	{ .name = "type",		.type = "INTEGER" },
+	[ccix_cache_field_operation] =	{ .name = "operation",		.type = "INTEGER" },
+	[ccix_cache_field_error_type] =	{ .name = "cache_err_type",	.type = "INTEGER" },
+	[ccix_cache_field_level] =	{ .name = "\"level\"",		.type = "INTEGER" },
+	[ccix_cache_field_set] =	{ .name = "\"set\"",		.type = "INTEGER" },
+	[ccix_cache_field_way] =	{ .name = "way",		.type = "INTEGER" },
+	[ccix_cache_field_instance] =	{ .name = "instance",		.type = "INTEGER" },
+	[ccix_cache_field_vendor] =	{ .name = "vendor_data",	.type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_cache_event_tab = {
+	.name = "ccix_cache_event",
+	.fields = ccix_cache_event_fields,
+	.num_fields = ARRAY_SIZE(ccix_cache_event_fields),
+};
+
+int ras_store_ccix_cache_event(struct ras_events *ras,
+			       struct ras_ccix_event *ev)
+{
+	int rc;
+	struct sqlite3_priv *priv = ras->db_priv;
+	struct cper_ccix_cache_err_compact *cache =
+		(struct cper_ccix_cache_err_compact *)ev->cper_data;
+	sqlite3_stmt *rec = priv->stmt_ccix_cache_record;
+
+	if (!priv || !rec)
+		return 0;
+	log(TERM, LOG_INFO, "ccix_cache_eventstore: %p\n", rec);
+
+	ras_store_ccix_common(rec, ev);
+
+	if (cache->validation_bits & CCIX_CACHE_ERR_CACHE_ERR_TYPE_VALID)
+		sqlite3_bind_int(rec, ccix_cache_field_error_type,
+				 cache->cache_error_type);
+
+	if (cache->validation_bits & CCIX_CACHE_ERR_TYPE_VALID)
+		sqlite3_bind_int(rec, ccix_cache_field_type, cache->cache_type);
+
+	if (cache->validation_bits & CCIX_CACHE_ERR_OP_VALID)
+		sqlite3_bind_int(rec, ccix_cache_field_operation,
+				 cache->op_type);
+
+	if (cache->validation_bits & CCIX_CACHE_ERR_LEVEL_VALID)
+		sqlite3_bind_int(rec, ccix_cache_field_level,
+				 cache->cache_level);
+
+	if (cache->validation_bits & CCIX_CACHE_ERR_SET_VALID)
+		sqlite3_bind_int(rec, ccix_cache_field_set, cache->set);
+
+	if (cache->validation_bits & CCIX_CACHE_ERR_WAY_VALID)
+		sqlite3_bind_int(rec, ccix_cache_field_way, cache->way);
+
+	if (cache->validation_bits & CCIX_CACHE_ERR_INSTANCE_ID_VALID)
+		sqlite3_bind_int(rec, ccix_cache_field_instance,
+				 cache->instance);
+
+	if (cache->validation_bits & CCIX_CACHE_ERR_VENDOR_DATA_VALID)
+		sqlite3_bind_blob(rec, ccix_cache_field_vendor,
+				  ev->vendor_data, ev->vendor_data_length,
+				  NULL);
+
+	rc = sqlite3_step(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to do ccix_cache_record step on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_reset(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed reset ccix_cache_record on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_clear_bindings(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to clear ccix_cache_record: error %d\n",
+		    rc);
+	log(TERM, LOG_INFO, "register inserted at db\n");
+	return rc;
+}
+
 void ras_ccix_create_table(struct sqlite3_priv *priv)
 {
 	int rc;
@@ -201,4 +296,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
 	if (rc == SQLITE_OK)
 		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_mem_record,
 					 &ccix_memory_event_tab);
+
+	rc = ras_mc_create_table(priv, &ccix_cache_event_tab);
+	if (rc == SQLITE_OK)
+		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_cache_record,
+					 &ccix_cache_event_tab);
 }
diff --git a/ras-record.h b/ras-record.h
index c094c91..ac25ffc 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -125,6 +125,7 @@ struct sqlite3_priv {
 #endif
 #ifdef HAVE_CCIX
 	sqlite3_stmt	*stmt_ccix_mem_record;
+	sqlite3_stmt	*stmt_ccix_cache_record;
 #endif
 #ifdef HAVE_NON_STANDARD
 	sqlite3_stmt	*stmt_non_standard_record;
@@ -163,6 +164,7 @@ int ras_store_mce_record(struct ras_events *ras, struct mce_event *ev);
 int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event *ev);
 void ras_ccix_create_table(struct sqlite3_priv *priv);
 int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
 int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
 int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -175,6 +177,7 @@ static inline int ras_store_mce_record(struct ras_events *ras, struct mce_event
 static inline int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event *ev) { return 0; };
 static inline void ras_ccix_create_table(void *priv) {};
 static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
+static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
 static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
 static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH V2 3/6] rasdaemon: CCIX: ATC error support
  2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 1/6] rasdaemon: CCIX: memory error support Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 2/6] rasdaemon: CCIX: Cache " Jonathan Cameron
@ 2019-08-27 11:30 ` " Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 4/6] rasdaemon: CCIX: Port error suppport Jonathan Cameron
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-edac
  Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron

Adds support for CCIX address translation cache (ATC) errors.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 ras-ccix-handler.c | 61 ++++++++++++++++++++++++++++++++++++++++
 ras-ccix-handler.h | 13 +++++++++
 ras-events.c       |  9 ++++++
 ras-record-ccix.c  | 69 ++++++++++++++++++++++++++++++++++++++++++++++
 ras-record.h       |  3 ++
 5 files changed, 155 insertions(+)

diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index f68c297..f7b9e8e 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -200,6 +200,26 @@ static char *ccix_cache_err_cper_data(const char *c)
 	return buf;
 }
 
+static char *ccix_atc_err_cper_data(const char *c)
+{
+	const struct cper_ccix_atc_err_compact *cpd =
+		(struct cper_ccix_atc_err_compact *)c;
+	static char buf[1024];
+	char *p = buf;
+
+	if (!cpd->validation_bits)
+		return "";
+
+	p += sprintf(p, " (");
+	if (cpd->validation_bits & CCIX_ATC_ERR_OP_VALID)
+		p += sprintf(p, "op: %s ", ccix_cache_op(cpd->op_type));
+	if (cpd->validation_bits & CCIX_ATC_ERR_INSTANCE_ID_VALID)
+		p += sprintf(p, "instance: %u ", cpd->instance);
+	p += sprintf(p - 1, ")");
+
+	return buf;
+}
+
 static char *ccix_component_type(int type)
 {
 	switch (type) {
@@ -356,3 +376,44 @@ int ras_ccix_cache_event_handler(struct trace_seq *s,
 
 	return 0;
 }
+
+int ras_ccix_atc_event_handler(struct trace_seq *s,
+			       struct pevent_record *record,
+			       struct event_format *event, void *context)
+{
+	struct ras_events *ras = context;
+	struct tm *tm;
+	struct ras_ccix_event ev;
+	time_t now;
+	int ret;
+
+	if (ras->use_uptime)
+		now = record->ts/user_hz + ras->uptime_diff;
+	else
+		now = time(NULL);
+
+	tm = localtime(&now);
+
+	if (tm)
+		strftime(ev.timestamp, sizeof(ev.timestamp),
+			 "%Y-%m-%d %H:%M:%S %z", tm);
+	trace_seq_printf(s, "%s ", ev.timestamp);
+	ret = ras_ccix_common_parse(s, record, event, context, &ev);
+	if (ret)
+		return ret;
+
+	trace_seq_printf(s, "%d %s id:%d CCIX ATC error: %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+			 ev.error_seq, err_severity(ev.severity),
+			 ev.source, ccix_component_type(ev.component),
+			 (ev.severity_detail & 0x1) ? 1 : 0,
+			 (ev.severity_detail & 0x2) ? 1 : 0,
+			 (ev.severity_detail & 0x4) ? 1 : 0,
+			 (ev.severity_detail & 0x8) ? 1 : 0,
+			 ev.address,
+			 err_mask(ev.pa_mask_lsb),
+			 ccix_atc_err_cper_data(ev.cper_data));
+
+	ras_store_ccix_atc_event(ras, &ev);
+
+	return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index 629ccbe..4528af7 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -24,6 +24,9 @@ int ras_ccix_memory_event_handler(struct trace_seq *s,
 int ras_ccix_cache_event_handler(struct trace_seq *s,
 				 struct pevent_record *record,
 				 struct event_format *event, void *context);
+int ras_ccix_atc_event_handler(struct trace_seq *s,
+			       struct pevent_record *record,
+			       struct event_format *event, void *context);
 
 /* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
 #pragma pack(1)
@@ -56,6 +59,12 @@ struct cper_ccix_cache_err_compact {
 	uint8_t instance;
 };
 
+struct cper_ccix_atc_err_compact {
+	uint32_t validation_bits;
+	uint8_t op_type;
+	uint8_t instance;
+};
+
 #pragma pack()
 
 #define CCIX_MEM_ERR_GENERIC_MEM_VALID		0x0001
@@ -82,4 +91,8 @@ struct cper_ccix_cache_err_compact {
 #define CCIX_CACHE_ERR_INSTANCE_ID_VALID	0x0040
 #define CCIX_CACHE_ERR_VENDOR_DATA_VALID	0x0080
 
+#define CCIX_ATC_ERR_OP_VALID			0x0001
+#define CCIX_ATC_ERR_INSTANCE_ID_VALID		0x0002
+#define CCIX_ATC_ERR_VENDOR_DATA_VALID		0x0004
+
 #endif
diff --git a/ras-events.c b/ras-events.c
index f1b67cd..68ed246 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -207,6 +207,7 @@ int toggle_ras_mc_event(int enable)
 #ifdef HAVE_CCIX
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_memory_event", enable);
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_cache_event", enable);
+	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_atc_event", enable);
 #endif
 
 #ifdef HAVE_MCE
@@ -740,6 +741,14 @@ int handle_ras_events(int record_events)
 	else
 		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
 		    "ras", "ccix_cache_event");
+	rc = add_event_handler(ras, pevent, page_size, "ras",
+			       "ccix_atc_error_event",
+			       ras_ccix_atc_event_handler, NULL);
+	if (!rc)
+		num_events++;
+	else
+		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+		    "ras", "ccix_atc_event");
 #endif
 
 #ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index 5b6e044..df68eef 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -288,6 +288,70 @@ int ras_store_ccix_cache_event(struct ras_events *ras,
 	return rc;
 }
 
+enum {
+	ccix_atc_field_operation = ccix_field_common_end,
+	ccix_atc_field_instance,
+	ccix_atc_field_vendor,
+};
+
+static const struct db_fields ccix_atc_event_fields[] = {
+	CCIX_COMMON_FIELDS,
+	[ccix_atc_field_operation] =	{ .name = "operation",		.type = "INTEGER" },
+	[ccix_atc_field_instance] =	{ .name = "instance",		.type = "INTEGER" },
+	[ccix_atc_field_vendor] =	{ .name = "vendor_data",	.type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_atc_event_tab = {
+	.name = "ccix_atc_event",
+	.fields = ccix_atc_event_fields,
+	.num_fields = ARRAY_SIZE(ccix_atc_event_fields),
+};
+
+int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev)
+{
+	int rc;
+	struct sqlite3_priv *priv = ras->db_priv;
+	struct cper_ccix_atc_err_compact *atc =
+		(struct cper_ccix_atc_err_compact *)ev->cper_data;
+	sqlite3_stmt *rec = priv->stmt_ccix_atc_record;
+
+	if (!priv || !rec)
+		return 0;
+	log(TERM, LOG_INFO, "ccix_atc_eventstore: %p\n", rec);
+
+	ras_store_ccix_common(priv->stmt_ccix_atc_record, ev);
+	if (atc->validation_bits & CCIX_ATC_ERR_OP_VALID)
+		sqlite3_bind_int(rec, ccix_atc_field_operation, atc->op_type);
+
+	if (atc->validation_bits & CCIX_ATC_ERR_INSTANCE_ID_VALID)
+		sqlite3_bind_int(rec, ccix_atc_field_instance, atc->instance);
+
+	if (atc->validation_bits & CCIX_ATC_ERR_VENDOR_DATA_VALID)
+		sqlite3_bind_blob(rec, ccix_atc_field_vendor,
+				  ev->vendor_data, ev->vendor_data_length,
+				  NULL);
+
+	rc = sqlite3_step(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to do ccix_atc_record step on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_reset(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed reset ccix_atc_record on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_clear_bindings(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to clear ccix_atc_record: error %d\n",
+		    rc);
+	log(TERM, LOG_INFO, "register inserted at db\n");
+	return rc;
+}
+
 void ras_ccix_create_table(struct sqlite3_priv *priv)
 {
 	int rc;
@@ -301,4 +365,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
 	if (rc == SQLITE_OK)
 		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_cache_record,
 					 &ccix_cache_event_tab);
+
+	rc = ras_mc_create_table(priv, &ccix_atc_event_tab);
+	if (rc == SQLITE_OK)
+		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_atc_record,
+					 &ccix_atc_event_tab);
 }
diff --git a/ras-record.h b/ras-record.h
index ac25ffc..c3b3586 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -126,6 +126,7 @@ struct sqlite3_priv {
 #ifdef HAVE_CCIX
 	sqlite3_stmt	*stmt_ccix_mem_record;
 	sqlite3_stmt	*stmt_ccix_cache_record;
+	sqlite3_stmt	*stmt_ccix_atc_record;
 #endif
 #ifdef HAVE_NON_STANDARD
 	sqlite3_stmt	*stmt_non_standard_record;
@@ -165,6 +166,7 @@ int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event
 void ras_ccix_create_table(struct sqlite3_priv *priv);
 int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
 int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
 int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -178,6 +180,7 @@ static inline int ras_store_extlog_mem_record(struct ras_events *ras, struct ras
 static inline void ras_ccix_create_table(void *priv) {};
 static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
 static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
+static inline int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
 static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
 static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH V2 4/6] rasdaemon: CCIX: Port error suppport
  2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
                   ` (2 preceding siblings ...)
  2019-08-27 11:30 ` [PATCH V2 3/6] rasdaemon: CCIX: ATC " Jonathan Cameron
@ 2019-08-27 11:30 ` Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 5/6] rasdaemon: CCIX: Link error support Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 6/6] rasdaemon: CCIX: Agent Internal " Jonathan Cameron
  5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-edac
  Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron

Add support for reporting and storing to sqlite3 for CCIX
Port errors.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 ras-ccix-handler.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++
 ras-ccix-handler.h | 14 +++++++
 ras-events.c       |  9 +++++
 ras-record-ccix.c  | 75 +++++++++++++++++++++++++++++++++++++
 ras-record.h       |  3 ++
 5 files changed, 194 insertions(+)

diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index f7b9e8e..0a79627 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -220,6 +220,58 @@ static char *ccix_atc_err_cper_data(const char *c)
 	return buf;
 }
 
+static char *ccix_port_op(uint8_t op)
+{
+	switch (op) {
+	case 0: return "command";
+	case 1: return "read";
+	case 2: return "write";
+	}
+	return "unknown";
+}
+
+static char *ccix_port_err_type(uint8_t type)
+{
+	switch (type) {
+	case 0: return "generic bus / slave error";
+	case 1: return "bus parity / ECC error";
+	case 2: return "BDF not present";
+	case 3: return "invalid address";
+	case 4: return "invalid agent ID";
+	case 5: return "bus timeout";
+	case 6: return "hang";
+	case 7: return "egress blocked";
+	}
+	return "unknown-type";
+};
+
+static char *ccix_port_err_cper_data(const char *c)
+{
+	const struct cper_ccix_port_err_compact *cpd =
+		(struct cper_ccix_port_err_compact *)c;
+	static char buf[1024];
+	char *p = buf;
+	int i;
+
+	if (!cpd->validation_bits)
+		return "";
+
+	p += sprintf(p, " (");
+	if (cpd->validation_bits & CCIX_PORT_ERR_TYPE_VALID)
+		p += sprintf(p, "error: %s ",
+			     ccix_port_err_type(cpd->err_type));
+	if (cpd->validation_bits & CCIX_PORT_ERR_OP_VALID)
+		p += sprintf(p, "op: %s ", ccix_port_op(cpd->op_type));
+	if (cpd->validation_bits & CCIX_PORT_ERR_MESSAGE_VALID) {
+		p += sprintf(p, "message: ");
+		for (i = 0; i < 8; i++)
+			p += sprintf(p, "0x%08x ", cpd->message[i]);
+	}
+	p += sprintf(p - 1, ")");
+
+	return buf;
+}
+
 static char *ccix_component_type(int type)
 {
 	switch (type) {
@@ -417,3 +469,44 @@ int ras_ccix_atc_event_handler(struct trace_seq *s,
 
 	return 0;
 }
+
+int ras_ccix_port_event_handler(struct trace_seq *s,
+				struct pevent_record *record,
+				struct event_format *event, void *context)
+{
+	struct ras_events *ras = context;
+	struct tm *tm;
+	struct ras_ccix_event ev;
+	time_t now;
+	int ret;
+
+	if (ras->use_uptime)
+		now = record->ts/user_hz + ras->uptime_diff;
+	else
+		now = time(NULL);
+
+	tm = localtime(&now);
+
+	if (tm)
+		strftime(ev.timestamp, sizeof(ev.timestamp),
+			 "%Y-%m-%d %H:%M:%S %z", tm);
+	trace_seq_printf(s, "%s ", ev.timestamp);
+	ret = ras_ccix_common_parse(s, record, event, context, &ev);
+	if (ret)
+		return ret;
+
+	trace_seq_printf(s, "%d %s id:%d CCIX Port error: %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+			 ev.error_seq, err_severity(ev.severity),
+			 ev.source, ccix_component_type(ev.component),
+			 (ev.severity_detail & 0x1) ? 1 : 0,
+			 (ev.severity_detail & 0x2) ? 1 : 0,
+			 (ev.severity_detail & 0x4) ? 1 : 0,
+			 (ev.severity_detail & 0x8) ? 1 : 0,
+			 ev.address,
+			 err_mask(ev.pa_mask_lsb),
+			 ccix_port_err_cper_data(ev.cper_data));
+
+	ras_store_ccix_port_event(ras, &ev);
+
+	return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index 4528af7..e824aed 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -27,6 +27,9 @@ int ras_ccix_cache_event_handler(struct trace_seq *s,
 int ras_ccix_atc_event_handler(struct trace_seq *s,
 			       struct pevent_record *record,
 			       struct event_format *event, void *context);
+int ras_ccix_port_event_handler(struct trace_seq *s,
+				struct pevent_record *record,
+				struct event_format *event, void *context);
 
 /* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
 #pragma pack(1)
@@ -65,6 +68,12 @@ struct cper_ccix_atc_err_compact {
 	uint8_t instance;
 };
 
+struct cper_ccix_port_err_compact {
+	uint32_t validation_bits;
+	uint32_t message[8];
+	uint8_t err_type;
+	uint8_t op_type;
+};
 #pragma pack()
 
 #define CCIX_MEM_ERR_GENERIC_MEM_VALID		0x0001
@@ -95,4 +104,9 @@ struct cper_ccix_atc_err_compact {
 #define CCIX_ATC_ERR_INSTANCE_ID_VALID		0x0002
 #define CCIX_ATC_ERR_VENDOR_DATA_VALID		0x0004
 
+#define CCIX_PORT_ERR_OP_VALID			0x0001
+#define CCIX_PORT_ERR_TYPE_VALID		0x0002
+#define CCIX_PORT_ERR_MESSAGE_VALID		0x0004
+#define CCIX_PORT_ERR_VENDOR_DATA_VALID		0x0008
+
 #endif
diff --git a/ras-events.c b/ras-events.c
index 68ed246..83e28a7 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -208,6 +208,7 @@ int toggle_ras_mc_event(int enable)
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_memory_event", enable);
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_cache_event", enable);
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_atc_event", enable);
+	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_port_event", enable);
 #endif
 
 #ifdef HAVE_MCE
@@ -749,6 +750,14 @@ int handle_ras_events(int record_events)
 	else
 		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
 		    "ras", "ccix_atc_event");
+	rc = add_event_handler(ras, pevent, page_size, "ras",
+			       "ccix_port_error_event",
+			       ras_ccix_port_event_handler, NULL);
+	if (!rc)
+		num_events++;
+	else
+		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+		    "ras", "ccix_port_event");
 #endif
 
 #ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index df68eef..e1c5df4 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -352,6 +352,76 @@ int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev)
 	return rc;
 }
 
+enum {
+	ccix_port_field_operation = ccix_field_common_end,
+	ccix_port_field_etype,
+	ccix_port_field_message,
+	ccix_port_field_vendor,
+};
+
+static const struct db_fields ccix_port_event_fields[] = {
+	CCIX_COMMON_FIELDS,
+	[ccix_port_field_operation] =	{ .name = "operation",		.type = "INTEGER" },
+	[ccix_port_field_etype] =	{ .name = "etype",		.type = "INTEGER" },
+	[ccix_port_field_message] =	{ .name = "message",		.type = "BLOB" },
+	[ccix_port_field_vendor] =	{ .name = "vendor_data",	.type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_port_event_tab = {
+	.name = "ccix_port_event",
+	.fields = ccix_port_event_fields,
+	.num_fields = ARRAY_SIZE(ccix_port_event_fields),
+};
+
+int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev)
+{
+	int rc;
+	struct sqlite3_priv *priv = ras->db_priv;
+	struct cper_ccix_port_err_compact *port =
+		(struct cper_ccix_port_err_compact *)ev->cper_data;
+	sqlite3_stmt *rec = priv->stmt_ccix_port_record;
+
+	if (!priv || !rec)
+		return 0;
+	log(TERM, LOG_INFO, "ccix_port_eventstore: %p\n", rec);
+
+	ras_store_ccix_common(rec, ev);
+	if (port->validation_bits & CCIX_PORT_ERR_OP_VALID)
+		sqlite3_bind_int(rec, ccix_port_field_operation, port->op_type);
+
+	if (port->validation_bits & CCIX_PORT_ERR_TYPE_VALID)
+		sqlite3_bind_int(rec, ccix_port_field_etype, port->err_type);
+
+	if (port->validation_bits & CCIX_PORT_ERR_MESSAGE_VALID)
+		sqlite3_bind_blob(rec, ccix_port_field_message,
+				  port->message, sizeof(port->message), NULL);
+
+	if (port->validation_bits & CCIX_PORT_ERR_VENDOR_DATA_VALID)
+		sqlite3_bind_blob(rec, ccix_port_field_vendor,
+				  ev->vendor_data, ev->vendor_data_length,
+				  NULL);
+
+	rc = sqlite3_step(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to do ccix_port_record step on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_reset(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed reset ccix_port_record on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_clear_bindings(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to clear ccix_port_record: error %d\n",
+		    rc);
+	log(TERM, LOG_INFO, "register inserted at db\n");
+	return rc;
+}
+
 void ras_ccix_create_table(struct sqlite3_priv *priv)
 {
 	int rc;
@@ -370,4 +440,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
 	if (rc == SQLITE_OK)
 		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_atc_record,
 					 &ccix_atc_event_tab);
+
+	rc = ras_mc_create_table(priv, &ccix_port_event_tab);
+	if (rc == SQLITE_OK)
+		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_port_record,
+					 &ccix_port_event_tab);
 }
diff --git a/ras-record.h b/ras-record.h
index c3b3586..778de25 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -127,6 +127,7 @@ struct sqlite3_priv {
 	sqlite3_stmt	*stmt_ccix_mem_record;
 	sqlite3_stmt	*stmt_ccix_cache_record;
 	sqlite3_stmt	*stmt_ccix_atc_record;
+	sqlite3_stmt	*stmt_ccix_port_record;
 #endif
 #ifdef HAVE_NON_STANDARD
 	sqlite3_stmt	*stmt_non_standard_record;
@@ -167,6 +168,7 @@ void ras_ccix_create_table(struct sqlite3_priv *priv);
 int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
 int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
 int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -181,6 +183,7 @@ static inline void ras_ccix_create_table(void *priv) {};
 static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
 static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
+static inline int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
 static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
 static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH V2 5/6] rasdaemon: CCIX: Link error support
  2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
                   ` (3 preceding siblings ...)
  2019-08-27 11:30 ` [PATCH V2 4/6] rasdaemon: CCIX: Port error suppport Jonathan Cameron
@ 2019-08-27 11:30 ` Jonathan Cameron
  2019-08-27 11:30 ` [PATCH V2 6/6] rasdaemon: CCIX: Agent Internal " Jonathan Cameron
  5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-edac
  Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron

Add support for reporting and storing to sqlite3 of
CCIX Link errors.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 ras-ccix-handler.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++
 ras-ccix-handler.h | 19 +++++++++
 ras-events.c       |  9 +++++
 ras-record-ccix.c  | 87 +++++++++++++++++++++++++++++++++++++++++
 ras-record.h       |  3 ++
 5 files changed, 214 insertions(+)

diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index 0a79627..69baa48 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -272,6 +272,61 @@ static char *ccix_port_err_cper_data(const char *c)
 	return buf;
 }
 
+static char *ccix_link_err_type(uint8_t err)
+{
+	switch (err) {
+	case 0: return "generic";
+	case 1: return "credit underflow";
+	case 2: return "credit overflow";
+	case 3: return "unusable credit";
+	case 4: return "credit timeout";
+	}
+	return "unknown";
+};
+
+static char *ccix_link_credit(uint8_t credit)
+{
+	switch (credit) {
+	case 0: return "memory";
+	case 1: return "snoop";
+	case 2: return "data";
+	case 3: return "misc";
+	}
+	return "unknown";
+};
+
+static char *ccix_link_err_cper_data(const char *c)
+{
+	const struct cper_ccix_link_err_compact *cpd =
+		(struct cper_ccix_link_err_compact *)c;
+	static char buf[1024];
+	char *p = buf;
+	int i;
+
+	if (!cpd->validation_bits)
+		return "";
+
+	p += sprintf(p, " (");
+	if (cpd->validation_bits & CCIX_LINK_ERR_TYPE_VALID)
+		p += sprintf(p, "error: %s ",
+			     ccix_link_err_type(cpd->err_type));
+	if (cpd->validation_bits & CCIX_LINK_ERR_OP_VALID)
+		p += sprintf(p, "op: %s ", ccix_port_op(cpd->op_type));
+	if (cpd->validation_bits & CCIX_LINK_ERR_LINK_ID_VALID)
+		p += sprintf(p, "id: %u ", cpd->link_id);
+	if (cpd->validation_bits & CCIX_LINK_ERR_CREDIT_TYPE_VALID)
+		p += sprintf(p, "credit-type: %s ",
+			     ccix_link_credit(cpd->credit_type));
+	if (cpd->validation_bits & CCIX_LINK_ERR_MESSAGE_VALID) {
+		p += sprintf(p, "message: ");
+		for (i = 0; i < 8; i++)
+			p += sprintf(p, "0x%08x ", cpd->message[i]);
+	}
+	p += sprintf(p - 1, ")");
+
+	return buf;
+}
+
 static char *ccix_component_type(int type)
 {
 	switch (type) {
@@ -510,3 +565,44 @@ int ras_ccix_port_event_handler(struct trace_seq *s,
 
 	return 0;
 }
+
+int ras_ccix_link_event_handler(struct trace_seq *s,
+				struct pevent_record *record,
+				struct event_format *event, void *context)
+{
+	struct ras_events *ras = context;
+	struct tm *tm;
+	struct ras_ccix_event ev;
+	time_t now;
+	int ret;
+
+	if (ras->use_uptime)
+		now = record->ts/user_hz + ras->uptime_diff;
+	else
+		now = time(NULL);
+
+	tm = localtime(&now);
+
+	if (tm)
+		strftime(ev.timestamp, sizeof(ev.timestamp),
+			 "%Y-%m-%d %H:%M:%S %z", tm);
+	trace_seq_printf(s, "%s ", ev.timestamp);
+	ret = ras_ccix_common_parse(s, record, event, context, &ev);
+	if (ret)
+		return ret;
+
+	trace_seq_printf(s, "%d %s id:%d CCIX Link error: %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+			 ev.error_seq, err_severity(ev.severity),
+			 ev.source, ccix_component_type(ev.component),
+			 (ev.severity_detail & 0x1) ? 1 : 0,
+			 (ev.severity_detail & 0x2) ? 1 : 0,
+			 (ev.severity_detail & 0x4) ? 1 : 0,
+			 (ev.severity_detail & 0x8) ? 1 : 0,
+			 ev.address,
+			 err_mask(ev.pa_mask_lsb),
+			 ccix_link_err_cper_data(ev.cper_data));
+
+	ras_store_ccix_link_event(ras, &ev);
+
+	return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index e824aed..3def534 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -30,6 +30,9 @@ int ras_ccix_atc_event_handler(struct trace_seq *s,
 int ras_ccix_port_event_handler(struct trace_seq *s,
 				struct pevent_record *record,
 				struct event_format *event, void *context);
+int ras_ccix_link_event_handler(struct trace_seq *s,
+				struct pevent_record *record,
+				struct event_format *event, void *context);
 
 /* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
 #pragma pack(1)
@@ -74,6 +77,15 @@ struct cper_ccix_port_err_compact {
 	uint8_t err_type;
 	uint8_t op_type;
 };
+
+struct cper_ccix_link_err_compact {
+	uint32_t validation_bits;
+	uint32_t message[8];
+	uint8_t err_type;
+	uint8_t op_type;
+	uint8_t link_id;
+	uint8_t credit_type;
+};
 #pragma pack()
 
 #define CCIX_MEM_ERR_GENERIC_MEM_VALID		0x0001
@@ -109,4 +121,11 @@ struct cper_ccix_port_err_compact {
 #define CCIX_PORT_ERR_MESSAGE_VALID		0x0004
 #define CCIX_PORT_ERR_VENDOR_DATA_VALID		0x0008
 
+#define CCIX_LINK_ERR_OP_VALID			0x0001
+#define CCIX_LINK_ERR_TYPE_VALID		0x0002
+#define CCIX_LINK_ERR_LINK_ID_VALID		0x0004
+#define CCIX_LINK_ERR_CREDIT_TYPE_VALID		0x0008
+#define CCIX_LINK_ERR_MESSAGE_VALID		0x0010
+#define CCIX_LINK_ERR_VENDOR_DATA_VALID		0x0020
+
 #endif
diff --git a/ras-events.c b/ras-events.c
index 83e28a7..c73a36d 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -209,6 +209,7 @@ int toggle_ras_mc_event(int enable)
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_cache_event", enable);
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_atc_event", enable);
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_port_event", enable);
+	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_link_event", enable);
 #endif
 
 #ifdef HAVE_MCE
@@ -758,6 +759,14 @@ int handle_ras_events(int record_events)
 	else
 		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
 		    "ras", "ccix_port_event");
+	rc = add_event_handler(ras, pevent, page_size, "ras",
+			       "ccix_link_error_event",
+			       ras_ccix_link_event_handler, NULL);
+	if (!rc)
+		num_events++;
+	else
+		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+		    "ras", "ccix_link_event");
 #endif
 
 #ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index e1c5df4..1e03e84 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -422,6 +422,88 @@ int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev)
 	return rc;
 }
 
+enum {
+	ccix_link_field_operation = ccix_field_common_end,
+	ccix_link_field_etype,
+	ccix_link_field_link_id,
+	ccix_link_field_credit_type,
+	ccix_link_field_message,
+	ccix_link_field_vendor,
+};
+
+static const struct db_fields ccix_link_event_fields[] = {
+	CCIX_COMMON_FIELDS,
+	[ccix_link_field_operation] =	{ .name = "operation",		.type = "INTEGER" },
+	[ccix_link_field_etype] =	{ .name = "etype",		.type = "INTEGER" },
+	[ccix_link_field_link_id] =	{ .name = "credit_id",		.type = "INTEGER" },
+	[ccix_link_field_credit_type] =	{ .name = "credit_type",	.type = "INTEGER" },
+	[ccix_link_field_message] =	{ .name = "message",		.type = "BLOB" },
+	[ccix_link_field_vendor] =	{ .name = "vendor_data",	.type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_link_event_tab = {
+	.name = "ccix_link_event",
+	.fields = ccix_link_event_fields,
+	.num_fields = ARRAY_SIZE(ccix_link_event_fields),
+};
+
+int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev)
+{
+	int rc;
+	struct sqlite3_priv *priv = ras->db_priv;
+	struct cper_ccix_link_err_compact *link =
+		(struct cper_ccix_link_err_compact *)ev->cper_data;
+	sqlite3_stmt *rec = priv->stmt_ccix_link_record;
+
+	if (!priv || !rec)
+		return 0;
+	log(TERM, LOG_INFO, "ccix_link_eventstore: %p\n", rec);
+
+	ras_store_ccix_common(rec, ev);
+	if (link->validation_bits & CCIX_LINK_ERR_OP_VALID)
+		sqlite3_bind_int(rec, ccix_link_field_operation, link->op_type);
+
+	if (link->validation_bits & CCIX_LINK_ERR_TYPE_VALID)
+		sqlite3_bind_int(rec, ccix_link_field_operation,
+				 link->err_type);
+
+	if (link->validation_bits & CCIX_LINK_ERR_LINK_ID_VALID)
+		sqlite3_bind_int(rec, ccix_link_field_link_id, link->link_id);
+
+	if (link->validation_bits & CCIX_LINK_ERR_CREDIT_TYPE_VALID)
+		sqlite3_bind_int(rec, ccix_link_field_credit_type,
+				 link->credit_type);
+
+	if (link->validation_bits & CCIX_LINK_ERR_MESSAGE_VALID)
+		sqlite3_bind_blob(rec, ccix_link_field_message,
+				  link->message, sizeof(link->message), NULL);
+
+	if (link->validation_bits & CCIX_LINK_ERR_VENDOR_DATA_VALID)
+		sqlite3_bind_blob(rec, ccix_link_field_vendor,
+				  ev->vendor_data, ev->vendor_data_length,
+				  NULL);
+
+	rc = sqlite3_step(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to do ccix_link_record step on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_reset(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed reset ccix_link_record on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_clear_bindings(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to clear ccix_link_record: error %d\n",
+		    rc);
+	log(TERM, LOG_INFO, "register inserted at db\n");
+	return rc;
+}
+
 void ras_ccix_create_table(struct sqlite3_priv *priv)
 {
 	int rc;
@@ -445,4 +527,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
 	if (rc == SQLITE_OK)
 		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_port_record,
 					 &ccix_port_event_tab);
+
+	rc = ras_mc_create_table(priv, &ccix_link_event_tab);
+	if (rc == SQLITE_OK)
+		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_link_record,
+					 &ccix_link_event_tab);
 }
diff --git a/ras-record.h b/ras-record.h
index 778de25..f13e286 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -128,6 +128,7 @@ struct sqlite3_priv {
 	sqlite3_stmt	*stmt_ccix_cache_record;
 	sqlite3_stmt	*stmt_ccix_atc_record;
 	sqlite3_stmt	*stmt_ccix_port_record;
+	sqlite3_stmt	*stmt_ccix_link_record;
 #endif
 #ifdef HAVE_NON_STANDARD
 	sqlite3_stmt	*stmt_non_standard_record;
@@ -169,6 +170,7 @@ int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *e
 int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
 int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
 int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -184,6 +186,7 @@ static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras
 static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
+static inline int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
 static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
 static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH V2 6/6] rasdaemon: CCIX: Agent Internal error support
  2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
                   ` (4 preceding siblings ...)
  2019-08-27 11:30 ` [PATCH V2 5/6] rasdaemon: CCIX: Link error support Jonathan Cameron
@ 2019-08-27 11:30 ` " Jonathan Cameron
  5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-edac
  Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron

Add support for reporting and stroing to sqlite3 of
CCIX Agent Interal errors.

In the current 1.0 CCIX specification these only have vendor_data
defined.  However, they are structured to allow additional fields
in future so we handle them the same way as all the other CCIX
error types.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 ras-ccix-handler.c | 40 ++++++++++++++++++++++++++++++
 ras-ccix-handler.h |  8 ++++++
 ras-events.c       |  9 +++++++
 ras-record-ccix.c  | 61 ++++++++++++++++++++++++++++++++++++++++++++++
 ras-record.h       |  3 +++
 5 files changed, 121 insertions(+)

diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index 69baa48..2088790 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -606,3 +606,43 @@ int ras_ccix_link_event_handler(struct trace_seq *s,
 
 	return 0;
 }
+
+int ras_ccix_agent_event_handler(struct trace_seq *s,
+				 struct pevent_record *record,
+				 struct event_format *event, void *context)
+{
+	struct ras_events *ras = context;
+	struct tm *tm;
+	struct ras_ccix_event ev;
+	time_t now;
+	int ret;
+
+	if (ras->use_uptime)
+		now = record->ts/user_hz + ras->uptime_diff;
+	else
+		now = time(NULL);
+
+	tm = localtime(&now);
+
+	if (tm)
+		strftime(ev.timestamp, sizeof(ev.timestamp),
+			 "%Y-%m-%d %H:%M:%S %z", tm);
+	trace_seq_printf(s, "%s ", ev.timestamp);
+	ret = ras_ccix_common_parse(s, record, event, context, &ev);
+	if (ret)
+		return ret;
+
+	trace_seq_printf(s, "%d %s id:%d CCIX Agent Internal error: %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx",
+			 ev.error_seq, err_severity(ev.severity),
+			 ev.source, ccix_component_type(ev.component),
+			 (ev.severity_detail & 0x1) ? 1 : 0,
+			 (ev.severity_detail & 0x2) ? 1 : 0,
+			 (ev.severity_detail & 0x4) ? 1 : 0,
+			 (ev.severity_detail & 0x8) ? 1 : 0,
+			 ev.address,
+			 err_mask(ev.pa_mask_lsb));
+
+	ras_store_ccix_agent_event(ras, &ev);
+
+	return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index 3def534..c53e3ee 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -33,6 +33,9 @@ int ras_ccix_port_event_handler(struct trace_seq *s,
 int ras_ccix_link_event_handler(struct trace_seq *s,
 				struct pevent_record *record,
 				struct event_format *event, void *context);
+int ras_ccix_agent_event_handler(struct trace_seq *s,
+				 struct pevent_record *record,
+				 struct event_format *event, void *context);
 
 /* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
 #pragma pack(1)
@@ -86,6 +89,10 @@ struct cper_ccix_link_err_compact {
 	uint8_t link_id;
 	uint8_t credit_type;
 };
+
+struct cper_ccix_agent_internal_err_compact {
+	uint32_t validation_bits;
+};
 #pragma pack()
 
 #define CCIX_MEM_ERR_GENERIC_MEM_VALID		0x0001
@@ -128,4 +135,5 @@ struct cper_ccix_link_err_compact {
 #define CCIX_LINK_ERR_MESSAGE_VALID		0x0010
 #define CCIX_LINK_ERR_VENDOR_DATA_VALID		0x0020
 
+#define CCIX_AGENT_ERR_VENDOR_DATA_VALID	0x0001
 #endif
diff --git a/ras-events.c b/ras-events.c
index c73a36d..4de28b7 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -210,6 +210,7 @@ int toggle_ras_mc_event(int enable)
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_atc_event", enable);
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_port_event", enable);
 	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_link_event", enable);
+	rc |= __toggle_ras_mc_event(ras, "ras", "ccix_agent_event", enable);
 #endif
 
 #ifdef HAVE_MCE
@@ -767,6 +768,14 @@ int handle_ras_events(int record_events)
 	else
 		log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
 		    "ras", "ccix_link_event");
+	rc = add_event_handler(ras, pevent, page_size, "ras",
+			       "ccix_agent_error_event",
+			       ras_ccix_agent_event_handler, NULL);
+	if (!rc)
+		num_events++;
+	else
+		log(ALL, LOG_ERR, "Cant' get traces from %s:%s\n",
+		    "ras", "ccix_agent_error_event");
 #endif
 
 #ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index 1e03e84..79c6e52 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -504,6 +504,62 @@ int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev)
 	return rc;
 }
 
+enum {
+	ccix_agent_field_vendor = ccix_field_common_end,
+};
+
+static const struct db_fields ccix_agent_event_fields[] = {
+	CCIX_COMMON_FIELDS,
+	[ccix_agent_field_vendor] =	{ .name = "vendor_data",	.type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_agent_event_tab = {
+	.name = "ccix_agent_event",
+	.fields = ccix_agent_event_fields,
+	.num_fields = ARRAY_SIZE(ccix_agent_event_fields),
+};
+
+int ras_store_ccix_agent_event(struct ras_events *ras,
+			       struct ras_ccix_event *ev)
+{
+	int rc;
+	struct sqlite3_priv *priv = ras->db_priv;
+	struct cper_ccix_agent_internal_err_compact *agent =
+		(struct cper_ccix_agent_internal_err_compact *)ev->cper_data;
+	sqlite3_stmt *rec = priv->stmt_ccix_agent_record;
+
+	if (!priv || !rec)
+		return 0;
+	log(TERM, LOG_INFO, "ccix_agent_eventstore: %p\n", rec);
+
+	ras_store_ccix_common(rec, ev);
+
+	if (agent->validation_bits & CCIX_AGENT_ERR_VENDOR_DATA_VALID)
+		sqlite3_bind_blob(rec, ccix_agent_field_vendor,
+				  ev->vendor_data, ev->vendor_data_length,
+				  NULL);
+
+	rc = sqlite3_step(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to do ccix_agent_record step on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_reset(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed reset ccix_agent_record on sqlite: error = %d\n",
+		    rc);
+
+	rc = sqlite3_clear_bindings(rec);
+	if (rc != SQLITE_OK && rc != SQLITE_DONE)
+		log(TERM, LOG_ERR,
+		    "Failed to clear ccix_agent_record: error %d\n",
+		    rc);
+	log(TERM, LOG_INFO, "register inserted at db\n");
+	return rc;
+}
+
 void ras_ccix_create_table(struct sqlite3_priv *priv)
 {
 	int rc;
@@ -532,4 +588,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
 	if (rc == SQLITE_OK)
 		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_link_record,
 					 &ccix_link_event_tab);
+
+	rc = ras_mc_create_table(priv, &ccix_agent_event_tab);
+	if (rc == SQLITE_OK)
+		rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_agent_record,
+					 &ccix_agent_event_tab);
 }
diff --git a/ras-record.h b/ras-record.h
index f13e286..4f78e1d 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -129,6 +129,7 @@ struct sqlite3_priv {
 	sqlite3_stmt	*stmt_ccix_atc_record;
 	sqlite3_stmt	*stmt_ccix_port_record;
 	sqlite3_stmt	*stmt_ccix_link_record;
+	sqlite3_stmt	*stmt_ccix_agent_record;
 #endif
 #ifdef HAVE_NON_STANDARD
 	sqlite3_stmt	*stmt_non_standard_record;
@@ -171,6 +172,7 @@ int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev
 int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_agent_event(struct ras_events *ras, struct ras_ccix_event *ev);
 int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
 int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
 int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -187,6 +189,7 @@ static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_
 static inline int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
+static inline int ras_store_ccix_agent_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
 static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
 static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
 static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 1/6] rasdaemon: CCIX: memory error support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 2/6] rasdaemon: CCIX: Cache " Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 3/6] rasdaemon: CCIX: ATC " Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 4/6] rasdaemon: CCIX: Port error suppport Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 5/6] rasdaemon: CCIX: Link error support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 6/6] rasdaemon: CCIX: Agent Internal " Jonathan Cameron

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org linux-edac@archiver.kernel.org
	public-inbox-index linux-edac


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/ public-inbox