* [PATCH V2 0/6] CCIX rasdaemon support
@ 2019-08-27 11:30 Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 1/6] rasdaemon: CCIX: memory error support Jonathan Cameron
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
To: Mauro Carvalho Chehab, linux-edac
Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron
Depends on the kernel patches being accepted:
https://lore.kernel.org/linux-edac/20190820144732.2370-1-Jonathan.Cameron@huawei.com/T/#t
Changes since v1:
* Separated out the ras-record section into its own file.
* Rebased on current rasdaemon tree.
This series introduced rasdaemon support to match against the above
series which provides the tracepoints for CCIX PER error reporting from
the kernel to userspace.
These are errors which occur at the CCIX protocol layer which sits
on top of PCIe (for which we have AER). They are defined in the
CCIX base specification v1.0 an evaluation version of which is available
at www.ccixconsortium.org.
Note the following is a trademark grant and doesn't prevent normal
stuff covered under fair use. Given this set doesn't quote from
the spec (other than field names), there are no such copyright
notices.
This patch is being distributed by the CCIX Consortium, Inc. (CCIX) to
you and other parties that are participating (the "participants") in
rasdemon project with the understanding that the participants will use CCIX's
name and trademark only when this patch is used in association with
rasdaemon.
CCIX is also distributing this patch to these participants with the
understanding that if any portion of the CCIX specification will be
used or referenced in rasdaemon, the participants will not modify
the cited portion of the CCIX specification and will give CCIX proper
copyright attribution by including the following copyright notice with
the cited part of the CCIX specification:
"© 2019 CCIX CONSORTIUM, INC. ALL RIGHTS RESERVED."
Jonathan Cameron (6):
rasdaemon: CCIX: memory error support
rasdaemon: CCIX: Cache error support
rasdaemon: CCIX: ATC error support
rasdaemon: CCIX: Port error suppport
rasdaemon: CCIX: Link error support
rasdaemon: CCIX: Agent Internal error support
Makefile.am | 8 +-
configure.ac | 10 +
ras-ccix-handler.c | 648 +++++++++++++++++++++++++++++++++++++++++++++
ras-ccix-handler.h | 139 ++++++++++
ras-events.c | 61 +++++
ras-record-ccix.c | 596 +++++++++++++++++++++++++++++++++++++++++
ras-record.c | 15 +-
ras-record.h | 43 +++
ras-report.h | 6 +-
9 files changed, 1519 insertions(+), 7 deletions(-)
create mode 100644 ras-ccix-handler.c
create mode 100644 ras-ccix-handler.h
create mode 100644 ras-record-ccix.c
--
2.20.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH V2 1/6] rasdaemon: CCIX: memory error support
2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
@ 2019-08-27 11:30 ` Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 2/6] rasdaemon: CCIX: Cache " Jonathan Cameron
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
To: Mauro Carvalho Chehab, linux-edac
Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron
Adds support for basic decoding and logging of ccix memory errors
+ storing to sqlite3 DB.
Given that the CCIX memory record is very tightly defined by the
specification and that databases with large blobs in them
are not particularly useful, I have separately exposed all of the
standard fields. Note that this means setting them NULL if the
validation bits indicate that the field is not valid.
Includes making a few ras-record.c functions available from other
files to allow us to split off the CCIX error recording functionality.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
Makefile.am | 8 +-
configure.ac | 10 ++
ras-ccix-handler.c | 244 +++++++++++++++++++++++++++++++++++++++++++++
ras-ccix-handler.h | 61 ++++++++++++
ras-events.c | 16 +++
ras-record-ccix.c | 204 +++++++++++++++++++++++++++++++++++++
ras-record.c | 15 ++-
ras-record.h | 28 ++++++
ras-report.h | 6 +-
9 files changed, 585 insertions(+), 7 deletions(-)
diff --git a/Makefile.am b/Makefile.am
index 3d89672..9d54390 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -20,10 +20,16 @@ rasdaemon_SOURCES = rasdaemon.c ras-events.c ras-mc-handler.c \
bitfield.c
if WITH_SQLITE3
rasdaemon_SOURCES += ras-record.c
+if WITH_CCIX
+ rasdaemon_SOURCES += ras-record-ccix.c
+endif
endif
if WITH_AER
rasdaemon_SOURCES += ras-aer-handler.c
endif
+if WITH_CCIX
+ rasdaemon_SOURCES += ras-ccix-handler.c
+endif
if WITH_NON_STANDARD
rasdaemon_SOURCES += ras-non-standard-handler.c
endif
@@ -56,7 +62,7 @@ rasdaemon_LDADD = -lpthread $(SQLITE3_LIBS) libtrace/libtrace.a
include_HEADERS = config.h ras-events.h ras-logger.h ras-mc-handler.h \
ras-aer-handler.h ras-mce-handler.h ras-record.h bitfield.h ras-report.h \
ras-extlog-handler.h ras-arm-handler.h ras-non-standard-handler.h \
- ras-devlink-handler.h
+ ras-devlink-handler.h ras-ccix-handler.h
# This rule can't be called with more than one Makefile job (like make -j8)
# I can't figure out a way to fix that
diff --git a/configure.ac b/configure.ac
index fecff51..ca8977c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -44,6 +44,15 @@ AS_IF([test "x$enable_aer" = "xyes"], [
])
AM_CONDITIONAL([WITH_AER], [test x$enable_aer = xyes])
+AC_ARG_ENABLE([ccix],
+ AS_HELP_STRING([--enable-ccix], [enable CCIX PER events (currently experimental)]))
+
+AS_IF([test "x$enable_ccix" = "xyes"], [
+ AC_DEFINE(HAVE_CCIX,1,"have CCIX PER events collect")
+ AC_SUBST([WITH_CCIX])
+])
+AM_CONDITIONAL([WITH_CCIX], [test x$enable_ccix = xyes])
+
AC_ARG_ENABLE([non_standard],
AS_HELP_STRING([--enable-non-standard], [enable NON_STANDARD events (currently experimental)]))
@@ -137,4 +146,5 @@ compile time options summary
HIP07 SAS HW errors : $enable_hisi_ns_decode
ARM events : $enable_arm
DEVLINK : $enable_devlink
+ CCIX : $enable_ccix
EOF
diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
new file mode 100644
index 0000000..2be413f
--- /dev/null
+++ b/ras-ccix-handler.c
@@ -0,0 +1,244 @@
+/*
+ * Copyright (c) 2019 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include "libtrace/kbuffer.h"
+#include "ras-record.h"
+#include "ras-logger.h"
+#include "bitfield.h"
+#include "ras-report.h"
+
+static char *ccix_mem_pool_type(uint8_t pt)
+{
+ switch (pt) {
+ case 0: return "other/not-specified";
+ case 1: return "ROM";
+ case 2: return "volatile";
+ case 3: return "non-volatile";
+ case 4: return "device/register";
+ }
+ if (pt >= 0x80)
+ return "vendor";
+ return "unknown";
+}
+
+static char *ccix_mem_spec_type(uint8_t st)
+{
+ switch (st) {
+ case 0: return "other/not-specified";
+ case 1: return "SRAM";
+ case 2: return "DDR";
+ case 3: return "NVDIMM-F";
+ case 4: return "NVDIMM-N";
+ case 5: return "HBM";
+ case 6: return "flash";
+ }
+ if (st >= 0x80)
+ return "vendor";
+ return "unknown";
+}
+
+static char *ccix_mem_op(uint8_t op)
+{
+ switch (op) {
+ case 0: return "generic";
+ case 1: return "read";
+ case 2: return "write";
+ case 4: return "scrub";
+ }
+ return "unknown";
+}
+
+static char *ccix_mem_err_type(int etype)
+{
+ switch (etype) {
+ case 0: return "unknown";
+ case 1: return "no error";
+ case 2: return "single-bit ECC";
+ case 3: return "multi-bit ECC";
+ case 4: return "single-symbol chipkill ECC";
+ case 5: return "multi-symbol chipkill ECC";
+ case 6: return "master abort";
+ case 7: return "target abort";
+ case 8: return "parity error";
+ case 9: return "watchdog timeout";
+ case 10: return "invalid address";
+ case 11: return "mirror Broken";
+ case 12: return "memory sparing";
+ case 13: return "scrub";
+ case 14: return "physical memory map-out event";
+ }
+ return "unknown-type";
+}
+
+static char *ccix_mem_err_cper_data(const char *c)
+{
+ const struct cper_ccix_mem_err_compact *cpd =
+ (struct cper_ccix_mem_err_compact *)c;
+ static char buf[1024];
+ char *p = buf;
+
+ p += sprintf(p, " (");
+ p += sprintf(p, "fru: %u ", cpd->fru);
+ if (cpd->validation_bits & CCIX_MEM_ERR_MEM_ERR_TYPE_VALID)
+ p += sprintf(p, "error: %s ",
+ ccix_mem_err_type(cpd->mem_err_type));
+ if (cpd->validation_bits & CCIX_MEM_ERR_GENERIC_MEM_VALID)
+ p += sprintf(p, "type: %s ",
+ ccix_mem_pool_type(cpd->pool_generic_type));
+ if (cpd->validation_bits & CCIX_MEM_ERR_SPEC_TYPE_VALID)
+ p += sprintf(p, "sub_type: %s ",
+ ccix_mem_spec_type(cpd->pool_specific_type));
+ if (cpd->validation_bits & CCIX_MEM_ERR_OP_VALID)
+ p += sprintf(p, "op: %s ", ccix_mem_op(cpd->op_type));
+ if (cpd->validation_bits & CCIX_MEM_ERR_CARD_VALID)
+ p += sprintf(p, "card: %u ", cpd->card);
+ if (cpd->validation_bits & CCIX_MEM_ERR_MOD_VALID)
+ p += sprintf(p, "mod: %u ", cpd->module);
+ if (cpd->validation_bits & CCIX_MEM_ERR_BANK_VALID)
+ p += sprintf(p, "bank: %u ", cpd->bank);
+ if (cpd->validation_bits & CCIX_MEM_ERR_DEVICE_VALID)
+ p += sprintf(p, "device: %u ", cpd->device);
+ if (cpd->validation_bits & CCIX_MEM_ERR_ROW_VALID)
+ p += sprintf(p, "row: %u ", cpd->row);
+ if (cpd->validation_bits & CCIX_MEM_ERR_COL_VALID)
+ p += sprintf(p, "col: %u ", cpd->column);
+ if (cpd->validation_bits & CCIX_MEM_ERR_RANK_VALID)
+ p += sprintf(p, "rank: %u ", cpd->rank);
+ if (cpd->validation_bits & CCIX_MEM_ERR_BIT_POS_VALID)
+ p += sprintf(p, "bitpos: %u ", cpd->bit_pos);
+ if (cpd->validation_bits & CCIX_MEM_ERR_CHIP_ID_VALID)
+ p += sprintf(p, "chipid: %u ", cpd->chip_id);
+ p += sprintf(p - 1, ")");
+
+ return buf;
+}
+
+static char *ccix_component_type(int type)
+{
+ switch (type) {
+ case 0: return "RA";
+ case 1: return "HA";
+ case 2: return "SA";
+ case 3: return "Port";
+ case 4: return "CCIX-Link";
+ }
+ return "unknown-component";
+}
+
+static char *err_severity(int severity)
+{
+ switch (severity) {
+ case 0: return "recoverable";
+ case 1: return "fatal";
+ case 2: return "corrected";
+ case 3: return "informational";
+ }
+ return "unknown-severity";
+}
+
+static unsigned long long err_mask(int lsb)
+{
+ if (lsb == 0xff)
+ return ~0ull;
+ return ~((1ull << lsb) - 1);
+}
+
+static int ras_ccix_common_parse(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context,
+ struct ras_ccix_event *ev)
+{
+ unsigned long long val;
+ int len;
+
+ if (pevent_get_field_val(s, event, "err_seq", record, &val, 1) < 0)
+ return -1;
+ ev->error_seq = val;
+ if (pevent_get_field_val(s, event, "sev", record, &val, 1) < 0)
+ return -1;
+ ev->severity = val;
+ if (pevent_get_field_val(s, event, "sevdetail", record, &val, 1) < 0)
+ return -1;
+ ev->severity_detail = val;
+ if (pevent_get_field_val(s, event, "pa", record, &val, 1) < 0)
+ return -1;
+ ev->address = val;
+ if (pevent_get_field_val(s, event, "pa_mask_lsb", record, &val, 1) < 0)
+ return -1;
+ ev->pa_mask_lsb = val;
+ if (pevent_get_field_val(s, event, "source", record, &val, 1) < 0)
+ return -1;
+ ev->source = val;
+ if (pevent_get_field_val(s, event, "component", record, &val, 1) < 0)
+ return -1;
+ ev->component = val;
+
+ ev->cper_data = pevent_get_field_raw(s, event, "data", record, &len, 1);
+ ev->cper_data_length = len;
+
+ if (pevent_get_field_val(s, event, "vendor_data_length", record, &val,
+ 1))
+ return -1;
+ ev->vendor_data_length = val;
+
+ ev->vendor_data = pevent_get_field_raw(s, event, "vendor_data", record,
+ &len, 1);
+
+ return 0;
+}
+
+int ras_ccix_memory_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context)
+{
+ struct ras_events *ras = context;
+ struct tm *tm;
+ struct ras_ccix_event ev;
+ time_t now;
+ int ret;
+
+ if (ras->use_uptime)
+ now = record->ts/user_hz + ras->uptime_diff;
+ else
+ now = time(NULL);
+
+ tm = localtime(&now);
+
+ if (tm)
+ strftime(ev.timestamp, sizeof(ev.timestamp),
+ "%Y-%m-%d %H:%M:%S %z", tm);
+ trace_seq_printf(s, "%s ", ev.timestamp);
+
+ ret = ras_ccix_common_parse(s, record, event, context, &ev);
+ if (ret)
+ return ret;
+
+ trace_seq_printf(s, "%d %s id:%d CCIX memory error %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+ ev.error_seq, err_severity(ev.severity),
+ ev.source, ccix_component_type(ev.component),
+ (ev.severity_detail & 0x1) ? 1 : 0,
+ (ev.severity_detail & 0x2) ? 1 : 0,
+ (ev.severity_detail & 0x4) ? 1 : 0,
+ (ev.severity_detail & 0x8) ? 1 : 0,
+ ev.address,
+ err_mask(ev.pa_mask_lsb),
+ ccix_mem_err_cper_data(ev.cper_data));
+
+ ras_store_ccix_memory_event(ras, &ev);
+
+ return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
new file mode 100644
index 0000000..f6d25b1
--- /dev/null
+++ b/ras-ccix-handler.h
@@ -0,0 +1,61 @@
+/*
+ * Copyright (c) 2019 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __RAS_CCIX_HANDLER_H
+#define __RAS_CCIX_HANDLER_H
+
+#include "ras-events.h"
+#include "libtrace/event-parse.h"
+
+int ras_ccix_memory_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context);
+
+/* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
+#pragma pack(1)
+struct cper_ccix_mem_err_compact {
+ uint32_t validation_bits;
+ uint8_t mem_err_type;
+ uint8_t pool_generic_type;
+ uint8_t pool_specific_type;
+ uint8_t op_type;
+ uint8_t card;
+ uint16_t module;
+ uint16_t bank;
+ uint32_t device;
+ uint32_t row;
+ uint32_t column;
+ uint32_t rank;
+ uint8_t bit_pos;
+ uint8_t chip_id;
+ uint8_t fru;
+};
+#pragma pack()
+
+#define CCIX_MEM_ERR_GENERIC_MEM_VALID 0x0001
+#define CCIX_MEM_ERR_OP_VALID 0x0002
+#define CCIX_MEM_ERR_MEM_ERR_TYPE_VALID 0x0004
+#define CCIX_MEM_ERR_CARD_VALID 0x0008
+#define CCIX_MEM_ERR_BANK_VALID 0x0010
+#define CCIX_MEM_ERR_DEVICE_VALID 0x0020
+#define CCIX_MEM_ERR_ROW_VALID 0x0040
+#define CCIX_MEM_ERR_COL_VALID 0x0080
+#define CCIX_MEM_ERR_RANK_VALID 0x0100
+#define CCIX_MEM_ERR_BIT_POS_VALID 0x0200
+#define CCIX_MEM_ERR_CHIP_ID_VALID 0x0400
+#define CCIX_MEM_ERR_VENDOR_DATA_VALID 0x0800
+#define CCIX_MEM_ERR_MOD_VALID 0x1000
+#define CCIX_MEM_ERR_SPEC_TYPE_VALID 0x2000
+
+#endif
diff --git a/ras-events.c b/ras-events.c
index 6ba7a6a..e365d97 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -29,6 +29,7 @@
#include "libtrace/event-parse.h"
#include "ras-mc-handler.h"
#include "ras-aer-handler.h"
+#include "ras-ccix-handler.h"
#include "ras-non-standard-handler.h"
#include "ras-arm-handler.h"
#include "ras-mce-handler.h"
@@ -203,6 +204,10 @@ int toggle_ras_mc_event(int enable)
rc |= __toggle_ras_mc_event(ras, "ras", "aer_event", enable);
#endif
+#ifdef HAVE_CCIX
+ rc |= __toggle_ras_mc_event(ras, "ras", "ccix_memory_event", enable);
+#endif
+
#ifdef HAVE_MCE
rc |= __toggle_ras_mc_event(ras, "mce", "mce_record", enable);
#endif
@@ -717,6 +722,17 @@ int handle_ras_events(int record_events)
"ras", "aer_event");
#endif
+#ifdef HAVE_CCIX
+ rc = add_event_handler(ras, pevent, page_size, "ras",
+ "ccix_memory_error_event",
+ ras_ccix_memory_event_handler, NULL);
+ if (!rc)
+ num_events++;
+ else
+ log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+ "ras", "ccix_memory_event");
+#endif
+
#ifdef HAVE_NON_STANDARD
rc = add_event_handler(ras, pevent, page_size, "ras", "non_standard_event",
ras_non_standard_event_handler, NULL);
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
new file mode 100644
index 0000000..6e46b40
--- /dev/null
+++ b/ras-record-ccix.c
@@ -0,0 +1,204 @@
+/*
+ * Copyright (C) 2019 Jonathan Cameron <Jonathan.Cameron@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+*/
+
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include "bitfield.h"
+#include "ras-ccix-handler.h"
+#include "ras-logger.h"
+#include "ras-record.h"
+#include "ras-report.h"
+
+enum {
+ ccix_field_id,
+ ccix_field_timestamp,
+ ccix_field_error_count,
+ ccix_field_severity,
+ ccix_field_severity_detail,
+ ccix_field_address,
+ ccix_field_address_mask,
+ ccix_field_source,
+ ccix_field_component,
+ ccix_field_common_end
+};
+
+#define CCIX_COMMON_FIELDS \
+ [ccix_field_id] = { .name = "id", .type = "INTEGER PRIMARY KEY" }, \
+ [ccix_field_timestamp] = { .name = "timestamp", .type = "TEXT" }, \
+ [ccix_field_error_count] = { .name = "error_count", .type = "INTEGER" }, \
+ [ccix_field_severity] = { .name = "severity", .type = "INTEGER" }, \
+ [ccix_field_severity_detail] = { .name = "severity_detail", .type = "INTEGER" }, \
+ [ccix_field_address] = { .name = "address", .type = "INTEGER" }, \
+ [ccix_field_address_mask] = { .name = "address_mask", .type = "INTEGER" }, \
+ [ccix_field_source] = { .name = "source", .type = "INTEGER" }, \
+ [ccix_field_component] = { .name = "component", .type = "INTEGER" }
+
+enum {
+ ccix_mem_field_error_type = ccix_field_common_end,
+ ccix_mem_field_fru,
+ ccix_mem_field_type,
+ ccix_mem_field_sub_type,
+ ccix_mem_field_operation,
+ ccix_mem_field_card,
+ ccix_mem_field_mod,
+ ccix_mem_field_bank,
+ ccix_mem_field_device,
+ ccix_mem_field_row,
+ ccix_mem_field_col,
+ ccix_mem_field_rank,
+ ccix_mem_field_bit_pos,
+ ccix_mem_field_chip_id,
+ ccix_mem_field_vendor
+};
+
+static const struct db_fields ccix_memory_event_fields[] = {
+ CCIX_COMMON_FIELDS,
+ [ccix_mem_field_error_type] = { .name = "mem_err_type", .type = "INTEGER" },
+ [ccix_mem_field_fru] = { .name = "fru", .type = "INTEGER" },
+ [ccix_mem_field_type] = { .name = "type", .type = "INTEGER" },
+ [ccix_mem_field_sub_type] = { .name = "sub_type", .type = "INTEGER" },
+ [ccix_mem_field_operation] = { .name = "operation", .type = "INTEGER" },
+ [ccix_mem_field_card] = { .name = "card", .type = "INTEGER" },
+ [ccix_mem_field_mod] = { .name = "mod", .type = "INTEGER" },
+ [ccix_mem_field_bank] = { .name = "bank", .type = "INTEGER" },
+ [ccix_mem_field_device] = { .name = "device", .type = "INTEGER" },
+ [ccix_mem_field_row] = { .name = "row", .type = "INTEGER" },
+ [ccix_mem_field_col] = { .name = "col", .type = "INTEGER" },
+ [ccix_mem_field_rank] = { .name = "rank", .type = "INTEGER" },
+ [ccix_mem_field_bit_pos] = { .name = "bit_position", .type = "INTEGER" },
+ [ccix_mem_field_chip_id] = { .name = "chip_id", .type = "INTEGER" },
+ [ccix_mem_field_vendor] = { .name = "vendor_data", .type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_memory_event_tab = {
+ .name = "ccix_memory_event",
+ .fields = ccix_memory_event_fields,
+ .num_fields = ARRAY_SIZE(ccix_memory_event_fields),
+};
+
+static void ras_store_ccix_common(sqlite3_stmt *record,
+ struct ras_ccix_event *ev)
+{
+ sqlite3_bind_text(record, ccix_field_timestamp, ev->timestamp, -1,
+ NULL);
+ sqlite3_bind_int(record, ccix_field_error_count, ev->error_seq);
+ sqlite3_bind_int(record, ccix_field_severity, ev->severity);
+ sqlite3_bind_int(record, ccix_field_severity_detail,
+ ev->severity_detail);
+ sqlite3_bind_int64(record, ccix_field_address, ev->address);
+ sqlite3_bind_int64(record, ccix_field_address_mask, ev->pa_mask_lsb);
+ sqlite3_bind_int(record, ccix_field_source, ev->source);
+ sqlite3_bind_int(record, ccix_field_component, ev->component);
+}
+
+int ras_store_ccix_memory_event(struct ras_events *ras,
+ struct ras_ccix_event *ev)
+{
+ int rc;
+ struct sqlite3_priv *priv = ras->db_priv;
+ struct cper_ccix_mem_err_compact *mem =
+ (struct cper_ccix_mem_err_compact *)ev->cper_data;
+ sqlite3_stmt *rec = priv->stmt_ccix_mem_record;
+
+ if (!priv || !rec)
+ return 0;
+ log(TERM, LOG_INFO, "ccix_memory_eventstore: %p\n", rec);
+
+ ras_store_ccix_common(rec, ev);
+
+ sqlite3_bind_int(rec, ccix_mem_field_fru, mem->fru);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_MEM_ERR_TYPE_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_error_type,
+ mem->mem_err_type);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_GENERIC_MEM_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_type,
+ mem->pool_generic_type);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_SPEC_TYPE_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_sub_type,
+ mem->pool_specific_type);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_OP_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_operation, mem->op_type);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_CARD_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_card, mem->card);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_MOD_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_mod, mem->module);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_BANK_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_bank, mem->bank);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_DEVICE_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_device, mem->device);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_ROW_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_row, mem->row);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_COL_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_col, mem->column);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_RANK_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_rank, mem->rank);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_BIT_POS_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_bit_pos, mem->bit_pos);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_CHIP_ID_VALID)
+ sqlite3_bind_int(rec, ccix_mem_field_chip_id, mem->chip_id);
+
+ if (mem->validation_bits & CCIX_MEM_ERR_VENDOR_DATA_VALID)
+ sqlite3_bind_blob(rec, ccix_mem_field_vendor,
+ ev->vendor_data, ev->vendor_data_length,
+ NULL);
+
+ rc = sqlite3_step(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to do ccix_mem_record step on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_reset(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed reset ccix_mem_record on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_clear_bindings(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to clear ccix_mem_record: error %d\n",
+ rc);
+ log(TERM, LOG_INFO, "register inserted at db\n");
+ return rc;
+}
+
+void ras_ccix_create_table(struct sqlite3_priv *priv)
+{
+ int rc;
+
+ rc = ras_mc_create_table(priv, &ccix_memory_event_tab);
+ if (rc == SQLITE_OK)
+ rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_mem_record,
+ &ccix_memory_event_tab);
+}
diff --git a/ras-record.c b/ras-record.c
index b212607..874902c 100644
--- a/ras-record.c
+++ b/ras-record.c
@@ -28,6 +28,7 @@
#include "ras-events.h"
#include "ras-mc-handler.h"
#include "ras-aer-handler.h"
+#include "ras-ccix-handler.h"
#include "ras-mce-handler.h"
#include "ras-logger.h"
@@ -449,9 +450,9 @@ int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev)
* Generic code
*/
-static int ras_mc_prepare_stmt(struct sqlite3_priv *priv,
- sqlite3_stmt **stmt,
- const struct db_table_descriptor *db_tab)
+int ras_mc_prepare_stmt(struct sqlite3_priv *priv,
+ sqlite3_stmt **stmt,
+ const struct db_table_descriptor *db_tab)
{
int i, rc;
@@ -495,8 +496,8 @@ static int ras_mc_prepare_stmt(struct sqlite3_priv *priv,
return rc;
}
-static int ras_mc_create_table(struct sqlite3_priv *priv,
- const struct db_table_descriptor *db_tab)
+int ras_mc_create_table(struct sqlite3_priv *priv,
+ const struct db_table_descriptor *db_tab)
{
const struct db_fields *field;
char sql[1024], *p = sql, *end = sql + sizeof(sql);
@@ -604,6 +605,10 @@ int ras_mc_event_opendb(unsigned cpu, struct ras_events *ras)
&extlog_event_tab);
#endif
+#ifdef HAVE_CCIX
+ ras_ccix_create_table(priv);
+#endif
+
#ifdef HAVE_MCE
rc = ras_mc_create_table(priv, &mce_record_tab);
if (rc == SQLITE_OK)
diff --git a/ras-record.h b/ras-record.h
index 432a571..c094c91 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -44,6 +44,21 @@ struct ras_aer_event {
const char *msg;
};
+struct ras_ccix_event {
+ char timestamp[64];
+ int32_t error_seq;
+ int8_t severity;
+ int8_t severity_detail;
+ unsigned long long address;
+ int8_t pa_mask_lsb;
+ uint8_t source;
+ uint8_t component;
+ const char *cper_data;
+ unsigned short cper_data_length;
+ uint16_t vendor_data_length;
+ const char *vendor_data;
+};
+
struct ras_extlog_event {
char timestamp[64];
int32_t error_seq;
@@ -108,6 +123,9 @@ struct sqlite3_priv {
#ifdef HAVE_EXTLOG
sqlite3_stmt *stmt_extlog_record;
#endif
+#ifdef HAVE_CCIX
+ sqlite3_stmt *stmt_ccix_mem_record;
+#endif
#ifdef HAVE_NON_STANDARD
sqlite3_stmt *stmt_non_standard_record;
#endif
@@ -131,12 +149,20 @@ struct db_table_descriptor {
};
int ras_mc_event_opendb(unsigned cpu, struct ras_events *ras);
+int ras_mc_prepare_stmt(struct sqlite3_priv *priv,
+ sqlite3_stmt **stmt,
+ const struct db_table_descriptor *db_tab);
+int ras_mc_create_table(struct sqlite3_priv *priv,
+ const struct db_table_descriptor *db_tab);
+
int ras_mc_add_vendor_table(struct ras_events *ras, sqlite3_stmt **stmt,
const struct db_table_descriptor *db_tab);
int ras_store_mc_event(struct ras_events *ras, struct ras_mc_event *ev);
int ras_store_aer_event(struct ras_events *ras, struct ras_aer_event *ev);
int ras_store_mce_record(struct ras_events *ras, struct mce_event *ev);
int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event *ev);
+void ras_ccix_create_table(struct sqlite3_priv *priv);
+int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -147,6 +173,8 @@ static inline int ras_store_mc_event(struct ras_events *ras, struct ras_mc_event
static inline int ras_store_aer_event(struct ras_events *ras, struct ras_aer_event *ev) { return 0; };
static inline int ras_store_mce_record(struct ras_events *ras, struct mce_event *ev) { return 0; };
static inline int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event *ev) { return 0; };
+static inline void ras_ccix_create_table(void *priv) {};
+static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
diff --git a/ras-report.h b/ras-report.h
index cb133a1..4684fdc 100644
--- a/ras-report.h
+++ b/ras-report.h
@@ -19,6 +19,7 @@
#include "ras-mc-handler.h"
#include "ras-mce-handler.h"
#include "ras-aer-handler.h"
+#include "ras-ccix-handler.h"
/* Maximal length of backtrace. */
#define MAX_BACKTRACE_SIZE (1024*1024)
@@ -35,7 +36,8 @@ enum {
AER_EVENT,
NON_STANDARD_EVENT,
ARM_EVENT,
- DEVLINK_EVENT
+ DEVLINK_EVENT,
+ CCIX_EVENT,
};
#ifdef HAVE_ABRT_REPORT
@@ -46,6 +48,7 @@ int ras_report_mce_event(struct ras_events *ras, struct mce_event *ev);
int ras_report_non_standard_event(struct ras_events *ras, struct ras_non_standard_event *ev);
int ras_report_arm_event(struct ras_events *ras, struct ras_arm_event *ev);
int ras_report_devlink_event(struct ras_events *ras, struct devlink_event *ev);
+int ras_report_ccix_event(struct ras_events *ras, struct ras_ccix_event *ev);
#else
@@ -55,6 +58,7 @@ static inline int ras_report_mce_event(struct ras_events *ras, struct mce_event
static inline int ras_report_non_standard_event(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
static inline int ras_report_arm_event(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
static inline int ras_report_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
+static inline int ras_report_ccix_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
#endif
--
2.20.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH V2 2/6] rasdaemon: CCIX: Cache error support
2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 1/6] rasdaemon: CCIX: memory error support Jonathan Cameron
@ 2019-08-27 11:30 ` Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 3/6] rasdaemon: CCIX: ATC " Jonathan Cameron
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
To: Mauro Carvalho Chehab, linux-edac
Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron
Adds the support of CCIX cache error reporting and logging
to sqlite3.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
ras-ccix-handler.c | 114 +++++++++++++++++++++++++++++++++++++++++++++
ras-ccix-handler.h | 24 ++++++++++
ras-events.c | 9 ++++
ras-record-ccix.c | 100 +++++++++++++++++++++++++++++++++++++++
ras-record.h | 3 ++
5 files changed, 250 insertions(+)
diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index 2be413f..f68c297 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -127,6 +127,79 @@ static char *ccix_mem_err_cper_data(const char *c)
return buf;
}
+static char *ccix_cache_type(uint8_t type)
+{
+ switch (type) {
+ case 0: return "instruction";
+ case 1: return "data";
+ case 2: return "generic/unified";
+ case 3: return "snoop filter directory";
+ }
+ return "unknown";
+}
+
+static char *ccix_cache_err_type(int etype)
+{
+ switch (etype) {
+ case 0: return "data";
+ case 1: return "tag";
+ case 2: return "timeout";
+ case 3: return "hang";
+ case 4: return "data loss";
+ case 5: return "invalid address";
+ }
+ return "unknown-type";
+}
+
+static char *ccix_cache_op(uint8_t op)
+{
+ switch (op) {
+ case 0: return "generic";
+ case 1: return "generic read";
+ case 2: return "generic write";
+ case 3: return "data read";
+ case 4: return "data write";
+ case 5: return "instruction fetch";
+ case 6: return "prefetch";
+ case 7: return "eviction";
+ case 8: return "snooping";
+ case 9: return "snooped";
+ case 10: return "management/command";
+ }
+ return "unknown";
+}
+
+static char *ccix_cache_err_cper_data(const char *c)
+{
+ const struct cper_ccix_cache_err_compact *cpd =
+ (struct cper_ccix_cache_err_compact *)c;
+ static char buf[1024];
+ char *p = buf;
+
+ if (!(cpd->validation_bits))
+ return "";
+
+ p += sprintf(p, " (");
+ if (cpd->validation_bits & CCIX_CACHE_ERR_CACHE_ERR_TYPE_VALID)
+ p += sprintf(p, "error: %s ",
+ ccix_cache_err_type(cpd->cache_error_type));
+ if (cpd->validation_bits & CCIX_CACHE_ERR_TYPE_VALID)
+ p += sprintf(p, "type: %s ", ccix_cache_type(cpd->cache_type));
+ if (cpd->validation_bits & CCIX_CACHE_ERR_OP_VALID)
+ p += sprintf(p, "op: %s ", ccix_cache_op(cpd->op_type));
+ if (cpd->validation_bits & CCIX_CACHE_ERR_LEVEL_VALID)
+ p += sprintf(p, "level: %u ", cpd->cache_level);
+ if (cpd->validation_bits & CCIX_CACHE_ERR_SET_VALID)
+ p += sprintf(p, "set: %u ", cpd->set);
+ if (cpd->validation_bits & CCIX_CACHE_ERR_WAY_VALID)
+ p += sprintf(p, "way: %u ", cpd->way);
+ if (cpd->validation_bits & CCIX_CACHE_ERR_INSTANCE_ID_VALID)
+ p += sprintf(p, "instance: %u ", cpd->instance);
+ p += sprintf(p - 1, ")");
+
+ return buf;
+}
+
static char *ccix_component_type(int type)
{
switch (type) {
@@ -242,3 +315,44 @@ int ras_ccix_memory_event_handler(struct trace_seq *s,
return 0;
}
+
+int ras_ccix_cache_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context)
+{
+ struct ras_events *ras = context;
+ struct tm *tm;
+ struct ras_ccix_event ev;
+ time_t now;
+ int ret;
+
+ if (ras->use_uptime)
+ now = record->ts/user_hz + ras->uptime_diff;
+ else
+ now = time(NULL);
+
+ tm = localtime(&now);
+
+ if (tm)
+ strftime(ev.timestamp, sizeof(ev.timestamp),
+ "%Y-%m-%d %H:%M:%S %z", tm);
+ trace_seq_printf(s, "%s ", ev.timestamp);
+ ret = ras_ccix_common_parse(s, record, event, context, &ev);
+ if (ret)
+ return ret;
+
+ trace_seq_printf(s, "%d %s id:%d CCIX cache error %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+ ev.error_seq, err_severity(ev.severity),
+ ev.source, ccix_component_type(ev.component),
+ (ev.severity_detail & 0x1) ? 1 : 0,
+ (ev.severity_detail & 0x2) ? 1 : 0,
+ (ev.severity_detail & 0x4) ? 1 : 0,
+ (ev.severity_detail & 0x8) ? 1 : 0,
+ ev.address,
+ err_mask(ev.pa_mask_lsb),
+ ccix_cache_err_cper_data(ev.cper_data));
+
+ ras_store_ccix_cache_event(ras, &ev);
+
+ return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index f6d25b1..629ccbe 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -21,6 +21,9 @@
int ras_ccix_memory_event_handler(struct trace_seq *s,
struct pevent_record *record,
struct event_format *event, void *context);
+int ras_ccix_cache_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context);
/* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
#pragma pack(1)
@@ -41,6 +44,18 @@ struct cper_ccix_mem_err_compact {
uint8_t chip_id;
uint8_t fru;
};
+
+struct cper_ccix_cache_err_compact {
+ uint32_t validation_bits;
+ uint32_t set;
+ uint32_t way;
+ uint8_t cache_type;
+ uint8_t op_type;
+ uint8_t cache_error_type;
+ uint8_t cache_level;
+ uint8_t instance;
+};
+
#pragma pack()
#define CCIX_MEM_ERR_GENERIC_MEM_VALID 0x0001
@@ -58,4 +73,13 @@ struct cper_ccix_mem_err_compact {
#define CCIX_MEM_ERR_MOD_VALID 0x1000
#define CCIX_MEM_ERR_SPEC_TYPE_VALID 0x2000
+#define CCIX_CACHE_ERR_TYPE_VALID 0x0001
+#define CCIX_CACHE_ERR_OP_VALID 0x0002
+#define CCIX_CACHE_ERR_CACHE_ERR_TYPE_VALID 0x0004
+#define CCIX_CACHE_ERR_LEVEL_VALID 0x0008
+#define CCIX_CACHE_ERR_SET_VALID 0x0010
+#define CCIX_CACHE_ERR_WAY_VALID 0x0020
+#define CCIX_CACHE_ERR_INSTANCE_ID_VALID 0x0040
+#define CCIX_CACHE_ERR_VENDOR_DATA_VALID 0x0080
+
#endif
diff --git a/ras-events.c b/ras-events.c
index e365d97..f1b67cd 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -206,6 +206,7 @@ int toggle_ras_mc_event(int enable)
#ifdef HAVE_CCIX
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_memory_event", enable);
+ rc |= __toggle_ras_mc_event(ras, "ras", "ccix_cache_event", enable);
#endif
#ifdef HAVE_MCE
@@ -731,6 +732,14 @@ int handle_ras_events(int record_events)
else
log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
"ras", "ccix_memory_event");
+ rc = add_event_handler(ras, pevent, page_size, "ras",
+ "ccix_cache_error_event",
+ ras_ccix_cache_event_handler, NULL);
+ if (!rc)
+ num_events++;
+ else
+ log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+ "ras", "ccix_cache_event");
#endif
#ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index 6e46b40..5b6e044 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -193,6 +193,101 @@ int ras_store_ccix_memory_event(struct ras_events *ras,
return rc;
}
+enum {
+ ccix_cache_field_type = ccix_field_common_end,
+ ccix_cache_field_operation,
+ ccix_cache_field_error_type,
+ ccix_cache_field_level,
+ ccix_cache_field_set,
+ ccix_cache_field_way,
+ ccix_cache_field_instance,
+ ccix_cache_field_vendor,
+};
+
+static const struct db_fields ccix_cache_event_fields[] = {
+ CCIX_COMMON_FIELDS,
+ [ccix_cache_field_type] = { .name = "type", .type = "INTEGER" },
+ [ccix_cache_field_operation] = { .name = "operation", .type = "INTEGER" },
+ [ccix_cache_field_error_type] = { .name = "cache_err_type", .type = "INTEGER" },
+ [ccix_cache_field_level] = { .name = "\"level\"", .type = "INTEGER" },
+ [ccix_cache_field_set] = { .name = "\"set\"", .type = "INTEGER" },
+ [ccix_cache_field_way] = { .name = "way", .type = "INTEGER" },
+ [ccix_cache_field_instance] = { .name = "instance", .type = "INTEGER" },
+ [ccix_cache_field_vendor] = { .name = "vendor_data", .type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_cache_event_tab = {
+ .name = "ccix_cache_event",
+ .fields = ccix_cache_event_fields,
+ .num_fields = ARRAY_SIZE(ccix_cache_event_fields),
+};
+
+int ras_store_ccix_cache_event(struct ras_events *ras,
+ struct ras_ccix_event *ev)
+{
+ int rc;
+ struct sqlite3_priv *priv = ras->db_priv;
+ struct cper_ccix_cache_err_compact *cache =
+ (struct cper_ccix_cache_err_compact *)ev->cper_data;
+ sqlite3_stmt *rec = priv->stmt_ccix_cache_record;
+
+ if (!priv || !rec)
+ return 0;
+ log(TERM, LOG_INFO, "ccix_cache_eventstore: %p\n", rec);
+
+ ras_store_ccix_common(rec, ev);
+
+ if (cache->validation_bits & CCIX_CACHE_ERR_CACHE_ERR_TYPE_VALID)
+ sqlite3_bind_int(rec, ccix_cache_field_error_type,
+ cache->cache_error_type);
+
+ if (cache->validation_bits & CCIX_CACHE_ERR_TYPE_VALID)
+ sqlite3_bind_int(rec, ccix_cache_field_type, cache->cache_type);
+
+ if (cache->validation_bits & CCIX_CACHE_ERR_OP_VALID)
+ sqlite3_bind_int(rec, ccix_cache_field_operation,
+ cache->op_type);
+
+ if (cache->validation_bits & CCIX_CACHE_ERR_LEVEL_VALID)
+ sqlite3_bind_int(rec, ccix_cache_field_level,
+ cache->cache_level);
+
+ if (cache->validation_bits & CCIX_CACHE_ERR_SET_VALID)
+ sqlite3_bind_int(rec, ccix_cache_field_set, cache->set);
+
+ if (cache->validation_bits & CCIX_CACHE_ERR_WAY_VALID)
+ sqlite3_bind_int(rec, ccix_cache_field_way, cache->way);
+
+ if (cache->validation_bits & CCIX_CACHE_ERR_INSTANCE_ID_VALID)
+ sqlite3_bind_int(rec, ccix_cache_field_instance,
+ cache->instance);
+
+ if (cache->validation_bits & CCIX_CACHE_ERR_VENDOR_DATA_VALID)
+ sqlite3_bind_blob(rec, ccix_cache_field_vendor,
+ ev->vendor_data, ev->vendor_data_length,
+ NULL);
+
+ rc = sqlite3_step(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to do ccix_cache_record step on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_reset(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed reset ccix_cache_record on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_clear_bindings(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to clear ccix_cache_record: error %d\n",
+ rc);
+ log(TERM, LOG_INFO, "register inserted at db\n");
+ return rc;
+}
+
void ras_ccix_create_table(struct sqlite3_priv *priv)
{
int rc;
@@ -201,4 +296,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
if (rc == SQLITE_OK)
rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_mem_record,
&ccix_memory_event_tab);
+
+ rc = ras_mc_create_table(priv, &ccix_cache_event_tab);
+ if (rc == SQLITE_OK)
+ rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_cache_record,
+ &ccix_cache_event_tab);
}
diff --git a/ras-record.h b/ras-record.h
index c094c91..ac25ffc 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -125,6 +125,7 @@ struct sqlite3_priv {
#endif
#ifdef HAVE_CCIX
sqlite3_stmt *stmt_ccix_mem_record;
+ sqlite3_stmt *stmt_ccix_cache_record;
#endif
#ifdef HAVE_NON_STANDARD
sqlite3_stmt *stmt_non_standard_record;
@@ -163,6 +164,7 @@ int ras_store_mce_record(struct ras_events *ras, struct mce_event *ev);
int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event *ev);
void ras_ccix_create_table(struct sqlite3_priv *priv);
int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -175,6 +177,7 @@ static inline int ras_store_mce_record(struct ras_events *ras, struct mce_event
static inline int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event *ev) { return 0; };
static inline void ras_ccix_create_table(void *priv) {};
static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
+static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
--
2.20.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH V2 3/6] rasdaemon: CCIX: ATC error support
2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 1/6] rasdaemon: CCIX: memory error support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 2/6] rasdaemon: CCIX: Cache " Jonathan Cameron
@ 2019-08-27 11:30 ` Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 4/6] rasdaemon: CCIX: Port error suppport Jonathan Cameron
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
To: Mauro Carvalho Chehab, linux-edac
Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron
Adds support for CCIX address translation cache (ATC) errors.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
ras-ccix-handler.c | 61 ++++++++++++++++++++++++++++++++++++++++
ras-ccix-handler.h | 13 +++++++++
ras-events.c | 9 ++++++
ras-record-ccix.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++
ras-record.h | 3 ++
5 files changed, 155 insertions(+)
diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index f68c297..f7b9e8e 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -200,6 +200,26 @@ static char *ccix_cache_err_cper_data(const char *c)
return buf;
}
+static char *ccix_atc_err_cper_data(const char *c)
+{
+ const struct cper_ccix_atc_err_compact *cpd =
+ (struct cper_ccix_atc_err_compact *)c;
+ static char buf[1024];
+ char *p = buf;
+
+ if (!cpd->validation_bits)
+ return "";
+
+ p += sprintf(p, " (");
+ if (cpd->validation_bits & CCIX_ATC_ERR_OP_VALID)
+ p += sprintf(p, "op: %s ", ccix_cache_op(cpd->op_type));
+ if (cpd->validation_bits & CCIX_ATC_ERR_INSTANCE_ID_VALID)
+ p += sprintf(p, "instance: %u ", cpd->instance);
+ p += sprintf(p - 1, ")");
+
+ return buf;
+}
+
static char *ccix_component_type(int type)
{
switch (type) {
@@ -356,3 +376,44 @@ int ras_ccix_cache_event_handler(struct trace_seq *s,
return 0;
}
+
+int ras_ccix_atc_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context)
+{
+ struct ras_events *ras = context;
+ struct tm *tm;
+ struct ras_ccix_event ev;
+ time_t now;
+ int ret;
+
+ if (ras->use_uptime)
+ now = record->ts/user_hz + ras->uptime_diff;
+ else
+ now = time(NULL);
+
+ tm = localtime(&now);
+
+ if (tm)
+ strftime(ev.timestamp, sizeof(ev.timestamp),
+ "%Y-%m-%d %H:%M:%S %z", tm);
+ trace_seq_printf(s, "%s ", ev.timestamp);
+ ret = ras_ccix_common_parse(s, record, event, context, &ev);
+ if (ret)
+ return ret;
+
+ trace_seq_printf(s, "%d %s id:%d CCIX ATC error: %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+ ev.error_seq, err_severity(ev.severity),
+ ev.source, ccix_component_type(ev.component),
+ (ev.severity_detail & 0x1) ? 1 : 0,
+ (ev.severity_detail & 0x2) ? 1 : 0,
+ (ev.severity_detail & 0x4) ? 1 : 0,
+ (ev.severity_detail & 0x8) ? 1 : 0,
+ ev.address,
+ err_mask(ev.pa_mask_lsb),
+ ccix_atc_err_cper_data(ev.cper_data));
+
+ ras_store_ccix_atc_event(ras, &ev);
+
+ return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index 629ccbe..4528af7 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -24,6 +24,9 @@ int ras_ccix_memory_event_handler(struct trace_seq *s,
int ras_ccix_cache_event_handler(struct trace_seq *s,
struct pevent_record *record,
struct event_format *event, void *context);
+int ras_ccix_atc_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context);
/* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
#pragma pack(1)
@@ -56,6 +59,12 @@ struct cper_ccix_cache_err_compact {
uint8_t instance;
};
+struct cper_ccix_atc_err_compact {
+ uint32_t validation_bits;
+ uint8_t op_type;
+ uint8_t instance;
+};
+
#pragma pack()
#define CCIX_MEM_ERR_GENERIC_MEM_VALID 0x0001
@@ -82,4 +91,8 @@ struct cper_ccix_cache_err_compact {
#define CCIX_CACHE_ERR_INSTANCE_ID_VALID 0x0040
#define CCIX_CACHE_ERR_VENDOR_DATA_VALID 0x0080
+#define CCIX_ATC_ERR_OP_VALID 0x0001
+#define CCIX_ATC_ERR_INSTANCE_ID_VALID 0x0002
+#define CCIX_ATC_ERR_VENDOR_DATA_VALID 0x0004
+
#endif
diff --git a/ras-events.c b/ras-events.c
index f1b67cd..68ed246 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -207,6 +207,7 @@ int toggle_ras_mc_event(int enable)
#ifdef HAVE_CCIX
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_memory_event", enable);
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_cache_event", enable);
+ rc |= __toggle_ras_mc_event(ras, "ras", "ccix_atc_event", enable);
#endif
#ifdef HAVE_MCE
@@ -740,6 +741,14 @@ int handle_ras_events(int record_events)
else
log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
"ras", "ccix_cache_event");
+ rc = add_event_handler(ras, pevent, page_size, "ras",
+ "ccix_atc_error_event",
+ ras_ccix_atc_event_handler, NULL);
+ if (!rc)
+ num_events++;
+ else
+ log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+ "ras", "ccix_atc_event");
#endif
#ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index 5b6e044..df68eef 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -288,6 +288,70 @@ int ras_store_ccix_cache_event(struct ras_events *ras,
return rc;
}
+enum {
+ ccix_atc_field_operation = ccix_field_common_end,
+ ccix_atc_field_instance,
+ ccix_atc_field_vendor,
+};
+
+static const struct db_fields ccix_atc_event_fields[] = {
+ CCIX_COMMON_FIELDS,
+ [ccix_atc_field_operation] = { .name = "operation", .type = "INTEGER" },
+ [ccix_atc_field_instance] = { .name = "instance", .type = "INTEGER" },
+ [ccix_atc_field_vendor] = { .name = "vendor_data", .type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_atc_event_tab = {
+ .name = "ccix_atc_event",
+ .fields = ccix_atc_event_fields,
+ .num_fields = ARRAY_SIZE(ccix_atc_event_fields),
+};
+
+int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev)
+{
+ int rc;
+ struct sqlite3_priv *priv = ras->db_priv;
+ struct cper_ccix_atc_err_compact *atc =
+ (struct cper_ccix_atc_err_compact *)ev->cper_data;
+ sqlite3_stmt *rec = priv->stmt_ccix_atc_record;
+
+ if (!priv || !rec)
+ return 0;
+ log(TERM, LOG_INFO, "ccix_atc_eventstore: %p\n", rec);
+
+ ras_store_ccix_common(priv->stmt_ccix_atc_record, ev);
+ if (atc->validation_bits & CCIX_ATC_ERR_OP_VALID)
+ sqlite3_bind_int(rec, ccix_atc_field_operation, atc->op_type);
+
+ if (atc->validation_bits & CCIX_ATC_ERR_INSTANCE_ID_VALID)
+ sqlite3_bind_int(rec, ccix_atc_field_instance, atc->instance);
+
+ if (atc->validation_bits & CCIX_ATC_ERR_VENDOR_DATA_VALID)
+ sqlite3_bind_blob(rec, ccix_atc_field_vendor,
+ ev->vendor_data, ev->vendor_data_length,
+ NULL);
+
+ rc = sqlite3_step(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to do ccix_atc_record step on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_reset(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed reset ccix_atc_record on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_clear_bindings(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to clear ccix_atc_record: error %d\n",
+ rc);
+ log(TERM, LOG_INFO, "register inserted at db\n");
+ return rc;
+}
+
void ras_ccix_create_table(struct sqlite3_priv *priv)
{
int rc;
@@ -301,4 +365,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
if (rc == SQLITE_OK)
rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_cache_record,
&ccix_cache_event_tab);
+
+ rc = ras_mc_create_table(priv, &ccix_atc_event_tab);
+ if (rc == SQLITE_OK)
+ rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_atc_record,
+ &ccix_atc_event_tab);
}
diff --git a/ras-record.h b/ras-record.h
index ac25ffc..c3b3586 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -126,6 +126,7 @@ struct sqlite3_priv {
#ifdef HAVE_CCIX
sqlite3_stmt *stmt_ccix_mem_record;
sqlite3_stmt *stmt_ccix_cache_record;
+ sqlite3_stmt *stmt_ccix_atc_record;
#endif
#ifdef HAVE_NON_STANDARD
sqlite3_stmt *stmt_non_standard_record;
@@ -165,6 +166,7 @@ int ras_store_extlog_mem_record(struct ras_events *ras, struct ras_extlog_event
void ras_ccix_create_table(struct sqlite3_priv *priv);
int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -178,6 +180,7 @@ static inline int ras_store_extlog_mem_record(struct ras_events *ras, struct ras
static inline void ras_ccix_create_table(void *priv) {};
static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
+static inline int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
--
2.20.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH V2 4/6] rasdaemon: CCIX: Port error suppport
2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
` (2 preceding siblings ...)
2019-08-27 11:30 ` [PATCH V2 3/6] rasdaemon: CCIX: ATC " Jonathan Cameron
@ 2019-08-27 11:30 ` Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 5/6] rasdaemon: CCIX: Link error support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 6/6] rasdaemon: CCIX: Agent Internal " Jonathan Cameron
5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
To: Mauro Carvalho Chehab, linux-edac
Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron
Add support for reporting and storing to sqlite3 for CCIX
Port errors.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
ras-ccix-handler.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++
ras-ccix-handler.h | 14 +++++++
ras-events.c | 9 +++++
ras-record-ccix.c | 75 +++++++++++++++++++++++++++++++++++++
ras-record.h | 3 ++
5 files changed, 194 insertions(+)
diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index f7b9e8e..0a79627 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -220,6 +220,58 @@ static char *ccix_atc_err_cper_data(const char *c)
return buf;
}
+static char *ccix_port_op(uint8_t op)
+{
+ switch (op) {
+ case 0: return "command";
+ case 1: return "read";
+ case 2: return "write";
+ }
+ return "unknown";
+}
+
+static char *ccix_port_err_type(uint8_t type)
+{
+ switch (type) {
+ case 0: return "generic bus / slave error";
+ case 1: return "bus parity / ECC error";
+ case 2: return "BDF not present";
+ case 3: return "invalid address";
+ case 4: return "invalid agent ID";
+ case 5: return "bus timeout";
+ case 6: return "hang";
+ case 7: return "egress blocked";
+ }
+ return "unknown-type";
+};
+
+static char *ccix_port_err_cper_data(const char *c)
+{
+ const struct cper_ccix_port_err_compact *cpd =
+ (struct cper_ccix_port_err_compact *)c;
+ static char buf[1024];
+ char *p = buf;
+ int i;
+
+ if (!cpd->validation_bits)
+ return "";
+
+ p += sprintf(p, " (");
+ if (cpd->validation_bits & CCIX_PORT_ERR_TYPE_VALID)
+ p += sprintf(p, "error: %s ",
+ ccix_port_err_type(cpd->err_type));
+ if (cpd->validation_bits & CCIX_PORT_ERR_OP_VALID)
+ p += sprintf(p, "op: %s ", ccix_port_op(cpd->op_type));
+ if (cpd->validation_bits & CCIX_PORT_ERR_MESSAGE_VALID) {
+ p += sprintf(p, "message: ");
+ for (i = 0; i < 8; i++)
+ p += sprintf(p, "0x%08x ", cpd->message[i]);
+ }
+ p += sprintf(p - 1, ")");
+
+ return buf;
+}
+
static char *ccix_component_type(int type)
{
switch (type) {
@@ -417,3 +469,44 @@ int ras_ccix_atc_event_handler(struct trace_seq *s,
return 0;
}
+
+int ras_ccix_port_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context)
+{
+ struct ras_events *ras = context;
+ struct tm *tm;
+ struct ras_ccix_event ev;
+ time_t now;
+ int ret;
+
+ if (ras->use_uptime)
+ now = record->ts/user_hz + ras->uptime_diff;
+ else
+ now = time(NULL);
+
+ tm = localtime(&now);
+
+ if (tm)
+ strftime(ev.timestamp, sizeof(ev.timestamp),
+ "%Y-%m-%d %H:%M:%S %z", tm);
+ trace_seq_printf(s, "%s ", ev.timestamp);
+ ret = ras_ccix_common_parse(s, record, event, context, &ev);
+ if (ret)
+ return ret;
+
+ trace_seq_printf(s, "%d %s id:%d CCIX Port error: %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+ ev.error_seq, err_severity(ev.severity),
+ ev.source, ccix_component_type(ev.component),
+ (ev.severity_detail & 0x1) ? 1 : 0,
+ (ev.severity_detail & 0x2) ? 1 : 0,
+ (ev.severity_detail & 0x4) ? 1 : 0,
+ (ev.severity_detail & 0x8) ? 1 : 0,
+ ev.address,
+ err_mask(ev.pa_mask_lsb),
+ ccix_port_err_cper_data(ev.cper_data));
+
+ ras_store_ccix_port_event(ras, &ev);
+
+ return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index 4528af7..e824aed 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -27,6 +27,9 @@ int ras_ccix_cache_event_handler(struct trace_seq *s,
int ras_ccix_atc_event_handler(struct trace_seq *s,
struct pevent_record *record,
struct event_format *event, void *context);
+int ras_ccix_port_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context);
/* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
#pragma pack(1)
@@ -65,6 +68,12 @@ struct cper_ccix_atc_err_compact {
uint8_t instance;
};
+struct cper_ccix_port_err_compact {
+ uint32_t validation_bits;
+ uint32_t message[8];
+ uint8_t err_type;
+ uint8_t op_type;
+};
#pragma pack()
#define CCIX_MEM_ERR_GENERIC_MEM_VALID 0x0001
@@ -95,4 +104,9 @@ struct cper_ccix_atc_err_compact {
#define CCIX_ATC_ERR_INSTANCE_ID_VALID 0x0002
#define CCIX_ATC_ERR_VENDOR_DATA_VALID 0x0004
+#define CCIX_PORT_ERR_OP_VALID 0x0001
+#define CCIX_PORT_ERR_TYPE_VALID 0x0002
+#define CCIX_PORT_ERR_MESSAGE_VALID 0x0004
+#define CCIX_PORT_ERR_VENDOR_DATA_VALID 0x0008
+
#endif
diff --git a/ras-events.c b/ras-events.c
index 68ed246..83e28a7 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -208,6 +208,7 @@ int toggle_ras_mc_event(int enable)
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_memory_event", enable);
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_cache_event", enable);
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_atc_event", enable);
+ rc |= __toggle_ras_mc_event(ras, "ras", "ccix_port_event", enable);
#endif
#ifdef HAVE_MCE
@@ -749,6 +750,14 @@ int handle_ras_events(int record_events)
else
log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
"ras", "ccix_atc_event");
+ rc = add_event_handler(ras, pevent, page_size, "ras",
+ "ccix_port_error_event",
+ ras_ccix_port_event_handler, NULL);
+ if (!rc)
+ num_events++;
+ else
+ log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+ "ras", "ccix_port_event");
#endif
#ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index df68eef..e1c5df4 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -352,6 +352,76 @@ int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev)
return rc;
}
+enum {
+ ccix_port_field_operation = ccix_field_common_end,
+ ccix_port_field_etype,
+ ccix_port_field_message,
+ ccix_port_field_vendor,
+};
+
+static const struct db_fields ccix_port_event_fields[] = {
+ CCIX_COMMON_FIELDS,
+ [ccix_port_field_operation] = { .name = "operation", .type = "INTEGER" },
+ [ccix_port_field_etype] = { .name = "etype", .type = "INTEGER" },
+ [ccix_port_field_message] = { .name = "message", .type = "BLOB" },
+ [ccix_port_field_vendor] = { .name = "vendor_data", .type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_port_event_tab = {
+ .name = "ccix_port_event",
+ .fields = ccix_port_event_fields,
+ .num_fields = ARRAY_SIZE(ccix_port_event_fields),
+};
+
+int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev)
+{
+ int rc;
+ struct sqlite3_priv *priv = ras->db_priv;
+ struct cper_ccix_port_err_compact *port =
+ (struct cper_ccix_port_err_compact *)ev->cper_data;
+ sqlite3_stmt *rec = priv->stmt_ccix_port_record;
+
+ if (!priv || !rec)
+ return 0;
+ log(TERM, LOG_INFO, "ccix_port_eventstore: %p\n", rec);
+
+ ras_store_ccix_common(rec, ev);
+ if (port->validation_bits & CCIX_PORT_ERR_OP_VALID)
+ sqlite3_bind_int(rec, ccix_port_field_operation, port->op_type);
+
+ if (port->validation_bits & CCIX_PORT_ERR_TYPE_VALID)
+ sqlite3_bind_int(rec, ccix_port_field_etype, port->err_type);
+
+ if (port->validation_bits & CCIX_PORT_ERR_MESSAGE_VALID)
+ sqlite3_bind_blob(rec, ccix_port_field_message,
+ port->message, sizeof(port->message), NULL);
+
+ if (port->validation_bits & CCIX_PORT_ERR_VENDOR_DATA_VALID)
+ sqlite3_bind_blob(rec, ccix_port_field_vendor,
+ ev->vendor_data, ev->vendor_data_length,
+ NULL);
+
+ rc = sqlite3_step(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to do ccix_port_record step on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_reset(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed reset ccix_port_record on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_clear_bindings(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to clear ccix_port_record: error %d\n",
+ rc);
+ log(TERM, LOG_INFO, "register inserted at db\n");
+ return rc;
+}
+
void ras_ccix_create_table(struct sqlite3_priv *priv)
{
int rc;
@@ -370,4 +440,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
if (rc == SQLITE_OK)
rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_atc_record,
&ccix_atc_event_tab);
+
+ rc = ras_mc_create_table(priv, &ccix_port_event_tab);
+ if (rc == SQLITE_OK)
+ rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_port_record,
+ &ccix_port_event_tab);
}
diff --git a/ras-record.h b/ras-record.h
index c3b3586..778de25 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -127,6 +127,7 @@ struct sqlite3_priv {
sqlite3_stmt *stmt_ccix_mem_record;
sqlite3_stmt *stmt_ccix_cache_record;
sqlite3_stmt *stmt_ccix_atc_record;
+ sqlite3_stmt *stmt_ccix_port_record;
#endif
#ifdef HAVE_NON_STANDARD
sqlite3_stmt *stmt_non_standard_record;
@@ -167,6 +168,7 @@ void ras_ccix_create_table(struct sqlite3_priv *priv);
int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -181,6 +183,7 @@ static inline void ras_ccix_create_table(void *priv) {};
static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *ev) { return 0; };
static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
+static inline int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
--
2.20.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH V2 5/6] rasdaemon: CCIX: Link error support
2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
` (3 preceding siblings ...)
2019-08-27 11:30 ` [PATCH V2 4/6] rasdaemon: CCIX: Port error suppport Jonathan Cameron
@ 2019-08-27 11:30 ` Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 6/6] rasdaemon: CCIX: Agent Internal " Jonathan Cameron
5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
To: Mauro Carvalho Chehab, linux-edac
Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron
Add support for reporting and storing to sqlite3 of
CCIX Link errors.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
ras-ccix-handler.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++
ras-ccix-handler.h | 19 +++++++++
ras-events.c | 9 +++++
ras-record-ccix.c | 87 +++++++++++++++++++++++++++++++++++++++++
ras-record.h | 3 ++
5 files changed, 214 insertions(+)
diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index 0a79627..69baa48 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -272,6 +272,61 @@ static char *ccix_port_err_cper_data(const char *c)
return buf;
}
+static char *ccix_link_err_type(uint8_t err)
+{
+ switch (err) {
+ case 0: return "generic";
+ case 1: return "credit underflow";
+ case 2: return "credit overflow";
+ case 3: return "unusable credit";
+ case 4: return "credit timeout";
+ }
+ return "unknown";
+};
+
+static char *ccix_link_credit(uint8_t credit)
+{
+ switch (credit) {
+ case 0: return "memory";
+ case 1: return "snoop";
+ case 2: return "data";
+ case 3: return "misc";
+ }
+ return "unknown";
+};
+
+static char *ccix_link_err_cper_data(const char *c)
+{
+ const struct cper_ccix_link_err_compact *cpd =
+ (struct cper_ccix_link_err_compact *)c;
+ static char buf[1024];
+ char *p = buf;
+ int i;
+
+ if (!cpd->validation_bits)
+ return "";
+
+ p += sprintf(p, " (");
+ if (cpd->validation_bits & CCIX_LINK_ERR_TYPE_VALID)
+ p += sprintf(p, "error: %s ",
+ ccix_link_err_type(cpd->err_type));
+ if (cpd->validation_bits & CCIX_LINK_ERR_OP_VALID)
+ p += sprintf(p, "op: %s ", ccix_port_op(cpd->op_type));
+ if (cpd->validation_bits & CCIX_LINK_ERR_LINK_ID_VALID)
+ p += sprintf(p, "id: %u ", cpd->link_id);
+ if (cpd->validation_bits & CCIX_LINK_ERR_CREDIT_TYPE_VALID)
+ p += sprintf(p, "credit-type: %s ",
+ ccix_link_credit(cpd->credit_type));
+ if (cpd->validation_bits & CCIX_LINK_ERR_MESSAGE_VALID) {
+ p += sprintf(p, "message: ");
+ for (i = 0; i < 8; i++)
+ p += sprintf(p, "0x%08x ", cpd->message[i]);
+ }
+ p += sprintf(p - 1, ")");
+
+ return buf;
+}
+
static char *ccix_component_type(int type)
{
switch (type) {
@@ -510,3 +565,44 @@ int ras_ccix_port_event_handler(struct trace_seq *s,
return 0;
}
+
+int ras_ccix_link_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context)
+{
+ struct ras_events *ras = context;
+ struct tm *tm;
+ struct ras_ccix_event ev;
+ time_t now;
+ int ret;
+
+ if (ras->use_uptime)
+ now = record->ts/user_hz + ras->uptime_diff;
+ else
+ now = time(NULL);
+
+ tm = localtime(&now);
+
+ if (tm)
+ strftime(ev.timestamp, sizeof(ev.timestamp),
+ "%Y-%m-%d %H:%M:%S %z", tm);
+ trace_seq_printf(s, "%s ", ev.timestamp);
+ ret = ras_ccix_common_parse(s, record, event, context, &ev);
+ if (ret)
+ return ret;
+
+ trace_seq_printf(s, "%d %s id:%d CCIX Link error: %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx %s",
+ ev.error_seq, err_severity(ev.severity),
+ ev.source, ccix_component_type(ev.component),
+ (ev.severity_detail & 0x1) ? 1 : 0,
+ (ev.severity_detail & 0x2) ? 1 : 0,
+ (ev.severity_detail & 0x4) ? 1 : 0,
+ (ev.severity_detail & 0x8) ? 1 : 0,
+ ev.address,
+ err_mask(ev.pa_mask_lsb),
+ ccix_link_err_cper_data(ev.cper_data));
+
+ ras_store_ccix_link_event(ras, &ev);
+
+ return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index e824aed..3def534 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -30,6 +30,9 @@ int ras_ccix_atc_event_handler(struct trace_seq *s,
int ras_ccix_port_event_handler(struct trace_seq *s,
struct pevent_record *record,
struct event_format *event, void *context);
+int ras_ccix_link_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context);
/* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
#pragma pack(1)
@@ -74,6 +77,15 @@ struct cper_ccix_port_err_compact {
uint8_t err_type;
uint8_t op_type;
};
+
+struct cper_ccix_link_err_compact {
+ uint32_t validation_bits;
+ uint32_t message[8];
+ uint8_t err_type;
+ uint8_t op_type;
+ uint8_t link_id;
+ uint8_t credit_type;
+};
#pragma pack()
#define CCIX_MEM_ERR_GENERIC_MEM_VALID 0x0001
@@ -109,4 +121,11 @@ struct cper_ccix_port_err_compact {
#define CCIX_PORT_ERR_MESSAGE_VALID 0x0004
#define CCIX_PORT_ERR_VENDOR_DATA_VALID 0x0008
+#define CCIX_LINK_ERR_OP_VALID 0x0001
+#define CCIX_LINK_ERR_TYPE_VALID 0x0002
+#define CCIX_LINK_ERR_LINK_ID_VALID 0x0004
+#define CCIX_LINK_ERR_CREDIT_TYPE_VALID 0x0008
+#define CCIX_LINK_ERR_MESSAGE_VALID 0x0010
+#define CCIX_LINK_ERR_VENDOR_DATA_VALID 0x0020
+
#endif
diff --git a/ras-events.c b/ras-events.c
index 83e28a7..c73a36d 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -209,6 +209,7 @@ int toggle_ras_mc_event(int enable)
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_cache_event", enable);
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_atc_event", enable);
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_port_event", enable);
+ rc |= __toggle_ras_mc_event(ras, "ras", "ccix_link_event", enable);
#endif
#ifdef HAVE_MCE
@@ -758,6 +759,14 @@ int handle_ras_events(int record_events)
else
log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
"ras", "ccix_port_event");
+ rc = add_event_handler(ras, pevent, page_size, "ras",
+ "ccix_link_error_event",
+ ras_ccix_link_event_handler, NULL);
+ if (!rc)
+ num_events++;
+ else
+ log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
+ "ras", "ccix_link_event");
#endif
#ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index e1c5df4..1e03e84 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -422,6 +422,88 @@ int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev)
return rc;
}
+enum {
+ ccix_link_field_operation = ccix_field_common_end,
+ ccix_link_field_etype,
+ ccix_link_field_link_id,
+ ccix_link_field_credit_type,
+ ccix_link_field_message,
+ ccix_link_field_vendor,
+};
+
+static const struct db_fields ccix_link_event_fields[] = {
+ CCIX_COMMON_FIELDS,
+ [ccix_link_field_operation] = { .name = "operation", .type = "INTEGER" },
+ [ccix_link_field_etype] = { .name = "etype", .type = "INTEGER" },
+ [ccix_link_field_link_id] = { .name = "credit_id", .type = "INTEGER" },
+ [ccix_link_field_credit_type] = { .name = "credit_type", .type = "INTEGER" },
+ [ccix_link_field_message] = { .name = "message", .type = "BLOB" },
+ [ccix_link_field_vendor] = { .name = "vendor_data", .type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_link_event_tab = {
+ .name = "ccix_link_event",
+ .fields = ccix_link_event_fields,
+ .num_fields = ARRAY_SIZE(ccix_link_event_fields),
+};
+
+int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev)
+{
+ int rc;
+ struct sqlite3_priv *priv = ras->db_priv;
+ struct cper_ccix_link_err_compact *link =
+ (struct cper_ccix_link_err_compact *)ev->cper_data;
+ sqlite3_stmt *rec = priv->stmt_ccix_link_record;
+
+ if (!priv || !rec)
+ return 0;
+ log(TERM, LOG_INFO, "ccix_link_eventstore: %p\n", rec);
+
+ ras_store_ccix_common(rec, ev);
+ if (link->validation_bits & CCIX_LINK_ERR_OP_VALID)
+ sqlite3_bind_int(rec, ccix_link_field_operation, link->op_type);
+
+ if (link->validation_bits & CCIX_LINK_ERR_TYPE_VALID)
+ sqlite3_bind_int(rec, ccix_link_field_operation,
+ link->err_type);
+
+ if (link->validation_bits & CCIX_LINK_ERR_LINK_ID_VALID)
+ sqlite3_bind_int(rec, ccix_link_field_link_id, link->link_id);
+
+ if (link->validation_bits & CCIX_LINK_ERR_CREDIT_TYPE_VALID)
+ sqlite3_bind_int(rec, ccix_link_field_credit_type,
+ link->credit_type);
+
+ if (link->validation_bits & CCIX_LINK_ERR_MESSAGE_VALID)
+ sqlite3_bind_blob(rec, ccix_link_field_message,
+ link->message, sizeof(link->message), NULL);
+
+ if (link->validation_bits & CCIX_LINK_ERR_VENDOR_DATA_VALID)
+ sqlite3_bind_blob(rec, ccix_link_field_vendor,
+ ev->vendor_data, ev->vendor_data_length,
+ NULL);
+
+ rc = sqlite3_step(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to do ccix_link_record step on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_reset(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed reset ccix_link_record on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_clear_bindings(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to clear ccix_link_record: error %d\n",
+ rc);
+ log(TERM, LOG_INFO, "register inserted at db\n");
+ return rc;
+}
+
void ras_ccix_create_table(struct sqlite3_priv *priv)
{
int rc;
@@ -445,4 +527,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
if (rc == SQLITE_OK)
rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_port_record,
&ccix_port_event_tab);
+
+ rc = ras_mc_create_table(priv, &ccix_link_event_tab);
+ if (rc == SQLITE_OK)
+ rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_link_record,
+ &ccix_link_event_tab);
}
diff --git a/ras-record.h b/ras-record.h
index 778de25..f13e286 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -128,6 +128,7 @@ struct sqlite3_priv {
sqlite3_stmt *stmt_ccix_cache_record;
sqlite3_stmt *stmt_ccix_atc_record;
sqlite3_stmt *stmt_ccix_port_record;
+ sqlite3_stmt *stmt_ccix_link_record;
#endif
#ifdef HAVE_NON_STANDARD
sqlite3_stmt *stmt_non_standard_record;
@@ -169,6 +170,7 @@ int ras_store_ccix_memory_event(struct ras_events *ras, struct ras_ccix_event *e
int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -184,6 +186,7 @@ static inline int ras_store_ccix_memory_event(struct ras_events *ras, struct ras
static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
+static inline int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
--
2.20.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH V2 6/6] rasdaemon: CCIX: Agent Internal error support
2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
` (4 preceding siblings ...)
2019-08-27 11:30 ` [PATCH V2 5/6] rasdaemon: CCIX: Link error support Jonathan Cameron
@ 2019-08-27 11:30 ` Jonathan Cameron
5 siblings, 0 replies; 7+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:30 UTC (permalink / raw)
To: Mauro Carvalho Chehab, linux-edac
Cc: linuxarm, jcm, shiju.jose, Jonathan Cameron
Add support for reporting and stroing to sqlite3 of
CCIX Agent Interal errors.
In the current 1.0 CCIX specification these only have vendor_data
defined. However, they are structured to allow additional fields
in future so we handle them the same way as all the other CCIX
error types.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
ras-ccix-handler.c | 40 ++++++++++++++++++++++++++++++
ras-ccix-handler.h | 8 ++++++
ras-events.c | 9 +++++++
ras-record-ccix.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++
ras-record.h | 3 +++
5 files changed, 121 insertions(+)
diff --git a/ras-ccix-handler.c b/ras-ccix-handler.c
index 69baa48..2088790 100644
--- a/ras-ccix-handler.c
+++ b/ras-ccix-handler.c
@@ -606,3 +606,43 @@ int ras_ccix_link_event_handler(struct trace_seq *s,
return 0;
}
+
+int ras_ccix_agent_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context)
+{
+ struct ras_events *ras = context;
+ struct tm *tm;
+ struct ras_ccix_event ev;
+ time_t now;
+ int ret;
+
+ if (ras->use_uptime)
+ now = record->ts/user_hz + ras->uptime_diff;
+ else
+ now = time(NULL);
+
+ tm = localtime(&now);
+
+ if (tm)
+ strftime(ev.timestamp, sizeof(ev.timestamp),
+ "%Y-%m-%d %H:%M:%S %z", tm);
+ trace_seq_printf(s, "%s ", ev.timestamp);
+ ret = ras_ccix_common_parse(s, record, event, context, &ev);
+ if (ret)
+ return ret;
+
+ trace_seq_printf(s, "%d %s id:%d CCIX Agent Internal error: %s ue:%d nocomm:%d degraded:%d deferred:%d physical addr: 0x%llx mask: 0x%llx",
+ ev.error_seq, err_severity(ev.severity),
+ ev.source, ccix_component_type(ev.component),
+ (ev.severity_detail & 0x1) ? 1 : 0,
+ (ev.severity_detail & 0x2) ? 1 : 0,
+ (ev.severity_detail & 0x4) ? 1 : 0,
+ (ev.severity_detail & 0x8) ? 1 : 0,
+ ev.address,
+ err_mask(ev.pa_mask_lsb));
+
+ ras_store_ccix_agent_event(ras, &ev);
+
+ return 0;
+}
diff --git a/ras-ccix-handler.h b/ras-ccix-handler.h
index 3def534..c53e3ee 100644
--- a/ras-ccix-handler.h
+++ b/ras-ccix-handler.h
@@ -33,6 +33,9 @@ int ras_ccix_port_event_handler(struct trace_seq *s,
int ras_ccix_link_event_handler(struct trace_seq *s,
struct pevent_record *record,
struct event_format *event, void *context);
+int ras_ccix_agent_event_handler(struct trace_seq *s,
+ struct pevent_record *record,
+ struct event_format *event, void *context);
/* Perhaps unnecessary paranoia, but the tracepoint structure is packed */
#pragma pack(1)
@@ -86,6 +89,10 @@ struct cper_ccix_link_err_compact {
uint8_t link_id;
uint8_t credit_type;
};
+
+struct cper_ccix_agent_internal_err_compact {
+ uint32_t validation_bits;
+};
#pragma pack()
#define CCIX_MEM_ERR_GENERIC_MEM_VALID 0x0001
@@ -128,4 +135,5 @@ struct cper_ccix_link_err_compact {
#define CCIX_LINK_ERR_MESSAGE_VALID 0x0010
#define CCIX_LINK_ERR_VENDOR_DATA_VALID 0x0020
+#define CCIX_AGENT_ERR_VENDOR_DATA_VALID 0x0001
#endif
diff --git a/ras-events.c b/ras-events.c
index c73a36d..4de28b7 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -210,6 +210,7 @@ int toggle_ras_mc_event(int enable)
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_atc_event", enable);
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_port_event", enable);
rc |= __toggle_ras_mc_event(ras, "ras", "ccix_link_event", enable);
+ rc |= __toggle_ras_mc_event(ras, "ras", "ccix_agent_event", enable);
#endif
#ifdef HAVE_MCE
@@ -767,6 +768,14 @@ int handle_ras_events(int record_events)
else
log(ALL, LOG_ERR, "Can't get traces from %s:%s\n",
"ras", "ccix_link_event");
+ rc = add_event_handler(ras, pevent, page_size, "ras",
+ "ccix_agent_error_event",
+ ras_ccix_agent_event_handler, NULL);
+ if (!rc)
+ num_events++;
+ else
+ log(ALL, LOG_ERR, "Cant' get traces from %s:%s\n",
+ "ras", "ccix_agent_error_event");
#endif
#ifdef HAVE_NON_STANDARD
diff --git a/ras-record-ccix.c b/ras-record-ccix.c
index 1e03e84..79c6e52 100644
--- a/ras-record-ccix.c
+++ b/ras-record-ccix.c
@@ -504,6 +504,62 @@ int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev)
return rc;
}
+enum {
+ ccix_agent_field_vendor = ccix_field_common_end,
+};
+
+static const struct db_fields ccix_agent_event_fields[] = {
+ CCIX_COMMON_FIELDS,
+ [ccix_agent_field_vendor] = { .name = "vendor_data", .type = "BLOB" },
+};
+
+static const struct db_table_descriptor ccix_agent_event_tab = {
+ .name = "ccix_agent_event",
+ .fields = ccix_agent_event_fields,
+ .num_fields = ARRAY_SIZE(ccix_agent_event_fields),
+};
+
+int ras_store_ccix_agent_event(struct ras_events *ras,
+ struct ras_ccix_event *ev)
+{
+ int rc;
+ struct sqlite3_priv *priv = ras->db_priv;
+ struct cper_ccix_agent_internal_err_compact *agent =
+ (struct cper_ccix_agent_internal_err_compact *)ev->cper_data;
+ sqlite3_stmt *rec = priv->stmt_ccix_agent_record;
+
+ if (!priv || !rec)
+ return 0;
+ log(TERM, LOG_INFO, "ccix_agent_eventstore: %p\n", rec);
+
+ ras_store_ccix_common(rec, ev);
+
+ if (agent->validation_bits & CCIX_AGENT_ERR_VENDOR_DATA_VALID)
+ sqlite3_bind_blob(rec, ccix_agent_field_vendor,
+ ev->vendor_data, ev->vendor_data_length,
+ NULL);
+
+ rc = sqlite3_step(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to do ccix_agent_record step on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_reset(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed reset ccix_agent_record on sqlite: error = %d\n",
+ rc);
+
+ rc = sqlite3_clear_bindings(rec);
+ if (rc != SQLITE_OK && rc != SQLITE_DONE)
+ log(TERM, LOG_ERR,
+ "Failed to clear ccix_agent_record: error %d\n",
+ rc);
+ log(TERM, LOG_INFO, "register inserted at db\n");
+ return rc;
+}
+
void ras_ccix_create_table(struct sqlite3_priv *priv)
{
int rc;
@@ -532,4 +588,9 @@ void ras_ccix_create_table(struct sqlite3_priv *priv)
if (rc == SQLITE_OK)
rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_link_record,
&ccix_link_event_tab);
+
+ rc = ras_mc_create_table(priv, &ccix_agent_event_tab);
+ if (rc == SQLITE_OK)
+ rc = ras_mc_prepare_stmt(priv, &priv->stmt_ccix_agent_record,
+ &ccix_agent_event_tab);
}
diff --git a/ras-record.h b/ras-record.h
index f13e286..4f78e1d 100644
--- a/ras-record.h
+++ b/ras-record.h
@@ -129,6 +129,7 @@ struct sqlite3_priv {
sqlite3_stmt *stmt_ccix_atc_record;
sqlite3_stmt *stmt_ccix_port_record;
sqlite3_stmt *stmt_ccix_link_record;
+ sqlite3_stmt *stmt_ccix_agent_record;
#endif
#ifdef HAVE_NON_STANDARD
sqlite3_stmt *stmt_non_standard_record;
@@ -171,6 +172,7 @@ int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_ccix_event *ev
int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev);
+int ras_store_ccix_agent_event(struct ras_events *ras, struct ras_ccix_event *ev);
int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev);
int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev);
int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev);
@@ -187,6 +189,7 @@ static inline int ras_store_ccix_cache_event(struct ras_events *ras, struct ras_
static inline int ras_store_ccix_atc_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_ccix_port_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_ccix_link_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
+static inline int ras_store_ccix_agent_event(struct ras_events *ras, struct ras_ccix_event *ev) {return 0; };
static inline int ras_store_non_standard_record(struct ras_events *ras, struct ras_non_standard_event *ev) { return 0; };
static inline int ras_store_arm_record(struct ras_events *ras, struct ras_arm_event *ev) { return 0; };
static inline int ras_store_devlink_event(struct ras_events *ras, struct devlink_event *ev) { return 0; };
--
2.20.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-08-27 11:47 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-27 11:30 [PATCH V2 0/6] CCIX rasdaemon support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 1/6] rasdaemon: CCIX: memory error support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 2/6] rasdaemon: CCIX: Cache " Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 3/6] rasdaemon: CCIX: ATC " Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 4/6] rasdaemon: CCIX: Port error suppport Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 5/6] rasdaemon: CCIX: Link error support Jonathan Cameron
2019-08-27 11:30 ` [PATCH V2 6/6] rasdaemon: CCIX: Agent Internal " Jonathan Cameron
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).