* [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events
@ 2020-12-04 10:13 Shiju Jose
2020-12-04 10:13 ` [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors Shiju Jose
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Shiju Jose @ 2020-12-04 10:13 UTC (permalink / raw)
To: linux-edac, mchehab+huawei
Cc: linuxarm, xuwei5, jonathan.cameron, john.garry, tanxiaofei,
shameerali.kolothum.thodi, salil.mehta, shiju.jose
Add support for logging the vendor-specific error events and
add support for the HiSilicon KunPeng9xx server errors
to the ras-mc-ctl.
Shiju Jose (3):
rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors
rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors
rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common
errors
util/ras-mc-ctl.in | 289 ++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 288 insertions(+), 1 deletion(-)
--
2.17.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors
2020-12-04 10:13 [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events Shiju Jose
@ 2020-12-04 10:13 ` Shiju Jose
2020-12-04 10:13 ` [PATCH 2/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors Shiju Jose
2020-12-04 10:13 ` [PATCH 3/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common errors Shiju Jose
2 siblings, 0 replies; 4+ messages in thread
From: Shiju Jose @ 2020-12-04 10:13 UTC (permalink / raw)
To: linux-edac, mchehab+huawei
Cc: linuxarm, xuwei5, jonathan.cameron, john.garry, tanxiaofei,
shameerali.kolothum.thodi, salil.mehta, shiju.jose
Add commands to support logging the vendor-specific
error info in the ras-mc-ctl.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Xiaofei Tan <tanxiaofei@huawei.com>
---
util/ras-mc-ctl.in | 68 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 67 insertions(+), 1 deletion(-)
diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in
index 07b52e9..f457f0a 100755
--- a/util/ras-mc-ctl.in
+++ b/util/ras-mc-ctl.in
@@ -80,6 +80,9 @@ Usage: $prog [OPTIONS...]
--summary Presents a summary of the logged errors.
--errors Shows the errors stored at the error database.
--error-count Shows the corrected and uncorrected error counts using sysfs.
+ --vendor-errors-summary <platform-id> Presents a summary of the vendor-specific logged errors.
+ --vendor-errors <platform-id> Shows the vendor-specific errors stored in the error database.
+ --vendor-platforms List the supported platforms with platform-ids for the vendor-specific errors.
--help This help message.
EOF
@@ -127,6 +130,18 @@ if ($conf{opt}{errors}) {
errors ();
}
+if ($conf{opt}{vendor_errors_summary}) {
+ vendor_errors_summary ();
+}
+
+if ($conf{opt}{vendor_errors}) {
+ vendor_errors ();
+}
+
+if ($conf{opt}{vendor_platforms}) {
+ vendor_platforms ();
+}
+
exit (0);
sub parse_cmdline
@@ -142,6 +157,9 @@ sub parse_cmdline
$conf{opt}{summary} = 0;
$conf{opt}{errors} = 0;
$conf{opt}{error_count} = 0;
+ $conf{opt}{vendor_errors_summary} = 0;
+ $conf{opt}{vendor_errors} = 0;
+ $conf{opt}{vendor_platforms} = 0;
my $rref = \$conf{opt}{report};
my $mref = \$conf{opt}{mainboard};
@@ -159,7 +177,10 @@ sub parse_cmdline
"layout" => \$conf{opt}{display_memory_layout},
"summary" => \$conf{opt}{summary},
"errors" => \$conf{opt}{errors},
- "error-count" => \$conf{opt}{error_count}
+ "error-count" => \$conf{opt}{error_count},
+ "vendor-errors-summary" => \$conf{opt}{vendor_errors_summary},
+ "vendor-errors" => \$conf{opt}{vendor_errors},
+ "vendor-platforms" => \$conf{opt}{vendor_platforms},
);
usage(1) if !$rc;
@@ -1531,6 +1552,51 @@ sub errors
undef($dbh);
}
+sub vendor_errors_summary
+{
+ require DBI;
+ my ($num_args, $platform_id);
+
+ $num_args = $#ARGV + 1;
+ $platform_id = 0;
+ if ($num_args ne 0) {
+ $platform_id = $ARGV[0];
+ } else {
+ return;
+ }
+
+ my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname", "", "", {});
+ # Disable the DBI automatic exception log
+ $dbh->{PrintError} = 0;
+
+ undef($dbh);
+}
+
+sub vendor_errors
+{
+ require DBI;
+ my ($num_args, $platform_id);
+
+ $num_args = $#ARGV + 1;
+ $platform_id = 0;
+ if ($num_args ne 0) {
+ $platform_id = $ARGV[0];
+ } else {
+ return;
+ }
+
+ my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname", "", "", {});
+ # Disable the DBI automatic exception log
+ $dbh->{PrintError} = 0;
+
+ undef($dbh);
+}
+
+sub vendor_platforms
+{
+ print "\nSupported platforms for the vendor-specific errors:\n";
+}
+
sub log_msg { print STDERR "$prog: ", @_ unless $conf{opt}{quiet}; }
sub log_error { log_msg ("Error: @_"); }
--
2.17.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors
2020-12-04 10:13 [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events Shiju Jose
2020-12-04 10:13 ` [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors Shiju Jose
@ 2020-12-04 10:13 ` Shiju Jose
2020-12-04 10:13 ` [PATCH 3/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common errors Shiju Jose
2 siblings, 0 replies; 4+ messages in thread
From: Shiju Jose @ 2020-12-04 10:13 UTC (permalink / raw)
To: linux-edac, mchehab+huawei
Cc: linuxarm, xuwei5, jonathan.cameron, john.garry, tanxiaofei,
shameerali.kolothum.thodi, salil.mehta, shiju.jose
Add support for the HiSilicon KunPeng920 errors.
Support the error formats: OEM type 1, OEM typ2 and PCIe controller
error formats.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Xiaofei Tan <tanxiaofei@huawei.com>
---
util/ras-mc-ctl.in | 173 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 173 insertions(+)
diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in
index f457f0a..711e4b0 100755
--- a/util/ras-mc-ctl.in
+++ b/util/ras-mc-ctl.in
@@ -1552,10 +1552,17 @@ sub errors
undef($dbh);
}
+# Definitions of the vendor platform IDs.
+use constant {
+ HISILICON_KUNPENG_920 => "KunPeng920",
+};
+
sub vendor_errors_summary
{
require DBI;
my ($num_args, $platform_id);
+ my ($query, $query_handle, $count, $out);
+ my ($module_id, $sub_module_id, $err_severity, $err_sev);
$num_args = $#ARGV + 1;
$platform_id = 0;
@@ -1569,6 +1576,81 @@ sub vendor_errors_summary
# Disable the DBI automatic exception log
$dbh->{PrintError} = 0;
+ # HiSilicon KunPeng920 errors
+ if ($platform_id eq HISILICON_KUNPENG_920) {
+ try {
+ $query = "select err_severity, module_id, count(*) from hip08_oem_type1_event_v2 group by err_severity, module_id";
+ $query_handle = $dbh->prepare($query);
+ $query_handle->execute();
+ $query_handle->bind_columns(\($err_severity, $module_id, $count));
+ $out = "";
+ $err_sev = "";
+ while($query_handle->fetch()) {
+ if ($err_severity ne $err_sev) {
+ $out .= "$err_severity errors:\n";
+ $err_sev = $err_severity;
+ }
+ $out .= "\t$module_id: $count\n";
+ }
+ if ($out ne "") {
+ print "HiSilicon KunPeng920 OEM type1 error events summary:\n$out\n";
+ } else {
+ print "No HiSilicon KunPeng920 OEM type1 errors.\n\n";
+ }
+ $query_handle->finish;
+ } catch {
+ print "Exception: $DBI::errstr\n\n";
+ };
+
+ try {
+ $query = "select err_severity, module_id, count(*) from hip08_oem_type2_event_v2 group by err_severity, module_id";
+ $query_handle = $dbh->prepare($query);
+ $query_handle->execute();
+ $query_handle->bind_columns(\($err_severity, $module_id, $count));
+ $out = "";
+ $err_sev = "";
+ while($query_handle->fetch()) {
+ if ($err_severity ne $err_sev) {
+ $out .= "$err_severity errors:\n";
+ $err_sev = $err_severity;
+ }
+ $out .= "\t$module_id: $count\n";
+ }
+ if ($out ne "") {
+ print "HiSilicon KunPeng920 OEM type2 error events summary:\n$out\n";
+ } else {
+ print "No HiSilicon KunPeng920 OEM type2 errors.\n\n";
+ }
+ $query_handle->finish;
+ } catch {
+ print "Exception: $DBI::errstr\n\n";
+ };
+
+ try {
+ $query = "select err_severity, sub_module_id, count(*) from hip08_pcie_local_event_v2 group by err_severity, sub_module_id";
+ $query_handle = $dbh->prepare($query);
+ $query_handle->execute();
+ $query_handle->bind_columns(\($err_severity, $sub_module_id, $count));
+ $out = "";
+ $err_sev = "";
+ while($query_handle->fetch()) {
+ if ($err_severity ne $err_sev) {
+ $out .= "$err_severity errors:\n";
+ $err_sev = $err_severity;
+ }
+ $out .= "\t$sub_module_id: $count\n";
+ }
+ if ($out ne "") {
+ print "HiSilicon KunPeng920 PCIe controller error events summary:\n$out\n";
+ } else {
+ print "No HiSilicon KunPeng920 PCIe controller errors.\n\n";
+ }
+ $query_handle->finish;
+ } catch {
+ print "Exception: $DBI::errstr\n\n";
+ };
+ }
+
undef($dbh);
}
@@ -1576,6 +1658,9 @@ sub vendor_errors
{
require DBI;
my ($num_args, $platform_id);
+ my ($query, $query_handle, $id, $timestamp, $out);
+ my ($version, $soc_id, $socket_id, $nimbus_id, $core_id, $port_id);
+ my ($module_id, $sub_module_id, $err_severity, $err_type, $regs);
$num_args = $#ARGV + 1;
$platform_id = 0;
@@ -1589,12 +1674,100 @@ sub vendor_errors
# Disable the DBI automatic exception log
$dbh->{PrintError} = 0;
+ # HiSilicon KunPeng920 errors
+ if ($platform_id eq HISILICON_KUNPENG_920) {
+ try {
+ $query = "select id, timestamp, version, soc_id, socket_id, nimbus_id, module_id, sub_module_id, err_severity, regs_dump from hip08_oem_type1_event_v2 order by id, module_id, err_severity";
+ $query_handle = $dbh->prepare($query);
+ $query_handle->execute();
+ $query_handle->bind_columns(\($id, $timestamp, $version, $soc_id, $socket_id, $nimbus_id, $module_id, $sub_module_id, $err_severity, $regs));
+ $out = "";
+ while($query_handle->fetch()) {
+ $out .= "$id. $timestamp Error Info: ";
+ $out .= "version=$version, ";
+ $out .= "soc_id=$soc_id, " if ($soc_id);
+ $out .= "socket_id=$socket_id, " if ($socket_id);
+ $out .= "nimbus_id=$nimbus_id, " if ($nimbus_id);
+ $out .= "module_id=$module_id, " if ($module_id);
+ $out .= "sub_module_id=$sub_module_id, " if ($sub_module_id);
+ $out .= "err_severity=$err_severity, \n" if ($err_severity);
+ $out .= "Error Registers: $regs\n\n" if ($regs);
+ }
+ if ($out ne "") {
+ print "HiSilicon KunPeng920 OEM type1 error events:\n$out\n";
+ } else {
+ print "No HiSilicon KunPeng920 OEM type1 errors.\n";
+ }
+ $query_handle->finish;
+ } catch {
+ print "Exception: $DBI::errstr\n\n";
+ };
+
+ try {
+ $query = "select id, timestamp, version, soc_id, socket_id, nimbus_id, module_id, sub_module_id, err_severity, regs_dump from hip08_oem_type2_event_v2 order by id, module_id, err_severity";
+ $query_handle = $dbh->prepare($query);
+ $query_handle->execute();
+ $query_handle->bind_columns(\($id, $timestamp, $version, $soc_id, $socket_id, $nimbus_id, $module_id, $sub_module_id, $err_severity, $regs));
+ $out = "";
+ while($query_handle->fetch()) {
+ $out .= "$id. $timestamp Error Info: ";
+ $out .= "version=$version, ";
+ $out .= "soc_id=$soc_id, " if ($soc_id);
+ $out .= "socket_id=$socket_id, " if ($socket_id);
+ $out .= "nimbus_id=$nimbus_id, " if ($nimbus_id);
+ $out .= "module_id=$module_id, " if ($module_id);
+ $out .= "sub_module_id=$sub_module_id, " if ($sub_module_id);
+ $out .= "err_severity=$err_severity, \n" if ($err_severity);
+ $out .= "Error Registers: $regs\n\n" if ($regs);
+ }
+ if ($out ne "") {
+ print "HiSilicon KunPeng920 OEM type2 error events:\n$out\n";
+ } else {
+ print "No HiSilicon KunPeng920 OEM type2 errors.\n";
+ }
+ $query_handle->finish;
+ } catch {
+ print "Exception: $DBI::errstr\n\n";
+ };
+
+ try {
+ $query = "select id, timestamp, version, soc_id, socket_id, nimbus_id, sub_module_id, core_id, port_id, err_severity, err_type, regs_dump from hip08_pcie_local_event_v2 order by id, sub_module_id, err_severity";
+ $query_handle = $dbh->prepare($query);
+ $query_handle->execute();
+ $query_handle->bind_columns(\($id, $timestamp, $version, $soc_id, $socket_id, $nimbus_id, $sub_module_id, $core_id, $port_id, $err_severity, $err_type, $regs));
+ $out = "";
+ while($query_handle->fetch()) {
+ $out .= "$id. $timestamp Error Info: ";
+ $out .= "version=$version, ";
+ $out .= "soc_id=$soc_id, " if ($soc_id);
+ $out .= "socket_id=$socket_id, " if ($socket_id);
+ $out .= "nimbus_id=$nimbus_id, " if ($nimbus_id);
+ $out .= "sub_module_id=$sub_module_id, " if ($sub_module_id);
+ $out .= "core_id=$core_id, " if ($core_id);
+ $out .= "port_id=$port_id, " if ($port_id);
+ $out .= "err_severity=$err_severity, " if ($err_severity);
+ $out .= "err_type=$err_type, \n" if ($err_type);
+ $out .= "Error Registers: $regs\n\n" if ($regs);
+ }
+ if ($out ne "") {
+ print "HiSilicon KunPeng920 PCIe controller error events:\n$out\n";
+ } else {
+ print "No HiSilicon KunPeng920 PCIe controller errors.\n";
+ }
+ $query_handle->finish;
+ } catch {
+ print "Exception: $DBI::errstr\n\n";
+ };
+ }
+
undef($dbh);
}
sub vendor_platforms
{
print "\nSupported platforms for the vendor-specific errors:\n";
+ print "\tHiSilicon KunPeng920, platform-id=\"", HISILICON_KUNPENG_920, "\"\n";
+ print "\n";
}
sub log_msg { print STDERR "$prog: ", @_ unless $conf{opt}{quiet}; }
--
2.17.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common errors
2020-12-04 10:13 [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events Shiju Jose
2020-12-04 10:13 ` [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors Shiju Jose
2020-12-04 10:13 ` [PATCH 2/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors Shiju Jose
@ 2020-12-04 10:13 ` Shiju Jose
2 siblings, 0 replies; 4+ messages in thread
From: Shiju Jose @ 2020-12-04 10:13 UTC (permalink / raw)
To: linux-edac, mchehab+huawei
Cc: linuxarm, xuwei5, jonathan.cameron, john.garry, tanxiaofei,
shameerali.kolothum.thodi, salil.mehta, shiju.jose
Add support for the HiSilicon KunPeng9xx platforms common errors.
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Xiaofei Tan <tanxiaofei@huawei.com>
---
util/ras-mc-ctl.in | 52 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 50 insertions(+), 2 deletions(-)
diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in
index 711e4b0..0885de1 100755
--- a/util/ras-mc-ctl.in
+++ b/util/ras-mc-ctl.in
@@ -1555,6 +1555,7 @@ sub errors
# Definitions of the vendor platform IDs.
use constant {
HISILICON_KUNPENG_920 => "KunPeng920",
+ HISILICON_KUNPENG_9XX => "KunPeng9xx",
};
sub vendor_errors_summary
@@ -1562,7 +1563,7 @@ sub vendor_errors_summary
require DBI;
my ($num_args, $platform_id);
my ($query, $query_handle, $count, $out);
- my ($module_id, $sub_module_id, $err_severity, $err_sev);
+ my ($module_id, $sub_module_id, $err_severity, $err_sev, $err_info);
$num_args = $#ARGV + 1;
$platform_id = 0;
@@ -1651,6 +1652,28 @@ sub vendor_errors_summary
};
}
+ # HiSilicon KunPeng9xx common errors
+ if ($platform_id eq HISILICON_KUNPENG_9XX) {
+ try {
+ $query = "select err_info, count(*) from hisi_common_section";
+ $query_handle = $dbh->prepare($query);
+ $query_handle->execute();
+ $query_handle->bind_columns(\($err_info, $count));
+ $out = "";
+ while($query_handle->fetch()) {
+ $out .= "\terrors: $count\n";
+ }
+ if ($out ne "") {
+ print "HiSilicon KunPeng9xx common error events summary:\n$out\n";
+ } else {
+ print "No HiSilicon KunPeng9xx common errors.\n\n";
+ }
+ $query_handle->finish;
+ } catch {
+ print "Exception: $DBI::errstr\n\n";
+ };
+ }
+
undef($dbh);
}
@@ -1660,7 +1683,7 @@ sub vendor_errors
my ($num_args, $platform_id);
my ($query, $query_handle, $id, $timestamp, $out);
my ($version, $soc_id, $socket_id, $nimbus_id, $core_id, $port_id);
- my ($module_id, $sub_module_id, $err_severity, $err_type, $regs);
+ my ($module_id, $sub_module_id, $err_severity, $err_type, $err_info, $regs);
$num_args = $#ARGV + 1;
$platform_id = 0;
@@ -1760,6 +1783,30 @@ sub vendor_errors
};
}
+ # HiSilicon KunPeng9xx common errors
+ if ($platform_id eq HISILICON_KUNPENG_9XX) {
+ try {
+ $query = "select id, timestamp, err_info, regs_dump from hisi_common_section order by id";
+ $query_handle = $dbh->prepare($query);
+ $query_handle->execute();
+ $query_handle->bind_columns(\($id, $timestamp, $err_info, $regs));
+ $out = "";
+ while($query_handle->fetch()) {
+ $out .= "$id. $timestamp ";
+ $out .= "Error Info:$err_info \n" if ($err_info);
+ $out .= "Error Registers: $regs\n\n" if ($regs);
+ }
+ if ($out ne "") {
+ print "HiSilicon KunPeng9xx common error events:\n$out\n";
+ } else {
+ print "No HiSilicon KunPeng9xx common errors.\n";
+ }
+ $query_handle->finish;
+ } catch {
+ print "Exception: $DBI::errstr\n\n";
+ };
+ }
+
undef($dbh);
}
@@ -1767,6 +1814,7 @@ sub vendor_platforms
{
print "\nSupported platforms for the vendor-specific errors:\n";
print "\tHiSilicon KunPeng920, platform-id=\"", HISILICON_KUNPENG_920, "\"\n";
+ print "\tHiSilicon KunPeng9xx, platform-id=\"", HISILICON_KUNPENG_9XX, "\"\n";
print "\n";
}
--
2.17.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-12-04 10:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04 10:13 [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events Shiju Jose
2020-12-04 10:13 ` [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors Shiju Jose
2020-12-04 10:13 ` [PATCH 2/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors Shiju Jose
2020-12-04 10:13 ` [PATCH 3/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common errors Shiju Jose
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).