linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events
@ 2020-12-04 10:13 Shiju Jose
  2020-12-04 10:13 ` [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors Shiju Jose
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Shiju Jose @ 2020-12-04 10:13 UTC (permalink / raw)
  To: linux-edac, mchehab+huawei
  Cc: linuxarm, xuwei5, jonathan.cameron, john.garry, tanxiaofei,
	shameerali.kolothum.thodi, salil.mehta, shiju.jose

Add support for logging the vendor-specific error events and
add support for the HiSilicon KunPeng9xx server errors
to the ras-mc-ctl.

Shiju Jose (3):
  rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors
  rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors
  rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common
    errors

 util/ras-mc-ctl.in | 289 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 288 insertions(+), 1 deletion(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors
  2020-12-04 10:13 [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events Shiju Jose
@ 2020-12-04 10:13 ` Shiju Jose
  2020-12-04 10:13 ` [PATCH 2/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors Shiju Jose
  2020-12-04 10:13 ` [PATCH 3/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common errors Shiju Jose
  2 siblings, 0 replies; 4+ messages in thread
From: Shiju Jose @ 2020-12-04 10:13 UTC (permalink / raw)
  To: linux-edac, mchehab+huawei
  Cc: linuxarm, xuwei5, jonathan.cameron, john.garry, tanxiaofei,
	shameerali.kolothum.thodi, salil.mehta, shiju.jose

Add commands to support logging the vendor-specific
error info in the ras-mc-ctl.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Xiaofei Tan <tanxiaofei@huawei.com>
---
 util/ras-mc-ctl.in | 68 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 67 insertions(+), 1 deletion(-)

diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in
index 07b52e9..f457f0a 100755
--- a/util/ras-mc-ctl.in
+++ b/util/ras-mc-ctl.in
@@ -80,6 +80,9 @@ Usage: $prog [OPTIONS...]
  --summary          Presents a summary of the logged errors.
  --errors           Shows the errors stored at the error database.
  --error-count      Shows the corrected and uncorrected error counts using sysfs.
+ --vendor-errors-summary <platform-id>    Presents a summary of the vendor-specific logged errors.
+ --vendor-errors         <platform-id>    Shows the vendor-specific errors stored in the error database.
+ --vendor-platforms List the supported platforms with platform-ids for the vendor-specific errors.
  --help             This help message.
 EOF
 
@@ -127,6 +130,18 @@ if ($conf{opt}{errors}) {
     errors ();
 }
 
+if ($conf{opt}{vendor_errors_summary}) {
+    vendor_errors_summary ();
+}
+
+if ($conf{opt}{vendor_errors}) {
+    vendor_errors ();
+}
+
+if ($conf{opt}{vendor_platforms}) {
+    vendor_platforms ();
+}
+
 exit (0);
 
 sub parse_cmdline
@@ -142,6 +157,9 @@ sub parse_cmdline
     $conf{opt}{summary} = 0;
     $conf{opt}{errors} = 0;
     $conf{opt}{error_count} = 0;
+    $conf{opt}{vendor_errors_summary} = 0;
+    $conf{opt}{vendor_errors} = 0;
+    $conf{opt}{vendor_platforms} = 0;
 
     my $rref = \$conf{opt}{report};
     my $mref = \$conf{opt}{mainboard};
@@ -159,7 +177,10 @@ sub parse_cmdline
                          "layout" =>          \$conf{opt}{display_memory_layout},
                          "summary" =>         \$conf{opt}{summary},
                          "errors" =>          \$conf{opt}{errors},
-                         "error-count" =>     \$conf{opt}{error_count}
+                         "error-count" =>     \$conf{opt}{error_count},
+                         "vendor-errors-summary" =>    \$conf{opt}{vendor_errors_summary},
+                         "vendor-errors" =>   \$conf{opt}{vendor_errors},
+                         "vendor-platforms" =>    \$conf{opt}{vendor_platforms},
             );
 
     usage(1) if !$rc;
@@ -1531,6 +1552,51 @@ sub errors
     undef($dbh);
 }
 
+sub vendor_errors_summary
+{
+    require DBI;
+    my ($num_args, $platform_id);
+
+    $num_args = $#ARGV + 1;
+    $platform_id = 0;
+    if ($num_args ne 0) {
+        $platform_id = $ARGV[0];
+    } else {
+        return;
+    }
+
+    my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname", "", "", {});
+    # Disable the DBI automatic exception log
+    $dbh->{PrintError} = 0;
+
+    undef($dbh);
+}
+
+sub vendor_errors
+{
+    require DBI;
+    my ($num_args, $platform_id);
+
+    $num_args = $#ARGV + 1;
+    $platform_id = 0;
+    if ($num_args ne 0) {
+        $platform_id = $ARGV[0];
+    } else {
+        return;
+    }
+
+    my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname", "", "", {});
+    # Disable the DBI automatic exception log
+    $dbh->{PrintError} = 0;
+
+    undef($dbh);
+}
+
+sub vendor_platforms
+{
+        print "\nSupported platforms for the vendor-specific errors:\n";
+}
+
 sub log_msg   { print STDERR "$prog: ", @_ unless $conf{opt}{quiet}; }
 sub log_error { log_msg ("Error: @_"); }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors
  2020-12-04 10:13 [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events Shiju Jose
  2020-12-04 10:13 ` [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors Shiju Jose
@ 2020-12-04 10:13 ` Shiju Jose
  2020-12-04 10:13 ` [PATCH 3/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common errors Shiju Jose
  2 siblings, 0 replies; 4+ messages in thread
From: Shiju Jose @ 2020-12-04 10:13 UTC (permalink / raw)
  To: linux-edac, mchehab+huawei
  Cc: linuxarm, xuwei5, jonathan.cameron, john.garry, tanxiaofei,
	shameerali.kolothum.thodi, salil.mehta, shiju.jose

Add support for the HiSilicon KunPeng920 errors.
Support the error formats: OEM type 1, OEM typ2 and PCIe controller
error formats.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Xiaofei Tan <tanxiaofei@huawei.com>
---
 util/ras-mc-ctl.in | 173 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 173 insertions(+)

diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in
index f457f0a..711e4b0 100755
--- a/util/ras-mc-ctl.in
+++ b/util/ras-mc-ctl.in
@@ -1552,10 +1552,17 @@ sub errors
     undef($dbh);
 }
 
+# Definitions of the vendor platform IDs.
+use constant {
+    HISILICON_KUNPENG_920 => "KunPeng920",
+};
+
 sub vendor_errors_summary
 {
     require DBI;
     my ($num_args, $platform_id);
+    my ($query, $query_handle, $count, $out);
+    my ($module_id, $sub_module_id, $err_severity, $err_sev);
 
     $num_args = $#ARGV + 1;
     $platform_id = 0;
@@ -1569,6 +1576,81 @@ sub vendor_errors_summary
     # Disable the DBI automatic exception log
     $dbh->{PrintError} = 0;
 
+    # HiSilicon KunPeng920 errors
+    if ($platform_id eq HISILICON_KUNPENG_920) {
+        try {
+            $query = "select err_severity, module_id, count(*) from hip08_oem_type1_event_v2 group by err_severity, module_id";
+            $query_handle = $dbh->prepare($query);
+            $query_handle->execute();
+            $query_handle->bind_columns(\($err_severity, $module_id, $count));
+            $out = "";
+            $err_sev = "";
+	    while($query_handle->fetch()) {
+                if ($err_severity ne $err_sev) {
+                        $out .= "$err_severity errors:\n";
+                        $err_sev = $err_severity;
+		}
+                $out .= "\t$module_id: $count\n";
+            }
+            if ($out ne "") {
+                print "HiSilicon KunPeng920 OEM type1 error events summary:\n$out\n";
+            } else {
+                print "No HiSilicon KunPeng920 OEM type1 errors.\n\n";
+            }
+            $query_handle->finish;
+        } catch {
+            print "Exception: $DBI::errstr\n\n";
+        };
+
+	try {
+            $query = "select err_severity, module_id, count(*) from hip08_oem_type2_event_v2 group by err_severity, module_id";
+            $query_handle = $dbh->prepare($query);
+            $query_handle->execute();
+            $query_handle->bind_columns(\($err_severity, $module_id, $count));
+            $out = "";
+            $err_sev = "";
+            while($query_handle->fetch()) {
+                if ($err_severity ne $err_sev) {
+                        $out .= "$err_severity errors:\n";
+                        $err_sev = $err_severity;
+		}
+                $out .= "\t$module_id: $count\n";
+            }
+            if ($out ne "") {
+                print "HiSilicon KunPeng920 OEM type2 error events summary:\n$out\n";
+            } else {
+                print "No HiSilicon KunPeng920 OEM type2 errors.\n\n";
+            }
+            $query_handle->finish;
+        } catch {
+            print "Exception: $DBI::errstr\n\n";
+        };
+
+        try {
+            $query = "select err_severity, sub_module_id, count(*) from hip08_pcie_local_event_v2 group by err_severity, sub_module_id";
+            $query_handle = $dbh->prepare($query);
+            $query_handle->execute();
+            $query_handle->bind_columns(\($err_severity, $sub_module_id, $count));
+            $out = "";
+            $err_sev = "";
+            while($query_handle->fetch()) {
+                if ($err_severity ne $err_sev) {
+                        $out .= "$err_severity errors:\n";
+                        $err_sev = $err_severity;
+		}
+                $out .= "\t$sub_module_id: $count\n";
+            }
+            if ($out ne "") {
+                print "HiSilicon KunPeng920 PCIe controller error events summary:\n$out\n";
+            } else {
+                print "No HiSilicon KunPeng920 PCIe controller errors.\n\n";
+            }
+            $query_handle->finish;
+        } catch {
+            print "Exception: $DBI::errstr\n\n";
+        };
+    }
+
     undef($dbh);
 }
 
@@ -1576,6 +1658,9 @@ sub vendor_errors
 {
     require DBI;
     my ($num_args, $platform_id);
+    my ($query, $query_handle, $id, $timestamp, $out);
+    my ($version, $soc_id, $socket_id, $nimbus_id, $core_id, $port_id);
+    my ($module_id, $sub_module_id, $err_severity, $err_type, $regs);
 
     $num_args = $#ARGV + 1;
     $platform_id = 0;
@@ -1589,12 +1674,100 @@ sub vendor_errors
     # Disable the DBI automatic exception log
     $dbh->{PrintError} = 0;
 
+    # HiSilicon KunPeng920 errors
+    if ($platform_id eq HISILICON_KUNPENG_920) {
+        try {
+            $query = "select id, timestamp, version, soc_id, socket_id, nimbus_id, module_id, sub_module_id, err_severity, regs_dump from hip08_oem_type1_event_v2 order by id, module_id, err_severity";
+            $query_handle = $dbh->prepare($query);
+            $query_handle->execute();
+            $query_handle->bind_columns(\($id, $timestamp, $version, $soc_id, $socket_id, $nimbus_id, $module_id, $sub_module_id, $err_severity, $regs));
+            $out = "";
+            while($query_handle->fetch()) {
+                $out .= "$id. $timestamp Error Info: ";
+                $out .= "version=$version, ";
+                $out .= "soc_id=$soc_id, " if ($soc_id);
+                $out .= "socket_id=$socket_id, " if ($socket_id);
+                $out .= "nimbus_id=$nimbus_id, " if ($nimbus_id);
+                $out .= "module_id=$module_id, " if ($module_id);
+                $out .= "sub_module_id=$sub_module_id, " if ($sub_module_id);
+                $out .= "err_severity=$err_severity, \n" if ($err_severity);
+                $out .= "Error Registers: $regs\n\n" if ($regs);
+            }
+            if ($out ne "") {
+                print "HiSilicon KunPeng920 OEM type1 error events:\n$out\n";
+            } else {
+                print "No HiSilicon KunPeng920 OEM type1 errors.\n";
+            }
+            $query_handle->finish;
+        } catch {
+            print "Exception: $DBI::errstr\n\n";
+        };
+
+        try {
+            $query = "select id, timestamp, version, soc_id, socket_id, nimbus_id, module_id, sub_module_id, err_severity, regs_dump from hip08_oem_type2_event_v2 order by id, module_id, err_severity";
+            $query_handle = $dbh->prepare($query);
+            $query_handle->execute();
+            $query_handle->bind_columns(\($id, $timestamp, $version, $soc_id, $socket_id, $nimbus_id, $module_id, $sub_module_id, $err_severity, $regs));
+            $out = "";
+            while($query_handle->fetch()) {
+                $out .= "$id. $timestamp Error Info: ";
+                $out .= "version=$version, ";
+                $out .= "soc_id=$soc_id, " if ($soc_id);
+                $out .= "socket_id=$socket_id, " if ($socket_id);
+                $out .= "nimbus_id=$nimbus_id, " if ($nimbus_id);
+                $out .= "module_id=$module_id, " if ($module_id);
+                $out .= "sub_module_id=$sub_module_id, " if ($sub_module_id);
+                $out .= "err_severity=$err_severity, \n" if ($err_severity);
+                $out .= "Error Registers: $regs\n\n" if ($regs);
+            }
+            if ($out ne "") {
+                print "HiSilicon KunPeng920 OEM type2 error events:\n$out\n";
+            } else {
+                print "No HiSilicon KunPeng920 OEM type2 errors.\n";
+            }
+            $query_handle->finish;
+        } catch {
+            print "Exception: $DBI::errstr\n\n";
+        };
+
+	try {
+            $query = "select id, timestamp, version, soc_id, socket_id, nimbus_id, sub_module_id, core_id, port_id, err_severity, err_type, regs_dump from hip08_pcie_local_event_v2 order by id, sub_module_id, err_severity";
+            $query_handle = $dbh->prepare($query);
+            $query_handle->execute();
+            $query_handle->bind_columns(\($id, $timestamp, $version, $soc_id, $socket_id, $nimbus_id, $sub_module_id, $core_id, $port_id, $err_severity, $err_type, $regs));
+            $out = "";
+            while($query_handle->fetch()) {
+                $out .= "$id. $timestamp Error Info: ";
+                $out .= "version=$version, ";
+                $out .= "soc_id=$soc_id, " if ($soc_id);
+                $out .= "socket_id=$socket_id, " if ($socket_id);
+                $out .= "nimbus_id=$nimbus_id, " if ($nimbus_id);
+                $out .= "sub_module_id=$sub_module_id, " if ($sub_module_id);
+                $out .= "core_id=$core_id, " if ($core_id);
+                $out .= "port_id=$port_id, " if ($port_id);
+                $out .= "err_severity=$err_severity, " if ($err_severity);
+                $out .= "err_type=$err_type, \n" if ($err_type);
+                $out .= "Error Registers: $regs\n\n" if ($regs);
+            }
+            if ($out ne "") {
+                print "HiSilicon KunPeng920 PCIe controller error events:\n$out\n";
+            } else {
+                print "No HiSilicon KunPeng920 PCIe controller errors.\n";
+            }
+            $query_handle->finish;
+        } catch {
+            print "Exception: $DBI::errstr\n\n";
+        };
+    }
+
     undef($dbh);
 }
 
 sub vendor_platforms
 {
         print "\nSupported platforms for the vendor-specific errors:\n";
+        print "\tHiSilicon KunPeng920, platform-id=\"", HISILICON_KUNPENG_920, "\"\n";
+        print "\n";
 }
 
 sub log_msg   { print STDERR "$prog: ", @_ unless $conf{opt}{quiet}; }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common errors
  2020-12-04 10:13 [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events Shiju Jose
  2020-12-04 10:13 ` [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors Shiju Jose
  2020-12-04 10:13 ` [PATCH 2/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors Shiju Jose
@ 2020-12-04 10:13 ` Shiju Jose
  2 siblings, 0 replies; 4+ messages in thread
From: Shiju Jose @ 2020-12-04 10:13 UTC (permalink / raw)
  To: linux-edac, mchehab+huawei
  Cc: linuxarm, xuwei5, jonathan.cameron, john.garry, tanxiaofei,
	shameerali.kolothum.thodi, salil.mehta, shiju.jose

Add support for the HiSilicon KunPeng9xx platforms common errors.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Xiaofei Tan <tanxiaofei@huawei.com>
---
 util/ras-mc-ctl.in | 52 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 50 insertions(+), 2 deletions(-)

diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in
index 711e4b0..0885de1 100755
--- a/util/ras-mc-ctl.in
+++ b/util/ras-mc-ctl.in
@@ -1555,6 +1555,7 @@ sub errors
 # Definitions of the vendor platform IDs.
 use constant {
     HISILICON_KUNPENG_920 => "KunPeng920",
+    HISILICON_KUNPENG_9XX => "KunPeng9xx",
 };
 
 sub vendor_errors_summary
@@ -1562,7 +1563,7 @@ sub vendor_errors_summary
     require DBI;
     my ($num_args, $platform_id);
     my ($query, $query_handle, $count, $out);
-    my ($module_id, $sub_module_id, $err_severity, $err_sev);
+    my ($module_id, $sub_module_id, $err_severity, $err_sev, $err_info);
 
     $num_args = $#ARGV + 1;
     $platform_id = 0;
@@ -1651,6 +1652,28 @@ sub vendor_errors_summary
         };
     }
 
+    # HiSilicon KunPeng9xx common errors
+    if ($platform_id eq HISILICON_KUNPENG_9XX) {
+        try {
+            $query = "select err_info, count(*) from hisi_common_section";
+            $query_handle = $dbh->prepare($query);
+            $query_handle->execute();
+            $query_handle->bind_columns(\($err_info, $count));
+            $out = "";
+            while($query_handle->fetch()) {
+                $out .= "\terrors: $count\n";
+            }
+            if ($out ne "") {
+                print "HiSilicon KunPeng9xx common error events summary:\n$out\n";
+            } else {
+                print "No HiSilicon KunPeng9xx common errors.\n\n";
+            }
+            $query_handle->finish;
+        } catch {
+            print "Exception: $DBI::errstr\n\n";
+        };
+    }
+
     undef($dbh);
 }
 
@@ -1660,7 +1683,7 @@ sub vendor_errors
     my ($num_args, $platform_id);
     my ($query, $query_handle, $id, $timestamp, $out);
     my ($version, $soc_id, $socket_id, $nimbus_id, $core_id, $port_id);
-    my ($module_id, $sub_module_id, $err_severity, $err_type, $regs);
+    my ($module_id, $sub_module_id, $err_severity, $err_type, $err_info, $regs);
 
     $num_args = $#ARGV + 1;
     $platform_id = 0;
@@ -1760,6 +1783,30 @@ sub vendor_errors
         };
     }
 
+    # HiSilicon KunPeng9xx common errors
+    if ($platform_id eq HISILICON_KUNPENG_9XX) {
+        try {
+            $query = "select id, timestamp, err_info, regs_dump from hisi_common_section order by id";
+            $query_handle = $dbh->prepare($query);
+            $query_handle->execute();
+            $query_handle->bind_columns(\($id, $timestamp, $err_info, $regs));
+            $out = "";
+            while($query_handle->fetch()) {
+                $out .= "$id. $timestamp ";
+                $out .= "Error Info:$err_info \n" if ($err_info);
+                $out .= "Error Registers: $regs\n\n" if ($regs);
+            }
+            if ($out ne "") {
+                print "HiSilicon KunPeng9xx common error events:\n$out\n";
+            } else {
+                print "No HiSilicon KunPeng9xx common errors.\n";
+            }
+            $query_handle->finish;
+        } catch {
+            print "Exception: $DBI::errstr\n\n";
+        };
+    }
+
     undef($dbh);
 }
 
@@ -1767,6 +1814,7 @@ sub vendor_platforms
 {
         print "\nSupported platforms for the vendor-specific errors:\n";
         print "\tHiSilicon KunPeng920, platform-id=\"", HISILICON_KUNPENG_920, "\"\n";
+        print "\tHiSilicon KunPeng9xx, platform-id=\"", HISILICON_KUNPENG_9XX, "\"\n";
         print "\n";
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-04 10:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04 10:13 [PATCH 0/3] rasdaemon: ras-mc-ctl: Support vendor-specific error events Shiju Jose
2020-12-04 10:13 ` [PATCH 1/3] rasdaemon: ras-mc-ctl: Add support for the vendor-specific errors Shiju Jose
2020-12-04 10:13 ` [PATCH 2/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng920 errors Shiju Jose
2020-12-04 10:13 ` [PATCH 3/3] rasdaemon: ras-mc-ctl: Add support for HiSilicon KunPeng9xx common errors Shiju Jose

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).