All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tyler Baicar <tbaicar@codeaurora.org>
To: christoffer.dall@linaro.org, marc.zyngier@arm.com,
	pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk,
	catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net,
	lenb@kernel.org, matt@codeblueprint.co.uk,
	robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org,
	zjzhang@codeaurora.org, mark.rutland@arm.com,
	james.morse@arm.com, akpm@linux-foundation.org,
	eun.taik.lee@samsung.com, sandeepa.s.prabhu@gmail.com,
	labbott@redhat.com, shijie.huang@arm.com,
	rruigrok@codeaurora.org, paul.gortmaker@windriver.com,
	tn@semihalf.com, fu.wei@linaro.org, rostedt@goodmis.org,
	bristot@redhat.com, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-efi@vger.kernel.org, devel@acpica.org,
	Suzuki.Poulose@arm.com, punit.agrawal@arm.com, astone@redhat.com,
	harba@codeaur
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Subject: [PATCH V12 06/10] acpi: apei: panic OS with fatal error status block
Date: Mon,  6 Mar 2017 13:44:59 -0700	[thread overview]
Message-ID: <1488833103-21082-7-git-send-email-tbaicar@codeaurora.org> (raw)
In-Reply-To: <1488833103-21082-1-git-send-email-tbaicar@codeaurora.org>

From: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org>

Even if an error status block's severity is fatal, the kernel does not
honor the severity level and panic.

With the firmware first model, the platform could inform the OS about a
fatal hardware error through the non-NMI GHES notification type. The OS
should panic when a hardware error record is received with this
severity.

Call panic() after CPER data in error status block is printed if
severity is fatal, before each error section is handled.

Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b0596ba..d6a3b9f 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -133,6 +133,8 @@
 static struct ghes_estatus_cache *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
 static atomic_t ghes_estatus_cache_alloced;
 
+static int ghes_panic_timeout __read_mostly = 30;
+
 static int ghes_ioremap_init(void)
 {
 	ghes_ioremap_area = __get_vm_area(PAGE_SIZE * GHES_IOREMAP_PAGES,
@@ -688,6 +690,13 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *generic_v2)
 	return rc;
 }
 
+static void __ghes_call_panic(void)
+{
+	if (panic_timeout == 0)
+		panic_timeout = ghes_panic_timeout;
+	panic("Fatal hardware error!");
+}
+
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
@@ -695,6 +704,10 @@ static int ghes_proc(struct ghes *ghes)
 	rc = ghes_read_estatus(ghes, 0);
 	if (rc)
 		goto out;
+	if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
+		__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+		__ghes_call_panic();
+	}
 	if (!ghes_estatus_cached(ghes->estatus)) {
 		if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
 			ghes_estatus_cache_add(ghes->generic, ghes->estatus);
@@ -831,8 +844,6 @@ static inline void ghes_sea_remove(struct ghes *ghes)
 
 static LIST_HEAD(ghes_nmi);
 
-static int ghes_panic_timeout	__read_mostly = 30;
-
 static void ghes_proc_in_irq(struct irq_work *irq_work)
 {
 	struct llist_node *llnode, *next;
@@ -925,9 +936,7 @@ static void __ghes_panic(struct ghes *ghes)
 	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
 
 	/* reboot to log the error! */
-	if (panic_timeout == 0)
-		panic_timeout = ghes_panic_timeout;
-	panic("Fatal hardware error!");
+	__ghes_call_panic();
 }
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

WARNING: multiple messages have this Message-ID (diff)
From: Tyler Baicar <tbaicar@codeaurora.org>
To: christoffer.dall@linaro.org, marc.zyngier@arm.com,
	pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk,
	catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net,
	lenb@kernel.org, matt@codeblueprint.co.uk,
	robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org,
	zjzhang@codeaurora.org, mark.rutland@arm.com,
	james.morse@arm.com, akpm@linux-foundation.org,
	eun.taik.lee@samsung.com, sandeepa.s.prabhu@gmail.com,
	labbott@redhat.com, shijie.huang@arm.com,
	rruigrok@codeaurora.org, paul.gortmaker@windriver.com,
	tn@semihalf.com, fu.wei@linaro.org, rostedt@goodmis.org,
	bristot@redhat.com, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-efi@vger.kernel.org, devel@acpica.org,
	Suzuki.Poulose@arm.com, punit.agrawal@arm.com, astone@redhat.com,
	harba@codeaurora.org, hanjun.guo@linaro.org,
	john.garry@huawei.com, shiju.jose@huawei.com, joe@perches.com
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Subject: [PATCH V12 06/10] acpi: apei: panic OS with fatal error status block
Date: Mon,  6 Mar 2017 13:44:59 -0700	[thread overview]
Message-ID: <1488833103-21082-7-git-send-email-tbaicar@codeaurora.org> (raw)
In-Reply-To: <1488833103-21082-1-git-send-email-tbaicar@codeaurora.org>

From: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org>

Even if an error status block's severity is fatal, the kernel does not
honor the severity level and panic.

With the firmware first model, the platform could inform the OS about a
fatal hardware error through the non-NMI GHES notification type. The OS
should panic when a hardware error record is received with this
severity.

Call panic() after CPER data in error status block is printed if
severity is fatal, before each error section is handled.

Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b0596ba..d6a3b9f 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -133,6 +133,8 @@
 static struct ghes_estatus_cache *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
 static atomic_t ghes_estatus_cache_alloced;
 
+static int ghes_panic_timeout __read_mostly = 30;
+
 static int ghes_ioremap_init(void)
 {
 	ghes_ioremap_area = __get_vm_area(PAGE_SIZE * GHES_IOREMAP_PAGES,
@@ -688,6 +690,13 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *generic_v2)
 	return rc;
 }
 
+static void __ghes_call_panic(void)
+{
+	if (panic_timeout == 0)
+		panic_timeout = ghes_panic_timeout;
+	panic("Fatal hardware error!");
+}
+
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
@@ -695,6 +704,10 @@ static int ghes_proc(struct ghes *ghes)
 	rc = ghes_read_estatus(ghes, 0);
 	if (rc)
 		goto out;
+	if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
+		__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+		__ghes_call_panic();
+	}
 	if (!ghes_estatus_cached(ghes->estatus)) {
 		if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
 			ghes_estatus_cache_add(ghes->generic, ghes->estatus);
@@ -831,8 +844,6 @@ static inline void ghes_sea_remove(struct ghes *ghes)
 
 static LIST_HEAD(ghes_nmi);
 
-static int ghes_panic_timeout	__read_mostly = 30;
-
 static void ghes_proc_in_irq(struct irq_work *irq_work)
 {
 	struct llist_node *llnode, *next;
@@ -925,9 +936,7 @@ static void __ghes_panic(struct ghes *ghes)
 	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
 
 	/* reboot to log the error! */
-	if (panic_timeout == 0)
-		panic_timeout = ghes_panic_timeout;
-	panic("Fatal hardware error!");
+	__ghes_call_panic();
 }
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

WARNING: multiple messages have this Message-ID (diff)
From: tbaicar@codeaurora.org (Tyler Baicar)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH V12 06/10] acpi: apei: panic OS with fatal error status block
Date: Mon,  6 Mar 2017 13:44:59 -0700	[thread overview]
Message-ID: <1488833103-21082-7-git-send-email-tbaicar@codeaurora.org> (raw)
In-Reply-To: <1488833103-21082-1-git-send-email-tbaicar@codeaurora.org>

From: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org>

Even if an error status block's severity is fatal, the kernel does not
honor the severity level and panic.

With the firmware first model, the platform could inform the OS about a
fatal hardware error through the non-NMI GHES notification type. The OS
should panic when a hardware error record is received with this
severity.

Call panic() after CPER data in error status block is printed if
severity is fatal, before each error section is handled.

Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b0596ba..d6a3b9f 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -133,6 +133,8 @@
 static struct ghes_estatus_cache *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
 static atomic_t ghes_estatus_cache_alloced;
 
+static int ghes_panic_timeout __read_mostly = 30;
+
 static int ghes_ioremap_init(void)
 {
 	ghes_ioremap_area = __get_vm_area(PAGE_SIZE * GHES_IOREMAP_PAGES,
@@ -688,6 +690,13 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *generic_v2)
 	return rc;
 }
 
+static void __ghes_call_panic(void)
+{
+	if (panic_timeout == 0)
+		panic_timeout = ghes_panic_timeout;
+	panic("Fatal hardware error!");
+}
+
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
@@ -695,6 +704,10 @@ static int ghes_proc(struct ghes *ghes)
 	rc = ghes_read_estatus(ghes, 0);
 	if (rc)
 		goto out;
+	if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
+		__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+		__ghes_call_panic();
+	}
 	if (!ghes_estatus_cached(ghes->estatus)) {
 		if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
 			ghes_estatus_cache_add(ghes->generic, ghes->estatus);
@@ -831,8 +844,6 @@ static inline void ghes_sea_remove(struct ghes *ghes)
 
 static LIST_HEAD(ghes_nmi);
 
-static int ghes_panic_timeout	__read_mostly = 30;
-
 static void ghes_proc_in_irq(struct irq_work *irq_work)
 {
 	struct llist_node *llnode, *next;
@@ -925,9 +936,7 @@ static void __ghes_panic(struct ghes *ghes)
 	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
 
 	/* reboot to log the error! */
-	if (panic_timeout == 0)
-		panic_timeout = ghes_panic_timeout;
-	panic("Fatal hardware error!");
+	__ghes_call_panic();
 }
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

  parent reply	other threads:[~2017-03-06 20:44 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-06 20:44 [PATCH V12 00/10] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 Tyler Baicar
2017-03-06 20:44 ` Tyler Baicar
2017-03-06 20:44 ` Tyler Baicar
2017-03-06 20:44 ` [PATCH V12 01/10] acpi: apei: read ack upon ghes record consumption Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:44 ` [PATCH V12 02/10] ras: acpi/apei: cper: generic error data entry v3 per ACPI 6.1 Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:44 ` [PATCH V12 03/10] efi: parse ARM processor error Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:44 ` [PATCH V12 04/10] arm64: exception: handle Synchronous External Abort Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:44 ` [PATCH V12 05/10] acpi: apei: handle SEA notification type for ARMv8 Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
     [not found]   ` <1488833103-21082-6-git-send-email-tbaicar-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-03-07 11:37     ` James Morse
2017-03-07 11:37       ` James Morse
2017-03-07 11:37       ` James Morse
2017-03-07 16:40       ` Baicar, Tyler
2017-03-07 16:40         ` Baicar, Tyler
2017-03-07 16:40         ` Baicar, Tyler
2017-03-17 16:43   ` James Morse
2017-03-17 16:43     ` James Morse
2017-03-17 16:43     ` James Morse
2017-03-21 19:19     ` Baicar, Tyler
2017-03-21 19:19       ` Baicar, Tyler
2017-03-21 19:19       ` Baicar, Tyler
2017-03-06 20:44 ` Tyler Baicar [this message]
2017-03-06 20:44   ` [PATCH V12 06/10] acpi: apei: panic OS with fatal error status block Tyler Baicar
2017-03-06 20:44   ` Tyler Baicar
2017-03-06 20:45 ` [PATCH V12 07/10] efi: print unrecognized CPER section Tyler Baicar
2017-03-06 20:45   ` Tyler Baicar
2017-03-06 20:45   ` Tyler Baicar
2017-03-06 21:05   ` Joe Perches
2017-03-06 21:05     ` Joe Perches
2017-03-06 21:05     ` Joe Perches
2017-03-06 21:05     ` Joe Perches
2017-03-07 16:39     ` Baicar, Tyler
2017-03-07 16:39       ` Baicar, Tyler
2017-03-07 16:39       ` Baicar, Tyler
2017-03-07 16:39       ` Baicar, Tyler
2017-03-06 20:45 ` [PATCH V12 08/10] ras: acpi / apei: generate trace event for " Tyler Baicar
2017-03-06 20:45   ` Tyler Baicar
2017-03-06 20:45   ` Tyler Baicar
2017-03-06 20:45 ` [PATCH V12 09/10] trace, ras: add ARM processor error trace event Tyler Baicar
2017-03-06 20:45   ` Tyler Baicar
2017-03-06 20:45   ` Tyler Baicar
2017-03-09  9:41   ` Xie XiuQi
2017-03-09  9:41     ` Xie XiuQi
2017-03-09  9:41     ` Xie XiuQi
2017-03-09  9:41     ` Xie XiuQi
2017-03-10 18:23     ` Baicar, Tyler
2017-03-10 18:23       ` Baicar, Tyler
2017-03-10 18:23       ` Baicar, Tyler
     [not found]       ` <14545228-7ff1-b31c-1fa5-daacf89a44b9-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-03-13  2:31         ` Xie XiuQi
2017-03-13  2:31           ` Xie XiuQi
2017-03-13  2:31           ` Xie XiuQi
2017-03-13  2:31           ` Xie XiuQi
     [not found]           ` <58C60485.2070509-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2017-03-13  9:00             ` Xie XiuQi
2017-03-14 19:29             ` Baicar, Tyler
2017-03-14 19:29               ` Baicar, Tyler
2017-03-14 19:29               ` Baicar, Tyler
2017-03-13  9:00           ` Xie XiuQi
2017-03-13  9:00             ` Xie XiuQi
2017-03-13  9:00             ` Xie XiuQi
2017-03-13  9:00             ` Xie XiuQi
2017-03-13 13:58             ` Steven Rostedt
2017-03-13 13:58               ` Steven Rostedt
2017-03-13 13:58               ` Steven Rostedt
2017-03-13 13:58               ` Steven Rostedt
2017-03-14  9:35               ` Xie XiuQi
2017-03-14  9:35                 ` Xie XiuQi
2017-03-14  9:35                 ` Xie XiuQi
2017-03-14  9:35                 ` Xie XiuQi
2017-03-06 20:45 ` [PATCH V12 10/10] arm/arm64: KVM: add guest SEA support Tyler Baicar
2017-03-06 20:45   ` Tyler Baicar
2017-03-06 20:45   ` Tyler Baicar
2017-03-07 11:48   ` James Morse
2017-03-07 11:48     ` James Morse
2017-03-07 11:48     ` James Morse
2017-03-07 17:58     ` Baicar, Tyler
2017-03-07 17:58       ` Baicar, Tyler
2017-03-07 17:58       ` Baicar, Tyler
2017-03-08 16:09       ` James Morse
2017-03-08 16:09         ` James Morse
2017-03-08 16:09         ` James Morse
2017-03-10 18:15         ` Baicar, Tyler
2017-03-10 18:15           ` Baicar, Tyler
2017-03-10 18:15           ` Baicar, Tyler
2017-03-07 11:37 ` [PATCH V12 00/10] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 James Morse
2017-03-07 11:37   ` James Morse
2017-03-07 11:37   ` James Morse
2017-03-07 16:37   ` Baicar, Tyler
2017-03-07 16:37     ` Baicar, Tyler
2017-03-07 16:37     ` Baicar, Tyler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1488833103-21082-7-git-send-email-tbaicar@codeaurora.org \
    --to=tbaicar@codeaurora.org \
    --cc=Suzuki.Poulose@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=astone@redhat.com \
    --cc=bristot@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=devel@acpica.org \
    --cc=eun.taik.lee@samsung.com \
    --cc=fu.wei@linaro.org \
    --cc=harba@codeaur \
    --cc=james.morse@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=labbott@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lv.zheng@intel.com \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=matt@codeblueprint.co.uk \
    --cc=nkaje@codeaurora.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=pbonzini@redhat.com \
    --cc=punit.agrawal@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=rkrcmar@redhat.com \
    --cc=robert.moore@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=rruigrok@codeaurora.org \
    --cc=sandeepa.s.prabhu@gmail.com \
    --cc=shijie.huang@arm.com \
    --cc=tn@semihalf.com \
    --cc=will.deacon@arm.com \
    --cc=zjzhang@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.