All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] ACPI / APEI: Boot Error Record Table processing was needlessly complicated
@ 2017-06-15 22:41 Luck, Tony
  2017-06-19 15:58 ` Borislav Petkov
  0 siblings, 1 reply; 6+ messages in thread
From: Luck, Tony @ 2017-06-15 22:41 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Tony Luck, Len Brown, Huang Ying, Borislav Petkov,
	Tomasz Nowicki, Jonathan Zhang, Tyler Baicar, linux-acpi

From: Tony Luck <tony.luck@intel.com>

Quoting version 6.1 of the ACPI specification. Section 18.3.1 "Boot
Error Source" says:

  The Boot Error Region is a range of addressable memory OSPM can access
  during initialization to determine if an unhandled error condition
  occurred. System firmware must report this memory range as firmware
  reserved. The format of the Boot Error Region follow that of an Error
  Status Block, this is defined in Section 18.3.2.7. The format of the
  error status block is described by Table 18-342.

This clarifies some points that were obfuscated in earlier versions.
E.g. there is no longer a separate table to describe the format of the
"Boot Error Region" (which was identical to the "Error Status Block").
Also saying "follow that of *an* Error Status Block" makes it clear that
there is just one block (which can still contain multiple "Generic Error
Data Entry structures").

The loop inside bert_print_all() is unnecessary (but probably harmless
as the "while (remain > sizeof(struct acpi_bert_region))" loop should
terminate after we skipped over the first entry.

We can drop the "bert_print_all()" function and just move the four
relevant lines inline in "bert_init()".

Also this driver clears the "block_status" status field of the error
status block in ACPI memory. I can see no justification for doing this
in the ACPI spec.  It is also highly kernel-ego-centric.  There are
user mode BERT harvesting tools that dig into /sys/firmware/acpi/tables/BERT
which might like to see this summary value.

Drop the:
	estatus->block_status = 0;

Cc: Len Brown <lenb@kernel.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 drivers/acpi/apei/bert.c | 59 ++++++++++--------------------------------------
 1 file changed, 12 insertions(+), 47 deletions(-)

v4: don't clear block_status

previous version was
	Reviewed-by: Borislav Petkov <bp@suse.de>
	Tested-by: Tyler Baicar <tbaicar@codeaurora.org>


diff --git a/drivers/acpi/apei/bert.c b/drivers/acpi/apei/bert.c
index 12771fcf0417..3b9b296dc84e 100644
--- a/drivers/acpi/apei/bert.c
+++ b/drivers/acpi/apei/bert.c
@@ -34,50 +34,6 @@
 
 static int bert_disable;
 
-static void __init bert_print_all(struct acpi_bert_region *region,
-				  unsigned int region_len)
-{
-	struct acpi_hest_generic_status *estatus =
-		(struct acpi_hest_generic_status *)region;
-	int remain = region_len;
-	u32 estatus_len;
-
-	if (!estatus->block_status)
-		return;
-
-	while (remain > sizeof(struct acpi_bert_region)) {
-		if (cper_estatus_check(estatus)) {
-			pr_err(FW_BUG "Invalid error record.\n");
-			return;
-		}
-
-		estatus_len = cper_estatus_len(estatus);
-		if (remain < estatus_len) {
-			pr_err(FW_BUG "Truncated status block (length: %u).\n",
-			       estatus_len);
-			return;
-		}
-
-		pr_info_once("Error records from previous boot:\n");
-
-		cper_estatus_print(KERN_INFO HW_ERR, estatus);
-
-		/*
-		 * Because the boot error source is "one-time polled" type,
-		 * clear Block Status of current Generic Error Status Block,
-		 * once it's printed.
-		 */
-		estatus->block_status = 0;
-
-		estatus = (void *)estatus + estatus_len;
-		/* No more error records. */
-		if (!estatus->block_status)
-			return;
-
-		remain -= estatus_len;
-	}
-}
-
 static int __init setup_bert_disable(char *str)
 {
 	bert_disable = 1;
@@ -89,7 +45,7 @@ __setup("bert_disable", setup_bert_disable);
 static int __init bert_check_table(struct acpi_table_bert *bert_tab)
 {
 	if (bert_tab->header.length < sizeof(struct acpi_table_bert) ||
-	    bert_tab->region_length < sizeof(struct acpi_bert_region))
+	    bert_tab->region_length < sizeof(struct acpi_hest_generic_status))
 		return -EINVAL;
 
 	return 0;
@@ -98,7 +54,7 @@ static int __init bert_check_table(struct acpi_table_bert *bert_tab)
 static int __init bert_init(void)
 {
 	struct apei_resources bert_resources;
-	struct acpi_bert_region *boot_error_region;
+	struct acpi_hest_generic_status *boot_error_region;
 	struct acpi_table_bert *bert_tab;
 	unsigned int region_len;
 	acpi_status status;
@@ -138,7 +94,16 @@ static int __init bert_init(void)
 		goto out_fini;
 	boot_error_region = ioremap_cache(bert_tab->address, region_len);
 	if (boot_error_region) {
-		bert_print_all(boot_error_region, region_len);
+		if (boot_error_region->block_status) {
+			rc = cper_estatus_check(boot_error_region);
+			if (rc) {
+				pr_err(FW_BUG "Invalid error record.\n");
+				iounmap(boot_error_region);
+				return rc;
+			}
+			pr_info("Error records from previous boot:\n");
+			cper_estatus_print(KERN_INFO HW_ERR, boot_error_region);
+		}
 		iounmap(boot_error_region);
 	} else {
 		rc = -ENOMEM;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] ACPI / APEI: Boot Error Record Table processing was needlessly complicated
  2017-06-15 22:41 [PATCH v4] ACPI / APEI: Boot Error Record Table processing was needlessly complicated Luck, Tony
@ 2017-06-19 15:58 ` Borislav Petkov
  2017-06-21 17:20   ` [PATCH v5 0/2] Fixes for ACPI BERT table Luck, Tony
  0 siblings, 1 reply; 6+ messages in thread
From: Borislav Petkov @ 2017-06-19 15:58 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Rafael J . Wysocki, Len Brown, Huang Ying, Tomasz Nowicki,
	Jonathan Zhang, Tyler Baicar, linux-acpi

On Thu, Jun 15, 2017 at 03:41:21PM -0700, Luck, Tony wrote:
> From: Tony Luck <tony.luck@intel.com>
> 
> Quoting version 6.1 of the ACPI specification. Section 18.3.1 "Boot
> Error Source" says:
> 
>   The Boot Error Region is a range of addressable memory OSPM can access
>   during initialization to determine if an unhandled error condition
>   occurred. System firmware must report this memory range as firmware
>   reserved. The format of the Boot Error Region follow that of an Error
>   Status Block, this is defined in Section 18.3.2.7. The format of the
>   error status block is described by Table 18-342.
> 
> This clarifies some points that were obfuscated in earlier versions.
> E.g. there is no longer a separate table to describe the format of the
> "Boot Error Region" (which was identical to the "Error Status Block").

Hmm, ok, so do we know whether some fw writers "misunderstood" the
previous version of the spec and actually added that separate table and
now we need to handle the different layout?

Probably not but I'm not going to be surprised someone did. At all.

> Also saying "follow that of *an* Error Status Block" makes it clear that
> there is just one block (which can still contain multiple "Generic Error
> Data Entry structures").
> 
> The loop inside bert_print_all() is unnecessary (but probably harmless
> as the "while (remain > sizeof(struct acpi_bert_region))" loop should
> terminate after we skipped over the first entry.
> 
> We can drop the "bert_print_all()" function and just move the four
> relevant lines inline in "bert_init()".
> 
> Also this driver clears the "block_status" status field of the error
> status block in ACPI memory. I can see no justification for doing this
> in the ACPI spec.  It is also highly kernel-ego-centric.  There are
> user mode BERT harvesting tools that dig into /sys/firmware/acpi/tables/BERT
> which might like to see this summary value.
> 
> Drop the:
> 	estatus->block_status = 0;

This whole treatise of the clearing of block_status should most likely
be a separate patch, though. As it is clearly a fix. Probably make it
the first patch and the second one removing bert_print_all().

Something like that.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 0/2] Fixes for ACPI BERT table
  2017-06-19 15:58 ` Borislav Petkov
@ 2017-06-21 17:20   ` Luck, Tony
  2017-06-21 17:20     ` [PATCH v5 1/2] ACPI / APEI / BERT: Don't clear "block_status" in header Luck, Tony
                       ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Luck, Tony @ 2017-06-21 17:20 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Tony Luck, Len Brown, Huang Ying, Borislav Petkov,
	Tomasz Nowicki, Jonathan Zhang, Tyler Baicar, linux-acpi

From: Tony Luck <tony.luck@intel.com>

At Boris' suggestion I've split this into two parts for version 5.
First one stops the driver from clearing the block_status field as:
 1) it is unnecessary
 2) not mentioned in the spec
 3) may upset user mode tools that extract information from the BERT table.

Second is the code simplification that you've seen before in ealier
versions of this patch.

Cc: Len Brown <lenb@kernel.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Cc: linux-acpi@vger.kernel.org

Tony Luck (2):
  ACPI / APEI / BERT: Don't clear "block_status" in header
  ACPI / APEI: Boot Error Record Table processing was needlessly
    complicated

 drivers/acpi/apei/bert.c | 59 ++++++++++--------------------------------------
 1 file changed, 12 insertions(+), 47 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 1/2] ACPI / APEI / BERT: Don't clear "block_status" in header
  2017-06-21 17:20   ` [PATCH v5 0/2] Fixes for ACPI BERT table Luck, Tony
@ 2017-06-21 17:20     ` Luck, Tony
  2017-06-21 17:20     ` [PATCH v5 2/2] ACPI / APEI: Boot Error Record Table processing was needlessly complicated Luck, Tony
  2017-06-28 16:38     ` [PATCH v5 0/2] Fixes for ACPI BERT table Borislav Petkov
  2 siblings, 0 replies; 6+ messages in thread
From: Luck, Tony @ 2017-06-21 17:20 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Tony Luck, Len Brown, Huang Ying, Borislav Petkov,
	Tomasz Nowicki, Jonathan Zhang, Tyler Baicar, linux-acpi

From: Tony Luck <tony.luck@intel.com>

Although the ACPI specification says this is a "one time polled" error
source, it makes no mention that the OS should stomp on fields in the
error status block to enforce this.

There are user mode BERT harvesting tools that dig into
/sys/firmware/acpi/tables/BERT which might like to see this summary value.

Drop the:
	estatus->block_status = 0;

Cc: Len Brown <lenb@kernel.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 drivers/acpi/apei/bert.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/acpi/apei/bert.c b/drivers/acpi/apei/bert.c
index 12771fcf0417..9197555dc3c7 100644
--- a/drivers/acpi/apei/bert.c
+++ b/drivers/acpi/apei/bert.c
@@ -62,13 +62,6 @@ static void __init bert_print_all(struct acpi_bert_region *region,
 
 		cper_estatus_print(KERN_INFO HW_ERR, estatus);
 
-		/*
-		 * Because the boot error source is "one-time polled" type,
-		 * clear Block Status of current Generic Error Status Block,
-		 * once it's printed.
-		 */
-		estatus->block_status = 0;
-
 		estatus = (void *)estatus + estatus_len;
 		/* No more error records. */
 		if (!estatus->block_status)
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v5 2/2] ACPI / APEI: Boot Error Record Table processing was needlessly complicated
  2017-06-21 17:20   ` [PATCH v5 0/2] Fixes for ACPI BERT table Luck, Tony
  2017-06-21 17:20     ` [PATCH v5 1/2] ACPI / APEI / BERT: Don't clear "block_status" in header Luck, Tony
@ 2017-06-21 17:20     ` Luck, Tony
  2017-06-28 16:38     ` [PATCH v5 0/2] Fixes for ACPI BERT table Borislav Petkov
  2 siblings, 0 replies; 6+ messages in thread
From: Luck, Tony @ 2017-06-21 17:20 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Tony Luck, Len Brown, Huang Ying, Borislav Petkov,
	Tomasz Nowicki, Jonathan Zhang, Tyler Baicar, linux-acpi

From: Tony Luck <tony.luck@intel.com>

Quoting version 6.1 of the ACPI specification. Section 18.3.1 "Boot
Error Source" says:

  The Boot Error Region is a range of addressable memory OSPM can access
  during initialization to determine if an unhandled error condition
  occurred. System firmware must report this memory range as firmware
  reserved. The format of the Boot Error Region follow that of an Error
  Status Block, this is defined in Section 18.3.2.7. The format of the
  error status block is described by Table 18-342.

This clarifies some points that were obfuscated in earlier versions.
E.g. there is no longer a separate table to describe the format of the
"Boot Error Region" (which was identical to the "Error Status Block").
Also saying "follow that of *an* Error Status Block" makes it clear that
there is just one block (which can still contain multiple "Generic Error
Data Entry structures").

The loop inside bert_print_all() is unnecessary (but probably harmless
as the "while (remain > sizeof(struct acpi_bert_region))" loop should
terminate after we skipped over the first entry.

We can drop the "bert_print_all()" function and just move the four
relevant lines inline in "bert_init()".

Cc: Len Brown <lenb@kernel.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 drivers/acpi/apei/bert.c | 52 +++++++++++-------------------------------------
 1 file changed, 12 insertions(+), 40 deletions(-)

diff --git a/drivers/acpi/apei/bert.c b/drivers/acpi/apei/bert.c
index 9197555dc3c7..3b9b296dc84e 100644
--- a/drivers/acpi/apei/bert.c
+++ b/drivers/acpi/apei/bert.c
@@ -34,43 +34,6 @@
 
 static int bert_disable;
 
-static void __init bert_print_all(struct acpi_bert_region *region,
-				  unsigned int region_len)
-{
-	struct acpi_hest_generic_status *estatus =
-		(struct acpi_hest_generic_status *)region;
-	int remain = region_len;
-	u32 estatus_len;
-
-	if (!estatus->block_status)
-		return;
-
-	while (remain > sizeof(struct acpi_bert_region)) {
-		if (cper_estatus_check(estatus)) {
-			pr_err(FW_BUG "Invalid error record.\n");
-			return;
-		}
-
-		estatus_len = cper_estatus_len(estatus);
-		if (remain < estatus_len) {
-			pr_err(FW_BUG "Truncated status block (length: %u).\n",
-			       estatus_len);
-			return;
-		}
-
-		pr_info_once("Error records from previous boot:\n");
-
-		cper_estatus_print(KERN_INFO HW_ERR, estatus);
-
-		estatus = (void *)estatus + estatus_len;
-		/* No more error records. */
-		if (!estatus->block_status)
-			return;
-
-		remain -= estatus_len;
-	}
-}
-
 static int __init setup_bert_disable(char *str)
 {
 	bert_disable = 1;
@@ -82,7 +45,7 @@ __setup("bert_disable", setup_bert_disable);
 static int __init bert_check_table(struct acpi_table_bert *bert_tab)
 {
 	if (bert_tab->header.length < sizeof(struct acpi_table_bert) ||
-	    bert_tab->region_length < sizeof(struct acpi_bert_region))
+	    bert_tab->region_length < sizeof(struct acpi_hest_generic_status))
 		return -EINVAL;
 
 	return 0;
@@ -91,7 +54,7 @@ static int __init bert_check_table(struct acpi_table_bert *bert_tab)
 static int __init bert_init(void)
 {
 	struct apei_resources bert_resources;
-	struct acpi_bert_region *boot_error_region;
+	struct acpi_hest_generic_status *boot_error_region;
 	struct acpi_table_bert *bert_tab;
 	unsigned int region_len;
 	acpi_status status;
@@ -131,7 +94,16 @@ static int __init bert_init(void)
 		goto out_fini;
 	boot_error_region = ioremap_cache(bert_tab->address, region_len);
 	if (boot_error_region) {
-		bert_print_all(boot_error_region, region_len);
+		if (boot_error_region->block_status) {
+			rc = cper_estatus_check(boot_error_region);
+			if (rc) {
+				pr_err(FW_BUG "Invalid error record.\n");
+				iounmap(boot_error_region);
+				return rc;
+			}
+			pr_info("Error records from previous boot:\n");
+			cper_estatus_print(KERN_INFO HW_ERR, boot_error_region);
+		}
 		iounmap(boot_error_region);
 	} else {
 		rc = -ENOMEM;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 0/2] Fixes for ACPI BERT table
  2017-06-21 17:20   ` [PATCH v5 0/2] Fixes for ACPI BERT table Luck, Tony
  2017-06-21 17:20     ` [PATCH v5 1/2] ACPI / APEI / BERT: Don't clear "block_status" in header Luck, Tony
  2017-06-21 17:20     ` [PATCH v5 2/2] ACPI / APEI: Boot Error Record Table processing was needlessly complicated Luck, Tony
@ 2017-06-28 16:38     ` Borislav Petkov
  2 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2017-06-28 16:38 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Rafael J . Wysocki, Len Brown, Huang Ying, Tomasz Nowicki,
	Jonathan Zhang, Tyler Baicar, linux-acpi

On Wed, Jun 21, 2017 at 10:20:40AM -0700, Luck, Tony wrote:
> From: Tony Luck <tony.luck@intel.com>
> 
> At Boris' suggestion I've split this into two parts for version 5.
> First one stops the driver from clearing the block_status field as:
>  1) it is unnecessary
>  2) not mentioned in the spec
>  3) may upset user mode tools that extract information from the BERT table.
> 
> Second is the code simplification that you've seen before in ealier
> versions of this patch.
> 
> Cc: Len Brown <lenb@kernel.org>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
> Cc: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
> Cc: Tyler Baicar <tbaicar@codeaurora.org>
> Cc: linux-acpi@vger.kernel.org
> 
> Tony Luck (2):
>   ACPI / APEI / BERT: Don't clear "block_status" in header
>   ACPI / APEI: Boot Error Record Table processing was needlessly
>     complicated

Both:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-06-28 16:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-15 22:41 [PATCH v4] ACPI / APEI: Boot Error Record Table processing was needlessly complicated Luck, Tony
2017-06-19 15:58 ` Borislav Petkov
2017-06-21 17:20   ` [PATCH v5 0/2] Fixes for ACPI BERT table Luck, Tony
2017-06-21 17:20     ` [PATCH v5 1/2] ACPI / APEI / BERT: Don't clear "block_status" in header Luck, Tony
2017-06-21 17:20     ` [PATCH v5 2/2] ACPI / APEI: Boot Error Record Table processing was needlessly complicated Luck, Tony
2017-06-28 16:38     ` [PATCH v5 0/2] Fixes for ACPI BERT table Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.