Linux-ARM-Kernel Archive on lore.kernel.org
 help / Atom feed
* [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up
@ 2018-12-03 18:05 James Morse
  2018-12-03 18:05 ` [PATCH v7 01/25] ACPI / APEI: Don't wait to serialise with oops messages when panic()ing James Morse
                   ` (24 more replies)
  0 siblings, 25 replies; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hello,

This series aims to wire-up arm64's fancy new software-NMI notifications
for firmware-first RAS. These need to use the estatus-queue, which is
also needed for notifications via emulated-SError. All of these
things take the 'in_nmi()' path through ghes_copy_tofrom_phys(), and
so will deadlock if they can interact, which they might.

To that end, this series removes the in_nmi() stuff from ghes.c.
Locks are pushed out to the notification helpers, and fixmap entries
are passed in to the code that needs them. This means the estatus-queue
users can interrupt each other however they like.

While doing this there is a fair amount of cleanup, which is (now) at the
beginning of the series. NMIlike notifications interrupting
ghes_probe() can go wrong for three different reasons. CPER record
blocks greater than PAGE_SIZE dont' work.
The estatus-pool allocation is simplified and the silent-flag/oops-begin
is removed.

Nothing in this series is intended as fixes, as its all cleanup or
never-worked.


----------%<----------
The earlier boiler-plate:

What's SDEI? Its ARM's "Software Delegated Exception Interface" [0]. It's
used by firmware to tell the OS about firmware-first RAS events.

These Software exceptions can interrupt anything, so I describe them as
NMI-like. They aren't the only NMI-like way to notify the OS about
firmware-first RAS events, the ACPI spec also defines 'NOTFIY_SEA' and
'NOTIFY_SEI'.

(Acronyms: SEA, Synchronous External Abort. The CPU requested some memory,
but the owner of that memory said no. These are always synchronous with the
instruction that caused them. SEI, System-Error Interrupt, commonly called
SError. This is an asynchronous external abort, the memory-owner didn't say no
at the right point. Collectively these things are called external-aborts
How is firmware involved? It traps these and re-injects them into the kernel
once its written the CPER records).

APEI's GHES code only expects one source of NMI. If a platform implements
more than one of these mechanisms, APEI needs to handle the interaction.
'SEA' and 'SEI' can interact as 'SEI' is asynchronous. SDEI can interact
with itself: its exceptions can be 'normal' or 'critical', and firmware
could use both types for RAS. (errors using normal, 'panic-now' using
critical).
----------%<----------


Known issue:
 * ghes_copy_tofrom_phys() already takes a lock in NMI context, this
   series moves that around, and makes sure we never try to take the
   same lock from different NMIlike notifications. Since the switch to
   queued spinlocks it looks like the kernel can only be 4 context's
   deep in spinlock, which arm64 could exceed as it doesn't have a
   single architected NMI. It either needs an additional
   idx-bit in the qspinlock, or for ghes.c to switch to using a
   different type of lock for NMIlike notifications.

Changes since v6:
 * Changed the order of the series.
 * Made hest.c own the estatus pool, which is now vmalloc()d.
 * Culled #ifdef, hopefully without generating too much noise.
 * Added GHESv2 'ack' support to NMI-like notifications
 * Use task-work to kick the memory_failure_queue()

Specific changes are noted in each patch.

[v6] https://www.spinics.net/lists/linux-acpi/msg84228.html
[v5] https://www.spinics.net/lists/linux-acpi/msg82993.html
[v4] https://www.spinics.net/lists/arm-kernel/msg653078.html
[v3] https://www.spinics.net/lists/arm-kernel/msg649230.html

[0] https://static.docs.arm.com/den0054/a/ARM_DEN0054A_Software_Delegated_Exception_Interface.pdf


Feedback welcome,

Thanks

James Morse (25):
  ACPI / APEI: Don't wait to serialise with oops messages when
    panic()ing
  ACPI / APEI: Remove silent flag from ghes_read_estatus()
  ACPI / APEI: Switch estatus pool to use vmalloc memory
  ACPI / APEI: Make hest.c manage the estatus memory pool
  ACPI / APEI: Make estatus pool allocation a static size
  ACPI / APEI: Don't store CPER records physical address in struct ghes
  ACPI / APEI: Remove spurious GHES_TO_CLEAR check
  ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  ACPI / APEI: Generalise the estatus queue's notify code
  ACPI / APEI: Tell firmware the estatus queue consumed the records
  ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI
  ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  ACPI / APEI: Move locking to the notification helper
  ACPI / APEI: Let the notification helper specify the fixmap slot
  ACPI / APEI: Pass ghes and estatus separately to avoid a later copy
  ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER
    length
  ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  ACPI / APEI: Use separate fixmap pages for arm64 NMI-like
    notifications
  mm/memory-failure: Add memory_failure_queue_kick()
  ACPI / APEI: Kick the memory_failure() queue for synchronous errors
  arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
  firmware: arm_sdei: Add ACPI GHES registration helper
  ACPI / APEI: Add support for the SDEI GHES Notification type

 arch/arm/include/asm/kvm_ras.h       |  14 +
 arch/arm/include/asm/system_misc.h   |   5 -
 arch/arm64/include/asm/acpi.h        |   4 +-
 arch/arm64/include/asm/daifflags.h   |   1 +
 arch/arm64/include/asm/fixmap.h      |   6 +-
 arch/arm64/include/asm/kvm_ras.h     |  25 +
 arch/arm64/include/asm/system_misc.h |   2 -
 arch/arm64/kernel/acpi.c             |  51 +++
 arch/arm64/mm/fault.c                |  25 +-
 drivers/acpi/apei/Kconfig            |  12 +-
 drivers/acpi/apei/ghes.c             | 652 ++++++++++++++++-----------
 drivers/acpi/apei/hest.c             |   5 +
 drivers/firmware/arm_sdei.c          |  70 +++
 include/acpi/ghes.h                  |   4 +-
 include/linux/arm_sdei.h             |   9 +
 include/linux/mm.h                   |   1 +
 mm/memory-failure.c                  |  15 +-
 virt/kvm/arm/mmu.c                   |   4 +-
 18 files changed, 606 insertions(+), 299 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 01/25] ACPI / APEI: Don't wait to serialise with oops messages when panic()ing
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-03 18:05 ` [PATCH v7 02/25] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

oops_begin() exists to group printk() messages with the oops message
printed by die(). To reach this caller we know that platform firmware
took this error first, then notified the OS via NMI with a 'panic'
severity.

Don't wait for another CPU to release the die-lock before panic()ing,
our only goal is to print this fatal error and panic().

This code is always called in_nmi(), and since commit 42a0bb3f7138
("printk/nmi: generic solution for safe printk in NMI"), it has been
safe to call printk() from this context. Messages are batched in a
per-cpu buffer and printed via irq-work, or a call back from panic().

Link: https://patchwork.kernel.org/patch/10313555/
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>

---
Changes since v6:
 * Capitals in patch subject
 * Tinkered with the commit message.
---
 drivers/acpi/apei/ghes.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 02c6fd9caff7..ab2dae6fc7e4 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -33,7 +33,6 @@
 #include <linux/interrupt.h>
 #include <linux/timer.h>
 #include <linux/cper.h>
-#include <linux/kdebug.h>
 #include <linux/platform_device.h>
 #include <linux/mutex.h>
 #include <linux/ratelimit.h>
@@ -947,7 +946,6 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 
 		sev = ghes_severity(ghes->estatus->error_severity);
 		if (sev >= GHES_SEV_PANIC) {
-			oops_begin();
 			ghes_print_queued_estatus();
 			__ghes_panic(ghes);
 		}
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 02/25] ACPI / APEI: Remove silent flag from ghes_read_estatus()
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
  2018-12-03 18:05 ` [PATCH v7 01/25] ACPI / APEI: Don't wait to serialise with oops messages when panic()ing James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-04 11:36   ` Borislav Petkov
  2018-12-03 18:05 ` [PATCH v7 03/25] ACPI / APEI: Switch estatus pool to use vmalloc memory James Morse
                   ` (22 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Subsequent patches will split up ghes_read_estatus(), at which
point passing around the 'silent' flag gets annoying. This is to
suppress prink() messages, which prior to commit 42a0bb3f7138
("printk/nmi: generic solution for safe printk in NMI"), were
unsafe in NMI context.

This is no longer necessary, remove the flag. printk() messages
are batched in a per-cpu buffer and printed via irq-work, or a call
back from panic().

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v6:
 * Moved earlier in the series,
 * Tinkered with the commit message.
 * switched to pr_warn_ratelimited() to shut checkpatch up

shutup checkpatch
---
 drivers/acpi/apei/ghes.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index ab2dae6fc7e4..e8503c7d721f 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -324,7 +324,7 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, int silent)
+static int ghes_read_estatus(struct ghes *ghes)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -333,8 +333,7 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 
 	rc = apei_read(&buf_paddr, &g->error_status_address);
 	if (rc) {
-		if (!silent && printk_ratelimit())
-			pr_warning(FW_WARN GHES_PFX
+		pr_warn_ratelimited(FW_WARN GHES_PFX
 "Failed to read error status block address for hardware error source: %d.\n",
 				   g->header.source_id);
 		return -EIO;
@@ -366,9 +365,9 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 	rc = 0;
 
 err_read_block:
-	if (rc && !silent && printk_ratelimit())
-		pr_warning(FW_WARN GHES_PFX
-			   "Failed to read error status block!\n");
+	if (rc)
+		pr_warn_ratelimited(FW_WARN GHES_PFX
+				    "Failed to read error status block!\n");
 	return rc;
 }
 
@@ -700,7 +699,7 @@ static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
 
-	rc = ghes_read_estatus(ghes, 0);
+	rc = ghes_read_estatus(ghes);
 	if (rc)
 		goto out;
 
@@ -937,7 +936,7 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 		return ret;
 
 	list_for_each_entry_rcu(ghes, &ghes_nmi, list) {
-		if (ghes_read_estatus(ghes, 1)) {
+		if (ghes_read_estatus(ghes)) {
 			ghes_clear_estatus(ghes);
 			continue;
 		} else {
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 03/25] ACPI / APEI: Switch estatus pool to use vmalloc memory
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
  2018-12-03 18:05 ` [PATCH v7 01/25] ACPI / APEI: Don't wait to serialise with oops messages when panic()ing James Morse
  2018-12-03 18:05 ` [PATCH v7 02/25] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-04 13:01   ` Borislav Petkov
  2018-12-03 18:05 ` [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool James Morse
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

The ghes code is careful to parse and round firmware's advertised
memory requirements for CPER records, up to a maximum of 64K.
However when ghes_estatus_pool_expand() does its work, it splits
the requested size into PAGE_SIZE granules.

This means if firmware generates 5K of CPER records, and correctly
describes this in the table, __process_error() will silently fail as it
is unable to allocate more than PAGE_SIZE.

Switch the estatus pool to vmalloc() memory. On x86 vmalloc() memory
may fault and be fixed up by vmalloc_fault(). To prevent this call
vmalloc_sync_all() before an NMI handler could discover the memory.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index e8503c7d721f..c15264f2dc4b 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -170,40 +170,40 @@ static int ghes_estatus_pool_init(void)
 	return 0;
 }
 
-static void ghes_estatus_pool_free_chunk_page(struct gen_pool *pool,
+static void ghes_estatus_pool_free_chunk(struct gen_pool *pool,
 					      struct gen_pool_chunk *chunk,
 					      void *data)
 {
-	free_page(chunk->start_addr);
+	vfree((void *)chunk->start_addr);
 }
 
 static void ghes_estatus_pool_exit(void)
 {
 	gen_pool_for_each_chunk(ghes_estatus_pool,
-				ghes_estatus_pool_free_chunk_page, NULL);
+				ghes_estatus_pool_free_chunk, NULL);
 	gen_pool_destroy(ghes_estatus_pool);
 }
 
 static int ghes_estatus_pool_expand(unsigned long len)
 {
-	unsigned long i, pages, size, addr;
-	int ret;
+	unsigned long size, addr;
 
 	ghes_estatus_pool_size_request += PAGE_ALIGN(len);
 	size = gen_pool_size(ghes_estatus_pool);
 	if (size >= ghes_estatus_pool_size_request)
 		return 0;
-	pages = (ghes_estatus_pool_size_request - size) / PAGE_SIZE;
-	for (i = 0; i < pages; i++) {
-		addr = __get_free_page(GFP_KERNEL);
-		if (!addr)
-			return -ENOMEM;
-		ret = gen_pool_add(ghes_estatus_pool, addr, PAGE_SIZE, -1);
-		if (ret)
-			return ret;
-	}
 
-	return 0;
+	addr = (unsigned long)vmalloc(PAGE_ALIGN(len));
+	if (!addr)
+		return -ENOMEM;
+
+	/*
+	 * New allocation must be visible in all pgd before it can be found by
+	 * an NMI allocating from the pool.
+	 */
+	vmalloc_sync_all();
+
+	return gen_pool_add(ghes_estatus_pool, addr, PAGE_ALIGN(len), -1);
 }
 
 static int map_gen_v2(struct ghes *ghes)
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (2 preceding siblings ...)
  2018-12-03 18:05 ` [PATCH v7 03/25] ACPI / APEI: Switch estatus pool to use vmalloc memory James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-11 16:48   ` Borislav Petkov
  2018-12-03 18:05 ` [PATCH v7 05/25] ACPI / APEI: Make estatus pool allocation a static size James Morse
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes.c has a memory pool it uses for the estatus cache and the estatus
queue. The cache is initialised when registering the platform driver.
For the queue, an NMI-like notification has to grow/shrink the pool
as it is registered and unregistered.

This is all pretty noisy when adding new NMI-like notifications, it
would be better to replace this with a static pool size based on the
number of users.

As a precursor, move the call that creates the pool from ghes_init(),
into hest.c. Later this will take the number of ghes entries and
consolidate the queue allocations.
Remove ghes_estatus_pool_exit() as hest.c doesn't have anywhere to put
this.

The pool is now initialised as part of ACPI's subsys_initcall():
(acpi_init(), acpi_scan_init(), acpi_pci_root_init(), acpi_hest_init())
Before this patch it happened later as a GHES specific device_initcall().

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 33 ++++++---------------------------
 drivers/acpi/apei/hest.c |  5 +++++
 include/acpi/ghes.h      |  2 ++
 3 files changed, 13 insertions(+), 27 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index c15264f2dc4b..78058adb2574 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -162,26 +162,16 @@ static void ghes_iounmap_irq(void)
 	clear_fixmap(FIX_APEI_GHES_IRQ);
 }
 
-static int ghes_estatus_pool_init(void)
+static int ghes_estatus_pool_expand(unsigned long len); //temporary
+
+int ghes_estatus_pool_init(void)
 {
 	ghes_estatus_pool = gen_pool_create(GHES_ESTATUS_POOL_MIN_ALLOC_ORDER, -1);
 	if (!ghes_estatus_pool)
 		return -ENOMEM;
-	return 0;
-}
 
-static void ghes_estatus_pool_free_chunk(struct gen_pool *pool,
-					      struct gen_pool_chunk *chunk,
-					      void *data)
-{
-	vfree((void *)chunk->start_addr);
-}
-
-static void ghes_estatus_pool_exit(void)
-{
-	gen_pool_for_each_chunk(ghes_estatus_pool,
-				ghes_estatus_pool_free_chunk, NULL);
-	gen_pool_destroy(ghes_estatus_pool);
+	return ghes_estatus_pool_expand(GHES_ESTATUS_CACHE_AVG_SIZE *
+					GHES_ESTATUS_CACHE_ALLOCED_MAX);
 }
 
 static int ghes_estatus_pool_expand(unsigned long len)
@@ -1225,18 +1215,9 @@ static int __init ghes_init(void)
 
 	ghes_nmi_init_cxt();
 
-	rc = ghes_estatus_pool_init();
-	if (rc)
-		goto err;
-
-	rc = ghes_estatus_pool_expand(GHES_ESTATUS_CACHE_AVG_SIZE *
-				      GHES_ESTATUS_CACHE_ALLOCED_MAX);
-	if (rc)
-		goto err_pool_exit;
-
 	rc = platform_driver_register(&ghes_platform_driver);
 	if (rc)
-		goto err_pool_exit;
+		goto err;
 
 	rc = apei_osc_setup();
 	if (rc == 0 && osc_sb_apei_support_acked)
@@ -1249,8 +1230,6 @@ static int __init ghes_init(void)
 		pr_info(GHES_PFX "Failed to enable APEI firmware first mode.\n");
 
 	return 0;
-err_pool_exit:
-	ghes_estatus_pool_exit();
 err:
 	return rc;
 }
diff --git a/drivers/acpi/apei/hest.c b/drivers/acpi/apei/hest.c
index b1e9f81ebeea..da5fabaeb48f 100644
--- a/drivers/acpi/apei/hest.c
+++ b/drivers/acpi/apei/hest.c
@@ -32,6 +32,7 @@
 #include <linux/io.h>
 #include <linux/platform_device.h>
 #include <acpi/apei.h>
+#include <acpi/ghes.h>
 
 #include "apei-internal.h"
 
@@ -200,6 +201,10 @@ static int __init hest_ghes_dev_register(unsigned int ghes_count)
 	if (!ghes_arr.ghes_devs)
 		return -ENOMEM;
 
+	rc = ghes_estatus_pool_init();
+	if (rc)
+		goto out;
+
 	rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
 	if (rc)
 		goto err;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 82cb4eb225a4..46ef5566e052 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -52,6 +52,8 @@ enum {
 	GHES_SEV_PANIC = 0x3,
 };
 
+int ghes_estatus_pool_init(void);
+
 /* From drivers/edac/ghes_edac.c */
 
 #ifdef CONFIG_EDAC_GHES
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 05/25] ACPI / APEI: Make estatus pool allocation a static size
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (3 preceding siblings ...)
  2018-12-03 18:05 ` [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-11 16:54   ` Borislav Petkov
  2018-12-03 18:05 ` [PATCH v7 06/25] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Adding new NMI-like notifications duplicates the calls that grow
and shrink the estatus pool. This is all pretty pointless, as the
size is capped to 64K. Allocate this for each ghes and drop
the code that grows and shrinks the pool.

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 49 +++++-----------------------------------
 drivers/acpi/apei/hest.c |  2 +-
 include/acpi/ghes.h      |  2 +-
 3 files changed, 8 insertions(+), 45 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 78058adb2574..7c2e9ac140d4 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -162,27 +162,18 @@ static void ghes_iounmap_irq(void)
 	clear_fixmap(FIX_APEI_GHES_IRQ);
 }
 
-static int ghes_estatus_pool_expand(unsigned long len); //temporary
-
-int ghes_estatus_pool_init(void)
+int ghes_estatus_pool_init(int num_ghes)
 {
+	unsigned long addr, len;
+
 	ghes_estatus_pool = gen_pool_create(GHES_ESTATUS_POOL_MIN_ALLOC_ORDER, -1);
 	if (!ghes_estatus_pool)
 		return -ENOMEM;
 
-	return ghes_estatus_pool_expand(GHES_ESTATUS_CACHE_AVG_SIZE *
-					GHES_ESTATUS_CACHE_ALLOCED_MAX);
-}
-
-static int ghes_estatus_pool_expand(unsigned long len)
-{
-	unsigned long size, addr;
-
-	ghes_estatus_pool_size_request += PAGE_ALIGN(len);
-	size = gen_pool_size(ghes_estatus_pool);
-	if (size >= ghes_estatus_pool_size_request)
-		return 0;
+	len = GHES_ESTATUS_CACHE_AVG_SIZE * GHES_ESTATUS_CACHE_ALLOCED_MAX;
+	len += (num_ghes * GHES_ESOURCE_PREALLOC_MAX_SIZE);
 
+	ghes_estatus_pool_size_request = PAGE_ALIGN(len);
 	addr = (unsigned long)vmalloc(PAGE_ALIGN(len));
 	if (!addr)
 		return -ENOMEM;
@@ -954,32 +945,8 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	return ret;
 }
 
-static unsigned long ghes_esource_prealloc_size(
-	const struct acpi_hest_generic *generic)
-{
-	unsigned long block_length, prealloc_records, prealloc_size;
-
-	block_length = min_t(unsigned long, generic->error_block_length,
-			     GHES_ESTATUS_MAX_SIZE);
-	prealloc_records = max_t(unsigned long,
-				 generic->records_to_preallocate, 1);
-	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
-			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
-
-	return prealloc_size;
-}
-
-static void ghes_estatus_pool_shrink(unsigned long len)
-{
-	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
-}
-
 static void ghes_nmi_add(struct ghes *ghes)
 {
-	unsigned long len;
-
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_expand(len);
 	mutex_lock(&ghes_list_mutex);
 	if (list_empty(&ghes_nmi))
 		register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes");
@@ -989,8 +956,6 @@ static void ghes_nmi_add(struct ghes *ghes)
 
 static void ghes_nmi_remove(struct ghes *ghes)
 {
-	unsigned long len;
-
 	mutex_lock(&ghes_list_mutex);
 	list_del_rcu(&ghes->list);
 	if (list_empty(&ghes_nmi))
@@ -1001,8 +966,6 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	 * freed after NMI handler finishes.
 	 */
 	synchronize_rcu();
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_shrink(len);
 }
 
 static void ghes_nmi_init_cxt(void)
diff --git a/drivers/acpi/apei/hest.c b/drivers/acpi/apei/hest.c
index da5fabaeb48f..66e1e2fd7bc4 100644
--- a/drivers/acpi/apei/hest.c
+++ b/drivers/acpi/apei/hest.c
@@ -201,7 +201,7 @@ static int __init hest_ghes_dev_register(unsigned int ghes_count)
 	if (!ghes_arr.ghes_devs)
 		return -ENOMEM;
 
-	rc = ghes_estatus_pool_init();
+	rc = ghes_estatus_pool_init(ghes_count);
 	if (rc)
 		goto out;
 
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 46ef5566e052..cd9ee507d860 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -52,7 +52,7 @@ enum {
 	GHES_SEV_PANIC = 0x3,
 };
 
-int ghes_estatus_pool_init(void);
+int ghes_estatus_pool_init(int num_ghes);
 
 /* From drivers/edac/ghes_edac.c */
 
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 06/25] ACPI / APEI: Don't store CPER records physical address in struct ghes
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (4 preceding siblings ...)
  2018-12-03 18:05 ` [PATCH v7 05/25] ACPI / APEI: Make estatus pool allocation a static size James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-11 17:04   ` Borislav Petkov
  2018-12-03 18:05 ` [PATCH v7 07/25] ACPI / APEI: Remove spurious GHES_TO_CLEAR check James Morse
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

When CPER records are found the address of the records is stashed
in the struct ghes. Once the records have been processed, this
address is overwritten with zero so that it won't be processed
again without being re-populated by firmware.

This goes wrong if a struct ghes can be processed concurrently,
as can happen at probe time when an NMI occurs. If the NMI arrives
on another CPU, the probing CPU may call ghes_clear_estatus() on the
records before the handler had finished with them.
Even on the same CPU, once the interrupted handler is resumed, it
will call ghes_clear_estatus() on the NMIs records, this memory may
have already been re-used by firmware.

Avoid this stashing by letting the caller hold the address. A
later patch will do away with the use of ghes->flags in the
read/clear code too.

Signed-off-by: James Morse <james.morse@arm.com>

---
Changes since v6:
 * Moved earlier in the series
 * Added buf_adder = 0 on all the error paths, and test for it in
   ghes_estatus_clear() for extra sanity.
---
 drivers/acpi/apei/ghes.c | 40 +++++++++++++++++++++++-----------------
 include/acpi/ghes.h      |  1 -
 2 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 7c2e9ac140d4..acf0c37e9af9 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -305,29 +305,30 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes)
+static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr)
 {
 	struct acpi_hest_generic *g = ghes->generic;
-	u64 buf_paddr;
 	u32 len;
 	int rc;
 
-	rc = apei_read(&buf_paddr, &g->error_status_address);
+	rc = apei_read(buf_paddr, &g->error_status_address);
 	if (rc) {
+		*buf_paddr = 0;
 		pr_warn_ratelimited(FW_WARN GHES_PFX
 "Failed to read error status block address for hardware error source: %d.\n",
 				   g->header.source_id);
 		return -EIO;
 	}
-	if (!buf_paddr)
+	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
+	ghes_copy_tofrom_phys(ghes->estatus, *buf_paddr,
 			      sizeof(*ghes->estatus), 1);
-	if (!ghes->estatus->block_status)
+	if (!ghes->estatus->block_status) {
+		*buf_paddr = 0;
 		return -ENOENT;
+	}
 
-	ghes->buffer_paddr = buf_paddr;
 	ghes->flags |= GHES_TO_CLEAR;
 
 	rc = -EIO;
@@ -339,7 +340,7 @@ static int ghes_read_estatus(struct ghes *ghes)
 	if (cper_estatus_check_header(ghes->estatus))
 		goto err_read_block;
 	ghes_copy_tofrom_phys(ghes->estatus + 1,
-			      buf_paddr + sizeof(*ghes->estatus),
+			      *buf_paddr + sizeof(*ghes->estatus),
 			      len - sizeof(*ghes->estatus), 1);
 	if (cper_estatus_check(ghes->estatus))
 		goto err_read_block;
@@ -349,17 +350,20 @@ static int ghes_read_estatus(struct ghes *ghes)
 	if (rc)
 		pr_warn_ratelimited(FW_WARN GHES_PFX
 				    "Failed to read error status block!\n");
+
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes)
+static void ghes_clear_estatus(struct ghes *ghes, u64 buf_paddr)
 {
 	ghes->estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
-	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
-			      sizeof(ghes->estatus->block_status), 0);
-	ghes->flags &= ~GHES_TO_CLEAR;
+	if (buf_paddr) {
+		ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
+				      sizeof(ghes->estatus->block_status), 0);
+		ghes->flags &= ~GHES_TO_CLEAR;
+	}
 }
 
 static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
@@ -678,9 +682,10 @@ static void __ghes_panic(struct ghes *ghes)
 
 static int ghes_proc(struct ghes *ghes)
 {
+	u64 buf_paddr;
 	int rc;
 
-	rc = ghes_read_estatus(ghes);
+	rc = ghes_read_estatus(ghes, &buf_paddr);
 	if (rc)
 		goto out;
 
@@ -695,7 +700,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, ghes->estatus);
 
 out:
-	ghes_clear_estatus(ghes);
+	ghes_clear_estatus(ghes, buf_paddr);
 
 	if (rc == -ENOENT)
 		return rc;
@@ -910,6 +915,7 @@ static void __process_error(struct ghes *ghes)
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
+	u64 buf_paddr;
 	struct ghes *ghes;
 	int sev, ret = NMI_DONE;
 
@@ -917,8 +923,8 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 		return ret;
 
 	list_for_each_entry_rcu(ghes, &ghes_nmi, list) {
-		if (ghes_read_estatus(ghes)) {
-			ghes_clear_estatus(ghes);
+		if (ghes_read_estatus(ghes, &buf_paddr)) {
+			ghes_clear_estatus(ghes, buf_paddr);
 			continue;
 		} else {
 			ret = NMI_HANDLED;
@@ -934,7 +940,7 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 			continue;
 
 		__process_error(ghes);
-		ghes_clear_estatus(ghes);
+		ghes_clear_estatus(ghes, buf_paddr);
 	}
 
 #ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index cd9ee507d860..f82f4a7ddd90 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -22,7 +22,6 @@ struct ghes {
 		struct acpi_hest_generic_v2 *generic_v2;
 	};
 	struct acpi_hest_generic_status *estatus;
-	u64 buffer_paddr;
 	unsigned long flags;
 	union {
 		struct list_head list;
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 07/25] ACPI / APEI: Remove spurious GHES_TO_CLEAR check
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (5 preceding siblings ...)
  2018-12-03 18:05 ` [PATCH v7 06/25] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-11 17:18   ` Borislav Petkov
  2018-12-03 18:05 ` [PATCH v7 08/25] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus James Morse
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes_notify_nmi() checks ghes->flags for GHES_TO_CLEAR before going
on to __process_error(). This is pointless as ghes_read_estatus()
will always set this flag if it returns success, which was checked
earlier in the loop. Remove it.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index acf0c37e9af9..f7a0ff1c785a 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -936,9 +936,6 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 			__ghes_panic(ghes);
 		}
 
-		if (!(ghes->flags & GHES_TO_CLEAR))
-			continue;
-
 		__process_error(ghes);
 		ghes_clear_estatus(ghes, buf_paddr);
 	}
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 08/25] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (6 preceding siblings ...)
  2018-12-03 18:05 ` [PATCH v7 07/25] ACPI / APEI: Remove spurious GHES_TO_CLEAR check James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-03 18:05 ` [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code James Morse
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes_read_estatus() sets a flag in struct ghes if the buffer of
CPER records needs to be cleared once the records have been
processed. This flag value is a problem if a struct ghes can be
processed concurrently, as happens at probe time if an NMI arrives
for the same error source. The NMI clears the flag, meaning the
interrupted handler may never do the ghes_estatus_clear() work.

The GHES_TO_CLEAR flags is only set at the same time as
buffer_paddr, which is now owned by the caller and passed to
ghes_clear_estatus(). Use this value as the flag.

A non-zero buf_paddr returned by ghes_read_estatus() means
ghes_clear_estatus() should clear this address. ghes_read_estatus()
already checks for a read of error_status_address being zero,
so CPER records cannot be written here.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Borislav Petkov <bp@suse.de>

--
Changes since v6:
 * Added Boris' RB, then:
 * Moved earlier in the series,
 * Tinkered with the commit message,
 * Always cleared buf_paddr on errors in the previous patch, which was
   previously in here.
---
 drivers/acpi/apei/ghes.c | 8 +-------
 include/acpi/ghes.h      | 1 -
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f7a0ff1c785a..d06456e60318 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -329,8 +329,6 @@ static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr)
 		return -ENOENT;
 	}
 
-	ghes->flags |= GHES_TO_CLEAR;
-
 	rc = -EIO;
 	len = cper_estatus_len(ghes->estatus);
 	if (len < sizeof(*ghes->estatus))
@@ -357,13 +355,9 @@ static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr)
 static void ghes_clear_estatus(struct ghes *ghes, u64 buf_paddr)
 {
 	ghes->estatus->block_status = 0;
-	if (!(ghes->flags & GHES_TO_CLEAR))
-		return;
-	if (buf_paddr) {
+	if (buf_paddr)
 		ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
 				      sizeof(ghes->estatus->block_status), 0);
-		ghes->flags &= ~GHES_TO_CLEAR;
-	}
 }
 
 static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index f82f4a7ddd90..e3f1cddb4ac8 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -13,7 +13,6 @@
  * estatus: memory buffer for error status block, allocated during
  * HEST parsing.
  */
-#define GHES_TO_CLEAR		0x0001
 #define GHES_EXITING		0x0002
 
 struct ghes {
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (7 preceding siblings ...)
  2018-12-03 18:05 ` [PATCH v7 08/25] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-11 17:44   ` Borislav Petkov
  2018-12-03 18:05 ` [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records James Morse
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Refactor the estatus queue's pool notification routine from
NOTIFY_NMI's handlers. This will allow another notification
method to use the estatus queue without duplicating this code.

This patch adds rcu_read_lock()/rcu_read_unlock() around the list
list_for_each_entry_rcu() walker. These aren't strictly necessary as
the whole nmi_enter/nmi_exit() window is a spooky RCU read-side
critical section.

_in_nmi_notify_one() is separate from the rcu-list walker for a later
caller that doesn't need to walk a list.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Changes since v6:
 * Removed pool grow/remove code as this is no longer necessary.

Changes since v3:
 * Removed duplicate or redundant paragraphs in commit message.
 * Fixed the style of a zero check.
Changes since v1:
   * Tidied up _in_nmi_notify_one().
---
 drivers/acpi/apei/ghes.c | 63 ++++++++++++++++++++++++++--------------
 1 file changed, 41 insertions(+), 22 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d06456e60318..366dbdd41ef3 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -907,37 +907,56 @@ static void __process_error(struct ghes *ghes)
 #endif
 }
 
-static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
+static int _in_nmi_notify_one(struct ghes *ghes)
 {
 	u64 buf_paddr;
-	struct ghes *ghes;
-	int sev, ret = NMI_DONE;
+	int sev;
 
-	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
-		return ret;
+	if (ghes_read_estatus(ghes, &buf_paddr)) {
+		ghes_clear_estatus(ghes, buf_paddr);
+		return -ENOENT;
+	}
 
-	list_for_each_entry_rcu(ghes, &ghes_nmi, list) {
-		if (ghes_read_estatus(ghes, &buf_paddr)) {
-			ghes_clear_estatus(ghes, buf_paddr);
-			continue;
-		} else {
-			ret = NMI_HANDLED;
-		}
+	sev = ghes_severity(ghes->estatus->error_severity);
+	if (sev >= GHES_SEV_PANIC) {
+		ghes_print_queued_estatus();
+		__ghes_panic(ghes);
+	}
 
-		sev = ghes_severity(ghes->estatus->error_severity);
-		if (sev >= GHES_SEV_PANIC) {
-			ghes_print_queued_estatus();
-			__ghes_panic(ghes);
-		}
+	__process_error(ghes);
+	ghes_clear_estatus(ghes, buf_paddr);
 
-		__process_error(ghes);
-		ghes_clear_estatus(ghes, buf_paddr);
+	return 0;
+}
+
+static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+{
+	int ret = -ENOENT;
+	struct ghes *ghes;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ghes, rcu_list, list) {
+		if (!_in_nmi_notify_one(ghes))
+			ret = 0;
 	}
+	rcu_read_unlock();
 
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	if (ret == NMI_HANDLED)
+	if (IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && !ret)
 		irq_work_queue(&ghes_proc_irq_work);
-#endif
+
+	return ret;
+}
+
+static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
+{
+	int ret = NMI_DONE;
+
+	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
+		return ret;
+
+	if (!ghes_estatus_queue_notified(&ghes_nmi))
+		ret = NMI_HANDLED;
+
 	atomic_dec(&ghes_in_nmi);
 	return ret;
 }
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (8 preceding siblings ...)
  2018-12-03 18:05 ` [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code James Morse
@ 2018-12-03 18:05 ` James Morse
  2018-12-11 18:36   ` Borislav Petkov
  2018-12-03 18:05 ` [PATCH v7 11/25] ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI James Morse
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ACPI has a GHESv2 which is used on hardware reduced platforms to
explicitly acknowledge that the memory for CPER records has been
consumed. This lets an external agent know it can re-use this
memory for something else.

Previously notify_nmi and the estatus queue didn't do this as
they were never used on hardware reduced platforms. Once we move
notify_sea over to use the estatus queue, it may become necessary.

Add the call. This is safe for use in NMI context as the
read_ack_register is pre-mapped by ghes_new() before the
ghes can be added to an RCU list, and then found by the
notification handler.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 366dbdd41ef3..15d94373ba72 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -926,6 +926,10 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 	__process_error(ghes);
 	ghes_clear_estatus(ghes, buf_paddr);
 
+	if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
+		pr_warn_ratelimited(FW_WARN GHES_PFX
+				    "Failed to ack error status block!\n");
+
 	return 0;
 }
 
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 11/25] ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (9 preceding siblings ...)
  2018-12-03 18:05 ` [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records James Morse
@ 2018-12-03 18:05 ` James Morse
  2019-01-21 13:01   ` Borislav Petkov
  2018-12-03 18:06 ` [PATCH v7 12/25] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:05 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

The estatus-queue code is currently hidden by the NOTIFY_NMI #ifdefs.
Once NOTIFY_SEA starts using the estatus-queue we can stop hiding
it as each architecture has a user that can't be turned off.

Split the existing CONFIG_HAVE_ACPI_APEI_NMI block in two, and move
the SEA code into the gap.

This patch moves code around ... and changes the stale comment
describing why the status queue is necessary: printk() is no
longer the issue, its the helpers like memory_failure_queue() that
aren't nmi safe.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 113 ++++++++++++++++++++-------------------
 1 file changed, 59 insertions(+), 54 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 15d94373ba72..00fe4785e469 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -765,66 +765,21 @@ static struct notifier_block ghes_notifier_hed = {
 	.notifier_call = ghes_notify_hed,
 };
 
-#ifdef CONFIG_ACPI_APEI_SEA
-static LIST_HEAD(ghes_sea);
-
-/*
- * Return 0 only if one of the SEA error sources successfully reported an error
- * record sent from the firmware.
- */
-int ghes_notify_sea(void)
-{
-	struct ghes *ghes;
-	int ret = -ENOENT;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(ghes, &ghes_sea, list) {
-		if (!ghes_proc(ghes))
-			ret = 0;
-	}
-	rcu_read_unlock();
-	return ret;
-}
-
-static void ghes_sea_add(struct ghes *ghes)
-{
-	mutex_lock(&ghes_list_mutex);
-	list_add_rcu(&ghes->list, &ghes_sea);
-	mutex_unlock(&ghes_list_mutex);
-}
-
-static void ghes_sea_remove(struct ghes *ghes)
-{
-	mutex_lock(&ghes_list_mutex);
-	list_del_rcu(&ghes->list);
-	mutex_unlock(&ghes_list_mutex);
-	synchronize_rcu();
-}
-#else /* CONFIG_ACPI_APEI_SEA */
-static inline void ghes_sea_add(struct ghes *ghes) { }
-static inline void ghes_sea_remove(struct ghes *ghes) { }
-#endif /* CONFIG_ACPI_APEI_SEA */
-
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
 /*
- * printk is not safe in NMI context.  So in NMI handler, we allocate
- * required memory from lock-less memory allocator
- * (ghes_estatus_pool), save estatus into it, put them into lock-less
- * list (ghes_estatus_llist), then delay printk into IRQ context via
- * irq_work (ghes_proc_irq_work).  ghes_estatus_size_request record
- * required pool size by all NMI error source.
+ * Handlers for CPER records may not be NMI safe. For example,
+ * memory_failure_queue() takes spinlocks and calls schedule_work_on().
+ * In any NMI-like handler, memory from ghes_estatus_pool is used to save
+ * estatus, and added to the ghes_estatus_llist. irq_work_queue() causes
+ * ghes_proc_in_irq() to run in IRQ context where each estatus in
+ * ghes_estatus_llist is processed.
+ *
+ * Memory from the ghes_estatus_pool is also used with the ghes_estatus_cache
+ * to suppress frequent messages.
  */
 static struct llist_head ghes_estatus_llist;
 static struct irq_work ghes_proc_irq_work;
 
-/*
- * NMI may be triggered on any CPU, so ghes_in_nmi is used for
- * having only one concurrent reader.
- */
-static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
-
-static LIST_HEAD(ghes_nmi);
-
 static void ghes_proc_in_irq(struct irq_work *irq_work)
 {
 	struct llist_node *llnode, *next;
@@ -950,6 +905,56 @@ static int ghes_estatus_queue_notified(struct list_head *rcu_list)
 
 	return ret;
 }
+#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+
+#ifdef CONFIG_ACPI_APEI_SEA
+static LIST_HEAD(ghes_sea);
+
+/*
+ * Return 0 only if one of the SEA error sources successfully reported an error
+ * record sent from the firmware.
+ */
+int ghes_notify_sea(void)
+{
+	struct ghes *ghes;
+	int ret = -ENOENT;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ghes, &ghes_sea, list) {
+		if (!ghes_proc(ghes))
+			ret = 0;
+	}
+	rcu_read_unlock();
+	return ret;
+}
+
+static void ghes_sea_add(struct ghes *ghes)
+{
+	mutex_lock(&ghes_list_mutex);
+	list_add_rcu(&ghes->list, &ghes_sea);
+	mutex_unlock(&ghes_list_mutex);
+}
+
+static void ghes_sea_remove(struct ghes *ghes)
+{
+	mutex_lock(&ghes_list_mutex);
+	list_del_rcu(&ghes->list);
+	mutex_unlock(&ghes_list_mutex);
+	synchronize_rcu();
+}
+#else /* CONFIG_ACPI_APEI_SEA */
+static inline void ghes_sea_add(struct ghes *ghes) { }
+static inline void ghes_sea_remove(struct ghes *ghes) { }
+#endif /* CONFIG_ACPI_APEI_SEA */
+
+#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+/*
+ * NMI may be triggered on any CPU, so ghes_in_nmi is used for
+ * having only one concurrent reader.
+ */
+static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
+
+static LIST_HEAD(ghes_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 12/25] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (10 preceding siblings ...)
  2018-12-03 18:05 ` [PATCH v7 11/25] ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI James Morse
@ 2018-12-03 18:06 ` James Morse
  2018-12-03 18:06 ` [PATCH v7 13/25] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Now that the estatus queue can be used by more than one notification
method, we can move notifications that have NMI-like behaviour over.

Switch NOTIFY_SEA over to use the estatus queue. This makes it behave
in the same way as x86's NOTIFY_NMI.

Remove Kconfig's ability to turn ACPI_APEI_SEA off if ACPI_APEI_GHES
is selected. This roughly matches the x86 NOTIFY_NMI behaviour, and means
each architecture has at least one user of the estatus-queue, meaning it
doesn't need guarding with ifdef.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v6:
 * Lost all the pool grow/shrink stuff,
 * Changed Kconfig so this can't be turned off to avoid kconfig complexity:
 * Dropped Tyler's tested-by.
 * For now we need #ifdef around the SEA code as the arch code assumes its there.
 * Removed Punit's reviewed-by due to the swirling #ifdeffery
---
 drivers/acpi/apei/Kconfig | 12 +-----------
 drivers/acpi/apei/ghes.c  | 22 +++++-----------------
 2 files changed, 6 insertions(+), 28 deletions(-)

diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 52ae5438edeb..6b18f8bc7be3 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -41,19 +41,9 @@ config ACPI_APEI_PCIEAER
 	  Turn on this option to enable the corresponding support.
 
 config ACPI_APEI_SEA
-	bool "APEI Synchronous External Abort logging/recovering support"
+	bool
 	depends on ARM64 && ACPI_APEI_GHES
 	default y
-	help
-	  This option should be enabled if the system supports
-	  firmware first handling of SEA (Synchronous External Abort).
-	  SEA happens with certain faults of data abort or instruction
-	  abort synchronous exceptions on ARMv8 systems. If a system
-	  supports firmware first handling of SEA, the platform analyzes
-	  and handles hardware error notifications from SEA, and it may then
-	  form a HW error record for the OS to parse and handle. This
-	  option allows the OS to look for such hardware error record, and
-	  take appropriate action.
 
 config ACPI_APEI_MEMORY_FAILURE
 	bool "APEI memory error recovering support"
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 00fe4785e469..4b33fa562e32 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -765,7 +765,6 @@ static struct notifier_block ghes_notifier_hed = {
 	.notifier_call = ghes_notify_hed,
 };
 
-#ifdef CONFIG_HAVE_ACPI_APEI_NMI
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -905,7 +904,6 @@ static int ghes_estatus_queue_notified(struct list_head *rcu_list)
 
 	return ret;
 }
-#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
 #ifdef CONFIG_ACPI_APEI_SEA
 static LIST_HEAD(ghes_sea);
@@ -916,16 +914,7 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	struct ghes *ghes;
-	int ret = -ENOENT;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(ghes, &ghes_sea, list) {
-		if (!ghes_proc(ghes))
-			ret = 0;
-	}
-	rcu_read_unlock();
-	return ret;
+	return ghes_estatus_queue_notified(&ghes_sea);
 }
 
 static void ghes_sea_add(struct ghes *ghes)
@@ -992,16 +981,15 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	 */
 	synchronize_rcu();
 }
+#else /* CONFIG_HAVE_ACPI_APEI_NMI */
+static inline void ghes_nmi_add(struct ghes *ghes) { }
+static inline void ghes_nmi_remove(struct ghes *ghes) { }
+#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
 static void ghes_nmi_init_cxt(void)
 {
 	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
 }
-#else /* CONFIG_HAVE_ACPI_APEI_NMI */
-static inline void ghes_nmi_add(struct ghes *ghes) { }
-static inline void ghes_nmi_remove(struct ghes *ghes) { }
-static inline void ghes_nmi_init_cxt(void) { }
-#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
 static int ghes_probe(struct platform_device *ghes_dev)
 {
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 13/25] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (11 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 12/25] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
@ 2018-12-03 18:06 ` James Morse
  2018-12-06 16:17   ` Catalin Marinas
  2018-12-03 18:06 ` [PATCH v7 14/25] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

To split up APEIs in_nmi() path, the caller needs to always be
in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
out into a header file.

Currently guest synchronous external aborts are claimed as RAS
notifications by handle_guest_sea(), which is hidden in the arch codes
mm/fault.c. 32bit gets a dummy declaration in system_misc.h.

There is going to be more of this in the future if/when the kernel
supports the SError-based firmware-first notification mechanism and/or
kernel-first notifications for both synchronous external abort and
SError. Each of these will come with some Kconfig symbols and a
handful of header files.

Create a header file for all this.

This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
declarations to kvm_ras.h as preparation for a future patch that moves
the ACPI-specific RAS code out of mm/fault.c.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Changes since v6:
 * Tinkered with the commit message
---
 arch/arm/include/asm/kvm_ras.h       | 14 ++++++++++++++
 arch/arm/include/asm/system_misc.h   |  5 -----
 arch/arm64/include/asm/kvm_ras.h     | 11 +++++++++++
 arch/arm64/include/asm/system_misc.h |  2 --
 arch/arm64/mm/fault.c                |  2 +-
 virt/kvm/arm/mmu.c                   |  4 ++--
 6 files changed, 28 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..e9577292dfe4
--- /dev/null
+++ b/arch/arm/include/asm/kvm_ras.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2018 - Arm Ltd */
+
+#ifndef __ARM_KVM_RAS_H__
+#define __ARM_KVM_RAS_H__
+
+#include <linux/types.h>
+
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	return -1;
+}
+
+#endif /* __ARM_KVM_RAS_H__ */
diff --git a/arch/arm/include/asm/system_misc.h b/arch/arm/include/asm/system_misc.h
index 8e76db83c498..66f6a3ae68d2 100644
--- a/arch/arm/include/asm/system_misc.h
+++ b/arch/arm/include/asm/system_misc.h
@@ -38,11 +38,6 @@ static inline void harden_branch_predictor(void)
 
 extern unsigned int user_debug;
 
-static inline int handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	return -1;
-}
-
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_ARM_SYSTEM_MISC_H */
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..6096f0251812
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2018 - Arm Ltd */
+
+#ifndef __ARM64_KVM_RAS_H__
+#define __ARM64_KVM_RAS_H__
+
+#include <linux/types.h>
+
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+
+#endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
index 0e2a0ecaf484..32693f34f431 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -46,8 +46,6 @@ extern void __show_regs(struct pt_regs *);
 
 extern void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd);
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr);
-
 #endif	/* __ASSEMBLY__ */
 
 #endif	/* __ASM_SYSTEM_MISC_H */
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 7d9571f4ae3d..eeeb576b33d7 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -720,7 +720,7 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr)
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
 {
 	return ghes_notify_sea();
 }
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 5eca48bdb1a6..f322b66ef975 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -27,10 +27,10 @@
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_mmio.h>
+#include <asm/kvm_ras.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/virt.h>
-#include <asm/system_misc.h>
 
 #include "trace.h"
 
@@ -1703,7 +1703,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 * For RAS the host kernel may handle this abort.
 		 * There is no need to pass the error into the guest.
 		 */
-		if (!handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
+		if (!kvm_handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
 			return 1;
 
 		if (unlikely(!is_iabt)) {
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 14/25] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (12 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 13/25] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
@ 2018-12-03 18:06 ` James Morse
  2018-12-06 16:17   ` Catalin Marinas
  2018-12-03 18:06 ` [PATCH v7 15/25] ACPI / APEI: Move locking to the notification helper James Morse
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

To split up APEIs in_nmi() path, the caller needs to always be
in_nmi(). Add a helper to do the work and claim the notification.

When KVM or the arch code takes an exception that might be a RAS
notification, it asks the APEI firmware-first code whether it wants
to claim the exception. A future kernel-first mechanism may be queried
afterwards, and claim the notification, otherwise we fall through
to the existing default behaviour.

The NOTIFY_SEA code was merged before considering multiple, possibly
interacting, NMI-like notifications and the need to consider kernel
first in the future. Make the 'claiming' behaviour explicit.

Restructuring the APEI code to allow multiple NMI-like notifications
means any notification that might interrupt interrupts-masked
code must always be wrapped in nmi_enter()/nmi_exit(). This will
allow APEI to use in_nmi() to use the right fixmap entries.

Mask SError over this window to prevent an asynchronous RAS error
arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Why does apei_claim_sea() take a pt_regs? This gets used later to take
APEI by the hand through NMI->IRQ context, depending on what we
interrupted.

Changes since v6:
 * Moved the voice of the commit message.
 * moved arch_local_save_flags() below the !IS_ENABLED drop-out
 * Moved the dummy declaration into the if-ACPI part of the header instead
   of if-APEI.

Changes since v4:
 * Made irqs-unmasked comment a lockdep assert.

Changes since v3:
 * Removed spurious whitespace change
 * Updated comment in acpi.c to cover SError masking

Changes since v2:
 * Added dummy definition for !ACPI and culled IS_ENABLED() checks.
---
 arch/arm64/include/asm/acpi.h      |  4 +++-
 arch/arm64/include/asm/daifflags.h |  1 +
 arch/arm64/include/asm/kvm_ras.h   | 16 ++++++++++++++-
 arch/arm64/kernel/acpi.c           | 31 ++++++++++++++++++++++++++++++
 arch/arm64/mm/fault.c              | 24 +++++------------------
 5 files changed, 55 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 709208dfdc8b..4ed19b26ec1a 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -18,6 +18,7 @@
 
 #include <asm/cputype.h>
 #include <asm/io.h>
+#include <asm/ptrace.h>
 #include <asm/smp_plat.h>
 #include <asm/tlbflush.h>
 
@@ -99,9 +100,10 @@ static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
 
 static inline void arch_fix_phys_package_id(int num, u32 slot) { }
 void __init acpi_init_cpus(void);
-
+int apei_claim_sea(struct pt_regs *regs);
 #else
 static inline void acpi_init_cpus(void) { }
+static inline int apei_claim_sea(struct pt_regs *regs) { return -ENOENT; }
 #endif /* CONFIG_ACPI */
 
 #ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index 8d91f2233135..fa90779fc752 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -20,6 +20,7 @@
 
 #define DAIF_PROCCTX		0
 #define DAIF_PROCCTX_NOIRQ	PSR_I_BIT
+#define DAIF_ERRCTX		(PSR_I_BIT | PSR_A_BIT)
 
 /* mask/save/unmask/restore all exceptions, including interrupts. */
 static inline void local_daif_mask(void)
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
index 6096f0251812..8ac6ee77437c 100644
--- a/arch/arm64/include/asm/kvm_ras.h
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -4,8 +4,22 @@
 #ifndef __ARM64_KVM_RAS_H__
 #define __ARM64_KVM_RAS_H__
 
+#include <linux/acpi.h>
+#include <linux/errno.h>
 #include <linux/types.h>
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+#include <asm/acpi.h>
+
+/*
+ * Was this synchronous external abort a RAS notification?
+ * Returns '0' for errors handled by some RAS subsystem, or -ENOENT.
+ */
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	/* apei_claim_sea(NULL) expects to mask interrupts itself */
+	lockdep_assert_irqs_enabled();
+
+	return apei_claim_sea(NULL);
+}
 
 #endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index 44e3c351e1ea..803f0494dd3e 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -27,8 +27,10 @@
 #include <linux/smp.h>
 #include <linux/serial_core.h>
 
+#include <acpi/ghes.h>
 #include <asm/cputype.h>
 #include <asm/cpu_ops.h>
+#include <asm/daifflags.h>
 #include <asm/pgtable.h>
 #include <asm/smp_plat.h>
 
@@ -256,3 +258,32 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
 		return __pgprot(PROT_NORMAL_NC);
 	return __pgprot(PROT_DEVICE_nGnRnE);
 }
+
+/*
+ * Claim Synchronous External Aborts as a firmware first notification.
+ *
+ * Used by KVM and the arch do_sea handler.
+ * @regs may be NULL when called from process context.
+ */
+int apei_claim_sea(struct pt_regs *regs)
+{
+	int err = -ENOENT;
+	unsigned long current_flags;
+
+	if (!IS_ENABLED(CONFIG_ACPI_APEI_GHES))
+		return err;
+
+	current_flags = arch_local_save_flags();
+
+	/*
+	 * SEA can interrupt SError, mask it and describe this as an NMI so
+	 * that APEI defers the handling.
+	 */
+	local_daif_restore(DAIF_ERRCTX);
+	nmi_enter();
+	err = ghes_notify_sea();
+	nmi_exit();
+	local_daif_restore(current_flags);
+
+	return err;
+}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index eeeb576b33d7..956afc7d932a 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -18,6 +18,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi.h>
 #include <linux/extable.h>
 #include <linux/signal.h>
 #include <linux/mm.h>
@@ -33,6 +34,7 @@
 #include <linux/preempt.h>
 #include <linux/hugetlb.h>
 
+#include <asm/acpi.h>
 #include <asm/bug.h>
 #include <asm/cmpxchg.h>
 #include <asm/cpufeature.h>
@@ -46,8 +48,6 @@
 #include <asm/tlbflush.h>
 #include <asm/traps.h>
 
-#include <acpi/ghes.h>
-
 struct fault_info {
 	int	(*fn)(unsigned long addr, unsigned int esr,
 		      struct pt_regs *regs);
@@ -630,19 +630,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 	inf = esr_to_fault_info(esr);
 
 	/*
-	 * Synchronous aborts may interrupt code which had interrupts masked.
-	 * Before calling out into the wider kernel tell the interested
-	 * subsystems.
+	 * Return value ignored as we rely on signal merging.
+	 * Future patches will make this more robust.
 	 */
-	if (IS_ENABLED(CONFIG_ACPI_APEI_SEA)) {
-		if (interrupts_enabled(regs))
-			nmi_enter();
-
-		ghes_notify_sea();
-
-		if (interrupts_enabled(regs))
-			nmi_exit();
-	}
+	apei_claim_sea(regs);
 
 	if (esr & ESR_ELx_FnV)
 		siaddr = NULL;
@@ -720,11 +711,6 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	return ghes_notify_sea();
-}
-
 asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
 					 struct pt_regs *regs)
 {
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 15/25] ACPI / APEI: Move locking to the notification helper
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (13 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 14/25] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
@ 2018-12-03 18:06 ` James Morse
  2018-12-03 18:06 ` [PATCH v7 16/25] ACPI / APEI: Let the notification helper specify the fixmap slot James Morse
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes_copy_tofrom_phys() takes different locks depending on in_nmi().
This doesn't work if there are multiple NMI-like notifications, that
can interrupt each other.

Now that NOTIFY_SEA is always called in the same context, move the
lock-taking to the notification helper. The helper will always know
which lock to take. This avoids ghes_copy_tofrom_phys() taking a guess
based on in_nmi().

This splits NOTIFY_NMI and NOTIFY_SEA to use different locks. All
the other notifications use ghes_proc(), and are called in process
or IRQ context. Move the spin_lock_irqsave() around their ghes_proc()
calls.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
Changes since v6:
 * Tinkered with the commit message
 * Lock definitions have moved due to the #ifdefs
---
 drivers/acpi/apei/ghes.c | 34 +++++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 4b33fa562e32..30490eff7704 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -114,11 +114,10 @@ static DEFINE_MUTEX(ghes_list_mutex);
  * handler, but general ioremap can not be used in atomic context, so
  * the fixmap is used instead.
  *
- * These 2 spinlocks are used to prevent the fixmap entries from being used
+ * This spinlock is used to prevent the fixmap entry from being used
  * simultaneously.
  */
-static DEFINE_RAW_SPINLOCK(ghes_ioremap_lock_nmi);
-static DEFINE_SPINLOCK(ghes_ioremap_lock_irq);
+static DEFINE_SPINLOCK(ghes_notify_lock_irq);
 
 static struct gen_pool *ghes_estatus_pool;
 static unsigned long ghes_estatus_pool_size_request;
@@ -272,7 +271,6 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 				  int from_phys)
 {
 	void __iomem *vaddr;
-	unsigned long flags = 0;
 	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
@@ -280,10 +278,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
 		if (in_nmi) {
-			raw_spin_lock(&ghes_ioremap_lock_nmi);
 			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
 		} else {
-			spin_lock_irqsave(&ghes_ioremap_lock_irq, flags);
 			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
 		}
 		trunk = PAGE_SIZE - offset;
@@ -297,10 +293,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		buffer += trunk;
 		if (in_nmi) {
 			ghes_iounmap_nmi();
-			raw_spin_unlock(&ghes_ioremap_lock_nmi);
 		} else {
 			ghes_iounmap_irq();
-			spin_unlock_irqrestore(&ghes_ioremap_lock_irq, flags);
 		}
 	}
 }
@@ -727,8 +721,11 @@ static void ghes_add_timer(struct ghes *ghes)
 static void ghes_poll_func(struct timer_list *t)
 {
 	struct ghes *ghes = from_timer(ghes, t, timer);
+	unsigned long flags;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 	if (!(ghes->flags & GHES_EXITING))
 		ghes_add_timer(ghes);
 }
@@ -736,9 +733,12 @@ static void ghes_poll_func(struct timer_list *t)
 static irqreturn_t ghes_irq_func(int irq, void *data)
 {
 	struct ghes *ghes = data;
+	unsigned long flags;
 	int rc;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	rc = ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 	if (rc)
 		return IRQ_NONE;
 
@@ -749,14 +749,17 @@ static int ghes_notify_hed(struct notifier_block *this, unsigned long event,
 			   void *data)
 {
 	struct ghes *ghes;
+	unsigned long flags;
 	int ret = NOTIFY_DONE;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	rcu_read_lock();
 	list_for_each_entry_rcu(ghes, &ghes_hed, list) {
 		if (!ghes_proc(ghes))
 			ret = NOTIFY_OK;
 	}
 	rcu_read_unlock();
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 
 	return ret;
 }
@@ -906,6 +909,7 @@ static int ghes_estatus_queue_notified(struct list_head *rcu_list)
 }
 
 #ifdef CONFIG_ACPI_APEI_SEA
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sea);
 static LIST_HEAD(ghes_sea);
 
 /*
@@ -914,7 +918,13 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	return ghes_estatus_queue_notified(&ghes_sea);
+	int rv;
+
+	raw_spin_lock(&ghes_notify_lock_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea);
+	raw_spin_unlock(&ghes_notify_lock_sea);
+
+	return rv;
 }
 
 static void ghes_sea_add(struct ghes *ghes)
@@ -943,6 +953,7 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
  */
 static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_nmi);
 static LIST_HEAD(ghes_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
@@ -952,8 +963,10 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
 		return ret;
 
+	raw_spin_lock(&ghes_notify_lock_nmi);
 	if (!ghes_estatus_queue_notified(&ghes_nmi))
 		ret = NMI_HANDLED;
+	raw_spin_unlock(&ghes_notify_lock_nmi);
 
 	atomic_dec(&ghes_in_nmi);
 	return ret;
@@ -995,6 +1008,7 @@ static int ghes_probe(struct platform_device *ghes_dev)
 {
 	struct acpi_hest_generic *generic;
 	struct ghes *ghes = NULL;
+	unsigned long flags;
 
 	int rc = -EINVAL;
 
@@ -1097,7 +1111,9 @@ static int ghes_probe(struct platform_device *ghes_dev)
 	ghes_edac_register(ghes, &ghes_dev->dev);
 
 	/* Handle any pending errors right away */
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 
 	return 0;
 
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 16/25] ACPI / APEI: Let the notification helper specify the fixmap slot
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (14 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 15/25] ACPI / APEI: Move locking to the notification helper James Morse
@ 2018-12-03 18:06 ` James Morse
  2018-12-03 18:06 ` [PATCH v7 17/25] ACPI / APEI: Pass ghes and estatus separately to avoid a later copy James Morse
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes_copy_tofrom_phys() uses a different fixmap slot depending on in_nmi().
This doesn't work when there are multiple NMI-like notifications, that
could interrupt each other.

As with the locking, move the chosen fixmap_idx to the notification helper.
This only matters for NMI-like notifications, anything calling
ghes_proc() can use the IRQ fixmap slot as its already holding an irqsave
spinlock.

This lets us collapse the ghes_ioremap_pfn_*() helpers.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
---

The fixmap-idx and vaddr are passed back to ghes_unmap()
to allow ioremap() to be used in process context in the
future. This will let us send tlbi-ipi for notifications
that don't mask irqs.
---
 drivers/acpi/apei/ghes.c | 79 +++++++++++++++-------------------------
 1 file changed, 30 insertions(+), 49 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 30490eff7704..b5c31f65a1c0 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -41,6 +41,7 @@
 #include <linux/llist.h>
 #include <linux/genalloc.h>
 #include <linux/pci.h>
+#include <linux/pfn.h>
 #include <linux/aer.h>
 #include <linux/nmi.h>
 #include <linux/sched/clock.h>
@@ -127,38 +128,24 @@ static atomic_t ghes_estatus_cache_alloced;
 
 static int ghes_panic_timeout __read_mostly = 30;
 
-static void __iomem *ghes_ioremap_pfn_nmi(u64 pfn)
+static void __iomem *ghes_map(u64 pfn, int fixmap_idx)
 {
 	phys_addr_t paddr;
 	pgprot_t prot;
 
-	paddr = pfn << PAGE_SHIFT;
+	paddr = PFN_PHYS(pfn);
 	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_NMI, paddr, prot);
+	__set_fixmap(fixmap_idx, paddr, prot);
 
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_NMI);
+	return (void __iomem *) __fix_to_virt(fixmap_idx);
 }
 
-static void __iomem *ghes_ioremap_pfn_irq(u64 pfn)
+static void ghes_unmap(int fixmap_idx, void __iomem *vaddr)
 {
-	phys_addr_t paddr;
-	pgprot_t prot;
-
-	paddr = pfn << PAGE_SHIFT;
-	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_IRQ, paddr, prot);
+	int _idx = virt_to_fix((unsigned long)vaddr);
 
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_IRQ);
-}
-
-static void ghes_iounmap_nmi(void)
-{
-	clear_fixmap(FIX_APEI_GHES_NMI);
-}
-
-static void ghes_iounmap_irq(void)
-{
-	clear_fixmap(FIX_APEI_GHES_IRQ);
+	WARN_ON_ONCE(fixmap_idx != _idx);
+	clear_fixmap(fixmap_idx);
 }
 
 int ghes_estatus_pool_init(int num_ghes)
@@ -268,20 +255,15 @@ static inline int ghes_severity(int severity)
 }
 
 static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
-				  int from_phys)
+				  int from_phys, int fixmap_idx)
 {
 	void __iomem *vaddr;
-	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
 
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
-		if (in_nmi) {
-			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
-		} else {
-			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
-		}
+		vaddr = ghes_map(PHYS_PFN(paddr), fixmap_idx);
 		trunk = PAGE_SIZE - offset;
 		trunk = min(trunk, len);
 		if (from_phys)
@@ -291,15 +273,12 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		len -= trunk;
 		paddr += trunk;
 		buffer += trunk;
-		if (in_nmi) {
-			ghes_iounmap_nmi();
-		} else {
-			ghes_iounmap_irq();
-		}
+		ghes_unmap(fixmap_idx, vaddr);
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr)
+static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr, int fixmap_idx)
+
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u32 len;
@@ -317,7 +296,7 @@ static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr)
 		return -ENOENT;
 
 	ghes_copy_tofrom_phys(ghes->estatus, *buf_paddr,
-			      sizeof(*ghes->estatus), 1);
+			      sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (!ghes->estatus->block_status) {
 		*buf_paddr = 0;
 		return -ENOENT;
@@ -333,7 +312,7 @@ static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr)
 		goto err_read_block;
 	ghes_copy_tofrom_phys(ghes->estatus + 1,
 			      *buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1);
+			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (cper_estatus_check(ghes->estatus))
 		goto err_read_block;
 	rc = 0;
@@ -346,12 +325,13 @@ static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes, u64 buf_paddr)
+static void ghes_clear_estatus(struct ghes *ghes, u64 buf_paddr, int fixmap_idx)
 {
 	ghes->estatus->block_status = 0;
 	if (buf_paddr)
 		ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-				      sizeof(ghes->estatus->block_status), 0);
+				      sizeof(ghes->estatus->block_status), 0,
+				      fixmap_idx);
 }
 
 static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
@@ -673,7 +653,7 @@ static int ghes_proc(struct ghes *ghes)
 	u64 buf_paddr;
 	int rc;
 
-	rc = ghes_read_estatus(ghes, &buf_paddr);
+	rc = ghes_read_estatus(ghes, &buf_paddr, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
@@ -688,7 +668,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, ghes->estatus);
 
 out:
-	ghes_clear_estatus(ghes, buf_paddr);
+	ghes_clear_estatus(ghes, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
@@ -864,13 +844,13 @@ static void __process_error(struct ghes *ghes)
 #endif
 }
 
-static int _in_nmi_notify_one(struct ghes *ghes)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	u64 buf_paddr;
 	int sev;
 
-	if (ghes_read_estatus(ghes, &buf_paddr)) {
-		ghes_clear_estatus(ghes, buf_paddr);
+	if (ghes_read_estatus(ghes, &buf_paddr, fixmap_idx)) {
+		ghes_clear_estatus(ghes, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -881,7 +861,7 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 	}
 
 	__process_error(ghes);
-	ghes_clear_estatus(ghes, buf_paddr);
+	ghes_clear_estatus(ghes, buf_paddr, fixmap_idx);
 
 	if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
 		pr_warn_ratelimited(FW_WARN GHES_PFX
@@ -890,14 +870,15 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 	return 0;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+static int ghes_estatus_queue_notified(struct list_head *rcu_list,
+				       int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ghes, rcu_list, list) {
-		if (!_in_nmi_notify_one(ghes))
+		if (!_in_nmi_notify_one(ghes, fixmap_idx))
 			ret = 0;
 	}
 	rcu_read_unlock();
@@ -921,7 +902,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
@@ -964,7 +945,7 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 		return ret;
 
 	raw_spin_lock(&ghes_notify_lock_nmi);
-	if (!ghes_estatus_queue_notified(&ghes_nmi))
+	if (!ghes_estatus_queue_notified(&ghes_nmi, FIX_APEI_GHES_NMI))
 		ret = NMI_HANDLED;
 	raw_spin_unlock(&ghes_notify_lock_nmi);
 
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 17/25] ACPI / APEI: Pass ghes and estatus separately to avoid a later copy
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (15 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 16/25] ACPI / APEI: Let the notification helper specify the fixmap slot James Morse
@ 2018-12-03 18:06 ` James Morse
  2019-01-21 13:35   ` Borislav Petkov
  2018-12-03 18:06 ` [PATCH v7 18/25] ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length James Morse
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

The NMI-like notifications scribble over ghes->estatus, before
copying it somewhere else. If this interrupts the ghes_probe() code
calling ghes_proc() on each struct ghes, the data is corrupted.

All the NMI-like notifications should use a queued estatus entry
from the beginning, instead of the ghes version, then copying it.
To do this, break up any use of "ghes->estatus" so that all
functions take the estatus as an argument.

This patch just moves these ghes->estatus dereferences into separate
arguments, no change in behaviour. struct ghes becomes unused in
ghes_clear_estatus() as it only wanted ghes->estatus, which we now
pass directly. This is removed.

Signed-off-by: James Morse <james.morse@arm.com>

---
Changes since v6:
 * Changed subject
 * Renamed ghes_estatus to src_estatus, which is a little clearer
 * Removed struct ghes from ghes_clear_estatus() now that this becomes
   unused in this patch.
 * Mangled the commit message to be different
---
 drivers/acpi/apei/ghes.c | 84 +++++++++++++++++++++-------------------
 1 file changed, 45 insertions(+), 39 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b5c31f65a1c0..b70f5fd962cc 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -277,8 +277,9 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr, int fixmap_idx)
-
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     u64 *buf_paddr, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u32 len;
@@ -295,25 +296,25 @@ static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr, int fixmap_idx)
 	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(ghes->estatus, *buf_paddr,
-			      sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (!ghes->estatus->block_status) {
+	ghes_copy_tofrom_phys(estatus, *buf_paddr, sizeof(*estatus), 1,
+			      fixmap_idx);
+	if (!estatus->block_status) {
 		*buf_paddr = 0;
 		return -ENOENT;
 	}
 
 	rc = -EIO;
-	len = cper_estatus_len(ghes->estatus);
-	if (len < sizeof(*ghes->estatus))
+	len = cper_estatus_len(estatus);
+	if (len < sizeof(*estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
-	if (cper_estatus_check_header(ghes->estatus))
+	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
-	ghes_copy_tofrom_phys(ghes->estatus + 1,
-			      *buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (cper_estatus_check(ghes->estatus))
+	ghes_copy_tofrom_phys(estatus + 1,
+			      *buf_paddr + sizeof(*estatus),
+			      len - sizeof(*estatus), 1, fixmap_idx);
+	if (cper_estatus_check(estatus))
 		goto err_read_block;
 	rc = 0;
 
@@ -325,12 +326,13 @@ static int ghes_read_estatus(struct ghes *ghes, u64 *buf_paddr, int fixmap_idx)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes, u64 buf_paddr, int fixmap_idx)
+static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
+			       u64 buf_paddr, int fixmap_idx)
 {
-	ghes->estatus->block_status = 0;
+	estatus->block_status = 0;
 	if (buf_paddr)
-		ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-				      sizeof(ghes->estatus->block_status), 0,
+		ghes_copy_tofrom_phys(estatus, buf_paddr,
+				      sizeof(estatus->block_status), 0,
 				      fixmap_idx);
 }
 
@@ -638,9 +640,10 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 	return apei_write(val, &gv2->read_ack_register);
 }
 
-static void __ghes_panic(struct ghes *ghes)
+static void __ghes_panic(struct ghes *ghes,
+			 struct acpi_hest_generic_status *estatus)
 {
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, estatus);
 
 	/* reboot to log the error! */
 	if (!panic_timeout)
@@ -650,25 +653,25 @@ static void __ghes_panic(struct ghes *ghes)
 
 static int ghes_proc(struct ghes *ghes)
 {
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 	u64 buf_paddr;
 	int rc;
 
-	rc = ghes_read_estatus(ghes, &buf_paddr, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, &buf_paddr, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
-	if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
-		__ghes_panic(ghes);
-	}
+	if (ghes_severity(estatus->error_severity) >= GHES_SEV_PANIC)
+		__ghes_panic(ghes, estatus);
 
-	if (!ghes_estatus_cached(ghes->estatus)) {
-		if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
-			ghes_estatus_cache_add(ghes->generic, ghes->estatus);
+	if (!ghes_estatus_cached(estatus)) {
+		if (ghes_print_estatus(NULL, ghes->generic, estatus))
+			ghes_estatus_cache_add(ghes->generic, estatus);
 	}
-	ghes_do_proc(ghes, ghes->estatus);
+	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, buf_paddr, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(estatus, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
@@ -819,17 +822,20 @@ static void ghes_print_queued_estatus(void)
 }
 
 /* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
+static void __process_error(struct ghes *ghes,
+			    struct acpi_hest_generic_status *src_estatus)
 {
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
 
-	if (ghes_estatus_cached(ghes->estatus))
+	if (!IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG))
 		return;
 
-	len = cper_estatus_len(ghes->estatus);
+	if (ghes_estatus_cached(src_estatus))
+		return;
+
+	len = cper_estatus_len(src_estatus);
 	node_len = GHES_ESTATUS_NODE_LEN(len);
 
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
@@ -839,29 +845,29 @@ static void __process_error(struct ghes *ghes)
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
+	memcpy(estatus, src_estatus, len);
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
 }
 
 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 	u64 buf_paddr;
 	int sev;
 
-	if (ghes_read_estatus(ghes, &buf_paddr, fixmap_idx)) {
-		ghes_clear_estatus(ghes, buf_paddr, fixmap_idx);
+	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
-	sev = ghes_severity(ghes->estatus->error_severity);
+	sev = ghes_severity(estatus->error_severity);
 	if (sev >= GHES_SEV_PANIC) {
 		ghes_print_queued_estatus();
-		__ghes_panic(ghes);
+		__ghes_panic(ghes, estatus);
 	}
 
-	__process_error(ghes);
-	ghes_clear_estatus(ghes, buf_paddr, fixmap_idx);
+	__process_error(ghes, estatus);
+	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
 	if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
 		pr_warn_ratelimited(FW_WARN GHES_PFX
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 18/25] ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (16 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 17/25] ACPI / APEI: Pass ghes and estatus separately to avoid a later copy James Morse
@ 2018-12-03 18:06 ` James Morse
  2019-01-21 13:53   ` Borislav Petkov
  2018-12-03 18:06 ` [PATCH v7 19/25] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes_read_estatus() reads the record address, then the record's
header, then performs some sanity checks before reading the
records into the provided estatus buffer.

To provide this estatus buffer the caller must know the size of the
records in advance, or always provide a worst-case sized buffer as
happens today for the non-NMI notifications.

Add a function to peek at the record's header to find the size. This
will let the NMI path allocate the right amount of memory before reading
the records, instead of using the worst-case size, and having to copy
the records.

Split ghes_read_estatus() to create __ghes_peek_estatus() which
returns the address and size of the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>

Changes since v6:
 * Additional buf_addr = 0 error handling
 * Moved checking out of peek-estatus
 * Reworded an error message so we can tell them apart
---
 drivers/acpi/apei/ghes.c | 59 ++++++++++++++++++++++++++++++++--------
 1 file changed, 47 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b70f5fd962cc..07a12aac4c1a 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -277,12 +277,12 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes,
-			     struct acpi_hest_generic_status *estatus,
-			     u64 *buf_paddr, int fixmap_idx)
+/* Read the CPER block and returning its address, and header in estatus. */
+static int __ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
+			       struct acpi_hest_generic_status *estatus,
+			       u64 *buf_paddr)
 {
 	struct acpi_hest_generic *g = ghes->generic;
-	u32 len;
 	int rc;
 
 	rc = apei_read(buf_paddr, &g->error_status_address);
@@ -303,29 +303,64 @@ static int ghes_read_estatus(struct ghes *ghes,
 		return -ENOENT;
 	}
 
-	rc = -EIO;
-	len = cper_estatus_len(estatus);
+	return 0;
+}
+
+/* Check the top-level record header has an appropriate size. */
+int __ghes_check_estatus(struct ghes *ghes,
+			 struct acpi_hest_generic_status *estatus)
+{
+	u32 len = cper_estatus_len(estatus);
+	int rc = -EIO;
+
 	if (len < sizeof(*estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
 	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
-	ghes_copy_tofrom_phys(estatus + 1,
-			      *buf_paddr + sizeof(*estatus),
-			      len - sizeof(*estatus), 1, fixmap_idx);
-	if (cper_estatus_check(estatus))
-		goto err_read_block;
 	rc = 0;
 
 err_read_block:
 	if (rc)
 		pr_warn_ratelimited(FW_WARN GHES_PFX
-				    "Failed to read error status block!\n");
+				    "Invalid Error status block!\n");
 
 	return rc;
 }
 
+static int __ghes_read_estatus(struct acpi_hest_generic_status *estatus,
+			       u64 buf_paddr, size_t buf_len,
+			       int fixmap_idx)
+{
+	ghes_copy_tofrom_phys(estatus, buf_paddr, buf_len, 1, fixmap_idx);
+	if (cper_estatus_check(estatus)) {
+		pr_warn_ratelimited(FW_WARN GHES_PFX
+				   "Failed to read error status block!\n");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     u64 *buf_paddr, int fixmap_idx)
+{
+	int rc;
+
+	rc = __ghes_peek_estatus(ghes, fixmap_idx, estatus, buf_paddr);
+	if (rc)
+		return rc;
+
+	rc = __ghes_check_estatus(ghes, estatus);
+	if (rc)
+		return rc;
+
+	return __ghes_read_estatus(estatus, *buf_paddr,
+				   cper_estatus_len(estatus), fixmap_idx);
+}
+
 static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 			       u64 buf_paddr, int fixmap_idx)
 {
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 19/25] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (17 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 18/25] ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length James Morse
@ 2018-12-03 18:06 ` James Morse
  2019-01-21 17:19   ` Borislav Petkov
  2018-12-03 18:06 ` [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications James Morse
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Each struct ghes has an worst-case sized buffer for storing the
estatus. If an error is being processed by ghes_proc() in process
context this buffer will be in use. If the error source then triggers
an NMI-like notification, the same buffer will be used by
_in_nmi_notify_one() to stage the estatus data, before
__process_error() copys it into a queued estatus entry.

Merge __process_error()s work into _in_nmi_notify_one() so that
the queued estatus entry is used from the beginning. Use the new
ghes_peek_estatus() to know how much memory to allocate from
the ghes_estatus_pool before reading the records.

Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>

Change since v6:
 * Added a comment explaining the 'ack-error, then goto no_work'.
 * Added missing esatus-clearing, which is necessary after reading the GAS,
---
 drivers/acpi/apei/ghes.c | 59 ++++++++++++++++++++++++----------------
 1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 07a12aac4c1a..849da0d43a21 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -856,43 +856,43 @@ static void ghes_print_queued_estatus(void)
 	}
 }
 
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes,
-			    struct acpi_hest_generic_status *src_estatus)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
-	u32 len, node_len;
+	struct acpi_hest_generic_status *estatus, tmp_header;
 	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic_status *estatus;
+	u32 len, node_len;
+	u64 buf_paddr;
+	int sev, rc;
 
 	if (!IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG))
-		return;
+		return -EOPNOTSUPP;
 
-	if (ghes_estatus_cached(src_estatus))
-		return;
+	rc = __ghes_peek_estatus(ghes, fixmap_idx, &tmp_header, &buf_paddr);
+	if (rc) {
+		ghes_clear_estatus(&tmp_header, buf_paddr, fixmap_idx);
+		return rc;
+	}
 
-	len = cper_estatus_len(src_estatus);
-	node_len = GHES_ESTATUS_NODE_LEN(len);
+	rc = __ghes_check_estatus(ghes, &tmp_header);
+	if (rc) {
+		ghes_clear_estatus(&tmp_header, buf_paddr, fixmap_idx);
+		return rc;
+	}
 
+	len = cper_estatus_len(&tmp_header);
+	node_len = GHES_ESTATUS_NODE_LEN(len);
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
 	if (!estatus_node)
-		return;
+		return -ENOMEM;
 
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, src_estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-}
-
-static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
-{
-	struct acpi_hest_generic_status *estatus = ghes->estatus;
-	u64 buf_paddr;
-	int sev;
 
-	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+	if (__ghes_read_estatus(estatus, buf_paddr, len, fixmap_idx)) {
 		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
-		return -ENOENT;
+		rc = -ENOENT;
+		goto no_work;
 	}
 
 	sev = ghes_severity(estatus->error_severity);
@@ -901,14 +901,25 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		__ghes_panic(ghes, estatus);
 	}
 
-	__process_error(ghes, estatus);
 	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
 	if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
 		pr_warn_ratelimited(FW_WARN GHES_PFX
 				    "Failed to ack error status block!\n");
 
-	return 0;
+	/* This error has been reported before, don't process it again. */
+	if (ghes_estatus_cached(estatus))
+		goto no_work;
+
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+
+	return rc;
+
+no_work:
+	gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+		      node_len);
+
+	return rc;
 }
 
 static int ghes_estatus_queue_notified(struct list_head *rcu_list,
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (18 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 19/25] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
@ 2018-12-03 18:06 ` James Morse
  2019-01-21 17:27   ` Borislav Petkov
  2018-12-03 18:06 ` [PATCH v7 21/25] mm/memory-failure: Add memory_failure_queue_kick() James Morse
                   ` (4 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Now that ghes notification helpers provide the fixmap slots and
take the lock themselves, multiple NMI-like notifications can
be used on arm64.

These should be named after their notification method as they can't
all be called 'NMI'. x86's NOTIFY_NMI already is, change the SEA
fixmap entry to be called FIX_APEI_GHES_SEA.

Future patches can add support for FIX_APEI_GHES_SEI and
FIX_APEI_GHES_SDEI_{NORMAL,CRITICAL}.

Because all of ghes.c builds on both architectures, provide a
constant for each fixmap entry that the architecture will never
use.

Signed-off-by: James Morse <james.morse@arm.com>

---
Changes since v6:
 * Added #ifdef definitions of each missing fixmap entry.

Changes since v3:
 * idx/lock are now in a separate struct.
 * Add to the comment above ghes_fixmap_lock_irq so that it makes more
   sense in isolation.

fixup for split fixmap
---
 arch/arm64/include/asm/fixmap.h |  2 +-
 drivers/acpi/apei/ghes.c        | 10 +++++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index ec1e6d6fa14c..966dd4bb23f2 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -55,7 +55,7 @@ enum fixed_addresses {
 #ifdef CONFIG_ACPI_APEI_GHES
 	/* Used for GHES mapping from assorted contexts */
 	FIX_APEI_GHES_IRQ,
-	FIX_APEI_GHES_NMI,
+	FIX_APEI_GHES_SEA,
 #endif /* CONFIG_ACPI_APEI_GHES */
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 849da0d43a21..6cbf9471b2a2 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -85,6 +85,14 @@
 	((struct acpi_hest_generic_status *)				\
 	 ((struct ghes_estatus_node *)(estatus_node) + 1))
 
+/* NMI-like notifications vary by architecture. Fill in the fixmap gaps */
+#ifndef CONFIG_HAVE_ACPI_APEI_NMI
+#define FIX_APEI_GHES_NMI	-1
+#endif
+#ifndef CONFIG_ACPI_APEI_SEA
+#define FIX_APEI_GHES_SEA	-1
+#endif
+
 static inline bool is_hest_type_generic_v2(struct ghes *ghes)
 {
 	return ghes->generic->header.type == ACPI_HEST_TYPE_GENERIC_ERROR_V2;
@@ -954,7 +962,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_SEA);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 21/25] mm/memory-failure: Add memory_failure_queue_kick()
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (19 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications James Morse
@ 2018-12-03 18:06 ` James Morse
  2018-12-03 18:06 ` [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors James Morse
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

The GHES code calls memory_failure_queue() from IRQ context to schedule
work on the current CPU so that memory_failure() can sleep.

For synchronous memory errors the arch code needs to know any signals
that memory_failure() will trigger are pending before it returns to
user-space, possibly when exiting from the IRQ.

Add a helper to kick the memory failure queue, to ensure the scheduled
work has happened. This has to be called from process context, so may
have been migrated from the original cpu. Pass the cpu the work was
queued on.

Change memory_failure_work_func() to permit being called on the 'wrong'
cpu.

Signed-off-by: James Morse <james.morse@arm.com>
---
 include/linux/mm.h  |  1 +
 mm/memory-failure.c | 15 ++++++++++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5411de93a363..37b4884b2a1e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2692,6 +2692,7 @@ enum mf_flags {
 };
 extern int memory_failure(unsigned long pfn, int flags);
 extern void memory_failure_queue(unsigned long pfn, int flags);
+extern void memory_failure_queue_kick(int cpu);
 extern int unpoison_memory(unsigned long pfn);
 extern int get_hwpoison_page(struct page *page);
 #define put_hwpoison_page(page)	put_page(page)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 0cd3de3550f0..ec05e1dfce37 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1480,7 +1480,7 @@ static void memory_failure_work_func(struct work_struct *work)
 	unsigned long proc_flags;
 	int gotten;
 
-	mf_cpu = this_cpu_ptr(&memory_failure_cpu);
+	mf_cpu = container_of(work, struct memory_failure_cpu, work);
 	for (;;) {
 		spin_lock_irqsave(&mf_cpu->lock, proc_flags);
 		gotten = kfifo_get(&mf_cpu->fifo, &entry);
@@ -1494,6 +1494,19 @@ static void memory_failure_work_func(struct work_struct *work)
 	}
 }
 
+/*
+ * Process memory_failure work queued on the specified CPU.
+ * Used to avoid return-to-userspace racing with the memory_failure workqueue.
+ */
+void memory_failure_queue_kick(int cpu)
+{
+	struct memory_failure_cpu *mf_cpu;
+
+	mf_cpu = &per_cpu(memory_failure_cpu, cpu);
+	cancel_work_sync(&mf_cpu->work);
+	memory_failure_work_func(&mf_cpu->work);
+}
+
 static int __init memory_failure_init(void)
 {
 	struct memory_failure_cpu *mf_cpu;
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (20 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 21/25] mm/memory-failure: Add memory_failure_queue_kick() James Morse
@ 2018-12-03 18:06 ` James Morse
  2018-12-05  2:02   ` Xie XiuQi
  2019-01-21 17:58   ` Borislav Petkov
  2018-12-03 18:06 ` [PATCH v7 23/25] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
                   ` (2 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

memory_failure() offlines or repairs pages of memory that have been
discovered to be corrupt. These may be detected by an external
component, (e.g. the memory controller), and notified via an IRQ.
In this case the work is queued as not all of memory_failure()s work
can happen in IRQ context.

If the error was detected as a result of user-space accessing a
corrupt memory location the CPU may take an abort instead. On arm64
this is a 'synchronous external abort', and on a firmware first
system it is replayed using NOTIFY_SEA.

This notification has NMI like properties, (it can interrupt
IRQ-masked code), so the memory_failure() work is queued. If we
return to user-space before the queued memory_failure() work is
processed, we will take the fault again. This loop may cause platform
firmware to exceed some threshold and reboot when Linux could have
recovered from this error.

If a ghes notification type indicates that it may be triggered again
when we return to user-space, use the task-work and notify-resume
hooks to kick the relevant memory_failure() queue before returning
to user-space.

Signed-off-by: James Morse <james.morse@arm.com>

---
current->mm == &init_mm ? I couldn't find a helper for this.
The intent is not to set TIF flags on kernel threads. What happens
if a kernel-thread takes on of these? Its just one of the many
not-handled-very-well cases we have already, as memory_failure()
puts it: "try to be lucky".

I assume that if NOTIFY_NMI is coming from SMM it must suffer from
this problem too.
---
 drivers/acpi/apei/ghes.c | 65 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 60 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 6cbf9471b2a2..3e7da9243153 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -47,6 +47,7 @@
 #include <linux/sched/clock.h>
 #include <linux/uuid.h>
 #include <linux/ras.h>
+#include <linux/task_work.h>
 
 #include <acpi/actbl1.h>
 #include <acpi/ghes.h>
@@ -136,6 +137,26 @@ static atomic_t ghes_estatus_cache_alloced;
 
 static int ghes_panic_timeout __read_mostly = 30;
 
+static bool ghes_is_synchronous(struct ghes *ghes)
+{
+	switch (ghes->generic->notify.type) {
+	case ACPI_HEST_NOTIFY_NMI:	/* fall through */
+	case ACPI_HEST_NOTIFY_SEA:
+		/*
+		 * These notifications could be repeated if the interrupted
+		 * instruction is run again. e.g. a read of bad-memory causing
+		 * a trap to platform firmware.
+		 */
+		return true;
+	default:
+		/*
+		 * Other notifications are asynchronous, and not related to the
+		 * interrupted instruction. e.g. an IRQ.
+		 */
+		return false;
+	}
+}
+
 static void __iomem *ghes_map(u64 pfn, int fixmap_idx)
 {
 	phys_addr_t paddr;
@@ -379,14 +400,33 @@ static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 				      fixmap_idx);
 }
 
-static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
+struct ghes_memory_failure_work {
+	int cpu;
+	struct callback_head work;
+};
+
+static void ghes_kick_memory_failure(struct callback_head *head)
+{
+	struct ghes_memory_failure_work *callback;
+
+	callback = container_of(head, struct ghes_memory_failure_work, work);
+	memory_failure_queue_kick(callback->cpu);
+	kfree(callback);
+}
+
+static void ghes_handle_memory_failure(struct ghes *ghes,
+				       struct acpi_hest_generic_data *gdata,
+				       int sev)
 {
-#ifdef CONFIG_ACPI_APEI_MEMORY_FAILURE
 	unsigned long pfn;
-	int flags = -1;
+	int flags = -1, ret;
+	struct ghes_memory_failure_work	*callback;
 	int sec_sev = ghes_severity(gdata->error_severity);
 	struct cper_sec_mem_err *mem_err = acpi_hest_get_payload(gdata);
 
+	if (!IS_ENABLED(CONFIG_ACPI_APEI_MEMORY_FAILURE))
+		return;
+
 	if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
 		return;
 
@@ -407,7 +447,22 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int
 
 	if (flags != -1)
 		memory_failure_queue(pfn, flags);
-#endif
+
+	/*
+	 * If the notification indicates that it was the interrupted
+	 * instruction that caused the error, try to kick the
+	 * memory_failure() queue before returning to user-space.
+	 */
+	if (ghes_is_synchronous(ghes) && current->mm != &init_mm) {
+		callback = kzalloc(sizeof(*callback), GFP_ATOMIC);
+		if (!callback)
+			return;
+		callback->work.func = ghes_kick_memory_failure;
+		callback->cpu = smp_processor_id();
+		ret = task_work_add(current, &callback->work, true);
+		if (ret)
+			kfree(callback);
+	}
 }
 
 /*
@@ -480,7 +535,7 @@ static void ghes_do_proc(struct ghes *ghes,
 			ghes_edac_report_mem_error(sev, mem_err);
 
 			arch_apei_report_mem_error(sev, mem_err);
-			ghes_handle_memory_failure(gdata, sev);
+			ghes_handle_memory_failure(ghes, gdata, sev);
 		}
 		else if (guid_equal(sec_type, &CPER_SEC_PCIE)) {
 			ghes_handle_aer(gdata);
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 23/25] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (21 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors James Morse
@ 2018-12-03 18:06 ` James Morse
  2018-12-06 16:18   ` Catalin Marinas
  2018-12-03 18:06 ` [PATCH v7 24/25] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
  2018-12-03 18:06 ` [PATCH v7 25/25] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

APEI is unable to do all of its error handling work in nmi-context, so
it defers non-fatal work onto the irq_work queue. arch_irq_work_raise()
sends an IPI to the calling cpu, but this is not guaranteed to be taken
before returning to user-space.

Unless the exception interrupted a context with irqs-masked,
irq_work_run() can run immediately. Otherwise return -EINPROGRESS to
indicate ghes_notify_sea() found some work to do, but it hasn't
finished yet.

With this apei_claim_sea() returning '0' means this external-abort was
also notification of a firmware-first RAS error, and that APEI has
processed the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
Changes since v6:
 * Added pr_warn() for the EINPROGRESS case so panic-tracebacks show
   'APEI was here'.
 * Tinkered with the commit message

Changes since v2:
 * Removed IS_ENABLED() check, done by the caller unless we have a dummy
   definition.
---
 arch/arm64/kernel/acpi.c | 22 +++++++++++++++++++++-
 arch/arm64/mm/fault.c    |  9 ++++-----
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index 803f0494dd3e..421331157e8f 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -22,6 +22,7 @@
 #include <linux/init.h>
 #include <linux/irq.h>
 #include <linux/irqdomain.h>
+#include <linux/irq_work.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
 #include <linux/smp.h>
@@ -268,13 +269,17 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
 int apei_claim_sea(struct pt_regs *regs)
 {
 	int err = -ENOENT;
-	unsigned long current_flags;
+	unsigned long current_flags, interrupted_flags;
 
 	if (!IS_ENABLED(CONFIG_ACPI_APEI_GHES))
 		return err;
 
 	current_flags = arch_local_save_flags();
 
+	interrupted_flags = current_flags;
+	if (regs)
+		interrupted_flags = regs->pstate;
+
 	/*
 	 * SEA can interrupt SError, mask it and describe this as an NMI so
 	 * that APEI defers the handling.
@@ -283,6 +288,21 @@ int apei_claim_sea(struct pt_regs *regs)
 	nmi_enter();
 	err = ghes_notify_sea();
 	nmi_exit();
+
+	/*
+	 * APEI NMI-like notifications are deferred to irq_work. Unless
+	 * we interrupted irqs-masked code, we can do that now.
+	 */
+	if (!err) {
+		if (!arch_irqs_disabled_flags(interrupted_flags)) {
+			local_daif_restore(DAIF_PROCCTX_NOIRQ);
+			irq_work_run();
+		} else {
+			pr_warn("APEI work queued but not completed");
+			err = -EINPROGRESS;
+		}
+	}
+
 	local_daif_restore(current_flags);
 
 	return err;
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 956afc7d932a..c26ee1f1cc36 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -629,11 +629,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 
 	inf = esr_to_fault_info(esr);
 
-	/*
-	 * Return value ignored as we rely on signal merging.
-	 * Future patches will make this more robust.
-	 */
-	apei_claim_sea(regs);
+	if (apei_claim_sea(regs) == 0) {
+		/* APEI claimed this as a firmware-first notification */
+		return 0;
+	}
 
 	if (esr & ESR_ELx_FnV)
 		siaddr = NULL;
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 24/25] firmware: arm_sdei: Add ACPI GHES registration helper
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (22 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 23/25] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
@ 2018-12-03 18:06 ` James Morse
  2018-12-06 16:18   ` Catalin Marinas
  2018-12-03 18:06 ` [PATCH v7 25/25] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse
  24 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

APEI's Generic Hardware Error Source structures do not describe
whether the SDEI event is shared or private, as this information is
discoverable via the API.

GHES needs to know whether an event is normal or critical to avoid
sharing locks or fixmap entries, but GHES shouldn't have to know about
the SDEI API.

Add a helper to register the GHES using the appropriate normal or
critical callback.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * Moved normal/critical callbacks into the helper, as APEI needs to know.
 * Tinkered with the commit message.
 * Dropped Punit's Reviewed-by.

Changes since v3:
 * Removed acpi_disabled() checks that aren't necessary after v2s #ifdef
   change.

Changes since v2:
 * Added header file, thanks kbuild-robot!
 * changed ifdef to the GHES version to match the fixmap definition

Changes since v1:
 * ghes->fixmap_idx variable rename
---
 arch/arm64/include/asm/fixmap.h |  4 ++
 drivers/firmware/arm_sdei.c     | 70 +++++++++++++++++++++++++++++++++
 include/linux/arm_sdei.h        |  6 +++
 3 files changed, 80 insertions(+)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 966dd4bb23f2..f987b8a8f325 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -56,6 +56,10 @@ enum fixed_addresses {
 	/* Used for GHES mapping from assorted contexts */
 	FIX_APEI_GHES_IRQ,
 	FIX_APEI_GHES_SEA,
+#ifdef CONFIG_ARM_SDE_INTERFACE
+	FIX_APEI_GHES_SDEI_NORMAL,
+	FIX_APEI_GHES_SDEI_CRITICAL,
+#endif
 #endif /* CONFIG_ACPI_APEI_GHES */
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c
index 1ea71640fdc2..4bcbe3a3f597 100644
--- a/drivers/firmware/arm_sdei.c
+++ b/drivers/firmware/arm_sdei.c
@@ -2,6 +2,7 @@
 // Copyright (C) 2017 Arm Ltd.
 #define pr_fmt(fmt) "sdei: " fmt
 
+#include <acpi/ghes.h>
 #include <linux/acpi.h>
 #include <linux/arm_sdei.h>
 #include <linux/arm-smccc.h>
@@ -32,6 +33,8 @@
 #include <linux/spinlock.h>
 #include <linux/uaccess.h>
 
+#include <asm/fixmap.h>
+
 /*
  * The call to use to reach the firmware.
  */
@@ -887,6 +890,73 @@ static void sdei_smccc_hvc(unsigned long function_id,
 	arm_smccc_hvc(function_id, arg0, arg1, arg2, arg3, arg4, 0, 0, res);
 }
 
+int sdei_register_ghes(struct ghes *ghes, sdei_event_callback *normal_cb,
+		       sdei_event_callback *critical_cb)
+{
+	int err;
+	u64 result;
+	u32 event_num;
+	sdei_event_callback *cb;
+
+	if (!IS_ENABLED(CONFIG_ACPI_APEI_GHES))
+		return -EOPNOTSUPP;
+
+	event_num = ghes->generic->notify.vector;
+	if (event_num == 0) {
+		/*
+		 * Event 0 is reserved by the specification for
+		 * SDEI_EVENT_SIGNAL.
+		 */
+		return -EINVAL;
+	}
+
+	err = sdei_api_event_get_info(event_num, SDEI_EVENT_INFO_EV_PRIORITY,
+				      &result);
+	if (err)
+		return err;
+
+	if (result == SDEI_EVENT_PRIORITY_CRITICAL)
+		cb = critical_cb;
+	else
+		cb = normal_cb;
+
+	err = sdei_event_register(event_num, cb, ghes);
+	if (!err)
+		err = sdei_event_enable(event_num);
+
+	return err;
+}
+
+int sdei_unregister_ghes(struct ghes *ghes)
+{
+	int i;
+	int err;
+	u32 event_num = ghes->generic->notify.vector;
+
+	might_sleep();
+
+	if (!IS_ENABLED(CONFIG_ACPI_APEI_GHES))
+		return -EOPNOTSUPP;
+
+	/*
+	 * The event may be running on another CPU. Disable it
+	 * to stop new events, then try to unregister a few times.
+	 */
+	err = sdei_event_disable(event_num);
+	if (err)
+		return err;
+
+	for (i = 0; i < 3; i++) {
+		err = sdei_event_unregister(event_num);
+		if (err != -EINPROGRESS)
+			break;
+
+		schedule();
+	}
+
+	return err;
+}
+
 static int sdei_get_conduit(struct platform_device *pdev)
 {
 	const char *method;
diff --git a/include/linux/arm_sdei.h b/include/linux/arm_sdei.h
index 942afbd544b7..393899192906 100644
--- a/include/linux/arm_sdei.h
+++ b/include/linux/arm_sdei.h
@@ -11,6 +11,7 @@ enum sdei_conduit_types {
 	CONDUIT_HVC,
 };
 
+#include <acpi/ghes.h>
 #include <asm/sdei.h>
 
 /* Arch code should override this to set the entry point from firmware... */
@@ -39,6 +40,11 @@ int sdei_event_unregister(u32 event_num);
 int sdei_event_enable(u32 event_num);
 int sdei_event_disable(u32 event_num);
 
+/* GHES register/unregister helpers */
+int sdei_register_ghes(struct ghes *ghes, sdei_event_callback *normal_cb,
+		       sdei_event_callback *critical_cb);
+int sdei_unregister_ghes(struct ghes *ghes);
+
 #ifdef CONFIG_ARM_SDE_INTERFACE
 /* For use by arch code when CPU hotplug notifiers are not appropriate. */
 int sdei_mask_local_cpu(void);
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v7 25/25] ACPI / APEI: Add support for the SDEI GHES Notification type
  2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
                   ` (23 preceding siblings ...)
  2018-12-03 18:06 ` [PATCH v7 24/25] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
@ 2018-12-03 18:06 ` James Morse
  24 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2018-12-03 18:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	linux-mm, Borislav Petkov, James Morse, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

If the GHES notification type is SDEI, register the provided event
using the SDEI-GHES helper.

SDEI may be one of two types of event, normal and critical. Critical
events can interrupt normal events, so these must have separate
fixmap slots and locks in case both event types are in use.

Signed-off-by: James Morse <james.morse@arm.com>

--
Changes since v6:
 * Tinkering due to the absence of #ifdef
 * Added SDEI to the new ghes_is_synchronous() helper.
---
 drivers/acpi/apei/ghes.c | 82 +++++++++++++++++++++++++++++++++++++++-
 include/linux/arm_sdei.h |  3 ++
 2 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 3e7da9243153..6325f1d6d9df 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -25,6 +25,7 @@
  * GNU General Public License for more details.
  */
 
+#include <linux/arm_sdei.h>
 #include <linux/kernel.h>
 #include <linux/moduleparam.h>
 #include <linux/init.h>
@@ -93,6 +94,10 @@
 #ifndef CONFIG_ACPI_APEI_SEA
 #define FIX_APEI_GHES_SEA	-1
 #endif
+#ifndef CONFIG_ARM_SDE_INTERFACE
+#define FIX_APEI_GHES_SDEI_NORMAL	-1
+#define FIX_APEI_GHES_SDEI_CRITICAL	-1
+#endif
 
 static inline bool is_hest_type_generic_v2(struct ghes *ghes)
 {
@@ -128,6 +133,8 @@ static DEFINE_MUTEX(ghes_list_mutex);
  * simultaneously.
  */
 static DEFINE_SPINLOCK(ghes_notify_lock_irq);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sdei_normal);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sdei_critical);
 
 static struct gen_pool *ghes_estatus_pool;
 static unsigned long ghes_estatus_pool_size_request;
@@ -141,7 +148,8 @@ static bool ghes_is_synchronous(struct ghes *ghes)
 {
 	switch (ghes->generic->notify.type) {
 	case ACPI_HEST_NOTIFY_NMI:	/* fall through */
-	case ACPI_HEST_NOTIFY_SEA:
+	case ACPI_HEST_NOTIFY_SEA:	/* fall through */
+	case ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED:
 		/*
 		 * These notifications could be repeated if the interrupted
 		 * instruction is run again. e.g. a read of bad-memory causing
@@ -1100,6 +1108,60 @@ static void ghes_nmi_init_cxt(void)
 	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
 }
 
+static int __ghes_sdei_callback(struct ghes *ghes, int fixmap_idx)
+{
+	if (!_in_nmi_notify_one(ghes, fixmap_idx)) {
+		irq_work_queue(&ghes_proc_irq_work);
+
+		return 0;
+	}
+
+	return -ENOENT;
+}
+
+static int ghes_sdei_normal_callback(u32 event_num, struct pt_regs *regs,
+				      void *arg)
+{
+	int err;
+	struct ghes *ghes = arg;
+
+	raw_spin_lock(&ghes_notify_lock_sdei_normal);
+	err = __ghes_sdei_callback(ghes, FIX_APEI_GHES_SDEI_NORMAL);
+	raw_spin_unlock(&ghes_notify_lock_sdei_normal);
+
+	return err;
+}
+
+static int ghes_sdei_critical_callback(u32 event_num, struct pt_regs *regs,
+				       void *arg)
+{
+	int err;
+	struct ghes *ghes = arg;
+
+	raw_spin_lock(&ghes_notify_lock_sdei_critical);
+	err = __ghes_sdei_callback(ghes, FIX_APEI_GHES_SDEI_CRITICAL);
+	raw_spin_unlock(&ghes_notify_lock_sdei_critical);
+
+	return err;
+}
+
+static int apei_sdei_register_ghes(struct ghes *ghes)
+{
+	if (!IS_ENABLED(CONFIG_ARM_SDE_INTERFACE))
+		return -EOPNOTSUPP;
+
+	return sdei_register_ghes(ghes, ghes_sdei_normal_callback,
+				 ghes_sdei_critical_callback);
+}
+
+static int apei_sdei_unregister_ghes(struct ghes *ghes)
+{
+	if (!IS_ENABLED(CONFIG_ARM_SDE_INTERFACE))
+		return -EOPNOTSUPP;
+
+	return sdei_unregister_ghes(ghes);
+}
+
 static int ghes_probe(struct platform_device *ghes_dev)
 {
 	struct acpi_hest_generic *generic;
@@ -1135,6 +1197,13 @@ static int ghes_probe(struct platform_device *ghes_dev)
 			goto err;
 		}
 		break;
+	case ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED:
+		if (!IS_ENABLED(CONFIG_ARM_SDE_INTERFACE)) {
+			pr_warn(GHES_PFX "Generic hardware error source: %d notified via SDE Interface is not supported!\n",
+				generic->header.source_id);
+			goto err;
+		}
+		break;
 	case ACPI_HEST_NOTIFY_LOCAL:
 		pr_warning(GHES_PFX "Generic hardware error source: %d notified via local interrupt is not supported!\n",
 			   generic->header.source_id);
@@ -1198,6 +1267,11 @@ static int ghes_probe(struct platform_device *ghes_dev)
 	case ACPI_HEST_NOTIFY_NMI:
 		ghes_nmi_add(ghes);
 		break;
+	case ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED:
+		rc = apei_sdei_register_ghes(ghes);
+		if (rc)
+			goto err;
+		break;
 	default:
 		BUG();
 	}
@@ -1223,6 +1297,7 @@ static int ghes_probe(struct platform_device *ghes_dev)
 
 static int ghes_remove(struct platform_device *ghes_dev)
 {
+	int rc;
 	struct ghes *ghes;
 	struct acpi_hest_generic *generic;
 
@@ -1255,6 +1330,11 @@ static int ghes_remove(struct platform_device *ghes_dev)
 	case ACPI_HEST_NOTIFY_NMI:
 		ghes_nmi_remove(ghes);
 		break;
+	case ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED:
+		rc = apei_sdei_unregister_ghes(ghes);
+		if (rc)
+			return rc;
+		break;
 	default:
 		BUG();
 		break;
diff --git a/include/linux/arm_sdei.h b/include/linux/arm_sdei.h
index 393899192906..3305ea7f9dc7 100644
--- a/include/linux/arm_sdei.h
+++ b/include/linux/arm_sdei.h
@@ -12,7 +12,10 @@ enum sdei_conduit_types {
 };
 
 #include <acpi/ghes.h>
+
+#ifdef CONFIG_ARM_SDE_INTERFACE
 #include <asm/sdei.h>
+#endif
 
 /* Arch code should override this to set the entry point from firmware... */
 #ifndef sdei_arch_get_entry_point
-- 
2.19.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 02/25] ACPI / APEI: Remove silent flag from ghes_read_estatus()
  2018-12-03 18:05 ` [PATCH v7 02/25] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
@ 2018-12-04 11:36   ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2018-12-04 11:36 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:05:50PM +0000, James Morse wrote:
> Subsequent patches will split up ghes_read_estatus(), at which
> point passing around the 'silent' flag gets annoying. This is to
> suppress prink() messages, which prior to commit 42a0bb3f7138
> ("printk/nmi: generic solution for safe printk in NMI"), were
> unsafe in NMI context.
> 
> This is no longer necessary, remove the flag. printk() messages
> are batched in a per-cpu buffer and printed via irq-work, or a call
> back from panic().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v6:
>  * Moved earlier in the series,
>  * Tinkered with the commit message.
>  * switched to pr_warn_ratelimited() to shut checkpatch up
> 
> shutup checkpatch
> ---
>  drivers/acpi/apei/ghes.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 03/25] ACPI / APEI: Switch estatus pool to use vmalloc memory
  2018-12-03 18:05 ` [PATCH v7 03/25] ACPI / APEI: Switch estatus pool to use vmalloc memory James Morse
@ 2018-12-04 13:01   ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2018-12-04 13:01 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:05:51PM +0000, James Morse wrote:
> The ghes code is careful to parse and round firmware's advertised
> memory requirements for CPER records, up to a maximum of 64K.
> However when ghes_estatus_pool_expand() does its work, it splits
> the requested size into PAGE_SIZE granules.
> 
> This means if firmware generates 5K of CPER records, and correctly
> describes this in the table, __process_error() will silently fail as it
> is unable to allocate more than PAGE_SIZE.
> 
> Switch the estatus pool to vmalloc() memory. On x86 vmalloc() memory
> may fault and be fixed up by vmalloc_fault(). To prevent this call
> vmalloc_sync_all() before an NMI handler could discover the memory.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 30 +++++++++++++++---------------
>  1 file changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index e8503c7d721f..c15264f2dc4b 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -170,40 +170,40 @@ static int ghes_estatus_pool_init(void)
>  	return 0;
>  }
>  
> -static void ghes_estatus_pool_free_chunk_page(struct gen_pool *pool,
> +static void ghes_estatus_pool_free_chunk(struct gen_pool *pool,
>  					      struct gen_pool_chunk *chunk,
>  					      void *data)
>  {
> -	free_page(chunk->start_addr);
> +	vfree((void *)chunk->start_addr);
>  }
>  
>  static void ghes_estatus_pool_exit(void)
>  {
>  	gen_pool_for_each_chunk(ghes_estatus_pool,
> -				ghes_estatus_pool_free_chunk_page, NULL);
> +				ghes_estatus_pool_free_chunk, NULL);
>  	gen_pool_destroy(ghes_estatus_pool);
>  }
>  
>  static int ghes_estatus_pool_expand(unsigned long len)
>  {
> -	unsigned long i, pages, size, addr;
> -	int ret;
> +	unsigned long size, addr;
>  
>  	ghes_estatus_pool_size_request += PAGE_ALIGN(len);

So here we increment with page-aligned len...

>  	size = gen_pool_size(ghes_estatus_pool);
>  	if (size >= ghes_estatus_pool_size_request)
>  		return 0;
> -	pages = (ghes_estatus_pool_size_request - size) / PAGE_SIZE;
> -	for (i = 0; i < pages; i++) {
> -		addr = __get_free_page(GFP_KERNEL);
> -		if (!addr)
> -			return -ENOMEM;
> -		ret = gen_pool_add(ghes_estatus_pool, addr, PAGE_SIZE, -1);
> -		if (ret)
> -			return ret;
> -	}
>  
> -	return 0;
> +	addr = (unsigned long)vmalloc(PAGE_ALIGN(len));
> +	if (!addr)
> +		return -ENOMEM;

... and if we return here due to the ENOMEM, that increment above
remains.

I see you're reworking all that stuff in the next patches which is cool,
thx. So I guess we should leave it as is, as the code before was broken
too.

IOW,

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors
  2018-12-03 18:06 ` [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors James Morse
@ 2018-12-05  2:02   ` Xie XiuQi
  2018-12-10 19:15     ` James Morse
  2019-01-21 17:58   ` Borislav Petkov
  1 sibling, 1 reply; 72+ messages in thread
From: Xie XiuQi @ 2018-12-05  2:02 UTC (permalink / raw)
  To: James Morse, Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	Wang Xiongfeng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi James & Boris,

On 2018/12/4 2:06, James Morse wrote:
> memory_failure() offlines or repairs pages of memory that have been
> discovered to be corrupt. These may be detected by an external
> component, (e.g. the memory controller), and notified via an IRQ.
> In this case the work is queued as not all of memory_failure()s work
> can happen in IRQ context.
> 
> If the error was detected as a result of user-space accessing a
> corrupt memory location the CPU may take an abort instead. On arm64
> this is a 'synchronous external abort', and on a firmware first
> system it is replayed using NOTIFY_SEA.
> 
> This notification has NMI like properties, (it can interrupt
> IRQ-masked code), so the memory_failure() work is queued. If we
> return to user-space before the queued memory_failure() work is
> processed, we will take the fault again. This loop may cause platform
> firmware to exceed some threshold and reboot when Linux could have
> recovered from this error.
> 
> If a ghes notification type indicates that it may be triggered again
> when we return to user-space, use the task-work and notify-resume
> hooks to kick the relevant memory_failure() queue before returning
> to user-space.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> ---
> current->mm == &init_mm ? I couldn't find a helper for this.
> The intent is not to set TIF flags on kernel threads. What happens
> if a kernel-thread takes on of these? Its just one of the many
> not-handled-very-well cases we have already, as memory_failure()
> puts it: "try to be lucky".
> 
> I assume that if NOTIFY_NMI is coming from SMM it must suffer from
> this problem too.
> ---
>  drivers/acpi/apei/ghes.c | 65 ++++++++++++++++++++++++++++++++++++----
>  1 file changed, 60 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 6cbf9471b2a2..3e7da9243153 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -47,6 +47,7 @@
>  #include <linux/sched/clock.h>
>  #include <linux/uuid.h>
>  #include <linux/ras.h>
> +#include <linux/task_work.h>
>  
>  #include <acpi/actbl1.h>
>  #include <acpi/ghes.h>
> @@ -136,6 +137,26 @@ static atomic_t ghes_estatus_cache_alloced;
>  
>  static int ghes_panic_timeout __read_mostly = 30;
>  
> +static bool ghes_is_synchronous(struct ghes *ghes)
> +{
> +	switch (ghes->generic->notify.type) {
> +	case ACPI_HEST_NOTIFY_NMI:	/* fall through */
> +	case ACPI_HEST_NOTIFY_SEA:
> +		/*
> +		 * These notifications could be repeated if the interrupted
> +		 * instruction is run again. e.g. a read of bad-memory causing
> +		 * a trap to platform firmware.
> +		 */
> +		return true;
> +	default:
> +		/*
> +		 * Other notifications are asynchronous, and not related to the
> +		 * interrupted instruction. e.g. an IRQ.
> +		 */
> +		return false;
> +	}
> +}
> +
>  static void __iomem *ghes_map(u64 pfn, int fixmap_idx)
>  {
>  	phys_addr_t paddr;
> @@ -379,14 +400,33 @@ static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
>  				      fixmap_idx);
>  }
>  
> -static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
> +struct ghes_memory_failure_work {
> +	int cpu;
> +	struct callback_head work;
> +};
> +
> +static void ghes_kick_memory_failure(struct callback_head *head)
> +{
> +	struct ghes_memory_failure_work *callback;
> +
> +	callback = container_of(head, struct ghes_memory_failure_work, work);
> +	memory_failure_queue_kick(callback->cpu);
> +	kfree(callback);
> +}
> +
> +static void ghes_handle_memory_failure(struct ghes *ghes,
> +				       struct acpi_hest_generic_data *gdata,
> +				       int sev)
>  {
> -#ifdef CONFIG_ACPI_APEI_MEMORY_FAILURE
>  	unsigned long pfn;
> -	int flags = -1;
> +	int flags = -1, ret;
> +	struct ghes_memory_failure_work	*callback;
>  	int sec_sev = ghes_severity(gdata->error_severity);
>  	struct cper_sec_mem_err *mem_err = acpi_hest_get_payload(gdata);
>  
> +	if (!IS_ENABLED(CONFIG_ACPI_APEI_MEMORY_FAILURE))
> +		return;
> +
>  	if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
>  		return;
>  
> @@ -407,7 +447,22 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int
>  
>  	if (flags != -1)
>  		memory_failure_queue(pfn, flags);

We may need to take MF_ACTION_REQUIRED flags for memory_failure() in SEA condition.
And there is no return value check for memory_failure() in memory_failure_work_func(),
I'm not sure whether we need to check the return value.

static void memory_failure_work_func(struct work_struct *work)
{
        struct memory_failure_cpu *mf_cpu;
        struct memory_failure_entry entry = { 0, };
        unsigned long proc_flags;
        int gotten;

        mf_cpu = container_of(work, struct memory_failure_cpu, work);
        for (;;) {
                spin_lock_irqsave(&mf_cpu->lock, proc_flags);
                gotten = kfifo_get(&mf_cpu->fifo, &entry);
                spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
                if (!gotten)
                        break;
                if (entry.flags & MF_SOFT_OFFLINE)
                        soft_offline_page(pfn_to_page(entry.pfn), entry.flags);
                else
                        memory_failure(entry.pfn, entry.flags);
        }
}

If the recovery fails here, we need to take other actions, such as force to send a SIGBUS signal.


> -#endif
> +
> +	/*
> +	 * If the notification indicates that it was the interrupted
> +	 * instruction that caused the error, try to kick the
> +	 * memory_failure() queue before returning to user-space.
> +	 */
> +	if (ghes_is_synchronous(ghes) && current->mm != &init_mm) {
> +		callback = kzalloc(sizeof(*callback), GFP_ATOMIC);
> +		if (!callback)
> +			return;
> +		callback->work.func = ghes_kick_memory_failure;
> +		callback->cpu = smp_processor_id();
> +		ret = task_work_add(current, &callback->work, true);
> +		if (ret)
> +			kfree(callback);
> +	}
>  }
>  
>  /*
> @@ -480,7 +535,7 @@ static void ghes_do_proc(struct ghes *ghes,
>  			ghes_edac_report_mem_error(sev, mem_err);
>  
>  			arch_apei_report_mem_error(sev, mem_err);
> -			ghes_handle_memory_failure(gdata, sev);
> +			ghes_handle_memory_failure(ghes, gdata, sev);
>  		}
>  		else if (guid_equal(sec_type, &CPER_SEC_PCIE)) {
>  			ghes_handle_aer(gdata);
> 

-- 
Thanks,
Xie XiuQi


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 13/25] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  2018-12-03 18:06 ` [PATCH v7 13/25] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
@ 2018-12-06 16:17   ` Catalin Marinas
  0 siblings, 0 replies; 72+ messages in thread
From: Catalin Marinas @ 2018-12-06 16:17 UTC (permalink / raw)
  To: James Morse
  Cc: Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier, Will Deacon,
	Rafael Wysocki, Christoffer Dall, Dongjiu Geng, linux-mm,
	linux-acpi, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:06:01PM +0000, James Morse wrote:
> To split up APEIs in_nmi() path, the caller needs to always be
> in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
> out into a header file.
> 
> Currently guest synchronous external aborts are claimed as RAS
> notifications by handle_guest_sea(), which is hidden in the arch codes
> mm/fault.c. 32bit gets a dummy declaration in system_misc.h.
> 
> There is going to be more of this in the future if/when the kernel
> supports the SError-based firmware-first notification mechanism and/or
> kernel-first notifications for both synchronous external abort and
> SError. Each of these will come with some Kconfig symbols and a
> handful of header files.
> 
> Create a header file for all this.
> 
> This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
> declarations to kvm_ras.h as preparation for a future patch that moves
> the ACPI-specific RAS code out of mm/fault.c.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

For the arm64 bits here:

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 14/25] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  2018-12-03 18:06 ` [PATCH v7 14/25] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
@ 2018-12-06 16:17   ` Catalin Marinas
  0 siblings, 0 replies; 72+ messages in thread
From: Catalin Marinas @ 2018-12-06 16:17 UTC (permalink / raw)
  To: James Morse
  Cc: Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier, Will Deacon,
	Rafael Wysocki, Christoffer Dall, Dongjiu Geng, linux-mm,
	linux-acpi, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:06:02PM +0000, James Morse wrote:
> To split up APEIs in_nmi() path, the caller needs to always be
> in_nmi(). Add a helper to do the work and claim the notification.
> 
> When KVM or the arch code takes an exception that might be a RAS
> notification, it asks the APEI firmware-first code whether it wants
> to claim the exception. A future kernel-first mechanism may be queried
> afterwards, and claim the notification, otherwise we fall through
> to the existing default behaviour.
> 
> The NOTIFY_SEA code was merged before considering multiple, possibly
> interacting, NMI-like notifications and the need to consider kernel
> first in the future. Make the 'claiming' behaviour explicit.
> 
> Restructuring the APEI code to allow multiple NMI-like notifications
> means any notification that might interrupt interrupts-masked
> code must always be wrapped in nmi_enter()/nmi_exit(). This will
> allow APEI to use in_nmi() to use the right fixmap entries.
> 
> Mask SError over this window to prevent an asynchronous RAS error
> arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 23/25] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
  2018-12-03 18:06 ` [PATCH v7 23/25] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
@ 2018-12-06 16:18   ` Catalin Marinas
  0 siblings, 0 replies; 72+ messages in thread
From: Catalin Marinas @ 2018-12-06 16:18 UTC (permalink / raw)
  To: James Morse
  Cc: Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier, Will Deacon,
	Rafael Wysocki, Christoffer Dall, Dongjiu Geng, linux-mm,
	linux-acpi, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:06:11PM +0000, James Morse wrote:
> APEI is unable to do all of its error handling work in nmi-context, so
> it defers non-fatal work onto the irq_work queue. arch_irq_work_raise()
> sends an IPI to the calling cpu, but this is not guaranteed to be taken
> before returning to user-space.
> 
> Unless the exception interrupted a context with irqs-masked,
> irq_work_run() can run immediately. Otherwise return -EINPROGRESS to
> indicate ghes_notify_sea() found some work to do, but it hasn't
> finished yet.
> 
> With this apei_claim_sea() returning '0' means this external-abort was
> also notification of a firmware-first RAS error, and that APEI has
> processed the CPER records.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> CC: Xie XiuQi <xiexiuqi@huawei.com>
> CC: gengdongjiu <gengdongjiu@huawei.com>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 24/25] firmware: arm_sdei: Add ACPI GHES registration helper
  2018-12-03 18:06 ` [PATCH v7 24/25] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
@ 2018-12-06 16:18   ` Catalin Marinas
  0 siblings, 0 replies; 72+ messages in thread
From: Catalin Marinas @ 2018-12-06 16:18 UTC (permalink / raw)
  To: James Morse
  Cc: Tony Luck, Fan Wu, Xie XiuQi, Marc Zyngier, Will Deacon,
	Rafael Wysocki, Christoffer Dall, Dongjiu Geng, linux-mm,
	linux-acpi, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:06:12PM +0000, James Morse wrote:
> APEI's Generic Hardware Error Source structures do not describe
> whether the SDEI event is shared or private, as this information is
> discoverable via the API.
> 
> GHES needs to know whether an event is normal or critical to avoid
> sharing locks or fixmap entries, but GHES shouldn't have to know about
> the SDEI API.
> 
> Add a helper to register the GHES using the appropriate normal or
> critical callback.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors
  2018-12-05  2:02   ` Xie XiuQi
@ 2018-12-10 19:15     ` James Morse
  2019-01-22 10:51       ` Borislav Petkov
  0 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-10 19:15 UTC (permalink / raw)
  To: Xie XiuQi
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Will Deacon, Christoffer Dall, Dongjiu Geng,
	Wang Xiongfeng, linux-acpi, Borislav Petkov, Naoya Horiguchi,
	kvmarm, linux-arm-kernel, Len Brown

Hi Xie XiuQi,

On 05/12/2018 02:02, Xie XiuQi wrote:
> On 2018/12/4 2:06, James Morse wrote:
>> memory_failure() offlines or repairs pages of memory that have been
>> discovered to be corrupt. These may be detected by an external
>> component, (e.g. the memory controller), and notified via an IRQ.
>> In this case the work is queued as not all of memory_failure()s work
>> can happen in IRQ context.
>>
>> If the error was detected as a result of user-space accessing a
>> corrupt memory location the CPU may take an abort instead. On arm64
>> this is a 'synchronous external abort', and on a firmware first
>> system it is replayed using NOTIFY_SEA.
>>
>> This notification has NMI like properties, (it can interrupt
>> IRQ-masked code), so the memory_failure() work is queued. If we
>> return to user-space before the queued memory_failure() work is
>> processed, we will take the fault again. This loop may cause platform
>> firmware to exceed some threshold and reboot when Linux could have
>> recovered from this error.
>>
>> If a ghes notification type indicates that it may be triggered again
>> when we return to user-space, use the task-work and notify-resume
>> hooks to kick the relevant memory_failure() queue before returning


>> @@ -407,7 +447,22 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int
>>  
>>  	if (flags != -1)
>>  		memory_failure_queue(pfn, flags);
> 
> We may need to take MF_ACTION_REQUIRED flags for memory_failure() in SEA condition.

Hmmm, I'd forgotten about the extra flags. They're only used by x86's
do_machine_check(), which knows more about what is going on. I agree we do know
it should be a 'MF_ACTION_REQUIRED' for Synchronous-external-abort, but I'd
really like all the notifications to behave in the same way as we can't change
which notification firmware uses.
(This ghes_is_synchronous() affects when memory_failure() runs, not what it does.)

What happens if we miss MF_ACTION_REQUIRED? Surely the page still gets unmapped
as its PG_Poisoned, an AO signal may be pending, but if user-space touches the
page it will get an AR signal. Is this just about removing an extra AO signal to
user-space?

If we do need this, I'd like to pick it up from the CPER records, as x86's
NOTIFY_NMI looks like it covers both AO/AR cases. (as does NOTIFY_SDEI). The
Master/Target abort or Invalid-address types in the memory-error-section CPER
records look like the best bet.


> And there is no return value check for memory_failure() in memory_failure_work_func(),
> I'm not sure whether we need to check the return value.

What would we do if it fails? The reasons look fairly broad, -EBUSY can mean
"(page) still referenced by [..] users", 'thp split failed' or 'page already
poisoned'. I don't think the behaviour or return-codes are consistent enough to use.


> If the recovery fails here, we need to take other actions, such as force to send a SIGBUS signal.

We don't do this today. If it fixes some mis-behaviour, and we can key it from
something in the CPER records then I'm all ears!


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool
  2018-12-03 18:05 ` [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool James Morse
@ 2018-12-11 16:48   ` Borislav Petkov
  2018-12-14 13:56     ` James Morse
  0 siblings, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2018-12-11 16:48 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:05:52PM +0000, James Morse wrote:
> ghes.c has a memory pool it uses for the estatus cache and the estatus
> queue. The cache is initialised when registering the platform driver.
> For the queue, an NMI-like notification has to grow/shrink the pool
> as it is registered and unregistered.
> 
> This is all pretty noisy when adding new NMI-like notifications, it
> would be better to replace this with a static pool size based on the
> number of users.
> 
> As a precursor, move the call that creates the pool from ghes_init(),
> into hest.c. Later this will take the number of ghes entries and
> consolidate the queue allocations.
> Remove ghes_estatus_pool_exit() as hest.c doesn't have anywhere to put
> this.
> 
> The pool is now initialised as part of ACPI's subsys_initcall():
> (acpi_init(), acpi_scan_init(), acpi_pci_root_init(), acpi_hest_init())
> Before this patch it happened later as a GHES specific device_initcall().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 33 ++++++---------------------------
>  drivers/acpi/apei/hest.c |  5 +++++
>  include/acpi/ghes.h      |  2 ++
>  3 files changed, 13 insertions(+), 27 deletions(-)

...

> diff --git a/drivers/acpi/apei/hest.c b/drivers/acpi/apei/hest.c
> index b1e9f81ebeea..da5fabaeb48f 100644
> --- a/drivers/acpi/apei/hest.c
> +++ b/drivers/acpi/apei/hest.c
> @@ -32,6 +32,7 @@
>  #include <linux/io.h>
>  #include <linux/platform_device.h>
>  #include <acpi/apei.h>
> +#include <acpi/ghes.h>
>  
>  #include "apei-internal.h"
>  
> @@ -200,6 +201,10 @@ static int __init hest_ghes_dev_register(unsigned int ghes_count)
>  	if (!ghes_arr.ghes_devs)
>  		return -ENOMEM;
>  
> +	rc = ghes_estatus_pool_init();
> +	if (rc)
> +		goto out;

Right, this happens before...

> +
>  	rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);

... this but do we even want to do any memory allocations if we don't
have any HEST tables or we've been disabled by hest_disable?

IOW, we should swap those two calls, methinks.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 05/25] ACPI / APEI: Make estatus pool allocation a static size
  2018-12-03 18:05 ` [PATCH v7 05/25] ACPI / APEI: Make estatus pool allocation a static size James Morse
@ 2018-12-11 16:54   ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2018-12-11 16:54 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:05:53PM +0000, James Morse wrote:
> Adding new NMI-like notifications duplicates the calls that grow
> and shrink the estatus pool. This is all pretty pointless, as the
> size is capped to 64K. Allocate this for each ghes and drop
> the code that grows and shrinks the pool.
> 
> Suggested-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 49 +++++-----------------------------------
>  drivers/acpi/apei/hest.c |  2 +-
>  include/acpi/ghes.h      |  2 +-
>  3 files changed, 8 insertions(+), 45 deletions(-)

Nice and simple, cool. Thanks for doing that.

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 06/25] ACPI / APEI: Don't store CPER records physical address in struct ghes
  2018-12-03 18:05 ` [PATCH v7 06/25] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
@ 2018-12-11 17:04   ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2018-12-11 17:04 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:05:54PM +0000, James Morse wrote:
> When CPER records are found the address of the records is stashed
> in the struct ghes. Once the records have been processed, this
> address is overwritten with zero so that it won't be processed
> again without being re-populated by firmware.
> 
> This goes wrong if a struct ghes can be processed concurrently,
> as can happen at probe time when an NMI occurs. If the NMI arrives
> on another CPU, the probing CPU may call ghes_clear_estatus() on the
> records before the handler had finished with them.
> Even on the same CPU, once the interrupted handler is resumed, it
> will call ghes_clear_estatus() on the NMIs records, this memory may
> have already been re-used by firmware.
> 
> Avoid this stashing by letting the caller hold the address. A
> later patch will do away with the use of ghes->flags in the
> read/clear code too.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> ---
> Changes since v6:
>  * Moved earlier in the series
>  * Added buf_adder = 0 on all the error paths, and test for it in
>    ghes_estatus_clear() for extra sanity.
> ---
>  drivers/acpi/apei/ghes.c | 40 +++++++++++++++++++++++-----------------
>  include/acpi/ghes.h      |  1 -
>  2 files changed, 23 insertions(+), 18 deletions(-)

...

> @@ -349,17 +350,20 @@ static int ghes_read_estatus(struct ghes *ghes)
>  	if (rc)
>  		pr_warn_ratelimited(FW_WARN GHES_PFX
>  				    "Failed to read error status block!\n");
> +
>  	return rc;
>  }
>  
> -static void ghes_clear_estatus(struct ghes *ghes)
> +static void ghes_clear_estatus(struct ghes *ghes, u64 buf_paddr)
>  {
>  	ghes->estatus->block_status = 0;
>  	if (!(ghes->flags & GHES_TO_CLEAR))
>  		return;
> -	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
> -			      sizeof(ghes->estatus->block_status), 0);
> -	ghes->flags &= ~GHES_TO_CLEAR;

<---- newline here.

> +	if (buf_paddr) {

Also, you can save yourself an indendation level:

	if (!buf_paddr)
		return;

	ghes_copy...

> +		ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
> +				      sizeof(ghes->estatus->block_status), 0);
> +		ghes->flags &= ~GHES_TO_CLEAR;
> +	}
>  }

With that addressed:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 07/25] ACPI / APEI: Remove spurious GHES_TO_CLEAR check
  2018-12-03 18:05 ` [PATCH v7 07/25] ACPI / APEI: Remove spurious GHES_TO_CLEAR check James Morse
@ 2018-12-11 17:18   ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2018-12-11 17:18 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:05:55PM +0000, James Morse wrote:
> ghes_notify_nmi() checks ghes->flags for GHES_TO_CLEAR before going
> on to __process_error(). This is pointless as ghes_read_estatus()
> will always set this flag if it returns success, which was checked
> earlier in the loop. Remove it.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index acf0c37e9af9..f7a0ff1c785a 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -936,9 +936,6 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
>  			__ghes_panic(ghes);
>  		}
>  
> -		if (!(ghes->flags & GHES_TO_CLEAR))
> -			continue;
> -
>  		__process_error(ghes);
>  		ghes_clear_estatus(ghes, buf_paddr);
>  	}
> -- 

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code
  2018-12-03 18:05 ` [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code James Morse
@ 2018-12-11 17:44   ` Borislav Petkov
  2019-01-10 18:21     ` James Morse
  0 siblings, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2018-12-11 17:44 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:05:57PM +0000, James Morse wrote:
> Refactor the estatus queue's pool notification routine from
> NOTIFY_NMI's handlers. This will allow another notification
> method to use the estatus queue without duplicating this code.
> 
> This patch adds rcu_read_lock()/rcu_read_unlock() around the list

s/This patch adds/Add/

> list_for_each_entry_rcu() walker. These aren't strictly necessary as
> the whole nmi_enter/nmi_exit() window is a spooky RCU read-side
> critical section.
> 
> _in_nmi_notify_one() is separate from the rcu-list walker for a later
> caller that doesn't need to walk a list.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> 
> ---
> Changes since v6:
>  * Removed pool grow/remove code as this is no longer necessary.
> 
> Changes since v3:
>  * Removed duplicate or redundant paragraphs in commit message.
>  * Fixed the style of a zero check.
> Changes since v1:
>    * Tidied up _in_nmi_notify_one().
> ---
>  drivers/acpi/apei/ghes.c | 63 ++++++++++++++++++++++++++--------------
>  1 file changed, 41 insertions(+), 22 deletions(-)

...

> +static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
> +{
> +	int ret = NMI_DONE;
> +
> +	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
> +		return ret;
> +
> +	if (!ghes_estatus_queue_notified(&ghes_nmi))
> +		ret = NMI_HANDLED;

So this reads kinda the other way around, at least to me:

	"if the queue was *not* notified, the NMI was handled."

Maybe rename to this:

	err = process_queue(&ghes_nmi);
	if (!err)
		ret = NMI_HANDLED;

to make it clearer...

And yeah, all those static functions having "ghes_" prefix is just
encumbering readability for no good reason.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2018-12-03 18:05 ` [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records James Morse
@ 2018-12-11 18:36   ` Borislav Petkov
  2019-01-10 18:22     ` James Morse
  0 siblings, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2018-12-11 18:36 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:05:58PM +0000, James Morse wrote:
> ACPI has a GHESv2 which is used on hardware reduced platforms to
> explicitly acknowledge that the memory for CPER records has been
> consumed. This lets an external agent know it can re-use this
> memory for something else.
> 
> Previously notify_nmi and the estatus queue didn't do this as
> they were never used on hardware reduced platforms. Once we move
> notify_sea over to use the estatus queue, it may become necessary.
> 
> Add the call. This is safe for use in NMI context as the
> read_ack_register is pre-mapped by ghes_new() before the
> ghes can be added to an RCU list, and then found by the
> notification handler.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 366dbdd41ef3..15d94373ba72 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -926,6 +926,10 @@ static int _in_nmi_notify_one(struct ghes *ghes)
>  	__process_error(ghes);
>  	ghes_clear_estatus(ghes, buf_paddr);
>  
> +	if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))

Since ghes_ack_error() is always prepended with this check, you could
push it down into the function:

ghes_ack_error(ghes)
...

	if (!is_hest_type_generic_v2(ghes))
		return 0;

and simplify the two callsites :)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool
  2018-12-11 16:48   ` Borislav Petkov
@ 2018-12-14 13:56     ` James Morse
  2018-12-19 14:42       ` Borislav Petkov
  0 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2018-12-14 13:56 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 11/12/2018 16:48, Borislav Petkov wrote:
> On Mon, Dec 03, 2018 at 06:05:52PM +0000, James Morse wrote:
>> ghes.c has a memory pool it uses for the estatus cache and the estatus
>> queue. The cache is initialised when registering the platform driver.
>> For the queue, an NMI-like notification has to grow/shrink the pool
>> as it is registered and unregistered.
>>
>> This is all pretty noisy when adding new NMI-like notifications, it
>> would be better to replace this with a static pool size based on the
>> number of users.
>>
>> As a precursor, move the call that creates the pool from ghes_init(),
>> into hest.c. Later this will take the number of ghes entries and
>> consolidate the queue allocations.
>> Remove ghes_estatus_pool_exit() as hest.c doesn't have anywhere to put
>> this.
>>
>> The pool is now initialised as part of ACPI's subsys_initcall():
>> (acpi_init(), acpi_scan_init(), acpi_pci_root_init(), acpi_hest_init())
>> Before this patch it happened later as a GHES specific device_initcall().

>> diff --git a/drivers/acpi/apei/hest.c b/drivers/acpi/apei/hest.c
>> index b1e9f81ebeea..da5fabaeb48f 100644
>> --- a/drivers/acpi/apei/hest.c
>> +++ b/drivers/acpi/apei/hest.c
>> @@ -32,6 +32,7 @@
>>  #include <linux/io.h>
>>  #include <linux/platform_device.h>
>>  #include <acpi/apei.h>
>> +#include <acpi/ghes.h>
>>  
>>  #include "apei-internal.h"
>>  
>> @@ -200,6 +201,10 @@ static int __init hest_ghes_dev_register(unsigned int ghes_count)
>>  	if (!ghes_arr.ghes_devs)
>>  		return -ENOMEM;
>>  
>> +	rc = ghes_estatus_pool_init();
>> +	if (rc)
>> +		goto out;
> 
> Right, this happens before...
> 
>> +
>>  	rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
> 
> ... this but do we even want to do any memory allocations if we don't
> have any HEST tables or we've been disabled by hest_disable?

I agree we shouldn't,


> IOW, we should swap those two calls, methinks.

/me digs a bit,

ghes_estatus_pool_init() allocates memory from hest_ghes_dev_register().
Its caller is behind a 'if (!ghes_disable)' in acpi_hest_init(), and is after
another 2 calls to apei_hest_parse().

If ghes_disable is set, we don't call this thing.
If hest_disable is set, acpi_hest_init() exits early.
If we don't have a HEST table, acpi_hest_init() exits early.

... if the HEST table doesn't have any GHES entries, hest_ghes_dev_register() is
called with ghes_count==0, and does nothing useful. (kmalloc_alloc_array(0,...)
great!) But we do call ghes_estatus_pool_init().

I think a check that ghes_count is non-zero before calling
hest_ghes_dev_register() is the cleanest way to avoid this.

I wanted the estatus pool to be initialised before creating the platform devices
in case the order of these things is changed in the future and they get probed
immediately, before the pool is initialised.


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool
  2018-12-14 13:56     ` James Morse
@ 2018-12-19 14:42       ` Borislav Petkov
  2019-01-10 18:20         ` James Morse
  0 siblings, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2018-12-19 14:42 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Dec 14, 2018 at 01:56:16PM +0000, James Morse wrote:
> /me digs a bit,
> 
> ghes_estatus_pool_init() allocates memory from hest_ghes_dev_register().
> Its caller is behind a 'if (!ghes_disable)' in acpi_hest_init(), and is after
> another 2 calls to apei_hest_parse().
> 
> If ghes_disable is set, we don't call this thing.
> If hest_disable is set, acpi_hest_init() exits early.
> If we don't have a HEST table, acpi_hest_init() exits early.
> 
> ... if the HEST table doesn't have any GHES entries, hest_ghes_dev_register() is
> called with ghes_count==0, and does nothing useful. (kmalloc_alloc_array(0,...)
> great!) But we do call ghes_estatus_pool_init().
> 
> I think a check that ghes_count is non-zero before calling
> hest_ghes_dev_register() is the cleanest way to avoid this.

Grrr, what an effing mess that code is! There's hest_disable *and*
ghes_disable. Do we really need them both?

With my simplifier hat on I wanna say, we should have a single switch -
apei_disable - and kill those other two. What a damn mess that is.

> I wanted the estatus pool to be initialised before creating the platform devices
> in case the order of these things is changed in the future and they get probed
> immediately, before the pool is initialised.

Hmmm.

Actually, I meant flipping those two calls:

        rc = ghes_estatus_pool_init(ghes_count);
        if (rc)
                goto out;

        rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
        if (rc)
                goto err;

to

        rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
        if (rc)
                goto err;

        rc = ghes_estatus_pool_init(ghes_count);
        if (rc)
                goto out;

so as not to alloc the pool unnecessarily if the parsing fails.

Also, AFAICT, the order you have them in now might be a problem anyway
if

	apei_hest_parse(hest_parse_ghes, &ghes_arr);

fails because then you goto err and and that pool leaks, right?

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool
  2018-12-19 14:42       ` Borislav Petkov
@ 2019-01-10 18:20         ` James Morse
  0 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2019-01-10 18:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 19/12/2018 14:42, Borislav Petkov wrote:
> On Fri, Dec 14, 2018 at 01:56:16PM +0000, James Morse wrote:
>> /me digs a bit,
>>
>> ghes_estatus_pool_init() allocates memory from hest_ghes_dev_register().
>> Its caller is behind a 'if (!ghes_disable)' in acpi_hest_init(), and is after
>> another 2 calls to apei_hest_parse().
>>
>> If ghes_disable is set, we don't call this thing.
>> If hest_disable is set, acpi_hest_init() exits early.
>> If we don't have a HEST table, acpi_hest_init() exits early.
>>
>> ... if the HEST table doesn't have any GHES entries, hest_ghes_dev_register() is
>> called with ghes_count==0, and does nothing useful. (kmalloc_alloc_array(0,...)
>> great!) But we do call ghes_estatus_pool_init().
>>
>> I think a check that ghes_count is non-zero before calling
>> hest_ghes_dev_register() is the cleanest way to avoid this.
> 
> Grrr, what an effing mess that code is! There's hest_disable *and*
> ghes_disable. Do we really need them both?

ghes_disable lets you ignore the firmware-first notifications, but still 'use'
the other error sources:
drivers/pci/pcie/aer.c picks out the three AER types, and uses apei_hest_parse()
to know if firmware is controlling AER, even if ghes_disable is set.

x86's arch_apei_enable_cmcff() looks like it disables MCE to get firmware to
handle them. hest_disable would stop this, but instead ghes_disable keeps that,
and stops the NOTIFY_NMI being registered.


> With my simplifier hat on I wanna say, we should have a single switch -
> apei_disable - and kill those other two. What a damn mess that is.

(do you consider cmdline arguments as ABI, or hard to justify and hard to remove?)

I don't think its broken enough to justify ripping them out. A user of
ghes_disable would be someone with broken firmware-first handling of AER. They
need to know firmware is changing the register values behind their back (so need
to parse the HEST), but want to ignore the junk notifications. It doesn't sound
like an unlikely scenario.


>> I wanted the estatus pool to be initialised before creating the platform devices
>> in case the order of these things is changed in the future and they get probed
>> immediately, before the pool is initialised.
> 
> Hmmm.
> 
> Actually, I meant flipping those two calls:
> 
>         rc = ghes_estatus_pool_init(ghes_count);
>         if (rc)
>                 goto out;
> 
>         rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
>         if (rc)
>                 goto err;
> 
> to
> 
>         rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
>         if (rc)
>                 goto err;
> 
>         rc = ghes_estatus_pool_init(ghes_count);
>         if (rc)
>                 goto out;
> 
> so as not to alloc the pool unnecessarily if the parsing fails.
> 
> Also, AFAICT, the order you have them in now might be a problem anyway
> if
> 
> 	apei_hest_parse(hest_parse_ghes, &ghes_arr);
> 
> fails because then you goto err and and that pool leaks, right?

Right, yes. I've been ignoring errors like this on the probe path as it implies
you've got busted ACPI tables, or so little memory you're never going to make it
to user-space. I was more worried about ghes_probe() trying to use the pool
memory before its been allocated. I doesn't seem right to register the device if
the driver wouldn't work yet. But one is an subsys_initcall(), the drivers is
device_initcall(), which is obvious enough.

Fixed.


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code
  2018-12-11 17:44   ` Borislav Petkov
@ 2019-01-10 18:21     ` James Morse
  2019-01-11 11:46       ` Borislav Petkov
  0 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2019-01-10 18:21 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 11/12/2018 17:44, Borislav Petkov wrote:
> On Mon, Dec 03, 2018 at 06:05:57PM +0000, James Morse wrote:
>> Refactor the estatus queue's pool notification routine from
>> NOTIFY_NMI's handlers. This will allow another notification
>> method to use the estatus queue without duplicating this code.
>>
>> This patch adds rcu_read_lock()/rcu_read_unlock() around the list
> 
> s/This patch adds/Add/
> 
>> list_for_each_entry_rcu() walker. These aren't strictly necessary as
>> the whole nmi_enter/nmi_exit() window is a spooky RCU read-side
>> critical section.
>>
>> _in_nmi_notify_one() is separate from the rcu-list walker for a later
>> caller that doesn't need to walk a list.

>> +static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
>> +{
>> +	int ret = NMI_DONE;
>> +
>> +	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
>> +		return ret;
>> +
>> +	if (!ghes_estatus_queue_notified(&ghes_nmi))
>> +		ret = NMI_HANDLED;
> 
> So this reads kinda the other way around, at least to me:
> 
> 	"if the queue was *not* notified, the NMI was handled."
> 
> Maybe rename to this:
> 
> 	err = process_queue(&ghes_nmi);
> 	if (!err)
> 		ret = NMI_HANDLED;
> 
> to make it clearer...

(yup, that's clearer).

But now we've opened pandora's box of naming-things: This thing isn't really
processing anything, its walking a list of 'maybe it was one of these' and
copying anything it finds into the estatus-queue to be handled later.

I've evidently overloaded 'notified' to mean this.
__process_error() doesn't process anything either, it does the add-to-queue.

'spool' is the word that best conveys what's going on here, I should probably
use that 'in_nmi' prefix more to make it clear this has to be nmi safe.

Something like:
ghes_notify_nmi() -> in_nmi_spool_from_list(list) -> in_nmi_queue_one_entry(ghes).



Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2018-12-11 18:36   ` Borislav Petkov
@ 2019-01-10 18:22     ` James Morse
  2019-01-10 21:01       ` Tyler Baicar
  0 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2019-01-10 18:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, baicar.tyler, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

(CC: +Tyler)

On 11/12/2018 18:36, Borislav Petkov wrote:
> On Mon, Dec 03, 2018 at 06:05:58PM +0000, James Morse wrote:
>> ACPI has a GHESv2 which is used on hardware reduced platforms to
>> explicitly acknowledge that the memory for CPER records has been
>> consumed. This lets an external agent know it can re-use this
>> memory for something else.
>>
>> Previously notify_nmi and the estatus queue didn't do this as
>> they were never used on hardware reduced platforms. Once we move
>> notify_sea over to use the estatus queue, it may become necessary.
>>
>> Add the call. This is safe for use in NMI context as the
>> read_ack_register is pre-mapped by ghes_new() before the
>> ghes can be added to an RCU list, and then found by the
>> notification handler.

>> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
>> index 366dbdd41ef3..15d94373ba72 100644
>> --- a/drivers/acpi/apei/ghes.c
>> +++ b/drivers/acpi/apei/ghes.c
>> @@ -926,6 +926,10 @@ static int _in_nmi_notify_one(struct ghes *ghes)
>>  	__process_error(ghes);
>>  	ghes_clear_estatus(ghes, buf_paddr);
>>  
>> +	if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
> 
> Since ghes_ack_error() is always prepended with this check, you could
> push it down into the function:
> 
> ghes_ack_error(ghes)
> ...
> 
> 	if (!is_hest_type_generic_v2(ghes))
> 		return 0;
> 
> and simplify the two callsites :)

Great idea! ...

.. huh. Turns out for ghes_proc() we discard any errors other than ENOENT from
ghes_read_estatus() if is_hest_type_generic_v2(). This masks EIO.

Most of the error sources discard the result, the worst thing I can find is
ghes_irq_func() will return IRQ_HANDLED, instead of IRQ_NONE when we didn't
really handle the IRQ. They're registered as SHARED, but I don't have an example
of what goes wrong next.

I think this will also stop the spurious handling code kicking in to shut it up
if its broken and screaming. Unlikely, but not impossible.

Fixed in a prior patch, with Boris' suggestion, ghes_proc()s tail ends up look
like this:
----------------------%<----------------------
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 0321d9420b1e..8d1f9930b159 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -700,18 +708,11 @@ static int ghes_proc(struct ghes *ghes)

 out:
        ghes_clear_estatus(ghes, buf_paddr);
+       if (rc != -ENOENT)
+               rc_ack = ghes_ack_error(ghes);

-       if (rc == -ENOENT)
-               return rc;
-
-       /*
-        * GHESv2 type HEST entries introduce support for error acknowledgment,
-        * so only acknowledge the error if this support is present.
-        */
-       if (is_hest_type_generic_v2(ghes))
-               return ghes_ack_error(ghes->generic_v2);
-
-       return rc;
+       /* If rc and rc_ack failed, return the first one */
+       return rc ? rc : rc_ack;
 }
----------------------%<----------------------


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-10 18:22     ` James Morse
@ 2019-01-10 21:01       ` Tyler Baicar
  2019-01-11 12:03         ` Borislav Petkov
  0 siblings, 1 reply; 72+ messages in thread
From: Tyler Baicar @ 2019-01-10 21:01 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Linux ACPI, Borislav Petkov, Naoya Horiguchi,
	kvmarm, arm-mail-list, Len Brown

On Thu, Jan 10, 2019 at 1:23 PM James Morse <james.morse@arm.com> wrote:
> >>
> >> +    if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
> >
> > Since ghes_ack_error() is always prepended with this check, you could
> > push it down into the function:
> >
> > ghes_ack_error(ghes)
> > ...
> >
> >       if (!is_hest_type_generic_v2(ghes))
> >               return 0;
> >
> > and simplify the two callsites :)
>
> Great idea! ...
>
> .. huh. Turns out for ghes_proc() we discard any errors other than ENOENT from
> ghes_read_estatus() if is_hest_type_generic_v2(). This masks EIO.
>
> Most of the error sources discard the result, the worst thing I can find is
> ghes_irq_func() will return IRQ_HANDLED, instead of IRQ_NONE when we didn't
> really handle the IRQ. They're registered as SHARED, but I don't have an example
> of what goes wrong next.
>
> I think this will also stop the spurious handling code kicking in to shut it up
> if its broken and screaming. Unlikely, but not impossible.
>
> Fixed in a prior patch, with Boris' suggestion, ghes_proc()s tail ends up look
> like this:
> ----------------------%<----------------------
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 0321d9420b1e..8d1f9930b159 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -700,18 +708,11 @@ static int ghes_proc(struct ghes *ghes)
>
>  out:
>         ghes_clear_estatus(ghes, buf_paddr);
> +       if (rc != -ENOENT)
> +               rc_ack = ghes_ack_error(ghes);
>
> -       if (rc == -ENOENT)
> -               return rc;
> -
> -       /*
> -        * GHESv2 type HEST entries introduce support for error acknowledgment,
> -        * so only acknowledge the error if this support is present.
> -        */
> -       if (is_hest_type_generic_v2(ghes))
> -               return ghes_ack_error(ghes->generic_v2);
> -
> -       return rc;
> +       /* If rc and rc_ack failed, return the first one */
> +       return rc ? rc : rc_ack;
>  }
> ----------------------%<----------------------
>

Looks good to me, I guess there's no harm in acking invalid error status blocks.

T

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code
  2019-01-10 18:21     ` James Morse
@ 2019-01-11 11:46       ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2019-01-11 11:46 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Thu, Jan 10, 2019 at 06:21:21PM +0000, James Morse wrote:
> Something like:
> ghes_notify_nmi() -> in_nmi_spool_from_list(list) -> in_nmi_queue_one_entry(ghes).

Yah, but make that

ghes_notify_nmi() -> ghes_nmi_spool_from_list(list) -> ghes_nmi_queue_one_entry(ghes).

to denote it is the GHES NMI path.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-10 21:01       ` Tyler Baicar
@ 2019-01-11 12:03         ` Borislav Petkov
  2019-01-11 15:32           ` Tyler Baicar
  0 siblings, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2019-01-11 12:03 UTC (permalink / raw)
  To: Tyler Baicar, James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Linux ACPI, Naoya Horiguchi, kvmarm, arm-mail-list,
	Len Brown

On Thu, Jan 10, 2019 at 04:01:27PM -0500, Tyler Baicar wrote:
> On Thu, Jan 10, 2019 at 1:23 PM James Morse <james.morse@arm.com> wrote:
> > >>
> > >> +    if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
> > >
> > > Since ghes_ack_error() is always prepended with this check, you could
> > > push it down into the function:
> > >
> > > ghes_ack_error(ghes)
> > > ...
> > >
> > >       if (!is_hest_type_generic_v2(ghes))
> > >               return 0;
> > >
> > > and simplify the two callsites :)
> >
> > Great idea! ...
> >
> > .. huh. Turns out for ghes_proc() we discard any errors other than ENOENT from
> > ghes_read_estatus() if is_hest_type_generic_v2(). This masks EIO.
> >
> > Most of the error sources discard the result, the worst thing I can find is
> > ghes_irq_func() will return IRQ_HANDLED, instead of IRQ_NONE when we didn't
> > really handle the IRQ. They're registered as SHARED, but I don't have an example
> > of what goes wrong next.
> >
> > I think this will also stop the spurious handling code kicking in to shut it up
> > if its broken and screaming. Unlikely, but not impossible.
> >
> > Fixed in a prior patch, with Boris' suggestion, ghes_proc()s tail ends up look
> > like this:
> > ----------------------%<----------------------
> > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> > index 0321d9420b1e..8d1f9930b159 100644
> > --- a/drivers/acpi/apei/ghes.c
> > +++ b/drivers/acpi/apei/ghes.c
> > @@ -700,18 +708,11 @@ static int ghes_proc(struct ghes *ghes)
> >
> >  out:
> >         ghes_clear_estatus(ghes, buf_paddr);
> > +       if (rc != -ENOENT)
> > +               rc_ack = ghes_ack_error(ghes);
> >
> > -       if (rc == -ENOENT)
> > -               return rc;
> > -
> > -       /*
> > -        * GHESv2 type HEST entries introduce support for error acknowledgment,
> > -        * so only acknowledge the error if this support is present.
> > -        */
> > -       if (is_hest_type_generic_v2(ghes))
> > -               return ghes_ack_error(ghes->generic_v2);
> > -
> > -       return rc;
> > +       /* If rc and rc_ack failed, return the first one */
> > +       return rc ? rc : rc_ack;
> >  }
> > ----------------------%<----------------------
> >
> 
> Looks good to me, I guess there's no harm in acking invalid error status blocks.

Err, why?

I don't know what the firmware glue does on ARM but if I'd have to
remain logical - which is hard to do with firmware - the proper thing to
do would be this:

	rc = ghes_read_estatus(ghes, &buf_paddr);
	if (rc) {
		ghes_reset_hardware();
	}

	/* clear estatus and bla bla */

	/* Now, I'm in the success case: */
	 ghes_ack_error();


This way, you have the error path clear of something unexpected happened
when reading the hardware, obvious and separated. ghes_reset_hardware()
clears the registers and does the necessary steps to put the hardware in
good state again so that it can report the next error.

And the success path simply acks the error and does possibly the same
thing. The naming of the functions is important though, to denote what
gets called when.

This way you handle all the cases just fine. No looking at the error
type and blabla.

Right?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-11 12:03         ` Borislav Petkov
@ 2019-01-11 15:32           ` Tyler Baicar
  2019-01-11 17:45             ` Borislav Petkov
  2019-01-11 18:09             ` James Morse
  0 siblings, 2 replies; 72+ messages in thread
From: Tyler Baicar @ 2019-01-11 15:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Linux ACPI, James Morse, Naoya Horiguchi, kvmarm,
	arm-mail-list, Len Brown

On Fri, Jan 11, 2019 at 7:03 AM Borislav Petkov <bp@alien8.de> wrote:
> On Thu, Jan 10, 2019 at 04:01:27PM -0500, Tyler Baicar wrote:
> > On Thu, Jan 10, 2019 at 1:23 PM James Morse <james.morse@arm.com> wrote:
> > > >>
> > > >> +    if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
> > > >
> > > > Since ghes_ack_error() is always prepended with this check, you could
> > > > push it down into the function:
> > > >
> > > > ghes_ack_error(ghes)
> > > > ...
> > > >
> > > >       if (!is_hest_type_generic_v2(ghes))
> > > >               return 0;
> > > >
> > > > and simplify the two callsites :)
> > >
> > > Great idea! ...
> > >
> > > .. huh. Turns out for ghes_proc() we discard any errors other than ENOENT from
> > > ghes_read_estatus() if is_hest_type_generic_v2(). This masks EIO.
> > >
> > > Most of the error sources discard the result, the worst thing I can find is
> > > ghes_irq_func() will return IRQ_HANDLED, instead of IRQ_NONE when we didn't
> > > really handle the IRQ. They're registered as SHARED, but I don't have an example
> > > of what goes wrong next.
> > >
> > > I think this will also stop the spurious handling code kicking in to shut it up
> > > if its broken and screaming. Unlikely, but not impossible.
> > >
> > > Fixed in a prior patch, with Boris' suggestion, ghes_proc()s tail ends up look
> > > like this:
> > > ----------------------%<----------------------
> > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> > > index 0321d9420b1e..8d1f9930b159 100644
> > > --- a/drivers/acpi/apei/ghes.c
> > > +++ b/drivers/acpi/apei/ghes.c
> > > @@ -700,18 +708,11 @@ static int ghes_proc(struct ghes *ghes)
> > >
> > >  out:
> > >         ghes_clear_estatus(ghes, buf_paddr);
> > > +       if (rc != -ENOENT)
> > > +               rc_ack = ghes_ack_error(ghes);
> > >
> > > -       if (rc == -ENOENT)
> > > -               return rc;
> > > -
> > > -       /*
> > > -        * GHESv2 type HEST entries introduce support for error acknowledgment,
> > > -        * so only acknowledge the error if this support is present.
> > > -        */
> > > -       if (is_hest_type_generic_v2(ghes))
> > > -               return ghes_ack_error(ghes->generic_v2);
> > > -
> > > -       return rc;
> > > +       /* If rc and rc_ack failed, return the first one */
> > > +       return rc ? rc : rc_ack;
> > >  }
> > > ----------------------%<----------------------
> > >
> >
> > Looks good to me, I guess there's no harm in acking invalid error status blocks.
>
> Err, why?

If ghes_read_estatus() fails, then either there was no error populated or the
error status block was invalid. If the error status block is invalid, then the
kernel doesn't know what happened in hardware.

I originally thought this was changing what's acked, but it's just changing the
return value of ghes_proc() when ghes_read_estatus() returns -EIO.

> I don't know what the firmware glue does on ARM but if I'd have to
> remain logical - which is hard to do with firmware - the proper thing to
> do would be this:
>
>         rc = ghes_read_estatus(ghes, &buf_paddr);
>         if (rc) {
>                 ghes_reset_hardware();

The kernel would have no way of knowing what to do here.

>         }
>
>         /* clear estatus and bla bla */
>
>         /* Now, I'm in the success case: */
>          ghes_ack_error();
>
>
> This way, you have the error path clear of something unexpected happened
> when reading the hardware, obvious and separated. ghes_reset_hardware()
> clears the registers and does the necessary steps to put the hardware in
> good state again so that it can report the next error.
>
> And the success path simply acks the error and does possibly the same
> thing. The naming of the functions is important though, to denote what
> gets called when.
>
> This way you handle all the cases just fine. No looking at the error
> type and blabla.
>
> Right?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-11 15:32           ` Tyler Baicar
@ 2019-01-11 17:45             ` Borislav Petkov
  2019-01-11 18:25               ` James Morse
  2019-01-11 18:09             ` James Morse
  1 sibling, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2019-01-11 17:45 UTC (permalink / raw)
  To: Tyler Baicar
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Linux ACPI, James Morse, Naoya Horiguchi, kvmarm,
	arm-mail-list, Len Brown

On Fri, Jan 11, 2019 at 10:32:23AM -0500, Tyler Baicar wrote:
> The kernel would have no way of knowing what to do here.

What do you mean, there's no way of knowing what to do? It needs to
clear registers so that the next error can get reported properly.

Or of the status read failed and it doesn't need to do anything, then it
shouldn't.

Whatever it is, the kernel either needs to do something in the error
case to clean up, or nothing if the firmware doesn't need anything done
in the error case; *or* ack the error in the success case.

This should all be written down somewhere in that GHES v2
spec/doc/writeup whatever, explaining what the OS is supposed to do to
signal the error has been read by the OS.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-11 15:32           ` Tyler Baicar
  2019-01-11 17:45             ` Borislav Petkov
@ 2019-01-11 18:09             ` James Morse
  2019-01-11 20:01               ` Borislav Petkov
  2019-01-11 20:53               ` Tyler Baicar
  1 sibling, 2 replies; 72+ messages in thread
From: James Morse @ 2019-01-11 18:09 UTC (permalink / raw)
  To: Tyler Baicar, Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Linux ACPI, Naoya Horiguchi, kvmarm, arm-mail-list,
	Len Brown

Hi guys,

On 11/01/2019 15:32, Tyler Baicar wrote:
> On Fri, Jan 11, 2019 at 7:03 AM Borislav Petkov <bp@alien8.de> wrote:
>> On Thu, Jan 10, 2019 at 04:01:27PM -0500, Tyler Baicar wrote:
>>> On Thu, Jan 10, 2019 at 1:23 PM James Morse <james.morse@arm.com> wrote:
>>>>>>
>>>>>> +    if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
>>>>>
>>>>> Since ghes_ack_error() is always prepended with this check, you could
>>>>> push it down into the function:
>>>>>
>>>>> ghes_ack_error(ghes)
>>>>> ...
>>>>>
>>>>>       if (!is_hest_type_generic_v2(ghes))
>>>>>               return 0;
>>>>>
>>>>> and simplify the two callsites :)
>>>>
>>>> Great idea! ...
>>>>
>>>> .. huh. Turns out for ghes_proc() we discard any errors other than ENOENT from
>>>> ghes_read_estatus() if is_hest_type_generic_v2(). This masks EIO.
>>>>
>>>> Most of the error sources discard the result, the worst thing I can find is
>>>> ghes_irq_func() will return IRQ_HANDLED, instead of IRQ_NONE when we didn't
>>>> really handle the IRQ. They're registered as SHARED, but I don't have an example
>>>> of what goes wrong next.
>>>>
>>>> I think this will also stop the spurious handling code kicking in to shut it up
>>>> if its broken and screaming. Unlikely, but not impossible.

[....]

>>> Looks good to me, I guess there's no harm in acking invalid error status blocks.

Great, I didn't miss something nasty...


>> Err, why?
> 
> If ghes_read_estatus() fails, then either there was no error populated or the
> error status block was invalid.
> If the error status block is invalid, then the kernel doesn't know what happened
> in hardware.

What do we mean by 'hardware' here? We're receiving a corrupt report of
something via memory.
The GHESv2 ack just means we're done with the memory. I think it exists because
the external-agent can't peek into the CPU to see if its returned from the
notification.


> I originally thought this was changing what's acked, but it's just changing the
> return value of ghes_proc() when ghes_read_estatus() returns -EIO.

Sorry, that will be due to my bad description.


>> I don't know what the firmware glue does on ARM but if I'd have to
>> remain logical - which is hard to do with firmware - the proper thing to
>> do would be this:
>>
>>         rc = ghes_read_estatus(ghes, &buf_paddr);
>>         if (rc) {
>>                 ghes_reset_hardware();
> 
> The kernel would have no way of knowing what to do here.

Is there anything wrong with what we do today? We stamp on the records so that
we don't processes them again. (especially if is polled), and we tell firmware
it can re-use this memory.

(I think we should return an error, or print a ratelimited warning for corrupt
records)


>>         }
>>
>>         /* clear estatus and bla bla */
>>
>>         /* Now, I'm in the success case: */
>>          ghes_ack_error();
>>
>>
>> This way, you have the error path clear of something unexpected happened
>> when reading the hardware, obvious and separated. ghes_reset_hardware()
>> clears the registers and does the necessary steps to put the hardware in
>> good state again so that it can report the next error.
>>
>> And the success path simply acks the error and does possibly the same
>> thing. The naming of the functions is important though, to denote what
>> gets called when.

I think this duplicates the record-stamping/acking. If there is anything in that
memory region, the action for processed/copied/ignored-because-its-corrupt is
the same.

We can return on ENOENT out earlier, as nothing needs doing in that case. Its
what the GHES_TO_CLEAR spaghetti is for, we can probably move the ack thing into
ghes_clear_estatus(), that way that thing means 'I'm done with this memory'.

Something like:
-------------------------
rc = ghes_read_estatus();
if (rc == -ENOENT)
	return 0;

if (!rc) {
	ghes_do_proc() and friends;
}

ghes_clear_estatus();

return rc;
-------------------------

We would no longer return errors from the ack code, I suspect that can only
happen for a corrupt gas, which we would have caught earlier as we rely on the
mapping being cached.



Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-11 17:45             ` Borislav Petkov
@ 2019-01-11 18:25               ` James Morse
  2019-01-11 19:58                 ` Borislav Petkov
  0 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2019-01-11 18:25 UTC (permalink / raw)
  To: Borislav Petkov, Tyler Baicar
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Linux ACPI, Naoya Horiguchi, kvmarm, arm-mail-list,
	Len Brown

Hi Boris,

On 11/01/2019 17:45, Borislav Petkov wrote:
> On Fri, Jan 11, 2019 at 10:32:23AM -0500, Tyler Baicar wrote:
>> The kernel would have no way of knowing what to do here.
> 
> What do you mean, there's no way of knowing what to do? It needs to
> clear registers so that the next error can get reported properly.
> 
> Or of the status read failed and it doesn't need to do anything, then it
> shouldn't.

I think we're speaking at cross-purposes. If the error-detecting-hardware has
some state, that's firmware's problem to deal with.
What we're dealing with here is the memory we read the error records from.


> Whatever it is, the kernel either needs to do something in the error
> case to clean up, or nothing if the firmware doesn't need anything done
> in the error case; *or* ack the error in the success case.

We ack it in the corrupt-record case too, because we are done with the memory.


> This should all be written down somewhere in that GHES v2
> spec/doc/writeup whatever, explaining what the OS is supposed to do to
> signal the error has been read by the OS.

I think it is. 18.3.2.8 of ACPI v6.2 (search for Generic Hardware Error Source
version 2", then below the table):
* OSPM detects error (via interrupt/exception or polling the block status)
* OSPM copies the error status block
* OSPM clears the block status field of the error status block
* OSPM acknowledges the error via Read Ack register

The ENOENT case is excluded by 'polling the block status'.
Unsurprisingly the spec doesn't consider the case that firmware generates
corrupt records!


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-11 18:25               ` James Morse
@ 2019-01-11 19:58                 ` Borislav Petkov
  2019-01-23 18:36                   ` James Morse
  0 siblings, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2019-01-11 19:58 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Linux ACPI,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Christoffer Dall, Dongjiu Geng, linux-mm, Naoya Horiguchi,
	kvmarm, arm-mail-list, Len Brown

On Fri, Jan 11, 2019 at 06:25:21PM +0000, James Morse wrote:
> We ack it in the corrupt-record case too, because we are done with the
> memory.

Ok, so the only thing that we need to do unconditionally is ACK in order
to free the memory. Or is there an exception to that set of steps in
error handling?

> I think it is. 18.3.2.8 of ACPI v6.2 (search for Generic Hardware Error Source
> version 2", then below the table):
> * OSPM detects error (via interrupt/exception or polling the block status)
> * OSPM copies the error status block
> * OSPM clears the block status field of the error status block
> * OSPM acknowledges the error via Read Ack register
> 
> The ENOENT case is excluded by 'polling the block status'.

Ok, so we signal the absence of an error record with ENOENT.

        if (!buf_paddr)
                return -ENOENT;

Can that even happen?

Also, in that case, what would happen if we ACK the error anyway? We'd
confuse the firmware?

I sure hope firmware is prepared for spurious ACKs :)

> Unsurprisingly the spec doesn't consider the case that firmware generates
> corrupt records!

You mean the EIO case?

Not surprised at all. But we do not report that record so all good.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-11 18:09             ` James Morse
@ 2019-01-11 20:01               ` Borislav Petkov
  2019-01-11 20:53               ` Tyler Baicar
  1 sibling, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2019-01-11 20:01 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Linux ACPI,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Christoffer Dall, Dongjiu Geng, linux-mm, Naoya Horiguchi,
	kvmarm, arm-mail-list, Len Brown

On Fri, Jan 11, 2019 at 06:09:28PM +0000, James Morse wrote:
> We can return on ENOENT out earlier, as nothing needs doing in that case. Its
> what the GHES_TO_CLEAR spaghetti is for, we can probably move the ack thing into
> ghes_clear_estatus(), that way that thing means 'I'm done with this memory'.

That actually sounds nice and other code in the kernel already does
that: when a failure has been encountered during reading status, you
free up resources right then and there. No need for passing retvals back
and forth. And this would simplify the spaghetti. Which is something
good(tm) on its own!

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-11 18:09             ` James Morse
  2019-01-11 20:01               ` Borislav Petkov
@ 2019-01-11 20:53               ` Tyler Baicar
  2019-01-29 18:48                 ` James Morse
  1 sibling, 1 reply; 72+ messages in thread
From: Tyler Baicar @ 2019-01-11 20:53 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Linux ACPI, Borislav Petkov, Naoya Horiguchi,
	kvmarm, arm-mail-list, Len Brown

On Fri, Jan 11, 2019 at 1:09 PM James Morse <james.morse@arm.com> wrote:
> On 11/01/2019 15:32, Tyler Baicar wrote:
> > On Fri, Jan 11, 2019 at 7:03 AM Borislav Petkov <bp@alien8.de> wrote:
> >> On Thu, Jan 10, 2019 at 04:01:27PM -0500, Tyler Baicar wrote:
> >>> On Thu, Jan 10, 2019 at 1:23 PM James Morse <james.morse@arm.com> wrote:
> >>>>>>
> >>>>>> +    if (is_hest_type_generic_v2(ghes) && ghes_ack_error(ghes->generic_v2))
> >>>>>
> >>>>> Since ghes_ack_error() is always prepended with this check, you could
> >>>>> push it down into the function:
> >>>>>
> >>>>> ghes_ack_error(ghes)
> >>>>> ...
> >>>>>
> >>>>>       if (!is_hest_type_generic_v2(ghes))
> >>>>>               return 0;
> >>>>>
> >>>>> and simplify the two callsites :)
> >>>>
> >>>> Great idea! ...
> >>>>
> >>>> .. huh. Turns out for ghes_proc() we discard any errors other than ENOENT from
> >>>> ghes_read_estatus() if is_hest_type_generic_v2(). This masks EIO.
> >>>>
> >>>> Most of the error sources discard the result, the worst thing I can find is
> >>>> ghes_irq_func() will return IRQ_HANDLED, instead of IRQ_NONE when we didn't
> >>>> really handle the IRQ. They're registered as SHARED, but I don't have an example
> >>>> of what goes wrong next.
> >>>>
> >>>> I think this will also stop the spurious handling code kicking in to shut it up
> >>>> if its broken and screaming. Unlikely, but not impossible.
>
> [....]
>
> >>> Looks good to me, I guess there's no harm in acking invalid error status blocks.
>
> Great, I didn't miss something nasty...
>
>
> >> Err, why?
> >
> > If ghes_read_estatus() fails, then either there was no error populated or the
> > error status block was invalid.
> > If the error status block is invalid, then the kernel doesn't know what happened
> > in hardware.
>
> What do we mean by 'hardware' here? We're receiving a corrupt report of
> something via memory.

By Hardware here I meant whatever hardware was reporting the error.

> The GHESv2 ack just means we're done with the memory. I think it exists because
> the external-agent can't peek into the CPU to see if its returned from the
> notification.
>
>
> > I originally thought this was changing what's acked, but it's just changing the
> > return value of ghes_proc() when ghes_read_estatus() returns -EIO.
>
> Sorry, that will be due to my bad description.
>
>
> >> I don't know what the firmware glue does on ARM but if I'd have to
> >> remain logical - which is hard to do with firmware - the proper thing to
> >> do would be this:
> >>
> >>         rc = ghes_read_estatus(ghes, &buf_paddr);
> >>         if (rc) {
> >>                 ghes_reset_hardware();
> >
> > The kernel would have no way of knowing what to do here.
>
> Is there anything wrong with what we do today? We stamp on the records so that
> we don't processes them again. (especially if is polled), and we tell firmware
> it can re-use this memory.
>
> (I think we should return an error, or print a ratelimited warning for corrupt
> records)

Agree, the print is already present in ghes_read_estatus.

> >>         }
> >>
> >>         /* clear estatus and bla bla */
> >>
> >>         /* Now, I'm in the success case: */
> >>          ghes_ack_error();
> >>
> >>
> >> This way, you have the error path clear of something unexpected happened
> >> when reading the hardware, obvious and separated. ghes_reset_hardware()
> >> clears the registers and does the necessary steps to put the hardware in
> >> good state again so that it can report the next error.
> >>
> >> And the success path simply acks the error and does possibly the same
> >> thing. The naming of the functions is important though, to denote what
> >> gets called when.
>
> I think this duplicates the record-stamping/acking. If there is anything in that
> memory region, the action for processed/copied/ignored-because-its-corrupt is
> the same.
>
> We can return on ENOENT out earlier, as nothing needs doing in that case. Its
> what the GHES_TO_CLEAR spaghetti is for, we can probably move the ack thing into
> ghes_clear_estatus(), that way that thing means 'I'm done with this memory'.
>
> Something like:
> -------------------------
> rc = ghes_read_estatus();
> if (rc == -ENOENT)
>         return 0;

We still should be returning at least the -ENOENT from ghes_read_estatus().
That is being used by the SEA handling to determine if an SEA was properly
reported/handled by the host kernel in the KVM SEA case.

Here are the relevant functions:
https://elixir.bootlin.com/linux/latest/source/drivers/acpi/apei/ghes.c#L797
https://elixir.bootlin.com/linux/latest/source/arch/arm64/mm/fault.c#L723
https://elixir.bootlin.com/linux/latest/source/virt/kvm/arm/mmu.c#L1706

>
> if (!rc) {
>         ghes_do_proc() and friends;
> }
>
> ghes_clear_estatus();
>
> return rc;
> -------------------------
>
> We would no longer return errors from the ack code, I suspect that can only
> happen for a corrupt gas, which we would have caught earlier as we rely on the
> mapping being cached.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 11/25] ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI
  2018-12-03 18:05 ` [PATCH v7 11/25] ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI James Morse
@ 2019-01-21 13:01   ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2019-01-21 13:01 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:05:59PM +0000, James Morse wrote:
> The estatus-queue code is currently hidden by the NOTIFY_NMI #ifdefs.
> Once NOTIFY_SEA starts using the estatus-queue we can stop hiding
> it as each architecture has a user that can't be turned off.
> 
> Split the existing CONFIG_HAVE_ACPI_APEI_NMI block in two, and move
> the SEA code into the gap.
> 
> This patch moves code around ... and changes the stale comment

s/This patch moves/Move the/

> describing why the status queue is necessary: printk() is no
> longer the issue, its the helpers like memory_failure_queue() that
> aren't nmi safe.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 113 ++++++++++++++++++++-------------------
>  1 file changed, 59 insertions(+), 54 deletions(-)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 17/25] ACPI / APEI: Pass ghes and estatus separately to avoid a later copy
  2018-12-03 18:06 ` [PATCH v7 17/25] ACPI / APEI: Pass ghes and estatus separately to avoid a later copy James Morse
@ 2019-01-21 13:35   ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2019-01-21 13:35 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:06:05PM +0000, James Morse wrote:
> The NMI-like notifications scribble over ghes->estatus, before
> copying it somewhere else. If this interrupts the ghes_probe() code
> calling ghes_proc() on each struct ghes, the data is corrupted.
> 
> All the NMI-like notifications should use a queued estatus entry
> from the beginning, instead of the ghes version, then copying it.
> To do this, break up any use of "ghes->estatus" so that all
> functions take the estatus as an argument.
> 
> This patch just moves these ghes->estatus dereferences into separate

s/This patch just moves/Move/

> arguments, no change in behaviour. struct ghes becomes unused in
> ghes_clear_estatus() as it only wanted ghes->estatus, which we now
> pass directly. This is removed.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 18/25] ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length
  2018-12-03 18:06 ` [PATCH v7 18/25] ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length James Morse
@ 2019-01-21 13:53   ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2019-01-21 13:53 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:06:06PM +0000, James Morse wrote:
> ghes_read_estatus() reads the record address, then the record's
> header, then performs some sanity checks before reading the
> records into the provided estatus buffer.
> 
> To provide this estatus buffer the caller must know the size of the
> records in advance, or always provide a worst-case sized buffer as
> happens today for the non-NMI notifications.
> 
> Add a function to peek at the record's header to find the size. This
> will let the NMI path allocate the right amount of memory before reading
> the records, instead of using the worst-case size, and having to copy
> the records.
> 
> Split ghes_read_estatus() to create __ghes_peek_estatus() which
> returns the address and size of the CPER records.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> Changes since v6:
>  * Additional buf_addr = 0 error handling
>  * Moved checking out of peek-estatus
>  * Reworded an error message so we can tell them apart
> ---
>  drivers/acpi/apei/ghes.c | 59 ++++++++++++++++++++++++++++++++--------
>  1 file changed, 47 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index b70f5fd962cc..07a12aac4c1a 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -277,12 +277,12 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  	}
>  }
>  
> -static int ghes_read_estatus(struct ghes *ghes,
> -			     struct acpi_hest_generic_status *estatus,
> -			     u64 *buf_paddr, int fixmap_idx)
> +/* Read the CPER block and returning its address, and header in estatus. */

s/and /,/

> +static int __ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
> +			       struct acpi_hest_generic_status *estatus,

Also, we probably should stick to some order of arguments of those
functions for easier code staring, i.e.

	function_name(ghes, estatus, buf_paddr, fixmap_idx)

or so.

> +			       u64 *buf_paddr)
>  {
>  	struct acpi_hest_generic *g = ghes->generic;
> -	u32 len;
>  	int rc;
>  
>  	rc = apei_read(buf_paddr, &g->error_status_address);
> @@ -303,29 +303,64 @@ static int ghes_read_estatus(struct ghes *ghes,
>  		return -ENOENT;
>  	}
>  
> -	rc = -EIO;
> -	len = cper_estatus_len(estatus);
> +	return 0;
> +}
> +
> +/* Check the top-level record header has an appropriate size. */
> +int __ghes_check_estatus(struct ghes *ghes,
> +			 struct acpi_hest_generic_status *estatus)
> +{
> +	u32 len = cper_estatus_len(estatus);
> +	int rc = -EIO;
> +
>  	if (len < sizeof(*estatus))
>  		goto err_read_block;
>  	if (len > ghes->generic->error_block_length)
>  		goto err_read_block;
>  	if (cper_estatus_check_header(estatus))
>  		goto err_read_block;

Please make this chunk more user-friendly, maybe in a separate patch ontop:

/* Check the top-level record header has an appropriate size. */
int __ghes_check_estatus(struct ghes *ghes,
                         struct acpi_hest_generic_status *estatus)
{
        u32 len = cper_estatus_len(estatus);

        if (len < sizeof(*estatus)) {
                pr_warn_ratelimited(FW_WARN GHES_PFX "Truncated error status block!\n");
                return -EIO;
        }

        if (len > ghes->generic->error_block_length) {
                pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid error status block length!\n");
                return -EIO;
        }

        if (cper_estatus_check_header(estatus)) {
                pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid CPER header!\n");
                return -EIO;
        }

        return 0;
}

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 19/25] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  2018-12-03 18:06 ` [PATCH v7 19/25] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
@ 2019-01-21 17:19   ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2019-01-21 17:19 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:06:07PM +0000, James Morse wrote:
> Each struct ghes has an worst-case sized buffer for storing the
> estatus. If an error is being processed by ghes_proc() in process
> context this buffer will be in use. If the error source then triggers
> an NMI-like notification, the same buffer will be used by
> _in_nmi_notify_one() to stage the estatus data, before
> __process_error() copys it into a queued estatus entry.
> 
> Merge __process_error()s work into _in_nmi_notify_one() so that
> the queued estatus entry is used from the beginning. Use the new
> ghes_peek_estatus() to know how much memory to allocate from
> the ghes_estatus_pool before reading the records.
> 
> Reported-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> Change since v6:
>  * Added a comment explaining the 'ack-error, then goto no_work'.
>  * Added missing esatus-clearing, which is necessary after reading the GAS,
> ---
>  drivers/acpi/apei/ghes.c | 59 ++++++++++++++++++++++++----------------
>  1 file changed, 35 insertions(+), 24 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications
  2018-12-03 18:06 ` [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications James Morse
@ 2019-01-21 17:27   ` Borislav Petkov
  2019-01-23 18:33     ` James Morse
  0 siblings, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2019-01-21 17:27 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:06:08PM +0000, James Morse wrote:
> Now that ghes notification helpers provide the fixmap slots and
> take the lock themselves, multiple NMI-like notifications can
> be used on arm64.
> 
> These should be named after their notification method as they can't
> all be called 'NMI'. x86's NOTIFY_NMI already is, change the SEA
> fixmap entry to be called FIX_APEI_GHES_SEA.
> 
> Future patches can add support for FIX_APEI_GHES_SEI and
> FIX_APEI_GHES_SDEI_{NORMAL,CRITICAL}.
> 
> Because all of ghes.c builds on both architectures, provide a
> constant for each fixmap entry that the architecture will never
> use.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> ---
> Changes since v6:
>  * Added #ifdef definitions of each missing fixmap entry.
> 
> Changes since v3:
>  * idx/lock are now in a separate struct.
>  * Add to the comment above ghes_fixmap_lock_irq so that it makes more
>    sense in isolation.
> 
> fixup for split fixmap
> ---
>  arch/arm64/include/asm/fixmap.h |  2 +-
>  drivers/acpi/apei/ghes.c        | 10 +++++++++-
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
> index ec1e6d6fa14c..966dd4bb23f2 100644
> --- a/arch/arm64/include/asm/fixmap.h
> +++ b/arch/arm64/include/asm/fixmap.h
> @@ -55,7 +55,7 @@ enum fixed_addresses {
>  #ifdef CONFIG_ACPI_APEI_GHES
>  	/* Used for GHES mapping from assorted contexts */
>  	FIX_APEI_GHES_IRQ,
> -	FIX_APEI_GHES_NMI,
> +	FIX_APEI_GHES_SEA,
>  #endif /* CONFIG_ACPI_APEI_GHES */
>  
>  #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 849da0d43a21..6cbf9471b2a2 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -85,6 +85,14 @@
>  	((struct acpi_hest_generic_status *)				\
>  	 ((struct ghes_estatus_node *)(estatus_node) + 1))
>  
> +/* NMI-like notifications vary by architecture. Fill in the fixmap gaps */
> +#ifndef CONFIG_HAVE_ACPI_APEI_NMI
> +#define FIX_APEI_GHES_NMI	-1
> +#endif
> +#ifndef CONFIG_ACPI_APEI_SEA
> +#define FIX_APEI_GHES_SEA	-1

I'm guessing those -1 are going to cause __set_fixmap() to fail, right?

I'm wondering if we could catch that situation in ghes_map() already to
protect ourselves against future changes in the fixmap code...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors
  2018-12-03 18:06 ` [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors James Morse
  2018-12-05  2:02   ` Xie XiuQi
@ 2019-01-21 17:58   ` Borislav Petkov
  2019-01-23 18:40     ` James Morse
  1 sibling, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2019-01-21 17:58 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 03, 2018 at 06:06:10PM +0000, James Morse wrote:
> memory_failure() offlines or repairs pages of memory that have been
> discovered to be corrupt. These may be detected by an external
> component, (e.g. the memory controller), and notified via an IRQ.
> In this case the work is queued as not all of memory_failure()s work
> can happen in IRQ context.
> 
> If the error was detected as a result of user-space accessing a
> corrupt memory location the CPU may take an abort instead. On arm64
> this is a 'synchronous external abort', and on a firmware first
> system it is replayed using NOTIFY_SEA.
> 
> This notification has NMI like properties, (it can interrupt
> IRQ-masked code), so the memory_failure() work is queued. If we
> return to user-space before the queued memory_failure() work is
> processed, we will take the fault again. This loop may cause platform
> firmware to exceed some threshold and reboot when Linux could have
> recovered from this error.
> 
> If a ghes notification type indicates that it may be triggered again
> when we return to user-space, use the task-work and notify-resume
> hooks to kick the relevant memory_failure() queue before returning
> to user-space.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> ---
> current->mm == &init_mm ? I couldn't find a helper for this.
> The intent is not to set TIF flags on kernel threads. What happens
> if a kernel-thread takes on of these? Its just one of the many
> not-handled-very-well cases we have already, as memory_failure()
> puts it: "try to be lucky".
> 
> I assume that if NOTIFY_NMI is coming from SMM it must suffer from
> this problem too.

Good question.

I'm guessing all those things should be queued on a normal struct
work_struct queue, no?

Now, memory_failure_queue() does that and can run from IRQ context so
you need only an irq_work which can queue from NMI context. We do it
this way in the MCA code:

We queue in an irq_work in NMI context and work through the items in
process context.

> ---
>  drivers/acpi/apei/ghes.c | 65 ++++++++++++++++++++++++++++++++++++----
>  1 file changed, 60 insertions(+), 5 deletions(-)

...

> @@ -407,7 +447,22 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int
>  
>  	if (flags != -1)
>  		memory_failure_queue(pfn, flags);
> -#endif
> +
> +	/*
> +	 * If the notification indicates that it was the interrupted
> +	 * instruction that caused the error, try to kick the
> +	 * memory_failure() queue before returning to user-space.
> +	 */
> +	if (ghes_is_synchronous(ghes) && current->mm != &init_mm) {
> +		callback = kzalloc(sizeof(*callback), GFP_ATOMIC);

Can we avoid that GFP_ATOMIC allocation and kfree() in
ghes_kick_memory_failure()?

I mean, that struct ghes_memory_failure_work is small enough and we
already do lockless allocation:

	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);

so I guess we could add that ghes_memory_failure_work struct to that
estatus_node, hand it into ghes_do_proc() and then free it.

No?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors
  2018-12-10 19:15     ` James Morse
@ 2019-01-22 10:51       ` Borislav Petkov
  2019-01-23 18:37         ` James Morse
  0 siblings, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2019-01-22 10:51 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-acpi, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Wang Xiongfeng, linux-mm, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Mon, Dec 10, 2018 at 07:15:13PM +0000, James Morse wrote:
> What happens if we miss MF_ACTION_REQUIRED?

AFAICU, the logic is to force-send a signal to the user process, i.e.,
force_sig_info() which cannot be ignored. IOW, an "enlightened" process
would know how to do recovery action from a memory error.

VS the action optional thing which you can handle at your leisure.

So the question boils down to what kind of severity do the errors
reported through SEA have? I mean, if the hw would go the trouble to do
the synchronous reporting, then something important must've happened and
it wants us to know about it and handle it.

> Surely the page still gets unmapped as its PG_Poisoned, an AO signal
> may be pending, but if user-space touches the page it will get an AR
> signal. Is this just about removing an extra AO signal to user-space?
>
> If we do need this, I'd like to pick it up from the CPER records, as x86's
> NOTIFY_NMI looks like it covers both AO/AR cases. (as does NOTIFY_SDEI). The
> Master/Target abort or Invalid-address types in the memory-error-section CPER
> records look like the best bet.

Right, and we do all kinds of severity mapping there aka ghes_severity()
so that'll be a good start, methinks.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications
  2019-01-21 17:27   ` Borislav Petkov
@ 2019-01-23 18:33     ` James Morse
  2019-01-31 13:38       ` Borislav Petkov
  0 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2019-01-23 18:33 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 21/01/2019 17:27, Borislav Petkov wrote:
> On Mon, Dec 03, 2018 at 06:06:08PM +0000, James Morse wrote:
>> Now that ghes notification helpers provide the fixmap slots and
>> take the lock themselves, multiple NMI-like notifications can
>> be used on arm64.
>>
>> These should be named after their notification method as they can't
>> all be called 'NMI'. x86's NOTIFY_NMI already is, change the SEA
>> fixmap entry to be called FIX_APEI_GHES_SEA.
>>
>> Future patches can add support for FIX_APEI_GHES_SEI and
>> FIX_APEI_GHES_SDEI_{NORMAL,CRITICAL}.
>>
>> Because all of ghes.c builds on both architectures, provide a
>> constant for each fixmap entry that the architecture will never
>> use.

>> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
>> index 849da0d43a21..6cbf9471b2a2 100644
>> --- a/drivers/acpi/apei/ghes.c
>> +++ b/drivers/acpi/apei/ghes.c
>> @@ -85,6 +85,14 @@
>>  	((struct acpi_hest_generic_status *)				\
>>  	 ((struct ghes_estatus_node *)(estatus_node) + 1))
>>  
>> +/* NMI-like notifications vary by architecture. Fill in the fixmap gaps */
>> +#ifndef CONFIG_HAVE_ACPI_APEI_NMI
>> +#define FIX_APEI_GHES_NMI	-1
>> +#endif
>> +#ifndef CONFIG_ACPI_APEI_SEA
>> +#define FIX_APEI_GHES_SEA	-1
> 
> I'm guessing those -1 are going to cause __set_fixmap() to fail, right?

It shouldn't be possible, these are just to give the compiler something int
shaped to work with, until it prunes all the callers.

But for arm64, yes if would fail. -1 shouldn't alias an existing entry, and it
will get caught by:
| BUG_ON(idx <= FIX_HOLE || idx >= __end_of_fixed_addresses);

I wanted BUILD_BUG_ON() here, as any user of these should be optimised out, but
the compiler choked on that.

__end_of_fixed_addresses would be a better arch-agnostic invalid value. It has
to be defined as the last value in the enum for core code's fix_to_virt() to work.


These two look like something left behind from when we had different #ifdeffery.
The users of these two are now behind arch specific #ifdefs that since patch 12
of this series, can't be turned off, so I can remove these.

We do need them for SDEI, as it is relying on IS_ENABLED() and the compiler's
dead code elimination. But the compiler wants that symbol to have the right type
before it gets that far.


|#define FIX_APEI_GHES_SDEI_NORMAL      (BUILD_BUG(), -1)

Was the best I had, but this trips the BUILD_BUG() too early.
With it, x86 BUILD_BUG()s. With just the -1 the path gets pruned out, and there
are no 'sdei' symbols in the object file.

...at this point, I stopped caring!


> I'm wondering if we could catch that situation in ghes_map() already to
> protect ourselves against future changes in the fixmap code...

We already skip registering notifiers if the kconfig option wasn't selected.

We can't catch this at compile time, as the dead-code elimination seems to
happen in multiple passes.

I'll switch the SDEI ones to __end_of_fixed_addresses, as both architectures
BUG() when they see this.


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-11 19:58                 ` Borislav Petkov
@ 2019-01-23 18:36                   ` James Morse
  2019-01-29 11:49                     ` Borislav Petkov
  0 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2019-01-23 18:36 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Linux ACPI,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Christoffer Dall, Dongjiu Geng, linux-mm, Naoya Horiguchi,
	kvmarm, arm-mail-list, Len Brown

Hi Boris,

On 11/01/2019 19:58, Borislav Petkov wrote:
> On Fri, Jan 11, 2019 at 06:25:21PM +0000, James Morse wrote:
>> We ack it in the corrupt-record case too, because we are done with the
>> memory.
> 
> Ok, so the only thing that we need to do unconditionally is ACK in order
> to free the memory. Or is there an exception to that set of steps in
> error handling?

Do you consider ENOENT an error? We don't ack in that case as the memory wasn't
in use.

For the other cases its because the records are bogus, but we still
unconditionally tell firmware we're done with them.


>> I think it is. 18.3.2.8 of ACPI v6.2 (search for Generic Hardware Error Source
>> version 2", then below the table):
>> * OSPM detects error (via interrupt/exception or polling the block status)
>> * OSPM copies the error status block
>> * OSPM clears the block status field of the error status block
>> * OSPM acknowledges the error via Read Ack register
>>
>> The ENOENT case is excluded by 'polling the block status'.
> 
> Ok, so we signal the absence of an error record with ENOENT.
> 
>         if (!buf_paddr)
>                 return -ENOENT;
> 
> Can that even happen?

Yes, for NOTIFY_POLLED its the norm. For the IRQ flavours that walk a list of
GHES, all but one of them will return ENOENT.


> Also, in that case, what would happen if we ACK the error anyway? We'd
> confuse the firmware?

No idea.

> I sure hope firmware is prepared for spurious ACKs :)

We could try it and see. It depends if firmware shares ack locations between
multiple GHES. We could ack an empty GHES, and it removes the records of one we
haven't looked at yet.


>> Unsurprisingly the spec doesn't consider the case that firmware generates
>> corrupt records!
> 
> You mean the EIO case?

Yup,

> Not surprised at all. But we do not report that record so all good.



Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors
  2019-01-22 10:51       ` Borislav Petkov
@ 2019-01-23 18:37         ` James Morse
  0 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2019-01-23 18:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-acpi, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Wang Xiongfeng, linux-mm, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 22/01/2019 10:51, Borislav Petkov wrote:
> On Mon, Dec 10, 2018 at 07:15:13PM +0000, James Morse wrote:
>> What happens if we miss MF_ACTION_REQUIRED?
> 
> AFAICU, the logic is to force-send a signal to the user process, i.e.,
> force_sig_info() which cannot be ignored. IOW, an "enlightened" process
> would know how to do recovery action from a memory error.
> 
> VS the action optional thing which you can handle at your leisure.

> So the question boils down to what kind of severity do the errors
> reported through SEA have? I mean, if the hw would go the trouble to do
> the synchronous reporting, then something important must've happened and
> it wants us to know about it and handle it.

Before v8.2 we assumed these were fatal for the thread, it couldn't make progress.
Since v8.2 we get a value from the CPU, the severity values are, (the flippant
summary is obviously mine!):
* Recoverable: "You're about to step in it, fix it or die"
* Uncontainable: "It was here, but it escaped, we dont know where it went, panic!"
* Restartable/Corrected: "its fine, pretend this didn't happen"

Firmware should duplicate these values into the CPER severity fields.


>> Surely the page still gets unmapped as its PG_Poisoned, an AO signal
>> may be pending, but if user-space touches the page it will get an AR
>> signal. Is this just about removing an extra AO signal to user-space?

If we miss MF_ACTION_REQUIRED, the page still gets unmapped from user-space, and
user-space gets an AO signal. With this patch it takes that signal before it
continues. If it ignores it, the access gets a translation-fault->EHWPOISON->AR
signal from the arch code.

... so missing the flag gives us an extra signal. I'm not convinced this results
in any observable difference.


>> If we do need this, I'd like to pick it up from the CPER records, as x86's
>> NOTIFY_NMI looks like it covers both AO/AR cases. (as does NOTIFY_SDEI). The
>> Master/Target abort or Invalid-address types in the memory-error-section CPER
>> records look like the best bet.
> 
> Right, and we do all kinds of severity mapping there aka ghes_severity()
> so that'll be a good start, methinks.

The options are those 'aborts' in the memory error. These must have been a
result of some request. If we get a CPU error structure as part of the same
block, it may have a cache/bus error structure, which has a precise bit that
tells us whether this is a co-incidence. (but linux doesn't support any of those
structures today)



Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors
  2019-01-21 17:58   ` Borislav Petkov
@ 2019-01-23 18:40     ` James Morse
  2019-01-31 14:04       ` Borislav Petkov
  0 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2019-01-23 18:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 21/01/2019 17:58, Borislav Petkov wrote:
> On Mon, Dec 03, 2018 at 06:06:10PM +0000, James Morse wrote:
>> memory_failure() offlines or repairs pages of memory that have been
>> discovered to be corrupt. These may be detected by an external
>> component, (e.g. the memory controller), and notified via an IRQ.
>> In this case the work is queued as not all of memory_failure()s work
>> can happen in IRQ context.
>>
>> If the error was detected as a result of user-space accessing a
>> corrupt memory location the CPU may take an abort instead. On arm64
>> this is a 'synchronous external abort', and on a firmware first
>> system it is replayed using NOTIFY_SEA.
>>
>> This notification has NMI like properties, (it can interrupt
>> IRQ-masked code), so the memory_failure() work is queued. If we
>> return to user-space before the queued memory_failure() work is
>> processed, we will take the fault again. This loop may cause platform
>> firmware to exceed some threshold and reboot when Linux could have
>> recovered from this error.
>>
>> If a ghes notification type indicates that it may be triggered again
>> when we return to user-space, use the task-work and notify-resume
>> hooks to kick the relevant memory_failure() queue before returning
>> to user-space.

>> ---

>> I assume that if NOTIFY_NMI is coming from SMM it must suffer from
>> this problem too.
> 
> Good question.
> 
> I'm guessing all those things should be queued on a normal struct
> work_struct queue, no?

ghes_notify_nmi() does this today with its:
|	irq_work_queue(&ghes_proc_irq_work);

Once its in IRQ context, the irq_work pokes memory_failure_queue(), which
schedule_work_on()s.

Finally we schedule() in process context, and can unmap the affected memory.


The problem is between each of these steps we might return to user-space and run
the instruction that tripped all this to begin with.


My SMM comment was because the CPU must jump from user-space->SMM, which injects
an NMI into the kernel. The kernel's EIP must point into user-space, so
returning from the NMI without doing the memory_failure() work puts us back the
same position we started in.


> Now, memory_failure_queue() does that and can run from IRQ context so
> you need only an irq_work which can queue from NMI context. We do it
> this way in the MCA code:
> 

(was there something missing here?)

> We queue in an irq_work in NMI context and work through the items in
> process context.

How are you getting from NMI to process context in one go?

This patch causes the IRQ->process transition.
The arch specific bit of this gives the irq work queue a kick if returning from
the NMI would unmask IRQs. This makes it look like we moved from NMI to IRQ
context without returning to user-space.

Once ghes_handle_memory_failure() runs in IRQ context, it task_work_add()s the
call to ghes_kick_memory_failure().

Finally on the way out of the kernel to user-space that task_work runs and the
memory_failure() work happens in process context.

During all this the user-space program counter can point at a poisoned location,
but we don't return there until the memory_failure() work has been done.


>> @@ -407,7 +447,22 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int
>>  
>>  	if (flags != -1)
>>  		memory_failure_queue(pfn, flags);
>> -#endif
>> +
>> +	/*
>> +	 * If the notification indicates that it was the interrupted
>> +	 * instruction that caused the error, try to kick the
>> +	 * memory_failure() queue before returning to user-space.
>> +	 */
>> +	if (ghes_is_synchronous(ghes) && current->mm != &init_mm) {
>> +		callback = kzalloc(sizeof(*callback), GFP_ATOMIC);
> 
> Can we avoid that GFP_ATOMIC allocation and kfree() in
> ghes_kick_memory_failure()?
> 
> I mean, that struct ghes_memory_failure_work is small enough and we
> already do lockless allocation:
> 
> 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
> 
> so I guess we could add that ghes_memory_failure_work struct to that
> estatus_node, hand it into ghes_do_proc() and then free it.

I forget estatus_node is a linux thing, not an ACPI-spec thing!

Hmmm, ghes_handle_memory_failure() runs for POLLED and irq error sources too,
they don't have an estatus_node. We don't care about this ret_to_user() problem
as they are all asynchronous, this is why we have ghes_is_synchronous()...

It feels like there should be a way to do this, let me have a go...


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-23 18:36                   ` James Morse
@ 2019-01-29 11:49                     ` Borislav Petkov
  2019-01-29 18:48                       ` James Morse
  0 siblings, 1 reply; 72+ messages in thread
From: Borislav Petkov @ 2019-01-29 11:49 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Linux ACPI,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Christoffer Dall, Dongjiu Geng, linux-mm, Naoya Horiguchi,
	kvmarm, arm-mail-list, Len Brown

On Wed, Jan 23, 2019 at 06:36:38PM +0000, James Morse wrote:
> Do you consider ENOENT an error? We don't ack in that case as the
> memory wasn't in use.

So let's see:

        if (!*buf_paddr)
                return -ENOENT;

can happen when apei_read() has returned 0 but it has managed to do

	*val = 0;

Now, that function returns error values which we should be checking
but we're checking the buf_paddr pointed to value for being 0. Are
we fearing that even if acpi_os_read_memory() or acpi_os_read_port()
succeed, *buf_paddr could still be 0 ?

Because if not, we should be checking whether rc == -EINVAL and then
convert it to -ENOENT.

But ghes_read_estatus() handles the error case first *and* *then* checks
buf_paddr too, to make really really sure we won't be reading from
address 0.

> For the other cases its because the records are bogus, but we still
> unconditionally tell firmware we're done with them.

... to free the memory, yes, ok.

> >> I think it is. 18.3.2.8 of ACPI v6.2 (search for Generic Hardware Error Source
> >> version 2", then below the table):
> >> * OSPM detects error (via interrupt/exception or polling the block status)
> >> * OSPM copies the error status block
> >> * OSPM clears the block status field of the error status block
> >> * OSPM acknowledges the error via Read Ack register
> >>
> >> The ENOENT case is excluded by 'polling the block status'.
> > 
> > Ok, so we signal the absence of an error record with ENOENT.
> > 
> >         if (!buf_paddr)
> >                 return -ENOENT;
> > 
> > Can that even happen?
> 
> Yes, for NOTIFY_POLLED its the norm. For the IRQ flavours that walk a list of
> GHES, all but one of them will return ENOENT.

Lemme get this straight: when we do

	apei_read(&buf_paddr, &g->error_status_address);

in the polled case, buf_paddr can be 0?

> We could try it and see. It depends if firmware shares ack locations between
> multiple GHES. We could ack an empty GHES, and it removes the records of one we
> haven't looked at yet.

Yeah, OTOH, we shouldn't be pushing our luck here, I guess.

So let's sum up: we'll ack the GHES error in all but the -ENOENT cases
in order to free the memory occupied by the error record.

The slightly "pathological" -ENOENT case is I guess how the fw behaves
when it is being polled and also for broken firmware which could report
a 0 buf_paddr.

Btw, that last thing I'm assuming because

  d334a49113a4 ("ACPI, APEI, Generic Hardware Error Source memory error support")

doesn't say what that check was needed for.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-29 11:49                     ` Borislav Petkov
@ 2019-01-29 18:48                       ` James Morse
  2019-01-31 13:29                         ` Borislav Petkov
  0 siblings, 1 reply; 72+ messages in thread
From: James Morse @ 2019-01-29 18:48 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Linux ACPI,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Christoffer Dall, Dongjiu Geng, linux-mm, Naoya Horiguchi,
	kvmarm, arm-mail-list, Len Brown

Hi Boris,

On 29/01/2019 11:49, Borislav Petkov wrote:
> On Wed, Jan 23, 2019 at 06:36:38PM +0000, James Morse wrote:
>> Do you consider ENOENT an error? We don't ack in that case as the
>> memory wasn't in use.
> 
> So let's see:
> 
>         if (!*buf_paddr)
>                 return -ENOENT;
> 
> can happen when apei_read() has returned 0 but it has managed to do
> 
> 	*val = 0;

> Now, that function returns error values which we should be checking
> but we're checking the buf_paddr pointed to value for being 0. Are
> we fearing that even if acpi_os_read_memory() or acpi_os_read_port()
> succeed, *buf_paddr could still be 0 ?

That's what this code is doing, checking for a successful read, of zero.
The g->error_status_address has to point somewhere as its location is advertised
in the tables.

What is the value of g->error_status_address 'out of reset' or before any error
has occurred? This code expects it to be zero, or to point to a CPER block with
an empty block_status.

(the acpi spec is unclear on when *(g->error_status_address) is written)


> Because if not, we should be checking whether rc == -EINVAL and then
> convert it to -ENOENT.

EINVAL implies the reg->space_id wasn't one of the two "System IO or System
Memory". (I thought the spec required this, but it only says this for EINJ:
'This constraint is an attempt to ensure that the registers are accessible in
the presence of hardware error conditions'.)

apei_check_gar() checks for these two in apei_map_generic_address(), so if this
is the case we would have failed at ghes_new() time.


> But ghes_read_estatus() handles the error case first *and* *then* checks
> buf_paddr too, to make really really sure we won't be reading from
> address 0.

I think this is the distinction between 'failed to read', (because
g->error_status_address has bad alignment or an unsupported address-space
id/access-size), and successfully read 0, which is treated as ENOENT.


>> For the other cases its because the records are bogus, but we still
>> unconditionally tell firmware we're done with them.
> 
> ... to free the memory, yes, ok.
> 
>>>> I think it is. 18.3.2.8 of ACPI v6.2 (search for Generic Hardware Error Source
>>>> version 2", then below the table):
>>>> * OSPM detects error (via interrupt/exception or polling the block status)
>>>> * OSPM copies the error status block
>>>> * OSPM clears the block status field of the error status block
>>>> * OSPM acknowledges the error via Read Ack register
>>>>
>>>> The ENOENT case is excluded by 'polling the block status'.
>>>
>>> Ok, so we signal the absence of an error record with ENOENT.
>>>
>>>         if (!buf_paddr)
>>>                 return -ENOENT;
>>>
>>> Can that even happen?
>>
>> Yes, for NOTIFY_POLLED its the norm. For the IRQ flavours that walk a list of
>> GHES, all but one of them will return ENOENT.


> Lemme get this straight: when we do
> 
> 	apei_read(&buf_paddr, &g->error_status_address);
> 
> in the polled case, buf_paddr can be 0?

If firmware has never generated CPER records, so it has never written to void
*error_status_address, yes.

There seem to be two ways of doing this. This zero check implies an example
system could be:
| g->error_status_address == 0xf00d
| *(u64 *)0xf00d == 0
Firmware populates CPER records, then updates 0xf00d.
(0xf00d would have been pre-mapped by apei_map_generic_address() in ghes_new())
Reads of 0xf00d before CPER records are generated get 0.

Once an error occurs, this system now looks like this:
| g->error_status_address == 0xf00d
| *(u64 *)0xf00d == 0xbeef
| *(u64 *)0xbeef == 0

For new errors, firmware populates CPER records, then updates 0xf00d.
Alternatively firmware could re-use the memory at 0xbeef, generating the CPER
records backwards, so that once 0xbeef is updated, the rest of the record is
visible. (firmware knows not to race with another CPU right?)

Firmware could equally point 0xf00d at 0xbeef at startup, so it has one fewer
values to write when an error occurs. I have an arm64 system with a HEST that
does this. (I'm pretty sure its ACPI support is a copy-and-paste from x86, it
even describes NOTIFY_NMI, who knows what that means on arm!)

When linux processes an error, ghes_clear_estatus() NULLs the
estatus->block_status, (which in this example is at 0xbeef). This is the
documented sequence for GHESv2.
Elsewhere the spec talks of checking the block status which is part of the
records, (not the error_status_address, which is the pointer to the records).

Linux can't NULL 0xf00d, because it doesn't know if firmware will write it again
next time it updates the records.
I can't find where in the spec it says the error status address is written to.
Linux works with both 'at boot' and 'on each error'.
If it were know to have a static value, ghes_copy_tofrom_phys() would not have
been necessary, but its been there since d334a49113a4.

In the worst case, if there is a value at the error_status_address, we have to
map/unmap it every time we poll in case firmware wrote new records at that same
location.

I don't think we can change Linux's behaviour here, without interpreting zero as
CPER records or missing new errors.


>> We could try it and see. It depends if firmware shares ack locations between
>> multiple GHES. We could ack an empty GHES, and it removes the records of one we
>> haven't looked at yet.
> 
> Yeah, OTOH, we shouldn't be pushing our luck here, I guess.
> 
> So let's sum up: we'll ack the GHES error in all but the -ENOENT cases
> in order to free the memory occupied by the error record.

I agree.


> The slightly "pathological" -ENOENT case is I guess how the fw behaves
> when it is being polled and also for broken firmware which could report
> a 0 buf_paddr.
> 
> Btw, that last thing I'm assuming because
> 
>   d334a49113a4 ("ACPI, APEI, Generic Hardware Error Source memory error support")
> 
> doesn't say what that check was needed for.

Heh. I'd assume this was the out-of-reset value on the platform that was
developed for, which implicitly assumed we could never get CPER records at zero.


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-11 20:53               ` Tyler Baicar
@ 2019-01-29 18:48                 ` James Morse
  0 siblings, 0 replies; 72+ messages in thread
From: James Morse @ 2019-01-29 18:48 UTC (permalink / raw)
  To: Tyler Baicar
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, Linux ACPI, Borislav Petkov, Naoya Horiguchi,
	kvmarm, arm-mail-list, Len Brown

Hi Tyler,

On 11/01/2019 20:53, Tyler Baicar wrote:
> On Fri, Jan 11, 2019 at 1:09 PM James Morse <james.morse@arm.com> wrote:
>> We can return on ENOENT out earlier, as nothing needs doing in that case. Its
>> what the GHES_TO_CLEAR spaghetti is for, we can probably move the ack thing into
>> ghes_clear_estatus(), that way that thing means 'I'm done with this memory'.
>>
>> Something like:
>> -------------------------
>> rc = ghes_read_estatus();
>> if (rc == -ENOENT)
>>         return 0;
> 
> We still should be returning at least the -ENOENT from ghes_read_estatus().
> That is being used by the SEA handling to determine if an SEA was properly
> reported/handled by the host kernel in the KVM SEA case.

Sorry, my terrible example code. You'll be glad to know I would have caught this
when testing it!


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records
  2019-01-29 18:48                       ` James Morse
@ 2019-01-31 13:29                         ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2019-01-31 13:29 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, Xie XiuQi, Linux ACPI,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Christoffer Dall, Dongjiu Geng, linux-mm, Naoya Horiguchi,
	kvmarm, arm-mail-list, Len Brown

On Tue, Jan 29, 2019 at 06:48:33PM +0000, James Morse wrote:
> If firmware has never generated CPER records, so it has never written to void
> *error_status_address, yes.

I guess this is the bit of information I was missing.

> There seem to be two ways of doing this. This zero check implies an example
> system could be:
> | g->error_status_address == 0xf00d
> | *(u64 *)0xf00d == 0
> Firmware populates CPER records, then updates 0xf00d.
> (0xf00d would have been pre-mapped by apei_map_generic_address() in ghes_new())
> Reads of 0xf00d before CPER records are generated get 0.

Ok, this sounds like the polled case. FW better have a record ready
before raising the NMI.

> Once an error occurs, this system now looks like this:
> | g->error_status_address == 0xf00d
> | *(u64 *)0xf00d == 0xbeef
> | *(u64 *)0xbeef == 0
> 
> For new errors, firmware populates CPER records, then updates 0xf00d.
> Alternatively firmware could re-use the memory at 0xbeef, generating the CPER
> records backwards, so that once 0xbeef is updated, the rest of the record is
> visible. (firmware knows not to race with another CPU right?)

Thanks for the comic relief. :-P

> Firmware could equally point 0xf00d at 0xbeef at startup, so it has one fewer
> values to write when an error occurs. I have an arm64 system with a HEST that
> does this. (I'm pretty sure its ACPI support is a copy-and-paste from x86, it
> even describes NOTIFY_NMI, who knows what that means on arm!)

Oh the fun.

> When linux processes an error, ghes_clear_estatus() NULLs the
> estatus->block_status, (which in this example is at 0xbeef). This is the
> documented sequence for GHESv2.
> Elsewhere the spec talks of checking the block status which is part of the
> records, (not the error_status_address, which is the pointer to the records).
>
> Linux can't NULL 0xf00d, because it doesn't know if firmware will write it again
> next time it updates the records.
> I can't find where in the spec it says the error status address is written to.
> Linux works with both 'at boot' and 'on each error'.
> If it were know to have a static value, ghes_copy_tofrom_phys() would not have
> been necessary, but its been there since d334a49113a4.
>
> In the worst case, if there is a value at the error_status_address, we have to
> map/unmap it every time we poll in case firmware wrote new records at that same
> location.
> 
> I don't think we can change Linux's behaviour here, without interpreting zero as
> CPER records or missing new errors.

Nah, I was simply trying to figure out why we do that buf_paddr check.
Thanks for the extensive clarification.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications
  2019-01-23 18:33     ` James Morse
@ 2019-01-31 13:38       ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2019-01-31 13:38 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Wed, Jan 23, 2019 at 06:33:02PM +0000, James Morse wrote:
> Was the best I had, but this trips the BUILD_BUG() too early.
> With it, x86 BUILD_BUG()s. With just the -1 the path gets pruned out, and there
> are no 'sdei' symbols in the object file.
> 
> ...at this point, I stopped caring!

Yah, you said it: __end_of_fixed_addresses will practically give you the
BUG behavior:

        if (idx >= __end_of_fixed_addresses) {
                BUG();
                return;
        }

and ARM64 does the same.

> We already skip registering notifiers if the kconfig option wasn't selected.
> 
> We can't catch this at compile time, as the dead-code elimination seems to
> happen in multiple passes.
> 
> I'll switch the SDEI ones to __end_of_fixed_addresses, as both architectures
> BUG() when they see this.

Right.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors
  2019-01-23 18:40     ` James Morse
@ 2019-01-31 14:04       ` Borislav Petkov
  0 siblings, 0 replies; 72+ messages in thread
From: Borislav Petkov @ 2019-01-31 14:04 UTC (permalink / raw)
  To: James Morse
  Cc: Rafael Wysocki, Tony Luck, Fan Wu, linux-mm, Marc Zyngier,
	Catalin Marinas, Xie XiuQi, Will Deacon, Christoffer Dall,
	Dongjiu Geng, linux-acpi, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Wed, Jan 23, 2019 at 06:40:08PM +0000, James Morse wrote:
> My SMM comment was because the CPU must jump from user-space->SMM, which injects
> an NMI into the kernel. The kernel's EIP must point into user-space, so
> returning from the NMI without doing the memory_failure() work puts us back the
> same position we started in.

Yeah, known issue. We dealt with that on x86 at the time:

d4812e169de4 ("x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks")

> > Now, memory_failure_queue() does that and can run from IRQ context so
> > you need only an irq_work which can queue from NMI context. We do it
> > this way in the MCA code:
> > 
> 
> (was there something missing here?)

Whoops. Yeah, I was about to paste this:

void mce_log(struct mce *m)
{
        if (!mce_gen_pool_add(m))
                irq_work_queue(&mce_irq_work);
}

we're basically queueing only into the lockless buffer and kicking the
IRQ work.

> > We queue in an irq_work in NMI context and work through the items in
> > process context.
> 
> How are you getting from NMI to process context in one go?

Well, #MC is basically an NMI context on x86 and when it is done, we
work through the items queued in process context. But see the commit
above too - for really urgent errors we run memory_failure *before* we
return to user.

> This patch causes the IRQ->process transition.
> The arch specific bit of this gives the irq work queue a kick if returning from
> the NMI would unmask IRQs. This makes it look like we moved from NMI to IRQ
> context without returning to user-space.
> 
> Once ghes_handle_memory_failure() runs in IRQ context, it task_work_add()s the
> call to ghes_kick_memory_failure().
> 
> Finally on the way out of the kernel to user-space that task_work runs and the
> memory_failure() work happens in process context.
> 
> During all this the user-space program counter can point at a poisoned location,
> but we don't return there until the memory_failure() work has been done.

Sounds very similar.

Actually, yours is even a bit more elegant. I wonder why we didn't use
task_work_add() then...

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, back to index

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
2018-12-03 18:05 ` [PATCH v7 01/25] ACPI / APEI: Don't wait to serialise with oops messages when panic()ing James Morse
2018-12-03 18:05 ` [PATCH v7 02/25] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
2018-12-04 11:36   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 03/25] ACPI / APEI: Switch estatus pool to use vmalloc memory James Morse
2018-12-04 13:01   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool James Morse
2018-12-11 16:48   ` Borislav Petkov
2018-12-14 13:56     ` James Morse
2018-12-19 14:42       ` Borislav Petkov
2019-01-10 18:20         ` James Morse
2018-12-03 18:05 ` [PATCH v7 05/25] ACPI / APEI: Make estatus pool allocation a static size James Morse
2018-12-11 16:54   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 06/25] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
2018-12-11 17:04   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 07/25] ACPI / APEI: Remove spurious GHES_TO_CLEAR check James Morse
2018-12-11 17:18   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 08/25] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus James Morse
2018-12-03 18:05 ` [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code James Morse
2018-12-11 17:44   ` Borislav Petkov
2019-01-10 18:21     ` James Morse
2019-01-11 11:46       ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records James Morse
2018-12-11 18:36   ` Borislav Petkov
2019-01-10 18:22     ` James Morse
2019-01-10 21:01       ` Tyler Baicar
2019-01-11 12:03         ` Borislav Petkov
2019-01-11 15:32           ` Tyler Baicar
2019-01-11 17:45             ` Borislav Petkov
2019-01-11 18:25               ` James Morse
2019-01-11 19:58                 ` Borislav Petkov
2019-01-23 18:36                   ` James Morse
2019-01-29 11:49                     ` Borislav Petkov
2019-01-29 18:48                       ` James Morse
2019-01-31 13:29                         ` Borislav Petkov
2019-01-11 18:09             ` James Morse
2019-01-11 20:01               ` Borislav Petkov
2019-01-11 20:53               ` Tyler Baicar
2019-01-29 18:48                 ` James Morse
2018-12-03 18:05 ` [PATCH v7 11/25] ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI James Morse
2019-01-21 13:01   ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 12/25] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
2018-12-03 18:06 ` [PATCH v7 13/25] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
2018-12-06 16:17   ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 14/25] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
2018-12-06 16:17   ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 15/25] ACPI / APEI: Move locking to the notification helper James Morse
2018-12-03 18:06 ` [PATCH v7 16/25] ACPI / APEI: Let the notification helper specify the fixmap slot James Morse
2018-12-03 18:06 ` [PATCH v7 17/25] ACPI / APEI: Pass ghes and estatus separately to avoid a later copy James Morse
2019-01-21 13:35   ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 18/25] ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length James Morse
2019-01-21 13:53   ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 19/25] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
2019-01-21 17:19   ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications James Morse
2019-01-21 17:27   ` Borislav Petkov
2019-01-23 18:33     ` James Morse
2019-01-31 13:38       ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 21/25] mm/memory-failure: Add memory_failure_queue_kick() James Morse
2018-12-03 18:06 ` [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors James Morse
2018-12-05  2:02   ` Xie XiuQi
2018-12-10 19:15     ` James Morse
2019-01-22 10:51       ` Borislav Petkov
2019-01-23 18:37         ` James Morse
2019-01-21 17:58   ` Borislav Petkov
2019-01-23 18:40     ` James Morse
2019-01-31 14:04       ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 23/25] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
2018-12-06 16:18   ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 24/25] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
2018-12-06 16:18   ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 25/25] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse

Linux-ARM-Kernel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/0 linux-arm-kernel/git/0.git
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/1 linux-arm-kernel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arm-kernel linux-arm-kernel/ https://lore.kernel.org/linux-arm-kernel \
		linux-arm-kernel@lists.infradead.org infradead-linux-arm-kernel@archiver.kernel.org
	public-inbox-index linux-arm-kernel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-arm-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox