linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up
@ 2018-06-26 17:00 James Morse
  2018-06-26 17:00 ` [PATCH v5 01/20] ACPI / APEI: Move the estatus queue code up, and under its own ifdef James Morse
                   ` (21 more replies)
  0 siblings, 22 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:00 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

The aim of this series is to wire arm64's SDEI into APEI.

On arm64 we have three APEI notifications that are NMI-like, and
in the unlikely event that all three are supported by a platform,
they can interrupt each other.
The GHES driver shouldn't have to deal with this, so this series aims
to make it re-entrant.

To do that, we refactor the estatus queue to allow multiple notifications
to use it, then convert NOTIFY_SEA to always be described as NMI-like,
and to use the estatus queue.

>From here we push the locking and fixmap choices out to the notification
functions, and remove the use of per-ghes estatus and flags. This removes
the in_nmi() 'timebomb' in ghes_copy_tofrom_phys().

Things get sticky when an NMI notification needs to know how big the
CPER records might be, before reading it. This series splits
ghes_estatus_read() to let us peek at the buffer. A side effect of this
is the 20byte header will get read twice. (how does it work today? it
reads the records into a per-ghes worst-case sized buffer, allocates
the correct size and copies the records. in_nmi() use of this per-ghes
buffer needs eliminating).

One alternative was to trust firmware's 'max raw data length' and use
that to allocate 'enough' memory. We don't use this value today, so its
probably wrong on some sytem somewhere.

Since v4 patches 5,8-15 are new, otherwise changes are noted in the patch.


The earlier boiler-plate:

What's SDEI? Its ARM's "Software Delegated Exception Interface" [0]. It's
used by firmware to tell the OS about firmware-first RAS events.

These Software exceptions can interrupt anything, so I describe them as
NMI-like. They aren't the only NMI-like way to notify the OS about
firmware-first RAS events, the ACPI spec also defines 'NOTFIY_SEA' and
'NOTIFY_SEI'.

(Acronyms: SEA, Synchronous External Abort. The CPU requested some memory,
but the owner of that memory said no. These are always synchronous with the
instruction that caused them. SEI, System-Error Interrupt, commonly called
SError. This is an asynchronous external abort, the memory-owner didn't say no
at the right point. Collectively these things are called external-aborts
How is firmware involved? It traps these and re-injects them into the kernel
once its written the CPER records).

APEI's GHES code only expects one source of NMI. If a platform implements
more than one of these mechanisms, APEI needs to handle the interaction.
'SEA' and 'SEI' can interact as 'SEI' is asynchronous. SDEI can interact
with itself: its exceptions can be 'normal' or 'critical', and firmware
could use both types for RAS. (errors using normal, 'panic-now' using
critical).


ghes.c became clearer to me when I worked out that it has three sets of
functions with 'estatus' in the name. One is a pool of memory that can be
allocated-from atomically. This is grown/shrunk when new NMI users are
allocated.
The second is the estatus-cache, which holds recent notifications so it
can suppress notifications we've already handled.
The last it the estatus-queue, which holds data from NMI-like notifications
(in pool memory) to be processed from irq_work.


Testing?
Tested with the SDEI FVP based software model and a mocked up NOTFIY_SEA using
KVM. I've added a case where 'corrected errors' are discovered at probe time
to exercise ghes_probe() during boot. I've only build tested this on x86.

This series based on v4.18-rc2 can be retrieved from:
git://linux-arm.org/linux-jm.git -b apei_sdei/v5


Thanks,

James

[0] http://infocenter.arm.com/help/topic/com.arm.doc.den0054a/ARM_DEN0054A_Software_Delegated_Exception_Interface.pdf

James Morse (20):
  ACPI / APEI: Move the estatus queue code up, and under its own ifdef
  ACPI / APEI: Generalise the estatus queue's add/remove and notify code
  ACPI / APEI: don't wait to serialise with oops messages when
    panic()ing
  ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  ACPI / APEI: Make estatus queue a Kconfig symbol
  KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  ACPI / APEI: Move locking to the notification helper
  ACPI / APEI: Let the notification helper specify the fixmap slot
  ACPI / APEI: preparatory split of ghes->estatus
  ACPI / APEI: Remove silent flag from ghes_read_estatus()
  ACPI / APEI: Don't store CPER records physical address in struct ghes
  ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  ACPI / APEI: Split ghes_read_estatus() to read CPER length
  ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications
  firmware: arm_sdei: Add ACPI GHES registration helper
  ACPI / APEI: Add support for the SDEI GHES Notification type
  mm/memory-failure: increase queued recovery work's priority
  arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work

 arch/arm/include/asm/kvm_ras.h       |  14 +
 arch/arm/include/asm/system_misc.h   |   5 -
 arch/arm64/include/asm/acpi.h        |   4 +
 arch/arm64/include/asm/daifflags.h   |   1 +
 arch/arm64/include/asm/fixmap.h      |   8 +-
 arch/arm64/include/asm/kvm_ras.h     |  25 ++
 arch/arm64/include/asm/system_misc.h |   2 -
 arch/arm64/kernel/acpi.c             |  49 ++
 arch/arm64/mm/fault.c                |  30 +-
 drivers/acpi/apei/Kconfig            |  11 +
 drivers/acpi/apei/ghes.c             | 649 ++++++++++++++++-----------
 drivers/firmware/Kconfig             |   1 +
 drivers/firmware/arm_sdei.c          |  66 +++
 include/acpi/ghes.h                  |   2 -
 include/linux/arm_sdei.h             |   9 +
 mm/memory-failure.c                  |  11 +-
 virt/kvm/arm/mmu.c                   |   4 +-
 17 files changed, 591 insertions(+), 300 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

-- 
2.17.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 01/20] ACPI / APEI: Move the estatus queue code up, and under its own ifdef
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
@ 2018-06-26 17:00 ` James Morse
  2018-06-26 17:00 ` [PATCH v5 02/20] ACPI / APEI: Generalise the estatus queue's add/remove and notify code James Morse
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:00 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

To support asynchronous NMI-like notifications on arm64 we need to use
the estatus-queue. These patches refactor it to allow multiple APEI
notification types to use it.

First we move the estatus-queue code higher in the file so that any
notify_foo() handler can make use of it.

This patch moves code around ... and makes the following trivial change:
Freshen the dated comment above ghes_estatus_llist. printk() is no
longer the issue, its the helpers like memory_failure_queue() that
still aren't nmi safe.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/acpi/apei/ghes.c | 265 ++++++++++++++++++++-------------------
 1 file changed, 137 insertions(+), 128 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 02c6fd9caff7..f5732e6b5be8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -545,6 +545,16 @@ static int ghes_print_estatus(const char *pfx,
 	return 0;
 }
 
+static void __ghes_panic(struct ghes *ghes)
+{
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+
+	/* reboot to log the error! */
+	if (!panic_timeout)
+		panic_timeout = ghes_panic_timeout;
+	panic("Fatal hardware error!");
+}
+
 /*
  * GHES error status reporting throttle, to report more kinds of
  * errors, instead of just most frequently occurred errors.
@@ -672,6 +682,133 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
+#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+/*
+ * Handlers for CPER records may not be NMI safe. For example,
+ * memory_failure_queue() takes spinlocks and calls schedule_work_on().
+ * In any NMI-like handler, memory from ghes_estatus_pool is used to save
+ * estatus, and added to the ghes_estatus_llist. irq_work_queue() causes
+ * ghes_proc_in_irq() to run in IRQ context where each estatus in
+ * ghes_estatus_llist is processed. Each NMI-like error source must grow
+ * the ghes_estatus_pool to ensure memory is available.
+ *
+ * Memory from the ghes_estatus_pool is also used with the ghes_estatus_cache
+ * to suppress frequent messages.
+ */
+static struct llist_head ghes_estatus_llist;
+static struct irq_work ghes_proc_irq_work;
+
+static void ghes_print_queued_estatus(void)
+{
+	struct llist_node *llnode;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		generic = estatus_node->generic;
+		ghes_print_estatus(NULL, generic, estatus);
+		llnode = llnode->next;
+	}
+}
+
+/* Save estatus for further processing in IRQ context */
+static void __process_error(struct ghes *ghes)
+{
+#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
+	u32 len, node_len;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic_status *estatus;
+
+	if (ghes_estatus_cached(ghes->estatus))
+		return;
+
+	len = cper_estatus_len(ghes->estatus);
+	node_len = GHES_ESTATUS_NODE_LEN(len);
+
+	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
+	if (!estatus_node)
+		return;
+
+	estatus_node->ghes = ghes;
+	estatus_node->generic = ghes->generic;
+	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+	memcpy(estatus, ghes->estatus, len);
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+#endif
+}
+
+static unsigned long ghes_esource_prealloc_size(
+	const struct acpi_hest_generic *generic)
+{
+	unsigned long block_length, prealloc_records, prealloc_size;
+
+	block_length = min_t(unsigned long, generic->error_block_length,
+			     GHES_ESTATUS_MAX_SIZE);
+	prealloc_records = max_t(unsigned long,
+				 generic->records_to_preallocate, 1);
+	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
+			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
+
+	return prealloc_size;
+}
+
+static void ghes_estatus_pool_shrink(unsigned long len)
+{
+	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
+}
+
+static void ghes_proc_in_irq(struct irq_work *irq_work)
+{
+	struct llist_node *llnode, *next;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+	u32 len, node_len;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		next = llnode->next;
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		len = cper_estatus_len(estatus);
+		node_len = GHES_ESTATUS_NODE_LEN(len);
+		ghes_do_proc(estatus_node->ghes, estatus);
+		if (!ghes_estatus_cached(estatus)) {
+			generic = estatus_node->generic;
+			if (ghes_print_estatus(NULL, generic, estatus))
+				ghes_estatus_cache_add(generic, estatus);
+		}
+		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+			      node_len);
+		llnode = next;
+	}
+}
+
+static void ghes_nmi_init_cxt(void)
+{
+	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
+}
+
+#else
+static inline void ghes_nmi_init_cxt(void) { }
+#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
 	int rc;
@@ -687,16 +824,6 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 	return apei_write(val, &gv2->read_ack_register);
 }
 
-static void __ghes_panic(struct ghes *ghes)
-{
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
-
-	/* reboot to log the error! */
-	if (!panic_timeout)
-		panic_timeout = ghes_panic_timeout;
-	panic("Fatal hardware error!");
-}
-
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
@@ -828,17 +955,6 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 #endif /* CONFIG_ACPI_APEI_SEA */
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
-/*
- * printk is not safe in NMI context.  So in NMI handler, we allocate
- * required memory from lock-less memory allocator
- * (ghes_estatus_pool), save estatus into it, put them into lock-less
- * list (ghes_estatus_llist), then delay printk into IRQ context via
- * irq_work (ghes_proc_irq_work).  ghes_estatus_size_request record
- * required pool size by all NMI error source.
- */
-static struct llist_head ghes_estatus_llist;
-static struct irq_work ghes_proc_irq_work;
-
 /*
  * NMI may be triggered on any CPU, so ghes_in_nmi is used for
  * having only one concurrent reader.
@@ -847,88 +963,6 @@ static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
 static LIST_HEAD(ghes_nmi);
 
-static void ghes_proc_in_irq(struct irq_work *irq_work)
-{
-	struct llist_node *llnode, *next;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-	u32 len, node_len;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		next = llnode->next;
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		len = cper_estatus_len(estatus);
-		node_len = GHES_ESTATUS_NODE_LEN(len);
-		ghes_do_proc(estatus_node->ghes, estatus);
-		if (!ghes_estatus_cached(estatus)) {
-			generic = estatus_node->generic;
-			if (ghes_print_estatus(NULL, generic, estatus))
-				ghes_estatus_cache_add(generic, estatus);
-		}
-		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
-			      node_len);
-		llnode = next;
-	}
-}
-
-static void ghes_print_queued_estatus(void)
-{
-	struct llist_node *llnode;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		generic = estatus_node->generic;
-		ghes_print_estatus(NULL, generic, estatus);
-		llnode = llnode->next;
-	}
-}
-
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
-{
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	u32 len, node_len;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic_status *estatus;
-
-	if (ghes_estatus_cached(ghes->estatus))
-		return;
-
-	len = cper_estatus_len(ghes->estatus);
-	node_len = GHES_ESTATUS_NODE_LEN(len);
-
-	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
-	if (!estatus_node)
-		return;
-
-	estatus_node->ghes = ghes;
-	estatus_node->generic = ghes->generic;
-	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
-}
-
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
 	struct ghes *ghes;
@@ -967,26 +1001,6 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	return ret;
 }
 
-static unsigned long ghes_esource_prealloc_size(
-	const struct acpi_hest_generic *generic)
-{
-	unsigned long block_length, prealloc_records, prealloc_size;
-
-	block_length = min_t(unsigned long, generic->error_block_length,
-			     GHES_ESTATUS_MAX_SIZE);
-	prealloc_records = max_t(unsigned long,
-				 generic->records_to_preallocate, 1);
-	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
-			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
-
-	return prealloc_size;
-}
-
-static void ghes_estatus_pool_shrink(unsigned long len)
-{
-	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
-}
-
 static void ghes_nmi_add(struct ghes *ghes)
 {
 	unsigned long len;
@@ -1018,14 +1032,9 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	ghes_estatus_pool_shrink(len);
 }
 
-static void ghes_nmi_init_cxt(void)
-{
-	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
-}
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
 static inline void ghes_nmi_add(struct ghes *ghes) { }
 static inline void ghes_nmi_remove(struct ghes *ghes) { }
-static inline void ghes_nmi_init_cxt(void) { }
 #endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
 static int ghes_probe(struct platform_device *ghes_dev)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 02/20] ACPI / APEI: Generalise the estatus queue's add/remove and notify code
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
  2018-06-26 17:00 ` [PATCH v5 01/20] ACPI / APEI: Move the estatus queue code up, and under its own ifdef James Morse
@ 2018-06-26 17:00 ` James Morse
  2018-06-26 17:00 ` [PATCH v5 03/20] ACPI / APEI: don't wait to serialise with oops messages when panic()ing James Morse
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:00 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Refactor the estatus queue's pool grow/shrink code and notification
routine from NOTIFY_NMI's handlers. This will allow another notification
method to use the estatus queue without duplicating this code.

This patch adds rcu_read_lock()/rcu_read_unlock() around the list
list_for_each_entry_rcu() walker. These aren't strictly necessary as
the whole nmi_enter/nmi_exit() window is a spooky RCU read-side
critical section.

The existing ghes_estatus_pool_shrink() is folded into the new
ghes_estatus_queue_shrink_pool() as only the queue uses it.

_in_nmi_notify_one() is separate from the rcu-list walker for a later
caller that doesn't need to walk a list.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Changes since v3:
 * Removed dupicate or redundant paragraphs in commit message.
 * Fixed the style of a zero check
Changes since v1:
 * Tidied up _in_nmi_notify_one().
---
 drivers/acpi/apei/ghes.c | 100 +++++++++++++++++++++++++--------------
 1 file changed, 65 insertions(+), 35 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f5732e6b5be8..29d863ff2f87 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -747,6 +747,51 @@ static void __process_error(struct ghes *ghes)
 #endif
 }
 
+static int _in_nmi_notify_one(struct ghes *ghes)
+{
+	int sev;
+
+	if (ghes_read_estatus(ghes, 1)) {
+		ghes_clear_estatus(ghes);
+		return -ENOENT;
+	}
+
+	sev = ghes_severity(ghes->estatus->error_severity);
+	if (sev >= GHES_SEV_PANIC) {
+#ifdef CONFIG_X86
+		oops_begin();
+#endif
+		ghes_print_queued_estatus();
+		__ghes_panic(ghes);
+	}
+
+	if (!(ghes->flags & GHES_TO_CLEAR))
+		return 0;
+
+	__process_error(ghes);
+	ghes_clear_estatus(ghes);
+
+	return 0;
+}
+
+static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+{
+	int ret = -ENOENT;
+	struct ghes *ghes;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ghes, rcu_list, list) {
+		if (!_in_nmi_notify_one(ghes))
+			ret = 0;
+	}
+	rcu_read_unlock();
+
+	if (IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && !ret)
+		irq_work_queue(&ghes_proc_irq_work);
+
+	return ret;
+}
+
 static unsigned long ghes_esource_prealloc_size(
 	const struct acpi_hest_generic *generic)
 {
@@ -762,11 +807,24 @@ static unsigned long ghes_esource_prealloc_size(
 	return prealloc_size;
 }
 
-static void ghes_estatus_pool_shrink(unsigned long len)
+/* After removing a queue user, we can shrink the pool */
+static void ghes_estatus_queue_shrink_pool(struct ghes *ghes)
 {
+	unsigned long len;
+
+	len = ghes_esource_prealloc_size(ghes->generic);
 	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
 }
 
+/* Before adding a queue user, grow the pool */
+static void ghes_estatus_queue_grow_pool(struct ghes *ghes)
+{
+	unsigned long len;
+
+	len = ghes_esource_prealloc_size(ghes->generic);
+	ghes_estatus_pool_expand(len);
+}
+
 static void ghes_proc_in_irq(struct irq_work *irq_work)
 {
 	struct llist_node *llnode, *next;
@@ -965,48 +1023,22 @@ static LIST_HEAD(ghes_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
-	struct ghes *ghes;
-	int sev, ret = NMI_DONE;
+	int ret = NMI_DONE;
 
 	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
 		return ret;
 
-	list_for_each_entry_rcu(ghes, &ghes_nmi, list) {
-		if (ghes_read_estatus(ghes, 1)) {
-			ghes_clear_estatus(ghes);
-			continue;
-		} else {
-			ret = NMI_HANDLED;
-		}
-
-		sev = ghes_severity(ghes->estatus->error_severity);
-		if (sev >= GHES_SEV_PANIC) {
-			oops_begin();
-			ghes_print_queued_estatus();
-			__ghes_panic(ghes);
-		}
+	if (!ghes_estatus_queue_notified(&ghes_nmi))
+		ret = NMI_HANDLED;
 
-		if (!(ghes->flags & GHES_TO_CLEAR))
-			continue;
-
-		__process_error(ghes);
-		ghes_clear_estatus(ghes);
-	}
-
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	if (ret == NMI_HANDLED)
-		irq_work_queue(&ghes_proc_irq_work);
-#endif
 	atomic_dec(&ghes_in_nmi);
 	return ret;
 }
 
 static void ghes_nmi_add(struct ghes *ghes)
 {
-	unsigned long len;
+	ghes_estatus_queue_grow_pool(ghes);
 
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_expand(len);
 	mutex_lock(&ghes_list_mutex);
 	if (list_empty(&ghes_nmi))
 		register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes");
@@ -1016,8 +1048,6 @@ static void ghes_nmi_add(struct ghes *ghes)
 
 static void ghes_nmi_remove(struct ghes *ghes)
 {
-	unsigned long len;
-
 	mutex_lock(&ghes_list_mutex);
 	list_del_rcu(&ghes->list);
 	if (list_empty(&ghes_nmi))
@@ -1028,8 +1058,8 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	 * freed after NMI handler finishes.
 	 */
 	synchronize_rcu();
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_shrink(len);
+
+	ghes_estatus_queue_shrink_pool(ghes);
 }
 
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 03/20] ACPI / APEI: don't wait to serialise with oops messages when panic()ing
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
  2018-06-26 17:00 ` [PATCH v5 01/20] ACPI / APEI: Move the estatus queue code up, and under its own ifdef James Morse
  2018-06-26 17:00 ` [PATCH v5 02/20] ACPI / APEI: Generalise the estatus queue's add/remove and notify code James Morse
@ 2018-06-26 17:00 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 04/20] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:00 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

oops_begin() exists to group printk() messages with the oops message
printed by die(). To reach this caller we know that platform firmware
took this error first, then notified the OS via NMI with a 'panic'
severity.

Don't wait for another CPU to release the die-lock before we can
panic(), our only goal is to print this fatal error and panic().

This code is always called in_nmi(), and since 42a0bb3f7138 ("printk/nmi:
generic solution for safe printk in NMI"), it has been safe to call
printk() from this context. Messages are batched in a per-cpu buffer
and printed via irq-work, or a call back from panic().

Link: https://patchwork.kernel.org/patch/10313555/
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 29d863ff2f87..d7c46236b353 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -33,7 +33,6 @@
 #include <linux/interrupt.h>
 #include <linux/timer.h>
 #include <linux/cper.h>
-#include <linux/kdebug.h>
 #include <linux/platform_device.h>
 #include <linux/mutex.h>
 #include <linux/ratelimit.h>
@@ -758,9 +757,6 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 
 	sev = ghes_severity(ghes->estatus->error_severity);
 	if (sev >= GHES_SEV_PANIC) {
-#ifdef CONFIG_X86
-		oops_begin();
-#endif
 		ghes_print_queued_estatus();
 		__ghes_panic(ghes);
 	}
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 04/20] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (2 preceding siblings ...)
  2018-06-26 17:00 ` [PATCH v5 03/20] ACPI / APEI: don't wait to serialise with oops messages when panic()ing James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 05/20] ACPI / APEI: Make estatus queue a Kconfig symbol James Morse
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Now that the estatus queue can be used by more than one notification
method, we can move notifications that have NMI-like behaviour over to
it, and start abstracting GHES's single in_nmi() path.

Switch NOTIFY_SEA over to use the estatus queue. This makes it behave
in the same way as x86's NOTIFY_NMI.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/acpi/apei/ghes.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d7c46236b353..150fb184c7cb 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -58,6 +58,10 @@
 
 #define GHES_PFX	"GHES: "
 
+#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
+#define WANT_NMI_ESTATUS_QUEUE	1
+#endif
+
 #define GHES_ESTATUS_MAX_SIZE		65536
 #define GHES_ESOURCE_PREALLOC_MAX_SIZE	65536
 
@@ -681,7 +685,7 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+#ifdef WANT_NMI_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -861,7 +865,7 @@ static void ghes_nmi_init_cxt(void)
 
 #else
 static inline void ghes_nmi_init_cxt(void) { }
-#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+#endif /* WANT_NMI_ESTATUS_QUEUE */
 
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
@@ -977,20 +981,13 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	struct ghes *ghes;
-	int ret = -ENOENT;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(ghes, &ghes_sea, list) {
-		if (!ghes_proc(ghes))
-			ret = 0;
-	}
-	rcu_read_unlock();
-	return ret;
+	return ghes_estatus_queue_notified(&ghes_sea);
 }
 
 static void ghes_sea_add(struct ghes *ghes)
 {
+	ghes_estatus_queue_grow_pool(ghes);
+
 	mutex_lock(&ghes_list_mutex);
 	list_add_rcu(&ghes->list, &ghes_sea);
 	mutex_unlock(&ghes_list_mutex);
@@ -1002,6 +999,8 @@ static void ghes_sea_remove(struct ghes *ghes)
 	list_del_rcu(&ghes->list);
 	mutex_unlock(&ghes_list_mutex);
 	synchronize_rcu();
+
+	ghes_estatus_queue_shrink_pool(ghes);
 }
 #else /* CONFIG_ACPI_APEI_SEA */
 static inline void ghes_sea_add(struct ghes *ghes) { }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 05/20] ACPI / APEI: Make estatus queue a Kconfig symbol
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (3 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 04/20] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 06/20] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Now that there are two users of the estatus queue, and likely to be more,
make it a Kconfig symbol selected by the appropriate notification. We
can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/Kconfig |  6 ++++++
 drivers/acpi/apei/ghes.c  | 12 +++---------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 52ae5438edeb..2b191e09b647 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -4,6 +4,7 @@ config HAVE_ACPI_APEI
 
 config HAVE_ACPI_APEI_NMI
 	bool
+	select ACPI_APEI_GHES_ESTATUS_QUEUE
 
 config ACPI_APEI
 	bool "ACPI Platform Error Interface (APEI)"
@@ -33,6 +34,10 @@ config ACPI_APEI_GHES
 	  by firmware to produce more valuable hardware error
 	  information for Linux.
 
+config ACPI_APEI_GHES_ESTATUS_QUEUE
+	bool
+	depends on ACPI_APEI_GHES && ARCH_HAVE_NMI_SAFE_CMPXCHG
+
 config ACPI_APEI_PCIEAER
 	bool "APEI PCIe AER logging/recovering support"
 	depends on ACPI_APEI && PCIEAER
@@ -43,6 +48,7 @@ config ACPI_APEI_PCIEAER
 config ACPI_APEI_SEA
 	bool "APEI Synchronous External Abort logging/recovering support"
 	depends on ARM64 && ACPI_APEI_GHES
+	select ACPI_APEI_GHES_ESTATUS_QUEUE
 	default y
 	help
 	  This option should be enabled if the system supports
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 150fb184c7cb..2880547e13b8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -58,10 +58,6 @@
 
 #define GHES_PFX	"GHES: "
 
-#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
-#define WANT_NMI_ESTATUS_QUEUE	1
-#endif
-
 #define GHES_ESTATUS_MAX_SIZE		65536
 #define GHES_ESOURCE_PREALLOC_MAX_SIZE	65536
 
@@ -685,7 +681,7 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef WANT_NMI_ESTATUS_QUEUE
+#ifdef CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -727,7 +723,6 @@ static void ghes_print_queued_estatus(void)
 /* Save estatus for further processing in IRQ context */
 static void __process_error(struct ghes *ghes)
 {
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
@@ -747,7 +742,6 @@ static void __process_error(struct ghes *ghes)
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
 	memcpy(estatus, ghes->estatus, len);
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
 }
 
 static int _in_nmi_notify_one(struct ghes *ghes)
@@ -786,7 +780,7 @@ static int ghes_estatus_queue_notified(struct list_head *rcu_list)
 	}
 	rcu_read_unlock();
 
-	if (IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && !ret)
+	if (!ret)
 		irq_work_queue(&ghes_proc_irq_work);
 
 	return ret;
@@ -865,7 +859,7 @@ static void ghes_nmi_init_cxt(void)
 
 #else
 static inline void ghes_nmi_init_cxt(void) { }
-#endif /* WANT_NMI_ESTATUS_QUEUE */
+#endif /* CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE */
 
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 06/20] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (4 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 05/20] ACPI / APEI: Make estatus queue a Kconfig symbol James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 07/20] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

To split up APEIs in_nmi() path, we need any nmi-like callers to always
be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
out into a header file.

Currently guest synchronous external aborts are claimed as RAS
notifications by handle_guest_sea(), which is hidden in the arch codes
mm/fault.c. 32bit gets a dummy declaration in system_misc.h.

There is going to be more of this in the future if/when we support
the SError-based firmware-first notification mechanism and/or
kernel-first notifications for both synchronous external abort and
SError. Each of these will come with some Kconfig symbols and a
handful of header files.

Create a header file for all this.

This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
declarations to kvm_ras.h as preparation for a future patch that moves
the ACPI-specific RAS code out of mm/fault.c.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 arch/arm/include/asm/kvm_ras.h       | 14 ++++++++++++++
 arch/arm/include/asm/system_misc.h   |  5 -----
 arch/arm64/include/asm/kvm_ras.h     | 11 +++++++++++
 arch/arm64/include/asm/system_misc.h |  2 --
 arch/arm64/mm/fault.c                |  2 +-
 virt/kvm/arm/mmu.c                   |  4 ++--
 6 files changed, 28 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..aaff56bf338f
--- /dev/null
+++ b/arch/arm/include/asm/kvm_ras.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 - Arm Ltd
+
+#ifndef __ARM_KVM_RAS_H__
+#define __ARM_KVM_RAS_H__
+
+#include <linux/types.h>
+
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	return -1;
+}
+
+#endif /* __ARM_KVM_RAS_H__ */
diff --git a/arch/arm/include/asm/system_misc.h b/arch/arm/include/asm/system_misc.h
index 8e76db83c498..66f6a3ae68d2 100644
--- a/arch/arm/include/asm/system_misc.h
+++ b/arch/arm/include/asm/system_misc.h
@@ -38,11 +38,6 @@ static inline void harden_branch_predictor(void)
 
 extern unsigned int user_debug;
 
-static inline int handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	return -1;
-}
-
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_ARM_SYSTEM_MISC_H */
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..5f72b07b7912
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 - Arm Ltd
+
+#ifndef __ARM64_KVM_RAS_H__
+#define __ARM64_KVM_RAS_H__
+
+#include <linux/types.h>
+
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+
+#endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
index 28893a0b141d..48ded3628a89 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -45,8 +45,6 @@ extern void __show_regs(struct pt_regs *);
 
 extern void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd);
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr);
-
 #endif	/* __ASSEMBLY__ */
 
 #endif	/* __ASM_SYSTEM_MISC_H */
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index b8eecc7b9531..167772fe4360 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -725,7 +725,7 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr)
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
 {
 	int ret = -ENOENT;
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1d90d79706bd..15b85c10f6db 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -27,10 +27,10 @@
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_mmio.h>
+#include <asm/kvm_ras.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/virt.h>
-#include <asm/system_misc.h>
 
 #include "trace.h"
 
@@ -1650,7 +1650,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 * For RAS the host kernel may handle this abort.
 		 * There is no need to pass the error into the guest.
 		 */
-		if (!handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
+		if (!kvm_handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
 			return 1;
 
 		if (unlikely(!is_iabt)) {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 07/20] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (5 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 06/20] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 08/20] ACPI / APEI: Move locking to the notification helper James Morse
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

To split up APEIs in_nmi() path, we need the nmi-like callers to always
be in_nmi(). Add a helper to do the work and claim the notification.

When KVM or the arch code takes an exception that might be a RAS
notification, it asks the APEI firmware-first code whether it wants
to claim the exception. We can then go on to see if (a future)
kernel-first mechanism wants to claim the notification, before
falling through to the existing default behaviour.

The NOTIFY_SEA code was merged before we had multiple, possibly
interacting, NMI-like notifications and the need to consider kernel
first in the future. Make the 'claiming' behaviour explicit.

As we're restructuring the APEI code to allow multiple NMI-like
notifications, any notification that might interrupt interrupts-masked
code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
to use in_nmi() to use the right fixmap entries.

We mask SError over this window to prevent an asynchronous RAS error
arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Why does apei_claim_sea() take a pt_regs? This gets used later to take
APEI by the hand through NMI->IRQ context, depending on what we
interrupted.

Changes since v4:
 * Made irqs-unmasked comment a lockdep assert.

Changes since v3:
 * Removed spurious whitespace change
 * Updated comment in acpi.c to cover SError masking

Changes since v2:
 * Added dummy definition for !ACPI and culled IS_ENABLED() checks.

squash: make 'call with irqs unmaksed' a lockdep assert, much better
---
 arch/arm64/include/asm/acpi.h      |  4 ++++
 arch/arm64/include/asm/daifflags.h |  1 +
 arch/arm64/include/asm/kvm_ras.h   | 16 +++++++++++++++-
 arch/arm64/kernel/acpi.c           | 30 ++++++++++++++++++++++++++++++
 arch/arm64/mm/fault.c              | 29 +++++------------------------
 5 files changed, 55 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 0db62a4cbce2..bfd23f0f0060 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -16,6 +16,7 @@
 #include <linux/psci.h>
 
 #include <asm/cputype.h>
+#include <asm/ptrace.h>
 #include <asm/smp_plat.h>
 #include <asm/tlbflush.h>
 
@@ -130,6 +131,9 @@ static inline const char *acpi_get_enable_method(int cpu)
  */
 #define acpi_disable_cmcff 1
 pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr);
+int apei_claim_sea(struct pt_regs *regs);
+#else
+static inline int apei_claim_sea(struct pt_regs *regs) { return -ENOENT; }
 #endif /* CONFIG_ACPI_APEI */
 
 #ifdef CONFIG_ACPI_NUMA
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index 22e4c83de5a5..cbd753855bf3 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -20,6 +20,7 @@
 
 #define DAIF_PROCCTX		0
 #define DAIF_PROCCTX_NOIRQ	PSR_I_BIT
+#define DAIF_ERRCTX		(PSR_I_BIT | PSR_A_BIT)
 
 /* mask/save/unmask/restore all exceptions, including interrupts. */
 static inline void local_daif_mask(void)
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
index 5f72b07b7912..5b56e7e297b1 100644
--- a/arch/arm64/include/asm/kvm_ras.h
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -4,8 +4,22 @@
 #ifndef __ARM64_KVM_RAS_H__
 #define __ARM64_KVM_RAS_H__
 
+#include <linux/acpi.h>
+#include <linux/errno.h>
 #include <linux/types.h>
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+#include <asm/acpi.h>
+
+/*
+ * Was this synchronous external abort a RAS notification?
+ * Returns '0' for errors handled by some RAS subsystem, or -ENOENT.
+ */
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	/* apei_claim_sea(NULL) expects to mask interrupts itself */
+	lockdep_assert_irqs_enabled();
+
+	return apei_claim_sea(NULL);
+}
 
 #endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index 7b09487ff8fb..df2c6bff8c58 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -33,6 +33,8 @@
 
 #ifdef CONFIG_ACPI_APEI
 # include <linux/efi.h>
+# include <acpi/ghes.h>
+# include <asm/daifflags.h>
 # include <asm/pgtable.h>
 #endif
 
@@ -261,4 +263,32 @@ pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
 		return __pgprot(PROT_NORMAL_NC);
 	return __pgprot(PROT_DEVICE_nGnRnE);
 }
+
+
+/*
+ * Claim Synchronous External Aborts as a firmware first notification.
+ *
+ * Used by KVM and the arch do_sea handler.
+ * @regs may be NULL when called from process context.
+ */
+int apei_claim_sea(struct pt_regs *regs)
+{
+	int err = -ENOENT;
+	unsigned long current_flags = arch_local_save_flags();
+
+	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
+		return err;
+
+	/*
+	 * SEA can interrupt SError, mask it and describe this as an NMI so
+	 * that APEI defers the handling.
+	 */
+	local_daif_restore(DAIF_ERRCTX);
+	nmi_enter();
+	err = ghes_notify_sea();
+	nmi_exit();
+	local_daif_restore(current_flags);
+
+	return err;
+}
 #endif
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 167772fe4360..fb2761172cd4 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -18,6 +18,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi.h>
 #include <linux/extable.h>
 #include <linux/signal.h>
 #include <linux/mm.h>
@@ -33,6 +34,7 @@
 #include <linux/preempt.h>
 #include <linux/hugetlb.h>
 
+#include <asm/acpi.h>
 #include <asm/bug.h>
 #include <asm/cmpxchg.h>
 #include <asm/cpufeature.h>
@@ -45,8 +47,6 @@
 #include <asm/tlbflush.h>
 #include <asm/traps.h>
 
-#include <acpi/ghes.h>
-
 struct fault_info {
 	int	(*fn)(unsigned long addr, unsigned int esr,
 		      struct pt_regs *regs);
@@ -631,19 +631,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 	inf = esr_to_fault_info(esr);
 
 	/*
-	 * Synchronous aborts may interrupt code which had interrupts masked.
-	 * Before calling out into the wider kernel tell the interested
-	 * subsystems.
+	 * Return value ignored as we rely on signal merging.
+	 * Future patches will make this more robust.
 	 */
-	if (IS_ENABLED(CONFIG_ACPI_APEI_SEA)) {
-		if (interrupts_enabled(regs))
-			nmi_enter();
-
-		ghes_notify_sea();
-
-		if (interrupts_enabled(regs))
-			nmi_exit();
-	}
+	apei_claim_sea(regs);
 
 	clear_siginfo(&info);
 	info.si_signo = inf->sig;
@@ -725,16 +716,6 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	int ret = -ENOENT;
-
-	if (IS_ENABLED(CONFIG_ACPI_APEI_SEA))
-		ret = ghes_notify_sea();
-
-	return ret;
-}
-
 asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
 					 struct pt_regs *regs)
 {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 08/20] ACPI / APEI: Move locking to the notification helper
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (6 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 07/20] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 09/20] ACPI / APEI: Let the notification helper specify the fixmap slot James Morse
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

ghes_copy_tofrom_phys() takes different locks depending on in_nmi().
This doesn't work when we have multiple NMI-like notifications, that
can interrupt each other.

Now that NOTIFY_SEA is always called as an NMI, move the lock-taking
to the notification helper. The helper will always know which lock
to take. This avoids ghes_copy_tofrom_phys() taking a guess based
on in_nmi().

This splits NOTIFY_NMI and NOTIFY_SEA to use different locks. All
the other notifications use ghes_proc(), and are called in process
or IRQ context. Move the spin_lock_irqsave() into ghes_proc().

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 2880547e13b8..f30e6fae57c0 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -113,12 +113,13 @@ static DEFINE_MUTEX(ghes_list_mutex);
  * from BIOS to Linux can be determined only in NMI, IRQ or timer
  * handler, but general ioremap can not be used in atomic context, so
  * the fixmap is used instead.
- *
- * These 2 spinlocks are used to prevent the fixmap entries from being used
- * simultaneously.
  */
-static DEFINE_RAW_SPINLOCK(ghes_ioremap_lock_nmi);
-static DEFINE_SPINLOCK(ghes_ioremap_lock_irq);
+
+/*
+ * Used by ghes_proc() to prevent non-NMI notifications from interacting.
+ * This also protects the FIX_APEI_GHES_IRQ fixmap slot.
+ */
+static DEFINE_SPINLOCK(ghes_notify_lock_irq);
 
 static struct gen_pool *ghes_estatus_pool;
 static unsigned long ghes_estatus_pool_size_request;
@@ -291,7 +292,6 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 				  int from_phys)
 {
 	void __iomem *vaddr;
-	unsigned long flags = 0;
 	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
@@ -299,10 +299,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
 		if (in_nmi) {
-			raw_spin_lock(&ghes_ioremap_lock_nmi);
 			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
 		} else {
-			spin_lock_irqsave(&ghes_ioremap_lock_irq, flags);
 			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
 		}
 		trunk = PAGE_SIZE - offset;
@@ -316,10 +314,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		buffer += trunk;
 		if (in_nmi) {
 			ghes_iounmap_nmi();
-			raw_spin_unlock(&ghes_ioremap_lock_nmi);
 		} else {
 			ghes_iounmap_irq();
-			spin_unlock_irqrestore(&ghes_ioremap_lock_irq, flags);
 		}
 	}
 }
@@ -879,6 +875,9 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 
 	rc = ghes_read_estatus(ghes, 0);
 	if (rc)
@@ -898,14 +897,17 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_clear_estatus(ghes);
 
 	if (rc == -ENOENT)
-		return rc;
+		goto unlock;
 
 	/*
 	 * GHESv2 type HEST entries introduce support for error acknowledgment,
 	 * so only acknowledge the error if this support is present.
 	 */
 	if (is_hest_type_generic_v2(ghes))
-		return ghes_ack_error(ghes->generic_v2);
+		rc = ghes_ack_error(ghes->generic_v2);
+
+unlock:
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 
 	return rc;
 }
@@ -968,6 +970,7 @@ static struct notifier_block ghes_notifier_hed = {
 
 #ifdef CONFIG_ACPI_APEI_SEA
 static LIST_HEAD(ghes_sea);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sea);
 
 /*
  * Return 0 only if one of the SEA error sources successfully reported an error
@@ -975,7 +978,13 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	return ghes_estatus_queue_notified(&ghes_sea);
+	int rv;
+
+	raw_spin_lock(&ghes_notify_lock_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea);
+	raw_spin_unlock(&ghes_notify_lock_sea);
+
+	return rv;
 }
 
 static void ghes_sea_add(struct ghes *ghes)
@@ -1009,6 +1018,7 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
 static LIST_HEAD(ghes_nmi);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
@@ -1017,8 +1027,10 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
 		return ret;
 
+	raw_spin_lock(&ghes_notify_lock_nmi);
 	if (!ghes_estatus_queue_notified(&ghes_nmi))
 		ret = NMI_HANDLED;
+	raw_spin_unlock(&ghes_notify_lock_nmi);
 
 	atomic_dec(&ghes_in_nmi);
 	return ret;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 09/20] ACPI / APEI: Let the notification helper specify the fixmap slot
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (7 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 08/20] ACPI / APEI: Move locking to the notification helper James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 10/20] ACPI / APEI: preparatory split of ghes->estatus James Morse
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

ghes_copy_tofrom_phys() uses a different fixmap slot depending on in_nmi().
This doesn't work when we have multiple NMI-like notifications, that
can interrupt each other.

As with the locking, move the chosen fixmap_idx to the notification helper.
This only matters for NMI-like notifications, anything calling
ghes_proc() can use the IRQ fixmap slot.

This lets us collapse the ghes_ioremap_pfn_*() helpers.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 73 ++++++++++++----------------------------
 1 file changed, 21 insertions(+), 52 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f30e6fae57c0..77505cfa930e 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -129,38 +129,16 @@ static atomic_t ghes_estatus_cache_alloced;
 
 static int ghes_panic_timeout __read_mostly = 30;
 
-static void __iomem *ghes_ioremap_pfn_nmi(u64 pfn)
+static void __iomem *ghes_fixmap(u64 pfn, int fixmap_idx)
 {
 	phys_addr_t paddr;
 	pgprot_t prot;
 
 	paddr = pfn << PAGE_SHIFT;
 	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_NMI, paddr, prot);
+	__set_fixmap(fixmap_idx, paddr, prot);
 
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_NMI);
-}
-
-static void __iomem *ghes_ioremap_pfn_irq(u64 pfn)
-{
-	phys_addr_t paddr;
-	pgprot_t prot;
-
-	paddr = pfn << PAGE_SHIFT;
-	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_IRQ, paddr, prot);
-
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_IRQ);
-}
-
-static void ghes_iounmap_nmi(void)
-{
-	clear_fixmap(FIX_APEI_GHES_NMI);
-}
-
-static void ghes_iounmap_irq(void)
-{
-	clear_fixmap(FIX_APEI_GHES_IRQ);
+	return (void __iomem *) __fix_to_virt(fixmap_idx);
 }
 
 static int ghes_estatus_pool_init(void)
@@ -289,20 +267,15 @@ static inline int ghes_severity(int severity)
 }
 
 static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
-				  int from_phys)
+				  int from_phys, int fixmap_idx)
 {
 	void __iomem *vaddr;
-	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
 
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
-		if (in_nmi) {
-			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
-		} else {
-			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
-		}
+		vaddr = ghes_fixmap(paddr >> PAGE_SHIFT, fixmap_idx);
 		trunk = PAGE_SIZE - offset;
 		trunk = min(trunk, len);
 		if (from_phys)
@@ -312,15 +285,11 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		len -= trunk;
 		paddr += trunk;
 		buffer += trunk;
-		if (in_nmi) {
-			ghes_iounmap_nmi();
-		} else {
-			ghes_iounmap_irq();
-		}
+		clear_fixmap(fixmap_idx);
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, int silent)
+static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -339,7 +308,7 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 		return -ENOENT;
 
 	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-			      sizeof(*ghes->estatus), 1);
+			      sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (!ghes->estatus->block_status)
 		return -ENOENT;
 
@@ -356,7 +325,7 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 		goto err_read_block;
 	ghes_copy_tofrom_phys(ghes->estatus + 1,
 			      buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1);
+			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (cper_estatus_check(ghes->estatus))
 		goto err_read_block;
 	rc = 0;
@@ -368,13 +337,13 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes)
+static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
 {
 	ghes->estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
 	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
-			      sizeof(ghes->estatus->block_status), 0);
+			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
 
@@ -740,12 +709,12 @@ static void __process_error(struct ghes *ghes)
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
 }
 
-static int _in_nmi_notify_one(struct ghes *ghes)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
 
-	if (ghes_read_estatus(ghes, 1)) {
-		ghes_clear_estatus(ghes);
+	if (ghes_read_estatus(ghes, 1, fixmap_idx)) {
+		ghes_clear_estatus(ghes, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -759,19 +728,19 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 		return 0;
 
 	__process_error(ghes);
-	ghes_clear_estatus(ghes);
+	ghes_clear_estatus(ghes, fixmap_idx);
 
 	return 0;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+static int ghes_estatus_queue_notified(struct list_head *rcu_list, int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ghes, rcu_list, list) {
-		if (!_in_nmi_notify_one(ghes))
+		if (!_in_nmi_notify_one(ghes, fixmap_idx))
 			ret = 0;
 	}
 	rcu_read_unlock();
@@ -879,7 +848,7 @@ static int ghes_proc(struct ghes *ghes)
 
 	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 
-	rc = ghes_read_estatus(ghes, 0);
+	rc = ghes_read_estatus(ghes, 0, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
@@ -894,7 +863,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, ghes->estatus);
 
 out:
-	ghes_clear_estatus(ghes);
+	ghes_clear_estatus(ghes, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		goto unlock;
@@ -981,7 +950,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
@@ -1028,7 +997,7 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 		return ret;
 
 	raw_spin_lock(&ghes_notify_lock_nmi);
-	if (!ghes_estatus_queue_notified(&ghes_nmi))
+	if (!ghes_estatus_queue_notified(&ghes_nmi, FIX_APEI_GHES_NMI))
 		ret = NMI_HANDLED;
 	raw_spin_unlock(&ghes_notify_lock_nmi);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 10/20] ACPI / APEI: preparatory split of ghes->estatus
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (8 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 09/20] ACPI / APEI: Let the notification helper specify the fixmap slot James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 11/20] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

The NMI-like notifications scribble over ghes->estatus, before
copying it somewhere else. If this interrupts the ghes_probe() code
calling ghes_proc() on each struct ghes, the data is corrupted.

We want the NMI-like notifications to use a queued estatus entry
from the beginning. To that end, break up any use of "ghes->estatus"
so that all functions take the estatus as an argument.

This patch is just moving types around, no change in behaviour.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 82 ++++++++++++++++++++++------------------
 1 file changed, 45 insertions(+), 37 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 77505cfa930e..9bb00a06ba6e 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -289,7 +289,9 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     int silent, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -307,26 +309,26 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 	if (!buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-			      sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (!ghes->estatus->block_status)
+	ghes_copy_tofrom_phys(estatus, buf_paddr,
+			      sizeof(*estatus), 1, fixmap_idx);
+	if (!estatus->block_status)
 		return -ENOENT;
 
 	ghes->buffer_paddr = buf_paddr;
 	ghes->flags |= GHES_TO_CLEAR;
 
 	rc = -EIO;
-	len = cper_estatus_len(ghes->estatus);
-	if (len < sizeof(*ghes->estatus))
+	len = cper_estatus_len(estatus);
+	if (len < sizeof(*estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
-	if (cper_estatus_check_header(ghes->estatus))
+	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
-	ghes_copy_tofrom_phys(ghes->estatus + 1,
-			      buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (cper_estatus_check(ghes->estatus))
+	ghes_copy_tofrom_phys(estatus + 1,
+			      buf_paddr + sizeof(*estatus),
+			      len - sizeof(*estatus), 1, fixmap_idx);
+	if (cper_estatus_check(estatus))
 		goto err_read_block;
 	rc = 0;
 
@@ -337,13 +339,15 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
+static void ghes_clear_estatus(struct ghes *ghes,
+			       struct acpi_hest_generic_status *estatus,
+			       int fixmap_idx)
 {
-	ghes->estatus->block_status = 0;
+	estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
-	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
-			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
+	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
+			      sizeof(estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
 
@@ -509,9 +513,10 @@ static int ghes_print_estatus(const char *pfx,
 	return 0;
 }
 
-static void __ghes_panic(struct ghes *ghes)
+static void __ghes_panic(struct ghes *ghes,
+			 struct acpi_hest_generic_status *estatus)
 {
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, estatus);
 
 	/* reboot to log the error! */
 	if (!panic_timeout)
@@ -686,16 +691,17 @@ static void ghes_print_queued_estatus(void)
 }
 
 /* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
+static void __process_error(struct ghes *ghes,
+			    struct acpi_hest_generic_status *ghes_estatus)
 {
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
 
-	if (ghes_estatus_cached(ghes->estatus))
+	if (ghes_estatus_cached(ghes_estatus))
 		return;
 
-	len = cper_estatus_len(ghes->estatus);
+	len = cper_estatus_len(ghes_estatus);
 	node_len = GHES_ESTATUS_NODE_LEN(len);
 
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
@@ -705,35 +711,37 @@ static void __process_error(struct ghes *ghes)
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
+	memcpy(estatus, ghes_estatus, len);
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
 }
 
 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, 1, fixmap_idx)) {
-		ghes_clear_estatus(ghes, fixmap_idx);
+	if (ghes_read_estatus(ghes, estatus, 1, fixmap_idx)) {
+		ghes_clear_estatus(ghes, estatus, fixmap_idx);
 		return -ENOENT;
 	}
 
-	sev = ghes_severity(ghes->estatus->error_severity);
+	sev = ghes_severity(estatus->error_severity);
 	if (sev >= GHES_SEV_PANIC) {
 		ghes_print_queued_estatus();
-		__ghes_panic(ghes);
+		__ghes_panic(ghes, estatus);
 	}
 
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return 0;
 
-	__process_error(ghes);
-	ghes_clear_estatus(ghes, fixmap_idx);
+	__process_error(ghes, estatus);
+	ghes_clear_estatus(ghes, estatus, fixmap_idx);
 
 	return 0;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list, int fixmap_idx)
+static int ghes_estatus_queue_notified(struct list_head *rcu_list,
+				       int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
@@ -845,25 +853,25 @@ static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
 	unsigned long flags;
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
 	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 
-	rc = ghes_read_estatus(ghes, 0, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, 0, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
-	if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
-		__ghes_panic(ghes);
-	}
+	if (ghes_severity(estatus->error_severity) >= GHES_SEV_PANIC)
+		__ghes_panic(ghes, estatus);
 
-	if (!ghes_estatus_cached(ghes->estatus)) {
-		if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
-			ghes_estatus_cache_add(ghes->generic, ghes->estatus);
+	if (!ghes_estatus_cached(estatus)) {
+		if (ghes_print_estatus(NULL, ghes->generic, estatus))
+			ghes_estatus_cache_add(ghes->generic, estatus);
 	}
-	ghes_do_proc(ghes, ghes->estatus);
+	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		goto unlock;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 11/20] ACPI / APEI: Remove silent flag from ghes_read_estatus()
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (9 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 10/20] ACPI / APEI: preparatory split of ghes->estatus James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 12/20] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Subsequent patches will split up ghes_read_estatus(), at which
point passing around the 'silent' flag gets annoying. This is to
suppress prink() messages, which prior to 42a0bb3f7138 ("printk/nmi:
generic solution for safe printk in NMI"), were unsafe in NMI context.

We don't need to do this anymore, remove the flag. printk() messages
are batched in a per-cpu buffer and printed via irq-work, or a call
back from panic().

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 9bb00a06ba6e..7b412508b3ea 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -291,7 +291,7 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 
 static int ghes_read_estatus(struct ghes *ghes,
 			     struct acpi_hest_generic_status *estatus,
-			     int silent, int fixmap_idx)
+			     int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -300,7 +300,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 	rc = apei_read(&buf_paddr, &g->error_status_address);
 	if (rc) {
-		if (!silent && printk_ratelimit())
+		if (printk_ratelimit())
 			pr_warning(FW_WARN GHES_PFX
 "Failed to read error status block address for hardware error source: %d.\n",
 				   g->header.source_id);
@@ -333,7 +333,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 	rc = 0;
 
 err_read_block:
-	if (rc && !silent && printk_ratelimit())
+	if (rc && printk_ratelimit())
 		pr_warning(FW_WARN GHES_PFX
 			   "Failed to read error status block!\n");
 	return rc;
@@ -720,7 +720,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 	int sev;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, 1, fixmap_idx)) {
+	if (ghes_read_estatus(ghes, estatus, fixmap_idx)) {
 		ghes_clear_estatus(ghes, estatus, fixmap_idx);
 		return -ENOENT;
 	}
@@ -857,7 +857,7 @@ static int ghes_proc(struct ghes *ghes)
 
 	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 
-	rc = ghes_read_estatus(ghes, estatus, 0, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 12/20] ACPI / APEI: Don't store CPER records physical address in struct ghes
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (10 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 11/20] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 20:55   ` kbuild test robot
  2018-06-26 17:01 ` [PATCH v5 13/20] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus James Morse
                   ` (9 subsequent siblings)
  21 siblings, 1 reply; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

When CPER records are found the address of the records is stashed
in the struct ghes. Once the records have been processed, this
address is overwritten with zero so that it won't be processed
again without being re-populated by firmware.

This goes wrong if a struct ghes can be processed concurrently,
as can happen at probe time when an NMI occurs.

Avoid this stashing by letting the caller hold the address. A
later patch will do away with the use of ghes->flags in the
read/clear code too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 30 +++++++++++++++---------------
 include/acpi/ghes.h      |  1 -
 2 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 7b412508b3ea..b0054dfad9cc 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -266,7 +266,7 @@ static inline int ghes_severity(int severity)
 	}
 }
 
-static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
+static void ghes_copy_tofrom_phys(void *buffer, phys_addr_t paddr, u32 len,
 				  int from_phys, int fixmap_idx)
 {
 	void __iomem *vaddr;
@@ -291,14 +291,13 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 
 static int ghes_read_estatus(struct ghes *ghes,
 			     struct acpi_hest_generic_status *estatus,
-			     int fixmap_idx)
+			     phys_addr_t *buf_paddr, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
-	u64 buf_paddr;
 	u32 len;
 	int rc;
 
-	rc = apei_read(&buf_paddr, &g->error_status_address);
+	rc = apei_read(buf_paddr, &g->error_status_address);
 	if (rc) {
 		if (printk_ratelimit())
 			pr_warning(FW_WARN GHES_PFX
@@ -306,15 +305,14 @@ static int ghes_read_estatus(struct ghes *ghes,
 				   g->header.source_id);
 		return -EIO;
 	}
-	if (!buf_paddr)
+	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(estatus, buf_paddr,
+	ghes_copy_tofrom_phys(estatus, *buf_paddr,
 			      sizeof(*estatus), 1, fixmap_idx);
 	if (!estatus->block_status)
 		return -ENOENT;
 
-	ghes->buffer_paddr = buf_paddr;
 	ghes->flags |= GHES_TO_CLEAR;
 
 	rc = -EIO;
@@ -326,7 +324,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
 	ghes_copy_tofrom_phys(estatus + 1,
-			      buf_paddr + sizeof(*estatus),
+			      *buf_paddr + sizeof(*estatus),
 			      len - sizeof(*estatus), 1, fixmap_idx);
 	if (cper_estatus_check(estatus))
 		goto err_read_block;
@@ -341,12 +339,12 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 static void ghes_clear_estatus(struct ghes *ghes,
 			       struct acpi_hest_generic_status *estatus,
-			       int fixmap_idx)
+			       phys_addr_t buf_paddr, int fixmap_idx)
 {
 	estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
-	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
+	ghes_copy_tofrom_phys(estatus, buf_paddr,
 			      sizeof(estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
@@ -718,10 +716,11 @@ static void __process_error(struct ghes *ghes,
 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
+	phys_addr_t buf_paddr;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, fixmap_idx)) {
-		ghes_clear_estatus(ghes, estatus, fixmap_idx);
+	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+		ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -735,7 +734,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		return 0;
 
 	__process_error(ghes, estatus);
-	ghes_clear_estatus(ghes, estatus, fixmap_idx);
+	ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
 
 	return 0;
 }
@@ -853,11 +852,12 @@ static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
 	unsigned long flags;
+	phys_addr_t buf_paddr;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
 	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 
-	rc = ghes_read_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, &buf_paddr, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
@@ -871,7 +871,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(ghes, estatus, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		goto unlock;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 1624e2be485c..3d77452e3a1d 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -22,7 +22,6 @@ struct ghes {
 		struct acpi_hest_generic_v2 *generic_v2;
 	};
 	struct acpi_hest_generic_status *estatus;
-	u64 buffer_paddr;
 	unsigned long flags;
 	union {
 		struct list_head list;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 13/20] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (11 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 12/20] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 14/20] ACPI / APEI: Split ghes_read_estatus() to read CPER length James Morse
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

ghes_read_estatus() sets a flag in struct ghes if the buffer of
CPER records needs to be cleared once the records have been
processed. This global flags value is a problem if a struct ghes
can be processed concurrently, as happens at probe time if an
NMI arrives for the same error source.

The GHES_TO_CLEAR flags was only set at the same time as
buffer_paddr, which is now owned by the caller and passed to
ghes_clear_estatus(). Use this as the flag.

A non-zero buf_paddr returned by ghes_read_estatus() means
ghes_clear_estatus() will clear this address. ghes_read_estatus()
already checks for a read of error_status_address being zero,
so we can never get CPER records written at zero.

After this ghes_clear_estatus() no longer needs the struct ghes.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 26 ++++++++++++--------------
 include/acpi/ghes.h      |  1 -
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b0054dfad9cc..75360525935d 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -310,10 +310,10 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 	ghes_copy_tofrom_phys(estatus, *buf_paddr,
 			      sizeof(*estatus), 1, fixmap_idx);
-	if (!estatus->block_status)
+	if (!estatus->block_status) {
+		*buf_paddr = 0;
 		return -ENOENT;
-
-	ghes->flags |= GHES_TO_CLEAR;
+	}
 
 	rc = -EIO;
 	len = cper_estatus_len(estatus);
@@ -337,16 +337,14 @@ static int ghes_read_estatus(struct ghes *ghes,
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes,
-			       struct acpi_hest_generic_status *estatus,
+static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 			       phys_addr_t buf_paddr, int fixmap_idx)
 {
 	estatus->block_status = 0;
-	if (!(ghes->flags & GHES_TO_CLEAR))
-		return;
-	ghes_copy_tofrom_phys(estatus, buf_paddr,
-			      sizeof(estatus->block_status), 0, fixmap_idx);
-	ghes->flags &= ~GHES_TO_CLEAR;
+	if (buf_paddr)
+		ghes_copy_tofrom_phys(estatus, buf_paddr,
+				      sizeof(estatus->block_status), 0,
+				      fixmap_idx);
 }
 
 static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
@@ -720,7 +718,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
 	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
-		ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
+		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -730,11 +728,11 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		__ghes_panic(ghes, estatus);
 	}
 
-	if (!(ghes->flags & GHES_TO_CLEAR))
+	if (!buf_paddr)
 		return 0;
 
 	__process_error(ghes, estatus);
-	ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
+	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
 	return 0;
 }
@@ -871,7 +869,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, estatus, buf_paddr, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(estatus, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		goto unlock;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 3d77452e3a1d..0b6fe48e6671 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -13,7 +13,6 @@
  * estatus: memory buffer for error status block, allocated during
  * HEST parsing.
  */
-#define GHES_TO_CLEAR		0x0001
 #define GHES_EXITING		0x0002
 
 struct ghes {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 14/20] ACPI / APEI: Split ghes_read_estatus() to read CPER length
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (12 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 13/20] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 15/20] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

ghes_read_estatus() reads the record address, then the record's
header, then performs some sanity checks before reading the
records into the provided estatus buffer.

We either need to know the size of the records before we call
ghes_read_estatus(), or always provide a worst-case sized buffer,
as happens today.

Add a function to peek at the record's header to find the size. This
will let the NMI path allocate the right amount of memory before reading
the records, instead of using the worst-case size, and having to copy
the records.

Split ghes_read_estatus() to create ghes_peek_estatus() which
returns the address and size of the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 55 ++++++++++++++++++++++++++++++----------
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 75360525935d..1d59d85b38d2 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -289,11 +289,12 @@ static void ghes_copy_tofrom_phys(void *buffer, phys_addr_t paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes,
-			     struct acpi_hest_generic_status *estatus,
-			     phys_addr_t *buf_paddr, int fixmap_idx)
+/* read the CPER block returning its address and size */
+static int ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
+			     phys_addr_t *buf_paddr, u32 *buf_len)
 {
 	struct acpi_hest_generic *g = ghes->generic;
+	struct acpi_hest_generic_status estatus;
 	u32 len;
 	int rc;
 
@@ -308,26 +309,23 @@ static int ghes_read_estatus(struct ghes *ghes,
 	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(estatus, *buf_paddr,
-			      sizeof(*estatus), 1, fixmap_idx);
-	if (!estatus->block_status) {
+	ghes_copy_tofrom_phys(&estatus, *buf_paddr,
+			      sizeof(estatus), 1, fixmap_idx);
+	if (!estatus.block_status) {
 		*buf_paddr = 0;
 		return -ENOENT;
 	}
 
 	rc = -EIO;
-	len = cper_estatus_len(estatus);
-	if (len < sizeof(*estatus))
+	len = cper_estatus_len(&estatus);
+	if (len < sizeof(estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
-	if (cper_estatus_check_header(estatus))
-		goto err_read_block;
-	ghes_copy_tofrom_phys(estatus + 1,
-			      *buf_paddr + sizeof(*estatus),
-			      len - sizeof(*estatus), 1, fixmap_idx);
-	if (cper_estatus_check(estatus))
+	if (cper_estatus_check_header(&estatus))
 		goto err_read_block;
+	*buf_len = len;
+
 	rc = 0;
 
 err_read_block:
@@ -337,6 +335,35 @@ static int ghes_read_estatus(struct ghes *ghes,
 	return rc;
 }
 
+static int __ghes_read_estatus(struct acpi_hest_generic_status *estatus,
+			       phys_addr_t buf_paddr, size_t buf_len,
+			       int fixmap_idx)
+{
+	ghes_copy_tofrom_phys(estatus, buf_paddr, buf_len, 1, fixmap_idx);
+	if (cper_estatus_check(estatus)) {
+		if (printk_ratelimit())
+			pr_warning(FW_WARN GHES_PFX
+				   "Failed to read error status block!\n");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     phys_addr_t *buf_paddr, int fixmap_idx)
+{
+	int rc;
+	u32 buf_len;
+
+	rc = ghes_peek_estatus(ghes, fixmap_idx, buf_paddr, &buf_len);
+	if (rc)
+		return rc;
+
+	return __ghes_read_estatus(estatus, *buf_paddr, buf_len, fixmap_idx);
+}
+
 static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 			       phys_addr_t buf_paddr, int fixmap_idx)
 {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 15/20] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (13 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 14/20] ACPI / APEI: Split ghes_read_estatus() to read CPER length James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 16/20] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications James Morse
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Each struct ghes has an worst-case sized buffer for storing the
estatus. If an error is being processed by ghes_proc() in process
context this buffer will be in use. If the error source then triggers
an NMI-like notification, the same buffer will be used by
_in_nmi_notify_one() to stage the estatus data, before
__process_error() copys it into a queued estatus entry.

Merge __process_error()s work into _in_nmi_notify_one() so that
the queued estatus entry is used from the beginning. Use the new
ghes_peek_estatus() so we know how much memory to allocate from
the ghes_estatus_pool before we read the records.

Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 45 ++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 1d59d85b38d2..f196f8797fc1 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -713,40 +713,32 @@ static void ghes_print_queued_estatus(void)
 	}
 }
 
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes,
-			    struct acpi_hest_generic_status *ghes_estatus)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
+	int sev, rc = 0;
 	u32 len, node_len;
+	phys_addr_t buf_paddr;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
 
-	if (ghes_estatus_cached(ghes_estatus))
-		return;
+	rc = ghes_peek_estatus(ghes, fixmap_idx, &buf_paddr, &len);
+	if (rc)
+		return rc;
 
-	len = cper_estatus_len(ghes_estatus);
 	node_len = GHES_ESTATUS_NODE_LEN(len);
 
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
 	if (!estatus_node)
-		return;
+		return -ENOMEM;
 
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes_estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-}
-
-static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
-{
-	int sev;
-	phys_addr_t buf_paddr;
-	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+	if (__ghes_read_estatus(estatus, buf_paddr, len, fixmap_idx)) {
 		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
-		return -ENOENT;
+		rc = -ENOENT;
+		goto no_work;
 	}
 
 	sev = ghes_severity(estatus->error_severity);
@@ -755,13 +747,20 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		__ghes_panic(ghes, estatus);
 	}
 
-	if (!buf_paddr)
-		return 0;
-
-	__process_error(ghes, estatus);
 	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
-	return 0;
+	if (!buf_paddr || ghes_estatus_cached(estatus))
+		goto no_work;
+
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+
+	return rc;
+
+no_work:
+	gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+			      node_len);
+
+	return rc;
 }
 
 static int ghes_estatus_queue_notified(struct list_head *rcu_list,
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 16/20] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (14 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 15/20] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 17/20] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Now that ghes notification helpers provide the fixmap slots and
take the lock themselves we can support multiple NMI-like
notifications on arm64.

These should be named after their notification method. x86's
NOTIFY_NMI already is, move it to live with the ghes_nmi list.
Change the SEA fixmap entry to be called FIX_APEI_GHES_SEA.

Future patches can add support for FIX_APEI_GHES_SEI and
FIX_APEI_GHES_SDEI_{NORMAL,CRITICAL}.

Signed-off-by: James Morse <james.morse@arm.com>

---
Changes since v3:
 * idx/lock are now in a separate struct.
 * Add to the comment above ghes_fixmap_lock_irq so that it makes more
   sense in isolation.
---
 arch/arm64/include/asm/fixmap.h | 4 +++-
 drivers/acpi/apei/ghes.c        | 6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index ec1e6d6fa14c..c3974517c2cb 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -55,7 +55,9 @@ enum fixed_addresses {
 #ifdef CONFIG_ACPI_APEI_GHES
 	/* Used for GHES mapping from assorted contexts */
 	FIX_APEI_GHES_IRQ,
-	FIX_APEI_GHES_NMI,
+#ifdef CONFIG_ACPI_APEI_SEA
+	FIX_APEI_GHES_SEA,
+#endif
 #endif /* CONFIG_ACPI_APEI_GHES */
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f196f8797fc1..15d472048aa8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -982,7 +982,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_SEA);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
@@ -1013,8 +1013,8 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
 /*
- * NMI may be triggered on any CPU, so ghes_in_nmi is used for
- * having only one concurrent reader.
+ * NOTIFY_NMI may be triggered on any CPU, so ghes_in_nmi is
+ * used for having only one concurrent reader.
  */
 static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 17/20] firmware: arm_sdei: Add ACPI GHES registration helper
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (15 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 16/20] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 18/20] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

APEI's Generic Hardware Error Source structures do not describe
whether the SDEI event is shared or private, as this information is
discoverable via the API.

GHES needs to know whether an event is normal or critical to avoid
sharing locks or fixmap entries, but we don't want GHES to have to
know much about the SDEI API.

Add a helper to register the GHES using the appropriate normal or
critical callback.

Signed-off-by: James Morse <james.morse@arm.com>

---
Changes since v4:
 * Moved normal/critical callbacks into the helper, as APEI needs to know.
 * Dropped Punit's Reviewed-by.

Changes since v3:
 * Removed acpi_disabled() checks that aren't necessary after v2s #ifdef
   change.

Changes since v2:
 * Added header file, thanks kbuild-robot!
 * changed ifdef to the GHES version to match the fixmap definition

Changes since v1:
 * ghes->fixmap_idx variable rename
---
 arch/arm64/include/asm/fixmap.h |  4 ++
 drivers/firmware/arm_sdei.c     | 66 +++++++++++++++++++++++++++++++++
 include/linux/arm_sdei.h        |  6 +++
 3 files changed, 76 insertions(+)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index c3974517c2cb..e2b423a5feaf 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -58,6 +58,10 @@ enum fixed_addresses {
 #ifdef CONFIG_ACPI_APEI_SEA
 	FIX_APEI_GHES_SEA,
 #endif
+#ifdef CONFIG_ARM_SDE_INTERFACE
+	FIX_APEI_GHES_SDEI_NORMAL,
+	FIX_APEI_GHES_SDEI_CRITICAL,
+#endif
 #endif /* CONFIG_ACPI_APEI_GHES */
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c
index 1ea71640fdc2..c1b6591f2183 100644
--- a/drivers/firmware/arm_sdei.c
+++ b/drivers/firmware/arm_sdei.c
@@ -2,6 +2,7 @@
 // Copyright (C) 2017 Arm Ltd.
 #define pr_fmt(fmt) "sdei: " fmt
 
+#include <acpi/ghes.h>
 #include <linux/acpi.h>
 #include <linux/arm_sdei.h>
 #include <linux/arm-smccc.h>
@@ -32,6 +33,8 @@
 #include <linux/spinlock.h>
 #include <linux/uaccess.h>
 
+#include <asm/fixmap.h>
+
 /*
  * The call to use to reach the firmware.
  */
@@ -887,6 +890,69 @@ static void sdei_smccc_hvc(unsigned long function_id,
 	arm_smccc_hvc(function_id, arg0, arg1, arg2, arg3, arg4, 0, 0, res);
 }
 
+#ifdef CONFIG_ACPI_APEI_GHES
+int sdei_register_ghes(struct ghes *ghes, sdei_event_callback *normal_cb,
+		       sdei_event_callback *critical_cb)
+{
+	int err;
+	u64 result;
+	u32 event_num;
+	sdei_event_callback *cb;
+
+	event_num = ghes->generic->notify.vector;
+	if (event_num == 0) {
+		/*
+		 * Event 0 is reserved by the specification for
+		 * SDEI_EVENT_SIGNAL.
+		 */
+		return -EINVAL;
+	}
+
+	err = sdei_api_event_get_info(event_num, SDEI_EVENT_INFO_EV_PRIORITY,
+				      &result);
+	if (err)
+		return err;
+
+	if (result == SDEI_EVENT_PRIORITY_CRITICAL)
+		cb = critical_cb;
+	else
+		cb = normal_cb;
+
+	err = sdei_event_register(event_num, cb, ghes);
+	if (!err)
+		err = sdei_event_enable(event_num);
+
+	return err;
+}
+
+int sdei_unregister_ghes(struct ghes *ghes)
+{
+	int i;
+	int err;
+	u32 event_num = ghes->generic->notify.vector;
+
+	might_sleep();
+
+	/*
+	 * The event may be running on another CPU. Disable it
+	 * to stop new events, then try to unregister a few times.
+	 */
+	err = sdei_event_disable(event_num);
+	if (err)
+		return err;
+
+	for (i = 0; i < 3; i++) {
+		err = sdei_event_unregister(event_num);
+		if (err != -EINPROGRESS)
+			break;
+
+		schedule();
+	}
+
+	return err;
+}
+#endif /* CONFIG_ACPI_APEI_GHES */
+
 static int sdei_get_conduit(struct platform_device *pdev)
 {
 	const char *method;
diff --git a/include/linux/arm_sdei.h b/include/linux/arm_sdei.h
index 942afbd544b7..393899192906 100644
--- a/include/linux/arm_sdei.h
+++ b/include/linux/arm_sdei.h
@@ -11,6 +11,7 @@ enum sdei_conduit_types {
 	CONDUIT_HVC,
 };
 
+#include <acpi/ghes.h>
 #include <asm/sdei.h>
 
 /* Arch code should override this to set the entry point from firmware... */
@@ -39,6 +40,11 @@ int sdei_event_unregister(u32 event_num);
 int sdei_event_enable(u32 event_num);
 int sdei_event_disable(u32 event_num);
 
+/* GHES register/unregister helpers */
+int sdei_register_ghes(struct ghes *ghes, sdei_event_callback *normal_cb,
+		       sdei_event_callback *critical_cb);
+int sdei_unregister_ghes(struct ghes *ghes);
+
 #ifdef CONFIG_ARM_SDE_INTERFACE
 /* For use by arch code when CPU hotplug notifiers are not appropriate. */
 int sdei_mask_local_cpu(void);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 18/20] ACPI / APEI: Add support for the SDEI GHES Notification type
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (16 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 17/20] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 19/20] mm/memory-failure: increase queued recovery work's priority James Morse
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

If the GHES notification type is SDEI, register the provided event
using the SDEI-GHES helper.

SDEI may be one of two types of event, normal and critical. Critical
events can interrupt normal events, so these must have separate
fixmap slots and locks in case both event types are in use.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v4:
 * We now have two callbacks and separate locks and fixmap entries for
   normal/critical calls.
 * Use the new Kconfig selector for ESTATUS_QUEUE

 arch/arm64/include/asm/fixmap.h |  2 +-
 drivers/acpi/apei/Kconfig       |  5 ++
 drivers/acpi/apei/ghes.c        | 92 ++++++++++++++++++++++++++++++++-
 drivers/firmware/Kconfig        |  1 +
 drivers/firmware/arm_sdei.c     |  4 +-
 include/linux/arm_sdei.h        |  3 ++
 6 files changed, 102 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index e2b423a5feaf..e875b97032ce 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -58,7 +58,7 @@ enum fixed_addresses {
 #ifdef CONFIG_ACPI_APEI_SEA
 	FIX_APEI_GHES_SEA,
 #endif
-#ifdef CONFIG_ARM_SDE_INTERFACE
+#ifdef CONFIG_ACPI_APEI_SDEI
 	FIX_APEI_GHES_SDEI_NORMAL,
 	FIX_APEI_GHES_SDEI_CRITICAL,
 #endif
diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 2b191e09b647..a8d09065e6a9 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -61,6 +61,11 @@ config ACPI_APEI_SEA
 	  option allows the OS to look for such hardware error record, and
 	  take appropriate action.
 
+config ACPI_APEI_SDEI
+	bool
+	depends on ACPI_APEI_GHES && ARM_SDE_INTERFACE
+	select ACPI_APEI_GHES_ESTATUS_QUEUE
+
 config ACPI_APEI_MEMORY_FAILURE
 	bool "APEI memory error recovering support"
 	depends on ACPI_APEI && MEMORY_FAILURE
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 15d472048aa8..b7b335450a6b 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -25,6 +25,7 @@
  * GNU General Public License for more details.
  */
 
+#include <linux/arm_sdei.h>
 #include <linux/kernel.h>
 #include <linux/moduleparam.h>
 #include <linux/init.h>
@@ -763,8 +764,8 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 	return rc;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list,
-				       int fixmap_idx)
+static int __maybe_unused
+ghes_estatus_queue_notified(struct list_head *rcu_list, int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
@@ -1069,6 +1070,75 @@ static inline void ghes_nmi_add(struct ghes *ghes) { }
 static inline void ghes_nmi_remove(struct ghes *ghes) { }
 #endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
+#ifdef CONFIG_ACPI_APEI_SDEI
+static int __ghes_sdei_callback(struct ghes *ghes, int fixmap_idx)
+{
+	if (!_in_nmi_notify_one(ghes, fixmap_idx)) {
+		irq_work_queue(&ghes_proc_irq_work);
+
+		return 0;
+	}
+
+	return -ENOENT;
+}
+
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sdei_normal);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sdei_critical);
+
+static int ghes_sdei_normal_callback(u32 event_num, struct pt_regs *regs,
+				      void *arg)
+{
+	int err = -ENOENT;
+	struct ghes *ghes = arg;
+
+	raw_spin_lock(&ghes_notify_lock_sdei_normal);
+	err = __ghes_sdei_callback(ghes, FIX_APEI_GHES_SDEI_NORMAL);
+	raw_spin_unlock(&ghes_notify_lock_sdei_normal);
+
+	return err;
+}
+
+static int ghes_sdei_critical_callback(u32 event_num, struct pt_regs *regs,
+				       void *arg)
+{
+	int err = -ENOENT;
+	struct ghes *ghes = arg;
+
+	raw_spin_lock(&ghes_notify_lock_sdei_critical);
+	err = __ghes_sdei_callback(ghes, FIX_APEI_GHES_SDEI_CRITICAL);
+	raw_spin_unlock(&ghes_notify_lock_sdei_critical);
+
+	return err;
+}
+
+static int apei_sdei_register_ghes(struct ghes *ghes)
+{
+	int err;
+
+	ghes_estatus_queue_grow_pool(ghes);
+
+	err = sdei_register_ghes(ghes, ghes_sdei_normal_callback,
+				 ghes_sdei_critical_callback);
+	if (err)
+		ghes_estatus_queue_shrink_pool(ghes);
+
+	return err;
+}
+
+static int apei_sdei_unregister_ghes(struct ghes *ghes)
+{
+	int err = sdei_unregister_ghes(ghes);
+
+	if (!err)
+		ghes_estatus_queue_shrink_pool(ghes);
+
+	return err;
+}
+#else
+static int apei_sdei_register_ghes(struct ghes *ghes) { return -EINVAL; }
+static int apei_sdei_unregister_ghes(struct ghes *ghes) { return -EINVAL; }
+#endif /* CONFIG_ACPI_APEI_SDEI */
+
 static int ghes_probe(struct platform_device *ghes_dev)
 {
 	struct acpi_hest_generic *generic;
@@ -1103,6 +1173,13 @@ static int ghes_probe(struct platform_device *ghes_dev)
 			goto err;
 		}
 		break;
+	case ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED:
+		if (!IS_ENABLED(CONFIG_ACPI_APEI_SDEI)) {
+			pr_warn(GHES_PFX "Generic hardware error source: %d notified via SDE Interface is not supported!\n",
+				generic->header.source_id);
+			goto err;
+		}
+		break;
 	case ACPI_HEST_NOTIFY_LOCAL:
 		pr_warning(GHES_PFX "Generic hardware error source: %d notified via local interrupt is not supported!\n",
 			   generic->header.source_id);
@@ -1166,6 +1243,11 @@ static int ghes_probe(struct platform_device *ghes_dev)
 	case ACPI_HEST_NOTIFY_NMI:
 		ghes_nmi_add(ghes);
 		break;
+	case ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED:
+		rc = apei_sdei_register_ghes(ghes);
+		if (rc)
+			goto err;
+		break;
 	default:
 		BUG();
 	}
@@ -1189,6 +1271,7 @@ static int ghes_probe(struct platform_device *ghes_dev)
 
 static int ghes_remove(struct platform_device *ghes_dev)
 {
+	int rc;
 	struct ghes *ghes;
 	struct acpi_hest_generic *generic;
 
@@ -1221,6 +1304,11 @@ static int ghes_remove(struct platform_device *ghes_dev)
 	case ACPI_HEST_NOTIFY_NMI:
 		ghes_nmi_remove(ghes);
 		break;
+	case ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED:
+		rc = apei_sdei_unregister_ghes(ghes);
+		if (rc)
+			return rc;
+		break;
 	default:
 		BUG();
 		break;
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 6e83880046d7..69c52c0591da 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -85,6 +85,7 @@ config ARM_SCPI_POWER_DOMAIN
 config ARM_SDE_INTERFACE
 	bool "ARM Software Delegated Exception Interface (SDEI)"
 	depends on ARM64
+	select ACPI_APEI_SDEI if ACPI_APEI_GHES
 	help
 	  The Software Delegated Exception Interface (SDEI) is an ARM
 	  standard for registering callbacks from the platform firmware
diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c
index c1b6591f2183..97fd8d22bf15 100644
--- a/drivers/firmware/arm_sdei.c
+++ b/drivers/firmware/arm_sdei.c
@@ -890,7 +890,7 @@ static void sdei_smccc_hvc(unsigned long function_id,
 	arm_smccc_hvc(function_id, arg0, arg1, arg2, arg3, arg4, 0, 0, res);
 }
 
-#ifdef CONFIG_ACPI_APEI_GHES
+#ifdef CONFIG_ACPI_APEI_SDEI
 int sdei_register_ghes(struct ghes *ghes, sdei_event_callback *normal_cb,
 		       sdei_event_callback *critical_cb)
 {
@@ -951,7 +951,7 @@ int sdei_unregister_ghes(struct ghes *ghes)
 
 	return err;
 }
-#endif /* CONFIG_ACPI_APEI_GHES */
+#endif /* CONFIG_ACPI_APEI_SDEI */
 
 static int sdei_get_conduit(struct platform_device *pdev)
 {
diff --git a/include/linux/arm_sdei.h b/include/linux/arm_sdei.h
index 393899192906..3305ea7f9dc7 100644
--- a/include/linux/arm_sdei.h
+++ b/include/linux/arm_sdei.h
@@ -12,7 +12,10 @@ enum sdei_conduit_types {
 };
 
 #include <acpi/ghes.h>
+
+#ifdef CONFIG_ARM_SDE_INTERFACE
 #include <asm/sdei.h>
+#endif
 
 /* Arch code should override this to set the entry point from firmware... */
 #ifndef sdei_arch_get_entry_point
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 19/20] mm/memory-failure: increase queued recovery work's priority
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (17 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 18/20] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-06-26 17:01 ` [PATCH v5 20/20] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

arm64 can take an NMI-like error notification when user-space steps in
some corrupt memory. APEI's GHES code will call memory_failure_queue()
to schedule the recovery work. We then return to user-space, possibly
taking the fault again.

Currently the arch code unconditionally signals user-space from this
path, so we don't get stuck in this loop, but the affected process
never benefits from memory_failure()s recovery work. To fix this we
need to know the recovery work will run before we get back to user-space.

Increase the priority of the recovery work by scheduling it on the
system_highpri_wq, then try to bump the current task off this CPU
so that the recovery work starts immediately.

Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Tested-by: gengdongjiu <gengdongjiu@huawei.com>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
 mm/memory-failure.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 9d142b9b86dc..f0e69d7ac406 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -55,6 +55,7 @@
 #include <linux/hugetlb.h>
 #include <linux/memory_hotplug.h>
 #include <linux/mm_inline.h>
+#include <linux/preempt.h>
 #include <linux/kfifo.h>
 #include <linux/ratelimit.h>
 #include "internal.h"
@@ -1333,6 +1334,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu, memory_failure_cpu);
  */
 void memory_failure_queue(unsigned long pfn, int flags)
 {
+	int cpu = smp_processor_id();
 	struct memory_failure_cpu *mf_cpu;
 	unsigned long proc_flags;
 	struct memory_failure_entry entry = {
@@ -1342,11 +1344,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
 
 	mf_cpu = &get_cpu_var(memory_failure_cpu);
 	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
-	if (kfifo_put(&mf_cpu->fifo, entry))
-		schedule_work_on(smp_processor_id(), &mf_cpu->work);
-	else
+	if (kfifo_put(&mf_cpu->fifo, entry)) {
+		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
+		set_tsk_need_resched(current);
+		preempt_set_need_resched();
+	} else {
 		pr_err("Memory failure: buffer overflow when queuing memory failure at %#lx\n",
 		       pfn);
+	}
 	spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
 	put_cpu_var(memory_failure_cpu);
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 20/20] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (18 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 19/20] mm/memory-failure: increase queued recovery work's priority James Morse
@ 2018-06-26 17:01 ` James Morse
  2018-07-04 14:37 ` [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up Will Deacon
  2018-07-05  9:50 ` Rafael J. Wysocki
  21 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-26 17:01 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

APEI is unable to do all of its error handling work in nmi-context, so
it defers non-fatal work onto the irq_work queue. arch_irq_work_raise()
sends an IPI to the calling cpu, but we can't guarantee this will be
taken before we return.

Unless we interrupted a context with irqs-masked, we can call
irq_work_run() to do the work now. Otherwise return -EINPROGRESS to
indicate ghes_notify_sea() found some work to do, but it hasn't
finished yet.

With this we can take apei_claim_sea() returning '0' to mean this
external-abort was also notification of a firmware-first RAS error,
and that APEI has processed the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
Changes since v2:
 * Removed IS_ENABLED() check, done by the caller unless we have a dummy
   definition.
---
 arch/arm64/kernel/acpi.c | 19 +++++++++++++++++++
 arch/arm64/mm/fault.c    |  9 ++++-----
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index df2c6bff8c58..9ef2d91f0000 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -22,6 +22,7 @@
 #include <linux/init.h>
 #include <linux/irq.h>
 #include <linux/irqdomain.h>
+#include <linux/irq_work.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
 #include <linux/smp.h>
@@ -275,10 +276,14 @@ int apei_claim_sea(struct pt_regs *regs)
 {
 	int err = -ENOENT;
 	unsigned long current_flags = arch_local_save_flags();
+	unsigned long interrupted_flags = current_flags;
 
 	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
 		return err;
 
+	if (regs)
+		interrupted_flags = regs->pstate;
+
 	/*
 	 * SEA can interrupt SError, mask it and describe this as an NMI so
 	 * that APEI defers the handling.
@@ -287,6 +292,20 @@ int apei_claim_sea(struct pt_regs *regs)
 	nmi_enter();
 	err = ghes_notify_sea();
 	nmi_exit();
+
+	/*
+	 * APEI NMI-like notifications are deferred to irq_work. Unless
+	 * we interrupted irqs-masked code, we can do that now.
+	 */
+	if (!err) {
+		if (!arch_irqs_disabled_flags(interrupted_flags)) {
+			local_daif_restore(DAIF_PROCCTX_NOIRQ);
+			irq_work_run();
+		} else {
+			err = -EINPROGRESS;
+		}
+	}
+
 	local_daif_restore(current_flags);
 
 	return err;
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index fb2761172cd4..7e5985559a79 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -630,11 +630,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 
 	inf = esr_to_fault_info(esr);
 
-	/*
-	 * Return value ignored as we rely on signal merging.
-	 * Future patches will make this more robust.
-	 */
-	apei_claim_sea(regs);
+	if (apei_claim_sea(regs) == 0) {
+		/* APEI claimed this as a firmware-first notification */
+		return 0;
+	}
 
 	clear_siginfo(&info);
 	info.si_signo = inf->sig;
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 12/20] ACPI / APEI: Don't store CPER records physical address in struct ghes
  2018-06-26 17:01 ` [PATCH v5 12/20] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
@ 2018-06-26 20:55   ` kbuild test robot
  2018-06-27  8:40     ` James Morse
  0 siblings, 1 reply; 26+ messages in thread
From: kbuild test robot @ 2018-06-26 20:55 UTC (permalink / raw)
  To: James Morse
  Cc: kbuild-all, linux-acpi, kvmarm, linux-arm-kernel, linux-mm,
	Borislav Petkov, Marc Zyngier, Christoffer Dall, Will Deacon,
	Catalin Marinas, Naoya Horiguchi, Rafael Wysocki, Len Brown,
	Tony Luck, Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang

[-- Attachment #1: Type: text/plain, Size: 3098 bytes --]

Hi James,

I love your patch! Yet something to improve:

[auto build test ERROR on pm/linux-next]
[also build test ERROR on v4.18-rc2 next-20180626]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/James-Morse/APEI-in_nmi-rework-and-arm64-SDEI-wire-up/20180627-024229
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
config: i386-randconfig-i1-201825 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/acpi/apei/ghes.c: In function 'ghes_read_estatus':
>> drivers/acpi/apei/ghes.c:300:17: error: passing argument 1 of 'apei_read' from incompatible pointer type [-Werror=incompatible-pointer-types]
     rc = apei_read(buf_paddr, &g->error_status_address);
                    ^~~~~~~~~
   In file included from drivers/acpi/apei/ghes.c:57:0:
   drivers/acpi/apei/apei-internal.h:80:5: note: expected 'u64 * {aka long long unsigned int *}' but argument is of type 'phys_addr_t * {aka unsigned int *}'
    int apei_read(u64 *val, struct acpi_generic_address *reg);
        ^~~~~~~~~
   cc1: some warnings being treated as errors

vim +/apei_read +300 drivers/acpi/apei/ghes.c

   291	
   292	static int ghes_read_estatus(struct ghes *ghes,
   293				     struct acpi_hest_generic_status *estatus,
   294				     phys_addr_t *buf_paddr, int fixmap_idx)
   295	{
   296		struct acpi_hest_generic *g = ghes->generic;
   297		u32 len;
   298		int rc;
   299	
 > 300		rc = apei_read(buf_paddr, &g->error_status_address);
   301		if (rc) {
   302			if (printk_ratelimit())
   303				pr_warning(FW_WARN GHES_PFX
   304	"Failed to read error status block address for hardware error source: %d.\n",
   305					   g->header.source_id);
   306			return -EIO;
   307		}
   308		if (!*buf_paddr)
   309			return -ENOENT;
   310	
   311		ghes_copy_tofrom_phys(estatus, *buf_paddr,
   312				      sizeof(*estatus), 1, fixmap_idx);
   313		if (!estatus->block_status)
   314			return -ENOENT;
   315	
   316		ghes->flags |= GHES_TO_CLEAR;
   317	
   318		rc = -EIO;
   319		len = cper_estatus_len(estatus);
   320		if (len < sizeof(*estatus))
   321			goto err_read_block;
   322		if (len > ghes->generic->error_block_length)
   323			goto err_read_block;
   324		if (cper_estatus_check_header(estatus))
   325			goto err_read_block;
   326		ghes_copy_tofrom_phys(estatus + 1,
   327				      *buf_paddr + sizeof(*estatus),
   328				      len - sizeof(*estatus), 1, fixmap_idx);
   329		if (cper_estatus_check(estatus))
   330			goto err_read_block;
   331		rc = 0;
   332	
   333	err_read_block:
   334		if (rc && printk_ratelimit())
   335			pr_warning(FW_WARN GHES_PFX
   336				   "Failed to read error status block!\n");
   337		return rc;
   338	}
   339	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26080 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 12/20] ACPI / APEI: Don't store CPER records physical address in struct ghes
  2018-06-26 20:55   ` kbuild test robot
@ 2018-06-27  8:40     ` James Morse
  0 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-06-27  8:40 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang

On 26/06/18 21:55, kbuild test robot wrote:
>         # save the attached .config to linux build tree
>         make ARCH=i386 

Gah, guess who forgot about 32bit.


> All errors (new ones prefixed by >>):
> 
>    drivers/acpi/apei/ghes.c: In function 'ghes_read_estatus':
>>> drivers/acpi/apei/ghes.c:300:17: error: passing argument 1 of 'apei_read' from incompatible pointer type [-Werror=incompatible-pointer-types]
>      rc = apei_read(buf_paddr, &g->error_status_address);
>                     ^~~~~~~~~

This takes a u64 pointer even on 32bit systems, because that's the size of the
GAS structure in the spec. (I wonder what it expects you to do if the high bits
are set...)

I'll fix this locally[0].


Thanks,

James



[0] phys_addr_t is a good thing, lets not use it:
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index b7b335450a6b..930adecd87d4 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -267,7 +267,7 @@ static inline int ghes_severity(int severity)
        }
 }

-static void ghes_copy_tofrom_phys(void *buffer, phys_addr_t paddr, u32 len,
+static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
                                  int from_phys, int fixmap_idx)
 {
        void __iomem *vaddr;
@@ -292,7 +292,7 @@ static void ghes_copy_tofrom_phys(void *buffer, phys_addr_t
paddr, u32 len,

 /* read the CPER block returning its address and size */
 static int ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
-                            phys_addr_t *buf_paddr, u32 *buf_len)
+                            u64 *buf_paddr, u32 *buf_len)
 {
        struct acpi_hest_generic *g = ghes->generic;
        struct acpi_hest_generic_status estatus;

@@ -337,7 +337,7 @@ static int ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
 }

 static int __ghes_read_estatus(struct acpi_hest_generic_status *estatus,
-                              phys_addr_t buf_paddr, size_t buf_len,
+                              u64 buf_paddr, size_t buf_len,
                               int fixmap_idx)
 {
        ghes_copy_tofrom_phys(estatus, buf_paddr, buf_len, 1, fixmap_idx);
@@ -353,7 +353,7 @@ static int __ghes_read_estatus(struct
acpi_hest_generic_status *estatus,

 static int ghes_read_estatus(struct ghes *ghes,
                             struct acpi_hest_generic_status *estatus,
-                            phys_addr_t *buf_paddr, int fixmap_idx)
+                            u64 *buf_paddr, int fixmap_idx)
 {
        int rc;
        u32 buf_len;
@@ -366,7 +366,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 }

 static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
-                              phys_addr_t buf_paddr, int fixmap_idx)
+                              u64 buf_paddr, int fixmap_idx)
 {
        estatus->block_status = 0;
        if (buf_paddr)
@@ -716,9 +716,9 @@ static void ghes_print_queued_estatus(void)

 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
+       u64 buf_paddr;
        int sev, rc = 0;
        u32 len, node_len;
-       phys_addr_t buf_paddr;
        struct ghes_estatus_node *estatus_node;
        struct acpi_hest_generic_status *estatus;

@@ -876,8 +876,8 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 static int ghes_proc(struct ghes *ghes)
 {
        int rc;
+       u64 buf_paddr;
        unsigned long flags;
-       phys_addr_t buf_paddr;
        struct acpi_hest_generic_status *estatus = ghes->estatus;

        spin_lock_irqsave(&ghes_notify_lock_irq, flags);

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (19 preceding siblings ...)
  2018-06-26 17:01 ` [PATCH v5 20/20] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
@ 2018-07-04 14:37 ` Will Deacon
  2018-07-05  9:50 ` Rafael J. Wysocki
  21 siblings, 0 replies; 26+ messages in thread
From: Will Deacon @ 2018-07-04 14:37 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

Hi James,

On Tue, Jun 26, 2018 at 06:00:56PM +0100, James Morse wrote:
> The aim of this series is to wire arm64's SDEI into APEI.
> 
> On arm64 we have three APEI notifications that are NMI-like, and
> in the unlikely event that all three are supported by a platform,
> they can interrupt each other.
> The GHES driver shouldn't have to deal with this, so this series aims
> to make it re-entrant.
> 
> To do that, we refactor the estatus queue to allow multiple notifications
> to use it, then convert NOTIFY_SEA to always be described as NMI-like,
> and to use the estatus queue.
> 
> From here we push the locking and fixmap choices out to the notification
> functions, and remove the use of per-ghes estatus and flags. This removes
> the in_nmi() 'timebomb' in ghes_copy_tofrom_phys().
> 
> Things get sticky when an NMI notification needs to know how big the
> CPER records might be, before reading it. This series splits
> ghes_estatus_read() to let us peek at the buffer. A side effect of this
> is the 20byte header will get read twice. (how does it work today? it
> reads the records into a per-ghes worst-case sized buffer, allocates
> the correct size and copies the records. in_nmi() use of this per-ghes
> buffer needs eliminating).
> 
> One alternative was to trust firmware's 'max raw data length' and use
> that to allocate 'enough' memory. We don't use this value today, so its
> probably wrong on some sytem somewhere.
> 
> Since v4 patches 5,8-15 are new, otherwise changes are noted in the patch.

The little bits touching arch/arm64/ all look fine to me here, but it looks
like other patches need review separately and ultimately I suspect you're
going to route it via some other tree.

Let me know if you need me to help with anything.

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up
  2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
                   ` (20 preceding siblings ...)
  2018-07-04 14:37 ` [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up Will Deacon
@ 2018-07-05  9:50 ` Rafael J. Wysocki
  2018-07-05 15:42   ` James Morse
  21 siblings, 1 reply; 26+ messages in thread
From: Rafael J. Wysocki @ 2018-07-05  9:50 UTC (permalink / raw)
  To: James Morse, Tony Luck
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Len Brown, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Tuesday, June 26, 2018 7:00:56 PM CEST James Morse wrote:
> The aim of this series is to wire arm64's SDEI into APEI.
> 
> On arm64 we have three APEI notifications that are NMI-like, and
> in the unlikely event that all three are supported by a platform,
> they can interrupt each other.
> The GHES driver shouldn't have to deal with this, so this series aims
> to make it re-entrant.
> 
> To do that, we refactor the estatus queue to allow multiple notifications
> to use it, then convert NOTIFY_SEA to always be described as NMI-like,
> and to use the estatus queue.
> 
> From here we push the locking and fixmap choices out to the notification
> functions, and remove the use of per-ghes estatus and flags. This removes
> the in_nmi() 'timebomb' in ghes_copy_tofrom_phys().
> 
> Things get sticky when an NMI notification needs to know how big the
> CPER records might be, before reading it. This series splits
> ghes_estatus_read() to let us peek at the buffer. A side effect of this
> is the 20byte header will get read twice. (how does it work today? it
> reads the records into a per-ghes worst-case sized buffer, allocates
> the correct size and copies the records. in_nmi() use of this per-ghes
> buffer needs eliminating).
> 
> One alternative was to trust firmware's 'max raw data length' and use
> that to allocate 'enough' memory. We don't use this value today, so its
> probably wrong on some sytem somewhere.
> 
> Since v4 patches 5,8-15 are new, otherwise changes are noted in the patch.
> 
> 
> The earlier boiler-plate:
> 
> What's SDEI? Its ARM's "Software Delegated Exception Interface" [0]. It's
> used by firmware to tell the OS about firmware-first RAS events.
> 
> These Software exceptions can interrupt anything, so I describe them as
> NMI-like. They aren't the only NMI-like way to notify the OS about
> firmware-first RAS events, the ACPI spec also defines 'NOTFIY_SEA' and
> 'NOTIFY_SEI'.
> 
> (Acronyms: SEA, Synchronous External Abort. The CPU requested some memory,
> but the owner of that memory said no. These are always synchronous with the
> instruction that caused them. SEI, System-Error Interrupt, commonly called
> SError. This is an asynchronous external abort, the memory-owner didn't say no
> at the right point. Collectively these things are called external-aborts
> How is firmware involved? It traps these and re-injects them into the kernel
> once its written the CPER records).
> 
> APEI's GHES code only expects one source of NMI. If a platform implements
> more than one of these mechanisms, APEI needs to handle the interaction.
> 'SEA' and 'SEI' can interact as 'SEI' is asynchronous. SDEI can interact
> with itself: its exceptions can be 'normal' or 'critical', and firmware
> could use both types for RAS. (errors using normal, 'panic-now' using
> critical).
> 
> 
> ghes.c became clearer to me when I worked out that it has three sets of
> functions with 'estatus' in the name. One is a pool of memory that can be
> allocated-from atomically. This is grown/shrunk when new NMI users are
> allocated.
> The second is the estatus-cache, which holds recent notifications so it
> can suppress notifications we've already handled.
> The last it the estatus-queue, which holds data from NMI-like notifications
> (in pool memory) to be processed from irq_work.
> 
> 
> Testing?
> Tested with the SDEI FVP based software model and a mocked up NOTFIY_SEA using
> KVM. I've added a case where 'corrected errors' are discovered at probe time
> to exercise ghes_probe() during boot. I've only build tested this on x86.
> 
> This series based on v4.18-rc2 can be retrieved from:
> git://linux-arm.org/linux-jm.git -b apei_sdei/v5
> 
> 
> Thanks,
> 
> James
> 
> [0] http://infocenter.arm.com/help/topic/com.arm.doc.den0054a/ARM_DEN0054A_Software_Delegated_Exception_Interface.pdf
> 
> James Morse (20):
>   ACPI / APEI: Move the estatus queue code up, and under its own ifdef
>   ACPI / APEI: Generalise the estatus queue's add/remove and notify code
>   ACPI / APEI: don't wait to serialise with oops messages when
>     panic()ing
>   ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
>   ACPI / APEI: Make estatus queue a Kconfig symbol
>   KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
>   arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
>   ACPI / APEI: Move locking to the notification helper
>   ACPI / APEI: Let the notification helper specify the fixmap slot
>   ACPI / APEI: preparatory split of ghes->estatus
>   ACPI / APEI: Remove silent flag from ghes_read_estatus()
>   ACPI / APEI: Don't store CPER records physical address in struct ghes
>   ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
>   ACPI / APEI: Split ghes_read_estatus() to read CPER length
>   ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
>   ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications
>   firmware: arm_sdei: Add ACPI GHES registration helper
>   ACPI / APEI: Add support for the SDEI GHES Notification type
>   mm/memory-failure: increase queued recovery work's priority
>   arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
> 
>  arch/arm/include/asm/kvm_ras.h       |  14 +
>  arch/arm/include/asm/system_misc.h   |   5 -
>  arch/arm64/include/asm/acpi.h        |   4 +
>  arch/arm64/include/asm/daifflags.h   |   1 +
>  arch/arm64/include/asm/fixmap.h      |   8 +-
>  arch/arm64/include/asm/kvm_ras.h     |  25 ++
>  arch/arm64/include/asm/system_misc.h |   2 -
>  arch/arm64/kernel/acpi.c             |  49 ++
>  arch/arm64/mm/fault.c                |  30 +-
>  drivers/acpi/apei/Kconfig            |  11 +
>  drivers/acpi/apei/ghes.c             | 649 ++++++++++++++++-----------
>  drivers/firmware/Kconfig             |   1 +
>  drivers/firmware/arm_sdei.c          |  66 +++
>  include/acpi/ghes.h                  |   2 -
>  include/linux/arm_sdei.h             |   9 +
>  mm/memory-failure.c                  |  11 +-
>  virt/kvm/arm/mmu.c                   |   4 +-
>  17 files changed, 591 insertions(+), 300 deletions(-)
>  create mode 100644 arch/arm/include/asm/kvm_ras.h
>  create mode 100644 arch/arm64/include/asm/kvm_ras.h

Tony, I need your help with reviewing the APEI-related material here.
Can you please have a look at this series and let me know if there are
any concerns regarding it?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up
  2018-07-05  9:50 ` Rafael J. Wysocki
@ 2018-07-05 15:42   ` James Morse
  0 siblings, 0 replies; 26+ messages in thread
From: James Morse @ 2018-07-05 15:42 UTC (permalink / raw)
  To: Rafael J. Wysocki, Tony Luck
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Len Brown, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

Hi guys,

On 05/07/18 10:50, Rafael J. Wysocki wrote:
> On Tuesday, June 26, 2018 7:00:56 PM CEST James Morse wrote:
>> The aim of this series is to wire arm64's SDEI into APEI.
>>
>> On arm64 we have three APEI notifications that are NMI-like, and
>> in the unlikely event that all three are supported by a platform,
>> they can interrupt each other.
>> The GHES driver shouldn't have to deal with this, so this series aims
>> to make it re-entrant.
>>
>> To do that, we refactor the estatus queue to allow multiple notifications
>> to use it, then convert NOTIFY_SEA to always be described as NMI-like,
>> and to use the estatus queue.
>>
>> From here we push the locking and fixmap choices out to the notification
>> functions, and remove the use of per-ghes estatus and flags. This removes
>> the in_nmi() 'timebomb' in ghes_copy_tofrom_phys().
>>
>> Things get sticky when an NMI notification needs to know how big the
>> CPER records might be, before reading it. This series splits
>> ghes_estatus_read() to let us peek at the buffer. A side effect of this
>> is the 20byte header will get read twice. (how does it work today? it
>> reads the records into a per-ghes worst-case sized buffer, allocates
>> the correct size and copies the records. in_nmi() use of this per-ghes
>> buffer needs eliminating).
>>
>> One alternative was to trust firmware's 'max raw data length' and use
>> that to allocate 'enough' memory. We don't use this value today, so its
>> probably wrong on some sytem somewhere.
>>
>> Since v4 patches 5,8-15 are new, otherwise changes are noted in the patch.

> Tony, I need your help with reviewing the APEI-related material here.
> Can you please have a look at this series and let me know if there are
> any concerns regarding it?

Thanks.

I think the only context from earlier versions is where Borislav spotted some
issues with the ghes_proc() call at probe time and NMI-like notifications.

>From https://www.spinics.net/lists/arm-kernel/msg653332.html :
| Which means, that this code is not really reentrant and if should be
| fixed to be callable from different contexts, then it should use private
| buffers and be careful about locking.

... the patches for which have bloated this series.


Thanks,

James

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-07-05 15:42 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-26 17:00 [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
2018-06-26 17:00 ` [PATCH v5 01/20] ACPI / APEI: Move the estatus queue code up, and under its own ifdef James Morse
2018-06-26 17:00 ` [PATCH v5 02/20] ACPI / APEI: Generalise the estatus queue's add/remove and notify code James Morse
2018-06-26 17:00 ` [PATCH v5 03/20] ACPI / APEI: don't wait to serialise with oops messages when panic()ing James Morse
2018-06-26 17:01 ` [PATCH v5 04/20] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
2018-06-26 17:01 ` [PATCH v5 05/20] ACPI / APEI: Make estatus queue a Kconfig symbol James Morse
2018-06-26 17:01 ` [PATCH v5 06/20] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
2018-06-26 17:01 ` [PATCH v5 07/20] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
2018-06-26 17:01 ` [PATCH v5 08/20] ACPI / APEI: Move locking to the notification helper James Morse
2018-06-26 17:01 ` [PATCH v5 09/20] ACPI / APEI: Let the notification helper specify the fixmap slot James Morse
2018-06-26 17:01 ` [PATCH v5 10/20] ACPI / APEI: preparatory split of ghes->estatus James Morse
2018-06-26 17:01 ` [PATCH v5 11/20] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
2018-06-26 17:01 ` [PATCH v5 12/20] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
2018-06-26 20:55   ` kbuild test robot
2018-06-27  8:40     ` James Morse
2018-06-26 17:01 ` [PATCH v5 13/20] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus James Morse
2018-06-26 17:01 ` [PATCH v5 14/20] ACPI / APEI: Split ghes_read_estatus() to read CPER length James Morse
2018-06-26 17:01 ` [PATCH v5 15/20] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
2018-06-26 17:01 ` [PATCH v5 16/20] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications James Morse
2018-06-26 17:01 ` [PATCH v5 17/20] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
2018-06-26 17:01 ` [PATCH v5 18/20] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse
2018-06-26 17:01 ` [PATCH v5 19/20] mm/memory-failure: increase queued recovery work's priority James Morse
2018-06-26 17:01 ` [PATCH v5 20/20] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
2018-07-04 14:37 ` [PATCH v5 00/20] APEI in_nmi() rework and arm64 SDEI wire-up Will Deacon
2018-07-05  9:50 ` Rafael J. Wysocki
2018-07-05 15:42   ` James Morse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).