All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/18] APEI in_nmi() rework
@ 2018-09-21 22:16 ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hello,

The GHES driver has collected quite a few bugs:

ghes_proc() at ghes_probe() time can be interrupted by an NMI that
will clobber the ghes->estatus fields, flags, and the buffer_paddr.

ghes_copy_tofrom_phys() uses in_nmi() to decide which path to take. arm64's
SEA taking both paths, depending on what it interrupted.

There is no guarantee that queued memory_failure() errors will be processed
before this CPU returns to user-space.

x86 can't TLBI from interrupt-masked code which this driver does all the
time.


This series aims to fix the first three, with an eye to fixing the
last one with a follow-up series.

Previous postings included the SDEI notification calls, which I haven't
finished re-testing. This series is big enough as it is.


Any NMIlike notification should always be in_nmi(), and should use the
ghes estatus cache to hold the CPER records until they can be processed.

The path through GHES should be nmi-safe, without the need to look at
in_nmi(). Abstract the estatus cache, and re-plumb arm64 to always
nmi_enter() before making the ghes_notify_sea() call.

To remove the use of in_nmi(), the locks are pushed out to the notification
helpers, and the fixmap slot to use is passed in. (A future series could
change as many nnotification helpers as possible to not mask-irqs, and
pass in some GHES_FIXMAP_NONE that indicates ioremap() should be used)

Change the now common _in_nmi_notify_one() to use local estatus/paddr/flags,
instead of clobbering those in the struct ghes.

Finally we try and ensure the memory_failure() work will run before this
CPU returns to user-space where the error may be triggered again.


Changes since v5:
 * Fixed phys_addr_t/u64 that failed to build on 32bit x86.
 * Removed buffer/flags from struct ghes, these are now on the stack.

To make future irq/tlbi fixes easier:
 * Moved the locking further out to make it easier to avoid masking interrupts
   for notifications where it isn't needed.
 * Restored map/unmap helpers so they can use ioremap() when interrupts aren't
   masked.


Feedback welcome,

Thanks

James Morse (18):
  ACPI / APEI: Move the estatus queue code up, and under its own ifdef
  ACPI / APEI: Generalise the estatus queue's add/remove and notify code
  ACPI / APEI: don't wait to serialise with oops messages when
    panic()ing
  ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  ACPI / APEI: Make estatus queue a Kconfig symbol
  KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  ACPI / APEI: Move locking to the notification helper
  ACPI / APEI: Let the notification helper specify the fixmap slot
  ACPI / APEI: preparatory split of ghes->estatus
  ACPI / APEI: Remove silent flag from ghes_read_estatus()
  ACPI / APEI: Don't store CPER records physical address in struct ghes
  ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  ACPI / APEI: Split ghes_read_estatus() to read CPER length
  ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications
  mm/memory-failure: increase queued recovery work's priority
  arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work

 arch/arm/include/asm/kvm_ras.h       |  14 +
 arch/arm/include/asm/system_misc.h   |   5 -
 arch/arm64/include/asm/acpi.h        |   4 +
 arch/arm64/include/asm/daifflags.h   |   1 +
 arch/arm64/include/asm/fixmap.h      |   4 +-
 arch/arm64/include/asm/kvm_ras.h     |  25 ++
 arch/arm64/include/asm/system_misc.h |   2 -
 arch/arm64/kernel/acpi.c             |  48 +++
 arch/arm64/mm/fault.c                |  25 +-
 drivers/acpi/apei/Kconfig            |   6 +
 drivers/acpi/apei/ghes.c             | 564 +++++++++++++++------------
 include/acpi/ghes.h                  |   2 -
 mm/memory-failure.c                  |  11 +-
 virt/kvm/arm/mmu.c                   |   4 +-
 14 files changed, 426 insertions(+), 289 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

-- 
2.19.0

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 00/18] APEI in_nmi() rework
@ 2018-09-21 22:16 ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Hello,

The GHES driver has collected quite a few bugs:

ghes_proc() at ghes_probe() time can be interrupted by an NMI that
will clobber the ghes->estatus fields, flags, and the buffer_paddr.

ghes_copy_tofrom_phys() uses in_nmi() to decide which path to take. arm64's
SEA taking both paths, depending on what it interrupted.

There is no guarantee that queued memory_failure() errors will be processed
before this CPU returns to user-space.

x86 can't TLBI from interrupt-masked code which this driver does all the
time.


This series aims to fix the first three, with an eye to fixing the
last one with a follow-up series.

Previous postings included the SDEI notification calls, which I haven't
finished re-testing. This series is big enough as it is.


Any NMIlike notification should always be in_nmi(), and should use the
ghes estatus cache to hold the CPER records until they can be processed.

The path through GHES should be nmi-safe, without the need to look at
in_nmi(). Abstract the estatus cache, and re-plumb arm64 to always
nmi_enter() before making the ghes_notify_sea() call.

To remove the use of in_nmi(), the locks are pushed out to the notification
helpers, and the fixmap slot to use is passed in. (A future series could
change as many nnotification helpers as possible to not mask-irqs, and
pass in some GHES_FIXMAP_NONE that indicates ioremap() should be used)

Change the now common _in_nmi_notify_one() to use local estatus/paddr/flags,
instead of clobbering those in the struct ghes.

Finally we try and ensure the memory_failure() work will run before this
CPU returns to user-space where the error may be triggered again.


Changes since v5:
 * Fixed phys_addr_t/u64 that failed to build on 32bit x86.
 * Removed buffer/flags from struct ghes, these are now on the stack.

To make future irq/tlbi fixes easier:
 * Moved the locking further out to make it easier to avoid masking interrupts
   for notifications where it isn't needed.
 * Restored map/unmap helpers so they can use ioremap() when interrupts aren't
   masked.


Feedback welcome,

Thanks

James Morse (18):
  ACPI / APEI: Move the estatus queue code up, and under its own ifdef
  ACPI / APEI: Generalise the estatus queue's add/remove and notify code
  ACPI / APEI: don't wait to serialise with oops messages when
    panic()ing
  ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  ACPI / APEI: Make estatus queue a Kconfig symbol
  KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  ACPI / APEI: Move locking to the notification helper
  ACPI / APEI: Let the notification helper specify the fixmap slot
  ACPI / APEI: preparatory split of ghes->estatus
  ACPI / APEI: Remove silent flag from ghes_read_estatus()
  ACPI / APEI: Don't store CPER records physical address in struct ghes
  ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  ACPI / APEI: Split ghes_read_estatus() to read CPER length
  ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications
  mm/memory-failure: increase queued recovery work's priority
  arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work

 arch/arm/include/asm/kvm_ras.h       |  14 +
 arch/arm/include/asm/system_misc.h   |   5 -
 arch/arm64/include/asm/acpi.h        |   4 +
 arch/arm64/include/asm/daifflags.h   |   1 +
 arch/arm64/include/asm/fixmap.h      |   4 +-
 arch/arm64/include/asm/kvm_ras.h     |  25 ++
 arch/arm64/include/asm/system_misc.h |   2 -
 arch/arm64/kernel/acpi.c             |  48 +++
 arch/arm64/mm/fault.c                |  25 +-
 drivers/acpi/apei/Kconfig            |   6 +
 drivers/acpi/apei/ghes.c             | 564 +++++++++++++++------------
 include/acpi/ghes.h                  |   2 -
 mm/memory-failure.c                  |  11 +-
 virt/kvm/arm/mmu.c                   |   4 +-
 14 files changed, 426 insertions(+), 289 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

-- 
2.19.0

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 00/18] APEI in_nmi() rework
@ 2018-09-21 22:16 ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

The GHES driver has collected quite a few bugs:

ghes_proc() at ghes_probe() time can be interrupted by an NMI that
will clobber the ghes->estatus fields, flags, and the buffer_paddr.

ghes_copy_tofrom_phys() uses in_nmi() to decide which path to take. arm64's
SEA taking both paths, depending on what it interrupted.

There is no guarantee that queued memory_failure() errors will be processed
before this CPU returns to user-space.

x86 can't TLBI from interrupt-masked code which this driver does all the
time.


This series aims to fix the first three, with an eye to fixing the
last one with a follow-up series.

Previous postings included the SDEI notification calls, which I haven't
finished re-testing. This series is big enough as it is.


Any NMIlike notification should always be in_nmi(), and should use the
ghes estatus cache to hold the CPER records until they can be processed.

The path through GHES should be nmi-safe, without the need to look at
in_nmi(). Abstract the estatus cache, and re-plumb arm64 to always
nmi_enter() before making the ghes_notify_sea() call.

To remove the use of in_nmi(), the locks are pushed out to the notification
helpers, and the fixmap slot to use is passed in. (A future series could
change as many nnotification helpers as possible to not mask-irqs, and
pass in some GHES_FIXMAP_NONE that indicates ioremap() should be used)

Change the now common _in_nmi_notify_one() to use local estatus/paddr/flags,
instead of clobbering those in the struct ghes.

Finally we try and ensure the memory_failure() work will run before this
CPU returns to user-space where the error may be triggered again.


Changes since v5:
 * Fixed phys_addr_t/u64 that failed to build on 32bit x86.
 * Removed buffer/flags from struct ghes, these are now on the stack.

To make future irq/tlbi fixes easier:
 * Moved the locking further out to make it easier to avoid masking interrupts
   for notifications where it isn't needed.
 * Restored map/unmap helpers so they can use ioremap() when interrupts aren't
   masked.


Feedback welcome,

Thanks

James Morse (18):
  ACPI / APEI: Move the estatus queue code up, and under its own ifdef
  ACPI / APEI: Generalise the estatus queue's add/remove and notify code
  ACPI / APEI: don't wait to serialise with oops messages when
    panic()ing
  ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  ACPI / APEI: Make estatus queue a Kconfig symbol
  KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  ACPI / APEI: Move locking to the notification helper
  ACPI / APEI: Let the notification helper specify the fixmap slot
  ACPI / APEI: preparatory split of ghes->estatus
  ACPI / APEI: Remove silent flag from ghes_read_estatus()
  ACPI / APEI: Don't store CPER records physical address in struct ghes
  ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  ACPI / APEI: Split ghes_read_estatus() to read CPER length
  ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications
  mm/memory-failure: increase queued recovery work's priority
  arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work

 arch/arm/include/asm/kvm_ras.h       |  14 +
 arch/arm/include/asm/system_misc.h   |   5 -
 arch/arm64/include/asm/acpi.h        |   4 +
 arch/arm64/include/asm/daifflags.h   |   1 +
 arch/arm64/include/asm/fixmap.h      |   4 +-
 arch/arm64/include/asm/kvm_ras.h     |  25 ++
 arch/arm64/include/asm/system_misc.h |   2 -
 arch/arm64/kernel/acpi.c             |  48 +++
 arch/arm64/mm/fault.c                |  25 +-
 drivers/acpi/apei/Kconfig            |   6 +
 drivers/acpi/apei/ghes.c             | 564 +++++++++++++++------------
 include/acpi/ghes.h                  |   2 -
 mm/memory-failure.c                  |  11 +-
 virt/kvm/arm/mmu.c                   |   4 +-
 14 files changed, 426 insertions(+), 289 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

-- 
2.19.0

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 01/18] ACPI / APEI: Move the estatus queue code up, and under its own ifdef
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

To support asynchronous NMI-like notifications on arm64 we need to use
the estatus-queue. These patches refactor it to allow multiple APEI
notification types to use it.

First we move the estatus-queue code higher in the file so that any
notify_foo() handler can make use of it.

This patch moves code around ... and makes the following trivial change:
Freshen the dated comment above ghes_estatus_llist. printk() is no
longer the issue, its the helpers like memory_failure_queue() that
still aren't nmi safe.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/acpi/apei/ghes.c | 265 ++++++++++++++++++++-------------------
 1 file changed, 137 insertions(+), 128 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 02c6fd9caff7..f5732e6b5be8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -545,6 +545,16 @@ static int ghes_print_estatus(const char *pfx,
 	return 0;
 }
 
+static void __ghes_panic(struct ghes *ghes)
+{
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+
+	/* reboot to log the error! */
+	if (!panic_timeout)
+		panic_timeout = ghes_panic_timeout;
+	panic("Fatal hardware error!");
+}
+
 /*
  * GHES error status reporting throttle, to report more kinds of
  * errors, instead of just most frequently occurred errors.
@@ -672,6 +682,133 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
+#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+/*
+ * Handlers for CPER records may not be NMI safe. For example,
+ * memory_failure_queue() takes spinlocks and calls schedule_work_on().
+ * In any NMI-like handler, memory from ghes_estatus_pool is used to save
+ * estatus, and added to the ghes_estatus_llist. irq_work_queue() causes
+ * ghes_proc_in_irq() to run in IRQ context where each estatus in
+ * ghes_estatus_llist is processed. Each NMI-like error source must grow
+ * the ghes_estatus_pool to ensure memory is available.
+ *
+ * Memory from the ghes_estatus_pool is also used with the ghes_estatus_cache
+ * to suppress frequent messages.
+ */
+static struct llist_head ghes_estatus_llist;
+static struct irq_work ghes_proc_irq_work;
+
+static void ghes_print_queued_estatus(void)
+{
+	struct llist_node *llnode;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		generic = estatus_node->generic;
+		ghes_print_estatus(NULL, generic, estatus);
+		llnode = llnode->next;
+	}
+}
+
+/* Save estatus for further processing in IRQ context */
+static void __process_error(struct ghes *ghes)
+{
+#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
+	u32 len, node_len;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic_status *estatus;
+
+	if (ghes_estatus_cached(ghes->estatus))
+		return;
+
+	len = cper_estatus_len(ghes->estatus);
+	node_len = GHES_ESTATUS_NODE_LEN(len);
+
+	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
+	if (!estatus_node)
+		return;
+
+	estatus_node->ghes = ghes;
+	estatus_node->generic = ghes->generic;
+	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+	memcpy(estatus, ghes->estatus, len);
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+#endif
+}
+
+static unsigned long ghes_esource_prealloc_size(
+	const struct acpi_hest_generic *generic)
+{
+	unsigned long block_length, prealloc_records, prealloc_size;
+
+	block_length = min_t(unsigned long, generic->error_block_length,
+			     GHES_ESTATUS_MAX_SIZE);
+	prealloc_records = max_t(unsigned long,
+				 generic->records_to_preallocate, 1);
+	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
+			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
+
+	return prealloc_size;
+}
+
+static void ghes_estatus_pool_shrink(unsigned long len)
+{
+	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
+}
+
+static void ghes_proc_in_irq(struct irq_work *irq_work)
+{
+	struct llist_node *llnode, *next;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+	u32 len, node_len;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		next = llnode->next;
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		len = cper_estatus_len(estatus);
+		node_len = GHES_ESTATUS_NODE_LEN(len);
+		ghes_do_proc(estatus_node->ghes, estatus);
+		if (!ghes_estatus_cached(estatus)) {
+			generic = estatus_node->generic;
+			if (ghes_print_estatus(NULL, generic, estatus))
+				ghes_estatus_cache_add(generic, estatus);
+		}
+		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+			      node_len);
+		llnode = next;
+	}
+}
+
+static void ghes_nmi_init_cxt(void)
+{
+	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
+}
+
+#else
+static inline void ghes_nmi_init_cxt(void) { }
+#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
 	int rc;
@@ -687,16 +824,6 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 	return apei_write(val, &gv2->read_ack_register);
 }
 
-static void __ghes_panic(struct ghes *ghes)
-{
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
-
-	/* reboot to log the error! */
-	if (!panic_timeout)
-		panic_timeout = ghes_panic_timeout;
-	panic("Fatal hardware error!");
-}
-
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
@@ -828,17 +955,6 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 #endif /* CONFIG_ACPI_APEI_SEA */
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
-/*
- * printk is not safe in NMI context.  So in NMI handler, we allocate
- * required memory from lock-less memory allocator
- * (ghes_estatus_pool), save estatus into it, put them into lock-less
- * list (ghes_estatus_llist), then delay printk into IRQ context via
- * irq_work (ghes_proc_irq_work).  ghes_estatus_size_request record
- * required pool size by all NMI error source.
- */
-static struct llist_head ghes_estatus_llist;
-static struct irq_work ghes_proc_irq_work;
-
 /*
  * NMI may be triggered on any CPU, so ghes_in_nmi is used for
  * having only one concurrent reader.
@@ -847,88 +963,6 @@ static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
 static LIST_HEAD(ghes_nmi);
 
-static void ghes_proc_in_irq(struct irq_work *irq_work)
-{
-	struct llist_node *llnode, *next;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-	u32 len, node_len;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		next = llnode->next;
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		len = cper_estatus_len(estatus);
-		node_len = GHES_ESTATUS_NODE_LEN(len);
-		ghes_do_proc(estatus_node->ghes, estatus);
-		if (!ghes_estatus_cached(estatus)) {
-			generic = estatus_node->generic;
-			if (ghes_print_estatus(NULL, generic, estatus))
-				ghes_estatus_cache_add(generic, estatus);
-		}
-		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
-			      node_len);
-		llnode = next;
-	}
-}
-
-static void ghes_print_queued_estatus(void)
-{
-	struct llist_node *llnode;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		generic = estatus_node->generic;
-		ghes_print_estatus(NULL, generic, estatus);
-		llnode = llnode->next;
-	}
-}
-
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
-{
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	u32 len, node_len;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic_status *estatus;
-
-	if (ghes_estatus_cached(ghes->estatus))
-		return;
-
-	len = cper_estatus_len(ghes->estatus);
-	node_len = GHES_ESTATUS_NODE_LEN(len);
-
-	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
-	if (!estatus_node)
-		return;
-
-	estatus_node->ghes = ghes;
-	estatus_node->generic = ghes->generic;
-	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
-}
-
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
 	struct ghes *ghes;
@@ -967,26 +1001,6 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	return ret;
 }
 
-static unsigned long ghes_esource_prealloc_size(
-	const struct acpi_hest_generic *generic)
-{
-	unsigned long block_length, prealloc_records, prealloc_size;
-
-	block_length = min_t(unsigned long, generic->error_block_length,
-			     GHES_ESTATUS_MAX_SIZE);
-	prealloc_records = max_t(unsigned long,
-				 generic->records_to_preallocate, 1);
-	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
-			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
-
-	return prealloc_size;
-}
-
-static void ghes_estatus_pool_shrink(unsigned long len)
-{
-	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
-}
-
 static void ghes_nmi_add(struct ghes *ghes)
 {
 	unsigned long len;
@@ -1018,14 +1032,9 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	ghes_estatus_pool_shrink(len);
 }
 
-static void ghes_nmi_init_cxt(void)
-{
-	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
-}
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
 static inline void ghes_nmi_add(struct ghes *ghes) { }
 static inline void ghes_nmi_remove(struct ghes *ghes) { }
-static inline void ghes_nmi_init_cxt(void) { }
 #endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
 static int ghes_probe(struct platform_device *ghes_dev)
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 01/18] ACPI / APEI: Move the estatus queue code up, and under its own ifdef
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

To support asynchronous NMI-like notifications on arm64 we need to use
the estatus-queue. These patches refactor it to allow multiple APEI
notification types to use it.

First we move the estatus-queue code higher in the file so that any
notify_foo() handler can make use of it.

This patch moves code around ... and makes the following trivial change:
Freshen the dated comment above ghes_estatus_llist. printk() is no
longer the issue, its the helpers like memory_failure_queue() that
still aren't nmi safe.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/acpi/apei/ghes.c | 265 ++++++++++++++++++++-------------------
 1 file changed, 137 insertions(+), 128 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 02c6fd9caff7..f5732e6b5be8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -545,6 +545,16 @@ static int ghes_print_estatus(const char *pfx,
 	return 0;
 }
 
+static void __ghes_panic(struct ghes *ghes)
+{
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+
+	/* reboot to log the error! */
+	if (!panic_timeout)
+		panic_timeout = ghes_panic_timeout;
+	panic("Fatal hardware error!");
+}
+
 /*
  * GHES error status reporting throttle, to report more kinds of
  * errors, instead of just most frequently occurred errors.
@@ -672,6 +682,133 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
+#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+/*
+ * Handlers for CPER records may not be NMI safe. For example,
+ * memory_failure_queue() takes spinlocks and calls schedule_work_on().
+ * In any NMI-like handler, memory from ghes_estatus_pool is used to save
+ * estatus, and added to the ghes_estatus_llist. irq_work_queue() causes
+ * ghes_proc_in_irq() to run in IRQ context where each estatus in
+ * ghes_estatus_llist is processed. Each NMI-like error source must grow
+ * the ghes_estatus_pool to ensure memory is available.
+ *
+ * Memory from the ghes_estatus_pool is also used with the ghes_estatus_cache
+ * to suppress frequent messages.
+ */
+static struct llist_head ghes_estatus_llist;
+static struct irq_work ghes_proc_irq_work;
+
+static void ghes_print_queued_estatus(void)
+{
+	struct llist_node *llnode;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		generic = estatus_node->generic;
+		ghes_print_estatus(NULL, generic, estatus);
+		llnode = llnode->next;
+	}
+}
+
+/* Save estatus for further processing in IRQ context */
+static void __process_error(struct ghes *ghes)
+{
+#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
+	u32 len, node_len;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic_status *estatus;
+
+	if (ghes_estatus_cached(ghes->estatus))
+		return;
+
+	len = cper_estatus_len(ghes->estatus);
+	node_len = GHES_ESTATUS_NODE_LEN(len);
+
+	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
+	if (!estatus_node)
+		return;
+
+	estatus_node->ghes = ghes;
+	estatus_node->generic = ghes->generic;
+	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+	memcpy(estatus, ghes->estatus, len);
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+#endif
+}
+
+static unsigned long ghes_esource_prealloc_size(
+	const struct acpi_hest_generic *generic)
+{
+	unsigned long block_length, prealloc_records, prealloc_size;
+
+	block_length = min_t(unsigned long, generic->error_block_length,
+			     GHES_ESTATUS_MAX_SIZE);
+	prealloc_records = max_t(unsigned long,
+				 generic->records_to_preallocate, 1);
+	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
+			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
+
+	return prealloc_size;
+}
+
+static void ghes_estatus_pool_shrink(unsigned long len)
+{
+	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
+}
+
+static void ghes_proc_in_irq(struct irq_work *irq_work)
+{
+	struct llist_node *llnode, *next;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+	u32 len, node_len;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		next = llnode->next;
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		len = cper_estatus_len(estatus);
+		node_len = GHES_ESTATUS_NODE_LEN(len);
+		ghes_do_proc(estatus_node->ghes, estatus);
+		if (!ghes_estatus_cached(estatus)) {
+			generic = estatus_node->generic;
+			if (ghes_print_estatus(NULL, generic, estatus))
+				ghes_estatus_cache_add(generic, estatus);
+		}
+		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+			      node_len);
+		llnode = next;
+	}
+}
+
+static void ghes_nmi_init_cxt(void)
+{
+	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
+}
+
+#else
+static inline void ghes_nmi_init_cxt(void) { }
+#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
 	int rc;
@@ -687,16 +824,6 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 	return apei_write(val, &gv2->read_ack_register);
 }
 
-static void __ghes_panic(struct ghes *ghes)
-{
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
-
-	/* reboot to log the error! */
-	if (!panic_timeout)
-		panic_timeout = ghes_panic_timeout;
-	panic("Fatal hardware error!");
-}
-
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
@@ -828,17 +955,6 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 #endif /* CONFIG_ACPI_APEI_SEA */
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
-/*
- * printk is not safe in NMI context.  So in NMI handler, we allocate
- * required memory from lock-less memory allocator
- * (ghes_estatus_pool), save estatus into it, put them into lock-less
- * list (ghes_estatus_llist), then delay printk into IRQ context via
- * irq_work (ghes_proc_irq_work).  ghes_estatus_size_request record
- * required pool size by all NMI error source.
- */
-static struct llist_head ghes_estatus_llist;
-static struct irq_work ghes_proc_irq_work;
-
 /*
  * NMI may be triggered on any CPU, so ghes_in_nmi is used for
  * having only one concurrent reader.
@@ -847,88 +963,6 @@ static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
 static LIST_HEAD(ghes_nmi);
 
-static void ghes_proc_in_irq(struct irq_work *irq_work)
-{
-	struct llist_node *llnode, *next;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-	u32 len, node_len;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		next = llnode->next;
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		len = cper_estatus_len(estatus);
-		node_len = GHES_ESTATUS_NODE_LEN(len);
-		ghes_do_proc(estatus_node->ghes, estatus);
-		if (!ghes_estatus_cached(estatus)) {
-			generic = estatus_node->generic;
-			if (ghes_print_estatus(NULL, generic, estatus))
-				ghes_estatus_cache_add(generic, estatus);
-		}
-		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
-			      node_len);
-		llnode = next;
-	}
-}
-
-static void ghes_print_queued_estatus(void)
-{
-	struct llist_node *llnode;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		generic = estatus_node->generic;
-		ghes_print_estatus(NULL, generic, estatus);
-		llnode = llnode->next;
-	}
-}
-
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
-{
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	u32 len, node_len;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic_status *estatus;
-
-	if (ghes_estatus_cached(ghes->estatus))
-		return;
-
-	len = cper_estatus_len(ghes->estatus);
-	node_len = GHES_ESTATUS_NODE_LEN(len);
-
-	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
-	if (!estatus_node)
-		return;
-
-	estatus_node->ghes = ghes;
-	estatus_node->generic = ghes->generic;
-	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
-}
-
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
 	struct ghes *ghes;
@@ -967,26 +1001,6 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	return ret;
 }
 
-static unsigned long ghes_esource_prealloc_size(
-	const struct acpi_hest_generic *generic)
-{
-	unsigned long block_length, prealloc_records, prealloc_size;
-
-	block_length = min_t(unsigned long, generic->error_block_length,
-			     GHES_ESTATUS_MAX_SIZE);
-	prealloc_records = max_t(unsigned long,
-				 generic->records_to_preallocate, 1);
-	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
-			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
-
-	return prealloc_size;
-}
-
-static void ghes_estatus_pool_shrink(unsigned long len)
-{
-	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
-}
-
 static void ghes_nmi_add(struct ghes *ghes)
 {
 	unsigned long len;
@@ -1018,14 +1032,9 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	ghes_estatus_pool_shrink(len);
 }
 
-static void ghes_nmi_init_cxt(void)
-{
-	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
-}
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
 static inline void ghes_nmi_add(struct ghes *ghes) { }
 static inline void ghes_nmi_remove(struct ghes *ghes) { }
-static inline void ghes_nmi_init_cxt(void) { }
 #endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
 static int ghes_probe(struct platform_device *ghes_dev)
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 01/18] ACPI / APEI: Move the estatus queue code up, and under its own ifdef
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

To support asynchronous NMI-like notifications on arm64 we need to use
the estatus-queue. These patches refactor it to allow multiple APEI
notification types to use it.

First we move the estatus-queue code higher in the file so that any
notify_foo() handler can make use of it.

This patch moves code around ... and makes the following trivial change:
Freshen the dated comment above ghes_estatus_llist. printk() is no
longer the issue, its the helpers like memory_failure_queue() that
still aren't nmi safe.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/acpi/apei/ghes.c | 265 ++++++++++++++++++++-------------------
 1 file changed, 137 insertions(+), 128 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 02c6fd9caff7..f5732e6b5be8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -545,6 +545,16 @@ static int ghes_print_estatus(const char *pfx,
 	return 0;
 }
 
+static void __ghes_panic(struct ghes *ghes)
+{
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+
+	/* reboot to log the error! */
+	if (!panic_timeout)
+		panic_timeout = ghes_panic_timeout;
+	panic("Fatal hardware error!");
+}
+
 /*
  * GHES error status reporting throttle, to report more kinds of
  * errors, instead of just most frequently occurred errors.
@@ -672,6 +682,133 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
+#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+/*
+ * Handlers for CPER records may not be NMI safe. For example,
+ * memory_failure_queue() takes spinlocks and calls schedule_work_on().
+ * In any NMI-like handler, memory from ghes_estatus_pool is used to save
+ * estatus, and added to the ghes_estatus_llist. irq_work_queue() causes
+ * ghes_proc_in_irq() to run in IRQ context where each estatus in
+ * ghes_estatus_llist is processed. Each NMI-like error source must grow
+ * the ghes_estatus_pool to ensure memory is available.
+ *
+ * Memory from the ghes_estatus_pool is also used with the ghes_estatus_cache
+ * to suppress frequent messages.
+ */
+static struct llist_head ghes_estatus_llist;
+static struct irq_work ghes_proc_irq_work;
+
+static void ghes_print_queued_estatus(void)
+{
+	struct llist_node *llnode;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		generic = estatus_node->generic;
+		ghes_print_estatus(NULL, generic, estatus);
+		llnode = llnode->next;
+	}
+}
+
+/* Save estatus for further processing in IRQ context */
+static void __process_error(struct ghes *ghes)
+{
+#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
+	u32 len, node_len;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic_status *estatus;
+
+	if (ghes_estatus_cached(ghes->estatus))
+		return;
+
+	len = cper_estatus_len(ghes->estatus);
+	node_len = GHES_ESTATUS_NODE_LEN(len);
+
+	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
+	if (!estatus_node)
+		return;
+
+	estatus_node->ghes = ghes;
+	estatus_node->generic = ghes->generic;
+	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+	memcpy(estatus, ghes->estatus, len);
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+#endif
+}
+
+static unsigned long ghes_esource_prealloc_size(
+	const struct acpi_hest_generic *generic)
+{
+	unsigned long block_length, prealloc_records, prealloc_size;
+
+	block_length = min_t(unsigned long, generic->error_block_length,
+			     GHES_ESTATUS_MAX_SIZE);
+	prealloc_records = max_t(unsigned long,
+				 generic->records_to_preallocate, 1);
+	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
+			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
+
+	return prealloc_size;
+}
+
+static void ghes_estatus_pool_shrink(unsigned long len)
+{
+	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
+}
+
+static void ghes_proc_in_irq(struct irq_work *irq_work)
+{
+	struct llist_node *llnode, *next;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+	u32 len, node_len;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		next = llnode->next;
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		len = cper_estatus_len(estatus);
+		node_len = GHES_ESTATUS_NODE_LEN(len);
+		ghes_do_proc(estatus_node->ghes, estatus);
+		if (!ghes_estatus_cached(estatus)) {
+			generic = estatus_node->generic;
+			if (ghes_print_estatus(NULL, generic, estatus))
+				ghes_estatus_cache_add(generic, estatus);
+		}
+		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+			      node_len);
+		llnode = next;
+	}
+}
+
+static void ghes_nmi_init_cxt(void)
+{
+	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
+}
+
+#else
+static inline void ghes_nmi_init_cxt(void) { }
+#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
 	int rc;
@@ -687,16 +824,6 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 	return apei_write(val, &gv2->read_ack_register);
 }
 
-static void __ghes_panic(struct ghes *ghes)
-{
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
-
-	/* reboot to log the error! */
-	if (!panic_timeout)
-		panic_timeout = ghes_panic_timeout;
-	panic("Fatal hardware error!");
-}
-
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
@@ -828,17 +955,6 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 #endif /* CONFIG_ACPI_APEI_SEA */
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
-/*
- * printk is not safe in NMI context.  So in NMI handler, we allocate
- * required memory from lock-less memory allocator
- * (ghes_estatus_pool), save estatus into it, put them into lock-less
- * list (ghes_estatus_llist), then delay printk into IRQ context via
- * irq_work (ghes_proc_irq_work).  ghes_estatus_size_request record
- * required pool size by all NMI error source.
- */
-static struct llist_head ghes_estatus_llist;
-static struct irq_work ghes_proc_irq_work;
-
 /*
  * NMI may be triggered on any CPU, so ghes_in_nmi is used for
  * having only one concurrent reader.
@@ -847,88 +963,6 @@ static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
 static LIST_HEAD(ghes_nmi);
 
-static void ghes_proc_in_irq(struct irq_work *irq_work)
-{
-	struct llist_node *llnode, *next;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-	u32 len, node_len;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		next = llnode->next;
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		len = cper_estatus_len(estatus);
-		node_len = GHES_ESTATUS_NODE_LEN(len);
-		ghes_do_proc(estatus_node->ghes, estatus);
-		if (!ghes_estatus_cached(estatus)) {
-			generic = estatus_node->generic;
-			if (ghes_print_estatus(NULL, generic, estatus))
-				ghes_estatus_cache_add(generic, estatus);
-		}
-		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
-			      node_len);
-		llnode = next;
-	}
-}
-
-static void ghes_print_queued_estatus(void)
-{
-	struct llist_node *llnode;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		generic = estatus_node->generic;
-		ghes_print_estatus(NULL, generic, estatus);
-		llnode = llnode->next;
-	}
-}
-
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
-{
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	u32 len, node_len;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic_status *estatus;
-
-	if (ghes_estatus_cached(ghes->estatus))
-		return;
-
-	len = cper_estatus_len(ghes->estatus);
-	node_len = GHES_ESTATUS_NODE_LEN(len);
-
-	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
-	if (!estatus_node)
-		return;
-
-	estatus_node->ghes = ghes;
-	estatus_node->generic = ghes->generic;
-	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
-}
-
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
 	struct ghes *ghes;
@@ -967,26 +1001,6 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	return ret;
 }
 
-static unsigned long ghes_esource_prealloc_size(
-	const struct acpi_hest_generic *generic)
-{
-	unsigned long block_length, prealloc_records, prealloc_size;
-
-	block_length = min_t(unsigned long, generic->error_block_length,
-			     GHES_ESTATUS_MAX_SIZE);
-	prealloc_records = max_t(unsigned long,
-				 generic->records_to_preallocate, 1);
-	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
-			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
-
-	return prealloc_size;
-}
-
-static void ghes_estatus_pool_shrink(unsigned long len)
-{
-	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
-}
-
 static void ghes_nmi_add(struct ghes *ghes)
 {
 	unsigned long len;
@@ -1018,14 +1032,9 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	ghes_estatus_pool_shrink(len);
 }
 
-static void ghes_nmi_init_cxt(void)
-{
-	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
-}
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
 static inline void ghes_nmi_add(struct ghes *ghes) { }
 static inline void ghes_nmi_remove(struct ghes *ghes) { }
-static inline void ghes_nmi_init_cxt(void) { }
 #endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
 static int ghes_probe(struct platform_device *ghes_dev)
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 02/18] ACPI / APEI: Generalise the estatus queue's add/remove and notify code
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Refactor the estatus queue's pool grow/shrink code and notification
routine from NOTIFY_NMI's handlers. This will allow another notification
method to use the estatus queue without duplicating this code.

This patch adds rcu_read_lock()/rcu_read_unlock() around the list
list_for_each_entry_rcu() walker. These aren't strictly necessary as
the whole nmi_enter/nmi_exit() window is a spooky RCU read-side
critical section.

The existing ghes_estatus_pool_shrink() is folded into the new
ghes_estatus_queue_shrink_pool() as only the queue uses it.

_in_nmi_notify_one() is separate from the rcu-list walker for a later
caller that doesn't need to walk a list.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Changes since v3:
 * Removed dupicate or redundant paragraphs in commit message.
 * Fixed the style of a zero check
Changes since v1:
 * Tidied up _in_nmi_notify_one().
---
 drivers/acpi/apei/ghes.c | 100 +++++++++++++++++++++++++--------------
 1 file changed, 65 insertions(+), 35 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f5732e6b5be8..29d863ff2f87 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -747,6 +747,51 @@ static void __process_error(struct ghes *ghes)
 #endif
 }
 
+static int _in_nmi_notify_one(struct ghes *ghes)
+{
+	int sev;
+
+	if (ghes_read_estatus(ghes, 1)) {
+		ghes_clear_estatus(ghes);
+		return -ENOENT;
+	}
+
+	sev = ghes_severity(ghes->estatus->error_severity);
+	if (sev >= GHES_SEV_PANIC) {
+#ifdef CONFIG_X86
+		oops_begin();
+#endif
+		ghes_print_queued_estatus();
+		__ghes_panic(ghes);
+	}
+
+	if (!(ghes->flags & GHES_TO_CLEAR))
+		return 0;
+
+	__process_error(ghes);
+	ghes_clear_estatus(ghes);
+
+	return 0;
+}
+
+static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+{
+	int ret = -ENOENT;
+	struct ghes *ghes;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ghes, rcu_list, list) {
+		if (!_in_nmi_notify_one(ghes))
+			ret = 0;
+	}
+	rcu_read_unlock();
+
+	if (IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && !ret)
+		irq_work_queue(&ghes_proc_irq_work);
+
+	return ret;
+}
+
 static unsigned long ghes_esource_prealloc_size(
 	const struct acpi_hest_generic *generic)
 {
@@ -762,11 +807,24 @@ static unsigned long ghes_esource_prealloc_size(
 	return prealloc_size;
 }
 
-static void ghes_estatus_pool_shrink(unsigned long len)
+/* After removing a queue user, we can shrink the pool */
+static void ghes_estatus_queue_shrink_pool(struct ghes *ghes)
 {
+	unsigned long len;
+
+	len = ghes_esource_prealloc_size(ghes->generic);
 	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
 }
 
+/* Before adding a queue user, grow the pool */
+static void ghes_estatus_queue_grow_pool(struct ghes *ghes)
+{
+	unsigned long len;
+
+	len = ghes_esource_prealloc_size(ghes->generic);
+	ghes_estatus_pool_expand(len);
+}
+
 static void ghes_proc_in_irq(struct irq_work *irq_work)
 {
 	struct llist_node *llnode, *next;
@@ -965,48 +1023,22 @@ static LIST_HEAD(ghes_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
-	struct ghes *ghes;
-	int sev, ret = NMI_DONE;
+	int ret = NMI_DONE;
 
 	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
 		return ret;
 
-	list_for_each_entry_rcu(ghes, &ghes_nmi, list) {
-		if (ghes_read_estatus(ghes, 1)) {
-			ghes_clear_estatus(ghes);
-			continue;
-		} else {
-			ret = NMI_HANDLED;
-		}
-
-		sev = ghes_severity(ghes->estatus->error_severity);
-		if (sev >= GHES_SEV_PANIC) {
-			oops_begin();
-			ghes_print_queued_estatus();
-			__ghes_panic(ghes);
-		}
+	if (!ghes_estatus_queue_notified(&ghes_nmi))
+		ret = NMI_HANDLED;
 
-		if (!(ghes->flags & GHES_TO_CLEAR))
-			continue;
-
-		__process_error(ghes);
-		ghes_clear_estatus(ghes);
-	}
-
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	if (ret == NMI_HANDLED)
-		irq_work_queue(&ghes_proc_irq_work);
-#endif
 	atomic_dec(&ghes_in_nmi);
 	return ret;
 }
 
 static void ghes_nmi_add(struct ghes *ghes)
 {
-	unsigned long len;
+	ghes_estatus_queue_grow_pool(ghes);
 
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_expand(len);
 	mutex_lock(&ghes_list_mutex);
 	if (list_empty(&ghes_nmi))
 		register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes");
@@ -1016,8 +1048,6 @@ static void ghes_nmi_add(struct ghes *ghes)
 
 static void ghes_nmi_remove(struct ghes *ghes)
 {
-	unsigned long len;
-
 	mutex_lock(&ghes_list_mutex);
 	list_del_rcu(&ghes->list);
 	if (list_empty(&ghes_nmi))
@@ -1028,8 +1058,8 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	 * freed after NMI handler finishes.
 	 */
 	synchronize_rcu();
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_shrink(len);
+
+	ghes_estatus_queue_shrink_pool(ghes);
 }
 
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 02/18] ACPI / APEI: Generalise the estatus queue's add/remove and notify code
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Refactor the estatus queue's pool grow/shrink code and notification
routine from NOTIFY_NMI's handlers. This will allow another notification
method to use the estatus queue without duplicating this code.

This patch adds rcu_read_lock()/rcu_read_unlock() around the list
list_for_each_entry_rcu() walker. These aren't strictly necessary as
the whole nmi_enter/nmi_exit() window is a spooky RCU read-side
critical section.

The existing ghes_estatus_pool_shrink() is folded into the new
ghes_estatus_queue_shrink_pool() as only the queue uses it.

_in_nmi_notify_one() is separate from the rcu-list walker for a later
caller that doesn't need to walk a list.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Changes since v3:
 * Removed dupicate or redundant paragraphs in commit message.
 * Fixed the style of a zero check
Changes since v1:
 * Tidied up _in_nmi_notify_one().
---
 drivers/acpi/apei/ghes.c | 100 +++++++++++++++++++++++++--------------
 1 file changed, 65 insertions(+), 35 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f5732e6b5be8..29d863ff2f87 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -747,6 +747,51 @@ static void __process_error(struct ghes *ghes)
 #endif
 }
 
+static int _in_nmi_notify_one(struct ghes *ghes)
+{
+	int sev;
+
+	if (ghes_read_estatus(ghes, 1)) {
+		ghes_clear_estatus(ghes);
+		return -ENOENT;
+	}
+
+	sev = ghes_severity(ghes->estatus->error_severity);
+	if (sev >= GHES_SEV_PANIC) {
+#ifdef CONFIG_X86
+		oops_begin();
+#endif
+		ghes_print_queued_estatus();
+		__ghes_panic(ghes);
+	}
+
+	if (!(ghes->flags & GHES_TO_CLEAR))
+		return 0;
+
+	__process_error(ghes);
+	ghes_clear_estatus(ghes);
+
+	return 0;
+}
+
+static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+{
+	int ret = -ENOENT;
+	struct ghes *ghes;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ghes, rcu_list, list) {
+		if (!_in_nmi_notify_one(ghes))
+			ret = 0;
+	}
+	rcu_read_unlock();
+
+	if (IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && !ret)
+		irq_work_queue(&ghes_proc_irq_work);
+
+	return ret;
+}
+
 static unsigned long ghes_esource_prealloc_size(
 	const struct acpi_hest_generic *generic)
 {
@@ -762,11 +807,24 @@ static unsigned long ghes_esource_prealloc_size(
 	return prealloc_size;
 }
 
-static void ghes_estatus_pool_shrink(unsigned long len)
+/* After removing a queue user, we can shrink the pool */
+static void ghes_estatus_queue_shrink_pool(struct ghes *ghes)
 {
+	unsigned long len;
+
+	len = ghes_esource_prealloc_size(ghes->generic);
 	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
 }
 
+/* Before adding a queue user, grow the pool */
+static void ghes_estatus_queue_grow_pool(struct ghes *ghes)
+{
+	unsigned long len;
+
+	len = ghes_esource_prealloc_size(ghes->generic);
+	ghes_estatus_pool_expand(len);
+}
+
 static void ghes_proc_in_irq(struct irq_work *irq_work)
 {
 	struct llist_node *llnode, *next;
@@ -965,48 +1023,22 @@ static LIST_HEAD(ghes_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
-	struct ghes *ghes;
-	int sev, ret = NMI_DONE;
+	int ret = NMI_DONE;
 
 	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
 		return ret;
 
-	list_for_each_entry_rcu(ghes, &ghes_nmi, list) {
-		if (ghes_read_estatus(ghes, 1)) {
-			ghes_clear_estatus(ghes);
-			continue;
-		} else {
-			ret = NMI_HANDLED;
-		}
-
-		sev = ghes_severity(ghes->estatus->error_severity);
-		if (sev >= GHES_SEV_PANIC) {
-			oops_begin();
-			ghes_print_queued_estatus();
-			__ghes_panic(ghes);
-		}
+	if (!ghes_estatus_queue_notified(&ghes_nmi))
+		ret = NMI_HANDLED;
 
-		if (!(ghes->flags & GHES_TO_CLEAR))
-			continue;
-
-		__process_error(ghes);
-		ghes_clear_estatus(ghes);
-	}
-
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	if (ret == NMI_HANDLED)
-		irq_work_queue(&ghes_proc_irq_work);
-#endif
 	atomic_dec(&ghes_in_nmi);
 	return ret;
 }
 
 static void ghes_nmi_add(struct ghes *ghes)
 {
-	unsigned long len;
+	ghes_estatus_queue_grow_pool(ghes);
 
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_expand(len);
 	mutex_lock(&ghes_list_mutex);
 	if (list_empty(&ghes_nmi))
 		register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes");
@@ -1016,8 +1048,6 @@ static void ghes_nmi_add(struct ghes *ghes)
 
 static void ghes_nmi_remove(struct ghes *ghes)
 {
-	unsigned long len;
-
 	mutex_lock(&ghes_list_mutex);
 	list_del_rcu(&ghes->list);
 	if (list_empty(&ghes_nmi))
@@ -1028,8 +1058,8 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	 * freed after NMI handler finishes.
 	 */
 	synchronize_rcu();
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_shrink(len);
+
+	ghes_estatus_queue_shrink_pool(ghes);
 }
 
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 02/18] ACPI / APEI: Generalise the estatus queue's add/remove and notify code
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

Refactor the estatus queue's pool grow/shrink code and notification
routine from NOTIFY_NMI's handlers. This will allow another notification
method to use the estatus queue without duplicating this code.

This patch adds rcu_read_lock()/rcu_read_unlock() around the list
list_for_each_entry_rcu() walker. These aren't strictly necessary as
the whole nmi_enter/nmi_exit() window is a spooky RCU read-side
critical section.

The existing ghes_estatus_pool_shrink() is folded into the new
ghes_estatus_queue_shrink_pool() as only the queue uses it.

_in_nmi_notify_one() is separate from the rcu-list walker for a later
caller that doesn't need to walk a list.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Changes since v3:
 * Removed dupicate or redundant paragraphs in commit message.
 * Fixed the style of a zero check
Changes since v1:
 * Tidied up _in_nmi_notify_one().
---
 drivers/acpi/apei/ghes.c | 100 +++++++++++++++++++++++++--------------
 1 file changed, 65 insertions(+), 35 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f5732e6b5be8..29d863ff2f87 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -747,6 +747,51 @@ static void __process_error(struct ghes *ghes)
 #endif
 }
 
+static int _in_nmi_notify_one(struct ghes *ghes)
+{
+	int sev;
+
+	if (ghes_read_estatus(ghes, 1)) {
+		ghes_clear_estatus(ghes);
+		return -ENOENT;
+	}
+
+	sev = ghes_severity(ghes->estatus->error_severity);
+	if (sev >= GHES_SEV_PANIC) {
+#ifdef CONFIG_X86
+		oops_begin();
+#endif
+		ghes_print_queued_estatus();
+		__ghes_panic(ghes);
+	}
+
+	if (!(ghes->flags & GHES_TO_CLEAR))
+		return 0;
+
+	__process_error(ghes);
+	ghes_clear_estatus(ghes);
+
+	return 0;
+}
+
+static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+{
+	int ret = -ENOENT;
+	struct ghes *ghes;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ghes, rcu_list, list) {
+		if (!_in_nmi_notify_one(ghes))
+			ret = 0;
+	}
+	rcu_read_unlock();
+
+	if (IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && !ret)
+		irq_work_queue(&ghes_proc_irq_work);
+
+	return ret;
+}
+
 static unsigned long ghes_esource_prealloc_size(
 	const struct acpi_hest_generic *generic)
 {
@@ -762,11 +807,24 @@ static unsigned long ghes_esource_prealloc_size(
 	return prealloc_size;
 }
 
-static void ghes_estatus_pool_shrink(unsigned long len)
+/* After removing a queue user, we can shrink the pool */
+static void ghes_estatus_queue_shrink_pool(struct ghes *ghes)
 {
+	unsigned long len;
+
+	len = ghes_esource_prealloc_size(ghes->generic);
 	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
 }
 
+/* Before adding a queue user, grow the pool */
+static void ghes_estatus_queue_grow_pool(struct ghes *ghes)
+{
+	unsigned long len;
+
+	len = ghes_esource_prealloc_size(ghes->generic);
+	ghes_estatus_pool_expand(len);
+}
+
 static void ghes_proc_in_irq(struct irq_work *irq_work)
 {
 	struct llist_node *llnode, *next;
@@ -965,48 +1023,22 @@ static LIST_HEAD(ghes_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
-	struct ghes *ghes;
-	int sev, ret = NMI_DONE;
+	int ret = NMI_DONE;
 
 	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
 		return ret;
 
-	list_for_each_entry_rcu(ghes, &ghes_nmi, list) {
-		if (ghes_read_estatus(ghes, 1)) {
-			ghes_clear_estatus(ghes);
-			continue;
-		} else {
-			ret = NMI_HANDLED;
-		}
-
-		sev = ghes_severity(ghes->estatus->error_severity);
-		if (sev >= GHES_SEV_PANIC) {
-			oops_begin();
-			ghes_print_queued_estatus();
-			__ghes_panic(ghes);
-		}
+	if (!ghes_estatus_queue_notified(&ghes_nmi))
+		ret = NMI_HANDLED;
 
-		if (!(ghes->flags & GHES_TO_CLEAR))
-			continue;
-
-		__process_error(ghes);
-		ghes_clear_estatus(ghes);
-	}
-
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	if (ret == NMI_HANDLED)
-		irq_work_queue(&ghes_proc_irq_work);
-#endif
 	atomic_dec(&ghes_in_nmi);
 	return ret;
 }
 
 static void ghes_nmi_add(struct ghes *ghes)
 {
-	unsigned long len;
+	ghes_estatus_queue_grow_pool(ghes);
 
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_expand(len);
 	mutex_lock(&ghes_list_mutex);
 	if (list_empty(&ghes_nmi))
 		register_nmi_handler(NMI_LOCAL, ghes_notify_nmi, 0, "ghes");
@@ -1016,8 +1048,6 @@ static void ghes_nmi_add(struct ghes *ghes)
 
 static void ghes_nmi_remove(struct ghes *ghes)
 {
-	unsigned long len;
-
 	mutex_lock(&ghes_list_mutex);
 	list_del_rcu(&ghes->list);
 	if (list_empty(&ghes_nmi))
@@ -1028,8 +1058,8 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	 * freed after NMI handler finishes.
 	 */
 	synchronize_rcu();
-	len = ghes_esource_prealloc_size(ghes->generic);
-	ghes_estatus_pool_shrink(len);
+
+	ghes_estatus_queue_shrink_pool(ghes);
 }
 
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 03/18] ACPI / APEI: don't wait to serialise with oops messages when panic()ing
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

oops_begin() exists to group printk() messages with the oops message
printed by die(). To reach this caller we know that platform firmware
took this error first, then notified the OS via NMI with a 'panic'
severity.

Don't wait for another CPU to release the die-lock before we can
panic(), our only goal is to print this fatal error and panic().

This code is always called in_nmi(), and since 42a0bb3f7138 ("printk/nmi:
generic solution for safe printk in NMI"), it has been safe to call
printk() from this context. Messages are batched in a per-cpu buffer
and printed via irq-work, or a call back from panic().

Link: https://patchwork.kernel.org/patch/10313555/
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 29d863ff2f87..d7c46236b353 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -33,7 +33,6 @@
 #include <linux/interrupt.h>
 #include <linux/timer.h>
 #include <linux/cper.h>
-#include <linux/kdebug.h>
 #include <linux/platform_device.h>
 #include <linux/mutex.h>
 #include <linux/ratelimit.h>
@@ -758,9 +757,6 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 
 	sev = ghes_severity(ghes->estatus->error_severity);
 	if (sev >= GHES_SEV_PANIC) {
-#ifdef CONFIG_X86
-		oops_begin();
-#endif
 		ghes_print_queued_estatus();
 		__ghes_panic(ghes);
 	}
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 03/18] ACPI / APEI: don't wait to serialise with oops messages when panic()ing
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

oops_begin() exists to group printk() messages with the oops message
printed by die(). To reach this caller we know that platform firmware
took this error first, then notified the OS via NMI with a 'panic'
severity.

Don't wait for another CPU to release the die-lock before we can
panic(), our only goal is to print this fatal error and panic().

This code is always called in_nmi(), and since 42a0bb3f7138 ("printk/nmi:
generic solution for safe printk in NMI"), it has been safe to call
printk() from this context. Messages are batched in a per-cpu buffer
and printed via irq-work, or a call back from panic().

Link: https://patchwork.kernel.org/patch/10313555/
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 29d863ff2f87..d7c46236b353 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -33,7 +33,6 @@
 #include <linux/interrupt.h>
 #include <linux/timer.h>
 #include <linux/cper.h>
-#include <linux/kdebug.h>
 #include <linux/platform_device.h>
 #include <linux/mutex.h>
 #include <linux/ratelimit.h>
@@ -758,9 +757,6 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 
 	sev = ghes_severity(ghes->estatus->error_severity);
 	if (sev >= GHES_SEV_PANIC) {
-#ifdef CONFIG_X86
-		oops_begin();
-#endif
 		ghes_print_queued_estatus();
 		__ghes_panic(ghes);
 	}
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 03/18] ACPI / APEI: don't wait to serialise with oops messages when panic()ing
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

oops_begin() exists to group printk() messages with the oops message
printed by die(). To reach this caller we know that platform firmware
took this error first, then notified the OS via NMI with a 'panic'
severity.

Don't wait for another CPU to release the die-lock before we can
panic(), our only goal is to print this fatal error and panic().

This code is always called in_nmi(), and since 42a0bb3f7138 ("printk/nmi:
generic solution for safe printk in NMI"), it has been safe to call
printk() from this context. Messages are batched in a per-cpu buffer
and printed via irq-work, or a call back from panic().

Link: https://patchwork.kernel.org/patch/10313555/
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 29d863ff2f87..d7c46236b353 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -33,7 +33,6 @@
 #include <linux/interrupt.h>
 #include <linux/timer.h>
 #include <linux/cper.h>
-#include <linux/kdebug.h>
 #include <linux/platform_device.h>
 #include <linux/mutex.h>
 #include <linux/ratelimit.h>
@@ -758,9 +757,6 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 
 	sev = ghes_severity(ghes->estatus->error_severity);
 	if (sev >= GHES_SEV_PANIC) {
-#ifdef CONFIG_X86
-		oops_begin();
-#endif
 		ghes_print_queued_estatus();
 		__ghes_panic(ghes);
 	}
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 04/18] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Now that the estatus queue can be used by more than one notification
method, we can move notifications that have NMI-like behaviour over to
it, and start abstracting GHES's single in_nmi() path.

Switch NOTIFY_SEA over to use the estatus queue. This makes it behave
in the same way as x86's NOTIFY_NMI.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/acpi/apei/ghes.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d7c46236b353..150fb184c7cb 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -58,6 +58,10 @@
 
 #define GHES_PFX	"GHES: "
 
+#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
+#define WANT_NMI_ESTATUS_QUEUE	1
+#endif
+
 #define GHES_ESTATUS_MAX_SIZE		65536
 #define GHES_ESOURCE_PREALLOC_MAX_SIZE	65536
 
@@ -681,7 +685,7 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+#ifdef WANT_NMI_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -861,7 +865,7 @@ static void ghes_nmi_init_cxt(void)
 
 #else
 static inline void ghes_nmi_init_cxt(void) { }
-#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+#endif /* WANT_NMI_ESTATUS_QUEUE */
 
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
@@ -977,20 +981,13 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	struct ghes *ghes;
-	int ret = -ENOENT;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(ghes, &ghes_sea, list) {
-		if (!ghes_proc(ghes))
-			ret = 0;
-	}
-	rcu_read_unlock();
-	return ret;
+	return ghes_estatus_queue_notified(&ghes_sea);
 }
 
 static void ghes_sea_add(struct ghes *ghes)
 {
+	ghes_estatus_queue_grow_pool(ghes);
+
 	mutex_lock(&ghes_list_mutex);
 	list_add_rcu(&ghes->list, &ghes_sea);
 	mutex_unlock(&ghes_list_mutex);
@@ -1002,6 +999,8 @@ static void ghes_sea_remove(struct ghes *ghes)
 	list_del_rcu(&ghes->list);
 	mutex_unlock(&ghes_list_mutex);
 	synchronize_rcu();
+
+	ghes_estatus_queue_shrink_pool(ghes);
 }
 #else /* CONFIG_ACPI_APEI_SEA */
 static inline void ghes_sea_add(struct ghes *ghes) { }
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 04/18] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Now that the estatus queue can be used by more than one notification
method, we can move notifications that have NMI-like behaviour over to
it, and start abstracting GHES's single in_nmi() path.

Switch NOTIFY_SEA over to use the estatus queue. This makes it behave
in the same way as x86's NOTIFY_NMI.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/acpi/apei/ghes.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d7c46236b353..150fb184c7cb 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -58,6 +58,10 @@
 
 #define GHES_PFX	"GHES: "
 
+#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
+#define WANT_NMI_ESTATUS_QUEUE	1
+#endif
+
 #define GHES_ESTATUS_MAX_SIZE		65536
 #define GHES_ESOURCE_PREALLOC_MAX_SIZE	65536
 
@@ -681,7 +685,7 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+#ifdef WANT_NMI_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -861,7 +865,7 @@ static void ghes_nmi_init_cxt(void)
 
 #else
 static inline void ghes_nmi_init_cxt(void) { }
-#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+#endif /* WANT_NMI_ESTATUS_QUEUE */
 
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
@@ -977,20 +981,13 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	struct ghes *ghes;
-	int ret = -ENOENT;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(ghes, &ghes_sea, list) {
-		if (!ghes_proc(ghes))
-			ret = 0;
-	}
-	rcu_read_unlock();
-	return ret;
+	return ghes_estatus_queue_notified(&ghes_sea);
 }
 
 static void ghes_sea_add(struct ghes *ghes)
 {
+	ghes_estatus_queue_grow_pool(ghes);
+
 	mutex_lock(&ghes_list_mutex);
 	list_add_rcu(&ghes->list, &ghes_sea);
 	mutex_unlock(&ghes_list_mutex);
@@ -1002,6 +999,8 @@ static void ghes_sea_remove(struct ghes *ghes)
 	list_del_rcu(&ghes->list);
 	mutex_unlock(&ghes_list_mutex);
 	synchronize_rcu();
+
+	ghes_estatus_queue_shrink_pool(ghes);
 }
 #else /* CONFIG_ACPI_APEI_SEA */
 static inline void ghes_sea_add(struct ghes *ghes) { }
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 04/18] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

Now that the estatus queue can be used by more than one notification
method, we can move notifications that have NMI-like behaviour over to
it, and start abstracting GHES's single in_nmi() path.

Switch NOTIFY_SEA over to use the estatus queue. This makes it behave
in the same way as x86's NOTIFY_NMI.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/acpi/apei/ghes.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d7c46236b353..150fb184c7cb 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -58,6 +58,10 @@
 
 #define GHES_PFX	"GHES: "
 
+#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
+#define WANT_NMI_ESTATUS_QUEUE	1
+#endif
+
 #define GHES_ESTATUS_MAX_SIZE		65536
 #define GHES_ESOURCE_PREALLOC_MAX_SIZE	65536
 
@@ -681,7 +685,7 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+#ifdef WANT_NMI_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -861,7 +865,7 @@ static void ghes_nmi_init_cxt(void)
 
 #else
 static inline void ghes_nmi_init_cxt(void) { }
-#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+#endif /* WANT_NMI_ESTATUS_QUEUE */
 
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
@@ -977,20 +981,13 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	struct ghes *ghes;
-	int ret = -ENOENT;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(ghes, &ghes_sea, list) {
-		if (!ghes_proc(ghes))
-			ret = 0;
-	}
-	rcu_read_unlock();
-	return ret;
+	return ghes_estatus_queue_notified(&ghes_sea);
 }
 
 static void ghes_sea_add(struct ghes *ghes)
 {
+	ghes_estatus_queue_grow_pool(ghes);
+
 	mutex_lock(&ghes_list_mutex);
 	list_add_rcu(&ghes->list, &ghes_sea);
 	mutex_unlock(&ghes_list_mutex);
@@ -1002,6 +999,8 @@ static void ghes_sea_remove(struct ghes *ghes)
 	list_del_rcu(&ghes->list);
 	mutex_unlock(&ghes_list_mutex);
 	synchronize_rcu();
+
+	ghes_estatus_queue_shrink_pool(ghes);
 }
 #else /* CONFIG_ACPI_APEI_SEA */
 static inline void ghes_sea_add(struct ghes *ghes) { }
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Now that there are two users of the estatus queue, and likely to be more,
make it a Kconfig symbol selected by the appropriate notification. We
can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/Kconfig |  6 ++++++
 drivers/acpi/apei/ghes.c  | 12 +++---------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 52ae5438edeb..2b191e09b647 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -4,6 +4,7 @@ config HAVE_ACPI_APEI
 
 config HAVE_ACPI_APEI_NMI
 	bool
+	select ACPI_APEI_GHES_ESTATUS_QUEUE
 
 config ACPI_APEI
 	bool "ACPI Platform Error Interface (APEI)"
@@ -33,6 +34,10 @@ config ACPI_APEI_GHES
 	  by firmware to produce more valuable hardware error
 	  information for Linux.
 
+config ACPI_APEI_GHES_ESTATUS_QUEUE
+	bool
+	depends on ACPI_APEI_GHES && ARCH_HAVE_NMI_SAFE_CMPXCHG
+
 config ACPI_APEI_PCIEAER
 	bool "APEI PCIe AER logging/recovering support"
 	depends on ACPI_APEI && PCIEAER
@@ -43,6 +48,7 @@ config ACPI_APEI_PCIEAER
 config ACPI_APEI_SEA
 	bool "APEI Synchronous External Abort logging/recovering support"
 	depends on ARM64 && ACPI_APEI_GHES
+	select ACPI_APEI_GHES_ESTATUS_QUEUE
 	default y
 	help
 	  This option should be enabled if the system supports
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 150fb184c7cb..2880547e13b8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -58,10 +58,6 @@
 
 #define GHES_PFX	"GHES: "
 
-#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
-#define WANT_NMI_ESTATUS_QUEUE	1
-#endif
-
 #define GHES_ESTATUS_MAX_SIZE		65536
 #define GHES_ESOURCE_PREALLOC_MAX_SIZE	65536
 
@@ -685,7 +681,7 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef WANT_NMI_ESTATUS_QUEUE
+#ifdef CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -727,7 +723,6 @@ static void ghes_print_queued_estatus(void)
 /* Save estatus for further processing in IRQ context */
 static void __process_error(struct ghes *ghes)
 {
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
@@ -747,7 +742,6 @@ static void __process_error(struct ghes *ghes)
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
 	memcpy(estatus, ghes->estatus, len);
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
 }
 
 static int _in_nmi_notify_one(struct ghes *ghes)
@@ -786,7 +780,7 @@ static int ghes_estatus_queue_notified(struct list_head *rcu_list)
 	}
 	rcu_read_unlock();
 
-	if (IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && !ret)
+	if (!ret)
 		irq_work_queue(&ghes_proc_irq_work);
 
 	return ret;
@@ -865,7 +859,7 @@ static void ghes_nmi_init_cxt(void)
 
 #else
 static inline void ghes_nmi_init_cxt(void) { }
-#endif /* WANT_NMI_ESTATUS_QUEUE */
+#endif /* CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE */
 
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Now that there are two users of the estatus queue, and likely to be more,
make it a Kconfig symbol selected by the appropriate notification. We
can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/Kconfig |  6 ++++++
 drivers/acpi/apei/ghes.c  | 12 +++---------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 52ae5438edeb..2b191e09b647 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -4,6 +4,7 @@ config HAVE_ACPI_APEI
 
 config HAVE_ACPI_APEI_NMI
 	bool
+	select ACPI_APEI_GHES_ESTATUS_QUEUE
 
 config ACPI_APEI
 	bool "ACPI Platform Error Interface (APEI)"
@@ -33,6 +34,10 @@ config ACPI_APEI_GHES
 	  by firmware to produce more valuable hardware error
 	  information for Linux.
 
+config ACPI_APEI_GHES_ESTATUS_QUEUE
+	bool
+	depends on ACPI_APEI_GHES && ARCH_HAVE_NMI_SAFE_CMPXCHG
+
 config ACPI_APEI_PCIEAER
 	bool "APEI PCIe AER logging/recovering support"
 	depends on ACPI_APEI && PCIEAER
@@ -43,6 +48,7 @@ config ACPI_APEI_PCIEAER
 config ACPI_APEI_SEA
 	bool "APEI Synchronous External Abort logging/recovering support"
 	depends on ARM64 && ACPI_APEI_GHES
+	select ACPI_APEI_GHES_ESTATUS_QUEUE
 	default y
 	help
 	  This option should be enabled if the system supports
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 150fb184c7cb..2880547e13b8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -58,10 +58,6 @@
 
 #define GHES_PFX	"GHES: "
 
-#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
-#define WANT_NMI_ESTATUS_QUEUE	1
-#endif
-
 #define GHES_ESTATUS_MAX_SIZE		65536
 #define GHES_ESOURCE_PREALLOC_MAX_SIZE	65536
 
@@ -685,7 +681,7 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef WANT_NMI_ESTATUS_QUEUE
+#ifdef CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -727,7 +723,6 @@ static void ghes_print_queued_estatus(void)
 /* Save estatus for further processing in IRQ context */
 static void __process_error(struct ghes *ghes)
 {
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
@@ -747,7 +742,6 @@ static void __process_error(struct ghes *ghes)
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
 	memcpy(estatus, ghes->estatus, len);
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
 }
 
 static int _in_nmi_notify_one(struct ghes *ghes)
@@ -786,7 +780,7 @@ static int ghes_estatus_queue_notified(struct list_head *rcu_list)
 	}
 	rcu_read_unlock();
 
-	if (IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && !ret)
+	if (!ret)
 		irq_work_queue(&ghes_proc_irq_work);
 
 	return ret;
@@ -865,7 +859,7 @@ static void ghes_nmi_init_cxt(void)
 
 #else
 static inline void ghes_nmi_init_cxt(void) { }
-#endif /* WANT_NMI_ESTATUS_QUEUE */
+#endif /* CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE */
 
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

Now that there are two users of the estatus queue, and likely to be more,
make it a Kconfig symbol selected by the appropriate notification. We
can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/Kconfig |  6 ++++++
 drivers/acpi/apei/ghes.c  | 12 +++---------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 52ae5438edeb..2b191e09b647 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -4,6 +4,7 @@ config HAVE_ACPI_APEI
 
 config HAVE_ACPI_APEI_NMI
 	bool
+	select ACPI_APEI_GHES_ESTATUS_QUEUE
 
 config ACPI_APEI
 	bool "ACPI Platform Error Interface (APEI)"
@@ -33,6 +34,10 @@ config ACPI_APEI_GHES
 	  by firmware to produce more valuable hardware error
 	  information for Linux.
 
+config ACPI_APEI_GHES_ESTATUS_QUEUE
+	bool
+	depends on ACPI_APEI_GHES && ARCH_HAVE_NMI_SAFE_CMPXCHG
+
 config ACPI_APEI_PCIEAER
 	bool "APEI PCIe AER logging/recovering support"
 	depends on ACPI_APEI && PCIEAER
@@ -43,6 +48,7 @@ config ACPI_APEI_PCIEAER
 config ACPI_APEI_SEA
 	bool "APEI Synchronous External Abort logging/recovering support"
 	depends on ARM64 && ACPI_APEI_GHES
+	select ACPI_APEI_GHES_ESTATUS_QUEUE
 	default y
 	help
 	  This option should be enabled if the system supports
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 150fb184c7cb..2880547e13b8 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -58,10 +58,6 @@
 
 #define GHES_PFX	"GHES: "
 
-#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
-#define WANT_NMI_ESTATUS_QUEUE	1
-#endif
-
 #define GHES_ESTATUS_MAX_SIZE		65536
 #define GHES_ESOURCE_PREALLOC_MAX_SIZE	65536
 
@@ -685,7 +681,7 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef WANT_NMI_ESTATUS_QUEUE
+#ifdef CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -727,7 +723,6 @@ static void ghes_print_queued_estatus(void)
 /* Save estatus for further processing in IRQ context */
 static void __process_error(struct ghes *ghes)
 {
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
@@ -747,7 +742,6 @@ static void __process_error(struct ghes *ghes)
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
 	memcpy(estatus, ghes->estatus, len);
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
 }
 
 static int _in_nmi_notify_one(struct ghes *ghes)
@@ -786,7 +780,7 @@ static int ghes_estatus_queue_notified(struct list_head *rcu_list)
 	}
 	rcu_read_unlock();
 
-	if (IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && !ret)
+	if (!ret)
 		irq_work_queue(&ghes_proc_irq_work);
 
 	return ret;
@@ -865,7 +859,7 @@ static void ghes_nmi_init_cxt(void)
 
 #else
 static inline void ghes_nmi_init_cxt(void) { }
-#endif /* WANT_NMI_ESTATUS_QUEUE */
+#endif /* CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE */
 
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

To split up APEIs in_nmi() path, we need any nmi-like callers to always
be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
out into a header file.

Currently guest synchronous external aborts are claimed as RAS
notifications by handle_guest_sea(), which is hidden in the arch codes
mm/fault.c. 32bit gets a dummy declaration in system_misc.h.

There is going to be more of this in the future if/when we support
the SError-based firmware-first notification mechanism and/or
kernel-first notifications for both synchronous external abort and
SError. Each of these will come with some Kconfig symbols and a
handful of header files.

Create a header file for all this.

This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
declarations to kvm_ras.h as preparation for a future patch that moves
the ACPI-specific RAS code out of mm/fault.c.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 arch/arm/include/asm/kvm_ras.h       | 14 ++++++++++++++
 arch/arm/include/asm/system_misc.h   |  5 -----
 arch/arm64/include/asm/kvm_ras.h     | 11 +++++++++++
 arch/arm64/include/asm/system_misc.h |  2 --
 arch/arm64/mm/fault.c                |  2 +-
 virt/kvm/arm/mmu.c                   |  4 ++--
 6 files changed, 28 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..aaff56bf338f
--- /dev/null
+++ b/arch/arm/include/asm/kvm_ras.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 - Arm Ltd
+
+#ifndef __ARM_KVM_RAS_H__
+#define __ARM_KVM_RAS_H__
+
+#include <linux/types.h>
+
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	return -1;
+}
+
+#endif /* __ARM_KVM_RAS_H__ */
diff --git a/arch/arm/include/asm/system_misc.h b/arch/arm/include/asm/system_misc.h
index 8e76db83c498..66f6a3ae68d2 100644
--- a/arch/arm/include/asm/system_misc.h
+++ b/arch/arm/include/asm/system_misc.h
@@ -38,11 +38,6 @@ static inline void harden_branch_predictor(void)
 
 extern unsigned int user_debug;
 
-static inline int handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	return -1;
-}
-
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_ARM_SYSTEM_MISC_H */
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..5f72b07b7912
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 - Arm Ltd
+
+#ifndef __ARM64_KVM_RAS_H__
+#define __ARM64_KVM_RAS_H__
+
+#include <linux/types.h>
+
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+
+#endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
index 28893a0b141d..48ded3628a89 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -45,8 +45,6 @@ extern void __show_regs(struct pt_regs *);
 
 extern void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd);
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr);
-
 #endif	/* __ASSEMBLY__ */
 
 #endif	/* __ASM_SYSTEM_MISC_H */
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 50b30ff30de4..1a30d7a8c9bf 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -725,7 +725,7 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr)
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
 {
 	return ghes_notify_sea();
 }
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ed162a6c57c5..100c8f2d67ac 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -27,10 +27,10 @@
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_mmio.h>
+#include <asm/kvm_ras.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/virt.h>
-#include <asm/system_misc.h>
 
 #include "trace.h"
 
@@ -1699,7 +1699,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 * For RAS the host kernel may handle this abort.
 		 * There is no need to pass the error into the guest.
 		 */
-		if (!handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
+		if (!kvm_handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
 			return 1;
 
 		if (unlikely(!is_iabt)) {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

To split up APEIs in_nmi() path, we need any nmi-like callers to always
be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
out into a header file.

Currently guest synchronous external aborts are claimed as RAS
notifications by handle_guest_sea(), which is hidden in the arch codes
mm/fault.c. 32bit gets a dummy declaration in system_misc.h.

There is going to be more of this in the future if/when we support
the SError-based firmware-first notification mechanism and/or
kernel-first notifications for both synchronous external abort and
SError. Each of these will come with some Kconfig symbols and a
handful of header files.

Create a header file for all this.

This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
declarations to kvm_ras.h as preparation for a future patch that moves
the ACPI-specific RAS code out of mm/fault.c.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 arch/arm/include/asm/kvm_ras.h       | 14 ++++++++++++++
 arch/arm/include/asm/system_misc.h   |  5 -----
 arch/arm64/include/asm/kvm_ras.h     | 11 +++++++++++
 arch/arm64/include/asm/system_misc.h |  2 --
 arch/arm64/mm/fault.c                |  2 +-
 virt/kvm/arm/mmu.c                   |  4 ++--
 6 files changed, 28 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..aaff56bf338f
--- /dev/null
+++ b/arch/arm/include/asm/kvm_ras.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 - Arm Ltd
+
+#ifndef __ARM_KVM_RAS_H__
+#define __ARM_KVM_RAS_H__
+
+#include <linux/types.h>
+
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	return -1;
+}
+
+#endif /* __ARM_KVM_RAS_H__ */
diff --git a/arch/arm/include/asm/system_misc.h b/arch/arm/include/asm/system_misc.h
index 8e76db83c498..66f6a3ae68d2 100644
--- a/arch/arm/include/asm/system_misc.h
+++ b/arch/arm/include/asm/system_misc.h
@@ -38,11 +38,6 @@ static inline void harden_branch_predictor(void)
 
 extern unsigned int user_debug;
 
-static inline int handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	return -1;
-}
-
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_ARM_SYSTEM_MISC_H */
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..5f72b07b7912
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 - Arm Ltd
+
+#ifndef __ARM64_KVM_RAS_H__
+#define __ARM64_KVM_RAS_H__
+
+#include <linux/types.h>
+
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+
+#endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
index 28893a0b141d..48ded3628a89 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -45,8 +45,6 @@ extern void __show_regs(struct pt_regs *);
 
 extern void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd);
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr);
-
 #endif	/* __ASSEMBLY__ */
 
 #endif	/* __ASM_SYSTEM_MISC_H */
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 50b30ff30de4..1a30d7a8c9bf 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -725,7 +725,7 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr)
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
 {
 	return ghes_notify_sea();
 }
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ed162a6c57c5..100c8f2d67ac 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -27,10 +27,10 @@
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_mmio.h>
+#include <asm/kvm_ras.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/virt.h>
-#include <asm/system_misc.h>
 
 #include "trace.h"
 
@@ -1699,7 +1699,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 * For RAS the host kernel may handle this abort.
 		 * There is no need to pass the error into the guest.
 		 */
-		if (!handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
+		if (!kvm_handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
 			return 1;
 
 		if (unlikely(!is_iabt)) {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

To split up APEIs in_nmi() path, we need any nmi-like callers to always
be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
out into a header file.

Currently guest synchronous external aborts are claimed as RAS
notifications by handle_guest_sea(), which is hidden in the arch codes
mm/fault.c. 32bit gets a dummy declaration in system_misc.h.

There is going to be more of this in the future if/when we support
the SError-based firmware-first notification mechanism and/or
kernel-first notifications for both synchronous external abort and
SError. Each of these will come with some Kconfig symbols and a
handful of header files.

Create a header file for all this.

This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
declarations to kvm_ras.h as preparation for a future patch that moves
the ACPI-specific RAS code out of mm/fault.c.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 arch/arm/include/asm/kvm_ras.h       | 14 ++++++++++++++
 arch/arm/include/asm/system_misc.h   |  5 -----
 arch/arm64/include/asm/kvm_ras.h     | 11 +++++++++++
 arch/arm64/include/asm/system_misc.h |  2 --
 arch/arm64/mm/fault.c                |  2 +-
 virt/kvm/arm/mmu.c                   |  4 ++--
 6 files changed, 28 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_ras.h
 create mode 100644 arch/arm64/include/asm/kvm_ras.h

diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..aaff56bf338f
--- /dev/null
+++ b/arch/arm/include/asm/kvm_ras.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 - Arm Ltd
+
+#ifndef __ARM_KVM_RAS_H__
+#define __ARM_KVM_RAS_H__
+
+#include <linux/types.h>
+
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	return -1;
+}
+
+#endif /* __ARM_KVM_RAS_H__ */
diff --git a/arch/arm/include/asm/system_misc.h b/arch/arm/include/asm/system_misc.h
index 8e76db83c498..66f6a3ae68d2 100644
--- a/arch/arm/include/asm/system_misc.h
+++ b/arch/arm/include/asm/system_misc.h
@@ -38,11 +38,6 @@ static inline void harden_branch_predictor(void)
 
 extern unsigned int user_debug;
 
-static inline int handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	return -1;
-}
-
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_ARM_SYSTEM_MISC_H */
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
new file mode 100644
index 000000000000..5f72b07b7912
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018 - Arm Ltd
+
+#ifndef __ARM64_KVM_RAS_H__
+#define __ARM64_KVM_RAS_H__
+
+#include <linux/types.h>
+
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+
+#endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
index 28893a0b141d..48ded3628a89 100644
--- a/arch/arm64/include/asm/system_misc.h
+++ b/arch/arm64/include/asm/system_misc.h
@@ -45,8 +45,6 @@ extern void __show_regs(struct pt_regs *);
 
 extern void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd);
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr);
-
 #endif	/* __ASSEMBLY__ */
 
 #endif	/* __ASM_SYSTEM_MISC_H */
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 50b30ff30de4..1a30d7a8c9bf 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -725,7 +725,7 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int handle_guest_sea(phys_addr_t addr, unsigned int esr)
+int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
 {
 	return ghes_notify_sea();
 }
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ed162a6c57c5..100c8f2d67ac 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -27,10 +27,10 @@
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_mmio.h>
+#include <asm/kvm_ras.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
 #include <asm/virt.h>
-#include <asm/system_misc.h>
 
 #include "trace.h"
 
@@ -1699,7 +1699,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 * For RAS the host kernel may handle this abort.
 		 * There is no need to pass the error into the guest.
 		 */
-		if (!handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
+		if (!kvm_handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
 			return 1;
 
 		if (unlikely(!is_iabt)) {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

To split up APEIs in_nmi() path, we need the nmi-like callers to always
be in_nmi(). Add a helper to do the work and claim the notification.

When KVM or the arch code takes an exception that might be a RAS
notification, it asks the APEI firmware-first code whether it wants
to claim the exception. We can then go on to see if (a future)
kernel-first mechanism wants to claim the notification, before
falling through to the existing default behaviour.

The NOTIFY_SEA code was merged before we had multiple, possibly
interacting, NMI-like notifications and the need to consider kernel
first in the future. Make the 'claiming' behaviour explicit.

As we're restructuring the APEI code to allow multiple NMI-like
notifications, any notification that might interrupt interrupts-masked
code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
to use in_nmi() to use the right fixmap entries.

We mask SError over this window to prevent an asynchronous RAS error
arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Why does apei_claim_sea() take a pt_regs? This gets used later to take
APEI by the hand through NMI->IRQ context, depending on what we
interrupted.

Changes since v4:
 * Made irqs-unmasked comment a lockdep assert.

Changes since v3:
 * Removed spurious whitespace change
 * Updated comment in acpi.c to cover SError masking

Changes since v2:
 * Added dummy definition for !ACPI and culled IS_ENABLED() checks.

squash: make 'call with irqs unmaksed' a lockdep assert, much better
---
 arch/arm64/include/asm/acpi.h      |  4 ++++
 arch/arm64/include/asm/daifflags.h |  1 +
 arch/arm64/include/asm/kvm_ras.h   | 16 +++++++++++++++-
 arch/arm64/kernel/acpi.c           | 29 +++++++++++++++++++++++++++++
 arch/arm64/mm/fault.c              | 24 +++++-------------------
 5 files changed, 54 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 709208dfdc8b..f722d2d6bf2b 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -18,6 +18,7 @@
 
 #include <asm/cputype.h>
 #include <asm/io.h>
+#include <asm/ptrace.h>
 #include <asm/smp_plat.h>
 #include <asm/tlbflush.h>
 
@@ -139,6 +140,9 @@ static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
 {
 	return __acpi_get_mem_attribute(addr);
 }
+int apei_claim_sea(struct pt_regs *regs);
+#else
+static inline int apei_claim_sea(struct pt_regs *regs) { return -ENOENT; }
 #endif /* CONFIG_ACPI_APEI */
 
 #ifdef CONFIG_ACPI_NUMA
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index 22e4c83de5a5..cbd753855bf3 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -20,6 +20,7 @@
 
 #define DAIF_PROCCTX		0
 #define DAIF_PROCCTX_NOIRQ	PSR_I_BIT
+#define DAIF_ERRCTX		(PSR_I_BIT | PSR_A_BIT)
 
 /* mask/save/unmask/restore all exceptions, including interrupts. */
 static inline void local_daif_mask(void)
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
index 5f72b07b7912..5b56e7e297b1 100644
--- a/arch/arm64/include/asm/kvm_ras.h
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -4,8 +4,22 @@
 #ifndef __ARM64_KVM_RAS_H__
 #define __ARM64_KVM_RAS_H__
 
+#include <linux/acpi.h>
+#include <linux/errno.h>
 #include <linux/types.h>
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+#include <asm/acpi.h>
+
+/*
+ * Was this synchronous external abort a RAS notification?
+ * Returns '0' for errors handled by some RAS subsystem, or -ENOENT.
+ */
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	/* apei_claim_sea(NULL) expects to mask interrupts itself */
+	lockdep_assert_irqs_enabled();
+
+	return apei_claim_sea(NULL);
+}
 
 #endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index ed46dc188b22..a9b8bba014b5 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -28,8 +28,10 @@
 #include <linux/smp.h>
 #include <linux/serial_core.h>
 
+#include <acpi/ghes.h>
 #include <asm/cputype.h>
 #include <asm/cpu_ops.h>
+#include <asm/daifflags.h>
 #include <asm/pgtable.h>
 #include <asm/smp_plat.h>
 
@@ -257,3 +259,30 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
 		return __pgprot(PROT_NORMAL_NC);
 	return __pgprot(PROT_DEVICE_nGnRnE);
 }
+
+/*
+ * Claim Synchronous External Aborts as a firmware first notification.
+ *
+ * Used by KVM and the arch do_sea handler.
+ * @regs may be NULL when called from process context.
+ */
+int apei_claim_sea(struct pt_regs *regs)
+{
+	int err = -ENOENT;
+	unsigned long current_flags = arch_local_save_flags();
+
+	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
+		return err;
+
+	/*
+	 * SEA can interrupt SError, mask it and describe this as an NMI so
+	 * that APEI defers the handling.
+	 */
+	local_daif_restore(DAIF_ERRCTX);
+	nmi_enter();
+	err = ghes_notify_sea();
+	nmi_exit();
+	local_daif_restore(current_flags);
+
+	return err;
+}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 1a30d7a8c9bf..2c38776bb71f 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -18,6 +18,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi.h>
 #include <linux/extable.h>
 #include <linux/signal.h>
 #include <linux/mm.h>
@@ -33,6 +34,7 @@
 #include <linux/preempt.h>
 #include <linux/hugetlb.h>
 
+#include <asm/acpi.h>
 #include <asm/bug.h>
 #include <asm/cmpxchg.h>
 #include <asm/cpufeature.h>
@@ -45,8 +47,6 @@
 #include <asm/tlbflush.h>
 #include <asm/traps.h>
 
-#include <acpi/ghes.h>
-
 struct fault_info {
 	int	(*fn)(unsigned long addr, unsigned int esr,
 		      struct pt_regs *regs);
@@ -631,19 +631,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 	inf = esr_to_fault_info(esr);
 
 	/*
-	 * Synchronous aborts may interrupt code which had interrupts masked.
-	 * Before calling out into the wider kernel tell the interested
-	 * subsystems.
+	 * Return value ignored as we rely on signal merging.
+	 * Future patches will make this more robust.
 	 */
-	if (IS_ENABLED(CONFIG_ACPI_APEI_SEA)) {
-		if (interrupts_enabled(regs))
-			nmi_enter();
-
-		ghes_notify_sea();
-
-		if (interrupts_enabled(regs))
-			nmi_exit();
-	}
+	apei_claim_sea(regs);
 
 	clear_siginfo(&info);
 	info.si_signo = inf->sig;
@@ -725,11 +716,6 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	return ghes_notify_sea();
-}
-
 asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
 					 struct pt_regs *regs)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

To split up APEIs in_nmi() path, we need the nmi-like callers to always
be in_nmi(). Add a helper to do the work and claim the notification.

When KVM or the arch code takes an exception that might be a RAS
notification, it asks the APEI firmware-first code whether it wants
to claim the exception. We can then go on to see if (a future)
kernel-first mechanism wants to claim the notification, before
falling through to the existing default behaviour.

The NOTIFY_SEA code was merged before we had multiple, possibly
interacting, NMI-like notifications and the need to consider kernel
first in the future. Make the 'claiming' behaviour explicit.

As we're restructuring the APEI code to allow multiple NMI-like
notifications, any notification that might interrupt interrupts-masked
code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
to use in_nmi() to use the right fixmap entries.

We mask SError over this window to prevent an asynchronous RAS error
arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Why does apei_claim_sea() take a pt_regs? This gets used later to take
APEI by the hand through NMI->IRQ context, depending on what we
interrupted.

Changes since v4:
 * Made irqs-unmasked comment a lockdep assert.

Changes since v3:
 * Removed spurious whitespace change
 * Updated comment in acpi.c to cover SError masking

Changes since v2:
 * Added dummy definition for !ACPI and culled IS_ENABLED() checks.

squash: make 'call with irqs unmaksed' a lockdep assert, much better
---
 arch/arm64/include/asm/acpi.h      |  4 ++++
 arch/arm64/include/asm/daifflags.h |  1 +
 arch/arm64/include/asm/kvm_ras.h   | 16 +++++++++++++++-
 arch/arm64/kernel/acpi.c           | 29 +++++++++++++++++++++++++++++
 arch/arm64/mm/fault.c              | 24 +++++-------------------
 5 files changed, 54 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 709208dfdc8b..f722d2d6bf2b 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -18,6 +18,7 @@
 
 #include <asm/cputype.h>
 #include <asm/io.h>
+#include <asm/ptrace.h>
 #include <asm/smp_plat.h>
 #include <asm/tlbflush.h>
 
@@ -139,6 +140,9 @@ static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
 {
 	return __acpi_get_mem_attribute(addr);
 }
+int apei_claim_sea(struct pt_regs *regs);
+#else
+static inline int apei_claim_sea(struct pt_regs *regs) { return -ENOENT; }
 #endif /* CONFIG_ACPI_APEI */
 
 #ifdef CONFIG_ACPI_NUMA
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index 22e4c83de5a5..cbd753855bf3 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -20,6 +20,7 @@
 
 #define DAIF_PROCCTX		0
 #define DAIF_PROCCTX_NOIRQ	PSR_I_BIT
+#define DAIF_ERRCTX		(PSR_I_BIT | PSR_A_BIT)
 
 /* mask/save/unmask/restore all exceptions, including interrupts. */
 static inline void local_daif_mask(void)
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
index 5f72b07b7912..5b56e7e297b1 100644
--- a/arch/arm64/include/asm/kvm_ras.h
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -4,8 +4,22 @@
 #ifndef __ARM64_KVM_RAS_H__
 #define __ARM64_KVM_RAS_H__
 
+#include <linux/acpi.h>
+#include <linux/errno.h>
 #include <linux/types.h>
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+#include <asm/acpi.h>
+
+/*
+ * Was this synchronous external abort a RAS notification?
+ * Returns '0' for errors handled by some RAS subsystem, or -ENOENT.
+ */
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	/* apei_claim_sea(NULL) expects to mask interrupts itself */
+	lockdep_assert_irqs_enabled();
+
+	return apei_claim_sea(NULL);
+}
 
 #endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index ed46dc188b22..a9b8bba014b5 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -28,8 +28,10 @@
 #include <linux/smp.h>
 #include <linux/serial_core.h>
 
+#include <acpi/ghes.h>
 #include <asm/cputype.h>
 #include <asm/cpu_ops.h>
+#include <asm/daifflags.h>
 #include <asm/pgtable.h>
 #include <asm/smp_plat.h>
 
@@ -257,3 +259,30 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
 		return __pgprot(PROT_NORMAL_NC);
 	return __pgprot(PROT_DEVICE_nGnRnE);
 }
+
+/*
+ * Claim Synchronous External Aborts as a firmware first notification.
+ *
+ * Used by KVM and the arch do_sea handler.
+ * @regs may be NULL when called from process context.
+ */
+int apei_claim_sea(struct pt_regs *regs)
+{
+	int err = -ENOENT;
+	unsigned long current_flags = arch_local_save_flags();
+
+	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
+		return err;
+
+	/*
+	 * SEA can interrupt SError, mask it and describe this as an NMI so
+	 * that APEI defers the handling.
+	 */
+	local_daif_restore(DAIF_ERRCTX);
+	nmi_enter();
+	err = ghes_notify_sea();
+	nmi_exit();
+	local_daif_restore(current_flags);
+
+	return err;
+}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 1a30d7a8c9bf..2c38776bb71f 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -18,6 +18,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi.h>
 #include <linux/extable.h>
 #include <linux/signal.h>
 #include <linux/mm.h>
@@ -33,6 +34,7 @@
 #include <linux/preempt.h>
 #include <linux/hugetlb.h>
 
+#include <asm/acpi.h>
 #include <asm/bug.h>
 #include <asm/cmpxchg.h>
 #include <asm/cpufeature.h>
@@ -45,8 +47,6 @@
 #include <asm/tlbflush.h>
 #include <asm/traps.h>
 
-#include <acpi/ghes.h>
-
 struct fault_info {
 	int	(*fn)(unsigned long addr, unsigned int esr,
 		      struct pt_regs *regs);
@@ -631,19 +631,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 	inf = esr_to_fault_info(esr);
 
 	/*
-	 * Synchronous aborts may interrupt code which had interrupts masked.
-	 * Before calling out into the wider kernel tell the interested
-	 * subsystems.
+	 * Return value ignored as we rely on signal merging.
+	 * Future patches will make this more robust.
 	 */
-	if (IS_ENABLED(CONFIG_ACPI_APEI_SEA)) {
-		if (interrupts_enabled(regs))
-			nmi_enter();
-
-		ghes_notify_sea();
-
-		if (interrupts_enabled(regs))
-			nmi_exit();
-	}
+	apei_claim_sea(regs);
 
 	clear_siginfo(&info);
 	info.si_signo = inf->sig;
@@ -725,11 +716,6 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	return ghes_notify_sea();
-}
-
 asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
 					 struct pt_regs *regs)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

To split up APEIs in_nmi() path, we need the nmi-like callers to always
be in_nmi(). Add a helper to do the work and claim the notification.

When KVM or the arch code takes an exception that might be a RAS
notification, it asks the APEI firmware-first code whether it wants
to claim the exception. We can then go on to see if (a future)
kernel-first mechanism wants to claim the notification, before
falling through to the existing default behaviour.

The NOTIFY_SEA code was merged before we had multiple, possibly
interacting, NMI-like notifications and the need to consider kernel
first in the future. Make the 'claiming' behaviour explicit.

As we're restructuring the APEI code to allow multiple NMI-like
notifications, any notification that might interrupt interrupts-masked
code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
to use in_nmi() to use the right fixmap entries.

We mask SError over this window to prevent an asynchronous RAS error
arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

---
Why does apei_claim_sea() take a pt_regs? This gets used later to take
APEI by the hand through NMI->IRQ context, depending on what we
interrupted.

Changes since v4:
 * Made irqs-unmasked comment a lockdep assert.

Changes since v3:
 * Removed spurious whitespace change
 * Updated comment in acpi.c to cover SError masking

Changes since v2:
 * Added dummy definition for !ACPI and culled IS_ENABLED() checks.

squash: make 'call with irqs unmaksed' a lockdep assert, much better
---
 arch/arm64/include/asm/acpi.h      |  4 ++++
 arch/arm64/include/asm/daifflags.h |  1 +
 arch/arm64/include/asm/kvm_ras.h   | 16 +++++++++++++++-
 arch/arm64/kernel/acpi.c           | 29 +++++++++++++++++++++++++++++
 arch/arm64/mm/fault.c              | 24 +++++-------------------
 5 files changed, 54 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 709208dfdc8b..f722d2d6bf2b 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -18,6 +18,7 @@
 
 #include <asm/cputype.h>
 #include <asm/io.h>
+#include <asm/ptrace.h>
 #include <asm/smp_plat.h>
 #include <asm/tlbflush.h>
 
@@ -139,6 +140,9 @@ static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
 {
 	return __acpi_get_mem_attribute(addr);
 }
+int apei_claim_sea(struct pt_regs *regs);
+#else
+static inline int apei_claim_sea(struct pt_regs *regs) { return -ENOENT; }
 #endif /* CONFIG_ACPI_APEI */
 
 #ifdef CONFIG_ACPI_NUMA
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index 22e4c83de5a5..cbd753855bf3 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -20,6 +20,7 @@
 
 #define DAIF_PROCCTX		0
 #define DAIF_PROCCTX_NOIRQ	PSR_I_BIT
+#define DAIF_ERRCTX		(PSR_I_BIT | PSR_A_BIT)
 
 /* mask/save/unmask/restore all exceptions, including interrupts. */
 static inline void local_daif_mask(void)
diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h
index 5f72b07b7912..5b56e7e297b1 100644
--- a/arch/arm64/include/asm/kvm_ras.h
+++ b/arch/arm64/include/asm/kvm_ras.h
@@ -4,8 +4,22 @@
 #ifndef __ARM64_KVM_RAS_H__
 #define __ARM64_KVM_RAS_H__
 
+#include <linux/acpi.h>
+#include <linux/errno.h>
 #include <linux/types.h>
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr);
+#include <asm/acpi.h>
+
+/*
+ * Was this synchronous external abort a RAS notification?
+ * Returns '0' for errors handled by some RAS subsystem, or -ENOENT.
+ */
+static inline int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
+{
+	/* apei_claim_sea(NULL) expects to mask interrupts itself */
+	lockdep_assert_irqs_enabled();
+
+	return apei_claim_sea(NULL);
+}
 
 #endif /* __ARM64_KVM_RAS_H__ */
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index ed46dc188b22..a9b8bba014b5 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -28,8 +28,10 @@
 #include <linux/smp.h>
 #include <linux/serial_core.h>
 
+#include <acpi/ghes.h>
 #include <asm/cputype.h>
 #include <asm/cpu_ops.h>
+#include <asm/daifflags.h>
 #include <asm/pgtable.h>
 #include <asm/smp_plat.h>
 
@@ -257,3 +259,30 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
 		return __pgprot(PROT_NORMAL_NC);
 	return __pgprot(PROT_DEVICE_nGnRnE);
 }
+
+/*
+ * Claim Synchronous External Aborts as a firmware first notification.
+ *
+ * Used by KVM and the arch do_sea handler.
+ * @regs may be NULL when called from process context.
+ */
+int apei_claim_sea(struct pt_regs *regs)
+{
+	int err = -ENOENT;
+	unsigned long current_flags = arch_local_save_flags();
+
+	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
+		return err;
+
+	/*
+	 * SEA can interrupt SError, mask it and describe this as an NMI so
+	 * that APEI defers the handling.
+	 */
+	local_daif_restore(DAIF_ERRCTX);
+	nmi_enter();
+	err = ghes_notify_sea();
+	nmi_exit();
+	local_daif_restore(current_flags);
+
+	return err;
+}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 1a30d7a8c9bf..2c38776bb71f 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -18,6 +18,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi.h>
 #include <linux/extable.h>
 #include <linux/signal.h>
 #include <linux/mm.h>
@@ -33,6 +34,7 @@
 #include <linux/preempt.h>
 #include <linux/hugetlb.h>
 
+#include <asm/acpi.h>
 #include <asm/bug.h>
 #include <asm/cmpxchg.h>
 #include <asm/cpufeature.h>
@@ -45,8 +47,6 @@
 #include <asm/tlbflush.h>
 #include <asm/traps.h>
 
-#include <acpi/ghes.h>
-
 struct fault_info {
 	int	(*fn)(unsigned long addr, unsigned int esr,
 		      struct pt_regs *regs);
@@ -631,19 +631,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 	inf = esr_to_fault_info(esr);
 
 	/*
-	 * Synchronous aborts may interrupt code which had interrupts masked.
-	 * Before calling out into the wider kernel tell the interested
-	 * subsystems.
+	 * Return value ignored as we rely on signal merging.
+	 * Future patches will make this more robust.
 	 */
-	if (IS_ENABLED(CONFIG_ACPI_APEI_SEA)) {
-		if (interrupts_enabled(regs))
-			nmi_enter();
-
-		ghes_notify_sea();
-
-		if (interrupts_enabled(regs))
-			nmi_exit();
-	}
+	apei_claim_sea(regs);
 
 	clear_siginfo(&info);
 	info.si_signo = inf->sig;
@@ -725,11 +716,6 @@ static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 63"			},
 };
 
-int kvm_handle_guest_sea(phys_addr_t addr, unsigned int esr)
-{
-	return ghes_notify_sea();
-}
-
 asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
 					 struct pt_regs *regs)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 08/18] ACPI / APEI: Move locking to the notification helper
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes_copy_tofrom_phys() takes different locks depending on in_nmi().
This doesn't work when we have multiple NMI-like notifications, that
can interrupt each other.

Now that NOTIFY_SEA is always called as an NMI, move the lock-taking
to the notification helper. The helper will always know which lock
to take. This avoids ghes_copy_tofrom_phys() taking a guess based
on in_nmi().

This splits NOTIFY_NMI and NOTIFY_SEA to use different locks. All
the other notifications use ghes_proc(), and are called in process
or IRQ context. Move the spin_lock_irqsave() around their ghes_proc()
calls.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v5:
 * Moved locking further out, to allow no-irq-masking in the future.


 drivers/acpi/apei/ghes.c | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 2880547e13b8..ed8669a6c100 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -113,12 +113,13 @@ static DEFINE_MUTEX(ghes_list_mutex);
  * from BIOS to Linux can be determined only in NMI, IRQ or timer
  * handler, but general ioremap can not be used in atomic context, so
  * the fixmap is used instead.
- *
- * These 2 spinlocks are used to prevent the fixmap entries from being used
- * simultaneously.
  */
-static DEFINE_RAW_SPINLOCK(ghes_ioremap_lock_nmi);
-static DEFINE_SPINLOCK(ghes_ioremap_lock_irq);
+
+/*
+ * Used by ghes_proc() to prevent non-NMI notifications from interacting.
+ * This also protects the FIX_APEI_GHES_IRQ fixmap slot.
+ */
+static DEFINE_SPINLOCK(ghes_notify_lock_irq);
 
 static struct gen_pool *ghes_estatus_pool;
 static unsigned long ghes_estatus_pool_size_request;
@@ -291,7 +292,6 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 				  int from_phys)
 {
 	void __iomem *vaddr;
-	unsigned long flags = 0;
 	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
@@ -299,10 +299,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
 		if (in_nmi) {
-			raw_spin_lock(&ghes_ioremap_lock_nmi);
 			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
 		} else {
-			spin_lock_irqsave(&ghes_ioremap_lock_irq, flags);
 			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
 		}
 		trunk = PAGE_SIZE - offset;
@@ -316,10 +314,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		buffer += trunk;
 		if (in_nmi) {
 			ghes_iounmap_nmi();
-			raw_spin_unlock(&ghes_ioremap_lock_nmi);
 		} else {
 			ghes_iounmap_irq();
-			spin_unlock_irqrestore(&ghes_ioremap_lock_irq, flags);
 		}
 	}
 }
@@ -928,8 +924,11 @@ static void ghes_add_timer(struct ghes *ghes)
 static void ghes_poll_func(struct timer_list *t)
 {
 	struct ghes *ghes = from_timer(ghes, t, timer);
+	unsigned long flags;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 	if (!(ghes->flags & GHES_EXITING))
 		ghes_add_timer(ghes);
 }
@@ -937,9 +936,12 @@ static void ghes_poll_func(struct timer_list *t)
 static irqreturn_t ghes_irq_func(int irq, void *data)
 {
 	struct ghes *ghes = data;
+	unsigned long flags;
 	int rc;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	rc = ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 	if (rc)
 		return IRQ_NONE;
 
@@ -950,14 +952,17 @@ static int ghes_notify_hed(struct notifier_block *this, unsigned long event,
 			   void *data)
 {
 	struct ghes *ghes;
+	unsigned long flags;
 	int ret = NOTIFY_DONE;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	rcu_read_lock();
 	list_for_each_entry_rcu(ghes, &ghes_hed, list) {
 		if (!ghes_proc(ghes))
 			ret = NOTIFY_OK;
 	}
 	rcu_read_unlock();
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 
 	return ret;
 }
@@ -968,6 +973,7 @@ static struct notifier_block ghes_notifier_hed = {
 
 #ifdef CONFIG_ACPI_APEI_SEA
 static LIST_HEAD(ghes_sea);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sea);
 
 /*
  * Return 0 only if one of the SEA error sources successfully reported an error
@@ -975,7 +981,13 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	return ghes_estatus_queue_notified(&ghes_sea);
+	int rv;
+
+	raw_spin_lock(&ghes_notify_lock_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea);
+	raw_spin_unlock(&ghes_notify_lock_sea);
+
+	return rv;
 }
 
 static void ghes_sea_add(struct ghes *ghes)
@@ -1009,6 +1021,7 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
 static LIST_HEAD(ghes_nmi);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
@@ -1017,8 +1030,10 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
 		return ret;
 
+	raw_spin_lock(&ghes_notify_lock_nmi);
 	if (!ghes_estatus_queue_notified(&ghes_nmi))
 		ret = NMI_HANDLED;
+	raw_spin_unlock(&ghes_notify_lock_nmi);
 
 	atomic_dec(&ghes_in_nmi);
 	return ret;
@@ -1060,6 +1075,7 @@ static int ghes_probe(struct platform_device *ghes_dev)
 {
 	struct acpi_hest_generic *generic;
 	struct ghes *ghes = NULL;
+	unsigned long flags;
 
 	int rc = -EINVAL;
 
@@ -1162,7 +1178,9 @@ static int ghes_probe(struct platform_device *ghes_dev)
 	ghes_edac_register(ghes, &ghes_dev->dev);
 
 	/* Handle any pending errors right away */
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 
 	return 0;
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 08/18] ACPI / APEI: Move locking to the notification helper
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

ghes_copy_tofrom_phys() takes different locks depending on in_nmi().
This doesn't work when we have multiple NMI-like notifications, that
can interrupt each other.

Now that NOTIFY_SEA is always called as an NMI, move the lock-taking
to the notification helper. The helper will always know which lock
to take. This avoids ghes_copy_tofrom_phys() taking a guess based
on in_nmi().

This splits NOTIFY_NMI and NOTIFY_SEA to use different locks. All
the other notifications use ghes_proc(), and are called in process
or IRQ context. Move the spin_lock_irqsave() around their ghes_proc()
calls.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v5:
 * Moved locking further out, to allow no-irq-masking in the future.


 drivers/acpi/apei/ghes.c | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 2880547e13b8..ed8669a6c100 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -113,12 +113,13 @@ static DEFINE_MUTEX(ghes_list_mutex);
  * from BIOS to Linux can be determined only in NMI, IRQ or timer
  * handler, but general ioremap can not be used in atomic context, so
  * the fixmap is used instead.
- *
- * These 2 spinlocks are used to prevent the fixmap entries from being used
- * simultaneously.
  */
-static DEFINE_RAW_SPINLOCK(ghes_ioremap_lock_nmi);
-static DEFINE_SPINLOCK(ghes_ioremap_lock_irq);
+
+/*
+ * Used by ghes_proc() to prevent non-NMI notifications from interacting.
+ * This also protects the FIX_APEI_GHES_IRQ fixmap slot.
+ */
+static DEFINE_SPINLOCK(ghes_notify_lock_irq);
 
 static struct gen_pool *ghes_estatus_pool;
 static unsigned long ghes_estatus_pool_size_request;
@@ -291,7 +292,6 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 				  int from_phys)
 {
 	void __iomem *vaddr;
-	unsigned long flags = 0;
 	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
@@ -299,10 +299,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
 		if (in_nmi) {
-			raw_spin_lock(&ghes_ioremap_lock_nmi);
 			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
 		} else {
-			spin_lock_irqsave(&ghes_ioremap_lock_irq, flags);
 			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
 		}
 		trunk = PAGE_SIZE - offset;
@@ -316,10 +314,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		buffer += trunk;
 		if (in_nmi) {
 			ghes_iounmap_nmi();
-			raw_spin_unlock(&ghes_ioremap_lock_nmi);
 		} else {
 			ghes_iounmap_irq();
-			spin_unlock_irqrestore(&ghes_ioremap_lock_irq, flags);
 		}
 	}
 }
@@ -928,8 +924,11 @@ static void ghes_add_timer(struct ghes *ghes)
 static void ghes_poll_func(struct timer_list *t)
 {
 	struct ghes *ghes = from_timer(ghes, t, timer);
+	unsigned long flags;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 	if (!(ghes->flags & GHES_EXITING))
 		ghes_add_timer(ghes);
 }
@@ -937,9 +936,12 @@ static void ghes_poll_func(struct timer_list *t)
 static irqreturn_t ghes_irq_func(int irq, void *data)
 {
 	struct ghes *ghes = data;
+	unsigned long flags;
 	int rc;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	rc = ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 	if (rc)
 		return IRQ_NONE;
 
@@ -950,14 +952,17 @@ static int ghes_notify_hed(struct notifier_block *this, unsigned long event,
 			   void *data)
 {
 	struct ghes *ghes;
+	unsigned long flags;
 	int ret = NOTIFY_DONE;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	rcu_read_lock();
 	list_for_each_entry_rcu(ghes, &ghes_hed, list) {
 		if (!ghes_proc(ghes))
 			ret = NOTIFY_OK;
 	}
 	rcu_read_unlock();
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 
 	return ret;
 }
@@ -968,6 +973,7 @@ static struct notifier_block ghes_notifier_hed = {
 
 #ifdef CONFIG_ACPI_APEI_SEA
 static LIST_HEAD(ghes_sea);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sea);
 
 /*
  * Return 0 only if one of the SEA error sources successfully reported an error
@@ -975,7 +981,13 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	return ghes_estatus_queue_notified(&ghes_sea);
+	int rv;
+
+	raw_spin_lock(&ghes_notify_lock_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea);
+	raw_spin_unlock(&ghes_notify_lock_sea);
+
+	return rv;
 }
 
 static void ghes_sea_add(struct ghes *ghes)
@@ -1009,6 +1021,7 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
 static LIST_HEAD(ghes_nmi);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
@@ -1017,8 +1030,10 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
 		return ret;
 
+	raw_spin_lock(&ghes_notify_lock_nmi);
 	if (!ghes_estatus_queue_notified(&ghes_nmi))
 		ret = NMI_HANDLED;
+	raw_spin_unlock(&ghes_notify_lock_nmi);
 
 	atomic_dec(&ghes_in_nmi);
 	return ret;
@@ -1060,6 +1075,7 @@ static int ghes_probe(struct platform_device *ghes_dev)
 {
 	struct acpi_hest_generic *generic;
 	struct ghes *ghes = NULL;
+	unsigned long flags;
 
 	int rc = -EINVAL;
 
@@ -1162,7 +1178,9 @@ static int ghes_probe(struct platform_device *ghes_dev)
 	ghes_edac_register(ghes, &ghes_dev->dev);
 
 	/* Handle any pending errors right away */
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 
 	return 0;
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 08/18] ACPI / APEI: Move locking to the notification helper
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

ghes_copy_tofrom_phys() takes different locks depending on in_nmi().
This doesn't work when we have multiple NMI-like notifications, that
can interrupt each other.

Now that NOTIFY_SEA is always called as an NMI, move the lock-taking
to the notification helper. The helper will always know which lock
to take. This avoids ghes_copy_tofrom_phys() taking a guess based
on in_nmi().

This splits NOTIFY_NMI and NOTIFY_SEA to use different locks. All
the other notifications use ghes_proc(), and are called in process
or IRQ context. Move the spin_lock_irqsave() around their ghes_proc()
calls.

Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v5:
 * Moved locking further out, to allow no-irq-masking in the future.


 drivers/acpi/apei/ghes.c | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 2880547e13b8..ed8669a6c100 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -113,12 +113,13 @@ static DEFINE_MUTEX(ghes_list_mutex);
  * from BIOS to Linux can be determined only in NMI, IRQ or timer
  * handler, but general ioremap can not be used in atomic context, so
  * the fixmap is used instead.
- *
- * These 2 spinlocks are used to prevent the fixmap entries from being used
- * simultaneously.
  */
-static DEFINE_RAW_SPINLOCK(ghes_ioremap_lock_nmi);
-static DEFINE_SPINLOCK(ghes_ioremap_lock_irq);
+
+/*
+ * Used by ghes_proc() to prevent non-NMI notifications from interacting.
+ * This also protects the FIX_APEI_GHES_IRQ fixmap slot.
+ */
+static DEFINE_SPINLOCK(ghes_notify_lock_irq);
 
 static struct gen_pool *ghes_estatus_pool;
 static unsigned long ghes_estatus_pool_size_request;
@@ -291,7 +292,6 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 				  int from_phys)
 {
 	void __iomem *vaddr;
-	unsigned long flags = 0;
 	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
@@ -299,10 +299,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
 		if (in_nmi) {
-			raw_spin_lock(&ghes_ioremap_lock_nmi);
 			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
 		} else {
-			spin_lock_irqsave(&ghes_ioremap_lock_irq, flags);
 			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
 		}
 		trunk = PAGE_SIZE - offset;
@@ -316,10 +314,8 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		buffer += trunk;
 		if (in_nmi) {
 			ghes_iounmap_nmi();
-			raw_spin_unlock(&ghes_ioremap_lock_nmi);
 		} else {
 			ghes_iounmap_irq();
-			spin_unlock_irqrestore(&ghes_ioremap_lock_irq, flags);
 		}
 	}
 }
@@ -928,8 +924,11 @@ static void ghes_add_timer(struct ghes *ghes)
 static void ghes_poll_func(struct timer_list *t)
 {
 	struct ghes *ghes = from_timer(ghes, t, timer);
+	unsigned long flags;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 	if (!(ghes->flags & GHES_EXITING))
 		ghes_add_timer(ghes);
 }
@@ -937,9 +936,12 @@ static void ghes_poll_func(struct timer_list *t)
 static irqreturn_t ghes_irq_func(int irq, void *data)
 {
 	struct ghes *ghes = data;
+	unsigned long flags;
 	int rc;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	rc = ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 	if (rc)
 		return IRQ_NONE;
 
@@ -950,14 +952,17 @@ static int ghes_notify_hed(struct notifier_block *this, unsigned long event,
 			   void *data)
 {
 	struct ghes *ghes;
+	unsigned long flags;
 	int ret = NOTIFY_DONE;
 
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	rcu_read_lock();
 	list_for_each_entry_rcu(ghes, &ghes_hed, list) {
 		if (!ghes_proc(ghes))
 			ret = NOTIFY_OK;
 	}
 	rcu_read_unlock();
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 
 	return ret;
 }
@@ -968,6 +973,7 @@ static struct notifier_block ghes_notifier_hed = {
 
 #ifdef CONFIG_ACPI_APEI_SEA
 static LIST_HEAD(ghes_sea);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sea);
 
 /*
  * Return 0 only if one of the SEA error sources successfully reported an error
@@ -975,7 +981,13 @@ static LIST_HEAD(ghes_sea);
  */
 int ghes_notify_sea(void)
 {
-	return ghes_estatus_queue_notified(&ghes_sea);
+	int rv;
+
+	raw_spin_lock(&ghes_notify_lock_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea);
+	raw_spin_unlock(&ghes_notify_lock_sea);
+
+	return rv;
 }
 
 static void ghes_sea_add(struct ghes *ghes)
@@ -1009,6 +1021,7 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
 static LIST_HEAD(ghes_nmi);
+static DEFINE_RAW_SPINLOCK(ghes_notify_lock_nmi);
 
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
@@ -1017,8 +1030,10 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	if (!atomic_add_unless(&ghes_in_nmi, 1, 1))
 		return ret;
 
+	raw_spin_lock(&ghes_notify_lock_nmi);
 	if (!ghes_estatus_queue_notified(&ghes_nmi))
 		ret = NMI_HANDLED;
+	raw_spin_unlock(&ghes_notify_lock_nmi);
 
 	atomic_dec(&ghes_in_nmi);
 	return ret;
@@ -1060,6 +1075,7 @@ static int ghes_probe(struct platform_device *ghes_dev)
 {
 	struct acpi_hest_generic *generic;
 	struct ghes *ghes = NULL;
+	unsigned long flags;
 
 	int rc = -EINVAL;
 
@@ -1162,7 +1178,9 @@ static int ghes_probe(struct platform_device *ghes_dev)
 	ghes_edac_register(ghes, &ghes_dev->dev);
 
 	/* Handle any pending errors right away */
+	spin_lock_irqsave(&ghes_notify_lock_irq, flags);
 	ghes_proc(ghes);
+	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 
 	return 0;
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 09/18] ACPI / APEI: Let the notification helper specify the fixmap slot
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes_copy_tofrom_phys() uses a different fixmap slot depending on in_nmi().
This doesn't work when we have multiple NMI-like notifications, that
can interrupt each other.

As with the locking, move the chosen fixmap_idx to the notification helper.
This only matters for NMI-like notifications, anything calling
ghes_proc() can use the IRQ fixmap slot as its already holding an irqsave
spinlock.

This lets us collapse the ghes_ioremap_pfn_*() helpers.

Signed-off-by: James Morse <james.morse@arm.com>
---

The fixmap-idx and vaddr are passed back to ghes_unmap()
to allow ioremap() to be used in process context in the
future.
---
 drivers/acpi/apei/ghes.c | 76 ++++++++++++++--------------------------
 1 file changed, 27 insertions(+), 49 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index ed8669a6c100..adf7fd402813 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -41,6 +41,7 @@
 #include <linux/llist.h>
 #include <linux/genalloc.h>
 #include <linux/pci.h>
+#include <linux/pfn.h>
 #include <linux/aer.h>
 #include <linux/nmi.h>
 #include <linux/sched/clock.h>
@@ -129,38 +130,24 @@ static atomic_t ghes_estatus_cache_alloced;
 
 static int ghes_panic_timeout __read_mostly = 30;
 
-static void __iomem *ghes_ioremap_pfn_nmi(u64 pfn)
+static void __iomem *ghes_map(u64 pfn, int fixmap_idx)
 {
 	phys_addr_t paddr;
 	pgprot_t prot;
 
-	paddr = pfn << PAGE_SHIFT;
+	paddr = PFN_PHYS(pfn);
 	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_NMI, paddr, prot);
+	__set_fixmap(fixmap_idx, paddr, prot);
 
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_NMI);
+	return (void __iomem *) __fix_to_virt(fixmap_idx);
 }
 
-static void __iomem *ghes_ioremap_pfn_irq(u64 pfn)
+static void ghes_unmap(int fixmap_idx, void __iomem *vaddr)
 {
-	phys_addr_t paddr;
-	pgprot_t prot;
-
-	paddr = pfn << PAGE_SHIFT;
-	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_IRQ, paddr, prot);
-
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_IRQ);
-}
-
-static void ghes_iounmap_nmi(void)
-{
-	clear_fixmap(FIX_APEI_GHES_NMI);
-}
+	int _idx = virt_to_fix((unsigned long)vaddr);
 
-static void ghes_iounmap_irq(void)
-{
-	clear_fixmap(FIX_APEI_GHES_IRQ);
+	WARN_ON_ONCE(fixmap_idx != _idx);
+	clear_fixmap(fixmap_idx);
 }
 
 static int ghes_estatus_pool_init(void)
@@ -289,20 +276,15 @@ static inline int ghes_severity(int severity)
 }
 
 static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
-				  int from_phys)
+				  int from_phys, int fixmap_idx)
 {
 	void __iomem *vaddr;
-	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
 
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
-		if (in_nmi) {
-			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
-		} else {
-			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
-		}
+		vaddr = ghes_map(PHYS_PFN(paddr), fixmap_idx);
 		trunk = PAGE_SIZE - offset;
 		trunk = min(trunk, len);
 		if (from_phys)
@@ -312,15 +294,11 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		len -= trunk;
 		paddr += trunk;
 		buffer += trunk;
-		if (in_nmi) {
-			ghes_iounmap_nmi();
-		} else {
-			ghes_iounmap_irq();
-		}
+		ghes_unmap(fixmap_idx, vaddr);
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, int silent)
+static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -339,7 +317,7 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 		return -ENOENT;
 
 	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-			      sizeof(*ghes->estatus), 1);
+			      sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (!ghes->estatus->block_status)
 		return -ENOENT;
 
@@ -356,7 +334,7 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 		goto err_read_block;
 	ghes_copy_tofrom_phys(ghes->estatus + 1,
 			      buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1);
+			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (cper_estatus_check(ghes->estatus))
 		goto err_read_block;
 	rc = 0;
@@ -368,13 +346,13 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes)
+static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
 {
 	ghes->estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
 	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
-			      sizeof(ghes->estatus->block_status), 0);
+			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
 
@@ -740,12 +718,12 @@ static void __process_error(struct ghes *ghes)
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
 }
 
-static int _in_nmi_notify_one(struct ghes *ghes)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
 
-	if (ghes_read_estatus(ghes, 1)) {
-		ghes_clear_estatus(ghes);
+	if (ghes_read_estatus(ghes, 1, fixmap_idx)) {
+		ghes_clear_estatus(ghes, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -759,19 +737,19 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 		return 0;
 
 	__process_error(ghes);
-	ghes_clear_estatus(ghes);
+	ghes_clear_estatus(ghes, fixmap_idx);
 
 	return 0;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+static int ghes_estatus_queue_notified(struct list_head *rcu_list, int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ghes, rcu_list, list) {
-		if (!_in_nmi_notify_one(ghes))
+		if (!_in_nmi_notify_one(ghes, fixmap_idx))
 			ret = 0;
 	}
 	rcu_read_unlock();
@@ -876,7 +854,7 @@ static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
 
-	rc = ghes_read_estatus(ghes, 0);
+	rc = ghes_read_estatus(ghes, 0, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
@@ -891,7 +869,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, ghes->estatus);
 
 out:
-	ghes_clear_estatus(ghes);
+	ghes_clear_estatus(ghes, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
@@ -984,7 +962,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
@@ -1031,7 +1009,7 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 		return ret;
 
 	raw_spin_lock(&ghes_notify_lock_nmi);
-	if (!ghes_estatus_queue_notified(&ghes_nmi))
+	if (!ghes_estatus_queue_notified(&ghes_nmi, FIX_APEI_GHES_NMI))
 		ret = NMI_HANDLED;
 	raw_spin_unlock(&ghes_notify_lock_nmi);
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 09/18] ACPI / APEI: Let the notification helper specify the fixmap slot
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

ghes_copy_tofrom_phys() uses a different fixmap slot depending on in_nmi().
This doesn't work when we have multiple NMI-like notifications, that
can interrupt each other.

As with the locking, move the chosen fixmap_idx to the notification helper.
This only matters for NMI-like notifications, anything calling
ghes_proc() can use the IRQ fixmap slot as its already holding an irqsave
spinlock.

This lets us collapse the ghes_ioremap_pfn_*() helpers.

Signed-off-by: James Morse <james.morse@arm.com>
---

The fixmap-idx and vaddr are passed back to ghes_unmap()
to allow ioremap() to be used in process context in the
future.
---
 drivers/acpi/apei/ghes.c | 76 ++++++++++++++--------------------------
 1 file changed, 27 insertions(+), 49 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index ed8669a6c100..adf7fd402813 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -41,6 +41,7 @@
 #include <linux/llist.h>
 #include <linux/genalloc.h>
 #include <linux/pci.h>
+#include <linux/pfn.h>
 #include <linux/aer.h>
 #include <linux/nmi.h>
 #include <linux/sched/clock.h>
@@ -129,38 +130,24 @@ static atomic_t ghes_estatus_cache_alloced;
 
 static int ghes_panic_timeout __read_mostly = 30;
 
-static void __iomem *ghes_ioremap_pfn_nmi(u64 pfn)
+static void __iomem *ghes_map(u64 pfn, int fixmap_idx)
 {
 	phys_addr_t paddr;
 	pgprot_t prot;
 
-	paddr = pfn << PAGE_SHIFT;
+	paddr = PFN_PHYS(pfn);
 	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_NMI, paddr, prot);
+	__set_fixmap(fixmap_idx, paddr, prot);
 
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_NMI);
+	return (void __iomem *) __fix_to_virt(fixmap_idx);
 }
 
-static void __iomem *ghes_ioremap_pfn_irq(u64 pfn)
+static void ghes_unmap(int fixmap_idx, void __iomem *vaddr)
 {
-	phys_addr_t paddr;
-	pgprot_t prot;
-
-	paddr = pfn << PAGE_SHIFT;
-	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_IRQ, paddr, prot);
-
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_IRQ);
-}
-
-static void ghes_iounmap_nmi(void)
-{
-	clear_fixmap(FIX_APEI_GHES_NMI);
-}
+	int _idx = virt_to_fix((unsigned long)vaddr);
 
-static void ghes_iounmap_irq(void)
-{
-	clear_fixmap(FIX_APEI_GHES_IRQ);
+	WARN_ON_ONCE(fixmap_idx != _idx);
+	clear_fixmap(fixmap_idx);
 }
 
 static int ghes_estatus_pool_init(void)
@@ -289,20 +276,15 @@ static inline int ghes_severity(int severity)
 }
 
 static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
-				  int from_phys)
+				  int from_phys, int fixmap_idx)
 {
 	void __iomem *vaddr;
-	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
 
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
-		if (in_nmi) {
-			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
-		} else {
-			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
-		}
+		vaddr = ghes_map(PHYS_PFN(paddr), fixmap_idx);
 		trunk = PAGE_SIZE - offset;
 		trunk = min(trunk, len);
 		if (from_phys)
@@ -312,15 +294,11 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		len -= trunk;
 		paddr += trunk;
 		buffer += trunk;
-		if (in_nmi) {
-			ghes_iounmap_nmi();
-		} else {
-			ghes_iounmap_irq();
-		}
+		ghes_unmap(fixmap_idx, vaddr);
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, int silent)
+static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -339,7 +317,7 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 		return -ENOENT;
 
 	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-			      sizeof(*ghes->estatus), 1);
+			      sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (!ghes->estatus->block_status)
 		return -ENOENT;
 
@@ -356,7 +334,7 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 		goto err_read_block;
 	ghes_copy_tofrom_phys(ghes->estatus + 1,
 			      buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1);
+			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (cper_estatus_check(ghes->estatus))
 		goto err_read_block;
 	rc = 0;
@@ -368,13 +346,13 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes)
+static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
 {
 	ghes->estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
 	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
-			      sizeof(ghes->estatus->block_status), 0);
+			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
 
@@ -740,12 +718,12 @@ static void __process_error(struct ghes *ghes)
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
 }
 
-static int _in_nmi_notify_one(struct ghes *ghes)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
 
-	if (ghes_read_estatus(ghes, 1)) {
-		ghes_clear_estatus(ghes);
+	if (ghes_read_estatus(ghes, 1, fixmap_idx)) {
+		ghes_clear_estatus(ghes, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -759,19 +737,19 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 		return 0;
 
 	__process_error(ghes);
-	ghes_clear_estatus(ghes);
+	ghes_clear_estatus(ghes, fixmap_idx);
 
 	return 0;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+static int ghes_estatus_queue_notified(struct list_head *rcu_list, int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ghes, rcu_list, list) {
-		if (!_in_nmi_notify_one(ghes))
+		if (!_in_nmi_notify_one(ghes, fixmap_idx))
 			ret = 0;
 	}
 	rcu_read_unlock();
@@ -876,7 +854,7 @@ static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
 
-	rc = ghes_read_estatus(ghes, 0);
+	rc = ghes_read_estatus(ghes, 0, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
@@ -891,7 +869,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, ghes->estatus);
 
 out:
-	ghes_clear_estatus(ghes);
+	ghes_clear_estatus(ghes, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
@@ -984,7 +962,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
@@ -1031,7 +1009,7 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 		return ret;
 
 	raw_spin_lock(&ghes_notify_lock_nmi);
-	if (!ghes_estatus_queue_notified(&ghes_nmi))
+	if (!ghes_estatus_queue_notified(&ghes_nmi, FIX_APEI_GHES_NMI))
 		ret = NMI_HANDLED;
 	raw_spin_unlock(&ghes_notify_lock_nmi);
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 09/18] ACPI / APEI: Let the notification helper specify the fixmap slot
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

ghes_copy_tofrom_phys() uses a different fixmap slot depending on in_nmi().
This doesn't work when we have multiple NMI-like notifications, that
can interrupt each other.

As with the locking, move the chosen fixmap_idx to the notification helper.
This only matters for NMI-like notifications, anything calling
ghes_proc() can use the IRQ fixmap slot as its already holding an irqsave
spinlock.

This lets us collapse the ghes_ioremap_pfn_*() helpers.

Signed-off-by: James Morse <james.morse@arm.com>
---

The fixmap-idx and vaddr are passed back to ghes_unmap()
to allow ioremap() to be used in process context in the
future.
---
 drivers/acpi/apei/ghes.c | 76 ++++++++++++++--------------------------
 1 file changed, 27 insertions(+), 49 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index ed8669a6c100..adf7fd402813 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -41,6 +41,7 @@
 #include <linux/llist.h>
 #include <linux/genalloc.h>
 #include <linux/pci.h>
+#include <linux/pfn.h>
 #include <linux/aer.h>
 #include <linux/nmi.h>
 #include <linux/sched/clock.h>
@@ -129,38 +130,24 @@ static atomic_t ghes_estatus_cache_alloced;
 
 static int ghes_panic_timeout __read_mostly = 30;
 
-static void __iomem *ghes_ioremap_pfn_nmi(u64 pfn)
+static void __iomem *ghes_map(u64 pfn, int fixmap_idx)
 {
 	phys_addr_t paddr;
 	pgprot_t prot;
 
-	paddr = pfn << PAGE_SHIFT;
+	paddr = PFN_PHYS(pfn);
 	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_NMI, paddr, prot);
+	__set_fixmap(fixmap_idx, paddr, prot);
 
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_NMI);
+	return (void __iomem *) __fix_to_virt(fixmap_idx);
 }
 
-static void __iomem *ghes_ioremap_pfn_irq(u64 pfn)
+static void ghes_unmap(int fixmap_idx, void __iomem *vaddr)
 {
-	phys_addr_t paddr;
-	pgprot_t prot;
-
-	paddr = pfn << PAGE_SHIFT;
-	prot = arch_apei_get_mem_attribute(paddr);
-	__set_fixmap(FIX_APEI_GHES_IRQ, paddr, prot);
-
-	return (void __iomem *) fix_to_virt(FIX_APEI_GHES_IRQ);
-}
-
-static void ghes_iounmap_nmi(void)
-{
-	clear_fixmap(FIX_APEI_GHES_NMI);
-}
+	int _idx = virt_to_fix((unsigned long)vaddr);
 
-static void ghes_iounmap_irq(void)
-{
-	clear_fixmap(FIX_APEI_GHES_IRQ);
+	WARN_ON_ONCE(fixmap_idx != _idx);
+	clear_fixmap(fixmap_idx);
 }
 
 static int ghes_estatus_pool_init(void)
@@ -289,20 +276,15 @@ static inline int ghes_severity(int severity)
 }
 
 static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
-				  int from_phys)
+				  int from_phys, int fixmap_idx)
 {
 	void __iomem *vaddr;
-	int in_nmi = in_nmi();
 	u64 offset;
 	u32 trunk;
 
 	while (len > 0) {
 		offset = paddr - (paddr & PAGE_MASK);
-		if (in_nmi) {
-			vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
-		} else {
-			vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
-		}
+		vaddr = ghes_map(PHYS_PFN(paddr), fixmap_idx);
 		trunk = PAGE_SIZE - offset;
 		trunk = min(trunk, len);
 		if (from_phys)
@@ -312,15 +294,11 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 		len -= trunk;
 		paddr += trunk;
 		buffer += trunk;
-		if (in_nmi) {
-			ghes_iounmap_nmi();
-		} else {
-			ghes_iounmap_irq();
-		}
+		ghes_unmap(fixmap_idx, vaddr);
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, int silent)
+static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -339,7 +317,7 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 		return -ENOENT;
 
 	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-			      sizeof(*ghes->estatus), 1);
+			      sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (!ghes->estatus->block_status)
 		return -ENOENT;
 
@@ -356,7 +334,7 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 		goto err_read_block;
 	ghes_copy_tofrom_phys(ghes->estatus + 1,
 			      buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1);
+			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
 	if (cper_estatus_check(ghes->estatus))
 		goto err_read_block;
 	rc = 0;
@@ -368,13 +346,13 @@ static int ghes_read_estatus(struct ghes *ghes, int silent)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes)
+static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
 {
 	ghes->estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
 	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
-			      sizeof(ghes->estatus->block_status), 0);
+			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
 
@@ -740,12 +718,12 @@ static void __process_error(struct ghes *ghes)
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
 }
 
-static int _in_nmi_notify_one(struct ghes *ghes)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
 
-	if (ghes_read_estatus(ghes, 1)) {
-		ghes_clear_estatus(ghes);
+	if (ghes_read_estatus(ghes, 1, fixmap_idx)) {
+		ghes_clear_estatus(ghes, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -759,19 +737,19 @@ static int _in_nmi_notify_one(struct ghes *ghes)
 		return 0;
 
 	__process_error(ghes);
-	ghes_clear_estatus(ghes);
+	ghes_clear_estatus(ghes, fixmap_idx);
 
 	return 0;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list)
+static int ghes_estatus_queue_notified(struct list_head *rcu_list, int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ghes, rcu_list, list) {
-		if (!_in_nmi_notify_one(ghes))
+		if (!_in_nmi_notify_one(ghes, fixmap_idx))
 			ret = 0;
 	}
 	rcu_read_unlock();
@@ -876,7 +854,7 @@ static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
 
-	rc = ghes_read_estatus(ghes, 0);
+	rc = ghes_read_estatus(ghes, 0, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
@@ -891,7 +869,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, ghes->estatus);
 
 out:
-	ghes_clear_estatus(ghes);
+	ghes_clear_estatus(ghes, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
@@ -984,7 +962,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
@@ -1031,7 +1009,7 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 		return ret;
 
 	raw_spin_lock(&ghes_notify_lock_nmi);
-	if (!ghes_estatus_queue_notified(&ghes_nmi))
+	if (!ghes_estatus_queue_notified(&ghes_nmi, FIX_APEI_GHES_NMI))
 		ret = NMI_HANDLED;
 	raw_spin_unlock(&ghes_notify_lock_nmi);
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

The NMI-like notifications scribble over ghes->estatus, before
copying it somewhere else. If this interrupts the ghes_probe() code
calling ghes_proc() on each struct ghes, the data is corrupted.

We want the NMI-like notifications to use a queued estatus entry
from the beginning. To that end, break up any use of "ghes->estatus"
so that all functions take the estatus as an argument.

This patch is just moving code around, no change in behaviour.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 82 ++++++++++++++++++++++------------------
 1 file changed, 45 insertions(+), 37 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index adf7fd402813..586689cbc0fd 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -298,7 +298,9 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     int silent, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -316,26 +318,26 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 	if (!buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-			      sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (!ghes->estatus->block_status)
+	ghes_copy_tofrom_phys(estatus, buf_paddr,
+			      sizeof(*estatus), 1, fixmap_idx);
+	if (!estatus->block_status)
 		return -ENOENT;
 
 	ghes->buffer_paddr = buf_paddr;
 	ghes->flags |= GHES_TO_CLEAR;
 
 	rc = -EIO;
-	len = cper_estatus_len(ghes->estatus);
-	if (len < sizeof(*ghes->estatus))
+	len = cper_estatus_len(estatus);
+	if (len < sizeof(*estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
-	if (cper_estatus_check_header(ghes->estatus))
+	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
-	ghes_copy_tofrom_phys(ghes->estatus + 1,
-			      buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (cper_estatus_check(ghes->estatus))
+	ghes_copy_tofrom_phys(estatus + 1,
+			      buf_paddr + sizeof(*estatus),
+			      len - sizeof(*estatus), 1, fixmap_idx);
+	if (cper_estatus_check(estatus))
 		goto err_read_block;
 	rc = 0;
 
@@ -346,13 +348,15 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
+static void ghes_clear_estatus(struct ghes *ghes,
+			       struct acpi_hest_generic_status *estatus,
+			       int fixmap_idx)
 {
-	ghes->estatus->block_status = 0;
+	estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
-	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
-			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
+	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
+			      sizeof(estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
 
@@ -518,9 +522,10 @@ static int ghes_print_estatus(const char *pfx,
 	return 0;
 }
 
-static void __ghes_panic(struct ghes *ghes)
+static void __ghes_panic(struct ghes *ghes,
+			 struct acpi_hest_generic_status *estatus)
 {
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, estatus);
 
 	/* reboot to log the error! */
 	if (!panic_timeout)
@@ -695,16 +700,17 @@ static void ghes_print_queued_estatus(void)
 }
 
 /* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
+static void __process_error(struct ghes *ghes,
+			    struct acpi_hest_generic_status *ghes_estatus)
 {
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
 
-	if (ghes_estatus_cached(ghes->estatus))
+	if (ghes_estatus_cached(ghes_estatus))
 		return;
 
-	len = cper_estatus_len(ghes->estatus);
+	len = cper_estatus_len(ghes_estatus);
 	node_len = GHES_ESTATUS_NODE_LEN(len);
 
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
@@ -714,35 +720,37 @@ static void __process_error(struct ghes *ghes)
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
+	memcpy(estatus, ghes_estatus, len);
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
 }
 
 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, 1, fixmap_idx)) {
-		ghes_clear_estatus(ghes, fixmap_idx);
+	if (ghes_read_estatus(ghes, estatus, 1, fixmap_idx)) {
+		ghes_clear_estatus(ghes, estatus, fixmap_idx);
 		return -ENOENT;
 	}
 
-	sev = ghes_severity(ghes->estatus->error_severity);
+	sev = ghes_severity(estatus->error_severity);
 	if (sev >= GHES_SEV_PANIC) {
 		ghes_print_queued_estatus();
-		__ghes_panic(ghes);
+		__ghes_panic(ghes, estatus);
 	}
 
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return 0;
 
-	__process_error(ghes);
-	ghes_clear_estatus(ghes, fixmap_idx);
+	__process_error(ghes, estatus);
+	ghes_clear_estatus(ghes, estatus, fixmap_idx);
 
 	return 0;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list, int fixmap_idx)
+static int ghes_estatus_queue_notified(struct list_head *rcu_list,
+				       int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
@@ -853,23 +861,23 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	rc = ghes_read_estatus(ghes, 0, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, 0, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
-	if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
-		__ghes_panic(ghes);
-	}
+	if (ghes_severity(estatus->error_severity) >= GHES_SEV_PANIC)
+		__ghes_panic(ghes, estatus);
 
-	if (!ghes_estatus_cached(ghes->estatus)) {
-		if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
-			ghes_estatus_cache_add(ghes->generic, ghes->estatus);
+	if (!ghes_estatus_cached(estatus)) {
+		if (ghes_print_estatus(NULL, ghes->generic, estatus))
+			ghes_estatus_cache_add(ghes->generic, estatus);
 	}
-	ghes_do_proc(ghes, ghes->estatus);
+	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

The NMI-like notifications scribble over ghes->estatus, before
copying it somewhere else. If this interrupts the ghes_probe() code
calling ghes_proc() on each struct ghes, the data is corrupted.

We want the NMI-like notifications to use a queued estatus entry
from the beginning. To that end, break up any use of "ghes->estatus"
so that all functions take the estatus as an argument.

This patch is just moving code around, no change in behaviour.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 82 ++++++++++++++++++++++------------------
 1 file changed, 45 insertions(+), 37 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index adf7fd402813..586689cbc0fd 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -298,7 +298,9 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     int silent, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -316,26 +318,26 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 	if (!buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-			      sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (!ghes->estatus->block_status)
+	ghes_copy_tofrom_phys(estatus, buf_paddr,
+			      sizeof(*estatus), 1, fixmap_idx);
+	if (!estatus->block_status)
 		return -ENOENT;
 
 	ghes->buffer_paddr = buf_paddr;
 	ghes->flags |= GHES_TO_CLEAR;
 
 	rc = -EIO;
-	len = cper_estatus_len(ghes->estatus);
-	if (len < sizeof(*ghes->estatus))
+	len = cper_estatus_len(estatus);
+	if (len < sizeof(*estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
-	if (cper_estatus_check_header(ghes->estatus))
+	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
-	ghes_copy_tofrom_phys(ghes->estatus + 1,
-			      buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (cper_estatus_check(ghes->estatus))
+	ghes_copy_tofrom_phys(estatus + 1,
+			      buf_paddr + sizeof(*estatus),
+			      len - sizeof(*estatus), 1, fixmap_idx);
+	if (cper_estatus_check(estatus))
 		goto err_read_block;
 	rc = 0;
 
@@ -346,13 +348,15 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
+static void ghes_clear_estatus(struct ghes *ghes,
+			       struct acpi_hest_generic_status *estatus,
+			       int fixmap_idx)
 {
-	ghes->estatus->block_status = 0;
+	estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
-	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
-			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
+	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
+			      sizeof(estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
 
@@ -518,9 +522,10 @@ static int ghes_print_estatus(const char *pfx,
 	return 0;
 }
 
-static void __ghes_panic(struct ghes *ghes)
+static void __ghes_panic(struct ghes *ghes,
+			 struct acpi_hest_generic_status *estatus)
 {
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, estatus);
 
 	/* reboot to log the error! */
 	if (!panic_timeout)
@@ -695,16 +700,17 @@ static void ghes_print_queued_estatus(void)
 }
 
 /* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
+static void __process_error(struct ghes *ghes,
+			    struct acpi_hest_generic_status *ghes_estatus)
 {
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
 
-	if (ghes_estatus_cached(ghes->estatus))
+	if (ghes_estatus_cached(ghes_estatus))
 		return;
 
-	len = cper_estatus_len(ghes->estatus);
+	len = cper_estatus_len(ghes_estatus);
 	node_len = GHES_ESTATUS_NODE_LEN(len);
 
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
@@ -714,35 +720,37 @@ static void __process_error(struct ghes *ghes)
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
+	memcpy(estatus, ghes_estatus, len);
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
 }
 
 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, 1, fixmap_idx)) {
-		ghes_clear_estatus(ghes, fixmap_idx);
+	if (ghes_read_estatus(ghes, estatus, 1, fixmap_idx)) {
+		ghes_clear_estatus(ghes, estatus, fixmap_idx);
 		return -ENOENT;
 	}
 
-	sev = ghes_severity(ghes->estatus->error_severity);
+	sev = ghes_severity(estatus->error_severity);
 	if (sev >= GHES_SEV_PANIC) {
 		ghes_print_queued_estatus();
-		__ghes_panic(ghes);
+		__ghes_panic(ghes, estatus);
 	}
 
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return 0;
 
-	__process_error(ghes);
-	ghes_clear_estatus(ghes, fixmap_idx);
+	__process_error(ghes, estatus);
+	ghes_clear_estatus(ghes, estatus, fixmap_idx);
 
 	return 0;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list, int fixmap_idx)
+static int ghes_estatus_queue_notified(struct list_head *rcu_list,
+				       int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
@@ -853,23 +861,23 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	rc = ghes_read_estatus(ghes, 0, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, 0, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
-	if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
-		__ghes_panic(ghes);
-	}
+	if (ghes_severity(estatus->error_severity) >= GHES_SEV_PANIC)
+		__ghes_panic(ghes, estatus);
 
-	if (!ghes_estatus_cached(ghes->estatus)) {
-		if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
-			ghes_estatus_cache_add(ghes->generic, ghes->estatus);
+	if (!ghes_estatus_cached(estatus)) {
+		if (ghes_print_estatus(NULL, ghes->generic, estatus))
+			ghes_estatus_cache_add(ghes->generic, estatus);
 	}
-	ghes_do_proc(ghes, ghes->estatus);
+	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

The NMI-like notifications scribble over ghes->estatus, before
copying it somewhere else. If this interrupts the ghes_probe() code
calling ghes_proc() on each struct ghes, the data is corrupted.

We want the NMI-like notifications to use a queued estatus entry
from the beginning. To that end, break up any use of "ghes->estatus"
so that all functions take the estatus as an argument.

This patch is just moving code around, no change in behaviour.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 82 ++++++++++++++++++++++------------------
 1 file changed, 45 insertions(+), 37 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index adf7fd402813..586689cbc0fd 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -298,7 +298,9 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     int silent, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -316,26 +318,26 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 	if (!buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
-			      sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (!ghes->estatus->block_status)
+	ghes_copy_tofrom_phys(estatus, buf_paddr,
+			      sizeof(*estatus), 1, fixmap_idx);
+	if (!estatus->block_status)
 		return -ENOENT;
 
 	ghes->buffer_paddr = buf_paddr;
 	ghes->flags |= GHES_TO_CLEAR;
 
 	rc = -EIO;
-	len = cper_estatus_len(ghes->estatus);
-	if (len < sizeof(*ghes->estatus))
+	len = cper_estatus_len(estatus);
+	if (len < sizeof(*estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
-	if (cper_estatus_check_header(ghes->estatus))
+	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
-	ghes_copy_tofrom_phys(ghes->estatus + 1,
-			      buf_paddr + sizeof(*ghes->estatus),
-			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
-	if (cper_estatus_check(ghes->estatus))
+	ghes_copy_tofrom_phys(estatus + 1,
+			      buf_paddr + sizeof(*estatus),
+			      len - sizeof(*estatus), 1, fixmap_idx);
+	if (cper_estatus_check(estatus))
 		goto err_read_block;
 	rc = 0;
 
@@ -346,13 +348,15 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
+static void ghes_clear_estatus(struct ghes *ghes,
+			       struct acpi_hest_generic_status *estatus,
+			       int fixmap_idx)
 {
-	ghes->estatus->block_status = 0;
+	estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
-	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
-			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
+	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
+			      sizeof(estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
 
@@ -518,9 +522,10 @@ static int ghes_print_estatus(const char *pfx,
 	return 0;
 }
 
-static void __ghes_panic(struct ghes *ghes)
+static void __ghes_panic(struct ghes *ghes,
+			 struct acpi_hest_generic_status *estatus)
 {
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, estatus);
 
 	/* reboot to log the error! */
 	if (!panic_timeout)
@@ -695,16 +700,17 @@ static void ghes_print_queued_estatus(void)
 }
 
 /* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
+static void __process_error(struct ghes *ghes,
+			    struct acpi_hest_generic_status *ghes_estatus)
 {
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
 
-	if (ghes_estatus_cached(ghes->estatus))
+	if (ghes_estatus_cached(ghes_estatus))
 		return;
 
-	len = cper_estatus_len(ghes->estatus);
+	len = cper_estatus_len(ghes_estatus);
 	node_len = GHES_ESTATUS_NODE_LEN(len);
 
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
@@ -714,35 +720,37 @@ static void __process_error(struct ghes *ghes)
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
+	memcpy(estatus, ghes_estatus, len);
 	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
 }
 
 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, 1, fixmap_idx)) {
-		ghes_clear_estatus(ghes, fixmap_idx);
+	if (ghes_read_estatus(ghes, estatus, 1, fixmap_idx)) {
+		ghes_clear_estatus(ghes, estatus, fixmap_idx);
 		return -ENOENT;
 	}
 
-	sev = ghes_severity(ghes->estatus->error_severity);
+	sev = ghes_severity(estatus->error_severity);
 	if (sev >= GHES_SEV_PANIC) {
 		ghes_print_queued_estatus();
-		__ghes_panic(ghes);
+		__ghes_panic(ghes, estatus);
 	}
 
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return 0;
 
-	__process_error(ghes);
-	ghes_clear_estatus(ghes, fixmap_idx);
+	__process_error(ghes, estatus);
+	ghes_clear_estatus(ghes, estatus, fixmap_idx);
 
 	return 0;
 }
 
-static int ghes_estatus_queue_notified(struct list_head *rcu_list, int fixmap_idx)
+static int ghes_estatus_queue_notified(struct list_head *rcu_list,
+				       int fixmap_idx)
 {
 	int ret = -ENOENT;
 	struct ghes *ghes;
@@ -853,23 +861,23 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
+	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	rc = ghes_read_estatus(ghes, 0, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, 0, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
-	if (ghes_severity(ghes->estatus->error_severity) >= GHES_SEV_PANIC) {
-		__ghes_panic(ghes);
-	}
+	if (ghes_severity(estatus->error_severity) >= GHES_SEV_PANIC)
+		__ghes_panic(ghes, estatus);
 
-	if (!ghes_estatus_cached(ghes->estatus)) {
-		if (ghes_print_estatus(NULL, ghes->generic, ghes->estatus))
-			ghes_estatus_cache_add(ghes->generic, ghes->estatus);
+	if (!ghes_estatus_cached(estatus)) {
+		if (ghes_print_estatus(NULL, ghes->generic, estatus))
+			ghes_estatus_cache_add(ghes->generic, estatus);
 	}
-	ghes_do_proc(ghes, ghes->estatus);
+	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 11/18] ACPI / APEI: Remove silent flag from ghes_read_estatus()
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Subsequent patches will split up ghes_read_estatus(), at which
point passing around the 'silent' flag gets annoying. This is to
suppress prink() messages, which prior to 42a0bb3f7138 ("printk/nmi:
generic solution for safe printk in NMI"), were unsafe in NMI context.

We don't need to do this anymore, remove the flag. printk() messages
are batched in a per-cpu buffer and printed via irq-work, or a call
back from panic().

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 586689cbc0fd..ba5344d26a39 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -300,7 +300,7 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 
 static int ghes_read_estatus(struct ghes *ghes,
 			     struct acpi_hest_generic_status *estatus,
-			     int silent, int fixmap_idx)
+			     int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -309,7 +309,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 	rc = apei_read(&buf_paddr, &g->error_status_address);
 	if (rc) {
-		if (!silent && printk_ratelimit())
+		if (printk_ratelimit())
 			pr_warning(FW_WARN GHES_PFX
 "Failed to read error status block address for hardware error source: %d.\n",
 				   g->header.source_id);
@@ -342,7 +342,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 	rc = 0;
 
 err_read_block:
-	if (rc && !silent && printk_ratelimit())
+	if (rc && printk_ratelimit())
 		pr_warning(FW_WARN GHES_PFX
 			   "Failed to read error status block!\n");
 	return rc;
@@ -729,7 +729,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 	int sev;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, 1, fixmap_idx)) {
+	if (ghes_read_estatus(ghes, estatus, fixmap_idx)) {
 		ghes_clear_estatus(ghes, estatus, fixmap_idx);
 		return -ENOENT;
 	}
@@ -863,7 +863,7 @@ static int ghes_proc(struct ghes *ghes)
 	int rc;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	rc = ghes_read_estatus(ghes, estatus, 0, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 11/18] ACPI / APEI: Remove silent flag from ghes_read_estatus()
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Subsequent patches will split up ghes_read_estatus(), at which
point passing around the 'silent' flag gets annoying. This is to
suppress prink() messages, which prior to 42a0bb3f7138 ("printk/nmi:
generic solution for safe printk in NMI"), were unsafe in NMI context.

We don't need to do this anymore, remove the flag. printk() messages
are batched in a per-cpu buffer and printed via irq-work, or a call
back from panic().

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 586689cbc0fd..ba5344d26a39 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -300,7 +300,7 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 
 static int ghes_read_estatus(struct ghes *ghes,
 			     struct acpi_hest_generic_status *estatus,
-			     int silent, int fixmap_idx)
+			     int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -309,7 +309,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 	rc = apei_read(&buf_paddr, &g->error_status_address);
 	if (rc) {
-		if (!silent && printk_ratelimit())
+		if (printk_ratelimit())
 			pr_warning(FW_WARN GHES_PFX
 "Failed to read error status block address for hardware error source: %d.\n",
 				   g->header.source_id);
@@ -342,7 +342,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 	rc = 0;
 
 err_read_block:
-	if (rc && !silent && printk_ratelimit())
+	if (rc && printk_ratelimit())
 		pr_warning(FW_WARN GHES_PFX
 			   "Failed to read error status block!\n");
 	return rc;
@@ -729,7 +729,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 	int sev;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, 1, fixmap_idx)) {
+	if (ghes_read_estatus(ghes, estatus, fixmap_idx)) {
 		ghes_clear_estatus(ghes, estatus, fixmap_idx);
 		return -ENOENT;
 	}
@@ -863,7 +863,7 @@ static int ghes_proc(struct ghes *ghes)
 	int rc;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	rc = ghes_read_estatus(ghes, estatus, 0, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 11/18] ACPI / APEI: Remove silent flag from ghes_read_estatus()
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

Subsequent patches will split up ghes_read_estatus(), at which
point passing around the 'silent' flag gets annoying. This is to
suppress prink() messages, which prior to 42a0bb3f7138 ("printk/nmi:
generic solution for safe printk in NMI"), were unsafe in NMI context.

We don't need to do this anymore, remove the flag. printk() messages
are batched in a per-cpu buffer and printed via irq-work, or a call
back from panic().

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 586689cbc0fd..ba5344d26a39 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -300,7 +300,7 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 
 static int ghes_read_estatus(struct ghes *ghes,
 			     struct acpi_hest_generic_status *estatus,
-			     int silent, int fixmap_idx)
+			     int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
 	u64 buf_paddr;
@@ -309,7 +309,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 	rc = apei_read(&buf_paddr, &g->error_status_address);
 	if (rc) {
-		if (!silent && printk_ratelimit())
+		if (printk_ratelimit())
 			pr_warning(FW_WARN GHES_PFX
 "Failed to read error status block address for hardware error source: %d.\n",
 				   g->header.source_id);
@@ -342,7 +342,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 	rc = 0;
 
 err_read_block:
-	if (rc && !silent && printk_ratelimit())
+	if (rc && printk_ratelimit())
 		pr_warning(FW_WARN GHES_PFX
 			   "Failed to read error status block!\n");
 	return rc;
@@ -729,7 +729,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 	int sev;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, 1, fixmap_idx)) {
+	if (ghes_read_estatus(ghes, estatus, fixmap_idx)) {
 		ghes_clear_estatus(ghes, estatus, fixmap_idx);
 		return -ENOENT;
 	}
@@ -863,7 +863,7 @@ static int ghes_proc(struct ghes *ghes)
 	int rc;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	rc = ghes_read_estatus(ghes, estatus, 0, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 12/18] ACPI / APEI: Don't store CPER records physical address in struct ghes
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:16   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

When CPER records are found the address of the records is stashed
in the struct ghes. Once the records have been processed, this
address is overwritten with zero so that it won't be processed
again without being re-populated by firmware.

This goes wrong if a struct ghes can be processed concurrently,
as can happen at probe time when an NMI occurs.

Avoid this stashing by letting the caller hold the address. A
later patch will do away with the use of ghes->flags in the
read/clear code too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 28 ++++++++++++++--------------
 include/acpi/ghes.h      |  1 -
 2 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index ba5344d26a39..c58f9b330ed3 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -300,14 +300,13 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 
 static int ghes_read_estatus(struct ghes *ghes,
 			     struct acpi_hest_generic_status *estatus,
-			     int fixmap_idx)
+			     u64 *buf_paddr, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
-	u64 buf_paddr;
 	u32 len;
 	int rc;
 
-	rc = apei_read(&buf_paddr, &g->error_status_address);
+	rc = apei_read(buf_paddr, &g->error_status_address);
 	if (rc) {
 		if (printk_ratelimit())
 			pr_warning(FW_WARN GHES_PFX
@@ -315,15 +314,14 @@ static int ghes_read_estatus(struct ghes *ghes,
 				   g->header.source_id);
 		return -EIO;
 	}
-	if (!buf_paddr)
+	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(estatus, buf_paddr,
+	ghes_copy_tofrom_phys(estatus, *buf_paddr,
 			      sizeof(*estatus), 1, fixmap_idx);
 	if (!estatus->block_status)
 		return -ENOENT;
 
-	ghes->buffer_paddr = buf_paddr;
 	ghes->flags |= GHES_TO_CLEAR;
 
 	rc = -EIO;
@@ -335,7 +333,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
 	ghes_copy_tofrom_phys(estatus + 1,
-			      buf_paddr + sizeof(*estatus),
+			      *buf_paddr + sizeof(*estatus),
 			      len - sizeof(*estatus), 1, fixmap_idx);
 	if (cper_estatus_check(estatus))
 		goto err_read_block;
@@ -350,12 +348,12 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 static void ghes_clear_estatus(struct ghes *ghes,
 			       struct acpi_hest_generic_status *estatus,
-			       int fixmap_idx)
+			       u64 buf_paddr, int fixmap_idx)
 {
 	estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
-	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
+	ghes_copy_tofrom_phys(estatus, buf_paddr,
 			      sizeof(estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
@@ -727,10 +725,11 @@ static void __process_error(struct ghes *ghes,
 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
+	u64 buf_paddr;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, fixmap_idx)) {
-		ghes_clear_estatus(ghes, estatus, fixmap_idx);
+	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+		ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -744,7 +743,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		return 0;
 
 	__process_error(ghes, estatus);
-	ghes_clear_estatus(ghes, estatus, fixmap_idx);
+	ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
 
 	return 0;
 }
@@ -861,9 +860,10 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
+	u64 buf_paddr;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	rc = ghes_read_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, &buf_paddr, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
@@ -877,7 +877,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(ghes, estatus, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 82cb4eb225a4..6dc021e9cdad 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -22,7 +22,6 @@ struct ghes {
 		struct acpi_hest_generic_v2 *generic_v2;
 	};
 	struct acpi_hest_generic_status *estatus;
-	u64 buffer_paddr;
 	unsigned long flags;
 	union {
 		struct list_head list;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 12/18] ACPI / APEI: Don't store CPER records physical address in struct ghes
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

When CPER records are found the address of the records is stashed
in the struct ghes. Once the records have been processed, this
address is overwritten with zero so that it won't be processed
again without being re-populated by firmware.

This goes wrong if a struct ghes can be processed concurrently,
as can happen at probe time when an NMI occurs.

Avoid this stashing by letting the caller hold the address. A
later patch will do away with the use of ghes->flags in the
read/clear code too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 28 ++++++++++++++--------------
 include/acpi/ghes.h      |  1 -
 2 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index ba5344d26a39..c58f9b330ed3 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -300,14 +300,13 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 
 static int ghes_read_estatus(struct ghes *ghes,
 			     struct acpi_hest_generic_status *estatus,
-			     int fixmap_idx)
+			     u64 *buf_paddr, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
-	u64 buf_paddr;
 	u32 len;
 	int rc;
 
-	rc = apei_read(&buf_paddr, &g->error_status_address);
+	rc = apei_read(buf_paddr, &g->error_status_address);
 	if (rc) {
 		if (printk_ratelimit())
 			pr_warning(FW_WARN GHES_PFX
@@ -315,15 +314,14 @@ static int ghes_read_estatus(struct ghes *ghes,
 				   g->header.source_id);
 		return -EIO;
 	}
-	if (!buf_paddr)
+	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(estatus, buf_paddr,
+	ghes_copy_tofrom_phys(estatus, *buf_paddr,
 			      sizeof(*estatus), 1, fixmap_idx);
 	if (!estatus->block_status)
 		return -ENOENT;
 
-	ghes->buffer_paddr = buf_paddr;
 	ghes->flags |= GHES_TO_CLEAR;
 
 	rc = -EIO;
@@ -335,7 +333,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
 	ghes_copy_tofrom_phys(estatus + 1,
-			      buf_paddr + sizeof(*estatus),
+			      *buf_paddr + sizeof(*estatus),
 			      len - sizeof(*estatus), 1, fixmap_idx);
 	if (cper_estatus_check(estatus))
 		goto err_read_block;
@@ -350,12 +348,12 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 static void ghes_clear_estatus(struct ghes *ghes,
 			       struct acpi_hest_generic_status *estatus,
-			       int fixmap_idx)
+			       u64 buf_paddr, int fixmap_idx)
 {
 	estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
-	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
+	ghes_copy_tofrom_phys(estatus, buf_paddr,
 			      sizeof(estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
@@ -727,10 +725,11 @@ static void __process_error(struct ghes *ghes,
 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
+	u64 buf_paddr;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, fixmap_idx)) {
-		ghes_clear_estatus(ghes, estatus, fixmap_idx);
+	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+		ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -744,7 +743,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		return 0;
 
 	__process_error(ghes, estatus);
-	ghes_clear_estatus(ghes, estatus, fixmap_idx);
+	ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
 
 	return 0;
 }
@@ -861,9 +860,10 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
+	u64 buf_paddr;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	rc = ghes_read_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, &buf_paddr, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
@@ -877,7 +877,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(ghes, estatus, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 82cb4eb225a4..6dc021e9cdad 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -22,7 +22,6 @@ struct ghes {
 		struct acpi_hest_generic_v2 *generic_v2;
 	};
 	struct acpi_hest_generic_status *estatus;
-	u64 buffer_paddr;
 	unsigned long flags;
 	union {
 		struct list_head list;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 12/18] ACPI / APEI: Don't store CPER records physical address in struct ghes
@ 2018-09-21 22:16   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:16 UTC (permalink / raw)
  To: linux-arm-kernel

When CPER records are found the address of the records is stashed
in the struct ghes. Once the records have been processed, this
address is overwritten with zero so that it won't be processed
again without being re-populated by firmware.

This goes wrong if a struct ghes can be processed concurrently,
as can happen at probe time when an NMI occurs.

Avoid this stashing by letting the caller hold the address. A
later patch will do away with the use of ghes->flags in the
read/clear code too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 28 ++++++++++++++--------------
 include/acpi/ghes.h      |  1 -
 2 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index ba5344d26a39..c58f9b330ed3 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -300,14 +300,13 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 
 static int ghes_read_estatus(struct ghes *ghes,
 			     struct acpi_hest_generic_status *estatus,
-			     int fixmap_idx)
+			     u64 *buf_paddr, int fixmap_idx)
 {
 	struct acpi_hest_generic *g = ghes->generic;
-	u64 buf_paddr;
 	u32 len;
 	int rc;
 
-	rc = apei_read(&buf_paddr, &g->error_status_address);
+	rc = apei_read(buf_paddr, &g->error_status_address);
 	if (rc) {
 		if (printk_ratelimit())
 			pr_warning(FW_WARN GHES_PFX
@@ -315,15 +314,14 @@ static int ghes_read_estatus(struct ghes *ghes,
 				   g->header.source_id);
 		return -EIO;
 	}
-	if (!buf_paddr)
+	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(estatus, buf_paddr,
+	ghes_copy_tofrom_phys(estatus, *buf_paddr,
 			      sizeof(*estatus), 1, fixmap_idx);
 	if (!estatus->block_status)
 		return -ENOENT;
 
-	ghes->buffer_paddr = buf_paddr;
 	ghes->flags |= GHES_TO_CLEAR;
 
 	rc = -EIO;
@@ -335,7 +333,7 @@ static int ghes_read_estatus(struct ghes *ghes,
 	if (cper_estatus_check_header(estatus))
 		goto err_read_block;
 	ghes_copy_tofrom_phys(estatus + 1,
-			      buf_paddr + sizeof(*estatus),
+			      *buf_paddr + sizeof(*estatus),
 			      len - sizeof(*estatus), 1, fixmap_idx);
 	if (cper_estatus_check(estatus))
 		goto err_read_block;
@@ -350,12 +348,12 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 static void ghes_clear_estatus(struct ghes *ghes,
 			       struct acpi_hest_generic_status *estatus,
-			       int fixmap_idx)
+			       u64 buf_paddr, int fixmap_idx)
 {
 	estatus->block_status = 0;
 	if (!(ghes->flags & GHES_TO_CLEAR))
 		return;
-	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
+	ghes_copy_tofrom_phys(estatus, buf_paddr,
 			      sizeof(estatus->block_status), 0, fixmap_idx);
 	ghes->flags &= ~GHES_TO_CLEAR;
 }
@@ -727,10 +725,11 @@ static void __process_error(struct ghes *ghes,
 static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
 	int sev;
+	u64 buf_paddr;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, fixmap_idx)) {
-		ghes_clear_estatus(ghes, estatus, fixmap_idx);
+	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+		ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -744,7 +743,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		return 0;
 
 	__process_error(ghes, estatus);
-	ghes_clear_estatus(ghes, estatus, fixmap_idx);
+	ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
 
 	return 0;
 }
@@ -861,9 +860,10 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
+	u64 buf_paddr;
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	rc = ghes_read_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
+	rc = ghes_read_estatus(ghes, estatus, &buf_paddr, FIX_APEI_GHES_IRQ);
 	if (rc)
 		goto out;
 
@@ -877,7 +877,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, estatus, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(ghes, estatus, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 82cb4eb225a4..6dc021e9cdad 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -22,7 +22,6 @@ struct ghes {
 		struct acpi_hest_generic_v2 *generic_v2;
 	};
 	struct acpi_hest_generic_status *estatus;
-	u64 buffer_paddr;
 	unsigned long flags;
 	union {
 		struct list_head list;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 13/18] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:17   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes_read_estatus() sets a flag in struct ghes if the buffer of
CPER records needs to be cleared once the records have been
processed. This global flags value is a problem if a struct ghes
can be processed concurrently, as happens at probe time if an
NMI arrives for the same error source.

The GHES_TO_CLEAR flags was only set at the same time as
buffer_paddr, which is now owned by the caller and passed to
ghes_clear_estatus(). Use this as the flag.

A non-zero buf_paddr returned by ghes_read_estatus() means
ghes_clear_estatus() will clear this address. ghes_read_estatus()
already checks for a read of error_status_address being zero,
so we can never get CPER records written at zero.

After this ghes_clear_estatus() no longer needs the struct ghes.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 26 ++++++++++++--------------
 include/acpi/ghes.h      |  1 -
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index c58f9b330ed3..3028487d43a3 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -319,10 +319,10 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 	ghes_copy_tofrom_phys(estatus, *buf_paddr,
 			      sizeof(*estatus), 1, fixmap_idx);
-	if (!estatus->block_status)
+	if (!estatus->block_status) {
+		*buf_paddr = 0;
 		return -ENOENT;
-
-	ghes->flags |= GHES_TO_CLEAR;
+	}
 
 	rc = -EIO;
 	len = cper_estatus_len(estatus);
@@ -346,16 +346,14 @@ static int ghes_read_estatus(struct ghes *ghes,
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes,
-			       struct acpi_hest_generic_status *estatus,
+static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 			       u64 buf_paddr, int fixmap_idx)
 {
 	estatus->block_status = 0;
-	if (!(ghes->flags & GHES_TO_CLEAR))
-		return;
-	ghes_copy_tofrom_phys(estatus, buf_paddr,
-			      sizeof(estatus->block_status), 0, fixmap_idx);
-	ghes->flags &= ~GHES_TO_CLEAR;
+	if (buf_paddr)
+		ghes_copy_tofrom_phys(estatus, buf_paddr,
+				      sizeof(estatus->block_status), 0,
+				      fixmap_idx);
 }
 
 static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
@@ -729,7 +727,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
 	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
-		ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
+		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -739,11 +737,11 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		__ghes_panic(ghes, estatus);
 	}
 
-	if (!(ghes->flags & GHES_TO_CLEAR))
+	if (!buf_paddr)
 		return 0;
 
 	__process_error(ghes, estatus);
-	ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
+	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
 	return 0;
 }
@@ -877,7 +875,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, estatus, buf_paddr, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(estatus, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 6dc021e9cdad..536f90dd1e34 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -13,7 +13,6 @@
  * estatus: memory buffer for error status block, allocated during
  * HEST parsing.
  */
-#define GHES_TO_CLEAR		0x0001
 #define GHES_EXITING		0x0002
 
 struct ghes {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 13/18] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

ghes_read_estatus() sets a flag in struct ghes if the buffer of
CPER records needs to be cleared once the records have been
processed. This global flags value is a problem if a struct ghes
can be processed concurrently, as happens at probe time if an
NMI arrives for the same error source.

The GHES_TO_CLEAR flags was only set at the same time as
buffer_paddr, which is now owned by the caller and passed to
ghes_clear_estatus(). Use this as the flag.

A non-zero buf_paddr returned by ghes_read_estatus() means
ghes_clear_estatus() will clear this address. ghes_read_estatus()
already checks for a read of error_status_address being zero,
so we can never get CPER records written at zero.

After this ghes_clear_estatus() no longer needs the struct ghes.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 26 ++++++++++++--------------
 include/acpi/ghes.h      |  1 -
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index c58f9b330ed3..3028487d43a3 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -319,10 +319,10 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 	ghes_copy_tofrom_phys(estatus, *buf_paddr,
 			      sizeof(*estatus), 1, fixmap_idx);
-	if (!estatus->block_status)
+	if (!estatus->block_status) {
+		*buf_paddr = 0;
 		return -ENOENT;
-
-	ghes->flags |= GHES_TO_CLEAR;
+	}
 
 	rc = -EIO;
 	len = cper_estatus_len(estatus);
@@ -346,16 +346,14 @@ static int ghes_read_estatus(struct ghes *ghes,
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes,
-			       struct acpi_hest_generic_status *estatus,
+static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 			       u64 buf_paddr, int fixmap_idx)
 {
 	estatus->block_status = 0;
-	if (!(ghes->flags & GHES_TO_CLEAR))
-		return;
-	ghes_copy_tofrom_phys(estatus, buf_paddr,
-			      sizeof(estatus->block_status), 0, fixmap_idx);
-	ghes->flags &= ~GHES_TO_CLEAR;
+	if (buf_paddr)
+		ghes_copy_tofrom_phys(estatus, buf_paddr,
+				      sizeof(estatus->block_status), 0,
+				      fixmap_idx);
 }
 
 static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
@@ -729,7 +727,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
 	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
-		ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
+		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -739,11 +737,11 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		__ghes_panic(ghes, estatus);
 	}
 
-	if (!(ghes->flags & GHES_TO_CLEAR))
+	if (!buf_paddr)
 		return 0;
 
 	__process_error(ghes, estatus);
-	ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
+	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
 	return 0;
 }
@@ -877,7 +875,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, estatus, buf_paddr, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(estatus, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 6dc021e9cdad..536f90dd1e34 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -13,7 +13,6 @@
  * estatus: memory buffer for error status block, allocated during
  * HEST parsing.
  */
-#define GHES_TO_CLEAR		0x0001
 #define GHES_EXITING		0x0002
 
 struct ghes {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 13/18] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

ghes_read_estatus() sets a flag in struct ghes if the buffer of
CPER records needs to be cleared once the records have been
processed. This global flags value is a problem if a struct ghes
can be processed concurrently, as happens at probe time if an
NMI arrives for the same error source.

The GHES_TO_CLEAR flags was only set at the same time as
buffer_paddr, which is now owned by the caller and passed to
ghes_clear_estatus(). Use this as the flag.

A non-zero buf_paddr returned by ghes_read_estatus() means
ghes_clear_estatus() will clear this address. ghes_read_estatus()
already checks for a read of error_status_address being zero,
so we can never get CPER records written at zero.

After this ghes_clear_estatus() no longer needs the struct ghes.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 26 ++++++++++++--------------
 include/acpi/ghes.h      |  1 -
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index c58f9b330ed3..3028487d43a3 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -319,10 +319,10 @@ static int ghes_read_estatus(struct ghes *ghes,
 
 	ghes_copy_tofrom_phys(estatus, *buf_paddr,
 			      sizeof(*estatus), 1, fixmap_idx);
-	if (!estatus->block_status)
+	if (!estatus->block_status) {
+		*buf_paddr = 0;
 		return -ENOENT;
-
-	ghes->flags |= GHES_TO_CLEAR;
+	}
 
 	rc = -EIO;
 	len = cper_estatus_len(estatus);
@@ -346,16 +346,14 @@ static int ghes_read_estatus(struct ghes *ghes,
 	return rc;
 }
 
-static void ghes_clear_estatus(struct ghes *ghes,
-			       struct acpi_hest_generic_status *estatus,
+static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 			       u64 buf_paddr, int fixmap_idx)
 {
 	estatus->block_status = 0;
-	if (!(ghes->flags & GHES_TO_CLEAR))
-		return;
-	ghes_copy_tofrom_phys(estatus, buf_paddr,
-			      sizeof(estatus->block_status), 0, fixmap_idx);
-	ghes->flags &= ~GHES_TO_CLEAR;
+	if (buf_paddr)
+		ghes_copy_tofrom_phys(estatus, buf_paddr,
+				      sizeof(estatus->block_status), 0,
+				      fixmap_idx);
 }
 
 static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int sev)
@@ -729,7 +727,7 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
 	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
-		ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
+		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 		return -ENOENT;
 	}
 
@@ -739,11 +737,11 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		__ghes_panic(ghes, estatus);
 	}
 
-	if (!(ghes->flags & GHES_TO_CLEAR))
+	if (!buf_paddr)
 		return 0;
 
 	__process_error(ghes, estatus);
-	ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
+	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
 	return 0;
 }
@@ -877,7 +875,7 @@ static int ghes_proc(struct ghes *ghes)
 	ghes_do_proc(ghes, estatus);
 
 out:
-	ghes_clear_estatus(ghes, estatus, buf_paddr, FIX_APEI_GHES_IRQ);
+	ghes_clear_estatus(estatus, buf_paddr, FIX_APEI_GHES_IRQ);
 
 	if (rc == -ENOENT)
 		return rc;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 6dc021e9cdad..536f90dd1e34 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -13,7 +13,6 @@
  * estatus: memory buffer for error status block, allocated during
  * HEST parsing.
  */
-#define GHES_TO_CLEAR		0x0001
 #define GHES_EXITING		0x0002
 
 struct ghes {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 14/18] ACPI / APEI: Split ghes_read_estatus() to read CPER length
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:17   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

ghes_read_estatus() reads the record address, then the record's
header, then performs some sanity checks before reading the
records into the provided estatus buffer.

We either need to know the size of the records before we call
ghes_read_estatus(), or always provide a worst-case sized buffer,
as happens today.

Add a function to peek at the record's header to find the size. This
will let the NMI path allocate the right amount of memory before reading
the records, instead of using the worst-case size, and having to copy
the records.

Split ghes_read_estatus() to create ghes_peek_estatus() which
returns the address and size of the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 55 ++++++++++++++++++++++++++++++----------
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 3028487d43a3..055176ed68ac 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -298,11 +298,12 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes,
-			     struct acpi_hest_generic_status *estatus,
-			     u64 *buf_paddr, int fixmap_idx)
+/* read the CPER block returning its address and size */
+static int ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
+			     u64 *buf_paddr, u32 *buf_len)
 {
 	struct acpi_hest_generic *g = ghes->generic;
+	struct acpi_hest_generic_status estatus;
 	u32 len;
 	int rc;
 
@@ -317,26 +318,23 @@ static int ghes_read_estatus(struct ghes *ghes,
 	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(estatus, *buf_paddr,
-			      sizeof(*estatus), 1, fixmap_idx);
-	if (!estatus->block_status) {
+	ghes_copy_tofrom_phys(&estatus, *buf_paddr,
+			      sizeof(estatus), 1, fixmap_idx);
+	if (!estatus.block_status) {
 		*buf_paddr = 0;
 		return -ENOENT;
 	}
 
 	rc = -EIO;
-	len = cper_estatus_len(estatus);
-	if (len < sizeof(*estatus))
+	len = cper_estatus_len(&estatus);
+	if (len < sizeof(estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
-	if (cper_estatus_check_header(estatus))
-		goto err_read_block;
-	ghes_copy_tofrom_phys(estatus + 1,
-			      *buf_paddr + sizeof(*estatus),
-			      len - sizeof(*estatus), 1, fixmap_idx);
-	if (cper_estatus_check(estatus))
+	if (cper_estatus_check_header(&estatus))
 		goto err_read_block;
+	*buf_len = len;
+
 	rc = 0;
 
 err_read_block:
@@ -346,6 +344,35 @@ static int ghes_read_estatus(struct ghes *ghes,
 	return rc;
 }
 
+static int __ghes_read_estatus(struct acpi_hest_generic_status *estatus,
+			       u64 buf_paddr, size_t buf_len,
+			       int fixmap_idx)
+{
+	ghes_copy_tofrom_phys(estatus, buf_paddr, buf_len, 1, fixmap_idx);
+	if (cper_estatus_check(estatus)) {
+		if (printk_ratelimit())
+			pr_warning(FW_WARN GHES_PFX
+				   "Failed to read error status block!\n");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     u64 *buf_paddr, int fixmap_idx)
+{
+	int rc;
+	u32 buf_len;
+
+	rc = ghes_peek_estatus(ghes, fixmap_idx, buf_paddr, &buf_len);
+	if (rc)
+		return rc;
+
+	return __ghes_read_estatus(estatus, *buf_paddr, buf_len, fixmap_idx);
+}
+
 static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 			       u64 buf_paddr, int fixmap_idx)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 14/18] ACPI / APEI: Split ghes_read_estatus() to read CPER length
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

ghes_read_estatus() reads the record address, then the record's
header, then performs some sanity checks before reading the
records into the provided estatus buffer.

We either need to know the size of the records before we call
ghes_read_estatus(), or always provide a worst-case sized buffer,
as happens today.

Add a function to peek at the record's header to find the size. This
will let the NMI path allocate the right amount of memory before reading
the records, instead of using the worst-case size, and having to copy
the records.

Split ghes_read_estatus() to create ghes_peek_estatus() which
returns the address and size of the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 55 ++++++++++++++++++++++++++++++----------
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 3028487d43a3..055176ed68ac 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -298,11 +298,12 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes,
-			     struct acpi_hest_generic_status *estatus,
-			     u64 *buf_paddr, int fixmap_idx)
+/* read the CPER block returning its address and size */
+static int ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
+			     u64 *buf_paddr, u32 *buf_len)
 {
 	struct acpi_hest_generic *g = ghes->generic;
+	struct acpi_hest_generic_status estatus;
 	u32 len;
 	int rc;
 
@@ -317,26 +318,23 @@ static int ghes_read_estatus(struct ghes *ghes,
 	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(estatus, *buf_paddr,
-			      sizeof(*estatus), 1, fixmap_idx);
-	if (!estatus->block_status) {
+	ghes_copy_tofrom_phys(&estatus, *buf_paddr,
+			      sizeof(estatus), 1, fixmap_idx);
+	if (!estatus.block_status) {
 		*buf_paddr = 0;
 		return -ENOENT;
 	}
 
 	rc = -EIO;
-	len = cper_estatus_len(estatus);
-	if (len < sizeof(*estatus))
+	len = cper_estatus_len(&estatus);
+	if (len < sizeof(estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
-	if (cper_estatus_check_header(estatus))
-		goto err_read_block;
-	ghes_copy_tofrom_phys(estatus + 1,
-			      *buf_paddr + sizeof(*estatus),
-			      len - sizeof(*estatus), 1, fixmap_idx);
-	if (cper_estatus_check(estatus))
+	if (cper_estatus_check_header(&estatus))
 		goto err_read_block;
+	*buf_len = len;
+
 	rc = 0;
 
 err_read_block:
@@ -346,6 +344,35 @@ static int ghes_read_estatus(struct ghes *ghes,
 	return rc;
 }
 
+static int __ghes_read_estatus(struct acpi_hest_generic_status *estatus,
+			       u64 buf_paddr, size_t buf_len,
+			       int fixmap_idx)
+{
+	ghes_copy_tofrom_phys(estatus, buf_paddr, buf_len, 1, fixmap_idx);
+	if (cper_estatus_check(estatus)) {
+		if (printk_ratelimit())
+			pr_warning(FW_WARN GHES_PFX
+				   "Failed to read error status block!\n");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     u64 *buf_paddr, int fixmap_idx)
+{
+	int rc;
+	u32 buf_len;
+
+	rc = ghes_peek_estatus(ghes, fixmap_idx, buf_paddr, &buf_len);
+	if (rc)
+		return rc;
+
+	return __ghes_read_estatus(estatus, *buf_paddr, buf_len, fixmap_idx);
+}
+
 static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 			       u64 buf_paddr, int fixmap_idx)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 14/18] ACPI / APEI: Split ghes_read_estatus() to read CPER length
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

ghes_read_estatus() reads the record address, then the record's
header, then performs some sanity checks before reading the
records into the provided estatus buffer.

We either need to know the size of the records before we call
ghes_read_estatus(), or always provide a worst-case sized buffer,
as happens today.

Add a function to peek at the record's header to find the size. This
will let the NMI path allocate the right amount of memory before reading
the records, instead of using the worst-case size, and having to copy
the records.

Split ghes_read_estatus() to create ghes_peek_estatus() which
returns the address and size of the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 55 ++++++++++++++++++++++++++++++----------
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 3028487d43a3..055176ed68ac 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -298,11 +298,12 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
 	}
 }
 
-static int ghes_read_estatus(struct ghes *ghes,
-			     struct acpi_hest_generic_status *estatus,
-			     u64 *buf_paddr, int fixmap_idx)
+/* read the CPER block returning its address and size */
+static int ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
+			     u64 *buf_paddr, u32 *buf_len)
 {
 	struct acpi_hest_generic *g = ghes->generic;
+	struct acpi_hest_generic_status estatus;
 	u32 len;
 	int rc;
 
@@ -317,26 +318,23 @@ static int ghes_read_estatus(struct ghes *ghes,
 	if (!*buf_paddr)
 		return -ENOENT;
 
-	ghes_copy_tofrom_phys(estatus, *buf_paddr,
-			      sizeof(*estatus), 1, fixmap_idx);
-	if (!estatus->block_status) {
+	ghes_copy_tofrom_phys(&estatus, *buf_paddr,
+			      sizeof(estatus), 1, fixmap_idx);
+	if (!estatus.block_status) {
 		*buf_paddr = 0;
 		return -ENOENT;
 	}
 
 	rc = -EIO;
-	len = cper_estatus_len(estatus);
-	if (len < sizeof(*estatus))
+	len = cper_estatus_len(&estatus);
+	if (len < sizeof(estatus))
 		goto err_read_block;
 	if (len > ghes->generic->error_block_length)
 		goto err_read_block;
-	if (cper_estatus_check_header(estatus))
-		goto err_read_block;
-	ghes_copy_tofrom_phys(estatus + 1,
-			      *buf_paddr + sizeof(*estatus),
-			      len - sizeof(*estatus), 1, fixmap_idx);
-	if (cper_estatus_check(estatus))
+	if (cper_estatus_check_header(&estatus))
 		goto err_read_block;
+	*buf_len = len;
+
 	rc = 0;
 
 err_read_block:
@@ -346,6 +344,35 @@ static int ghes_read_estatus(struct ghes *ghes,
 	return rc;
 }
 
+static int __ghes_read_estatus(struct acpi_hest_generic_status *estatus,
+			       u64 buf_paddr, size_t buf_len,
+			       int fixmap_idx)
+{
+	ghes_copy_tofrom_phys(estatus, buf_paddr, buf_len, 1, fixmap_idx);
+	if (cper_estatus_check(estatus)) {
+		if (printk_ratelimit())
+			pr_warning(FW_WARN GHES_PFX
+				   "Failed to read error status block!\n");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static int ghes_read_estatus(struct ghes *ghes,
+			     struct acpi_hest_generic_status *estatus,
+			     u64 *buf_paddr, int fixmap_idx)
+{
+	int rc;
+	u32 buf_len;
+
+	rc = ghes_peek_estatus(ghes, fixmap_idx, buf_paddr, &buf_len);
+	if (rc)
+		return rc;
+
+	return __ghes_read_estatus(estatus, *buf_paddr, buf_len, fixmap_idx);
+}
+
 static void ghes_clear_estatus(struct acpi_hest_generic_status *estatus,
 			       u64 buf_paddr, int fixmap_idx)
 {
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 15/18] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:17   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Each struct ghes has an worst-case sized buffer for storing the
estatus. If an error is being processed by ghes_proc() in process
context this buffer will be in use. If the error source then triggers
an NMI-like notification, the same buffer will be used by
_in_nmi_notify_one() to stage the estatus data, before
__process_error() copys it into a queued estatus entry.

Merge __process_error()s work into _in_nmi_notify_one() so that
the queued estatus entry is used from the beginning. Use the
ghes_peek_estatus() so we know how much memory to allocate from
the ghes_estatus_pool before we read the records.

Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 45 ++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 055176ed68ac..a0c10b60ad44 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -722,40 +722,32 @@ static void ghes_print_queued_estatus(void)
 	}
 }
 
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes,
-			    struct acpi_hest_generic_status *ghes_estatus)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
+	u64 buf_paddr;
+	int sev, rc = 0;
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
 
-	if (ghes_estatus_cached(ghes_estatus))
-		return;
+	rc = ghes_peek_estatus(ghes, fixmap_idx, &buf_paddr, &len);
+	if (rc)
+		return rc;
 
-	len = cper_estatus_len(ghes_estatus);
 	node_len = GHES_ESTATUS_NODE_LEN(len);
 
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
 	if (!estatus_node)
-		return;
+		return -ENOMEM;
 
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes_estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-}
-
-static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
-{
-	int sev;
-	u64 buf_paddr;
-	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+	if (__ghes_read_estatus(estatus, buf_paddr, len, fixmap_idx)) {
 		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
-		return -ENOENT;
+		rc = -ENOENT;
+		goto no_work;
 	}
 
 	sev = ghes_severity(estatus->error_severity);
@@ -764,13 +756,20 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		__ghes_panic(ghes, estatus);
 	}
 
-	if (!buf_paddr)
-		return 0;
-
-	__process_error(ghes, estatus);
 	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
-	return 0;
+	if (!buf_paddr || ghes_estatus_cached(estatus))
+		goto no_work;
+
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+
+	return rc;
+
+no_work:
+	gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+			      node_len);
+
+	return rc;
 }
 
 static int ghes_estatus_queue_notified(struct list_head *rcu_list,
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 15/18] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Each struct ghes has an worst-case sized buffer for storing the
estatus. If an error is being processed by ghes_proc() in process
context this buffer will be in use. If the error source then triggers
an NMI-like notification, the same buffer will be used by
_in_nmi_notify_one() to stage the estatus data, before
__process_error() copys it into a queued estatus entry.

Merge __process_error()s work into _in_nmi_notify_one() so that
the queued estatus entry is used from the beginning. Use the
ghes_peek_estatus() so we know how much memory to allocate from
the ghes_estatus_pool before we read the records.

Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 45 ++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 055176ed68ac..a0c10b60ad44 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -722,40 +722,32 @@ static void ghes_print_queued_estatus(void)
 	}
 }
 
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes,
-			    struct acpi_hest_generic_status *ghes_estatus)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
+	u64 buf_paddr;
+	int sev, rc = 0;
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
 
-	if (ghes_estatus_cached(ghes_estatus))
-		return;
+	rc = ghes_peek_estatus(ghes, fixmap_idx, &buf_paddr, &len);
+	if (rc)
+		return rc;
 
-	len = cper_estatus_len(ghes_estatus);
 	node_len = GHES_ESTATUS_NODE_LEN(len);
 
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
 	if (!estatus_node)
-		return;
+		return -ENOMEM;
 
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes_estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-}
-
-static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
-{
-	int sev;
-	u64 buf_paddr;
-	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+	if (__ghes_read_estatus(estatus, buf_paddr, len, fixmap_idx)) {
 		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
-		return -ENOENT;
+		rc = -ENOENT;
+		goto no_work;
 	}
 
 	sev = ghes_severity(estatus->error_severity);
@@ -764,13 +756,20 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		__ghes_panic(ghes, estatus);
 	}
 
-	if (!buf_paddr)
-		return 0;
-
-	__process_error(ghes, estatus);
 	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
-	return 0;
+	if (!buf_paddr || ghes_estatus_cached(estatus))
+		goto no_work;
+
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+
+	return rc;
+
+no_work:
+	gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+			      node_len);
+
+	return rc;
 }
 
 static int ghes_estatus_queue_notified(struct list_head *rcu_list,
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 15/18] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

Each struct ghes has an worst-case sized buffer for storing the
estatus. If an error is being processed by ghes_proc() in process
context this buffer will be in use. If the error source then triggers
an NMI-like notification, the same buffer will be used by
_in_nmi_notify_one() to stage the estatus data, before
__process_error() copys it into a queued estatus entry.

Merge __process_error()s work into _in_nmi_notify_one() so that
the queued estatus entry is used from the beginning. Use the
ghes_peek_estatus() so we know how much memory to allocate from
the ghes_estatus_pool before we read the records.

Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/apei/ghes.c | 45 ++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 055176ed68ac..a0c10b60ad44 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -722,40 +722,32 @@ static void ghes_print_queued_estatus(void)
 	}
 }
 
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes,
-			    struct acpi_hest_generic_status *ghes_estatus)
+static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 {
+	u64 buf_paddr;
+	int sev, rc = 0;
 	u32 len, node_len;
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic_status *estatus;
 
-	if (ghes_estatus_cached(ghes_estatus))
-		return;
+	rc = ghes_peek_estatus(ghes, fixmap_idx, &buf_paddr, &len);
+	if (rc)
+		return rc;
 
-	len = cper_estatus_len(ghes_estatus);
 	node_len = GHES_ESTATUS_NODE_LEN(len);
 
 	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
 	if (!estatus_node)
-		return;
+		return -ENOMEM;
 
 	estatus_node->ghes = ghes;
 	estatus_node->generic = ghes->generic;
 	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes_estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-}
-
-static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
-{
-	int sev;
-	u64 buf_paddr;
-	struct acpi_hest_generic_status *estatus = ghes->estatus;
 
-	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
+	if (__ghes_read_estatus(estatus, buf_paddr, len, fixmap_idx)) {
 		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
-		return -ENOENT;
+		rc = -ENOENT;
+		goto no_work;
 	}
 
 	sev = ghes_severity(estatus->error_severity);
@@ -764,13 +756,20 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
 		__ghes_panic(ghes, estatus);
 	}
 
-	if (!buf_paddr)
-		return 0;
-
-	__process_error(ghes, estatus);
 	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
 
-	return 0;
+	if (!buf_paddr || ghes_estatus_cached(estatus))
+		goto no_work;
+
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+
+	return rc;
+
+no_work:
+	gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+			      node_len);
+
+	return rc;
 }
 
 static int ghes_estatus_queue_notified(struct list_head *rcu_list,
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 16/18] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:17   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Now that ghes notification helpers provide the fixmap slots and
take the lock themselves we can support multiple NMI-like
notifications on arm64.

These should be named after their notification method. x86's
NOTIFY_NMI already is, move it to live with the ghes_nmi list.
Change the SEA fixmap entry to be called FIX_APEI_GHES_SEA.

Future patches can add support for FIX_APEI_GHES_SEI and
FIX_APEI_GHES_SDEI_{NORMAL,CRITICAL}.

Signed-off-by: James Morse <james.morse@arm.com>

---
Changes since v3:
 * idx/lock are now in a separate struct.
 * Add to the comment above ghes_fixmap_lock_irq so that it makes more
   sense in isolation.
---
 arch/arm64/include/asm/fixmap.h | 4 +++-
 drivers/acpi/apei/ghes.c        | 6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index ec1e6d6fa14c..c3974517c2cb 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -55,7 +55,9 @@ enum fixed_addresses {
 #ifdef CONFIG_ACPI_APEI_GHES
 	/* Used for GHES mapping from assorted contexts */
 	FIX_APEI_GHES_IRQ,
-	FIX_APEI_GHES_NMI,
+#ifdef CONFIG_ACPI_APEI_SEA
+	FIX_APEI_GHES_SEA,
+#endif
 #endif /* CONFIG_ACPI_APEI_GHES */
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index a0c10b60ad44..463c8e6d1bb5 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -994,7 +994,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_SEA);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
@@ -1025,8 +1025,8 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
 /*
- * NMI may be triggered on any CPU, so ghes_in_nmi is used for
- * having only one concurrent reader.
+ * NOTIFY_NMI may be triggered on any CPU, so ghes_in_nmi is
+ * used for having only one concurrent reader.
  */
 static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 16/18] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

Now that ghes notification helpers provide the fixmap slots and
take the lock themselves we can support multiple NMI-like
notifications on arm64.

These should be named after their notification method. x86's
NOTIFY_NMI already is, move it to live with the ghes_nmi list.
Change the SEA fixmap entry to be called FIX_APEI_GHES_SEA.

Future patches can add support for FIX_APEI_GHES_SEI and
FIX_APEI_GHES_SDEI_{NORMAL,CRITICAL}.

Signed-off-by: James Morse <james.morse@arm.com>

---
Changes since v3:
 * idx/lock are now in a separate struct.
 * Add to the comment above ghes_fixmap_lock_irq so that it makes more
   sense in isolation.
---
 arch/arm64/include/asm/fixmap.h | 4 +++-
 drivers/acpi/apei/ghes.c        | 6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index ec1e6d6fa14c..c3974517c2cb 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -55,7 +55,9 @@ enum fixed_addresses {
 #ifdef CONFIG_ACPI_APEI_GHES
 	/* Used for GHES mapping from assorted contexts */
 	FIX_APEI_GHES_IRQ,
-	FIX_APEI_GHES_NMI,
+#ifdef CONFIG_ACPI_APEI_SEA
+	FIX_APEI_GHES_SEA,
+#endif
 #endif /* CONFIG_ACPI_APEI_GHES */
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index a0c10b60ad44..463c8e6d1bb5 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -994,7 +994,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_SEA);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
@@ -1025,8 +1025,8 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
 /*
- * NMI may be triggered on any CPU, so ghes_in_nmi is used for
- * having only one concurrent reader.
+ * NOTIFY_NMI may be triggered on any CPU, so ghes_in_nmi is
+ * used for having only one concurrent reader.
  */
 static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 16/18] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

Now that ghes notification helpers provide the fixmap slots and
take the lock themselves we can support multiple NMI-like
notifications on arm64.

These should be named after their notification method. x86's
NOTIFY_NMI already is, move it to live with the ghes_nmi list.
Change the SEA fixmap entry to be called FIX_APEI_GHES_SEA.

Future patches can add support for FIX_APEI_GHES_SEI and
FIX_APEI_GHES_SDEI_{NORMAL,CRITICAL}.

Signed-off-by: James Morse <james.morse@arm.com>

---
Changes since v3:
 * idx/lock are now in a separate struct.
 * Add to the comment above ghes_fixmap_lock_irq so that it makes more
   sense in isolation.
---
 arch/arm64/include/asm/fixmap.h | 4 +++-
 drivers/acpi/apei/ghes.c        | 6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index ec1e6d6fa14c..c3974517c2cb 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -55,7 +55,9 @@ enum fixed_addresses {
 #ifdef CONFIG_ACPI_APEI_GHES
 	/* Used for GHES mapping from assorted contexts */
 	FIX_APEI_GHES_IRQ,
-	FIX_APEI_GHES_NMI,
+#ifdef CONFIG_ACPI_APEI_SEA
+	FIX_APEI_GHES_SEA,
+#endif
 #endif /* CONFIG_ACPI_APEI_GHES */
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index a0c10b60ad44..463c8e6d1bb5 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -994,7 +994,7 @@ int ghes_notify_sea(void)
 	int rv;
 
 	raw_spin_lock(&ghes_notify_lock_sea);
-	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_NMI);
+	rv = ghes_estatus_queue_notified(&ghes_sea, FIX_APEI_GHES_SEA);
 	raw_spin_unlock(&ghes_notify_lock_sea);
 
 	return rv;
@@ -1025,8 +1025,8 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
 /*
- * NMI may be triggered on any CPU, so ghes_in_nmi is used for
- * having only one concurrent reader.
+ * NOTIFY_NMI may be triggered on any CPU, so ghes_in_nmi is
+ * used for having only one concurrent reader.
  */
 static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:17   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

arm64 can take an NMI-like error notification when user-space steps in
some corrupt memory. APEI's GHES code will call memory_failure_queue()
to schedule the recovery work. We then return to user-space, possibly
taking the fault again.

Currently the arch code unconditionally signals user-space from this
path, so we don't get stuck in this loop, but the affected process
never benefits from memory_failure()s recovery work. To fix this we
need to know the recovery work will run before we get back to user-space.

Increase the priority of the recovery work by scheduling it on the
system_highpri_wq, then try to bump the current task off this CPU
so that the recovery work starts immediately.

Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Tested-by: gengdongjiu <gengdongjiu@huawei.com>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
 mm/memory-failure.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 0cd3de3550f0..4e7b115cea5a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -56,6 +56,7 @@
 #include <linux/memory_hotplug.h>
 #include <linux/mm_inline.h>
 #include <linux/memremap.h>
+#include <linux/preempt.h>
 #include <linux/kfifo.h>
 #include <linux/ratelimit.h>
 #include <linux/page-isolation.h>
@@ -1454,6 +1455,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu, memory_failure_cpu);
  */
 void memory_failure_queue(unsigned long pfn, int flags)
 {
+	int cpu = smp_processor_id();
 	struct memory_failure_cpu *mf_cpu;
 	unsigned long proc_flags;
 	struct memory_failure_entry entry = {
@@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
 
 	mf_cpu = &get_cpu_var(memory_failure_cpu);
 	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
-	if (kfifo_put(&mf_cpu->fifo, entry))
-		schedule_work_on(smp_processor_id(), &mf_cpu->work);
-	else
+	if (kfifo_put(&mf_cpu->fifo, entry)) {
+		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
+		set_tsk_need_resched(current);
+		preempt_set_need_resched();
+	} else {
 		pr_err("Memory failure: buffer overflow when queuing memory failure at %#lx\n",
 		       pfn);
+	}
 	spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
 	put_cpu_var(memory_failure_cpu);
 }
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

arm64 can take an NMI-like error notification when user-space steps in
some corrupt memory. APEI's GHES code will call memory_failure_queue()
to schedule the recovery work. We then return to user-space, possibly
taking the fault again.

Currently the arch code unconditionally signals user-space from this
path, so we don't get stuck in this loop, but the affected process
never benefits from memory_failure()s recovery work. To fix this we
need to know the recovery work will run before we get back to user-space.

Increase the priority of the recovery work by scheduling it on the
system_highpri_wq, then try to bump the current task off this CPU
so that the recovery work starts immediately.

Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Tested-by: gengdongjiu <gengdongjiu@huawei.com>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
 mm/memory-failure.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 0cd3de3550f0..4e7b115cea5a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -56,6 +56,7 @@
 #include <linux/memory_hotplug.h>
 #include <linux/mm_inline.h>
 #include <linux/memremap.h>
+#include <linux/preempt.h>
 #include <linux/kfifo.h>
 #include <linux/ratelimit.h>
 #include <linux/page-isolation.h>
@@ -1454,6 +1455,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu, memory_failure_cpu);
  */
 void memory_failure_queue(unsigned long pfn, int flags)
 {
+	int cpu = smp_processor_id();
 	struct memory_failure_cpu *mf_cpu;
 	unsigned long proc_flags;
 	struct memory_failure_entry entry = {
@@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
 
 	mf_cpu = &get_cpu_var(memory_failure_cpu);
 	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
-	if (kfifo_put(&mf_cpu->fifo, entry))
-		schedule_work_on(smp_processor_id(), &mf_cpu->work);
-	else
+	if (kfifo_put(&mf_cpu->fifo, entry)) {
+		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
+		set_tsk_need_resched(current);
+		preempt_set_need_resched();
+	} else {
 		pr_err("Memory failure: buffer overflow when queuing memory failure at %#lx\n",
 		       pfn);
+	}
 	spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
 	put_cpu_var(memory_failure_cpu);
 }
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

arm64 can take an NMI-like error notification when user-space steps in
some corrupt memory. APEI's GHES code will call memory_failure_queue()
to schedule the recovery work. We then return to user-space, possibly
taking the fault again.

Currently the arch code unconditionally signals user-space from this
path, so we don't get stuck in this loop, but the affected process
never benefits from memory_failure()s recovery work. To fix this we
need to know the recovery work will run before we get back to user-space.

Increase the priority of the recovery work by scheduling it on the
system_highpri_wq, then try to bump the current task off this CPU
so that the recovery work starts immediately.

Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Tested-by: gengdongjiu <gengdongjiu@huawei.com>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
 mm/memory-failure.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 0cd3de3550f0..4e7b115cea5a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -56,6 +56,7 @@
 #include <linux/memory_hotplug.h>
 #include <linux/mm_inline.h>
 #include <linux/memremap.h>
+#include <linux/preempt.h>
 #include <linux/kfifo.h>
 #include <linux/ratelimit.h>
 #include <linux/page-isolation.h>
@@ -1454,6 +1455,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu, memory_failure_cpu);
  */
 void memory_failure_queue(unsigned long pfn, int flags)
 {
+	int cpu = smp_processor_id();
 	struct memory_failure_cpu *mf_cpu;
 	unsigned long proc_flags;
 	struct memory_failure_entry entry = {
@@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
 
 	mf_cpu = &get_cpu_var(memory_failure_cpu);
 	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
-	if (kfifo_put(&mf_cpu->fifo, entry))
-		schedule_work_on(smp_processor_id(), &mf_cpu->work);
-	else
+	if (kfifo_put(&mf_cpu->fifo, entry)) {
+		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
+		set_tsk_need_resched(current);
+		preempt_set_need_resched();
+	} else {
 		pr_err("Memory failure: buffer overflow when queuing memory failure at %#lx\n",
 		       pfn);
+	}
 	spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
 	put_cpu_var(memory_failure_cpu);
 }
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 18/18] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-21 22:17   ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Punit Agrawal,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-mm, Borislav Petkov, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

APEI is unable to do all of its error handling work in nmi-context, so
it defers non-fatal work onto the irq_work queue. arch_irq_work_raise()
sends an IPI to the calling cpu, but we can't guarantee this will be
taken before we return.

Unless we interrupted a context with irqs-masked, we can call
irq_work_run() to do the work now. Otherwise return -EINPROGRESS to
indicate ghes_notify_sea() found some work to do, but it hasn't
finished yet.

With this we can take apei_claim_sea() returning '0' to mean this
external-abort was also notification of a firmware-first RAS error,
and that APEI has processed the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
Changes since v2:
 * Removed IS_ENABLED() check, done by the caller unless we have a dummy
   definition.
---
 arch/arm64/kernel/acpi.c | 19 +++++++++++++++++++
 arch/arm64/mm/fault.c    |  9 ++++-----
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index a9b8bba014b5..09744e2d15a0 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/irq.h>
 #include <linux/irqdomain.h>
+#include <linux/irq_work.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
 #include <linux/smp.h>
@@ -270,10 +271,14 @@ int apei_claim_sea(struct pt_regs *regs)
 {
 	int err = -ENOENT;
 	unsigned long current_flags = arch_local_save_flags();
+	unsigned long interrupted_flags = current_flags;
 
 	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
 		return err;
 
+	if (regs)
+		interrupted_flags = regs->pstate;
+
 	/*
 	 * SEA can interrupt SError, mask it and describe this as an NMI so
 	 * that APEI defers the handling.
@@ -282,6 +287,20 @@ int apei_claim_sea(struct pt_regs *regs)
 	nmi_enter();
 	err = ghes_notify_sea();
 	nmi_exit();
+
+	/*
+	 * APEI NMI-like notifications are deferred to irq_work. Unless
+	 * we interrupted irqs-masked code, we can do that now.
+	 */
+	if (!err) {
+		if (!arch_irqs_disabled_flags(interrupted_flags)) {
+			local_daif_restore(DAIF_PROCCTX_NOIRQ);
+			irq_work_run();
+		} else {
+			err = -EINPROGRESS;
+		}
+	}
+
 	local_daif_restore(current_flags);
 
 	return err;
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 2c38776bb71f..97036e01522a 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -630,11 +630,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 
 	inf = esr_to_fault_info(esr);
 
-	/*
-	 * Return value ignored as we rely on signal merging.
-	 * Future patches will make this more robust.
-	 */
-	apei_claim_sea(regs);
+	if (apei_claim_sea(regs) == 0) {
+		/* APEI claimed this as a firmware-first notification */
+		return 0;
+	}
 
 	clear_siginfo(&info);
 	info.si_signo = inf->sig;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 18/18] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-acpi
  Cc: kvmarm, linux-arm-kernel, linux-mm, Borislav Petkov,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang, James Morse

APEI is unable to do all of its error handling work in nmi-context, so
it defers non-fatal work onto the irq_work queue. arch_irq_work_raise()
sends an IPI to the calling cpu, but we can't guarantee this will be
taken before we return.

Unless we interrupted a context with irqs-masked, we can call
irq_work_run() to do the work now. Otherwise return -EINPROGRESS to
indicate ghes_notify_sea() found some work to do, but it hasn't
finished yet.

With this we can take apei_claim_sea() returning '0' to mean this
external-abort was also notification of a firmware-first RAS error,
and that APEI has processed the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
Changes since v2:
 * Removed IS_ENABLED() check, done by the caller unless we have a dummy
   definition.
---
 arch/arm64/kernel/acpi.c | 19 +++++++++++++++++++
 arch/arm64/mm/fault.c    |  9 ++++-----
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index a9b8bba014b5..09744e2d15a0 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/irq.h>
 #include <linux/irqdomain.h>
+#include <linux/irq_work.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
 #include <linux/smp.h>
@@ -270,10 +271,14 @@ int apei_claim_sea(struct pt_regs *regs)
 {
 	int err = -ENOENT;
 	unsigned long current_flags = arch_local_save_flags();
+	unsigned long interrupted_flags = current_flags;
 
 	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
 		return err;
 
+	if (regs)
+		interrupted_flags = regs->pstate;
+
 	/*
 	 * SEA can interrupt SError, mask it and describe this as an NMI so
 	 * that APEI defers the handling.
@@ -282,6 +287,20 @@ int apei_claim_sea(struct pt_regs *regs)
 	nmi_enter();
 	err = ghes_notify_sea();
 	nmi_exit();
+
+	/*
+	 * APEI NMI-like notifications are deferred to irq_work. Unless
+	 * we interrupted irqs-masked code, we can do that now.
+	 */
+	if (!err) {
+		if (!arch_irqs_disabled_flags(interrupted_flags)) {
+			local_daif_restore(DAIF_PROCCTX_NOIRQ);
+			irq_work_run();
+		} else {
+			err = -EINPROGRESS;
+		}
+	}
+
 	local_daif_restore(current_flags);
 
 	return err;
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 2c38776bb71f..97036e01522a 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -630,11 +630,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 
 	inf = esr_to_fault_info(esr);
 
-	/*
-	 * Return value ignored as we rely on signal merging.
-	 * Future patches will make this more robust.
-	 */
-	apei_claim_sea(regs);
+	if (apei_claim_sea(regs) == 0) {
+		/* APEI claimed this as a firmware-first notification */
+		return 0;
+	}
 
 	clear_siginfo(&info);
 	info.si_signo = inf->sig;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 18/18] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
@ 2018-09-21 22:17   ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-09-21 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

APEI is unable to do all of its error handling work in nmi-context, so
it defers non-fatal work onto the irq_work queue. arch_irq_work_raise()
sends an IPI to the calling cpu, but we can't guarantee this will be
taken before we return.

Unless we interrupted a context with irqs-masked, we can call
irq_work_run() to do the work now. Otherwise return -EINPROGRESS to
indicate ghes_notify_sea() found some work to do, but it hasn't
finished yet.

With this we can take apei_claim_sea() returning '0' to mean this
external-abort was also notification of a firmware-first RAS error,
and that APEI has processed the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Xie XiuQi <xiexiuqi@huawei.com>
CC: gengdongjiu <gengdongjiu@huawei.com>
---
Changes since v2:
 * Removed IS_ENABLED() check, done by the caller unless we have a dummy
   definition.
---
 arch/arm64/kernel/acpi.c | 19 +++++++++++++++++++
 arch/arm64/mm/fault.c    |  9 ++++-----
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index a9b8bba014b5..09744e2d15a0 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/irq.h>
 #include <linux/irqdomain.h>
+#include <linux/irq_work.h>
 #include <linux/memblock.h>
 #include <linux/of_fdt.h>
 #include <linux/smp.h>
@@ -270,10 +271,14 @@ int apei_claim_sea(struct pt_regs *regs)
 {
 	int err = -ENOENT;
 	unsigned long current_flags = arch_local_save_flags();
+	unsigned long interrupted_flags = current_flags;
 
 	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
 		return err;
 
+	if (regs)
+		interrupted_flags = regs->pstate;
+
 	/*
 	 * SEA can interrupt SError, mask it and describe this as an NMI so
 	 * that APEI defers the handling.
@@ -282,6 +287,20 @@ int apei_claim_sea(struct pt_regs *regs)
 	nmi_enter();
 	err = ghes_notify_sea();
 	nmi_exit();
+
+	/*
+	 * APEI NMI-like notifications are deferred to irq_work. Unless
+	 * we interrupted irqs-masked code, we can do that now.
+	 */
+	if (!err) {
+		if (!arch_irqs_disabled_flags(interrupted_flags)) {
+			local_daif_restore(DAIF_PROCCTX_NOIRQ);
+			irq_work_run();
+		} else {
+			err = -EINPROGRESS;
+		}
+	}
+
 	local_daif_restore(current_flags);
 
 	return err;
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 2c38776bb71f..97036e01522a 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -630,11 +630,10 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 
 	inf = esr_to_fault_info(esr);
 
-	/*
-	 * Return value ignored as we rely on signal merging.
-	 * Future patches will make this more robust.
-	 */
-	apei_claim_sea(regs);
+	if (apei_claim_sea(regs) == 0) {
+		/* APEI claimed this as a firmware-first notification */
+		return 0;
+	}
 
 	clear_siginfo(&info);
 	info.si_signo = inf->sig;
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 00/18] APEI in_nmi() rework
  2018-09-21 22:16 ` James Morse
  (?)
@ 2018-09-25 12:45   ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-09-25 12:45 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:16:47PM +0100, James Morse wrote:
> Hello,
> 
> The GHES driver has collected quite a few bugs:
> 
> ghes_proc() at ghes_probe() time can be interrupted by an NMI that
> will clobber the ghes->estatus fields, flags, and the buffer_paddr.
> 
> ghes_copy_tofrom_phys() uses in_nmi() to decide which path to take. arm64's
> SEA taking both paths, depending on what it interrupted.
> 
> There is no guarantee that queued memory_failure() errors will be processed
> before this CPU returns to user-space.
> 
> x86 can't TLBI from interrupt-masked code which this driver does all the
> time.
> 
> 
> This series aims to fix the first three, with an eye to fixing the
> last one with a follow-up series.
> 
> Previous postings included the SDEI notification calls, which I haven't
> finished re-testing. This series is big enough as it is.

Yeah, and everywhere I look, this thing looks overengineered. Like,
for example, what's the purpose of this ghes_esource_prealloc_size()
computing a size each time the pool changes size?

AFAICT, this size can be computed exactly *once* at driver init and be
done with it. Right?

Or am I missing something subtle?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 00/18] APEI in_nmi() rework
@ 2018-09-25 12:45   ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-09-25 12:45 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:16:47PM +0100, James Morse wrote:
> Hello,
> 
> The GHES driver has collected quite a few bugs:
> 
> ghes_proc() at ghes_probe() time can be interrupted by an NMI that
> will clobber the ghes->estatus fields, flags, and the buffer_paddr.
> 
> ghes_copy_tofrom_phys() uses in_nmi() to decide which path to take. arm64's
> SEA taking both paths, depending on what it interrupted.
> 
> There is no guarantee that queued memory_failure() errors will be processed
> before this CPU returns to user-space.
> 
> x86 can't TLBI from interrupt-masked code which this driver does all the
> time.
> 
> 
> This series aims to fix the first three, with an eye to fixing the
> last one with a follow-up series.
> 
> Previous postings included the SDEI notification calls, which I haven't
> finished re-testing. This series is big enough as it is.

Yeah, and everywhere I look, this thing looks overengineered. Like,
for example, what's the purpose of this ghes_esource_prealloc_size()
computing a size each time the pool changes size?

AFAICT, this size can be computed exactly *once* at driver init and be
done with it. Right?

Or am I missing something subtle?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 00/18] APEI in_nmi() rework
@ 2018-09-25 12:45   ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-09-25 12:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:16:47PM +0100, James Morse wrote:
> Hello,
> 
> The GHES driver has collected quite a few bugs:
> 
> ghes_proc() at ghes_probe() time can be interrupted by an NMI that
> will clobber the ghes->estatus fields, flags, and the buffer_paddr.
> 
> ghes_copy_tofrom_phys() uses in_nmi() to decide which path to take. arm64's
> SEA taking both paths, depending on what it interrupted.
> 
> There is no guarantee that queued memory_failure() errors will be processed
> before this CPU returns to user-space.
> 
> x86 can't TLBI from interrupt-masked code which this driver does all the
> time.
> 
> 
> This series aims to fix the first three, with an eye to fixing the
> last one with a follow-up series.
> 
> Previous postings included the SDEI notification calls, which I haven't
> finished re-testing. This series is big enough as it is.

Yeah, and everywhere I look, this thing looks overengineered. Like,
for example, what's the purpose of this ghes_esource_prealloc_size()
computing a size each time the pool changes size?

AFAICT, this size can be computed exactly *once* at driver init and be
done with it. Right?

Or am I missing something subtle?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 04/18] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  2018-09-21 22:16   ` James Morse
  (?)
@ 2018-09-28 17:04     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-09-28 17:04 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:16:51PM +0100, James Morse wrote:
> Now that the estatus queue can be used by more than one notification
> method, we can move notifications that have NMI-like behaviour over to
> it, and start abstracting GHES's single in_nmi() path.
> 
> Switch NOTIFY_SEA over to use the estatus queue. This makes it behave
> in the same way as x86's NOTIFY_NMI.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  drivers/acpi/apei/ghes.c | 23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index d7c46236b353..150fb184c7cb 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -58,6 +58,10 @@
>  
>  #define GHES_PFX	"GHES: "
>  
> +#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
> +#define WANT_NMI_ESTATUS_QUEUE	1
> +#endif

Is that just so that you have shorter ifdeffery lines? Because if so, an
additional level of indirection is silly. Or maybe there's more coming -
I'll see when I continue going through this set. :)

Otherwise looks good - trying to reuse the facilities and all. Better. :)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 04/18] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
@ 2018-09-28 17:04     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-09-28 17:04 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:16:51PM +0100, James Morse wrote:
> Now that the estatus queue can be used by more than one notification
> method, we can move notifications that have NMI-like behaviour over to
> it, and start abstracting GHES's single in_nmi() path.
> 
> Switch NOTIFY_SEA over to use the estatus queue. This makes it behave
> in the same way as x86's NOTIFY_NMI.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  drivers/acpi/apei/ghes.c | 23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index d7c46236b353..150fb184c7cb 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -58,6 +58,10 @@
>  
>  #define GHES_PFX	"GHES: "
>  
> +#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
> +#define WANT_NMI_ESTATUS_QUEUE	1
> +#endif

Is that just so that you have shorter ifdeffery lines? Because if so, an
additional level of indirection is silly. Or maybe there's more coming -
I'll see when I continue going through this set. :)

Otherwise looks good - trying to reuse the facilities and all. Better. :)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 04/18] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
@ 2018-09-28 17:04     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-09-28 17:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:16:51PM +0100, James Morse wrote:
> Now that the estatus queue can be used by more than one notification
> method, we can move notifications that have NMI-like behaviour over to
> it, and start abstracting GHES's single in_nmi() path.
> 
> Switch NOTIFY_SEA over to use the estatus queue. This makes it behave
> in the same way as x86's NOTIFY_NMI.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  drivers/acpi/apei/ghes.c | 23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index d7c46236b353..150fb184c7cb 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -58,6 +58,10 @@
>  
>  #define GHES_PFX	"GHES: "
>  
> +#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
> +#define WANT_NMI_ESTATUS_QUEUE	1
> +#endif

Is that just so that you have shorter ifdeffery lines? Because if so, an
additional level of indirection is silly. Or maybe there's more coming -
I'll see when I continue going through this set. :)

Otherwise looks good - trying to reuse the facilities and all. Better. :)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
  2018-09-21 22:16   ` James Morse
  (?)
@ 2018-10-01 17:59     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-01 17:59 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:16:52PM +0100, James Morse wrote:
> Now that there are two users of the estatus queue, and likely to be more,
> make it a Kconfig symbol selected by the appropriate notification. We
> can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.

Ok, question: why do we need to complicate things at all? I mean, why do
we even need a Kconfig symbol?

This code is being used by two arches now so why not simply build it in
unconditionally and be done with it. The couple of KB saved are simply
not worth the effort, especially if it is going to end up being enabled
on 99% of the setups...

Or?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-01 17:59     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-01 17:59 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:16:52PM +0100, James Morse wrote:
> Now that there are two users of the estatus queue, and likely to be more,
> make it a Kconfig symbol selected by the appropriate notification. We
> can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.

Ok, question: why do we need to complicate things at all? I mean, why do
we even need a Kconfig symbol?

This code is being used by two arches now so why not simply build it in
unconditionally and be done with it. The couple of KB saved are simply
not worth the effort, especially if it is going to end up being enabled
on 99% of the setups...

Or?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-01 17:59     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-01 17:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:16:52PM +0100, James Morse wrote:
> Now that there are two users of the estatus queue, and likely to be more,
> make it a Kconfig symbol selected by the appropriate notification. We
> can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.

Ok, question: why do we need to complicate things at all? I mean, why do
we even need a Kconfig symbol?

This code is being used by two arches now so why not simply build it in
unconditionally and be done with it. The couple of KB saved are simply
not worth the effort, especially if it is going to end up being enabled
on 99% of the setups...

Or?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
  2018-10-01 17:59     ` Borislav Petkov
  (?)
@ 2018-10-03 17:50       ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-03 17:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 01/10/18 18:59, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:52PM +0100, James Morse wrote:
>> Now that there are two users of the estatus queue, and likely to be more,
>> make it a Kconfig symbol selected by the appropriate notification. We
>> can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.
> 
> Ok, question: why do we need to complicate things at all? I mean, why do
> we even need a Kconfig symbol?

Before patch 4, this was behind CONFIG_HAVE_ACPI_APEI_NMI, (so it made use of an
existing kconfig symbol), and there was only one user x86:NMI.

The ACPI spec has four ~NMI notifications, so far the support for these in Linux
has been selectable separately. If you build the kernel without any of them then
this code would be unused, and generate warnings because all those users are
behind #ifdef too.


> This code is being used by two arches now so why not simply build it in
> unconditionally and be done with it. The couple of KB saved are simply
> not worth the effort, especially if it is going to end up being enabled
> on 99% of the setups...

I'm all in favour of letting the compiler work it out, but the existing ghes
code has #ifdef/#else all over the place. This is 'keeping the style'.
I assumed it was done this way to support an older compiler on x86, (I see that
jumped from 3.2 to 4.6 with commit cafa0010cd51)

We could strip the lot away to a few IS_ENABLED() in ghes_probe() and the
memory_failure()/AER calls if you'd prefer.


Thanks,

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-03 17:50       ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-03 17:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

Hi Boris,

On 01/10/18 18:59, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:52PM +0100, James Morse wrote:
>> Now that there are two users of the estatus queue, and likely to be more,
>> make it a Kconfig symbol selected by the appropriate notification. We
>> can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.
> 
> Ok, question: why do we need to complicate things at all? I mean, why do
> we even need a Kconfig symbol?

Before patch 4, this was behind CONFIG_HAVE_ACPI_APEI_NMI, (so it made use of an
existing kconfig symbol), and there was only one user x86:NMI.

The ACPI spec has four ~NMI notifications, so far the support for these in Linux
has been selectable separately. If you build the kernel without any of them then
this code would be unused, and generate warnings because all those users are
behind #ifdef too.


> This code is being used by two arches now so why not simply build it in
> unconditionally and be done with it. The couple of KB saved are simply
> not worth the effort, especially if it is going to end up being enabled
> on 99% of the setups...

I'm all in favour of letting the compiler work it out, but the existing ghes
code has #ifdef/#else all over the place. This is 'keeping the style'.
I assumed it was done this way to support an older compiler on x86, (I see that
jumped from 3.2 to 4.6 with commit cafa0010cd51)

We could strip the lot away to a few IS_ENABLED() in ghes_probe() and the
memory_failure()/AER calls if you'd prefer.


Thanks,

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-03 17:50       ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-03 17:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On 01/10/18 18:59, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:52PM +0100, James Morse wrote:
>> Now that there are two users of the estatus queue, and likely to be more,
>> make it a Kconfig symbol selected by the appropriate notification. We
>> can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too.
> 
> Ok, question: why do we need to complicate things at all? I mean, why do
> we even need a Kconfig symbol?

Before patch 4, this was behind CONFIG_HAVE_ACPI_APEI_NMI, (so it made use of an
existing kconfig symbol), and there was only one user x86:NMI.

The ACPI spec has four ~NMI notifications, so far the support for these in Linux
has been selectable separately. If you build the kernel without any of them then
this code would be unused, and generate warnings because all those users are
behind #ifdef too.


> This code is being used by two arches now so why not simply build it in
> unconditionally and be done with it. The couple of KB saved are simply
> not worth the effort, especially if it is going to end up being enabled
> on 99% of the setups...

I'm all in favour of letting the compiler work it out, but the existing ghes
code has #ifdef/#else all over the place. This is 'keeping the style'.
I assumed it was done this way to support an older compiler on x86, (I see that
jumped from 3.2 to 4.6 with commit cafa0010cd51)

We could strip the lot away to a few IS_ENABLED() in ghes_probe() and the
memory_failure()/AER calls if you'd prefer.


Thanks,

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 00/18] APEI in_nmi() rework
  2018-09-25 12:45   ` Borislav Petkov
  (?)
@ 2018-10-03 17:50     ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-03 17:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 25/09/18 13:45, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:47PM +0100, James Morse wrote:
>> Hello,
>>
>> The GHES driver has collected quite a few bugs:
>>
>> ghes_proc() at ghes_probe() time can be interrupted by an NMI that
>> will clobber the ghes->estatus fields, flags, and the buffer_paddr.
>>
>> ghes_copy_tofrom_phys() uses in_nmi() to decide which path to take. arm64's
>> SEA taking both paths, depending on what it interrupted.
>>
>> There is no guarantee that queued memory_failure() errors will be processed
>> before this CPU returns to user-space.
>>
>> x86 can't TLBI from interrupt-masked code which this driver does all the
>> time.
>>
>>
>> This series aims to fix the first three, with an eye to fixing the
>> last one with a follow-up series.
>>
>> Previous postings included the SDEI notification calls, which I haven't
>> finished re-testing. This series is big enough as it is.

> Yeah, and everywhere I look, this thing looks overengineered. Like,
> for example, what's the purpose of this ghes_esource_prealloc_size()
> computing a size each time the pool changes size?

The size to grow the pool by, because each error-source described by a GHES
entry has its own worst-case size.

Today ghes_nmi_add() does this each time its called. You could have multiple
GHES entries in the HEST that describe NMI as the notification. The worst-case
size for the records is described in the GHES entry, and could be different for
each one. (error_block_length and records_to_preallocate, or table 18-379 of
acpi v6.2)

These different error-sources could be delivered on different CPUs at the same
time, so need their own pre-allocated reserved memory. ghes_notify_nmi()'s
atomic_add_unless() suggests this can happen on x86, but I don't know the
arch-specifics. It definitely can happen on arm64.


> AFAICT, this size can be computed exactly *once* at driver init and be
> done with it. Right?

We could do two passes of the HEST to pre-compute the total size of this
estatus-queue memory, allocate it, then do the notification registration stuff.
But this doesn't really work with the way this driver acts as platform-driver
for a ghes device...

The non-ghes HEST entries have a "number of records to pre-allocate" too, we
could make this memory pool something hest.c looks after, but I can't see if the
other error sources use those values.

Hmmm,
The size is capped to 64K, we could ignore the firmware description of the
memory requirements, and allocate SZ_64K each time. Doing it per-GHES is still
the only way to avoid allocating nmi-safe memory for irqs.


Thanks,

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 00/18] APEI in_nmi() rework
@ 2018-10-03 17:50     ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-03 17:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

Hi Boris,

On 25/09/18 13:45, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:47PM +0100, James Morse wrote:
>> Hello,
>>
>> The GHES driver has collected quite a few bugs:
>>
>> ghes_proc() at ghes_probe() time can be interrupted by an NMI that
>> will clobber the ghes->estatus fields, flags, and the buffer_paddr.
>>
>> ghes_copy_tofrom_phys() uses in_nmi() to decide which path to take. arm64's
>> SEA taking both paths, depending on what it interrupted.
>>
>> There is no guarantee that queued memory_failure() errors will be processed
>> before this CPU returns to user-space.
>>
>> x86 can't TLBI from interrupt-masked code which this driver does all the
>> time.
>>
>>
>> This series aims to fix the first three, with an eye to fixing the
>> last one with a follow-up series.
>>
>> Previous postings included the SDEI notification calls, which I haven't
>> finished re-testing. This series is big enough as it is.

> Yeah, and everywhere I look, this thing looks overengineered. Like,
> for example, what's the purpose of this ghes_esource_prealloc_size()
> computing a size each time the pool changes size?

The size to grow the pool by, because each error-source described by a GHES
entry has its own worst-case size.

Today ghes_nmi_add() does this each time its called. You could have multiple
GHES entries in the HEST that describe NMI as the notification. The worst-case
size for the records is described in the GHES entry, and could be different for
each one. (error_block_length and records_to_preallocate, or table 18-379 of
acpi v6.2)

These different error-sources could be delivered on different CPUs at the same
time, so need their own pre-allocated reserved memory. ghes_notify_nmi()'s
atomic_add_unless() suggests this can happen on x86, but I don't know the
arch-specifics. It definitely can happen on arm64.


> AFAICT, this size can be computed exactly *once* at driver init and be
> done with it. Right?

We could do two passes of the HEST to pre-compute the total size of this
estatus-queue memory, allocate it, then do the notification registration stuff.
But this doesn't really work with the way this driver acts as platform-driver
for a ghes device...

The non-ghes HEST entries have a "number of records to pre-allocate" too, we
could make this memory pool something hest.c looks after, but I can't see if the
other error sources use those values.

Hmmm,
The size is capped to 64K, we could ignore the firmware description of the
memory requirements, and allocate SZ_64K each time. Doing it per-GHES is still
the only way to avoid allocating nmi-safe memory for irqs.


Thanks,

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 00/18] APEI in_nmi() rework
@ 2018-10-03 17:50     ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-03 17:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On 25/09/18 13:45, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:47PM +0100, James Morse wrote:
>> Hello,
>>
>> The GHES driver has collected quite a few bugs:
>>
>> ghes_proc() at ghes_probe() time can be interrupted by an NMI that
>> will clobber the ghes->estatus fields, flags, and the buffer_paddr.
>>
>> ghes_copy_tofrom_phys() uses in_nmi() to decide which path to take. arm64's
>> SEA taking both paths, depending on what it interrupted.
>>
>> There is no guarantee that queued memory_failure() errors will be processed
>> before this CPU returns to user-space.
>>
>> x86 can't TLBI from interrupt-masked code which this driver does all the
>> time.
>>
>>
>> This series aims to fix the first three, with an eye to fixing the
>> last one with a follow-up series.
>>
>> Previous postings included the SDEI notification calls, which I haven't
>> finished re-testing. This series is big enough as it is.

> Yeah, and everywhere I look, this thing looks overengineered. Like,
> for example, what's the purpose of this ghes_esource_prealloc_size()
> computing a size each time the pool changes size?

The size to grow the pool by, because each error-source described by a GHES
entry has its own worst-case size.

Today ghes_nmi_add() does this each time its called. You could have multiple
GHES entries in the HEST that describe NMI as the notification. The worst-case
size for the records is described in the GHES entry, and could be different for
each one. (error_block_length and records_to_preallocate, or table 18-379 of
acpi v6.2)

These different error-sources could be delivered on different CPUs at the same
time, so need their own pre-allocated reserved memory. ghes_notify_nmi()'s
atomic_add_unless() suggests this can happen on x86, but I don't know the
arch-specifics. It definitely can happen on arm64.


> AFAICT, this size can be computed exactly *once* at driver init and be
> done with it. Right?

We could do two passes of the HEST to pre-compute the total size of this
estatus-queue memory, allocate it, then do the notification registration stuff.
But this doesn't really work with the way this driver acts as platform-driver
for a ghes device...

The non-ghes HEST entries have a "number of records to pre-allocate" too, we
could make this memory pool something hest.c looks after, but I can't see if the
other error sources use those values.

Hmmm,
The size is capped to 64K, we could ignore the firmware description of the
memory requirements, and allocate SZ_64K each time. Doing it per-GHES is still
the only way to avoid allocating nmi-safe memory for irqs.


Thanks,

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 00/18] APEI in_nmi() rework
  2018-10-03 17:50     ` James Morse
  (?)
@ 2018-10-04 15:15       ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-04 15:15 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Wed, Oct 03, 2018 at 06:50:38PM +0100, James Morse wrote:

...

> The non-ghes HEST entries have a "number of records to pre-allocate" too, we
> could make this memory pool something hest.c looks after, but I can't see if the
> other error sources use those values.

Thanks for the detailed analysis!

> Hmmm, The size is capped to 64K, we could ignore the firmware description of the
> memory requirements, and allocate SZ_64K each time. Doing it per-GHES is still
> the only way to avoid allocating nmi-safe memory for irqs.

Right, so I'm thinking a lot simpler: allocate a pool which should
be large enough to handle all situations and drop all that logic
which recomputes and reallocates pool size. Just a static thing which
JustWorks(tm).

For a couple of reasons:

 - you state it above: all those synchronization issues are gone with a
 prellocated pool

 - 64K per-GHES pool is nothing if you consider the machines this thing
 runs on - fat servers with lotsa memory. And RAS there *is* important.
 And TBH 64K is nothing even on a small client sporting gigabytes of
 memory.

 - code is a lot simpler and cleaner - you don't need all that pool
 expanding and shrinking. I mean, I'm all for smarter solutions if they
 have any clear advantages warranting the complication but this is a
 lot of machinery just so that we can save a couple of KBs. Which, as a
 whole, sounds just too much to me.

But this is just me.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 00/18] APEI in_nmi() rework
@ 2018-10-04 15:15       ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-04 15:15 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Wed, Oct 03, 2018 at 06:50:38PM +0100, James Morse wrote:

...

> The non-ghes HEST entries have a "number of records to pre-allocate" too, we
> could make this memory pool something hest.c looks after, but I can't see if the
> other error sources use those values.

Thanks for the detailed analysis!

> Hmmm, The size is capped to 64K, we could ignore the firmware description of the
> memory requirements, and allocate SZ_64K each time. Doing it per-GHES is still
> the only way to avoid allocating nmi-safe memory for irqs.

Right, so I'm thinking a lot simpler: allocate a pool which should
be large enough to handle all situations and drop all that logic
which recomputes and reallocates pool size. Just a static thing which
JustWorks(tm).

For a couple of reasons:

 - you state it above: all those synchronization issues are gone with a
 prellocated pool

 - 64K per-GHES pool is nothing if you consider the machines this thing
 runs on - fat servers with lotsa memory. And RAS there *is* important.
 And TBH 64K is nothing even on a small client sporting gigabytes of
 memory.

 - code is a lot simpler and cleaner - you don't need all that pool
 expanding and shrinking. I mean, I'm all for smarter solutions if they
 have any clear advantages warranting the complication but this is a
 lot of machinery just so that we can save a couple of KBs. Which, as a
 whole, sounds just too much to me.

But this is just me.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 00/18] APEI in_nmi() rework
@ 2018-10-04 15:15       ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-04 15:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Oct 03, 2018 at 06:50:38PM +0100, James Morse wrote:

...

> The non-ghes HEST entries have a "number of records to pre-allocate" too, we
> could make this memory pool something hest.c looks after, but I can't see if the
> other error sources use those values.

Thanks for the detailed analysis!

> Hmmm, The size is capped to 64K, we could ignore the firmware description of the
> memory requirements, and allocate SZ_64K each time. Doing it per-GHES is still
> the only way to avoid allocating nmi-safe memory for irqs.

Right, so I'm thinking a lot simpler: allocate a pool which should
be large enough to handle all situations and drop all that logic
which recomputes and reallocates pool size. Just a static thing which
JustWorks(tm).

For a couple of reasons:

 - you state it above: all those synchronization issues are gone with a
 prellocated pool

 - 64K per-GHES pool is nothing if you consider the machines this thing
 runs on - fat servers with lotsa memory. And RAS there *is* important.
 And TBH 64K is nothing even on a small client sporting gigabytes of
 memory.

 - code is a lot simpler and cleaner - you don't need all that pool
 expanding and shrinking. I mean, I'm all for smarter solutions if they
 have any clear advantages warranting the complication but this is a
 lot of machinery just so that we can save a couple of KBs. Which, as a
 whole, sounds just too much to me.

But this is just me.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
  2018-10-03 17:50       ` James Morse
  (?)
@ 2018-10-04 17:34         ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-04 17:34 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Wed, Oct 03, 2018 at 06:50:36PM +0100, James Morse wrote:
> I'm all in favour of letting the compiler work it out, but the existing ghes
> code has #ifdef/#else all over the place. This is 'keeping the style'.

Yeah, but this "style" is not the optimal one and we should
simplify/clean up and fix this thing.

Swapping the order of your statements here:

> The ACPI spec has four ~NMI notifications, so far the support for
> these in Linux has been selectable separately.

Yes, but: distro kernels end up enabling all those options anyway and
distro kernels are 90-ish% of the setups. Which means, this will get
enabled anyway and this additional Kconfig symbol is simply going to be
one automatic reply "Yes".

So let's build it in by default and if someone complains about it, we
can always carve it out. But right now I don't see the need for the
unnecessary separation...

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-04 17:34         ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-04 17:34 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Wed, Oct 03, 2018 at 06:50:36PM +0100, James Morse wrote:
> I'm all in favour of letting the compiler work it out, but the existing ghes
> code has #ifdef/#else all over the place. This is 'keeping the style'.

Yeah, but this "style" is not the optimal one and we should
simplify/clean up and fix this thing.

Swapping the order of your statements here:

> The ACPI spec has four ~NMI notifications, so far the support for
> these in Linux has been selectable separately.

Yes, but: distro kernels end up enabling all those options anyway and
distro kernels are 90-ish% of the setups. Which means, this will get
enabled anyway and this additional Kconfig symbol is simply going to be
one automatic reply "Yes".

So let's build it in by default and if someone complains about it, we
can always carve it out. But right now I don't see the need for the
unnecessary separation...

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-04 17:34         ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-04 17:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Oct 03, 2018 at 06:50:36PM +0100, James Morse wrote:
> I'm all in favour of letting the compiler work it out, but the existing ghes
> code has #ifdef/#else all over the place. This is 'keeping the style'.

Yeah, but this "style" is not the optimal one and we should
simplify/clean up and fix this thing.

Swapping the order of your statements here:

> The ACPI spec has four ~NMI notifications, so far the support for
> these in Linux has been selectable separately.

Yes, but: distro kernels end up enabling all those options anyway and
distro kernels are 90-ish% of the setups. Which means, this will get
enabled anyway and this additional Kconfig symbol is simply going to be
one automatic reply "Yes".

So let's build it in by default and if someone complains about it, we
can always carve it out. But right now I don't see the need for the
unnecessary separation...

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  2018-09-21 22:16   ` James Morse
  (?)
@ 2018-10-12  9:57     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12  9:57 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:16:53PM +0100, James Morse wrote:
> To split up APEIs in_nmi() path, we need any nmi-like callers to always
> be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
> out into a header file.
> 
> Currently guest synchronous external aborts are claimed as RAS
> notifications by handle_guest_sea(), which is hidden in the arch codes
> mm/fault.c. 32bit gets a dummy declaration in system_misc.h.
> 
> There is going to be more of this in the future if/when we support
> the SError-based firmware-first notification mechanism and/or
> kernel-first notifications for both synchronous external abort and
> SError. Each of these will come with some Kconfig symbols and a
> handful of header files.
> 
> Create a header file for all this.
> 
> This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
> declarations to kvm_ras.h as preparation for a future patch that moves
> the ACPI-specific RAS code out of mm/fault.c.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  arch/arm/include/asm/kvm_ras.h       | 14 ++++++++++++++
>  arch/arm/include/asm/system_misc.h   |  5 -----
>  arch/arm64/include/asm/kvm_ras.h     | 11 +++++++++++
>  arch/arm64/include/asm/system_misc.h |  2 --
>  arch/arm64/mm/fault.c                |  2 +-
>  virt/kvm/arm/mmu.c                   |  4 ++--
>  6 files changed, 28 insertions(+), 10 deletions(-)
>  create mode 100644 arch/arm/include/asm/kvm_ras.h
>  create mode 100644 arch/arm64/include/asm/kvm_ras.h
> 
> diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
> new file mode 100644
> index 000000000000..aaff56bf338f
> --- /dev/null
> +++ b/arch/arm/include/asm/kvm_ras.h
> @@ -0,0 +1,14 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (C) 2018 - Arm Ltd

checkpatch is complaining for some reason:

WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#66: FILE: arch/arm/include/asm/kvm_ras.h:1:
+// SPDX-License-Identifier: GPL-2.0

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
@ 2018-10-12  9:57     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12  9:57 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:16:53PM +0100, James Morse wrote:
> To split up APEIs in_nmi() path, we need any nmi-like callers to always
> be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
> out into a header file.
> 
> Currently guest synchronous external aborts are claimed as RAS
> notifications by handle_guest_sea(), which is hidden in the arch codes
> mm/fault.c. 32bit gets a dummy declaration in system_misc.h.
> 
> There is going to be more of this in the future if/when we support
> the SError-based firmware-first notification mechanism and/or
> kernel-first notifications for both synchronous external abort and
> SError. Each of these will come with some Kconfig symbols and a
> handful of header files.
> 
> Create a header file for all this.
> 
> This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
> declarations to kvm_ras.h as preparation for a future patch that moves
> the ACPI-specific RAS code out of mm/fault.c.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  arch/arm/include/asm/kvm_ras.h       | 14 ++++++++++++++
>  arch/arm/include/asm/system_misc.h   |  5 -----
>  arch/arm64/include/asm/kvm_ras.h     | 11 +++++++++++
>  arch/arm64/include/asm/system_misc.h |  2 --
>  arch/arm64/mm/fault.c                |  2 +-
>  virt/kvm/arm/mmu.c                   |  4 ++--
>  6 files changed, 28 insertions(+), 10 deletions(-)
>  create mode 100644 arch/arm/include/asm/kvm_ras.h
>  create mode 100644 arch/arm64/include/asm/kvm_ras.h
> 
> diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
> new file mode 100644
> index 000000000000..aaff56bf338f
> --- /dev/null
> +++ b/arch/arm/include/asm/kvm_ras.h
> @@ -0,0 +1,14 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (C) 2018 - Arm Ltd

checkpatch is complaining for some reason:

WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#66: FILE: arch/arm/include/asm/kvm_ras.h:1:
+// SPDX-License-Identifier: GPL-2.0

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
@ 2018-10-12  9:57     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12  9:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:16:53PM +0100, James Morse wrote:
> To split up APEIs in_nmi() path, we need any nmi-like callers to always
> be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
> out into a header file.
> 
> Currently guest synchronous external aborts are claimed as RAS
> notifications by handle_guest_sea(), which is hidden in the arch codes
> mm/fault.c. 32bit gets a dummy declaration in system_misc.h.
> 
> There is going to be more of this in the future if/when we support
> the SError-based firmware-first notification mechanism and/or
> kernel-first notifications for both synchronous external abort and
> SError. Each of these will come with some Kconfig symbols and a
> handful of header files.
> 
> Create a header file for all this.
> 
> This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
> declarations to kvm_ras.h as preparation for a future patch that moves
> the ACPI-specific RAS code out of mm/fault.c.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  arch/arm/include/asm/kvm_ras.h       | 14 ++++++++++++++
>  arch/arm/include/asm/system_misc.h   |  5 -----
>  arch/arm64/include/asm/kvm_ras.h     | 11 +++++++++++
>  arch/arm64/include/asm/system_misc.h |  2 --
>  arch/arm64/mm/fault.c                |  2 +-
>  virt/kvm/arm/mmu.c                   |  4 ++--
>  6 files changed, 28 insertions(+), 10 deletions(-)
>  create mode 100644 arch/arm/include/asm/kvm_ras.h
>  create mode 100644 arch/arm64/include/asm/kvm_ras.h
> 
> diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
> new file mode 100644
> index 000000000000..aaff56bf338f
> --- /dev/null
> +++ b/arch/arm/include/asm/kvm_ras.h
> @@ -0,0 +1,14 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (C) 2018 - Arm Ltd

checkpatch is complaining for some reason:

WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#66: FILE: arch/arm/include/asm/kvm_ras.h:1:
+// SPDX-License-Identifier: GPL-2.0

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  2018-09-21 22:16   ` James Morse
  (?)
@ 2018-10-12 10:02     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 10:02 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:16:54PM +0100, James Morse wrote:
> To split up APEIs in_nmi() path, we need the nmi-like callers to always
> be in_nmi(). Add a helper to do the work and claim the notification.
> 
> When KVM or the arch code takes an exception that might be a RAS
> notification, it asks the APEI firmware-first code whether it wants
> to claim the exception. We can then go on to see if (a future)
> kernel-first mechanism wants to claim the notification, before
> falling through to the existing default behaviour.
> 
> The NOTIFY_SEA code was merged before we had multiple, possibly
> interacting, NMI-like notifications and the need to consider kernel
> first in the future. Make the 'claiming' behaviour explicit.
> 
> As we're restructuring the APEI code to allow multiple NMI-like
> notifications, any notification that might interrupt interrupts-masked
> code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
> to use in_nmi() to use the right fixmap entries.
> 
> We mask SError over this window to prevent an asynchronous RAS error
> arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

...

> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index ed46dc188b22..a9b8bba014b5 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -28,8 +28,10 @@
>  #include <linux/smp.h>
>  #include <linux/serial_core.h>
>  
> +#include <acpi/ghes.h>
>  #include <asm/cputype.h>
>  #include <asm/cpu_ops.h>
> +#include <asm/daifflags.h>
>  #include <asm/pgtable.h>
>  #include <asm/smp_plat.h>
>  
> @@ -257,3 +259,30 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
>  		return __pgprot(PROT_NORMAL_NC);
>  	return __pgprot(PROT_DEVICE_nGnRnE);
>  }
> +
> +/*
> + * Claim Synchronous External Aborts as a firmware first notification.
> + *
> + * Used by KVM and the arch do_sea handler.
> + * @regs may be NULL when called from process context.
> + */
> +int apei_claim_sea(struct pt_regs *regs)
> +{
> +	int err = -ENOENT;
> +	unsigned long current_flags = arch_local_save_flags();
> +
> +	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
> +		return err;

I don't know what side effects arch_local_save_flags() has on ARM but if
we return here, it looks to me like useless work.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
@ 2018-10-12 10:02     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 10:02 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:16:54PM +0100, James Morse wrote:
> To split up APEIs in_nmi() path, we need the nmi-like callers to always
> be in_nmi(). Add a helper to do the work and claim the notification.
> 
> When KVM or the arch code takes an exception that might be a RAS
> notification, it asks the APEI firmware-first code whether it wants
> to claim the exception. We can then go on to see if (a future)
> kernel-first mechanism wants to claim the notification, before
> falling through to the existing default behaviour.
> 
> The NOTIFY_SEA code was merged before we had multiple, possibly
> interacting, NMI-like notifications and the need to consider kernel
> first in the future. Make the 'claiming' behaviour explicit.
> 
> As we're restructuring the APEI code to allow multiple NMI-like
> notifications, any notification that might interrupt interrupts-masked
> code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
> to use in_nmi() to use the right fixmap entries.
> 
> We mask SError over this window to prevent an asynchronous RAS error
> arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

...

> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index ed46dc188b22..a9b8bba014b5 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -28,8 +28,10 @@
>  #include <linux/smp.h>
>  #include <linux/serial_core.h>
>  
> +#include <acpi/ghes.h>
>  #include <asm/cputype.h>
>  #include <asm/cpu_ops.h>
> +#include <asm/daifflags.h>
>  #include <asm/pgtable.h>
>  #include <asm/smp_plat.h>
>  
> @@ -257,3 +259,30 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
>  		return __pgprot(PROT_NORMAL_NC);
>  	return __pgprot(PROT_DEVICE_nGnRnE);
>  }
> +
> +/*
> + * Claim Synchronous External Aborts as a firmware first notification.
> + *
> + * Used by KVM and the arch do_sea handler.
> + * @regs may be NULL when called from process context.
> + */
> +int apei_claim_sea(struct pt_regs *regs)
> +{
> +	int err = -ENOENT;
> +	unsigned long current_flags = arch_local_save_flags();
> +
> +	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
> +		return err;

I don't know what side effects arch_local_save_flags() has on ARM but if
we return here, it looks to me like useless work.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
@ 2018-10-12 10:02     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 10:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:16:54PM +0100, James Morse wrote:
> To split up APEIs in_nmi() path, we need the nmi-like callers to always
> be in_nmi(). Add a helper to do the work and claim the notification.
> 
> When KVM or the arch code takes an exception that might be a RAS
> notification, it asks the APEI firmware-first code whether it wants
> to claim the exception. We can then go on to see if (a future)
> kernel-first mechanism wants to claim the notification, before
> falling through to the existing default behaviour.
> 
> The NOTIFY_SEA code was merged before we had multiple, possibly
> interacting, NMI-like notifications and the need to consider kernel
> first in the future. Make the 'claiming' behaviour explicit.
> 
> As we're restructuring the APEI code to allow multiple NMI-like
> notifications, any notification that might interrupt interrupts-masked
> code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
> to use in_nmi() to use the right fixmap entries.
> 
> We mask SError over this window to prevent an asynchronous RAS error
> arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

...

> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> index ed46dc188b22..a9b8bba014b5 100644
> --- a/arch/arm64/kernel/acpi.c
> +++ b/arch/arm64/kernel/acpi.c
> @@ -28,8 +28,10 @@
>  #include <linux/smp.h>
>  #include <linux/serial_core.h>
>  
> +#include <acpi/ghes.h>
>  #include <asm/cputype.h>
>  #include <asm/cpu_ops.h>
> +#include <asm/daifflags.h>
>  #include <asm/pgtable.h>
>  #include <asm/smp_plat.h>
>  
> @@ -257,3 +259,30 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
>  		return __pgprot(PROT_NORMAL_NC);
>  	return __pgprot(PROT_DEVICE_nGnRnE);
>  }
> +
> +/*
> + * Claim Synchronous External Aborts as a firmware first notification.
> + *
> + * Used by KVM and the arch do_sea handler.
> + * @regs may be NULL when called from process context.
> + */
> +int apei_claim_sea(struct pt_regs *regs)
> +{
> +	int err = -ENOENT;
> +	unsigned long current_flags = arch_local_save_flags();
> +
> +	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
> +		return err;

I don't know what side effects arch_local_save_flags() has on ARM but if
we return here, it looks to me like useless work.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 08/18] ACPI / APEI: Move locking to the notification helper
  2018-09-21 22:16   ` James Morse
  (?)
@ 2018-10-12 11:08     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 11:08 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:16:55PM +0100, James Morse wrote:
> ghes_copy_tofrom_phys() takes different locks depending on in_nmi().
> This doesn't work when we have multiple NMI-like notifications, that
> can interrupt each other.
> 
> Now that NOTIFY_SEA is always called as an NMI, move the lock-taking
> to the notification helper. The helper will always know which lock
> to take. This avoids ghes_copy_tofrom_phys() taking a guess based
> on in_nmi().
> 
> This splits NOTIFY_NMI and NOTIFY_SEA to use different locks. All
> the other notifications use ghes_proc(), and are called in process
> or IRQ context. Move the spin_lock_irqsave() around their ghes_proc()
> calls.

Right, should ghes_proc() be renamed to ghes_proc_irq() now, to be
absolutely clear on the processing context it is operating in?

Other than that:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 08/18] ACPI / APEI: Move locking to the notification helper
@ 2018-10-12 11:08     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 11:08 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:16:55PM +0100, James Morse wrote:
> ghes_copy_tofrom_phys() takes different locks depending on in_nmi().
> This doesn't work when we have multiple NMI-like notifications, that
> can interrupt each other.
> 
> Now that NOTIFY_SEA is always called as an NMI, move the lock-taking
> to the notification helper. The helper will always know which lock
> to take. This avoids ghes_copy_tofrom_phys() taking a guess based
> on in_nmi().
> 
> This splits NOTIFY_NMI and NOTIFY_SEA to use different locks. All
> the other notifications use ghes_proc(), and are called in process
> or IRQ context. Move the spin_lock_irqsave() around their ghes_proc()
> calls.

Right, should ghes_proc() be renamed to ghes_proc_irq() now, to be
absolutely clear on the processing context it is operating in?

Other than that:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 08/18] ACPI / APEI: Move locking to the notification helper
@ 2018-10-12 11:08     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 11:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:16:55PM +0100, James Morse wrote:
> ghes_copy_tofrom_phys() takes different locks depending on in_nmi().
> This doesn't work when we have multiple NMI-like notifications, that
> can interrupt each other.
> 
> Now that NOTIFY_SEA is always called as an NMI, move the lock-taking
> to the notification helper. The helper will always know which lock
> to take. This avoids ghes_copy_tofrom_phys() taking a guess based
> on in_nmi().
> 
> This splits NOTIFY_NMI and NOTIFY_SEA to use different locks. All
> the other notifications use ghes_proc(), and are called in process
> or IRQ context. Move the spin_lock_irqsave() around their ghes_proc()
> calls.

Right, should ghes_proc() be renamed to ghes_proc_irq() now, to be
absolutely clear on the processing context it is operating in?

Other than that:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 09/18] ACPI / APEI: Let the notification helper specify the fixmap slot
  2018-09-21 22:16   ` James Morse
  (?)
@ 2018-10-12 11:14     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 11:14 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:16:56PM +0100, James Morse wrote:
> ghes_copy_tofrom_phys() uses a different fixmap slot depending on in_nmi().
> This doesn't work when we have multiple NMI-like notifications, that
> can interrupt each other.
> 
> As with the locking, move the chosen fixmap_idx to the notification helper.
> This only matters for NMI-like notifications, anything calling
> ghes_proc() can use the IRQ fixmap slot as its already holding an irqsave
> spinlock.
> 
> This lets us collapse the ghes_ioremap_pfn_*() helpers.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> 
> The fixmap-idx and vaddr are passed back to ghes_unmap()
> to allow ioremap() to be used in process context in the
> future.
> ---
>  drivers/acpi/apei/ghes.c | 76 ++++++++++++++--------------------------
>  1 file changed, 27 insertions(+), 49 deletions(-)

Nice.

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 09/18] ACPI / APEI: Let the notification helper specify the fixmap slot
@ 2018-10-12 11:14     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 11:14 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:16:56PM +0100, James Morse wrote:
> ghes_copy_tofrom_phys() uses a different fixmap slot depending on in_nmi().
> This doesn't work when we have multiple NMI-like notifications, that
> can interrupt each other.
> 
> As with the locking, move the chosen fixmap_idx to the notification helper.
> This only matters for NMI-like notifications, anything calling
> ghes_proc() can use the IRQ fixmap slot as its already holding an irqsave
> spinlock.
> 
> This lets us collapse the ghes_ioremap_pfn_*() helpers.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> 
> The fixmap-idx and vaddr are passed back to ghes_unmap()
> to allow ioremap() to be used in process context in the
> future.
> ---
>  drivers/acpi/apei/ghes.c | 76 ++++++++++++++--------------------------
>  1 file changed, 27 insertions(+), 49 deletions(-)

Nice.

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 09/18] ACPI / APEI: Let the notification helper specify the fixmap slot
@ 2018-10-12 11:14     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 11:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:16:56PM +0100, James Morse wrote:
> ghes_copy_tofrom_phys() uses a different fixmap slot depending on in_nmi().
> This doesn't work when we have multiple NMI-like notifications, that
> can interrupt each other.
> 
> As with the locking, move the chosen fixmap_idx to the notification helper.
> This only matters for NMI-like notifications, anything calling
> ghes_proc() can use the IRQ fixmap slot as its already holding an irqsave
> spinlock.
> 
> This lets us collapse the ghes_ioremap_pfn_*() helpers.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> 
> The fixmap-idx and vaddr are passed back to ghes_unmap()
> to allow ioremap() to be used in process context in the
> future.
> ---
>  drivers/acpi/apei/ghes.c | 76 ++++++++++++++--------------------------
>  1 file changed, 27 insertions(+), 49 deletions(-)

Nice.

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus
  2018-09-21 22:16   ` James Morse
  (?)
@ 2018-10-12 16:37     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 16:37 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Nitpick:

Subject: Re: [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus

Pls have an active formulation in your Subject and start it with a capital
letter, i.e., something like:

	"Split ghes->estatus in preparation for... "

On Fri, Sep 21, 2018 at 11:16:57PM +0100, James Morse wrote:
> The NMI-like notifications scribble over ghes->estatus, before
> copying it somewhere else. If this interrupts the ghes_probe() code
> calling ghes_proc() on each struct ghes, the data is corrupted.
> 
> We want the NMI-like notifications to use a queued estatus entry

Pls formulate commit messages in passive voice.

> from the beginning. To that end, break up any use of "ghes->estatus"
> so that all functions take the estatus as an argument.
> 
> This patch is just moving code around, no change in behaviour.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 82 ++++++++++++++++++++++------------------
>  1 file changed, 45 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index adf7fd402813..586689cbc0fd 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -298,7 +298,9 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  	}
>  }
>  
> -static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
> +static int ghes_read_estatus(struct ghes *ghes,
> +			     struct acpi_hest_generic_status *estatus,

acpi_hest_generic_status - geez, could this name have been any longer ?!

> +			     int silent, int fixmap_idx)
>  {
>  	struct acpi_hest_generic *g = ghes->generic;
>  	u64 buf_paddr;
> @@ -316,26 +318,26 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
>  	if (!buf_paddr)
>  		return -ENOENT;
>  
> -	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
> -			      sizeof(*ghes->estatus), 1, fixmap_idx);
> -	if (!ghes->estatus->block_status)
> +	ghes_copy_tofrom_phys(estatus, buf_paddr,
> +			      sizeof(*estatus), 1, fixmap_idx);

Yeah, let that line stick out - it is easier to follow the code this
way.

> +	if (!estatus->block_status)
>  		return -ENOENT;
>  
>  	ghes->buffer_paddr = buf_paddr;
>  	ghes->flags |= GHES_TO_CLEAR;
>  
>  	rc = -EIO;
> -	len = cper_estatus_len(ghes->estatus);
> -	if (len < sizeof(*ghes->estatus))
> +	len = cper_estatus_len(estatus);
> +	if (len < sizeof(*estatus))
>  		goto err_read_block;
>  	if (len > ghes->generic->error_block_length)
>  		goto err_read_block;
> -	if (cper_estatus_check_header(ghes->estatus))
> +	if (cper_estatus_check_header(estatus))
>  		goto err_read_block;
> -	ghes_copy_tofrom_phys(ghes->estatus + 1,
> -			      buf_paddr + sizeof(*ghes->estatus),
> -			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
> -	if (cper_estatus_check(ghes->estatus))
> +	ghes_copy_tofrom_phys(estatus + 1,
> +			      buf_paddr + sizeof(*estatus),
> +			      len - sizeof(*estatus), 1, fixmap_idx);
> +	if (cper_estatus_check(estatus))
>  		goto err_read_block;
>  	rc = 0;
>  
> @@ -346,13 +348,15 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
>  	return rc;
>  }
>  
> -static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
> +static void ghes_clear_estatus(struct ghes *ghes,
> +			       struct acpi_hest_generic_status *estatus,
> +			       int fixmap_idx)
>  {
> -	ghes->estatus->block_status = 0;
> +	estatus->block_status = 0;
>  	if (!(ghes->flags & GHES_TO_CLEAR))
>  		return;

<---- newline here.

> -	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
> -			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
> +	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
> +			      sizeof(estatus->block_status), 0, fixmap_idx);
>  	ghes->flags &= ~GHES_TO_CLEAR;
>  }
>  
> @@ -518,9 +522,10 @@ static int ghes_print_estatus(const char *pfx,
>  	return 0;
>  }
>  
> -static void __ghes_panic(struct ghes *ghes)
> +static void __ghes_panic(struct ghes *ghes,
> +			 struct acpi_hest_generic_status *estatus)

Yeah, let that one stick out too. That struct naming needs slimming.

>  {
> -	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
> +	__ghes_print_estatus(KERN_EMERG, ghes->generic, estatus);
>  
>  	/* reboot to log the error! */
>  	if (!panic_timeout)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus
@ 2018-10-12 16:37     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 16:37 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

Nitpick:

Subject: Re: [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus

Pls have an active formulation in your Subject and start it with a capital
letter, i.e., something like:

	"Split ghes->estatus in preparation for... "

On Fri, Sep 21, 2018 at 11:16:57PM +0100, James Morse wrote:
> The NMI-like notifications scribble over ghes->estatus, before
> copying it somewhere else. If this interrupts the ghes_probe() code
> calling ghes_proc() on each struct ghes, the data is corrupted.
> 
> We want the NMI-like notifications to use a queued estatus entry

Pls formulate commit messages in passive voice.

> from the beginning. To that end, break up any use of "ghes->estatus"
> so that all functions take the estatus as an argument.
> 
> This patch is just moving code around, no change in behaviour.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 82 ++++++++++++++++++++++------------------
>  1 file changed, 45 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index adf7fd402813..586689cbc0fd 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -298,7 +298,9 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  	}
>  }
>  
> -static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
> +static int ghes_read_estatus(struct ghes *ghes,
> +			     struct acpi_hest_generic_status *estatus,

acpi_hest_generic_status - geez, could this name have been any longer ?!

> +			     int silent, int fixmap_idx)
>  {
>  	struct acpi_hest_generic *g = ghes->generic;
>  	u64 buf_paddr;
> @@ -316,26 +318,26 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
>  	if (!buf_paddr)
>  		return -ENOENT;
>  
> -	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
> -			      sizeof(*ghes->estatus), 1, fixmap_idx);
> -	if (!ghes->estatus->block_status)
> +	ghes_copy_tofrom_phys(estatus, buf_paddr,
> +			      sizeof(*estatus), 1, fixmap_idx);

Yeah, let that line stick out - it is easier to follow the code this
way.

> +	if (!estatus->block_status)
>  		return -ENOENT;
>  
>  	ghes->buffer_paddr = buf_paddr;
>  	ghes->flags |= GHES_TO_CLEAR;
>  
>  	rc = -EIO;
> -	len = cper_estatus_len(ghes->estatus);
> -	if (len < sizeof(*ghes->estatus))
> +	len = cper_estatus_len(estatus);
> +	if (len < sizeof(*estatus))
>  		goto err_read_block;
>  	if (len > ghes->generic->error_block_length)
>  		goto err_read_block;
> -	if (cper_estatus_check_header(ghes->estatus))
> +	if (cper_estatus_check_header(estatus))
>  		goto err_read_block;
> -	ghes_copy_tofrom_phys(ghes->estatus + 1,
> -			      buf_paddr + sizeof(*ghes->estatus),
> -			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
> -	if (cper_estatus_check(ghes->estatus))
> +	ghes_copy_tofrom_phys(estatus + 1,
> +			      buf_paddr + sizeof(*estatus),
> +			      len - sizeof(*estatus), 1, fixmap_idx);
> +	if (cper_estatus_check(estatus))
>  		goto err_read_block;
>  	rc = 0;
>  
> @@ -346,13 +348,15 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
>  	return rc;
>  }
>  
> -static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
> +static void ghes_clear_estatus(struct ghes *ghes,
> +			       struct acpi_hest_generic_status *estatus,
> +			       int fixmap_idx)
>  {
> -	ghes->estatus->block_status = 0;
> +	estatus->block_status = 0;
>  	if (!(ghes->flags & GHES_TO_CLEAR))
>  		return;

<---- newline here.

> -	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
> -			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
> +	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
> +			      sizeof(estatus->block_status), 0, fixmap_idx);
>  	ghes->flags &= ~GHES_TO_CLEAR;
>  }
>  
> @@ -518,9 +522,10 @@ static int ghes_print_estatus(const char *pfx,
>  	return 0;
>  }
>  
> -static void __ghes_panic(struct ghes *ghes)
> +static void __ghes_panic(struct ghes *ghes,
> +			 struct acpi_hest_generic_status *estatus)

Yeah, let that one stick out too. That struct naming needs slimming.

>  {
> -	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
> +	__ghes_print_estatus(KERN_EMERG, ghes->generic, estatus);
>  
>  	/* reboot to log the error! */
>  	if (!panic_timeout)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus
@ 2018-10-12 16:37     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 16:37 UTC (permalink / raw)
  To: linux-arm-kernel

Nitpick:

Subject: Re: [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus

Pls have an active formulation in your Subject and start it with a capital
letter, i.e., something like:

	"Split ghes->estatus in preparation for... "

On Fri, Sep 21, 2018 at 11:16:57PM +0100, James Morse wrote:
> The NMI-like notifications scribble over ghes->estatus, before
> copying it somewhere else. If this interrupts the ghes_probe() code
> calling ghes_proc() on each struct ghes, the data is corrupted.
> 
> We want the NMI-like notifications to use a queued estatus entry

Pls formulate commit messages in passive voice.

> from the beginning. To that end, break up any use of "ghes->estatus"
> so that all functions take the estatus as an argument.
> 
> This patch is just moving code around, no change in behaviour.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 82 ++++++++++++++++++++++------------------
>  1 file changed, 45 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index adf7fd402813..586689cbc0fd 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -298,7 +298,9 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  	}
>  }
>  
> -static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
> +static int ghes_read_estatus(struct ghes *ghes,
> +			     struct acpi_hest_generic_status *estatus,

acpi_hest_generic_status - geez, could this name have been any longer ?!

> +			     int silent, int fixmap_idx)
>  {
>  	struct acpi_hest_generic *g = ghes->generic;
>  	u64 buf_paddr;
> @@ -316,26 +318,26 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
>  	if (!buf_paddr)
>  		return -ENOENT;
>  
> -	ghes_copy_tofrom_phys(ghes->estatus, buf_paddr,
> -			      sizeof(*ghes->estatus), 1, fixmap_idx);
> -	if (!ghes->estatus->block_status)
> +	ghes_copy_tofrom_phys(estatus, buf_paddr,
> +			      sizeof(*estatus), 1, fixmap_idx);

Yeah, let that line stick out - it is easier to follow the code this
way.

> +	if (!estatus->block_status)
>  		return -ENOENT;
>  
>  	ghes->buffer_paddr = buf_paddr;
>  	ghes->flags |= GHES_TO_CLEAR;
>  
>  	rc = -EIO;
> -	len = cper_estatus_len(ghes->estatus);
> -	if (len < sizeof(*ghes->estatus))
> +	len = cper_estatus_len(estatus);
> +	if (len < sizeof(*estatus))
>  		goto err_read_block;
>  	if (len > ghes->generic->error_block_length)
>  		goto err_read_block;
> -	if (cper_estatus_check_header(ghes->estatus))
> +	if (cper_estatus_check_header(estatus))
>  		goto err_read_block;
> -	ghes_copy_tofrom_phys(ghes->estatus + 1,
> -			      buf_paddr + sizeof(*ghes->estatus),
> -			      len - sizeof(*ghes->estatus), 1, fixmap_idx);
> -	if (cper_estatus_check(ghes->estatus))
> +	ghes_copy_tofrom_phys(estatus + 1,
> +			      buf_paddr + sizeof(*estatus),
> +			      len - sizeof(*estatus), 1, fixmap_idx);
> +	if (cper_estatus_check(estatus))
>  		goto err_read_block;
>  	rc = 0;
>  
> @@ -346,13 +348,15 @@ static int ghes_read_estatus(struct ghes *ghes, int silent, int fixmap_idx)
>  	return rc;
>  }
>  
> -static void ghes_clear_estatus(struct ghes *ghes, int fixmap_idx)
> +static void ghes_clear_estatus(struct ghes *ghes,
> +			       struct acpi_hest_generic_status *estatus,
> +			       int fixmap_idx)
>  {
> -	ghes->estatus->block_status = 0;
> +	estatus->block_status = 0;
>  	if (!(ghes->flags & GHES_TO_CLEAR))
>  		return;

<---- newline here.

> -	ghes_copy_tofrom_phys(ghes->estatus, ghes->buffer_paddr,
> -			      sizeof(ghes->estatus->block_status), 0, fixmap_idx);
> +	ghes_copy_tofrom_phys(estatus, ghes->buffer_paddr,
> +			      sizeof(estatus->block_status), 0, fixmap_idx);
>  	ghes->flags &= ~GHES_TO_CLEAR;
>  }
>  
> @@ -518,9 +522,10 @@ static int ghes_print_estatus(const char *pfx,
>  	return 0;
>  }
>  
> -static void __ghes_panic(struct ghes *ghes)
> +static void __ghes_panic(struct ghes *ghes,
> +			 struct acpi_hest_generic_status *estatus)

Yeah, let that one stick out too. That struct naming needs slimming.

>  {
> -	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
> +	__ghes_print_estatus(KERN_EMERG, ghes->generic, estatus);
>  
>  	/* reboot to log the error! */
>  	if (!panic_timeout)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 11/18] ACPI / APEI: Remove silent flag from ghes_read_estatus()
  2018-09-21 22:16   ` James Morse
  (?)
@ 2018-10-12 16:55     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 16:55 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:16:58PM +0100, James Morse wrote:
> Subsequent patches will split up ghes_read_estatus(), at which
> point passing around the 'silent' flag gets annoying. This is to
> suppress prink() messages, which prior to 42a0bb3f7138 ("printk/nmi:
> generic solution for safe printk in NMI"), were unsafe in NMI context.

Put that commit onto a separate line:

"... which prior to

  42a0bb3f7138 ("printk/nmi: generic solution for safe printk in NMI")

were unsafe ..."

This way it is immediately visible.

In any case, this patch looks like a cleanup so move it to the beginning
of the queue, I'd say.

> We don't need to do this anymore, remove the flag. printk() messages
> are batched in a per-cpu buffer and printed via irq-work, or a call
> back from panic().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 586689cbc0fd..ba5344d26a39 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -300,7 +300,7 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  
>  static int ghes_read_estatus(struct ghes *ghes,
>  			     struct acpi_hest_generic_status *estatus,
> -			     int silent, int fixmap_idx)
> +			     int fixmap_idx)
>  {
>  	struct acpi_hest_generic *g = ghes->generic;
>  	u64 buf_paddr;
> @@ -309,7 +309,7 @@ static int ghes_read_estatus(struct ghes *ghes,
>  
>  	rc = apei_read(&buf_paddr, &g->error_status_address);
>  	if (rc) {
> -		if (!silent && printk_ratelimit())
> +		if (printk_ratelimit())

Btw, checkpatch complains here:

WARNING: Prefer printk_ratelimited or pr_<level>_ratelimited to printk_ratelimit
#57: FILE: drivers/acpi/apei/ghes.c:312:
+               if (printk_ratelimit())

WARNING: Prefer printk_ratelimited or pr_<level>_ratelimited to printk_ratelimit
#66: FILE: drivers/acpi/apei/ghes.c:345:
+       if (rc && printk_ratelimit())


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 11/18] ACPI / APEI: Remove silent flag from ghes_read_estatus()
@ 2018-10-12 16:55     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 16:55 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:16:58PM +0100, James Morse wrote:
> Subsequent patches will split up ghes_read_estatus(), at which
> point passing around the 'silent' flag gets annoying. This is to
> suppress prink() messages, which prior to 42a0bb3f7138 ("printk/nmi:
> generic solution for safe printk in NMI"), were unsafe in NMI context.

Put that commit onto a separate line:

"... which prior to

  42a0bb3f7138 ("printk/nmi: generic solution for safe printk in NMI")

were unsafe ..."

This way it is immediately visible.

In any case, this patch looks like a cleanup so move it to the beginning
of the queue, I'd say.

> We don't need to do this anymore, remove the flag. printk() messages
> are batched in a per-cpu buffer and printed via irq-work, or a call
> back from panic().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 586689cbc0fd..ba5344d26a39 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -300,7 +300,7 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  
>  static int ghes_read_estatus(struct ghes *ghes,
>  			     struct acpi_hest_generic_status *estatus,
> -			     int silent, int fixmap_idx)
> +			     int fixmap_idx)
>  {
>  	struct acpi_hest_generic *g = ghes->generic;
>  	u64 buf_paddr;
> @@ -309,7 +309,7 @@ static int ghes_read_estatus(struct ghes *ghes,
>  
>  	rc = apei_read(&buf_paddr, &g->error_status_address);
>  	if (rc) {
> -		if (!silent && printk_ratelimit())
> +		if (printk_ratelimit())

Btw, checkpatch complains here:

WARNING: Prefer printk_ratelimited or pr_<level>_ratelimited to printk_ratelimit
#57: FILE: drivers/acpi/apei/ghes.c:312:
+               if (printk_ratelimit())

WARNING: Prefer printk_ratelimited or pr_<level>_ratelimited to printk_ratelimit
#66: FILE: drivers/acpi/apei/ghes.c:345:
+       if (rc && printk_ratelimit())


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 11/18] ACPI / APEI: Remove silent flag from ghes_read_estatus()
@ 2018-10-12 16:55     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 16:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:16:58PM +0100, James Morse wrote:
> Subsequent patches will split up ghes_read_estatus(), at which
> point passing around the 'silent' flag gets annoying. This is to
> suppress prink() messages, which prior to 42a0bb3f7138 ("printk/nmi:
> generic solution for safe printk in NMI"), were unsafe in NMI context.

Put that commit onto a separate line:

"... which prior to

  42a0bb3f7138 ("printk/nmi: generic solution for safe printk in NMI")

were unsafe ..."

This way it is immediately visible.

In any case, this patch looks like a cleanup so move it to the beginning
of the queue, I'd say.

> We don't need to do this anymore, remove the flag. printk() messages
> are batched in a per-cpu buffer and printed via irq-work, or a call
> back from panic().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 586689cbc0fd..ba5344d26a39 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -300,7 +300,7 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  
>  static int ghes_read_estatus(struct ghes *ghes,
>  			     struct acpi_hest_generic_status *estatus,
> -			     int silent, int fixmap_idx)
> +			     int fixmap_idx)
>  {
>  	struct acpi_hest_generic *g = ghes->generic;
>  	u64 buf_paddr;
> @@ -309,7 +309,7 @@ static int ghes_read_estatus(struct ghes *ghes,
>  
>  	rc = apei_read(&buf_paddr, &g->error_status_address);
>  	if (rc) {
> -		if (!silent && printk_ratelimit())
> +		if (printk_ratelimit())

Btw, checkpatch complains here:

WARNING: Prefer printk_ratelimited or pr_<level>_ratelimited to printk_ratelimit
#57: FILE: drivers/acpi/apei/ghes.c:312:
+               if (printk_ratelimit())

WARNING: Prefer printk_ratelimited or pr_<level>_ratelimited to printk_ratelimit
#66: FILE: drivers/acpi/apei/ghes.c:345:
+       if (rc && printk_ratelimit())


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 13/18] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  2018-09-21 22:17   ` James Morse
  (?)
@ 2018-10-12 17:14     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 17:14 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:17:00PM +0100, James Morse wrote:
> ghes_read_estatus() sets a flag in struct ghes if the buffer of
> CPER records needs to be cleared once the records have been
> processed. This global flags value is a problem if a struct ghes
> can be processed concurrently, as happens at probe time if an
> NMI arrives for the same error source.
> 
> The GHES_TO_CLEAR flags was only set at the same time as
> buffer_paddr, which is now owned by the caller and passed to
> ghes_clear_estatus(). Use this as the flag.
> 
> A non-zero buf_paddr returned by ghes_read_estatus() means
> ghes_clear_estatus() will clear this address. ghes_read_estatus()
> already checks for a read of error_status_address being zero,
> so we can never get CPER records written at zero.
> 
> After this ghes_clear_estatus() no longer needs the struct ghes.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 26 ++++++++++++--------------
>  include/acpi/ghes.h      |  1 -
>  2 files changed, 12 insertions(+), 15 deletions(-)

Nice.

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 13/18] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
@ 2018-10-12 17:14     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 17:14 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:17:00PM +0100, James Morse wrote:
> ghes_read_estatus() sets a flag in struct ghes if the buffer of
> CPER records needs to be cleared once the records have been
> processed. This global flags value is a problem if a struct ghes
> can be processed concurrently, as happens at probe time if an
> NMI arrives for the same error source.
> 
> The GHES_TO_CLEAR flags was only set at the same time as
> buffer_paddr, which is now owned by the caller and passed to
> ghes_clear_estatus(). Use this as the flag.
> 
> A non-zero buf_paddr returned by ghes_read_estatus() means
> ghes_clear_estatus() will clear this address. ghes_read_estatus()
> already checks for a read of error_status_address being zero,
> so we can never get CPER records written at zero.
> 
> After this ghes_clear_estatus() no longer needs the struct ghes.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 26 ++++++++++++--------------
>  include/acpi/ghes.h      |  1 -
>  2 files changed, 12 insertions(+), 15 deletions(-)

Nice.

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 13/18] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
@ 2018-10-12 17:14     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 17:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:17:00PM +0100, James Morse wrote:
> ghes_read_estatus() sets a flag in struct ghes if the buffer of
> CPER records needs to be cleared once the records have been
> processed. This global flags value is a problem if a struct ghes
> can be processed concurrently, as happens at probe time if an
> NMI arrives for the same error source.
> 
> The GHES_TO_CLEAR flags was only set at the same time as
> buffer_paddr, which is now owned by the caller and passed to
> ghes_clear_estatus(). Use this as the flag.
> 
> A non-zero buf_paddr returned by ghes_read_estatus() means
> ghes_clear_estatus() will clear this address. ghes_read_estatus()
> already checks for a read of error_status_address being zero,
> so we can never get CPER records written at zero.
> 
> After this ghes_clear_estatus() no longer needs the struct ghes.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 26 ++++++++++++--------------
>  include/acpi/ghes.h      |  1 -
>  2 files changed, 12 insertions(+), 15 deletions(-)

Nice.

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
  2018-10-04 17:34         ` Borislav Petkov
  (?)
@ 2018-10-12 17:17           ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-12 17:17 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 04/10/2018 18:34, Borislav Petkov wrote:
> On Wed, Oct 03, 2018 at 06:50:36PM +0100, James Morse wrote:
>> I'm all in favour of letting the compiler work it out, but the existing ghes
>> code has #ifdef/#else all over the place. This is 'keeping the style'.
> 
> Yeah, but this "style" is not the optimal one and we should
> simplify/clean up and fix this thing.
> 
> Swapping the order of your statements here:
> 
>> The ACPI spec has four ~NMI notifications, so far the support for
>> these in Linux has been selectable separately.
> 
> Yes, but: distro kernels end up enabling all those options anyway and
> distro kernels are 90-ish% of the setups. Which means, this will get
> enabled anyway and this additional Kconfig symbol is simply going to be
> one automatic reply "Yes".
> 
> So let's build it in by default and if someone complains about it, we
> can always carve it out. But right now I don't see the need for the
> unnecessary separation...

Ripping out the existing #ifdefs and replacing them with IS_ENABLED() would let
the compiler work out the estatus stuff is unused, and saves us describing the
what-uses-it logic in Kconfig.

But this does expose the x86 nmi stuff on arm64, which doesn't build today.
Dragging NMI_HANDLED and friends up to the 'linux' header causes a fair amount
of noise under arch/x86 (include the new header in 22 files). Adding dummy
declarations to arm64 fixes this, and doesn't affect the other architectures
that have an asm/nmi.h

Alternatively we could leave {un,}register_nmi_handler() under
CONFIG_HAVE_ACPI_APEI_NMI. I think we need to keep the NOTIFY_NMI kconfig symbol
around, as its one of the two I can't work out how to fix without the TLBI-IPI.


Thanks,

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-12 17:17           ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-12 17:17 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

Hi Boris,

On 04/10/2018 18:34, Borislav Petkov wrote:
> On Wed, Oct 03, 2018 at 06:50:36PM +0100, James Morse wrote:
>> I'm all in favour of letting the compiler work it out, but the existing ghes
>> code has #ifdef/#else all over the place. This is 'keeping the style'.
> 
> Yeah, but this "style" is not the optimal one and we should
> simplify/clean up and fix this thing.
> 
> Swapping the order of your statements here:
> 
>> The ACPI spec has four ~NMI notifications, so far the support for
>> these in Linux has been selectable separately.
> 
> Yes, but: distro kernels end up enabling all those options anyway and
> distro kernels are 90-ish% of the setups. Which means, this will get
> enabled anyway and this additional Kconfig symbol is simply going to be
> one automatic reply "Yes".
> 
> So let's build it in by default and if someone complains about it, we
> can always carve it out. But right now I don't see the need for the
> unnecessary separation...

Ripping out the existing #ifdefs and replacing them with IS_ENABLED() would let
the compiler work out the estatus stuff is unused, and saves us describing the
what-uses-it logic in Kconfig.

But this does expose the x86 nmi stuff on arm64, which doesn't build today.
Dragging NMI_HANDLED and friends up to the 'linux' header causes a fair amount
of noise under arch/x86 (include the new header in 22 files). Adding dummy
declarations to arm64 fixes this, and doesn't affect the other architectures
that have an asm/nmi.h

Alternatively we could leave {un,}register_nmi_handler() under
CONFIG_HAVE_ACPI_APEI_NMI. I think we need to keep the NOTIFY_NMI kconfig symbol
around, as its one of the two I can't work out how to fix without the TLBI-IPI.


Thanks,

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-12 17:17           ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-12 17:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On 04/10/2018 18:34, Borislav Petkov wrote:
> On Wed, Oct 03, 2018 at 06:50:36PM +0100, James Morse wrote:
>> I'm all in favour of letting the compiler work it out, but the existing ghes
>> code has #ifdef/#else all over the place. This is 'keeping the style'.
> 
> Yeah, but this "style" is not the optimal one and we should
> simplify/clean up and fix this thing.
> 
> Swapping the order of your statements here:
> 
>> The ACPI spec has four ~NMI notifications, so far the support for
>> these in Linux has been selectable separately.
> 
> Yes, but: distro kernels end up enabling all those options anyway and
> distro kernels are 90-ish% of the setups. Which means, this will get
> enabled anyway and this additional Kconfig symbol is simply going to be
> one automatic reply "Yes".
> 
> So let's build it in by default and if someone complains about it, we
> can always carve it out. But right now I don't see the need for the
> unnecessary separation...

Ripping out the existing #ifdefs and replacing them with IS_ENABLED() would let
the compiler work out the estatus stuff is unused, and saves us describing the
what-uses-it logic in Kconfig.

But this does expose the x86 nmi stuff on arm64, which doesn't build today.
Dragging NMI_HANDLED and friends up to the 'linux' header causes a fair amount
of noise under arch/x86 (include the new header in 22 files). Adding dummy
declarations to arm64 fixes this, and doesn't affect the other architectures
that have an asm/nmi.h

Alternatively we could leave {un,}register_nmi_handler() under
CONFIG_HAVE_ACPI_APEI_NMI. I think we need to keep the NOTIFY_NMI kconfig symbol
around, as its one of the two I can't work out how to fix without the TLBI-IPI.


Thanks,

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  2018-10-12  9:57     ` Borislav Petkov
  (?)
@ 2018-10-12 17:18       ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-12 17:18 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 12/10/2018 10:57, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:53PM +0100, James Morse wrote:
>> To split up APEIs in_nmi() path, we need any nmi-like callers to always
>> be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
>> out into a header file.
>>
>> Currently guest synchronous external aborts are claimed as RAS
>> notifications by handle_guest_sea(), which is hidden in the arch codes
>> mm/fault.c. 32bit gets a dummy declaration in system_misc.h.
>>
>> There is going to be more of this in the future if/when we support
>> the SError-based firmware-first notification mechanism and/or
>> kernel-first notifications for both synchronous external abort and
>> SError. Each of these will come with some Kconfig symbols and a
>> handful of header files.
>>
>> Create a header file for all this.
>>
>> This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
>> declarations to kvm_ras.h as preparation for a future patch that moves
>> the ACPI-specific RAS code out of mm/fault.c.

>> diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
>> new file mode 100644
>> index 000000000000..aaff56bf338f
>> --- /dev/null
>> +++ b/arch/arm/include/asm/kvm_ras.h
>> @@ -0,0 +1,14 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (C) 2018 - Arm Ltd
> 
> checkpatch is complaining for some reason:
> 
> WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
> #66: FILE: arch/arm/include/asm/kvm_ras.h:1:
> +// SPDX-License-Identifier: GPL-2.0

Gah, I copied it from a C file, the comment-style has to be different for headers.

Fixed,


Thanks

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
@ 2018-10-12 17:18       ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-12 17:18 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

Hi Boris,

On 12/10/2018 10:57, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:53PM +0100, James Morse wrote:
>> To split up APEIs in_nmi() path, we need any nmi-like callers to always
>> be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
>> out into a header file.
>>
>> Currently guest synchronous external aborts are claimed as RAS
>> notifications by handle_guest_sea(), which is hidden in the arch codes
>> mm/fault.c. 32bit gets a dummy declaration in system_misc.h.
>>
>> There is going to be more of this in the future if/when we support
>> the SError-based firmware-first notification mechanism and/or
>> kernel-first notifications for both synchronous external abort and
>> SError. Each of these will come with some Kconfig symbols and a
>> handful of header files.
>>
>> Create a header file for all this.
>>
>> This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
>> declarations to kvm_ras.h as preparation for a future patch that moves
>> the ACPI-specific RAS code out of mm/fault.c.

>> diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
>> new file mode 100644
>> index 000000000000..aaff56bf338f
>> --- /dev/null
>> +++ b/arch/arm/include/asm/kvm_ras.h
>> @@ -0,0 +1,14 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (C) 2018 - Arm Ltd
> 
> checkpatch is complaining for some reason:
> 
> WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
> #66: FILE: arch/arm/include/asm/kvm_ras.h:1:
> +// SPDX-License-Identifier: GPL-2.0

Gah, I copied it from a C file, the comment-style has to be different for headers.

Fixed,


Thanks

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
@ 2018-10-12 17:18       ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-12 17:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On 12/10/2018 10:57, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:53PM +0100, James Morse wrote:
>> To split up APEIs in_nmi() path, we need any nmi-like callers to always
>> be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
>> out into a header file.
>>
>> Currently guest synchronous external aborts are claimed as RAS
>> notifications by handle_guest_sea(), which is hidden in the arch codes
>> mm/fault.c. 32bit gets a dummy declaration in system_misc.h.
>>
>> There is going to be more of this in the future if/when we support
>> the SError-based firmware-first notification mechanism and/or
>> kernel-first notifications for both synchronous external abort and
>> SError. Each of these will come with some Kconfig symbols and a
>> handful of header files.
>>
>> Create a header file for all this.
>>
>> This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
>> declarations to kvm_ras.h as preparation for a future patch that moves
>> the ACPI-specific RAS code out of mm/fault.c.

>> diff --git a/arch/arm/include/asm/kvm_ras.h b/arch/arm/include/asm/kvm_ras.h
>> new file mode 100644
>> index 000000000000..aaff56bf338f
>> --- /dev/null
>> +++ b/arch/arm/include/asm/kvm_ras.h
>> @@ -0,0 +1,14 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (C) 2018 - Arm Ltd
> 
> checkpatch is complaining for some reason:
> 
> WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
> #66: FILE: arch/arm/include/asm/kvm_ras.h:1:
> +// SPDX-License-Identifier: GPL-2.0

Gah, I copied it from a C file, the comment-style has to be different for headers.

Fixed,


Thanks

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  2018-10-12 10:02     ` Borislav Petkov
  (?)
@ 2018-10-12 17:18       ` James Morse
  -1 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-12 17:18 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

Hi Boris,

On 12/10/2018 11:02, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:54PM +0100, James Morse wrote:
>> To split up APEIs in_nmi() path, we need the nmi-like callers to always
>> be in_nmi(). Add a helper to do the work and claim the notification.
>>
>> When KVM or the arch code takes an exception that might be a RAS
>> notification, it asks the APEI firmware-first code whether it wants
>> to claim the exception. We can then go on to see if (a future)
>> kernel-first mechanism wants to claim the notification, before
>> falling through to the existing default behaviour.
>>
>> The NOTIFY_SEA code was merged before we had multiple, possibly
>> interacting, NMI-like notifications and the need to consider kernel
>> first in the future. Make the 'claiming' behaviour explicit.
>>
>> As we're restructuring the APEI code to allow multiple NMI-like
>> notifications, any notification that might interrupt interrupts-masked
>> code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
>> to use in_nmi() to use the right fixmap entries.
>>
>> We mask SError over this window to prevent an asynchronous RAS error
>> arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).

>> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
>> index ed46dc188b22..a9b8bba014b5 100644
>> --- a/arch/arm64/kernel/acpi.c
>> +++ b/arch/arm64/kernel/acpi.c
>> @@ -257,3 +259,30 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
>>  		return __pgprot(PROT_NORMAL_NC);
>>  	return __pgprot(PROT_DEVICE_nGnRnE);
>>  }
>> +
>> +/*
>> + * Claim Synchronous External Aborts as a firmware first notification.
>> + *
>> + * Used by KVM and the arch do_sea handler.
>> + * @regs may be NULL when called from process context.
>> + */
>> +int apei_claim_sea(struct pt_regs *regs)
>> +{
>> +	int err = -ENOENT;
>> +	unsigned long current_flags = arch_local_save_flags();
>> +
>> +	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
>> +		return err;
> 
> I don't know what side effects arch_local_save_flags() has on ARM but if

It reads the current 'masked' state for IRQs, debug exceptions and 'SError'.


> we return here, it looks to me like useless work.

Yes. I lazily assume the compiler will rip that out as the value is never used.
But in this case it can't, because its wrapped in asm-volatile, so it doesn't
know it has no side-effects.

I'll move it further down.

Thanks!

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
@ 2018-10-12 17:18       ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-12 17:18 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

Hi Boris,

On 12/10/2018 11:02, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:54PM +0100, James Morse wrote:
>> To split up APEIs in_nmi() path, we need the nmi-like callers to always
>> be in_nmi(). Add a helper to do the work and claim the notification.
>>
>> When KVM or the arch code takes an exception that might be a RAS
>> notification, it asks the APEI firmware-first code whether it wants
>> to claim the exception. We can then go on to see if (a future)
>> kernel-first mechanism wants to claim the notification, before
>> falling through to the existing default behaviour.
>>
>> The NOTIFY_SEA code was merged before we had multiple, possibly
>> interacting, NMI-like notifications and the need to consider kernel
>> first in the future. Make the 'claiming' behaviour explicit.
>>
>> As we're restructuring the APEI code to allow multiple NMI-like
>> notifications, any notification that might interrupt interrupts-masked
>> code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
>> to use in_nmi() to use the right fixmap entries.
>>
>> We mask SError over this window to prevent an asynchronous RAS error
>> arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).

>> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
>> index ed46dc188b22..a9b8bba014b5 100644
>> --- a/arch/arm64/kernel/acpi.c
>> +++ b/arch/arm64/kernel/acpi.c
>> @@ -257,3 +259,30 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
>>  		return __pgprot(PROT_NORMAL_NC);
>>  	return __pgprot(PROT_DEVICE_nGnRnE);
>>  }
>> +
>> +/*
>> + * Claim Synchronous External Aborts as a firmware first notification.
>> + *
>> + * Used by KVM and the arch do_sea handler.
>> + * @regs may be NULL when called from process context.
>> + */
>> +int apei_claim_sea(struct pt_regs *regs)
>> +{
>> +	int err = -ENOENT;
>> +	unsigned long current_flags = arch_local_save_flags();
>> +
>> +	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
>> +		return err;
> 
> I don't know what side effects arch_local_save_flags() has on ARM but if

It reads the current 'masked' state for IRQs, debug exceptions and 'SError'.


> we return here, it looks to me like useless work.

Yes. I lazily assume the compiler will rip that out as the value is never used.
But in this case it can't, because its wrapped in asm-volatile, so it doesn't
know it has no side-effects.

I'll move it further down.

Thanks!

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
@ 2018-10-12 17:18       ` James Morse
  0 siblings, 0 replies; 123+ messages in thread
From: James Morse @ 2018-10-12 17:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On 12/10/2018 11:02, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:16:54PM +0100, James Morse wrote:
>> To split up APEIs in_nmi() path, we need the nmi-like callers to always
>> be in_nmi(). Add a helper to do the work and claim the notification.
>>
>> When KVM or the arch code takes an exception that might be a RAS
>> notification, it asks the APEI firmware-first code whether it wants
>> to claim the exception. We can then go on to see if (a future)
>> kernel-first mechanism wants to claim the notification, before
>> falling through to the existing default behaviour.
>>
>> The NOTIFY_SEA code was merged before we had multiple, possibly
>> interacting, NMI-like notifications and the need to consider kernel
>> first in the future. Make the 'claiming' behaviour explicit.
>>
>> As we're restructuring the APEI code to allow multiple NMI-like
>> notifications, any notification that might interrupt interrupts-masked
>> code must always be wrapped in nmi_enter()/nmi_exit(). This allows APEI
>> to use in_nmi() to use the right fixmap entries.
>>
>> We mask SError over this window to prevent an asynchronous RAS error
>> arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).

>> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
>> index ed46dc188b22..a9b8bba014b5 100644
>> --- a/arch/arm64/kernel/acpi.c
>> +++ b/arch/arm64/kernel/acpi.c
>> @@ -257,3 +259,30 @@ pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
>>  		return __pgprot(PROT_NORMAL_NC);
>>  	return __pgprot(PROT_DEVICE_nGnRnE);
>>  }
>> +
>> +/*
>> + * Claim Synchronous External Aborts as a firmware first notification.
>> + *
>> + * Used by KVM and the arch do_sea handler.
>> + * @regs may be NULL when called from process context.
>> + */
>> +int apei_claim_sea(struct pt_regs *regs)
>> +{
>> +	int err = -ENOENT;
>> +	unsigned long current_flags = arch_local_save_flags();
>> +
>> +	if (!IS_ENABLED(CONFIG_ACPI_APEI_SEA))
>> +		return err;
> 
> I don't know what side effects arch_local_save_flags() has on ARM but if

It reads the current 'masked' state for IRQs, debug exceptions and 'SError'.


> we return here, it looks to me like useless work.

Yes. I lazily assume the compiler will rip that out as the value is never used.
But in this case it can't, because its wrapped in asm-volatile, so it doesn't
know it has no side-effects.

I'll move it further down.

Thanks!

James

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 14/18] ACPI / APEI: Split ghes_read_estatus() to read CPER length
  2018-09-21 22:17   ` James Morse
  (?)
@ 2018-10-12 17:25     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 17:25 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:17:01PM +0100, James Morse wrote:
> ghes_read_estatus() reads the record address, then the record's
> header, then performs some sanity checks before reading the
> records into the provided estatus buffer.
> 
> We either need to know the size of the records before we call
> ghes_read_estatus(), or always provide a worst-case sized buffer,
> as happens today.
> 
> Add a function to peek at the record's header to find the size. This
> will let the NMI path allocate the right amount of memory before reading
> the records, instead of using the worst-case size, and having to copy
> the records.
> 
> Split ghes_read_estatus() to create ghes_peek_estatus() which
> returns the address and size of the CPER records.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 55 ++++++++++++++++++++++++++++++----------
>  1 file changed, 41 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 3028487d43a3..055176ed68ac 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -298,11 +298,12 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  	}
>  }
>  
> -static int ghes_read_estatus(struct ghes *ghes,
> -			     struct acpi_hest_generic_status *estatus,
> -			     u64 *buf_paddr, int fixmap_idx)
> +/* read the CPER block returning its address and size */

Make that comment a proper sentence:

"./* ... Read the CPER ... and size. */

> +static int ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
> +			     u64 *buf_paddr, u32 *buf_len)
>  {

I find the functionality split a bit strange:

ghes_peek_estatus() does peek *and* verify sizes. The latter belongs
maybe better in ghes_read_estatus(). Together with the
cper_estatus_check_header() call. Or maybe into a separate

	__ghes_check_estatus()

to separate it all nicely.

>  	struct acpi_hest_generic *g = ghes->generic;
> +	struct acpi_hest_generic_status estatus;
>  	u32 len;
>  	int rc;
>  
> @@ -317,26 +318,23 @@ static int ghes_read_estatus(struct ghes *ghes,
>  	if (!*buf_paddr)
>  		return -ENOENT;
>  
> -	ghes_copy_tofrom_phys(estatus, *buf_paddr,
> -			      sizeof(*estatus), 1, fixmap_idx);
> -	if (!estatus->block_status) {
> +	ghes_copy_tofrom_phys(&estatus, *buf_paddr,
> +			      sizeof(estatus), 1, fixmap_idx);
> +	if (!estatus.block_status) {
>  		*buf_paddr = 0;
>  		return -ENOENT;
>  	}
>  
>  	rc = -EIO;
> -	len = cper_estatus_len(estatus);
> -	if (len < sizeof(*estatus))
> +	len = cper_estatus_len(&estatus);
> +	if (len < sizeof(estatus))
>  		goto err_read_block;
>  	if (len > ghes->generic->error_block_length)
>  		goto err_read_block;
> -	if (cper_estatus_check_header(estatus))
> -		goto err_read_block;
> -	ghes_copy_tofrom_phys(estatus + 1,
> -			      *buf_paddr + sizeof(*estatus),
> -			      len - sizeof(*estatus), 1, fixmap_idx);
> -	if (cper_estatus_check(estatus))
> +	if (cper_estatus_check_header(&estatus))
>  		goto err_read_block;
> +	*buf_len = len;
> +
>  	rc = 0;
>  
>  err_read_block:
> @@ -346,6 +344,35 @@ static int ghes_read_estatus(struct ghes *ghes,
>  	return rc;
>  }
>  
> +static int __ghes_read_estatus(struct acpi_hest_generic_status *estatus,
> +			       u64 buf_paddr, size_t buf_len,
> +			       int fixmap_idx)
> +{
> +	ghes_copy_tofrom_phys(estatus, buf_paddr, buf_len, 1, fixmap_idx);
> +	if (cper_estatus_check(estatus)) {
> +		if (printk_ratelimit())
> +			pr_warning(FW_WARN GHES_PFX
> +				   "Failed to read error status block!\n");

Then you won't have to have two identical messages:

	"Failed to read error status block!\n"

which, when one sees them, is hard to figure out where exactly in the
code that happened.

> +		return -EIO;
> +	}
> +
> +	return 0;
> +}
> +
> +static int ghes_read_estatus(struct ghes *ghes,
> +			     struct acpi_hest_generic_status *estatus,
> +			     u64 *buf_paddr, int fixmap_idx)
> +{
> +	int rc;
> +	u32 buf_len;
> +
> +	rc = ghes_peek_estatus(ghes, fixmap_idx, buf_paddr, &buf_len);

Also, if we have a __ghes_read_estatus() helper now, maybe prefixing
ghes_peek_estatus() with "__" would make sense too...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 14/18] ACPI / APEI: Split ghes_read_estatus() to read CPER length
@ 2018-10-12 17:25     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 17:25 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:17:01PM +0100, James Morse wrote:
> ghes_read_estatus() reads the record address, then the record's
> header, then performs some sanity checks before reading the
> records into the provided estatus buffer.
> 
> We either need to know the size of the records before we call
> ghes_read_estatus(), or always provide a worst-case sized buffer,
> as happens today.
> 
> Add a function to peek at the record's header to find the size. This
> will let the NMI path allocate the right amount of memory before reading
> the records, instead of using the worst-case size, and having to copy
> the records.
> 
> Split ghes_read_estatus() to create ghes_peek_estatus() which
> returns the address and size of the CPER records.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 55 ++++++++++++++++++++++++++++++----------
>  1 file changed, 41 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 3028487d43a3..055176ed68ac 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -298,11 +298,12 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  	}
>  }
>  
> -static int ghes_read_estatus(struct ghes *ghes,
> -			     struct acpi_hest_generic_status *estatus,
> -			     u64 *buf_paddr, int fixmap_idx)
> +/* read the CPER block returning its address and size */

Make that comment a proper sentence:

"./* ... Read the CPER ... and size. */

> +static int ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
> +			     u64 *buf_paddr, u32 *buf_len)
>  {

I find the functionality split a bit strange:

ghes_peek_estatus() does peek *and* verify sizes. The latter belongs
maybe better in ghes_read_estatus(). Together with the
cper_estatus_check_header() call. Or maybe into a separate

	__ghes_check_estatus()

to separate it all nicely.

>  	struct acpi_hest_generic *g = ghes->generic;
> +	struct acpi_hest_generic_status estatus;
>  	u32 len;
>  	int rc;
>  
> @@ -317,26 +318,23 @@ static int ghes_read_estatus(struct ghes *ghes,
>  	if (!*buf_paddr)
>  		return -ENOENT;
>  
> -	ghes_copy_tofrom_phys(estatus, *buf_paddr,
> -			      sizeof(*estatus), 1, fixmap_idx);
> -	if (!estatus->block_status) {
> +	ghes_copy_tofrom_phys(&estatus, *buf_paddr,
> +			      sizeof(estatus), 1, fixmap_idx);
> +	if (!estatus.block_status) {
>  		*buf_paddr = 0;
>  		return -ENOENT;
>  	}
>  
>  	rc = -EIO;
> -	len = cper_estatus_len(estatus);
> -	if (len < sizeof(*estatus))
> +	len = cper_estatus_len(&estatus);
> +	if (len < sizeof(estatus))
>  		goto err_read_block;
>  	if (len > ghes->generic->error_block_length)
>  		goto err_read_block;
> -	if (cper_estatus_check_header(estatus))
> -		goto err_read_block;
> -	ghes_copy_tofrom_phys(estatus + 1,
> -			      *buf_paddr + sizeof(*estatus),
> -			      len - sizeof(*estatus), 1, fixmap_idx);
> -	if (cper_estatus_check(estatus))
> +	if (cper_estatus_check_header(&estatus))
>  		goto err_read_block;
> +	*buf_len = len;
> +
>  	rc = 0;
>  
>  err_read_block:
> @@ -346,6 +344,35 @@ static int ghes_read_estatus(struct ghes *ghes,
>  	return rc;
>  }
>  
> +static int __ghes_read_estatus(struct acpi_hest_generic_status *estatus,
> +			       u64 buf_paddr, size_t buf_len,
> +			       int fixmap_idx)
> +{
> +	ghes_copy_tofrom_phys(estatus, buf_paddr, buf_len, 1, fixmap_idx);
> +	if (cper_estatus_check(estatus)) {
> +		if (printk_ratelimit())
> +			pr_warning(FW_WARN GHES_PFX
> +				   "Failed to read error status block!\n");

Then you won't have to have two identical messages:

	"Failed to read error status block!\n"

which, when one sees them, is hard to figure out where exactly in the
code that happened.

> +		return -EIO;
> +	}
> +
> +	return 0;
> +}
> +
> +static int ghes_read_estatus(struct ghes *ghes,
> +			     struct acpi_hest_generic_status *estatus,
> +			     u64 *buf_paddr, int fixmap_idx)
> +{
> +	int rc;
> +	u32 buf_len;
> +
> +	rc = ghes_peek_estatus(ghes, fixmap_idx, buf_paddr, &buf_len);

Also, if we have a __ghes_read_estatus() helper now, maybe prefixing
ghes_peek_estatus() with "__" would make sense too...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 14/18] ACPI / APEI: Split ghes_read_estatus() to read CPER length
@ 2018-10-12 17:25     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 17:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:17:01PM +0100, James Morse wrote:
> ghes_read_estatus() reads the record address, then the record's
> header, then performs some sanity checks before reading the
> records into the provided estatus buffer.
> 
> We either need to know the size of the records before we call
> ghes_read_estatus(), or always provide a worst-case sized buffer,
> as happens today.
> 
> Add a function to peek at the record's header to find the size. This
> will let the NMI path allocate the right amount of memory before reading
> the records, instead of using the worst-case size, and having to copy
> the records.
> 
> Split ghes_read_estatus() to create ghes_peek_estatus() which
> returns the address and size of the CPER records.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 55 ++++++++++++++++++++++++++++++----------
>  1 file changed, 41 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 3028487d43a3..055176ed68ac 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -298,11 +298,12 @@ static void ghes_copy_tofrom_phys(void *buffer, u64 paddr, u32 len,
>  	}
>  }
>  
> -static int ghes_read_estatus(struct ghes *ghes,
> -			     struct acpi_hest_generic_status *estatus,
> -			     u64 *buf_paddr, int fixmap_idx)
> +/* read the CPER block returning its address and size */

Make that comment a proper sentence:

"./* ... Read the CPER ... and size. */

> +static int ghes_peek_estatus(struct ghes *ghes, int fixmap_idx,
> +			     u64 *buf_paddr, u32 *buf_len)
>  {

I find the functionality split a bit strange:

ghes_peek_estatus() does peek *and* verify sizes. The latter belongs
maybe better in ghes_read_estatus(). Together with the
cper_estatus_check_header() call. Or maybe into a separate

	__ghes_check_estatus()

to separate it all nicely.

>  	struct acpi_hest_generic *g = ghes->generic;
> +	struct acpi_hest_generic_status estatus;
>  	u32 len;
>  	int rc;
>  
> @@ -317,26 +318,23 @@ static int ghes_read_estatus(struct ghes *ghes,
>  	if (!*buf_paddr)
>  		return -ENOENT;
>  
> -	ghes_copy_tofrom_phys(estatus, *buf_paddr,
> -			      sizeof(*estatus), 1, fixmap_idx);
> -	if (!estatus->block_status) {
> +	ghes_copy_tofrom_phys(&estatus, *buf_paddr,
> +			      sizeof(estatus), 1, fixmap_idx);
> +	if (!estatus.block_status) {
>  		*buf_paddr = 0;
>  		return -ENOENT;
>  	}
>  
>  	rc = -EIO;
> -	len = cper_estatus_len(estatus);
> -	if (len < sizeof(*estatus))
> +	len = cper_estatus_len(&estatus);
> +	if (len < sizeof(estatus))
>  		goto err_read_block;
>  	if (len > ghes->generic->error_block_length)
>  		goto err_read_block;
> -	if (cper_estatus_check_header(estatus))
> -		goto err_read_block;
> -	ghes_copy_tofrom_phys(estatus + 1,
> -			      *buf_paddr + sizeof(*estatus),
> -			      len - sizeof(*estatus), 1, fixmap_idx);
> -	if (cper_estatus_check(estatus))
> +	if (cper_estatus_check_header(&estatus))
>  		goto err_read_block;
> +	*buf_len = len;
> +
>  	rc = 0;
>  
>  err_read_block:
> @@ -346,6 +344,35 @@ static int ghes_read_estatus(struct ghes *ghes,
>  	return rc;
>  }
>  
> +static int __ghes_read_estatus(struct acpi_hest_generic_status *estatus,
> +			       u64 buf_paddr, size_t buf_len,
> +			       int fixmap_idx)
> +{
> +	ghes_copy_tofrom_phys(estatus, buf_paddr, buf_len, 1, fixmap_idx);
> +	if (cper_estatus_check(estatus)) {
> +		if (printk_ratelimit())
> +			pr_warning(FW_WARN GHES_PFX
> +				   "Failed to read error status block!\n");

Then you won't have to have two identical messages:

	"Failed to read error status block!\n"

which, when one sees them, is hard to figure out where exactly in the
code that happened.

> +		return -EIO;
> +	}
> +
> +	return 0;
> +}
> +
> +static int ghes_read_estatus(struct ghes *ghes,
> +			     struct acpi_hest_generic_status *estatus,
> +			     u64 *buf_paddr, int fixmap_idx)
> +{
> +	int rc;
> +	u32 buf_len;
> +
> +	rc = ghes_peek_estatus(ghes, fixmap_idx, buf_paddr, &buf_len);

Also, if we have a __ghes_read_estatus() helper now, maybe prefixing
ghes_peek_estatus() with "__" would make sense too...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 15/18] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
  2018-09-21 22:17   ` James Morse
  (?)
@ 2018-10-12 17:34     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 17:34 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Sep 21, 2018 at 11:17:02PM +0100, James Morse wrote:
> Each struct ghes has an worst-case sized buffer for storing the
> estatus. If an error is being processed by ghes_proc() in process
> context this buffer will be in use. If the error source then triggers
> an NMI-like notification, the same buffer will be used by
> _in_nmi_notify_one() to stage the estatus data, before
> __process_error() copys it into a queued estatus entry.
> 
> Merge __process_error()s work into _in_nmi_notify_one() so that
> the queued estatus entry is used from the beginning. Use the
> ghes_peek_estatus() so we know how much memory to allocate from
> the ghes_estatus_pool before we read the records.
> 
> Reported-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 45 ++++++++++++++++++++--------------------
>  1 file changed, 22 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 055176ed68ac..a0c10b60ad44 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -722,40 +722,32 @@ static void ghes_print_queued_estatus(void)
>  	}
>  }
>  
> -/* Save estatus for further processing in IRQ context */
> -static void __process_error(struct ghes *ghes,
> -			    struct acpi_hest_generic_status *ghes_estatus)
> +static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
>  {
> +	u64 buf_paddr;
> +	int sev, rc = 0;
>  	u32 len, node_len;
>  	struct ghes_estatus_node *estatus_node;
>  	struct acpi_hest_generic_status *estatus;

Please sort function local variables declaration in a reverse christmas
tree order:

	<type> longest_variable_name;
	<type> shorter_var_name;
	<type> even_shorter;
	<type> i;

>  
> -	if (ghes_estatus_cached(ghes_estatus))
> -		return;
> +	rc = ghes_peek_estatus(ghes, fixmap_idx, &buf_paddr, &len);

Oh ok, maybe not prefix it with "__" - we're using it somewhere else.

> +	if (rc)
> +		return rc;
>  
> -	len = cper_estatus_len(ghes_estatus);
>  	node_len = GHES_ESTATUS_NODE_LEN(len);
>  
>  	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
>  	if (!estatus_node)
> -		return;
> +		return -ENOMEM;
>  
>  	estatus_node->ghes = ghes;
>  	estatus_node->generic = ghes->generic;
>  	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
> -	memcpy(estatus, ghes_estatus, len);
> -	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
> -}
> -
> -static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
> -{
> -	int sev;
> -	u64 buf_paddr;
> -	struct acpi_hest_generic_status *estatus = ghes->estatus;
>  
> -	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
> +	if (__ghes_read_estatus(estatus, buf_paddr, len, fixmap_idx)) {
>  		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
> -		return -ENOENT;
> +		rc = -ENOENT;
> +		goto no_work;
>  	}
>  
>  	sev = ghes_severity(estatus->error_severity);
> @@ -764,13 +756,20 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
>  		__ghes_panic(ghes, estatus);
>  	}
>  
> -	if (!buf_paddr)
> -		return 0;
> -
> -	__process_error(ghes, estatus);
>  	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
>  
> -	return 0;
> +	if (!buf_paddr || ghes_estatus_cached(estatus))
> +		goto no_work;
> +
> +	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
> +
> +	return rc;
> +
> +no_work:
> +	gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
> +			      node_len);

Yeah, let it stick out.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 15/18] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
@ 2018-10-12 17:34     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 17:34 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Sep 21, 2018 at 11:17:02PM +0100, James Morse wrote:
> Each struct ghes has an worst-case sized buffer for storing the
> estatus. If an error is being processed by ghes_proc() in process
> context this buffer will be in use. If the error source then triggers
> an NMI-like notification, the same buffer will be used by
> _in_nmi_notify_one() to stage the estatus data, before
> __process_error() copys it into a queued estatus entry.
> 
> Merge __process_error()s work into _in_nmi_notify_one() so that
> the queued estatus entry is used from the beginning. Use the
> ghes_peek_estatus() so we know how much memory to allocate from
> the ghes_estatus_pool before we read the records.
> 
> Reported-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 45 ++++++++++++++++++++--------------------
>  1 file changed, 22 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 055176ed68ac..a0c10b60ad44 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -722,40 +722,32 @@ static void ghes_print_queued_estatus(void)
>  	}
>  }
>  
> -/* Save estatus for further processing in IRQ context */
> -static void __process_error(struct ghes *ghes,
> -			    struct acpi_hest_generic_status *ghes_estatus)
> +static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
>  {
> +	u64 buf_paddr;
> +	int sev, rc = 0;
>  	u32 len, node_len;
>  	struct ghes_estatus_node *estatus_node;
>  	struct acpi_hest_generic_status *estatus;

Please sort function local variables declaration in a reverse christmas
tree order:

	<type> longest_variable_name;
	<type> shorter_var_name;
	<type> even_shorter;
	<type> i;

>  
> -	if (ghes_estatus_cached(ghes_estatus))
> -		return;
> +	rc = ghes_peek_estatus(ghes, fixmap_idx, &buf_paddr, &len);

Oh ok, maybe not prefix it with "__" - we're using it somewhere else.

> +	if (rc)
> +		return rc;
>  
> -	len = cper_estatus_len(ghes_estatus);
>  	node_len = GHES_ESTATUS_NODE_LEN(len);
>  
>  	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
>  	if (!estatus_node)
> -		return;
> +		return -ENOMEM;
>  
>  	estatus_node->ghes = ghes;
>  	estatus_node->generic = ghes->generic;
>  	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
> -	memcpy(estatus, ghes_estatus, len);
> -	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
> -}
> -
> -static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
> -{
> -	int sev;
> -	u64 buf_paddr;
> -	struct acpi_hest_generic_status *estatus = ghes->estatus;
>  
> -	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
> +	if (__ghes_read_estatus(estatus, buf_paddr, len, fixmap_idx)) {
>  		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
> -		return -ENOENT;
> +		rc = -ENOENT;
> +		goto no_work;
>  	}
>  
>  	sev = ghes_severity(estatus->error_severity);
> @@ -764,13 +756,20 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
>  		__ghes_panic(ghes, estatus);
>  	}
>  
> -	if (!buf_paddr)
> -		return 0;
> -
> -	__process_error(ghes, estatus);
>  	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
>  
> -	return 0;
> +	if (!buf_paddr || ghes_estatus_cached(estatus))
> +		goto no_work;
> +
> +	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
> +
> +	return rc;
> +
> +no_work:
> +	gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
> +			      node_len);

Yeah, let it stick out.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 15/18] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()
@ 2018-10-12 17:34     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 17:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 21, 2018 at 11:17:02PM +0100, James Morse wrote:
> Each struct ghes has an worst-case sized buffer for storing the
> estatus. If an error is being processed by ghes_proc() in process
> context this buffer will be in use. If the error source then triggers
> an NMI-like notification, the same buffer will be used by
> _in_nmi_notify_one() to stage the estatus data, before
> __process_error() copys it into a queued estatus entry.
> 
> Merge __process_error()s work into _in_nmi_notify_one() so that
> the queued estatus entry is used from the beginning. Use the
> ghes_peek_estatus() so we know how much memory to allocate from
> the ghes_estatus_pool before we read the records.
> 
> Reported-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  drivers/acpi/apei/ghes.c | 45 ++++++++++++++++++++--------------------
>  1 file changed, 22 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 055176ed68ac..a0c10b60ad44 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -722,40 +722,32 @@ static void ghes_print_queued_estatus(void)
>  	}
>  }
>  
> -/* Save estatus for further processing in IRQ context */
> -static void __process_error(struct ghes *ghes,
> -			    struct acpi_hest_generic_status *ghes_estatus)
> +static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
>  {
> +	u64 buf_paddr;
> +	int sev, rc = 0;
>  	u32 len, node_len;
>  	struct ghes_estatus_node *estatus_node;
>  	struct acpi_hest_generic_status *estatus;

Please sort function local variables declaration in a reverse christmas
tree order:

	<type> longest_variable_name;
	<type> shorter_var_name;
	<type> even_shorter;
	<type> i;

>  
> -	if (ghes_estatus_cached(ghes_estatus))
> -		return;
> +	rc = ghes_peek_estatus(ghes, fixmap_idx, &buf_paddr, &len);

Oh ok, maybe not prefix it with "__" - we're using it somewhere else.

> +	if (rc)
> +		return rc;
>  
> -	len = cper_estatus_len(ghes_estatus);
>  	node_len = GHES_ESTATUS_NODE_LEN(len);
>  
>  	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
>  	if (!estatus_node)
> -		return;
> +		return -ENOMEM;
>  
>  	estatus_node->ghes = ghes;
>  	estatus_node->generic = ghes->generic;
>  	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
> -	memcpy(estatus, ghes_estatus, len);
> -	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
> -}
> -
> -static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
> -{
> -	int sev;
> -	u64 buf_paddr;
> -	struct acpi_hest_generic_status *estatus = ghes->estatus;
>  
> -	if (ghes_read_estatus(ghes, estatus, &buf_paddr, fixmap_idx)) {
> +	if (__ghes_read_estatus(estatus, buf_paddr, len, fixmap_idx)) {
>  		ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
> -		return -ENOENT;
> +		rc = -ENOENT;
> +		goto no_work;
>  	}
>  
>  	sev = ghes_severity(estatus->error_severity);
> @@ -764,13 +756,20 @@ static int _in_nmi_notify_one(struct ghes *ghes, int fixmap_idx)
>  		__ghes_panic(ghes, estatus);
>  	}
>  
> -	if (!buf_paddr)
> -		return 0;
> -
> -	__process_error(ghes, estatus);
>  	ghes_clear_estatus(estatus, buf_paddr, fixmap_idx);
>  
> -	return 0;
> +	if (!buf_paddr || ghes_estatus_cached(estatus))
> +		goto no_work;
> +
> +	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
> +
> +	return rc;
> +
> +no_work:
> +	gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
> +			      node_len);

Yeah, let it stick out.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
  2018-10-12 17:17           ` James Morse
  (?)
@ 2018-10-12 18:10             ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 18:10 UTC (permalink / raw)
  To: James Morse
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

On Fri, Oct 12, 2018 at 06:17:48PM +0100, James Morse wrote:
> Ripping out the existing #ifdefs and replacing them with IS_ENABLED() would let
> the compiler work out the estatus stuff is unused, and saves us describing the
> what-uses-it logic in Kconfig.
> 
> But this does expose the x86 nmi stuff on arm64, which doesn't build today.

Gah, that ifdeffery is one big mess. ;-\

One fine day...

> Dragging NMI_HANDLED and friends up to the 'linux' header causes a fair amount
> of noise under arch/x86 (include the new header in 22 files). Adding dummy
> declarations to arm64 fixes this, and doesn't affect the other architectures
> that have an asm/nmi.h
> 
> Alternatively we could leave {un,}register_nmi_handler() under
> CONFIG_HAVE_ACPI_APEI_NMI. I think we need to keep the NOTIFY_NMI kconfig symbol
> around, as its one of the two I can't work out how to fix without the TLBI-IPI.

Hmm, so I just tried the diff below with my arm64 cross compiler and a
defconfig with

CONFIG_ACPI_APEI_GHES=y
CONFIG_EDAC_GHES=y

and it did build fine. What am I missing?

---
diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 2b191e09b647..52ae5438edeb 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -4,7 +4,6 @@ config HAVE_ACPI_APEI
 
 config HAVE_ACPI_APEI_NMI
 	bool
-	select ACPI_APEI_GHES_ESTATUS_QUEUE
 
 config ACPI_APEI
 	bool "ACPI Platform Error Interface (APEI)"
@@ -34,10 +33,6 @@ config ACPI_APEI_GHES
 	  by firmware to produce more valuable hardware error
 	  information for Linux.
 
-config ACPI_APEI_GHES_ESTATUS_QUEUE
-	bool
-	depends on ACPI_APEI_GHES && ARCH_HAVE_NMI_SAFE_CMPXCHG
-
 config ACPI_APEI_PCIEAER
 	bool "APEI PCIe AER logging/recovering support"
 	depends on ACPI_APEI && PCIEAER
@@ -48,7 +43,6 @@ config ACPI_APEI_PCIEAER
 config ACPI_APEI_SEA
 	bool "APEI Synchronous External Abort logging/recovering support"
 	depends on ARM64 && ACPI_APEI_GHES
-	select ACPI_APEI_GHES_ESTATUS_QUEUE
 	default y
 	help
 	  This option should be enabled if the system supports
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 463c8e6d1bb5..8191d711564b 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -683,7 +683,6 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -862,10 +861,6 @@ static void ghes_nmi_init_cxt(void)
 	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
 }
 
-#else
-static inline void ghes_nmi_init_cxt(void) { }
-#endif /* CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE */
-
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
 	int rc;


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-12 18:10             ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 18:10 UTC (permalink / raw)
  To: James Morse
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

On Fri, Oct 12, 2018 at 06:17:48PM +0100, James Morse wrote:
> Ripping out the existing #ifdefs and replacing them with IS_ENABLED() would let
> the compiler work out the estatus stuff is unused, and saves us describing the
> what-uses-it logic in Kconfig.
> 
> But this does expose the x86 nmi stuff on arm64, which doesn't build today.

Gah, that ifdeffery is one big mess. ;-\

One fine day...

> Dragging NMI_HANDLED and friends up to the 'linux' header causes a fair amount
> of noise under arch/x86 (include the new header in 22 files). Adding dummy
> declarations to arm64 fixes this, and doesn't affect the other architectures
> that have an asm/nmi.h
> 
> Alternatively we could leave {un,}register_nmi_handler() under
> CONFIG_HAVE_ACPI_APEI_NMI. I think we need to keep the NOTIFY_NMI kconfig symbol
> around, as its one of the two I can't work out how to fix without the TLBI-IPI.

Hmm, so I just tried the diff below with my arm64 cross compiler and a
defconfig with

CONFIG_ACPI_APEI_GHES=y
CONFIG_EDAC_GHES=y

and it did build fine. What am I missing?

---
diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 2b191e09b647..52ae5438edeb 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -4,7 +4,6 @@ config HAVE_ACPI_APEI
 
 config HAVE_ACPI_APEI_NMI
 	bool
-	select ACPI_APEI_GHES_ESTATUS_QUEUE
 
 config ACPI_APEI
 	bool "ACPI Platform Error Interface (APEI)"
@@ -34,10 +33,6 @@ config ACPI_APEI_GHES
 	  by firmware to produce more valuable hardware error
 	  information for Linux.
 
-config ACPI_APEI_GHES_ESTATUS_QUEUE
-	bool
-	depends on ACPI_APEI_GHES && ARCH_HAVE_NMI_SAFE_CMPXCHG
-
 config ACPI_APEI_PCIEAER
 	bool "APEI PCIe AER logging/recovering support"
 	depends on ACPI_APEI && PCIEAER
@@ -48,7 +43,6 @@ config ACPI_APEI_PCIEAER
 config ACPI_APEI_SEA
 	bool "APEI Synchronous External Abort logging/recovering support"
 	depends on ARM64 && ACPI_APEI_GHES
-	select ACPI_APEI_GHES_ESTATUS_QUEUE
 	default y
 	help
 	  This option should be enabled if the system supports
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 463c8e6d1bb5..8191d711564b 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -683,7 +683,6 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -862,10 +861,6 @@ static void ghes_nmi_init_cxt(void)
 	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
 }
 
-#else
-static inline void ghes_nmi_init_cxt(void) { }
-#endif /* CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE */
-
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
 	int rc;


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol
@ 2018-10-12 18:10             ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-12 18:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 12, 2018 at 06:17:48PM +0100, James Morse wrote:
> Ripping out the existing #ifdefs and replacing them with IS_ENABLED() would let
> the compiler work out the estatus stuff is unused, and saves us describing the
> what-uses-it logic in Kconfig.
> 
> But this does expose the x86 nmi stuff on arm64, which doesn't build today.

Gah, that ifdeffery is one big mess. ;-\

One fine day...

> Dragging NMI_HANDLED and friends up to the 'linux' header causes a fair amount
> of noise under arch/x86 (include the new header in 22 files). Adding dummy
> declarations to arm64 fixes this, and doesn't affect the other architectures
> that have an asm/nmi.h
> 
> Alternatively we could leave {un,}register_nmi_handler() under
> CONFIG_HAVE_ACPI_APEI_NMI. I think we need to keep the NOTIFY_NMI kconfig symbol
> around, as its one of the two I can't work out how to fix without the TLBI-IPI.

Hmm, so I just tried the diff below with my arm64 cross compiler and a
defconfig with

CONFIG_ACPI_APEI_GHES=y
CONFIG_EDAC_GHES=y

and it did build fine. What am I missing?

---
diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 2b191e09b647..52ae5438edeb 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -4,7 +4,6 @@ config HAVE_ACPI_APEI
 
 config HAVE_ACPI_APEI_NMI
 	bool
-	select ACPI_APEI_GHES_ESTATUS_QUEUE
 
 config ACPI_APEI
 	bool "ACPI Platform Error Interface (APEI)"
@@ -34,10 +33,6 @@ config ACPI_APEI_GHES
 	  by firmware to produce more valuable hardware error
 	  information for Linux.
 
-config ACPI_APEI_GHES_ESTATUS_QUEUE
-	bool
-	depends on ACPI_APEI_GHES && ARCH_HAVE_NMI_SAFE_CMPXCHG
-
 config ACPI_APEI_PCIEAER
 	bool "APEI PCIe AER logging/recovering support"
 	depends on ACPI_APEI && PCIEAER
@@ -48,7 +43,6 @@ config ACPI_APEI_PCIEAER
 config ACPI_APEI_SEA
 	bool "APEI Synchronous External Abort logging/recovering support"
 	depends on ARM64 && ACPI_APEI_GHES
-	select ACPI_APEI_GHES_ESTATUS_QUEUE
 	default y
 	help
 	  This option should be enabled if the system supports
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 463c8e6d1bb5..8191d711564b 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -683,7 +683,6 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
-#ifdef CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE
 /*
  * Handlers for CPER records may not be NMI safe. For example,
  * memory_failure_queue() takes spinlocks and calls schedule_work_on().
@@ -862,10 +861,6 @@ static void ghes_nmi_init_cxt(void)
 	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
 }
 
-#else
-static inline void ghes_nmi_init_cxt(void) { }
-#endif /* CONFIG_ACPI_APEI_GHES_ESTATUS_QUEUE */
-
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
 	int rc;


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority
  2018-09-21 22:17   ` James Morse
  (?)
@ 2018-10-15 16:49     ` Borislav Petkov
  -1 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-15 16:49 UTC (permalink / raw)
  To: James Morse, Peter Zijlstra
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Dongjiu Geng, linux-acpi, Punit Agrawal, Naoya Horiguchi, kvmarm,
	linux-arm-kernel, Len Brown

+ Peter.

On Fri, Sep 21, 2018 at 11:17:04PM +0100, James Morse wrote:
> arm64 can take an NMI-like error notification when user-space steps in
> some corrupt memory. APEI's GHES code will call memory_failure_queue()
> to schedule the recovery work. We then return to user-space, possibly
> taking the fault again.
> 
> Currently the arch code unconditionally signals user-space from this
> path, so we don't get stuck in this loop, but the affected process
> never benefits from memory_failure()s recovery work. To fix this we
> need to know the recovery work will run before we get back to user-space.
> 
> Increase the priority of the recovery work by scheduling it on the
> system_highpri_wq, then try to bump the current task off this CPU
> so that the recovery work starts immediately.
> 
> Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> Tested-by: gengdongjiu <gengdongjiu@huawei.com>
> CC: Xie XiuQi <xiexiuqi@huawei.com>
> CC: gengdongjiu <gengdongjiu@huawei.com>
> ---
>  mm/memory-failure.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 0cd3de3550f0..4e7b115cea5a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -56,6 +56,7 @@
>  #include <linux/memory_hotplug.h>
>  #include <linux/mm_inline.h>
>  #include <linux/memremap.h>
> +#include <linux/preempt.h>
>  #include <linux/kfifo.h>
>  #include <linux/ratelimit.h>
>  #include <linux/page-isolation.h>
> @@ -1454,6 +1455,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu, memory_failure_cpu);
>   */
>  void memory_failure_queue(unsigned long pfn, int flags)
>  {
> +	int cpu = smp_processor_id();
>  	struct memory_failure_cpu *mf_cpu;
>  	unsigned long proc_flags;
>  	struct memory_failure_entry entry = {
> @@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
>  
>  	mf_cpu = &get_cpu_var(memory_failure_cpu);
>  	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
> -	if (kfifo_put(&mf_cpu->fifo, entry))
> -		schedule_work_on(smp_processor_id(), &mf_cpu->work);
> -	else
> +	if (kfifo_put(&mf_cpu->fifo, entry)) {
> +		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
> +		set_tsk_need_resched(current);
> +		preempt_set_need_resched();

What guarantees the workqueue would run before the process? I see this:

``WQ_HIGHPRI``
  Work items of a highpri wq are queued to the highpri
  worker-pool of the target cpu.  Highpri worker-pools are
  served by worker threads with elevated nice level.

but is that enough?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority
@ 2018-10-15 16:49     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-15 16:49 UTC (permalink / raw)
  To: James Morse, Peter Zijlstra
  Cc: linux-acpi, kvmarm, linux-arm-kernel, linux-mm, Marc Zyngier,
	Christoffer Dall, Will Deacon, Catalin Marinas, Naoya Horiguchi,
	Rafael Wysocki, Len Brown, Tony Luck, Tyler Baicar, Dongjiu Geng,
	Xie XiuQi, Punit Agrawal, jonathan.zhang

+ Peter.

On Fri, Sep 21, 2018 at 11:17:04PM +0100, James Morse wrote:
> arm64 can take an NMI-like error notification when user-space steps in
> some corrupt memory. APEI's GHES code will call memory_failure_queue()
> to schedule the recovery work. We then return to user-space, possibly
> taking the fault again.
> 
> Currently the arch code unconditionally signals user-space from this
> path, so we don't get stuck in this loop, but the affected process
> never benefits from memory_failure()s recovery work. To fix this we
> need to know the recovery work will run before we get back to user-space.
> 
> Increase the priority of the recovery work by scheduling it on the
> system_highpri_wq, then try to bump the current task off this CPU
> so that the recovery work starts immediately.
> 
> Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> Tested-by: gengdongjiu <gengdongjiu@huawei.com>
> CC: Xie XiuQi <xiexiuqi@huawei.com>
> CC: gengdongjiu <gengdongjiu@huawei.com>
> ---
>  mm/memory-failure.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 0cd3de3550f0..4e7b115cea5a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -56,6 +56,7 @@
>  #include <linux/memory_hotplug.h>
>  #include <linux/mm_inline.h>
>  #include <linux/memremap.h>
> +#include <linux/preempt.h>
>  #include <linux/kfifo.h>
>  #include <linux/ratelimit.h>
>  #include <linux/page-isolation.h>
> @@ -1454,6 +1455,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu, memory_failure_cpu);
>   */
>  void memory_failure_queue(unsigned long pfn, int flags)
>  {
> +	int cpu = smp_processor_id();
>  	struct memory_failure_cpu *mf_cpu;
>  	unsigned long proc_flags;
>  	struct memory_failure_entry entry = {
> @@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
>  
>  	mf_cpu = &get_cpu_var(memory_failure_cpu);
>  	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
> -	if (kfifo_put(&mf_cpu->fifo, entry))
> -		schedule_work_on(smp_processor_id(), &mf_cpu->work);
> -	else
> +	if (kfifo_put(&mf_cpu->fifo, entry)) {
> +		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
> +		set_tsk_need_resched(current);
> +		preempt_set_need_resched();

What guarantees the workqueue would run before the process? I see this:

``WQ_HIGHPRI``
  Work items of a highpri wq are queued to the highpri
  worker-pool of the target cpu.  Highpri worker-pools are
  served by worker threads with elevated nice level.

but is that enough?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority
@ 2018-10-15 16:49     ` Borislav Petkov
  0 siblings, 0 replies; 123+ messages in thread
From: Borislav Petkov @ 2018-10-15 16:49 UTC (permalink / raw)
  To: linux-arm-kernel

+ Peter.

On Fri, Sep 21, 2018 at 11:17:04PM +0100, James Morse wrote:
> arm64 can take an NMI-like error notification when user-space steps in
> some corrupt memory. APEI's GHES code will call memory_failure_queue()
> to schedule the recovery work. We then return to user-space, possibly
> taking the fault again.
> 
> Currently the arch code unconditionally signals user-space from this
> path, so we don't get stuck in this loop, but the affected process
> never benefits from memory_failure()s recovery work. To fix this we
> need to know the recovery work will run before we get back to user-space.
> 
> Increase the priority of the recovery work by scheduling it on the
> system_highpri_wq, then try to bump the current task off this CPU
> so that the recovery work starts immediately.
> 
> Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
> Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
> Tested-by: gengdongjiu <gengdongjiu@huawei.com>
> CC: Xie XiuQi <xiexiuqi@huawei.com>
> CC: gengdongjiu <gengdongjiu@huawei.com>
> ---
>  mm/memory-failure.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 0cd3de3550f0..4e7b115cea5a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -56,6 +56,7 @@
>  #include <linux/memory_hotplug.h>
>  #include <linux/mm_inline.h>
>  #include <linux/memremap.h>
> +#include <linux/preempt.h>
>  #include <linux/kfifo.h>
>  #include <linux/ratelimit.h>
>  #include <linux/page-isolation.h>
> @@ -1454,6 +1455,7 @@ static DEFINE_PER_CPU(struct memory_failure_cpu, memory_failure_cpu);
>   */
>  void memory_failure_queue(unsigned long pfn, int flags)
>  {
> +	int cpu = smp_processor_id();
>  	struct memory_failure_cpu *mf_cpu;
>  	unsigned long proc_flags;
>  	struct memory_failure_entry entry = {
> @@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
>  
>  	mf_cpu = &get_cpu_var(memory_failure_cpu);
>  	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
> -	if (kfifo_put(&mf_cpu->fifo, entry))
> -		schedule_work_on(smp_processor_id(), &mf_cpu->work);
> -	else
> +	if (kfifo_put(&mf_cpu->fifo, entry)) {
> +		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
> +		set_tsk_need_resched(current);
> +		preempt_set_need_resched();

What guarantees the workqueue would run before the process? I see this:

``WQ_HIGHPRI``
  Work items of a highpri wq are queued to the highpri
  worker-pool of the target cpu.  Highpri worker-pools are
  served by worker threads with elevated nice level.

but is that enough?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority
  2018-10-15 16:49     ` Borislav Petkov
  (?)
@ 2018-10-16  7:43       ` Peter Zijlstra
  -1 siblings, 0 replies; 123+ messages in thread
From: Peter Zijlstra @ 2018-10-16  7:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: jonathan.zhang, Rafael Wysocki, Tony Luck, Xie XiuQi, linux-mm,
	Marc Zyngier, Catalin Marinas, Tyler Baicar, Will Deacon,
	Christoffer Dall, Dongjiu Geng, linux-acpi, Punit Agrawal,
	James Morse, Naoya Horiguchi, kvmarm, linux-arm-kernel,
	Len Brown

On Mon, Oct 15, 2018 at 06:49:13PM +0200, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:17:04PM +0100, James Morse wrote:
> > @@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
> >  
> >  	mf_cpu = &get_cpu_var(memory_failure_cpu);
> >  	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
> > -	if (kfifo_put(&mf_cpu->fifo, entry))
> > -		schedule_work_on(smp_processor_id(), &mf_cpu->work);
> > -	else
> > +	if (kfifo_put(&mf_cpu->fifo, entry)) {
> > +		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
> > +		set_tsk_need_resched(current);
> > +		preempt_set_need_resched();
> 
> What guarantees the workqueue would run before the process? I see this:
> 
> ``WQ_HIGHPRI``
>   Work items of a highpri wq are queued to the highpri
>   worker-pool of the target cpu.  Highpri worker-pools are
>   served by worker threads with elevated nice level.
> 
> but is that enough?

Nope. Nice just makes it more likely, but no guarantees what so ever.

If you want to absolutely run something before we return to userspace,
would not task_work() be what we're looking for?

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority
@ 2018-10-16  7:43       ` Peter Zijlstra
  0 siblings, 0 replies; 123+ messages in thread
From: Peter Zijlstra @ 2018-10-16  7:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: James Morse, linux-acpi, kvmarm, linux-arm-kernel, linux-mm,
	Marc Zyngier, Christoffer Dall, Will Deacon, Catalin Marinas,
	Naoya Horiguchi, Rafael Wysocki, Len Brown, Tony Luck,
	Tyler Baicar, Dongjiu Geng, Xie XiuQi, Punit Agrawal,
	jonathan.zhang

On Mon, Oct 15, 2018 at 06:49:13PM +0200, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:17:04PM +0100, James Morse wrote:
> > @@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
> >  
> >  	mf_cpu = &get_cpu_var(memory_failure_cpu);
> >  	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
> > -	if (kfifo_put(&mf_cpu->fifo, entry))
> > -		schedule_work_on(smp_processor_id(), &mf_cpu->work);
> > -	else
> > +	if (kfifo_put(&mf_cpu->fifo, entry)) {
> > +		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
> > +		set_tsk_need_resched(current);
> > +		preempt_set_need_resched();
> 
> What guarantees the workqueue would run before the process? I see this:
> 
> ``WQ_HIGHPRI``
>   Work items of a highpri wq are queued to the highpri
>   worker-pool of the target cpu.  Highpri worker-pools are
>   served by worker threads with elevated nice level.
> 
> but is that enough?

Nope. Nice just makes it more likely, but no guarantees what so ever.

If you want to absolutely run something before we return to userspace,
would not task_work() be what we're looking for?

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority
@ 2018-10-16  7:43       ` Peter Zijlstra
  0 siblings, 0 replies; 123+ messages in thread
From: Peter Zijlstra @ 2018-10-16  7:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 15, 2018 at 06:49:13PM +0200, Borislav Petkov wrote:
> On Fri, Sep 21, 2018 at 11:17:04PM +0100, James Morse wrote:
> > @@ -1463,11 +1465,14 @@ void memory_failure_queue(unsigned long pfn, int flags)
> >  
> >  	mf_cpu = &get_cpu_var(memory_failure_cpu);
> >  	spin_lock_irqsave(&mf_cpu->lock, proc_flags);
> > -	if (kfifo_put(&mf_cpu->fifo, entry))
> > -		schedule_work_on(smp_processor_id(), &mf_cpu->work);
> > -	else
> > +	if (kfifo_put(&mf_cpu->fifo, entry)) {
> > +		queue_work_on(cpu, system_highpri_wq, &mf_cpu->work);
> > +		set_tsk_need_resched(current);
> > +		preempt_set_need_resched();
> 
> What guarantees the workqueue would run before the process? I see this:
> 
> ``WQ_HIGHPRI``
>   Work items of a highpri wq are queued to the highpri
>   worker-pool of the target cpu.  Highpri worker-pools are
>   served by worker threads with elevated nice level.
> 
> but is that enough?

Nope. Nice just makes it more likely, but no guarantees what so ever.

If you want to absolutely run something before we return to userspace,
would not task_work() be what we're looking for?

^ permalink raw reply	[flat|nested] 123+ messages in thread

end of thread, other threads:[~2018-10-16  7:44 UTC | newest]

Thread overview: 123+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-21 22:16 [PATCH v6 00/18] APEI in_nmi() rework James Morse
2018-09-21 22:16 ` James Morse
2018-09-21 22:16 ` James Morse
2018-09-21 22:16 ` [PATCH v6 01/18] ACPI / APEI: Move the estatus queue code up, and under its own ifdef James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16 ` [PATCH v6 02/18] ACPI / APEI: Generalise the estatus queue's add/remove and notify code James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16 ` [PATCH v6 03/18] ACPI / APEI: don't wait to serialise with oops messages when panic()ing James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16 ` [PATCH v6 04/18] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-09-28 17:04   ` Borislav Petkov
2018-09-28 17:04     ` Borislav Petkov
2018-09-28 17:04     ` Borislav Petkov
2018-09-21 22:16 ` [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-10-01 17:59   ` Borislav Petkov
2018-10-01 17:59     ` Borislav Petkov
2018-10-01 17:59     ` Borislav Petkov
2018-10-03 17:50     ` James Morse
2018-10-03 17:50       ` James Morse
2018-10-03 17:50       ` James Morse
2018-10-04 17:34       ` Borislav Petkov
2018-10-04 17:34         ` Borislav Petkov
2018-10-04 17:34         ` Borislav Petkov
2018-10-12 17:17         ` James Morse
2018-10-12 17:17           ` James Morse
2018-10-12 17:17           ` James Morse
2018-10-12 18:10           ` Borislav Petkov
2018-10-12 18:10             ` Borislav Petkov
2018-10-12 18:10             ` Borislav Petkov
2018-09-21 22:16 ` [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-10-12  9:57   ` Borislav Petkov
2018-10-12  9:57     ` Borislav Petkov
2018-10-12  9:57     ` Borislav Petkov
2018-10-12 17:18     ` James Morse
2018-10-12 17:18       ` James Morse
2018-10-12 17:18       ` James Morse
2018-09-21 22:16 ` [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-10-12 10:02   ` Borislav Petkov
2018-10-12 10:02     ` Borislav Petkov
2018-10-12 10:02     ` Borislav Petkov
2018-10-12 17:18     ` James Morse
2018-10-12 17:18       ` James Morse
2018-10-12 17:18       ` James Morse
2018-09-21 22:16 ` [PATCH v6 08/18] ACPI / APEI: Move locking to the notification helper James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-10-12 11:08   ` Borislav Petkov
2018-10-12 11:08     ` Borislav Petkov
2018-10-12 11:08     ` Borislav Petkov
2018-09-21 22:16 ` [PATCH v6 09/18] ACPI / APEI: Let the notification helper specify the fixmap slot James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-10-12 11:14   ` Borislav Petkov
2018-10-12 11:14     ` Borislav Petkov
2018-10-12 11:14     ` Borislav Petkov
2018-09-21 22:16 ` [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-10-12 16:37   ` Borislav Petkov
2018-10-12 16:37     ` Borislav Petkov
2018-10-12 16:37     ` Borislav Petkov
2018-09-21 22:16 ` [PATCH v6 11/18] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-10-12 16:55   ` Borislav Petkov
2018-10-12 16:55     ` Borislav Petkov
2018-10-12 16:55     ` Borislav Petkov
2018-09-21 22:16 ` [PATCH v6 12/18] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:16   ` James Morse
2018-09-21 22:17 ` [PATCH v6 13/18] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus James Morse
2018-09-21 22:17   ` James Morse
2018-09-21 22:17   ` James Morse
2018-10-12 17:14   ` Borislav Petkov
2018-10-12 17:14     ` Borislav Petkov
2018-10-12 17:14     ` Borislav Petkov
2018-09-21 22:17 ` [PATCH v6 14/18] ACPI / APEI: Split ghes_read_estatus() to read CPER length James Morse
2018-09-21 22:17   ` James Morse
2018-09-21 22:17   ` James Morse
2018-10-12 17:25   ` Borislav Petkov
2018-10-12 17:25     ` Borislav Petkov
2018-10-12 17:25     ` Borislav Petkov
2018-09-21 22:17 ` [PATCH v6 15/18] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
2018-09-21 22:17   ` James Morse
2018-09-21 22:17   ` James Morse
2018-10-12 17:34   ` Borislav Petkov
2018-10-12 17:34     ` Borislav Petkov
2018-10-12 17:34     ` Borislav Petkov
2018-09-21 22:17 ` [PATCH v6 16/18] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications James Morse
2018-09-21 22:17   ` James Morse
2018-09-21 22:17   ` James Morse
2018-09-21 22:17 ` [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority James Morse
2018-09-21 22:17   ` James Morse
2018-09-21 22:17   ` James Morse
2018-10-15 16:49   ` Borislav Petkov
2018-10-15 16:49     ` Borislav Petkov
2018-10-15 16:49     ` Borislav Petkov
2018-10-16  7:43     ` Peter Zijlstra
2018-10-16  7:43       ` Peter Zijlstra
2018-10-16  7:43       ` Peter Zijlstra
2018-09-21 22:17 ` [PATCH v6 18/18] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
2018-09-21 22:17   ` James Morse
2018-09-21 22:17   ` James Morse
2018-09-25 12:45 ` [PATCH v6 00/18] APEI in_nmi() rework Borislav Petkov
2018-09-25 12:45   ` Borislav Petkov
2018-09-25 12:45   ` Borislav Petkov
2018-10-03 17:50   ` James Morse
2018-10-03 17:50     ` James Morse
2018-10-03 17:50     ` James Morse
2018-10-04 15:15     ` Borislav Petkov
2018-10-04 15:15       ` Borislav Petkov
2018-10-04 15:15       ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.