linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: linux-acpi@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	Borislav Petkov <bp@alien8.de>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Christoffer Dall <cdall@kernel.org>,
	Will Deacon <will.deacon@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Rafael Wysocki <rjw@rjwysocki.net>, Len Brown <lenb@kernel.org>,
	Tony Luck <tony.luck@intel.com>,
	Tyler Baicar <tbaicar@codeaurora.org>,
	Dongjiu Geng <gengdongjiu@huawei.com>,
	Xie XiuQi <xiexiuqi@huawei.com>,
	Punit Agrawal <punit.agrawal@arm.com>,
	jonathan.zhang@cavium.com, James Morse <james.morse@arm.com>
Subject: [PATCH v3 01/12] ACPI / APEI: Move the estatus queue code up, and under its own ifdef
Date: Fri, 27 Apr 2018 16:34:59 +0100	[thread overview]
Message-ID: <20180427153510.5799-2-james.morse@arm.com> (raw)
In-Reply-To: <20180427153510.5799-1-james.morse@arm.com>

To support asynchronous NMI-like notifications on arm64 we need to use
the estatus-queue. These patches refactor it to allow multiple APEI
notification types to use it.

First we move the estatus-queue code higher in the file so that any
notify_foo() handler can make use of it.

This patch moves code around ... and makes the following trivial change:
Freshen the dated comment above ghes_estatus_llist. printk() is no
longer the issue, its the helpers like memory_failure_queue() that
still aren't nmi safe.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>

Notes for cover letter:
ghes.c has three things all called 'estatus'. One is a pool of memory
that has a static size, and is grown/shrunk when new NMI users are
allocated.
The second is the cache, this holds recent notifications so we can
suppress notifications we've already handled.
The last is the queue, which hold data from NMI notifications (in pool
memory) that can't be handled immediatly.
---
 drivers/acpi/apei/ghes.c | 265 ++++++++++++++++++++++++-----------------------
 1 file changed, 137 insertions(+), 128 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 1efefe919555..e2af91c92135 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -545,6 +545,16 @@ static int ghes_print_estatus(const char *pfx,
 	return 0;
 }
 
+static void __ghes_panic(struct ghes *ghes)
+{
+	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
+
+	/* reboot to log the error! */
+	if (!panic_timeout)
+		panic_timeout = ghes_panic_timeout;
+	panic("Fatal hardware error!");
+}
+
 /*
  * GHES error status reporting throttle, to report more kinds of
  * errors, instead of just most frequently occurred errors.
@@ -672,6 +682,133 @@ static void ghes_estatus_cache_add(
 	rcu_read_unlock();
 }
 
+#ifdef CONFIG_HAVE_ACPI_APEI_NMI
+/*
+ * Handlers for CPER records may not be NMI safe. For example,
+ * memory_failure_queue() takes spinlocks and calls schedule_work_on().
+ * In any NMI-like handler, memory from ghes_estatus_pool is used to save
+ * estatus, and added to the ghes_estatus_llist. irq_work_queue() causes
+ * ghes_proc_in_irq() to run in IRQ context where each estatus in
+ * ghes_estatus_llist is processed. Each NMI-like error source must grow
+ * the ghes_estatus_pool to ensure memory is available.
+ *
+ * Memory from the ghes_estatus_pool is also used with the ghes_estatus_cache
+ * to suppress frequent messages.
+ */
+static struct llist_head ghes_estatus_llist;
+static struct irq_work ghes_proc_irq_work;
+
+static void ghes_print_queued_estatus(void)
+{
+	struct llist_node *llnode;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		generic = estatus_node->generic;
+		ghes_print_estatus(NULL, generic, estatus);
+		llnode = llnode->next;
+	}
+}
+
+/* Save estatus for further processing in IRQ context */
+static void __process_error(struct ghes *ghes)
+{
+#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
+	u32 len, node_len;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic_status *estatus;
+
+	if (ghes_estatus_cached(ghes->estatus))
+		return;
+
+	len = cper_estatus_len(ghes->estatus);
+	node_len = GHES_ESTATUS_NODE_LEN(len);
+
+	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
+	if (!estatus_node)
+		return;
+
+	estatus_node->ghes = ghes;
+	estatus_node->generic = ghes->generic;
+	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+	memcpy(estatus, ghes->estatus, len);
+	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
+#endif
+}
+
+static unsigned long ghes_esource_prealloc_size(
+	const struct acpi_hest_generic *generic)
+{
+	unsigned long block_length, prealloc_records, prealloc_size;
+
+	block_length = min_t(unsigned long, generic->error_block_length,
+			     GHES_ESTATUS_MAX_SIZE);
+	prealloc_records = max_t(unsigned long,
+				 generic->records_to_preallocate, 1);
+	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
+			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
+
+	return prealloc_size;
+}
+
+static void ghes_estatus_pool_shrink(unsigned long len)
+{
+	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
+}
+
+static void ghes_proc_in_irq(struct irq_work *irq_work)
+{
+	struct llist_node *llnode, *next;
+	struct ghes_estatus_node *estatus_node;
+	struct acpi_hest_generic *generic;
+	struct acpi_hest_generic_status *estatus;
+	u32 len, node_len;
+
+	llnode = llist_del_all(&ghes_estatus_llist);
+	/*
+	 * Because the time order of estatus in list is reversed,
+	 * revert it back to proper order.
+	 */
+	llnode = llist_reverse_order(llnode);
+	while (llnode) {
+		next = llnode->next;
+		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
+					   llnode);
+		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
+		len = cper_estatus_len(estatus);
+		node_len = GHES_ESTATUS_NODE_LEN(len);
+		ghes_do_proc(estatus_node->ghes, estatus);
+		if (!ghes_estatus_cached(estatus)) {
+			generic = estatus_node->generic;
+			if (ghes_print_estatus(NULL, generic, estatus))
+				ghes_estatus_cache_add(generic, estatus);
+		}
+		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
+			      node_len);
+		llnode = next;
+	}
+}
+
+static void ghes_nmi_init_cxt(void)
+{
+	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
+}
+
+#else
+static inline void ghes_nmi_init_cxt(void) { }
+#endif /* CONFIG_HAVE_ACPI_APEI_NMI */
+
 static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 {
 	int rc;
@@ -687,16 +824,6 @@ static int ghes_ack_error(struct acpi_hest_generic_v2 *gv2)
 	return apei_write(val, &gv2->read_ack_register);
 }
 
-static void __ghes_panic(struct ghes *ghes)
-{
-	__ghes_print_estatus(KERN_EMERG, ghes->generic, ghes->estatus);
-
-	/* reboot to log the error! */
-	if (!panic_timeout)
-		panic_timeout = ghes_panic_timeout;
-	panic("Fatal hardware error!");
-}
-
 static int ghes_proc(struct ghes *ghes)
 {
 	int rc;
@@ -828,17 +955,6 @@ static inline void ghes_sea_remove(struct ghes *ghes) { }
 #endif /* CONFIG_ACPI_APEI_SEA */
 
 #ifdef CONFIG_HAVE_ACPI_APEI_NMI
-/*
- * printk is not safe in NMI context.  So in NMI handler, we allocate
- * required memory from lock-less memory allocator
- * (ghes_estatus_pool), save estatus into it, put them into lock-less
- * list (ghes_estatus_llist), then delay printk into IRQ context via
- * irq_work (ghes_proc_irq_work).  ghes_estatus_size_request record
- * required pool size by all NMI error source.
- */
-static struct llist_head ghes_estatus_llist;
-static struct irq_work ghes_proc_irq_work;
-
 /*
  * NMI may be triggered on any CPU, so ghes_in_nmi is used for
  * having only one concurrent reader.
@@ -847,88 +963,6 @@ static atomic_t ghes_in_nmi = ATOMIC_INIT(0);
 
 static LIST_HEAD(ghes_nmi);
 
-static void ghes_proc_in_irq(struct irq_work *irq_work)
-{
-	struct llist_node *llnode, *next;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-	u32 len, node_len;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		next = llnode->next;
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		len = cper_estatus_len(estatus);
-		node_len = GHES_ESTATUS_NODE_LEN(len);
-		ghes_do_proc(estatus_node->ghes, estatus);
-		if (!ghes_estatus_cached(estatus)) {
-			generic = estatus_node->generic;
-			if (ghes_print_estatus(NULL, generic, estatus))
-				ghes_estatus_cache_add(generic, estatus);
-		}
-		gen_pool_free(ghes_estatus_pool, (unsigned long)estatus_node,
-			      node_len);
-		llnode = next;
-	}
-}
-
-static void ghes_print_queued_estatus(void)
-{
-	struct llist_node *llnode;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic *generic;
-	struct acpi_hest_generic_status *estatus;
-
-	llnode = llist_del_all(&ghes_estatus_llist);
-	/*
-	 * Because the time order of estatus in list is reversed,
-	 * revert it back to proper order.
-	 */
-	llnode = llist_reverse_order(llnode);
-	while (llnode) {
-		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
-					   llnode);
-		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		generic = estatus_node->generic;
-		ghes_print_estatus(NULL, generic, estatus);
-		llnode = llnode->next;
-	}
-}
-
-/* Save estatus for further processing in IRQ context */
-static void __process_error(struct ghes *ghes)
-{
-#ifdef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
-	u32 len, node_len;
-	struct ghes_estatus_node *estatus_node;
-	struct acpi_hest_generic_status *estatus;
-
-	if (ghes_estatus_cached(ghes->estatus))
-		return;
-
-	len = cper_estatus_len(ghes->estatus);
-	node_len = GHES_ESTATUS_NODE_LEN(len);
-
-	estatus_node = (void *)gen_pool_alloc(ghes_estatus_pool, node_len);
-	if (!estatus_node)
-		return;
-
-	estatus_node->ghes = ghes;
-	estatus_node->generic = ghes->generic;
-	estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-	memcpy(estatus, ghes->estatus, len);
-	llist_add(&estatus_node->llnode, &ghes_estatus_llist);
-#endif
-}
-
 static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 {
 	struct ghes *ghes;
@@ -967,26 +1001,6 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
 	return ret;
 }
 
-static unsigned long ghes_esource_prealloc_size(
-	const struct acpi_hest_generic *generic)
-{
-	unsigned long block_length, prealloc_records, prealloc_size;
-
-	block_length = min_t(unsigned long, generic->error_block_length,
-			     GHES_ESTATUS_MAX_SIZE);
-	prealloc_records = max_t(unsigned long,
-				 generic->records_to_preallocate, 1);
-	prealloc_size = min_t(unsigned long, block_length * prealloc_records,
-			      GHES_ESOURCE_PREALLOC_MAX_SIZE);
-
-	return prealloc_size;
-}
-
-static void ghes_estatus_pool_shrink(unsigned long len)
-{
-	ghes_estatus_pool_size_request -= PAGE_ALIGN(len);
-}
-
 static void ghes_nmi_add(struct ghes *ghes)
 {
 	unsigned long len;
@@ -1018,14 +1032,9 @@ static void ghes_nmi_remove(struct ghes *ghes)
 	ghes_estatus_pool_shrink(len);
 }
 
-static void ghes_nmi_init_cxt(void)
-{
-	init_irq_work(&ghes_proc_irq_work, ghes_proc_in_irq);
-}
 #else /* CONFIG_HAVE_ACPI_APEI_NMI */
 static inline void ghes_nmi_add(struct ghes *ghes) { }
 static inline void ghes_nmi_remove(struct ghes *ghes) { }
-static inline void ghes_nmi_init_cxt(void) { }
 #endif /* CONFIG_HAVE_ACPI_APEI_NMI */
 
 static int ghes_probe(struct platform_device *ghes_dev)
-- 
2.16.2

  reply	other threads:[~2018-04-27 15:38 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-27 15:34 [PATCH v3 00/12] APEI in_nmi() rework and arm64 SDEI wire-up James Morse
2018-04-27 15:34 ` James Morse [this message]
2018-05-01 10:43   ` [PATCH v3 01/12] ACPI / APEI: Move the estatus queue code up, and under its own ifdef Punit Agrawal
2018-05-01 12:50     ` James Morse
2018-05-05  9:58   ` Borislav Petkov
2018-04-27 15:35 ` [PATCH v3 02/12] ACPI / APEI: Generalise the estatus queue's add/remove and notify code James Morse
2018-05-05 10:12   ` Borislav Petkov
2018-04-27 15:35 ` [PATCH v3 03/12] ACPI / APEI: don't wait to serialise with oops messages when panic()ing James Morse
2018-04-27 15:35 ` [PATCH v3 04/12] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
2018-04-27 15:35 ` [PATCH v3 05/12] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
2018-04-27 15:35 ` [PATCH v3 06/12] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
2018-04-27 15:35 ` [PATCH v3 07/12] ACPI / APEI: Make the nmi_fixmap_idx per-ghes to allow multiple in_nmi() users James Morse
2018-05-05 12:27   ` Borislav Petkov
2018-05-08  8:45     ` James Morse
2018-05-16 11:05       ` Borislav Petkov
2018-05-16 14:51         ` James Morse
2018-05-17 13:36           ` Borislav Petkov
2018-05-17 18:11             ` James Morse
2018-05-16 15:38         ` Tyler Baicar
2018-05-17 13:39           ` Borislav Petkov
2018-04-27 15:35 ` [PATCH v3 08/12] ACPI / APEI: Split fixmap pages for arm64 NMI-like notifications James Morse
2018-04-27 15:35 ` [PATCH v3 09/12] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
2018-04-27 15:35 ` [PATCH v3 10/12] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse
2018-04-27 15:35 ` [PATCH v3 11/12] mm/memory-failure: increase queued recovery work's priority James Morse
2018-04-27 15:35 ` [PATCH v3 12/12] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
2018-05-01 20:15 ` [PATCH v3 00/12] APEI in_nmi() rework and arm64 SDEI wire-up Tyler Baicar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180427153510.5799-2-james.morse@arm.com \
    --to=james.morse@arm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=cdall@kernel.org \
    --cc=gengdongjiu@huawei.com \
    --cc=jonathan.zhang@cavium.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=marc.zyngier@arm.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=punit.agrawal@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=tbaicar@codeaurora.org \
    --cc=tony.luck@intel.com \
    --cc=will.deacon@arm.com \
    --cc=xiexiuqi@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).