linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for v4.9 LTS 01/86] sparc64: Handle PIO & MEM non-resumable errors.
@ 2017-06-17 22:23 Levin, Alexander (Sasha Levin)
  2017-06-17 22:23 ` [PATCH for v4.9 LTS 02/86] sparc64: Zero pages on allocation for mondo and error queues Levin, Alexander (Sasha Levin)
  0 siblings, 1 reply; 3+ messages in thread
From: Levin, Alexander (Sasha Levin) @ 2017-06-17 22:23 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Liam R. Howlett, David S . Miller, Levin, Alexander (Sasha Levin)

From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>

[ Upstream commit 047487241ff59374fded8c477f21453681f5995c ]

User processes trying to access an invalid memory address via PIO will
receive a SIGBUS signal instead of causing a panic.  Memory errors will
receive a SIGKILL since a SIGBUS may result in a coredump which may
attempt to repeat the faulting access.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
---
 arch/sparc/kernel/traps_64.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/arch/sparc/kernel/traps_64.c b/arch/sparc/kernel/traps_64.c
index 496fa926e1e0..d44fb806bbd7 100644
--- a/arch/sparc/kernel/traps_64.c
+++ b/arch/sparc/kernel/traps_64.c
@@ -2051,6 +2051,73 @@ void sun4v_resum_overflow(struct pt_regs *regs)
 	atomic_inc(&sun4v_resum_oflow_cnt);
 }
 
+/* Given a set of registers, get the virtual addressi that was being accessed
+ * by the faulting instructions at tpc.
+ */
+static unsigned long sun4v_get_vaddr(struct pt_regs *regs)
+{
+	unsigned int insn;
+
+	if (!copy_from_user(&insn, (void __user *)regs->tpc, 4)) {
+		return compute_effective_address(regs, insn,
+						 (insn >> 25) & 0x1f);
+	}
+	return 0;
+}
+
+/* Attempt to handle non-resumable errors generated from userspace.
+ * Returns true if the signal was handled, false otherwise.
+ */
+bool sun4v_nonresum_error_user_handled(struct pt_regs *regs,
+				  struct sun4v_error_entry *ent) {
+
+	unsigned int attrs = ent->err_attrs;
+
+	if (attrs & SUN4V_ERR_ATTRS_MEMORY) {
+		unsigned long addr = ent->err_raddr;
+		siginfo_t info;
+
+		if (addr == ~(u64)0) {
+			/* This seems highly unlikely to ever occur */
+			pr_emerg("SUN4V NON-RECOVERABLE ERROR: Memory error detected in unknown location!\n");
+		} else {
+			unsigned long page_cnt = DIV_ROUND_UP(ent->err_size,
+							      PAGE_SIZE);
+
+			/* Break the unfortunate news. */
+			pr_emerg("SUN4V NON-RECOVERABLE ERROR: Memory failed at %016lX\n",
+				 addr);
+			pr_emerg("SUN4V NON-RECOVERABLE ERROR:   Claiming %lu ages.\n",
+				 page_cnt);
+
+			while (page_cnt-- > 0) {
+				if (pfn_valid(addr >> PAGE_SHIFT))
+					get_page(pfn_to_page(addr >> PAGE_SHIFT));
+				addr += PAGE_SIZE;
+			}
+		}
+		info.si_signo = SIGKILL;
+		info.si_errno = 0;
+		info.si_trapno = 0;
+		force_sig_info(info.si_signo, &info, current);
+
+		return true;
+	}
+	if (attrs & SUN4V_ERR_ATTRS_PIO) {
+		siginfo_t info;
+
+		info.si_signo = SIGBUS;
+		info.si_code = BUS_ADRERR;
+		info.si_addr = (void __user *)sun4v_get_vaddr(regs);
+		force_sig_info(info.si_signo, &info, current);
+
+		return true;
+	}
+
+	/* Default to doing nothing */
+	return false;
+}
+
 /* We run with %pil set to PIL_NORMAL_MAX and PSTATE_IE enabled in %pstate.
  * Log the event, clear the first word of the entry, and die.
  */
@@ -2075,6 +2142,12 @@ void sun4v_nonresum_error(struct pt_regs *regs, unsigned long offset)
 
 	put_cpu();
 
+	if (!(regs->tstate & TSTATE_PRIV) &&
+	    sun4v_nonresum_error_user_handled(regs, &local_copy)) {
+		/* DON'T PANIC: This userspace error was handled. */
+		return;
+	}
+
 #ifdef CONFIG_PCI
 	/* Check for the special PCI poke sequence. */
 	if (pci_poke_in_progress && pci_poke_cpu == cpu) {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH for v4.9 LTS 02/86] sparc64: Zero pages on allocation for mondo and error queues.
  2017-06-17 22:23 [PATCH for v4.9 LTS 01/86] sparc64: Handle PIO & MEM non-resumable errors Levin, Alexander (Sasha Levin)
@ 2017-06-17 22:23 ` Levin, Alexander (Sasha Levin)
  0 siblings, 0 replies; 3+ messages in thread
From: Levin, Alexander (Sasha Levin) @ 2017-06-17 22:23 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Liam R. Howlett, David S . Miller, Levin, Alexander (Sasha Levin)

From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>

[ Upstream commit 7a7dc961a28b965a0d0303c2e989df17b411708b ]

Error queues use a non-zero first word to detect if the queues are full.
Using pages that have not been zeroed may result in false positive
overflow events.  These queues are set up once during boot so zeroing
all mondo and error queue pages is safe.

Note that the false positive overflow does not always occur because the
page allocation for these queues is so early in the boot cycle that
higher number CPUs get fresh pages.  It is only when traps are serviced
with lower number CPUs who were given already used pages that this issue
is exposed.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
---
 arch/sparc/kernel/irq_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sparc/kernel/irq_64.c b/arch/sparc/kernel/irq_64.c
index e1b1ce63a328..5cbf03c14981 100644
--- a/arch/sparc/kernel/irq_64.c
+++ b/arch/sparc/kernel/irq_64.c
@@ -1021,7 +1021,7 @@ static void __init alloc_one_queue(unsigned long *pa_ptr, unsigned long qmask)
 	unsigned long order = get_order(size);
 	unsigned long p;
 
-	p = __get_free_pages(GFP_KERNEL, order);
+	p = __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
 	if (!p) {
 		prom_printf("SUN4V: Error, cannot allocate queue.\n");
 		prom_halt();
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH for v4.9 LTS 02/86] sparc64: Zero pages on allocation for mondo and error queues.
  2017-06-17 22:24 [PATCH for v4.9 LTS 01/86] sparc64: Handle PIO & MEM non-resumable errors Levin, Alexander (Sasha Levin)
@ 2017-06-17 22:24 ` Levin, Alexander (Sasha Levin)
  0 siblings, 0 replies; 3+ messages in thread
From: Levin, Alexander (Sasha Levin) @ 2017-06-17 22:24 UTC (permalink / raw)
  To: stable, linux-kernel
  Cc: Liam R. Howlett, David S . Miller, Levin, Alexander (Sasha Levin)

From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>

[ Upstream commit 7a7dc961a28b965a0d0303c2e989df17b411708b ]

Error queues use a non-zero first word to detect if the queues are full.
Using pages that have not been zeroed may result in false positive
overflow events.  These queues are set up once during boot so zeroing
all mondo and error queue pages is safe.

Note that the false positive overflow does not always occur because the
page allocation for these queues is so early in the boot cycle that
higher number CPUs get fresh pages.  It is only when traps are serviced
with lower number CPUs who were given already used pages that this issue
is exposed.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
---
 arch/sparc/kernel/irq_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sparc/kernel/irq_64.c b/arch/sparc/kernel/irq_64.c
index e1b1ce63a328..5cbf03c14981 100644
--- a/arch/sparc/kernel/irq_64.c
+++ b/arch/sparc/kernel/irq_64.c
@@ -1021,7 +1021,7 @@ static void __init alloc_one_queue(unsigned long *pa_ptr, unsigned long qmask)
 	unsigned long order = get_order(size);
 	unsigned long p;
 
-	p = __get_free_pages(GFP_KERNEL, order);
+	p = __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
 	if (!p) {
 		prom_printf("SUN4V: Error, cannot allocate queue.\n");
 		prom_halt();
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-06-17 22:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-17 22:23 [PATCH for v4.9 LTS 01/86] sparc64: Handle PIO & MEM non-resumable errors Levin, Alexander (Sasha Levin)
2017-06-17 22:23 ` [PATCH for v4.9 LTS 02/86] sparc64: Zero pages on allocation for mondo and error queues Levin, Alexander (Sasha Levin)
2017-06-17 22:24 [PATCH for v4.9 LTS 01/86] sparc64: Handle PIO & MEM non-resumable errors Levin, Alexander (Sasha Levin)
2017-06-17 22:24 ` [PATCH for v4.9 LTS 02/86] sparc64: Zero pages on allocation for mondo and error queues Levin, Alexander (Sasha Levin)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).